Governance: Working with Big/Challenging Data Collections
Contents
Governance: Working with Big/Challenging Data Collections¶
The “big data working group” is a group of Australian climate community self-nominated, volunteer representatives. Anyone is welcome to join the group or to contribute independently to the Jupyter Book the group is developing. This document outlines the governance of the big data working group. The group’s goals are described in a related document.
Statement of Scope¶
The goal of this working group is to create and share a best-practices framework for climate-related scientists in Australia who work with big/challenging datasets. The framework will be collated and shared in the form of a Jupyter Book, made using a public github repository. Contents include:
Methods of data storage
Methods of accessing metadata and datasets quickly
Interpreting data format, variables, and metadata
Carrying out computations on large datasets
Computing platforms available to Australian researchers
Identifying which languages/tools are best suited to specific tasks
List of resources, such as training material, documentation, etc.
This working group grew out of a workshop “Creating a collaborative approach to climate data” at the Australian Meteorological and Oceanographic Society’s Annual Conference 2021. There are three other working groups that are closely related: Creation of a single catalogue and/or access point for climate data Enabling access of collections across institutions Dataset management guidelines
Leadership¶
There is a chair of the working group, with individual group members leading each of the above listed topics. The group meets once per month, and works collaboratively to organize and create the Jupyter Book.
Membership
Paige Martin (NASA, formerly ANU) - Chair
Chloe Mackallah (CSIRO)
Paola Petrelli (CLEX)
Claire Trenham (CSIRO)
Scott Wales(BoM, formerly CLEX)
Damien Irving (CSIRO)
Dougie Squire (ACCESS-NRI)
Alicia Takbash (CSIRO)
Thomas Moore (CSIRO)
Guidelines for contributors¶
Contributions to the Jupyter Book are welcomed and encouraged. The Jupyter Book is hosted on Github and can be edited by creating a pull request. Useful links below:
“Working with Big/Challenging Data Collections” Jupyter Book
This is the landing page where we distribute information about working with big and challenging climate data.
GitHub repository for Jupyter Book
Instructions for how to contribute are found in the README of this main github repository