Best Practice Published Data Records#

The climate community is in a better position with respect to best practice data management than many other groups due to the self-describing data format of netCDF and community modelling projects such as CMIP/CORDEX/Obs4MIPS enforcing a strict standard. There are lots of climate-tools in wide use across the community that make use of this fact, including modifying the history attribute to document what was done to the data.

What does best practice look like? Certainly, it does look like including all metadata, ensuring data is FAIR/FAIRER [1], documenting data processing and publishing that as well, undertaking QA/QC and documenting the process.

It is near impossible to get all this right and even with strict standards like CMIP/CORDEX we still find mistakes in the metadata due to the complexity of the standards. We will never get this perfect but aim for a best effort and as our tooling (software/workflows) improve over time so will our compliance.

How do we walk the line between showing people what their data should look like, and not putting them off from making any effort at all by suggesting they have to get it perfect?

For examples of good practice, we suggest looking at the following: