Efforts to validate, monitor, and verify ocean-based carbon dioxide removal (CDR) will require a rich understanding of the ocean carbon system. Ocean observations anchor this understanding, but we know that some ongoing observations are precariously funded, that data products like SOCAT rely on volunteer effort, that regions essential to our understanding of the ocean carbon system are under-observed, and that some observation data is under-used. This presentation will be a progress report on our efforts to identify and document ocean carbon data flows using systematic literature reviews and examination of ocean data repositories. These data flows are essential to identify what data the scientific community already relies on; what data and observation gaps exist; and what data might be under-used. We examined variables of interest based on GOOS EOVs, including Oxygen (and supporting variables), Stable Carbon Isotopes (and supporting variables), Ocean Surface Stress (and supporting variables), and Ocean Surface Heat Flux (and supporting variables). Commonly observed supporting variables include O2, alkalinity, pCO2, pH, temperature, and near-surface air temperature, humidity, pressure, and wind speed.
Journals, funding agencies, and researchers are more frequently expecting manuscripts to include links to shared research data. Effective data sharing requires that data be findable, accessible, interoperable, and reusable (FAIR), and is thus predicated on establishing a common understanding on how to communicate: data exchange standards, common data formats, controlled vocabularies, and a communal data repository. When conducting research, we still communicate in shorthand that is effective for everyone on the team who understands our context, but is lost when data is shared in the absence of that context. “Water temperature” means only one thing to my research team, yet can mean dozens of things outside of that context. Data sharing is thus an exercise in sharing not just the data, which is typically readily available, but also the context of that data, which requires additional effort. This effort is one of the barriers to sharing data. We’ll describe an alternative model for accepting data to a repository: the immediate ingestion of data regardless of its metadata quality, then behavioural nudges and crowd-sourcing features that ensure this data meets appropriate standards prior to publication. We’ll show a work-in-progress prototype software tool that supports this alternative model, capable of accepting and standardizing a research data set to use CF conventions and ISO 8601 dates.