Martina Stockhause

and 5 more

Within the climate modeling community, the complex citation issue has been discussed for a decade in the context of research traceability and data citation/data impact. The traceability requires fine-granular information on individual datasets, whereas meaningful data impact analysis relies on data citations on large data collections of data belonging to and individual model run or to a model experiment. To this date, it is not yet possible to achieve both goals with one technical solution. Suggestions for combinations of DOIs on data collections and user-defined PID collections of data subsets across several DOIs have not been taken up (see Stockhause et al., 2013).The IPCC FAIR Guidelines introduced in the Sixth Assessment Report (AR6) aimed to enhance the transparency of the AR6 and its outcomes by documenting the figure creation process (Pirani et al., 2022). Many figures are based on large numbers of datasets hosted in various repositories. Citing every dataset in the captions is not feasible. User-defined data collections utilizing data provenance records could be included in a caption, but lack the information about the authors and funders of the individual objects required for data citation and data impact analysis.The exchange on complex citation difficulties intensified at the AGU 2020 within the Community of Practice and led to the establishment of the RDA Complex Citation Working Group (WG). The WG brings all stakeholders together. It aims to provide recommendations for citing a large number of existing objects in a way that allows to properly assign credit for individual objects.

Denise Hills

and 17 more

This article is composed of three independent commentaries about the state of ICON principles (Goldman et al., 2021a) in Earth and Space Science Informatics (ESSI) and includes discussion on the opportunities and challenges of adopting them. Each commentary focuses on a different topic: (Section 2) Global collaboration, cyberinfrastructure, and data sharing; (Section 3) Machine learning for multiscale modeling; (Section 4) Aerial and satellite remote sensing for advancing Earth system model development by integrating field and ancillary data. ESSI addresses data management practices, computation and analysis, and hardware and software infrastructure. Our role in ICON science therefore involves collaborative work to assess, design, implement, and promote practices and tools that enable effective data management, discovery, integration, and reuse for interdisciplinary work in Earth and space science disciplines. Networks of diverse people with expertise across Earth, space, and data science disciplines are essential for efficient and ethical exchanges of FAIR research products and practices. Our challenge is then to coordinate the development of standards, curation practices, and tools that enable integrating and reusing multiple data types, software, multi-scale models, and machine learning approaches across disciplines in a way that is as open and/or FAIR as ethically possible. This is a major endeavor that could greatly increase the pace and potential of interdisciplinary scientific discovery.

Carlo Lacagnina

and 9 more

The knowledge of data quality and the quality of the associated information, including metadata, is critical for data use and reuse. Assessment of data and metadata quality is key for ensuring credible available information, establishing a foundation of trust between the data provider and various downstream users, and demonstrating compliance with requirements established by funders and federal policies. Data quality information should be consistently curated, traceable, and adequately documented to provide sufficient evidence to guide users to address their specific needs. The quality information is especially important for data used to support decisions and policies, and for enabling data to be truly findable, accessible, interoperable, and reusable (FAIR). Clear documentation of the quality assessment protocols used can promote the reuse of quality assurance practices and thus support the generation of more easily-comparable datasets and quality metrics. To enable interoperability across systems and tools, the data quality information should be machine-actionable. Guidance on the curation of dataset quality information can help to improve the practices of various stakeholders who contribute to the collection, curation, and dissemination of data. This presentation introduces international community guidelines to curate data quality information that is consistent with the FAIR principles throughout the entire data life cycle and inheritable by any derivative product. Supportive case studies demonstrate the applicability of the proposed guidelines.

Erin Robinson

and 5 more

Addressing research problems in Earth and environmental science usually requires combining data from multiple sources. This is facilitated by the use of common practices, vocabularies, interfaces and standards and recently it has been accelerated through connected communities of practice. This abstract will focus on the Earth Science Information Partners (ESIP) and the Australian Earth and Environment Science Information Partners (E2SIP) Over the last 20 years ESIP has built a community of practice in USA, supported by NASA, NOAA & USGS, through regular meetings and online forums to examine and develop emerging technologies. ESIP has become a braintrust and professional home for the Earth science data and informatics community where both peer-led education & training and the codevelopment of conventions, practices and guidelines have helped make Earth science data more interoperable. Through connections in the ESIP network and these boundary objects, ESIP has influenced the international community. The Australian Earth and Environment Science Information Partners (E2SIP) was recently established through liaison with ESIP to support similar functions in Australia. E2SIP is working with the National Earth and Environmental Sciences Facilities Forum which provides a common voice to government on behalf of long term science infrastructure. In addition, E2SIP, supported by programs from the National Research Infrastructure Strategy (NCRIS) such as the newly formed Australian Research Data Commons (ARDC), will convene workshops, courses, hackathons, and develop guidance and best practices tailored for the Australian community. This talk will explore how ESIP and E2SIP will work together, utilizing the collective impact framework orienting around a common shared agenda and leveraging a shared backbone structure in the U.S. and Australia. We will highlight our current understanding through a few case studies.