Introduction
The morphological description of seeds and diaspores offers essential information for scientists and practitioners in a wide variety of fields, including botany, restoration, conservation, ethnobotany, archaeology, and agriculture. Diaspore traits, such as size, shape, colour, surface structures, and the presence of appendages are needed to establish the identity of particular diaspores that become detached of their mother plant (Martin and Barkley 1961), for instance in seed lots, seed traps, soil seed bank, archaeological sites, or forensic investigations. Moreover, integrating diaspore morphological traits into theoretical plant regeneration framework can lead to major advances in predictive evolutionary and ecological models, and thereby support conservation and restoration actions (Saatkamp et al. 2019).
Throughout the years, the demand for knowledge of diaspore morphology has led to numerous compilations of text descriptions and/or images of diaspores in books, guides and atlases (e.g., Martin and Barkley 1961, Brouwer and Stählin 1975, Beijerinck 1976, Sweedman and Merritt 2006, Bojňanský and Fargašová 2007, Cappers et al. 2012). In the last two decades, databases have been built to synthesise and centralise information on diaspore traits (e.g., Kleyer et al. 2008, Hintze et al. 2013, GEVES 2022, Royal Botanic Gardens Kew 2022), facilitating large scale analyses. Along with databases, standardized protocols were established for trait measurements to allow for the integration of data with different sources. These included methods for the description of diaspores, which consist of the quantification of size and other morphometric measurements (most reported as taxa mean or range values), and the classification of attributes either based on visual (perceptual) categories and/or functional structures and/or anatomical parts (Römermann et al. 2005).
Recently, the pressing need for new solutions to deal with environmental crises, together with the surge in applications of machine learning and image analysis in ecology and evolution, calls for an upgrade of the diaspore morphological datasets. The automated extraction of information from digital images provides the opportunity to collect quantitative phenotypic data in large quantities, enabling the investigation of high dimensional and complex relationships between traits and their interaction with environmental variables (Lürig et al. 2021). Furthermore, the use of machine learning algorithms to classify images and/or suites of traits can allow for the automation of taxa identification, making the task faster and not exclusively dependent on experienced taxonomists (Borowiec et al. 2022, Loddo et al. 2022).
Here, we present DiasMorph, a comprehensive dataset of morphological traits and images of diaspores from Central Europe. It provides images of 94,214 diaspores from 1,437 taxa in 513 genera, and 96 families, captured with a standardised and reproducible method (Dayrell et al. 2023). The dataset also compiles information on quantitative morphological traits extracted from the images following an image analysis method and include not only traditional morphometric measurements, but also colour, and shape features made available for the first time in a large dataset (Dayrell et al. 2023). The quantitative traits records correspond to measurements of individual diaspores, an input currently unavailable in trait databases that will allow for several approaches to be used for a complete exploration of the morphological traits of these species. We also included information on the presence and absence of appendages and structures in the diaspores of the evaluated taxa. By making these data available, we aim to encourage initiatives to advance on new tools for diaspore identification, further our understanding of morphological traits functions, and provide means for the continuous development of image analyses applications.