Challenge
Plant research produces data in a profusion of types and scales, and in ever increasing volume.
Data are generated through our research, through citizen science, satellite imagery and sensor technologies. Current innovations in interoperability and analytics allow us to converge vast quantities of data and look for unrecognised patterns behind intractable problems we are trying to understand.
When scientists take measurements (e.g. fecundity, light absorption, root growth), first of all they do not all measure them in the same way. For example, one might measure plant height from the soil to the first leaf, another from the soil to the top of the plant. Second, they often use proxies for interesting traits that they want to measure e.g. canopy height as a proxy for light interception. Research on agricultural and tree biodiversity needs to connect data on intraspecific diversity, species diversity, landscape and ecosystem diversity and connect these with data on the environment, economy, wellbeing, nutrition and productivity. Global solutions require these to be linked, so that we can understand how interactions at one level and in one sector affect the other and vice versa. However, these data are usually collected for different uses by different actors and kept in different formats in different databases. These data need to be able to talk to each other so that scientists can link them up to generate knowledge-based solutions.
Solution
Ontologies are the ultimate silo-busters. They are a way to make data interoperable. Ontologies are basically a list of agreed terms along with their definitions. Each term in an ontology has a stable identifier which explicates the class of the term and its linkages with other terms. Ontologies, by including a vocabulary of terms used in a domain, provide also information such as synonyms and abbreviations. Once linked to the data, they act like metadata describing your data, which facilitates data integration, access and analysis. Ontologies are machine-readable, so automatic agents can quickly check that all connections in both the ontology and the data are well described and work.