Skip to main content


Plant research produces data in a profusion of types and scales, and in ever increasing volume.

Data are generated through our research, through citizen science, satellite imagery and sensor technologies. Current innovations in interoperability and analytics allow us to converge vast quantities of data and look for unrecognised patterns behind intractable problems we are trying to understand.

When scientists take measurements (e.g. fecundity, light absorption, root growth), first of all they do not all measure them in the same way. For example, one might measure plant height from the soil to the first leaf, another from the soil to the top of the plant. Second, they often use proxies for interesting traits that they want to measure e.g. canopy height as a proxy for light interception. Research on agricultural and tree biodiversity needs to connect data on intraspecific diversity, species diversity, landscape and ecosystem diversity and connect these with data on the environment, economy, wellbeing, nutrition and productivity. Global solutions require these to be linked, so that we can understand how interactions at one level and in one sector affect the other and vice versa. However, these data are usually collected for different uses by different actors and kept in different formats in different databases. These data need to be able to talk to each other so that scientists can link them up to generate knowledge-based solutions.


Ontologies are the ultimate silo-busters. They are a way to make data interoperable. Ontologies are basically a list of agreed terms along with their definitions. Each term in an ontology has a stable identifier which explicates the class of the term and its linkages with other terms. Ontologies, by including a vocabulary of terms used in a domain, provide also information such as synonyms and abbreviations. Once linked to the data, they act like metadata describing your data, which facilitates data integration, access and analysis. Ontologies are machine-readable, so automatic agents can quickly check that all connections in both the ontology and the data are well described and work.

The crop ontology open-source tool

Bioversity International is one of the founding organizations behind the Crop Ontology – an online tool which compiles ontologies on anatomy, structures and phenotypes of, as well as germplasm with multi-crop passport terms. It is free to use.

The concepts of the different crop ontologies are used to describe phenotypic data – that is to describe the expression of crop traits when they grow in different environments. Development of crop-specific ontologies began in 2008 for chickpea, rice, potato, maize and Musa, and in 2010 for cassava.

Visit the Crop Ontology website


This work was conducted as part of the CGIAR Excellence in Breeding and the Big Data Platform and is supported by contributors to the CGIAR Trust Fund.