TNLS: a new taxonomic name matching system in the age of Big Data

On 1st May 2024, the Taxonomic Name Linking Services (TNLS) project officially launched as part of the TETTRIs cascade funding program — 1 of 12 innovative projects designed to transform the field of taxonomy through collaborative solutions. Supported by a €150,000 grant, TNLS is tackling a pressing issue in modern taxonomy: matching the names of different organism species across different datasets.  

By developing permanent identifiers that are specific to individual species, TNLS sets out to produce a unified system for matching species names.

The complexity of name matching in modern Taxonomy

For more than two centuries, the Latin naming system has been used to classify organisms, providing a universal way to identify species, a concept commonly referred to as “Taxonomy.” However, the surge in Big Data – massive amounts of information that are difficult to process with traditional tools – has introduced new challenges. 

Differences in the classification of species such as inconsistent spelling, duplicate names for the same species (homonyms), and varying taxonomic opinions have made the merging of datasets a big challenge. In fact, 10-20% of species names fail to match perfectly, and often require human intervention or the acceptance of errors, leading to a large time commitment or inaccuracies in datasets.

This phenomenon, which researchers call “data fog,” is becoming a significant barrier to advancing biodiversity research. With big datasets that contain thousands of species, the need for a standardized system to match names is more critical than ever.

About TNLS 

To address this challenge, TNLS is developing a unified name matching system capable of accurately matching species names across multiple big datasets. The project will achieve this through the use of stable identifiers, which act as an ID number for a species, even if the name or classification of a species is different across datasets. 

Project coordinator, Bart Vanhoorne, explains  “The project is developing a unified and transparent approach to name matching Big Data, creating lasting tools that make biodiversity research more accurate, connected, and impactful.” 

After the development of the name-matching system, TNLS will focus on: 

  1. Integrating the system into existing datasets, including those on marine and plant species.  
  2. Providing practical support to other institutions integrating the system.
  3. Creating a notification service to keep researchers up to date on taxonomic changes.

 

The team behind TNLS

TNLS is led by the Flanders Marine Institute (VLIZ) and supported by the Royal Botanic Garden Edinburgh (RBGE). The project also counts on the expertise of the renowned consultants Walter Berendsohn and Andreas Müller from the Botanischen Garten Berlin.

  • VLIZ, based in Belgium, serves as both the financial administrator and project manager. Importantly, VLIZ is also the host institute of the World Register of Marine Species, which manages the Aphia database, serving as a backbone for marine biodiversity data.

 

RBGE, based in Scotland, works in the project as part of their broader contributions to the World Flora Online, particularly through their hosting of the WFO Plant List, serving as a comprehensive database for plant taxonomy.

Figure 1. The TNLS team at the kick-off meeting in May in Edinburgh. In the picture from left to right: Roger Hyam (RBGE), Andreas Müller (consultant), Walter Berendsohn (consultant), Liesbeth Lyssens (VLIZ), Bart Vanhoorne (VLIZ)

What has been achieved so far?

One of TNLS’s first major achievements since its launch has been creating a detailed list of how name-matching services work across top European providers. This analysis will identify commonalities and discrepancies in their methodologies, which will inform the development of a unified system for name matching.  

In parallel, TNLS is working to update the Aphia database to include non-marine taxonomy. This expansion will benefit a wider range of biodiversity projects and researchers, including other TETTRIs-funded initiatives such as iSedge, as it will allow efficient linking and matching across marine, non-marine, animal and non-animal species.

Figure 2. How the Aphia platform operates

What’s next for TNLS?

Looking ahead, the project will focus on finalising the list of name-matching services and defining a common interface – an essential step for establishing standardization across datasets. This work will be followed by further enhancements to the Aphia database, ensuring it is aligned with TNLS’s findings. Alongside these technical developments, the team will also concentrate on promoting the adoption of TNLS tools and services within the wider research community.

Stay connected

Explore TNLS findings recommendations for researchers and system implementers, and case studies here:

 

To keep up with the latest updates from TNLS, follow the project on:

Share:

More newsposts

Taxonomy and Open Science Skills for Empowering Europe’s Green Deal

Taxonomy and Open Science Skills for Empowering Europe’s Green Deal

A deep dive into FOOTPRINTS-CITSCI

The FOOTPRINTS-CITSCI team is harnessing the power of citizen science and artificial intelligence to protect Norway’s wildlife and biodiversity

Balkan PolliS: Providing access to pollinators in the Balkan Peninsula

Launched on January 1st, the project has a clear mission: increasing the taxonomic knowledge of hoverfly and wild bee biodiversity in the Balkan Peninsula.