Institute of Marine Biology, Biotechnology and Aquaculture - HCMR
Literature Mining

Biodiversity literature and data constitute a vast public resource open to mining and knowledge extraction.

Associating organisms to key features of their life, for example the environment in which they live, the way feed, their breeding habits, is cornerstone in explaining biodiversity patterns and informing ecological decisions.

The Literature Mining virtual Lab (LM-vLab) aims at both:

  • the automatic extraction of species - traits associations from the literature
  • augmenting Lifewatch Greece species related information based on the above

The EXTRACT annotation tool, and the ENVIRONMENTS and SPECIES/ORGANISMS taggers are relevant LM-vLab tools to this end. All three are being employed for standard compliant term suggestion to describe the environmental context of metagenomic records, while ENVIRONMENTS is also being used identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life.

All tools are developed in collaboration and maintained at the group of Prof. Lars Juhl Jensen, Novo Nordisk Foundation Center for Protein Research, Copenhagen, Denmark.

Documentation: The ENVIRONMENTS and SPECIES/ORGANISMS taggers are command line tools. Their software documentation is available a. as supplementary material in their publications, b. as plain text files included in their downloadable archives. Please visit each tool's site for links to the aforementioned. The funtionality of the EXTRACT annotation tool and its Application Programming Interface (API) are thoroughly described in the EXTRACT FAQ tab. Please visit the tool's site for more information. In addition a step-by-step EXTRACT usage guide and a user evaluation report can be found here.