CHEMDNER: the drugs and chemical names extraction challenge

Krallinger, Martin; Leitner, Florian; Rabal, Obdulia; Vázquez, Miguel; Oyarzabal, Julen y Valencia, Alfonso (2015). CHEMDNER: the drugs and chemical names extraction challenge. "Journal of Cheminformatics", v. 7 (n. 1); pp.. ISSN 1758-2946. https://doi.org/10.1186/1758-2946-7-S1-S1.

Descripción

Título: CHEMDNER: the drugs and chemical names extraction challenge
Autor/es:
  • Krallinger, Martin
  • Leitner, Florian
  • Rabal, Obdulia
  • Vázquez, Miguel
  • Oyarzabal, Julen
  • Valencia, Alfonso
Tipo de Documento: Artículo
Título de Revista/Publicación: Journal of Cheminformatics
Fecha: 2015
Volumen: 7
Materias:
Escuela: E.T.S. de Ingenieros Informáticos (UPM)
Departamento: Inteligencia Artificial
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[img]
Vista Previa
PDF (Document Portable Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (1MB) | Vista Previa

Resumen

Natural language processing (NLP) and text mining technologies for the chemical domain (ChemNLP or chemical text mining) are key to improve the access and integration of information from unstructured data such as patents or the scientific literature. Therefore, the BioCreative organizers posed the CHEMDNER (chemical compound and drug name recognition) community challenge, which promoted the development of novel, competitive and accessible chemical text mining systems. This task allowed a comparative assessment of the performance of various methodologies using a carefully prepared collection of manually labeled text prepared by specially trained chemists as Gold Standard data. We evaluated two important aspects: one covered the indexing of documents with chemicals (chemical document indexing - CDI task), and the other was concerned with finding the exact mentions of chemicals in text (chemical entity mention recognition - CEM task). 27 teams (23 academic and 4 commercial, a total of 87 researchers) returned results for the CHEMDNER tasks: 26 teams for CEM and 23 for the CDI task. Top scoring teams obtained an F-score of 87.39% for the CEM task and 88.20% for the CDI task, a very promising result when compared to the agreement between human annotators (91%). The strategies used to detect chemicals included machine learning methods (e.g. conditional random fields) using a variety of features, chemistry and drug lexica, and domain-specific rules. We expect that the tools and resources resulting from this effort will have an impact in future developments of chemical text mining applications and will form the basis to find related chemical information for the detected entities, such as toxicological or pharmacogenomic properties

Más información

ID de Registro: 41176
Identificador DC: http://oa.upm.es/41176/
Identificador OAI: oai:oa.upm.es:41176
Identificador DOI: 10.1186/1758-2946-7-S1-S1
URL Oficial: http://jcheminf.springeropen.com/articles/10.1186/1758-2946-7-S1-S4
Depositado por: Memoria Investigacion
Depositado el: 26 Oct 2016 11:04
Ultima Modificación: 26 Oct 2016 11:04
  • Open Access
  • Open Access
  • Sherpa-Romeo
    Compruebe si la revista anglosajona en la que ha publicado un artículo permite también su publicación en abierto.
  • Dulcinea
    Compruebe si la revista española en la que ha publicado un artículo permite también su publicación en abierto.
  • Recolecta
  • e-ciencia
  • Observatorio I+D+i UPM
  • OpenCourseWare UPM