Exploring named-entity recognition techniques for academic books

Calleja Ibáñez, Pablo ORCID: https://orcid.org/0000-0001-8423-8240 and Giménez Toledo, Elea ORCID: https://orcid.org/0000-0001-5425-0003 (2024). Exploring named-entity recognition techniques for academic books. "Learned Publishing", v. 37 (n. 3); ISSN 1741-4857. https://doi.org/10.1002/leap.1610.

Descripción

Título: Exploring named-entity recognition techniques for academic books
Autor/es:
Tipo de Documento: Artículo
Título de Revista/Publicación: Learned Publishing
Fecha: 1 Mayo 2024
ISSN: 1741-4857
Volumen: 37
Número: 3
Materias:
Palabras Clave Informales: Academic books, Discoverability, Multilingualism, Name entity recognition (NER), Onomastic index, Ontology, Semantic-we; Semantic-web
Escuela: E.T.S. de Ingenieros Informáticos (UPM)
Departamento: Inteligencia Artificial
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[thumbnail of 10219129.pdf] PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (4MB)

Resumen

Recent advances in the natural language processing (NLP) field have achieved impressive results in various tasks. However, NLP techniques are underrepresented in the analysis of Humanities and Social Science texts and in languages other than English. In particular, academic books are a highly valuable source of information that has not been exploited by these techniques at all. The recognition of named entities (person names, organizations or locations) and their semantic annotation over books could enrich the visibility and discoverability of the information by users. This is an opportunity for academia and the academic publishing industry in which semantic search is a central task and now books can be queried by named entities of interest that are in their content. This work proposes a methodology to apply named-entity recognition to publish the results into an ontological semantic-web format. The work has been performed over a corpus of academic books provided by UNE (Uni & oacute;n de Editoriales Universitarias Espa & ntilde;olas, Union of Spanish University Presses). Results show an enrichment of the information extracted over the books and of the possibilities of querying them at the individual level but also within the whole set of books, increasing the possibilities for books to be discovered or retrieved beyond metadata.

Más información

ID de Registro: 86409
Identificador DC: https://oa.upm.es/86409/
Identificador OAI: oai:oa.upm.es:86409
URL Portal Científico: https://portalcientifico.upm.es/es/ipublic/item/10219129
Identificador DOI: 10.1002/leap.1610
URL Oficial: https://onlinelibrary.wiley.com/doi/10.1002/leap.1...
Depositado por: iMarina Portal Científico
Depositado el: 21 Ene 2025 15:00
Ultima Modificación: 21 Ene 2025 15:00