Distributing Text Mining tasks with librAIry

Badenes-Olmedo, Carlos and Redondo-Garcia, José Luis and Corcho, Oscar (2017). Distributing Text Mining tasks with librAIry. In: "ACM Symposium on Document Engineering (DOCEng 2017)", 3-7 Sept 2017, Valetta, Malta. pp. 63-66. https://doi.org/10.1145/3103010.3121040.


Title: Distributing Text Mining tasks with librAIry
  • Badenes-Olmedo, Carlos
  • Redondo-Garcia, José Luis
  • Corcho, Oscar
Item Type: Presentation at Congress or Conference (Article)
Event Title: ACM Symposium on Document Engineering (DOCEng 2017)
Event Dates: 3-7 Sept 2017
Event Location: Valetta, Malta
Title of Book: Proceedings of the 2017 ACM Symposium on Document Engineering - DocEng '17
Date: 31 August 2017
Freetext Keywords: large-scale text analysis; NLP; scholarly data; text mining; data integration
Faculty: E.T.S. de Ingenieros Informáticos (UPM)
Department: Inteligencia Artificial
UPM's Research Group: Ontology Engineering Group OEG
Creative Commons Licenses: Recognition - Share

Full text

PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (596kB) | Preview


We present librAIry, a novel architecture to store, process and an- alyze large collections of textual resources, integrating existing algorithms and tools into a common, distributed, high-performance work ow. Available text mining techniques can be incorporated as independent plug&play modules working in a collaborative manner into the framework. In the absence of a pre-de ned ow, librAIry leverages on the aggregation of operations executed by di erent components in response to an emergent chain of events. Extensive use of Linked Data (LD) and Representational State Transfer (REST) principles are made to provide individually addressable resources from textual documents. We have described the architecture design and its implementation and tested its e ectiveness in real-world scenarios such as collections of research papers, patents or ICT aids, with the objective of providing solutions for decision makers and experts in those domains. Major advantages of the framework and lessons-learned from these experiments are reported.

Funding Projects

Government of SpainTIN2016-78011-C4-4-RUnspecifiedUnspecifiedDATOS 4.0: RETOS Y SOLUCIONES

More information

Item ID: 52010
DC Identifier: https://oa.upm.es/52010/
OAI Identifier: oai:oa.upm.es:52010
DOI: 10.1145/3103010.3121040
Official URL: https://doi.org/10.1145/3103010.3121040
Deposited by: Carlos Badenes-Olmedo
Deposited on: 03 Sep 2018 11:05
Last Modified: 03 Sep 2018 11:05
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM