Challenges of terminology extraction from legal Spanish corpora

Martín Chozas, Patricia and Calleja Ibáñez, Pablo (2018). Challenges of terminology extraction from legal Spanish corpora. In: "Proceedings of the 2nd Workshop on Technologies for Regulatory Compliance", 12 Dic 2018, Groningen, Países Bajos. pp. 73-83.

Description

Title: Challenges of terminology extraction from legal Spanish corpora
Author/s:
  • Martín Chozas, Patricia
  • Calleja Ibáñez, Pablo
Item Type: Presentation at Congress or Conference (Article)
Event Title: Proceedings of the 2nd Workshop on Technologies for Regulatory Compliance
Event Dates: 12 Dic 2018
Event Location: Groningen, Países Bajos
Title of Book: TERECOM 2018: Technologies for Regulatory Compliance
Date: 2018
Volume: 2309
Subjects:
Freetext Keywords: Legal terminology, Automatic Term Extraction, Natural Language Processing, Semantic Web Technologies
Faculty: E.T.S. de Ingenieros Informáticos (UPM)
Department: Lingüistica Aplicada a la Ciencia y a la Tecnología
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview

Abstract

Untangling the complexities of legal documentation is an imperative need for non practitioners of the legal profession. The terminology used in the domain is complex and it usually requires expert knowledge to be fully understood, since the legal framework is constantly being updated and the meaning of terms vary accordingly. Non-proprietary Automatic Terminology Extraction (ATE) tools are required in this particular domain in which documents contain private and sensitive data. This paper describes methods for obtaining accurate legal terms from labour law corpora, overcoming the difficulties present in the area, and also analyses the peculiarities of the legal jargon, specifically, in Spanish language. The performed experiments, executed with JATE, a wellknown open source library in the ATE literature, are still preliminary, but promising.

More information

Item ID: 67249
DC Identifier: https://oa.upm.es/67249/
OAI Identifier: oai:oa.upm.es:67249
Deposited by: Memoria Investigacion
Deposited on: 25 May 2021 08:46
Last Modified: 25 May 2021 08:46
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM