Deep-Sync: A novel deep learning-based tool for semantic-aware subtitling synchronisation

Martín García, Alejandro

, González Carrasco, Israel

, Rodríguez Fernández, Víctor

, Souto Rico, Mónica

, Camacho Fernández, David

and Ruiz Mezcua, Belen

(2021). Deep-Sync: A novel deep learning-based tool for semantic-aware subtitling synchronisation. "Neural Computing and Applications" ; ISSN 1433-3058. https://doi.org/10.1007/s00521-021-05751-y.

Descripción

Título:	Deep-Sync: A novel deep learning-based tool for semantic-aware subtitling synchronisation
Autor/es:	Martín García, Alejandro https://orcid.org/0000-0002-0800-7632 González Carrasco, Israel https://orcid.org/0000-0001-8294-3157 Rodríguez Fernández, Víctor https://orcid.org/0000-0002-8589-6621 Souto Rico, Mónica https://orcid.org/0000-0002-9315-7861 Camacho Fernández, David https://orcid.org/0000-0002-5051-3475 Ruiz Mezcua, Belen https://orcid.org/0000-0003-1993-8325
Tipo de Documento:	Artículo
Título de Revista/Publicación:	Neural Computing and Applications
Fecha:	8 Febrero 2021
ISSN:	1433-3058
Materias:	Filología Informática
ODS:	09. Industria, innovación e infraestructura 10. Reducción de las desigualdades 16. Paz, justicia e instituciones sólidas
Palabras Clave Informales:	TV Broadcasting, Synchronisation, Language Model, Deep Neural Networks, Machine Learning
Escuela:	E.T.S.I. de Sistemas Informáticos (UPM)
Departamento:	Sistemas Informáticos
Licencias Creative Commons:	Reconocimiento - Sin obra derivada - No comercial

Texto completo

PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (1MB)

Resumen

Subtitles are a key element to make any media content accessible for people who suffer from hearing impairment and for elderly people, but also useful when watching TV in a noisy environment or learning new languages. Most of the time, subtitles are generated manually in advance, building a verbatim and synchronised transcription of the audio. However, in TV live broadcasts, captions are created in real time by a re-speaker with the help of a voice recognition software, which inevitability leads to delays and lack of synchronisation. In this paper, we present Deep-Sync, a tool for the alignment of subtitles with the audio-visual content. The architecture integrates a deep language representation model and a real-time voice recognition software to build a semantic-aware alignment tool that successfully aligns most of the subtitles even when there is no direct correspondence between the re-speaker and the audio content. In order to avoid any kind of censorship, Deep-Sync can be deployed directly on users' TVs causing a small delay to perform the alignment, but avoiding to delay the signal at the broadcaster station. Deep-Sync was compared with other subtitles alignment tool, showing that our proposal is able to improve the synchronisation in all tested cases.

Proyectos asociados

Tipo

Código

Acrónimo

Responsable

Título

Gobierno de España

TIN2017- 85727-C4-3-P

DeepBio

Sin especificar

Comunidad de Madrid

S2018/TCS-4566

CYNAMON-CM

Sin especificar

Más información

ID de Registro:	88871
Identificador DC:	https://oa.upm.es/88871/
Identificador OAI:	oai:oa.upm.es:88871
URL Portal Científico:	https://portalcientifico.upm.es/es/ipublic/item/9220082
Identificador DOI:	10.1007/s00521-021-05751-y
URL Oficial:	https://link.springer.com/article/10.1007/s00521-0...
Depositado por:	iMarina Portal Científico
Depositado el:	05 May 2025 17:24
Ultima Modificación:	05 May 2025 17:24

Estadísticas

Exportar cita

Editar (sólo personal del Archivo)

En esta página

Menú principal

Buscar

Deep-Sync: A novel deep learning-based tool for semantic-aware subtitling synchronisation

Cita

Descripción

Texto completo

Resumen

Proyectos asociados

Más información

Acciones

Metrics

Altmetrics probando

Dimensions

Documentos

El repositorio

Agrupados por ...

Datos Investigación

Financiadores

Especiales

En otros formatos

Redes sociales

Información adicional