Texto completo
|
PDF (Portable Document Format)
- Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (1MB) |
ORCID: https://orcid.org/0000-0002-0800-7632, González Carrasco, Israel
ORCID: https://orcid.org/0000-0001-8294-3157, Rodríguez Fernández, Víctor
ORCID: https://orcid.org/0000-0002-8589-6621, Souto Rico, Mónica
ORCID: https://orcid.org/0000-0002-9315-7861, Camacho Fernández, David
ORCID: https://orcid.org/0000-0002-5051-3475 and Ruiz Mezcua, Belen
ORCID: https://orcid.org/0000-0003-1993-8325
(2021).
Deep-Sync: A novel deep learning-based tool for semantic-aware subtitling synchronisation.
"Neural Computing and Applications"
;
ISSN 1433-3058.
https://doi.org/10.1007/s00521-021-05751-y.
| Título: | Deep-Sync: A novel deep learning-based tool for semantic-aware subtitling synchronisation |
|---|---|
| Autor/es: |
|
| Tipo de Documento: | Artículo |
| Título de Revista/Publicación: | Neural Computing and Applications |
| Fecha: | 8 Febrero 2021 |
| ISSN: | 1433-3058 |
| Materias: | |
| ODS: | |
| Palabras Clave Informales: | TV Broadcasting, Synchronisation, Language Model, Deep Neural Networks, Machine Learning |
| Escuela: | E.T.S.I. de Sistemas Informáticos (UPM) |
| Departamento: | Sistemas Informáticos |
| Licencias Creative Commons: | Reconocimiento - Sin obra derivada - No comercial |
|
PDF (Portable Document Format)
- Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (1MB) |
Subtitles are a key element to make any media content accessible for people who suffer from hearing impairment and for elderly people, but also useful when watching TV in a noisy environment or learning new languages. Most of the time, subtitles are generated manually in advance, building a verbatim and synchronised transcription of the audio. However, in TV live broadcasts, captions are created in real time by a re-speaker with the help of a voice recognition software, which inevitability leads to delays and lack of synchronisation. In this paper, we present Deep-Sync, a tool for the alignment of subtitles with the audio-visual content. The architecture integrates a deep language representation model and a real-time voice recognition software to build a semantic-aware alignment tool that successfully aligns most of the subtitles even when there is no direct correspondence between the re-speaker and the audio content. In order to avoid any kind of censorship, Deep-Sync can be deployed directly on users' TVs causing a small delay to perform the alignment, but avoiding to delay the signal at the broadcaster station. Deep-Sync was compared with other subtitles alignment tool, showing that our proposal is able to improve the synchronisation in all tested cases.
| ID de Registro: | 88871 |
|---|---|
| Identificador DC: | https://oa.upm.es/88871/ |
| Identificador OAI: | oai:oa.upm.es:88871 |
| URL Portal Científico: | https://portalcientifico.upm.es/es/ipublic/item/9220082 |
| Identificador DOI: | 10.1007/s00521-021-05751-y |
| URL Oficial: | https://link.springer.com/article/10.1007/s00521-0... |
| Depositado por: | iMarina Portal Científico |
| Depositado el: | 05 May 2025 17:24 |
| Ultima Modificación: | 05 May 2025 17:24 |
Publicar en el Archivo Digital desde el Portal Científico