Combining pulse-based features for rejecting far-field speech in a HMM-based Voice Activity Detector. Computers & Electrical Engineering (CAEE).

Varela Serrano, Oscar and San Segundo Hernández, Rubén and Hernández, Luis A. (2011). Combining pulse-based features for rejecting far-field speech in a HMM-based Voice Activity Detector. Computers & Electrical Engineering (CAEE).. "Computers and Electrical Engineering", v. 37 (n. 4); pp. 589-600. ISSN 0045-7906. https://doi.org/10.1016/j.compeleceng.2011.04.005.

Description

Title: Combining pulse-based features for rejecting far-field speech in a HMM-based Voice Activity Detector. Computers & Electrical Engineering (CAEE).
Author/s:
  • Varela Serrano, Oscar
  • San Segundo Hernández, Rubén
  • Hernández, Luis A.
Item Type: Article
Título de Revista/Publicación: Computers and Electrical Engineering
Date: July 2011
ISSN: 0045-7906
Volume: 37
Subjects:
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Ingeniería Electrónica
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (304kB) | Preview

Abstract

Nowadays, several computational techniques for speech recognition have been proposed. These techniques suppose an important improvement in real time applications where speaker interacts with speech recognition systems. Although researchers proposed many methods, none of them solve the high false alarm problem when far-field speakers interfere in a human-machine conversation. This paper presents a two-class (speech and non-speech classes) decision-tree based approach for combining new speech pulse features in a VAD (Voice Activity Detector) for rejecting far-field speech in speech recognition systems. This Decision Tree is applied over the speech pulses obtained by a baseline VAD composed of a frame feature extractor, a HMM-based (Hidden Markov Model) segmentation module and a pulse detector. The paper also presents a detailed analysis of a great amount of features for discriminating between close and far-field speech. The detection error obtained with the proposed VAD is the lowest compared to other well-known VADs

More information

Item ID: 8863
DC Identifier: http://oa.upm.es/8863/
OAI Identifier: oai:oa.upm.es:8863
DOI: 10.1016/j.compeleceng.2011.04.005
Official URL: http://www.sciencedirect.com/science/journal/00457906
Deposited by: Memoria Investigacion
Deposited on: 23 Sep 2011 10:15
Last Modified: 20 Apr 2016 17:30
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM