DSpace
 

DSpace at IIT Bombay >
IITB Publications >
Proceedings papers >

Please use this identifier to cite or link to this item: http://dspace.library.iitb.ac.in/jspui/handle/100/1623

Title: Using likelihood L-statistics to measure confidence in audio-visual speech recognition
Authors: GHOSH, A
VERMIA, A
SARKAR, A
Issue Date: 2001
Publisher: IEEE
Citation: 2001 IEEE FOURTH WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING,27-32
Abstract: This paper describes recent work on decision fusion in audiovisual speech recognition. lit this work, a novel approach is proposed to combine audio and video channel information in audiovisual speech recognition scenario. We have considered frame-level phonetic classification problem using two single-stream Gaussian Mixture Models. Audio and video streams are adaptively weighted using a cumulative mean of the sample confidence values over past frames in addition to the present sample confidence value. The confidence values for audio and video decisions are computed using an L-statistics (linear combination of order-statistics) of log-likelihoods against phone models. It is shown through various experiments, on a database of about 15000 sentences from large vocabulary continuous speech, that the proposed approach results in better classification accuracy as compared to other approaches.
URI: http://dx.doi.org/10.1109/MMSP.2001.962707
http://dspace.library.iitb.ac.in/xmlui/handle/10054/15563
http://hdl.handle.net/100/1623
ISBN: 0-7803-7025-2
Appears in Collections:Proceedings papers

Files in This Item:

There are no files associated with this item.

View Statistics

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2010  Duraspace - Feedback