Signalrepresentasjoner for automatisk talegjenkjenning

FFI-Rapport 2005

Om publikasjonen

Rapportnummer

2005/01053

ISBN

82-464-0936-0

Format

PDF-dokument

Størrelse

843.4 KB

Språk

Norsk

Last ned publikasjonen
Marius Gamborg Frode Lillevold
In this report we give an overwiev of methods for front-end processing of speech signals for automatic speech recognition (ASR) that are described in the litterature. The most common representation of speech in this context seems to be mel-frequency cepstral coeficient (MFCC) with delta- and double-delta coefficients, usually combined with cepstral mean normalization (CMN). Other representations include perceptual linear prediction (PLP) and linear prediction cepstral coefficients (LPCC).

Nylig publisert