Study of the parametrization of the speech signal from wavelet representations.

Authors
Publication date
1995
Publication type
Thesis
Summary The parametrization step consists in representing the signal by a reduced, relevant and robust set of parameters. Compared to the short term fourier transform, wavelet representations present interesting properties to parametrize the speech signal. The purpose of our work is to determine the contribution of wavelet representations in speech recognition. In order to validate our parametrizations in existing recognition systems, we set ourselves in the framework of fixed size frame analysis. The Morlet wavelet is particularly well adapted to the processed signal, because of its adaptable frequency distribution and its minimal time-frequency localization according to the uncertainty principle. The realized parameterizations are constituted by a single energy coefficient in each frequency band, and for each analysis window. Several variants have been tested: average or maximum coefficient, discrete or continuous wavelet decomposition, logarithmic or psychoacoustic frequency scale, synchronous or asynchronous maximums, spectral or pseudo-spectral domain. The conclusion of our study allows us to establish that the wavelet parameterizations implemented are, at most, as robust as the mfcc (mel frequency cepstrum coefficients). More precisely, it appears that the operating framework used is too restrictive to highlight the expected contribution of wavelet representations in the parametrization framework. Even if improvements can be brought to the realized parametrizations, the preferred operating framework of wavelet representations remains the variable time analysis, which will require the development of recognition systems with specific architectures.
Topics of the publication
  • ...
  • No themes identified
Themes detected by scanR from retrieved publications. For more information, see https://scanr.enseignementsup-recherche.gouv.fr