A study of continuous space word and sentence representations applied to ASR error detection - LIUM - Equipe Language and Speech Technology Accéder directement au contenu
Article Dans Une Revue Speech Communication Année : 2020

A study of continuous space word and sentence representations applied to ASR error detection

Résumé

This paper presents a study of continuous word representations applied to automatic detection of speech recognition errors. A neural network architecture is proposed, which is well suited to handle continuous word representations, like word embeddings. We explore the use of several types of word representations: simple and combined linguistic embeddings, and acoustic ones associated to prosodic features, extracted from the audio signal. To compensate certain phenomena highlighted by the analysis of the error average span, we propose to model the errors at the sentence level through the use of sentence embeddings. An approach to build continuous sentence representations dedicated to ASR error detection is also proposed and compared to the Doc2vec approach. Experiments are performed on automatic transcriptions generated by the LIUM ASR system applied to the French ETAPE corpus. They show that the combination of linguistic embeddings, acoustic embeddings, prosodic features, and sentence embeddings in addition to more classical features yields very competitive results. Particularly, these results show the complementarity of acoustic embeddings and prosodic information, and show that the proposed sentence em-beddings dedicated to ASR error detection achieve better results than generic sentence embeddings.
Fichier principal
Vignette du fichier
A_study_of_continuous_space_word_and_sentence_representations_applied_to_ASR_error_detection-4.pdf (623.74 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02501943 , version 1 (08-03-2020)

Identifiants

  • HAL Id : hal-02501943 , version 1

Citer

Sahar Ghannay, Yannick Estève, Nathalie Camelin. A study of continuous space word and sentence representations applied to ASR error detection. Speech Communication, 2020. ⟨hal-02501943⟩
200 Consultations
263 Téléchargements

Partager

Gmail Facebook X LinkedIn More