INDEPENDENT LANGUAGE MODELING ARCHITECTURE FOR END-TO-END ASR

dc.contributor.authorPham, Van Tung
dc.contributor.authorXu, Haihua
dc.contributor.authorKhassanov, Yerbolat
dc.contributor.authorZeng, Zhiping
dc.contributor.authorChng, Eng Siong
dc.contributor.authorNi, Chongjia
dc.contributor.authorMa, Bin
dc.contributor.authorLi, Haizhou
dc.date.accessioned2022-07-15T08:58:23Z
dc.date.available2022-07-15T08:58:23Z
dc.date.issued2019
dc.description.abstractThe attention-based end-to-end (E2E) automatic speech recognition (ASR) architecture allows for joint optimization of acoustic and language models within a single network. However, in a vanilla E2E ASR architecture, the decoder sub-network (subnet), which incorporates the role of the lan guage model (LM), is conditioned on the encoder output. This means that the acoustic encoder and the language model are entangled that doesn’t allow language model to be trained separately from external text data. To address this problem, in this work, we propose a new architecture that separates the decoder subnet from the encoder output. In this way, the decoupled subnet becomes an independently trainable LM subnet, which can easily be updated using the external text data. We study two strategies for updating the new architec ture. Experimental results show that, 1) the independent LM architecture benefits from external text data, achieving 9.3% and 22.8% relative character and word error rate reduction on Mandarin HKUST and English NSC datasets respectively; 2) the proposed architecture works well with external LM and can be generalized to different amount of labelled data.en_US
dc.identifier.urihttp://nur.nu.edu.kz/handle/123456789/6448
dc.language.isoenen_US
dc.publisherarxiven_US
dc.rightsAttribution-NonCommercial-ShareAlike 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/us/*
dc.subjectType of access: Open Accessen_US
dc.subjectIndependent language modelen_US
dc.subjectlow-resource ASRen_US
dc.subjectpre-trainingen_US
dc.subjectfine-tuningen_US
dc.subjectcatastrophic forgettingen_US
dc.titleINDEPENDENT LANGUAGE MODELING ARCHITECTURE FOR END-TO-END ASRen_US
dc.typeArticleen_US
workflow.import.sourcescience

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
1912.00863.pdf
Size:
988.74 KB
Format:
Adobe Portable Document Format
Description:
article

Collections