Reusing Weights in Subword-Aware Neural Language Models

Takhanov, Rustem; Assylbekov, Zhenisbek

doi:10.18653/v1/n18-1128

Reusing Weights in Subword-Aware Neural Language Models

Files

10.18653_v1_n18-1128.pdf (351.8 KB)

Date

2018-01-01

Authors

Takhanov, Rustem

Assylbekov, Zhenisbek

Publisher

Association for Computational Linguistics

Abstract

The authors introduce methods for reusing subword embeddings and other parameters in subword-aware neural language models. Techniques improve syllable- and morpheme-aware models' performance while greatly reducing model size. A practical principle is identified: when reusing embedding layers at the output, they should be tied consecutively from bottom up. The best morpheme-aware model significantly outperforms word-level baselines across languages with 20–87 % fewer parameters.

Keywords

subword embeddings, weight tying, subword-aware models, morpheme-aware language models, model compression

Citation

Assylbekov Z, Takhanov R (2018). Reusing Weights in Subword‑Aware Neural Language Models. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pp 1413–1423. Association for Computational Linguistics. doi=10.18653/v1/N18‑1128

URI

https://doi.org/10.18653/v1/n18-1128
https://nur.nu.edu.kz/handle/123456789/9514

Collections

Articles

Full item page

Reusing Weights in Subword-Aware Neural Language Models

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By