BENIGN OVERFITTING WITH RETRIEVAL AUGMENTED MODELS
Loading...
Date
Authors
Assylbekov, Zhenisbek
Tezekbayev, Maxat
Nikoulina, Vassilina
Gallé, Matthias
Journal Title
Journal ISSN
Volume Title
Publisher
Nazarbayev University School of Sciences and Humanities
Abstract
Despite the fact that modern deep neural networks have the ability to memorize (almost) the entire training set they generalize well to unseen data, contradicting traditional learning theory. This phenomenon --- dubbed benign overfitting --- has been theoretically studied so far in simplified settings only. At the same time, ML practitioners (especially in NLP) figured out how to exploit this feature for more efficient training: retrieval-augmented models (e.g., kNN-LM, RETRO) explicitly store (part of) the training sample in the storage and thus try to (partially) remove a load of memorization from the neural network. In this paper we link these apparently separate threads of research, and propose several possible research directions regarding benign overfitting in retrieval-augmented models.
Description
Citation
Collections
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as CC0 1.0 Universal
