Semantically expanded spoken term detection

dc.contributor.authorZhanibek Kozhirbayev
dc.contributor.authorZhandos Yessenbayev
dc.date.accessioned2025-08-26T11:25:29Z
dc.date.available2025-08-26T11:25:29Z
dc.date.issued2024-01-01
dc.description.abstractSpoken term detection (STD) is effectively implemented using fundamental techniques such as automatic speech recognition (ASR) and information retrieval. Through these methods, queried keywords can be identified in the decoded texts and indexed lattices produced by the ASR system. However, this approach relies heavily on the performance of the ASR; it may not produce the desired results when dealing with out-of-vocabulary (OOV) words that are not included in the ASR's lexicon. To address this limitation, we analyzed the semantic query expansion technique through extensive and reproducible experiments to assess its impact on the search quality for OOV words. We propose an approach to enhance existing spoken content retrieval methods by searching semantically expanded query sets and leveraging the advanced features of search engines. Our experiments, conducted on the Wall Street Journal (WSJ) datasets and top Google frequent queries, demonstrate that the proposed approach significantly improves retrieval accuracy over the traditional word-based STD method for in-vocabulary (IV) terms. Specifically, the Actual Term Weighted Value (ATWV) score improved from 0 to 0.5776 for the trigram query category. Additionally, our approach outperforms the proxy-based method for OOV words. While the proxy-based technique fails to retrieve results for both bigrams and trigrams, the semantic-based approach achieves ATWV scores of 0.7143 and 0.8846 for bigrams and trigrams, respectively. Furthermore, substantial gains are observed when combining semantic-based query expansion with a full-text search engine, improving the performance of the word-based STD system by approximately 3 to 4 times on the bigram and trigram query categories. en
dc.identifier.citationKozhirbayev Zhanibek, Yessenbayev Zhandos. (2024). Semantically Expanded Spoken Term Detection. IEEE Access. https://doi.org/10.1109/access.2024.3506982en
dc.identifier.doi10.1109/access.2024.3506982
dc.identifier.urihttps://doi.org/10.1109/access.2024.3506982
dc.identifier.urihttps://nur.nu.edu.kz/handle/123456789/10271
dc.language.isoen
dc.publisherInstitute of Electrical and Electronics Engineers (IEEE)
dc.rightsOpen accessen
dc.source(2024)en
dc.subjectTerm (time)en
dc.subjectComputer scienceen
dc.subjectNatural language processingen
dc.subjectArtificial intelligenceen
dc.subjectSpeech recognitionen
dc.subjectInformation retrievalen
dc.subjectQuantum mechanicsen
dc.subjectPhysics; type of access: open accessen
dc.titleSemantically expanded spoken term detectionen
dc.typearticleen

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
10.1109_ACCESS.2024.3506982.pdf
Size:
1.64 MB
Format:
Adobe Portable Document Format

Collections