Data Sparsity and Unsupervised Training of the Classifier-Based Speech Translation A Novel Approach for Addressing Data Sparsity in Classifier-Based Speech Translation
Main Article Content
Authors
Abstract
We focused on speech recognition jobs that need huge volumes of tagged voice data yet arechallenging to gather. Both academic study and practical applications make use of domain-adaptive andsemi-supervised learning techniques. Algorithms from the field of machine learning may be utilised forunsupervised learning, which entails studying and categorising data sets that have not been labelled. It ishypothesized that the accuracy of the suggested strategy depends on the size of the n-best lists. The trialsemployed n-best lists with sizes of 100, 500, 1000, and 2000 to make these findings. Thus, theseunsupervised algorithms are able to find patterns in data without any external supervision. This studymade use of a dataset that was compiled throughout the duration of the Transonics project. In addition,the output vocabulary may benefit greatly from employing a more robust SMT engine. For this purpose,we have adopted a strategy for determining how far apart two statements are conceptually, and asuitable clustering algorithm. To deal with the problem of sparse data, researchers have developed anovel approach that use statistical machine translation software to train classifiers
Downloads
Download data is not yet available.
Article Details
Section
Articles