Enhancing Low-Resource Code-Mixed Translation with SMoL Architectural Inductive Biases

Jashoda  Kumawat

doi:10.29070/pfdq4d05

PDF HTML

Published: Apr 1, 2025

DOI: https://doi.org/10.29070/pfdq4d05

Keywords:

Code-mixed translation, low-resource NLP, SMoL architecture, inductive biases, multitask learning

Authors

Jashoda Kumawat

B.Teach. IIIrd Year, Department of Computer Science, School of Computer Science, Lovely Professional University, Phagwara, Punjab

Abstract

In code-mixed cultures, that is, cultures with multiple languages being used, code-mixed speech where speakers switch and blend two or more languages in a single sentence is becoming common, and it is extremely challenging for machine translation systems, especially in low-resource settings. Addressing the particular complexity of code-mixed text complexities such as standard switching between languages, non-standard grammar, and mixed script requires more than data scaling alone; it needs smart architectural solutions. This paper introduces a novel approach that incorporates inductive biases of the SMoL (Small Models with Large Contexts) model, joining hierarchical attention, sparse routing, switch-aware encoders, and multitask auxiliary supervision to enhance code-mixed sentence translation under limited labeled data. Experiments on Hindi-English, Spanish-English, and Tamil-English code-mixed test sets demonstrate that the models suggested using SMoL outperform well-known baselines like Transformer and multilingual BART, achieving notable gains in BLEU scores and human-rated fluency and adequacy. The findings demonstrate that architectural innovations can go a long way in alleviating data paucity and linguistic diversity in code-mixed machine translation, promising thrilling avenues for future research in low-resource multilingual NLP.

Downloads

Download data is not yet available.

Issue

Vol. 22 No. 2 (2025): Journal of Advances in Science and Technology (JAST)

Section

Articles

References

S. Kazi, S. Khoja, and A. Daud, “A survey of deep learning techniques for machine reading comprehension,” Artif. Intell. Rev., vol. 56, no. 2, pp. 2509–2569, Nov. 2023, doi: 10.1007/s10462-023-10583-4.
“A Survey of Knowledge Graph Construction Using Machine Learning. | EBSCOhost.” Accessed: May 03, 2025. [Online]. Available: https://openurl.ebsco.com/EPDB%3Agcd%3A10%3A20307594/detailv2?sid=ebsco%3Aplink%3Ascholar&id=ebsco%3Agcd%3A174700485&crl=c&link_origin=scholar.google.com
C. Zhao et al., “A Systematic Review of Cross-Lingual Sentiment Analysis: Tasks, Strategies, and Prospects,” ACM Comput Surv, vol. 56, no. 7, p. 177:1-177:37, Apr. 2024, doi: 10.1145/3645106.
B. R. A. Chakravarthi and B. Tech, “A thesis submitted in partial fulﬁlment of the requirements for the degree of”.
A. Mukherjee, AI and Ethics: A computational perspective. IOP Publishing, 2023. Accessed: May 03, 2025. [Online]. Available: https://iopscience.iop.org/book/mono/978-0-7503-6116-3
B. Yergesh, L. Kenzhina, and A. Mukanova, “Creating a Semantic Knowledge Base and Corpus for Emotion Recognition in Kazakh-Language Texts: Methodologies, Tools, and Technologies,” in 2024 6th International Conference on Control and Robotics (ICCR), Dec. 2024, pp. 296–302. doi: 10.1109/ICCR64365.2024.10927480.
B. Agarwal, R. Nayak, N. Mittal, and S. Patnaik, Deep Learning-Based Approaches for Sentiment Analysis. Springer Nature, 2020.
O. Oriola and E. Kotzé, “Improving the Detection of Multilingual South African Abusive Language via Skip-gram Using Joint Multilevel Domain Adaptation,” ACM Trans Asian Low-Resour Lang Inf Process, vol. 23, no. 2, p. 25:1-25:28, Feb. 2024, doi: 10.1145/3638759.
J. Sandhan, “Linguistically-Informed Neural Architectures for Lexical, Syntactic and Semantic Tasks in Sanskrit,” ArXiv Prepr. ArXiv230808807, 2023.
“Llm for Everyone: Representing the Underrepresented in Large Language Models - ProQuest.” Accessed: May 03, 2025. [Online]. Available: https://www.proquest.com/openview/f832d87181d817fba4741ed0bb3a03dc/1?cbl=2026366&diss=y&pq-origsite=gscholar
N. Pahari and K. Shimada, “Share What You Already Know: Cross-Language-Script Transfer and Alignment for Sentiment Detection in Code-Mixed Data,” ACM Trans. Asian Low-Resour. Lang. Inf. Process., vol. 23, no. 7, pp. 1–15, 2024.
S. Bansal, S. Tripathi, S. Agarwal, T. Mitamura, and E. Nyberg, “PRO-CS: An instance-based prompt composition technique for code-switched tasks,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 10243–10255.
A. Priyadarshi and S. K. Saha, “The first named entity recognizer in Maithili: Resource creation and system development,” J. Intell. Fuzzy Syst., vol. 41, no. 1, pp. 1083–1095, Aug. 2021, doi: 10.3233/JIFS-210051.
K. Goswami, “Unsupervised deep representation learning for low-resourced,” LASER, vol. 77, pp. 79–7, 2012.
J.-W. Xiang, M.-R. Chen, P.-S. Li, H.-L. Zou, S.-D. Li, and J.-J. Huang, “TransMCGC: a recast vision transformer for small-scale image classification tasks,” Neural Comput. Appl., vol. 35, no. 10, pp. 7697–7718, Apr. 2023, doi: 10.1007/s00521-022-08067-7.
T. An, P. Yan, J. Zuo, X. Jin, M. Liu, and J. Wang, “Enhancing Cross-Lingual Sarcasm Detection by a Prompt Learning Framework with Data Augmentation and Contrastive Learning,” Electronics, vol. 13, no. 11, p. 2163, 2024.
M. Ravikiran and B. R. Chakravarthi, “On the Errors in Code-Mixed Tamil-English Offensive Span Identification,” in Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages, 2023, pp. 1–9.

Article Sidebar

Main Article Content