Enhancing Low-Resource Code-Mixed Translation with SMoL Architectural Inductive Biases
Main Article Content
Authors
Abstract
In code-mixed cultures, that is, cultures with multiple languages being used, code-mixed speech where speakers switch and blend two or more languages in a single sentence is becoming common, and it is extremely challenging for machine translation systems, especially in low-resource settings. Addressing the particular complexity of code-mixed text complexities such as standard switching between languages, non-standard grammar, and mixed script requires more than data scaling alone; it needs smart architectural solutions. This paper introduces a novel approach that incorporates inductive biases of the SMoL (Small Models with Large Contexts) model, joining hierarchical attention, sparse routing, switch-aware encoders, and multitask auxiliary supervision to enhance code-mixed sentence translation under limited labeled data. Experiments on Hindi-English, Spanish-English, and Tamil-English code-mixed test sets demonstrate that the models suggested using SMoL outperform well-known baselines like Transformer and multilingual BART, achieving notable gains in BLEU scores and human-rated fluency and adequacy. The findings demonstrate that architectural innovations can go a long way in alleviating data paucity and linguistic diversity in code-mixed machine translation, promising thrilling avenues for future research in low-resource multilingual NLP.
Downloads
Article Details
Section
References
- S. Kazi, S. Khoja, and A. Daud, “A survey of deep learning techniques for machine reading comprehension,” Artif. Intell. Rev., vol. 56, no. 2, pp. 2509–2569, Nov. 2023, doi: 10.1007/s10462-023-10583-4.
- “A Survey of Knowledge Graph Construction Using Machine Learning. | EBSCOhost.” Accessed: May 03, 2025. [Online]. Available: https://openurl.ebsco.com/EPDB%3Agcd%3A10%3A20307594/detailv2?sid=ebsco%3Aplink%3Ascholar&id=ebsco%3Agcd%3A174700485&crl=c&link_origin=scholar.google.com
- C. Zhao et al., “A Systematic Review of Cross-Lingual Sentiment Analysis: Tasks, Strategies, and Prospects,” ACM Comput Surv, vol. 56, no. 7, p. 177:1-177:37, Apr. 2024, doi: 10.1145/3645106.
- B. R. A. Chakravarthi and B. Tech, “A thesis submitted in partial fulfilment of the requirements for the degree of”.
- A. Mukherjee, AI and Ethics: A computational perspective. IOP Publishing, 2023. Accessed: May 03, 2025. [Online]. Available: https://iopscience.iop.org/book/mono/978-0-7503-6116-3
- B. Yergesh, L. Kenzhina, and A. Mukanova, “Creating a Semantic Knowledge Base and Corpus for Emotion Recognition in Kazakh-Language Texts: Methodologies, Tools, and Technologies,” in 2024 6th International Conference on Control and Robotics (ICCR), Dec. 2024, pp. 296–302. doi: 10.1109/ICCR64365.2024.10927480.
- B. Agarwal, R. Nayak, N. Mittal, and S. Patnaik, Deep Learning-Based Approaches for Sentiment Analysis. Springer Nature, 2020.
- O. Oriola and E. Kotzé, “Improving the Detection of Multilingual South African Abusive Language via Skip-gram Using Joint Multilevel Domain Adaptation,” ACM Trans Asian Low-Resour Lang Inf Process, vol. 23, no. 2, p. 25:1-25:28, Feb. 2024, doi: 10.1145/3638759.
- J. Sandhan, “Linguistically-Informed Neural Architectures for Lexical, Syntactic and Semantic Tasks in Sanskrit,” ArXiv Prepr. ArXiv230808807, 2023.
- “Llm for Everyone: Representing the Underrepresented in Large Language Models - ProQuest.” Accessed: May 03, 2025. [Online]. Available: https://www.proquest.com/openview/f832d87181d817fba4741ed0bb3a03dc/1?cbl=2026366&diss=y&pq-origsite=gscholar
- N. Pahari and K. Shimada, “Share What You Already Know: Cross-Language-Script Transfer and Alignment for Sentiment Detection in Code-Mixed Data,” ACM Trans. Asian Low-Resour. Lang. Inf. Process., vol. 23, no. 7, pp. 1–15, 2024.
- S. Bansal, S. Tripathi, S. Agarwal, T. Mitamura, and E. Nyberg, “PRO-CS: An instance-based prompt composition technique for code-switched tasks,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022, pp. 10243–10255.
- A. Priyadarshi and S. K. Saha, “The first named entity recognizer in Maithili: Resource creation and system development,” J. Intell. Fuzzy Syst., vol. 41, no. 1, pp. 1083–1095, Aug. 2021, doi: 10.3233/JIFS-210051.
- K. Goswami, “Unsupervised deep representation learning for low-resourced,” LASER, vol. 77, pp. 79–7, 2012.
- J.-W. Xiang, M.-R. Chen, P.-S. Li, H.-L. Zou, S.-D. Li, and J.-J. Huang, “TransMCGC: a recast vision transformer for small-scale image classification tasks,” Neural Comput. Appl., vol. 35, no. 10, pp. 7697–7718, Apr. 2023, doi: 10.1007/s00521-022-08067-7.
- T. An, P. Yan, J. Zuo, X. Jin, M. Liu, and J. Wang, “Enhancing Cross-Lingual Sarcasm Detection by a Prompt Learning Framework with Data Augmentation and Contrastive Learning,” Electronics, vol. 13, no. 11, p. 2163, 2024.
- M. Ravikiran and B. R. Chakravarthi, “On the Errors in Code-Mixed Tamil-English Offensive Span Identification,” in Proceedings of the Third Workshop on Speech and Language Technologies for Dravidian Languages, 2023, pp. 1–9.