EDM 2023Machine Learning has found several use cases in education applications. Specifically, Neural machine translation (NMT) systems (e.g., in educational applications) are socially significant with the potential to help make information accessible to a diverse set of users in multilingual societies. NMT systems have helped translating audio, video and textual content in vernacular languages aiding both students and teachers. However, translation of higher education/technical textbooks/courses necessitate MT systems to adhere to the lexicon of source and target domain. In this tutorial, we provide insights from our translation ecosystem (https://udaanproject.org) that has helped in translating 100s of diploma and engineering books each in more than 11 Indian languages.
Invited Tutorial: Data Efficient Machine Learning for Educational Content Creation
Co-presented with Prof. Ganesh Ramakrishnan at the 16th International Conference on Educational Data Mining (EDM 2023), IISc Bangalore.
This half-day tutorial showcased practical applications of machine learning in education, specifically focusing on neural machine translation (NMT) for making educational content accessible across multilingual societies.
Data-efficient NMT techniques for low-resource educational content
Domain-specific translation challenges in technical/higher education textbooks
UDAAN translation ecosystem - our production system that has:
Lexicon adherence and terminology consistency in technical translation
Post-editing workflows and human-in-the-loop systems
Real-world deployment insights from large-scale educational content translation
The UDAAN project demonstrates how ML can bridge language barriers in education, enabling access to technical knowledge for millions of students across India. This work received the Best Paper Award at CODS-COMAD 2023.
Tutorial materials and insights from translating hundreds of technical books across diverse Indian languages.