Invited Tutorial at EDM 2023 - Data Efficient Machine Learning for Educational Content Creation

EDM 2023

Abstract

Machine Learning has found several use cases in education applications. Specifically, Neural machine translation (NMT) systems (e.g., in educational applications) are socially significant with the potential to help make information accessible to a diverse set of users in multilingual societies. NMT systems have helped translating audio, video and textual content in vernacular languages aiding both students and teachers. However, translation of higher education/technical textbooks/courses necessitate MT systems to adhere to the lexicon of source and target domain. In this tutorial, we provide insights from our translation ecosystem (https://udaanproject.org) that has helped in translating 100s of diploma and engineering books each in more than 11 Indian languages.

Date
Jul 14, 2023 9:00 AM — 12:30 PM
Location
IISc Bangalore, India

Invited Tutorial: Data Efficient Machine Learning for Educational Content Creation

Co-presented with Prof. Ganesh Ramakrishnan at the 16th International Conference on Educational Data Mining (EDM 2023), IISc Bangalore.

Tutorial Overview

This half-day tutorial showcased practical applications of machine learning in education, specifically focusing on neural machine translation (NMT) for making educational content accessible across multilingual societies.

Key Topics Covered:

  • Data-efficient NMT techniques for low-resource educational content

  • Domain-specific translation challenges in technical/higher education textbooks

  • UDAAN translation ecosystem - our production system that has:

    • Translated 100+ diploma and engineering books
    • Supported 11+ Indian languages
    • Empowered 100+ professional translators
    • Received Presidential recognition
  • Lexicon adherence and terminology consistency in technical translation

  • Post-editing workflows and human-in-the-loop systems

  • Real-world deployment insights from large-scale educational content translation

Impact

The UDAAN project demonstrates how ML can bridge language barriers in education, enabling access to technical knowledge for millions of students across India. This work received the Best Paper Award at CODS-COMAD 2023.

Tutorial materials and insights from translating hundreds of technical books across diverse Indian languages.

Ayush Maheshwari
Ayush Maheshwari
Sr. Solutions Architect at NVIDIA
PhD in NLP/ML from CSE, IITB

My research interests include machine learning, NLP and machine translation.