Ayush Maheshwari

Senior Solutions Architect at NVIDIA
Foundation Models, NLP, and AI for Science

Biography

🎉 Update: PhD completed in July 2024 • Currently at NVIDIA • Delivered lectures on Systems View of AI at IIT Kharagpur (Spring 2026)

I am Ayush Maheshwari (आयुष माहेश्वरी), a Senior Solutions Architect at NVIDIA, where I work with the NVAITC India team.

My work focuses on multilingual and domain-specific foundation models, AI for Science, and research collaborations with academic and scientific institutions.

At NVIDIA, I focus on:

Foundation Models for Indian languages and scientific domains
AI for Science, including geospatial foundation models and air-pollution modeling
Research Collaborations with academic institutions and research organizations to advance impactful AI applications

My work spans AI solution architecture, applied research, and community enablement through technical workshops, teaching, and collaborations.

I also served as Adjunct Faculty at IIT Kharagpur, where I delivered lectures in the Deep Learning course during the Spring 2026 semester to a class of 150+ students. These lectures focused on the systems perspective of large-scale AI development, including training, inference, and deployment of modern deep learning systems.

Previously, I completed my PhD at CSE, IIT Bombay under the guidance of Prof. Ganesh Ramakrishnan. I was fortunate to be supported by the Ekal Fellowship during my PhD.

My research interests include natural language processing, machine learning, graph-based learning, and multilingual AI. I have worked on constrained neural machine translation, semi-supervised and unsupervised learning, and data programming.

During my PhD, I was a key member of the neural machine translation project UDAAN, which helps publishers translate technical content into Indian languages at scale. The project is open source and has been used by several Indian government technical education agencies and official language departments.

In my spare time, I enjoy playing table tennis and cricket, and reading about Indian culture, the Ramayana, and the Mahabharata.

Download my resumé
(Last updated: April 2026)

Interests

Foundation Models
Large Language Models
Natural Language Processing
AI for Science
Human-in-the-loop AI
Neural Machine Translation
Machine Learning
Information Retrieval

Education

PhD in Computer Science, Jan 2019 - Aug 2023 (Defended July 2024)

Indian Institute of Technology Bombay

Updates

[Dec 25] Paper on benchmarking LLMs on extremely-low resource Indic languages ! [IndicParam: Benchmark to evaluate LLMs on low-resource Indic langauges]
[May 25] Our paper on domain aware lexicon generation is accepted at ACL Main Conference 2025 ❤️ 😊 [LexGen: Domain aware multilingual lexicon generation]
[Jan 25] Our paper on synthetic data generation, ARISE: Iterative Rule Induction and Synthetic Data Generation for Text Classification is accepted at NAACL 2025 🎆🎇 [Paper]
[Nov 24] Delighted to announce that my PhD work is awarded with Impactful Research Award 2023 by IIT Bombay ❤️ 🎇!
[Oct 24] Joined NVIDIA 😌! Excited to be a part of the future of AI 😇
[Sep 24] Our paper on Dictionary Constrained Disambiguation for Improved NMT is accepted at EMNLP 2024 [Paper] ❤️ 😌

Click here for updates archive

Experience

Senior Solutions Architect

NVIDIA

Oct 2024 – Present Gurugram

Part of NVIDIA AI Technology Center (NVAITC) India, driving research engagements and strategic collaborations.

Responsibilities include:

Leading enterprise AI solution architecture and deployment strategies
Collaborating with academic and research institutions on cutting-edge AI projects
Providing technical expertise on NVIDIA GPU acceleration, large language models, and AI infrastructure
Architecting scalable AI solutions for diverse industry verticals
Mentoring teams on best practices for AI/ML model development and deployment
Led “LLM Development from Scratch” workshop series attended by 1,000+ participants from IITs, CDAC, and AI startups
Industry keynote on Mamba/SSM architectures at PREMI 2025, IIT Delhi (100+ attendees)
3-hour hands-on LLM workshop at Conference on Scientific ML, IISc Bangalore (150+ attendees)
10+ sessions at CDAC, IITs, SAIL, and Thapar University on NVIDIA AI stack

Adjunct Faculty

IIT Kharagpur

Jan 2026 – May 2026 Kharagpur

Delivered lectures in the Deep Learning course (Spring 2026) to a class of 150+ students, focusing on the systems perspective of large-scale AI development — covering training infrastructure, inference optimization, and deployment of deep learning models at scale. (Slides)

Research Scientist

Vizzhy Inc

Sep 2023 – Sep 2024 Bengaluru

Led a team of 5 people to build Indic-large language models from scratch.
Developing data collection & processing pipelines for training and evaluation.
Training of tokenizer and designing model training architecture.
Training the model on large Intel Gaudi-2 cluster.
Instruction tuning and preference training of the pre-trained models.

Research Intern

Adobe Research

May 2021 – Aug 2021 Bengaluru

Worked on prototyping new service for Adobe PDF in the legal domain Responsibilities include:

Modeling of the problem
Designing, developing and prototyping using ML
Deployment and Demonstration

Project Engineer

IIT Bombay

Jan 2016 – Dec 2018 Mumbai

Develop software solutions for security agencies

System Engineer

Tata Consultancy Services

Oct 2011 – Jul 2013 Mumbai

Projects

UDAAN - An NMT pipeline + Post-editing tool to translate document (Best Paper Award at CODS-COMAD 2023)

An end-to-end Machine Translation and post-editing platform that has translated >50 books across 10 Indian languages. 100+ professional translators actively use UDAAN to upload documents, obtain MT output, and edit translations. Received Presidential recognition and Best Paper Award at CODS-COMAD 2023. Includes >100 digitized dictionaries freely available from CSTT.

SPEAR - Programmatically label and quickly build training data

A widely-adopted open-source Python library for programmatic data labeling with 100+ GitHub stars. SPEAR reduces data labeling efforts by implementing cutting-edge data programming approaches (Snorkel, ImplyLoss, Learning to Reweight). Integrates semi-supervised learning for efficient training and inference. Featured at EMNLP 2021.

Temples of India

Temples of India is a not-for-profit knowledge platform to document and store possibly all details of temples across Indian subcontinent. We aim to present each detail related to the temple such as its location, images of the temple, videos, open and close timings, etc.

Awards & Honors

Impactful Research Award 2023

IIT Bombay Nov 2024

Awarded for impactful research contributions during PhD at IIT Bombay

Best Paper Award

CODS-COMAD 2023 Jan 2023

Our paper on translation post-editing tool (UDAAN) won the best paper award at CODS-COMAD 2023

Ekal Fellowship for PhD Research

Ekal Foundation Jan 2019 – Aug 2024

Funded by Ekal Fellowship throughout PhD research at IIT Bombay

1:1 Mentoring & Consultations

Connect with me on Topmate

💡 Book a 1:1 Session with Me

I offer mentoring sessions for PhD students, researchers, and professionals working in:

🎓 PhD Applications & Research 🤖 NLP & Machine Learning 💼 Career Guidance 📝 Paper Reviews & Feedback

📅 Book a Session on Topmate →

📧 For collaborations or speaking opportunities, reach out via the contact section below

Featured Publications

See all publications →

Complete publication list also available on Google Scholar.

Ayush Maheshwari, Kaushal Sharma, Vivek Patel, Aditya Maheshwari

December, 2025

IndicParam: Benchmark to evaluate LLMs on low-resource Indic Languages

TL;DR: Comprehensive benchmark to evaluate LLMs on low-resource Indic languages. Dataset publicly available on HuggingFace. Addresses critical gap in multilingual NLP evaluation.

Ayush Maheshwari, Atul Kumar Singh, Karthika NJ, Krishnakant Bhat, Preethi Jyothi, Ganesh Ramakrishnan

June, 2025 In ACL Main Conference 2025

LexGen: Domain-aware Multilingual Lexicon Generation

TL;DR: Domain-aware multilingual lexicon generation for 6 Indian languages across 8 domains using routing-based architecture. Released benchmark with 75K+ translation pairs. Accepted at ACL Main Conference 2025.

Yaswanth M, Vaibhav Singh, Ayush Maheshwari, Amrith Krishna, Ganesh Ramakrishnan

March, 2025 In Findings of NAACL 2025

ARISE: Iterative Rule Induction and Synthetic Data Generation for Text Classification

TL;DR: ARISE iteratively induces rules and generates synthetic data for text classification via bootstrapping. Outperforms complex methods like contrastive learning across diverse domains and languages. Published at NAACL 2025 Findings.

Ayush Maheshwari, Preethi Jyothi, Ganesh Ramakrishnan

November, 2024 In Findings of EMNLP 2024

DictDis: Dictionary Constrained Disambiguation for Improved NMT

TL;DR: DictDis disambiguates between multiple dictionary candidate translations in lexically constrained NMT. Achieves 2-3 BLEU point improvements across regulatory, finance, engineering, and health domains. Published at EMNLP 2024 Findings.

Ayush Maheshwari, Ashim Gupta, Amrith Krishna, Atul Kumar Singh, Ganesh Ramakrishnan, G. Anil Kumar, Jitin Singla

March, 2024 In LREC-COLING 2024

Sāmayik: A Benchmark and Dataset for English-Sanskrit Translation

TL;DR: First comprehensive benchmark and dataset for English-Sanskrit machine translation. Addresses critical gap in classical language NLP. Published at LREC-COLING 2024.

Ayush Maheshwari, Ajay Ravindran, Venkatapathy Subramanian, Ganesh Ramakrishnan

January, 2023 Best Paper Award In CODS-COMAD 2023

UDAAN - Machine Learning based Post-Editing tool for Document Translation

TL;DR: A production-ready MT post-editing tool used by 100+ translators to translate technical content into Indian languages. Won Best Paper Award at CODS-COMAD 2023. Impact: First batch of engineering books translated using UDAAN were released by the President of India.

Durga S, Ayush Maheshwari, Pradeep Shenoy, Prathosh AP, Ganesh Ramakrishnan

November, 2022 In AAAI 2023

Reweighing auxiliary losses in supervised learning

TL;DR: AMAL learns instance-specific weights using meta-learning to optimally combine auxiliary losses in supervised learning. Provides significant gains in knowledge distillation and rule denoising. Published at AAAI 2023.

Ayush Maheshwari, Oishik Chatterjee, Krishnateja Killamsetty, Ganesh Ramakrishnan, Rishabh Iyer

August, 2021 In Findings of ACL 2021

Semi-Supervised Data Programming with Subset Selection,

TL;DR: SPEAR combines semi-supervised learning with data programming to improve noisy labeling functions. Significantly outperforms state-of-the-art on seven datasets by jointly learning from rules and labeled data. Published at ACL 2021 Findings.

Soumya Chatterjee, Ayush Maheshwari, Ganesh Ramakrishnan, Saketha Nath Jagaralpudi

April, 2021 In EACL 2021

Joint Learning of Hyperbolic Label Embeddings for Hierarchical Multi-label Classification

TL;DR: Jointly learns classifier parameters and hyperbolic label embeddings for hierarchical multi-label classification without assuming known label hierarchy. Achieves state-of-the-art results by capturing hierarchical structure in hyperbolic space. Published at EACL 2021.