Harshvardhan Vatsa

I'm currently an Machine Learning Engineer at Mecha Systems. I'm most recently working on:

  • LLM post-training,finetuning and deployment on edge devices
  • Multimodal computer vision systems
  • Generative Models like Stable Diffusion and text->image stuff

I also do AI Research and my current interests involves mainly computer vision and generative modelling

Some of the most interesting stuff that I have came across(and would love to contribute to these someday) are:

  • VLMs(Vision Language Models)
  • NeRF(Neural Radiance Fields)
  • Vision Mamba
  • The Mamba architecture in general

I also write blogs. Feel free to check them out here

Please say Hi! to me at twitter (I mainly use this for connecting with fellow researchers and people)

you can email me at harshvardhanvatsa@gmail.com for any collaborations or enquiries

I am planning to do a MS and PhD in the future and for now I am just learning new things and trying to be the best version of myself I can think of

Email  /  Resume  /  LinkedIn  /  GitHub  /  twitter

profile photo

Machine Learning Research

Investigating Transformer-Based Architectures for Efficient Multi-Class Dental Disease Classification

Harshvardhan Vatsa, Hitesh Shivkumar, Bhavya Bhardwaj, Dr. Shalini L.

Trained mainly ViT and MobileViT model to 92.21% and 93.81% accuracy respectively on 11,653 dental images spanning 6 disease categories. Conclusion was that the MobileViT slightly outperforms traditional ViT on small dataset.

Wrote the code and setup the whole pipeline for preprocessing as well as training. Currently under review and gonna be published soon!

Open‑Source Contributions

Open Climate Fix

  • Analysed the TZ‑SAM solar dataset and fixed issues in the forecast predictor pipeline.
  • Analysed the TZ‑SAM solar dataset and fixed issues in the forecast predictor pipeline.
  • Modified Dockerfile and pyproject.toml to resolve Docker build failures.

Unsloth AI

  • Converted and published GGUF files for FLUX.1‑Kontext‑dev to improve accessibility.
  • Released files and docs to the repo; the Hugging Face release has 100k+ lifetime downloads.

Experience

Mecha Systems | Machine Learning Engineer Intern — May 2025–Present (Remote)

  • Built a Linux Command Assistant Agent for Mecha Comet using a fine‑tuned Llama 3.2 (1B) model.
  • Curated an 8,000‑command dataset focused on Debian and Mechanix OS for improved prediction and reliability.
Monkhub Innovations | Machine Learning Intern — Feb 2025–Jul 2025 (Gurugram, Haryana, India)

  • Fine‑tuned Stable Diffusion with LoRA for an AI fashion try‑on pipeline.
  • Built a YOLOv11 wildlife detection pipeline with real‑time deployment and feedback loops.
  • Created a Hindi–Hinglish conversational helpline using fine‑tuned Whisper ASR and a RAG architecture.

Projects

Steel Defect Detection
PyTorch, Computer Vision — Mar 2025

  • Developed a U‑Net architecture from scratch in PyTorch, featuring an encoder‑decoder with skip connections for precise pixel‑wise defect detection on steel surfaces.
  • Implemented a combined DICE‑BCE loss to address class imbalance and created a custom data pipeline using PIL and NumPy for efficient preprocessing of the Severstal dataset.
  • Attained strong performance metrics: 88.13% pixel accuracy, 0.7038 mean Dice coefficient, and 0.5841 mean IoU.
English–Swedish Neural Machine Translation
PyTorch, NLP — Jun 2025

  • Built a complete Transformer architecture from scratch with multi‑head attention and positional encoding.
  • Trained a custom tokenizer for English–Swedish translation and implemented a full training/validation pipeline.

This website was produced from a template made by Jon Barron.