About Me
I'm a second-year MS student at The Ohio State University, focusing on computer vision, machine learning, and multimodal learning (VLMs). Currently, I'm a full-time Computer Vision Engineer Intern at Ubihere, working on multi-camera systems for tracking, re-identification, and spatial analysis. Previously, I did reaserch at PCVLab as a GRA, with Alper Yilmaz, working on computer vision and multimodal learning for medical imaging. We have a dataset paper in revision at Nature Scientific Data, and our dataset has 6K+ downloads on Hugging Face. I like turning research into real systems and experimenting with new tech—recently built a robotics app integrating LLMs and computer vision for the Reachy Mini robot. Looking for full-time AI/ML roles starting May 2026.
Highlighted Skills


Built a real-time object detection system for German supermarket products using YOLOv8m and webcam input. The system is designed to eventually guide users to the correct storage locations based on recognized items in the kitchen.

Built a GPU-powered, LangGraph-orchestrated pipeline that detects, tracks, and re-identifies people across a video—then leverages LLM reasoning to link identities.
{global_id, description} entries for identity tracking.
Developing a multimodal vision-language model (VLM) for automatic chest X-ray interpretation and clinical-report generation by learning joint image–text representations from the MIMIC-CXR dataset.

Contributed extensively to the paper and experimentation phase, collaborating with a colleague who adapted the initial Lightning-Hydra codebase.
Acknowledgement – Pouyan Navard provided the original Lightning-Hydra template, baseline models (ViT, UNETR, SwinUNETR, V-Net, UNet++, SENet154, 3D ResNet, and 3D UNet), dataset, and pipeline.
Contact Me