About me
Iโm a Ph.D. student in Computer Science at Georgia Tech, where Iโm fortunate to be advised by Prof. Judy Hoffman. My research focuses on developing multimodal vision-language models that integrate spatial, semantic, and temporal reasoning with minimal supervision.
Recent work includes:
A 7B open-source VLM for open-vocabulary 3D scene graph generation, under review at NeurIPS 2025.
SkyScenes, a synthetic aerial dataset for improving real-world segmentation, accepted at ECCV 2024.
Generalist Multimodal LLM, where I designed a jointly-trained vision-audio model that outperforms larger generalist systems by reducing cross-modal interference.
I bring prior experience in domain generalization, zero-shot learning, and synthetic-to-real adaptation, focusing on making models robust to diversity, correlation, and semantic shifts in real-world environments. My goal is to build generalizable systems that require minimal labeled data yet remain reliable under distribution shifts.
I also review for top conferences (NeurIPS, CVPR, ECCV), and have published across both vision and language communities.
๐ Iโll be attending CVPR 2025 โ if youโre around and working in similar areas, Iโd love to connect!
๐ผ Iโm currently looking for research internships for Summer 2026 โ feel free to reach out if youโre hiring!
Recent Updates
[ ๐: Important | ๐ก: Research Paper | ๐: Miscellaneous ]
๐ Attending CVPR 2025 in Nashville!
๐ Attending ECCV 2024 at Milan, Italy! Georgia Tech published an article about our work!
๐ก Jul 1, 2024: My first first-author paper โ SkyScenes: A Synthetic Dataset for Aerial Scene Understanding is accepted at ECCV 2024!
๐ Apr 1, 2024: Joining Georgia Tech for Ph.D. CS under Prof. Judy Hoffman in Fall 2024.
๐ Mar 12, 2024: Serving as a reviewer for ECCV 2024.
๐ก Oct 24, 2023: My first main-conference paper - LatentDR: Improving Model Generalization Through Sample-Aware Latent Degradation and Restoration is accepted at WACV 2024!
๐ Attending NeurIPS 2022 at New Orleans, LA, USA. My first in-person conference!
๐ Apr 4, 2022: Admitted to the MS CS program of Georgia Tech for Fall 2022!
Publications
2024
- ECCV 2024 [First first-author paper!]
- SkyScenes: A Synthetic Dataset for Aerial Scene Understanding
- Sahil Khose*, Anisha Pal*, Aayushi Agarwal*, Deepanshi*, Judy Hoffman, Prithvijit Chattopadhyay
- Website | HF๐ค | Paper | CoC Article | GitHub
- WACV 2024 [First main-conference paper!]
- LatentDR: Improving Model Generalization Through Sample-Aware Latent Degradation and Restoration
- Ran Liu, Sahil Khose, Jingyun Xiao, Lakshmi Sathidevi, Keerthan Ramnath, Zsolt Kira, Eva L. Dyer
2022
- NeurIPS 2022 [Visted my first in-person conference in New Orleans!]
- [Poster] Continual VQA for Disaster Response Systems at CCAI
- ICML 2022
- [Best Paper Award ๐] An Efficient Modern Baseline for FloodNet VQA at New In ML
- ACL 2022
2021
- NeurIPS 2021
- [Spotlight Paper ๐] Semi-Supervised Classification and Segmentation on High Resolution Aerial Images at CCAI
- XCI-Sketch: Extraction of Color Information from Images for Generation of Colored Outlines and Sketches at 1. [Oral] New in ML 2. [Paper] CtrlGen 3. [Paper] ML4CD 4. [Poster] DGM
- [Poster] A Studious Approach to Semi-Supervised Learning at ICBINB
- NAACL 2021