I am a Ph.D. candidate in Artificial Intelligence at Yonsei University, advised by Professor Seon Joo Kim.
My research focuses on multimodal intelligence, especially reasoning, grounding, and agentic behavior. I develop models and evaluations for systems that reason over visual evidence and make decisions across multiple steps.
Research interests: multimodal reasoning, grounded AI, agentic systems, diagnostic evaluation, and representation learning.
- Open to Research Scientist opportunities starting in Spring 2027.
Microsoft Research, AI Frontiers Redmond, U.S.
Research Scientist Intern Summer 2025
Conducted research on diagnosing multimodal reasoning capabilities in MLLMs (MathLens; arXiv 2025), with the collaboration later continuing into on process-level evaluation of web agents (WebStep; 2026).
LG AI Research, AML Seoul, South Korea
Research Scientist Intern Summer 2024
Conducted research on extending multimodal LLMs beyond language-only outputs, including embodied VLA, image generation, and mathematical reasoning (DIST2Loss; ICLR 2026).
Naver, Foundational Research Seongnam, South Korea
Research Scientist Intern Summer 2023
Contributed to HyperCLOVA X LLM development and conducted preliminary exploration on omnimodal language modeling.
Yonsei University Seoul, South Korea
Ph.D. Student Mar. 2024 - Present
Advisor: Seon Joo Kim
Seoul National University Seoul, South Korea
M.Sc. Student Mar. 2020 - Feb. 2023
Advisor: Gunhee Kim
Click a topic to filter publications below.
Research outputs across venues. ‡ = equal contribution. Full list on Scholar
Full Resume in PDF.
Conducted research on diagnosing multimodal reasoning capabilities in MLLMs (MathLens; arXiv 2025), with the collaboration later continuing into on process-level evaluation of web agents (WebStep; 2026).
Conducted research on extending multimodal LLMs beyond language-only outputs, including embodied VLA, image generation, and mathematical reasoning (DIST2Loss; ICLR 2026).
Contributed to HyperCLOVA X LLM development and conducted preliminary exploration on omnimodal language modeling.