Jieneng Chen

I'm a final-year Ph.D. candidate in Computer Science at Johns Hopkins University, advised by Dist. Prof. Alan L. Yuille. I am co-advised by Prof. Daniel Khashabi and Dist. Prof. Rama Chellappa. I am awarded as a Siebel Scholar Class 2025. My PhD work on neural architectures has become influential, with over 15,000 citations.

One snapshot of myself at campus is enough for me 🧠 (or our neighbour deer 🦌 and squirrel 🐿️) to rebuild the entire space in 3D, explore every unseen nook, and test my intuitions about the world.

Why can't machines match that? What still separates them from a squirrel 🐿️? I aim to bridge the gap: enabling machines to model the world's meaning, shape, and dynamics straight from raw data, giving machines the base they need for multimodal perception, reasoning, and interaction.

AI must serve people, thus I also tackle cancer AI's “unattainable triangle” of accessibility, accuracy, and generalizability to lay the foundation for equitable, personalized cancer care worldwide.

Super happy to chat more! I'd love to collaborate or present my research in related seminars.

News

invited talk at NSF IAIFI for physics & AI in Boston.
received the Best Paper Award in KDD 2025 Health Day. Grateful to the NIH PDs for their votes and to CCC and NIH for the prize.
awarded an NVIDIA Academic Grant.
selected to CVPR 2025 Doctoral Consortium with an NSF travel award.
invited talk at ICLR 2025 Workshop on Embodied Intelligence with Large Language Models In Open City Environment (slides).
invited talk at Chemical and Biomolecular Engineering.
invited talk at Cognitive Science Brown Bag Talk.
co-organized CVPR'25 workshop on Generative Models for Computer Vision.
selected as a Siebel Scholar, acknowledging me as a leading scholar in bioengineering at JHU, as well as globally.
TransUNet is listed as top 15 cited 2021 paper in all AI fields (the top 1 alphafold has won the nobel prize).
SwinUNet is listed as top 3 most cited ECCV papers in five years in Google Metrics.

Recent Projects

Full list on Google Scholar Profile . ☆ denotes visiting undergraduate / graduate mentees.

	GenEx: Generating an Explorable World. TaiMing Lu ☆, Tianmin Shu, Alan Yuille, Daniel Khashabi, Jieneng Chen. ICLR, 2025 Turn a single image into a 3D world adventure. Embodied agents refine their beliefs by predicting unseen parts of the physical world. Paper (OpenReview) \| Blog \| Project Website
	Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning. Yijun Yang ☆, Zhao-Yang Wang, Qiuping Liu, Shuwen Sun, Kang Wang, Rama Chellappa, Zongwei Zhou, Alan Yuille, Lei Zhu, Yu-Dong Zhang, Jieneng Chen. ICCV, 2025. Envision precision medicine via generative world modeling. Paper \| Code \| Project
	4D-Animal: Freely Reconstructing Animatable 3D Animals from Videos. Shanshan Zhong ☆, Jiawei Peng, Zehan Zheng, Zhongzhan Huang, Wufei Ma, Guofeng Zhang, Qihao Liu, Alan Yuille, Jieneng Chen. Technical report, 2025. Paper \| Code
	Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models. Tiezheng Zhang, Yitong Li, Yu-Cheng Chou, Jieneng Chen, Alan Yuille, Chen Wei, Junfei Xiao. Technical report, 2025. Paper \| Project \| Code \| HuggingFace Data Card
	Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal Models. Xingrui Wang, Wufei Ma, Tiezheng Zhang, Celso Miguel de Melo, Jieneng Chen†, Alan Yuille†. CVPR, Highlight, 2025. Paper \| Code \| HuggingFace Data Card
	SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models. Wufei Ma, Luoxin Ye ☆, Nessa McWeeney, Celso Miguel de Melo, Jieneng Chen, Alan Yuille. CVPR, Highlight, 2025. arXiv
	LLaVolta: Efficient Large Multi-modal Models via Visual Context Compression. Jieneng Chen, Luoxin Ye, Ju He, Zhaoyang Wang, Daniel Khashabi, Alan Yuille. NeurIPS, 2024. Paper \| Code \| Project
	ViTamin: Designing Scalable Vision Models in the Vision-Language Era. Jieneng Chen, Qihang Yu, Xiaohui Shen, Alan Yuille, Liang-Chieh Chen. CVPR, 2024. The first vision-centric design for LMM encoder, with SoTA performance on 60+ multimodal tasks in 2024. Paper \| Code \| 🤗 HuggingFace \| timm \| open_clip
	TransUNet: Rethinking the U-Net Architecture Design for Medical Image Segmentation through the Lens of Transformers. Jieneng Chen, Jieru Mei, Xianhang Li, Yongyi Lu, Qihang Yu, Qingyue Wei, Xiangde Luo, Yutong Xie, Ehsan Adeli, Yan Wang, Matthew P Lungren, Shaoting Zhang, Lei Xing, Le Lu, Alan Yuille, Yuyin Zhou. Medical Image Analysis (MedIA), 2024. ICML-W 2021 \| Journal \| Code \| Top ScienceDirect downloaded article 🏆, published all time. Top 15 cited 2021 paper in all AI fields, 6000 citations.

Teaching

Instructor: I designed and taught the undergraduate course Machine Imagination at JHU in 2025.

Service

Invited reviewer for communities: computer vision (CVPR, ICCV, ECCV, WACV), deep learning (NeurIPS, ICML, ICLR, AAAI, TPAMI), medical AI (TMI, MICCAI) and CogSci.
Workshop co-organizer for CVPR and MICCAI.

Mentoring

I am fortunate to have mentored super talented undergraduate, master and visiting students at JHU.
Students from the 2024-2025 cohort will pursue top CS PhD programs at institutions including CMU, JHU, Princeton, Northwestern and Oxford.
TaiMing Lu, a JHU undergraduate on the GenEx project, has received the Michael J. Muuss Research Award and been named a finalist for the CRA Outstanding Undergraduate Researcher Award.