Jieneng Chen

I'm a fifth-year Ph.D. candidate in Computer Science at Johns Hopkins University, advised by Distinguished Professor Alan L. Yuille. I am awarded as a Siebel Scholar Class 2025.

My research focuses on artificial intelligence, computer vision, multimodality, medical AI and embodied AI. I work with Prof. Daniel Khashabi , Prof. Tianmin Shu , and Prof. Rama Chellappa on many topics.

I am on the job market for 2025! Would love to chat more if you are interested. I am also happy to give talks on my research in related seminars.


               

profile photo
Research Themes
My research is driven by a desire to understand the computational foundations of human-level and expert-level intelligence. I strive to replicate human intelligence by developing general-purpose spatial and embodied intelligence, and to build expert-level intelligence to address challenges like health inequity and climate change.

  • Spatial Intelligence: my vision is grounded in interpreting 3D spatial configurations and advancing spatial reasoning capabilities. (1) develop vision models capable of predicting 3D orientation and geometry of objects from the monocular image and video. (2) benchmark and bridge the (significant) gap in 3D spatial reasoning capabilities between humans and GPT models.
  • Embodied Intelligence: 🧠 I am developing human-like multi-sensory embodied agents and working towards addressing critical challengs in embodied AI. (1) develop embodied large multimodal models through the vision-focused and spatial-aware redesign. (2) build generative world explorer to address the challenge of planning with partial observations. (3) equip embodied agents with 3D spatial intelligence in the dynamic physical world.
  • Medical Intelligence: my vision is grounded in foundational algorithms and scalable, clinically relevant applications to address health inequity 🙏. (1) algorithms: I developed the TransUNet, pioneering the Transformer era in medical image analysis. (2) clinical applications: I developed the world's first AI model capable of detecting and diagnosing eight major cancers. Once detected, I monitored the cancer's prognosis using multi-scale spatial analysis in CT and IHC. (3) Spatial and embodied intelligence holds immense potential to revolutionize patient care through homecare agents.
  • News
  • Invited talk at Cognitive Science Brown Bag Talk (slides).
  • Awarded an NVIDIA Academic Grant.
  • Co-organizing CVPR'25 workshop on Generative Models for Computer Vision. Call for Submission!
  • Two papers on spatial intelligence are selected as CVPR'25 Highlight (top 3%)!
  • I am selected to present my latest research in the CVPR'25 Doctoral Consortium.
  • Genex will be presented at ICLR'25 and CVPR'25 Demo. Congrats to TaiMing for winning the Michael J. Muuss Research Award and the Finalist for CRA's Outstanding Undergraduate Researcher Award!
  • I have designed a new course 'Machine Imagination' for undergraduates at JHU in 2025.
  • Selected as a Siebel Scholar, acknowledging me as a leading PhD student in bioengineering at JHU, as well as globally.
  • Our paper TransUNet is listed as one of the top 15 cited 2021 paper in all AI fields (the top 1 alphafold has won the nobel prize).
  • Our paper SwinUNet is listed as one of top 3 most cited ECCV papers in five years in Google Metrics.
  • Recent Publications (last year)

    Full list on Google Scholar Profile (over 13,000 citations).


    SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models.
    Wufei Ma, Luoxin Ye, Celso Miguel de Melo, Alan Yuille, Jieneng Chen.

    In Conference on Computer Vision and Pattern Recognition (CVPR), Highlight 🏆 (top 3%), 2025.
    Coming soon.



    Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal Models.
    Xingrui Wang, Wufei Ma, Tiezheng Zhang, Celso Miguel de Melo, Jieneng Chen†, Alan Yuille†.

    In Conference on Computer Vision and Pattern Recognition (CVPR), Highlight 🏆 (top 3%), 2025.
    Paper | Code | HuggingFace Data Card 🤗



    GenEx: Generating an Explorable World.
    Taiming Lu (Ugrad), Tianmin Shu, Junfei Xiao, Luoxin Ye, Jiahao Wang, Cheng Peng, Chen Wei, Daniel Khashabi, Rama Chellappa, Alan Yuille, Jieneng Chen.

    In Conference on Computer Vision and Pattern Recognition (CVPR) Demo Track, 2025.

    Turn a single image into a 3D world adventure. Discover the magic within.
    Paper | Project



    Generative World Explorer.
    Taiming Lu (Ugrad), Tianmin Shu, Alan Yuille, Daniel Khashabi, Jieneng Chen.

    In International Conference on Learning Representations (ICLR), 🤗 Top 1 Hugging Face Daily Papers, 2025.

    Humans can see and explore the unseen. Read our paper to learn how.
    Paper | Code | Project | Video



    3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark.
    Wufei Ma, Haoyu Chen, Guofeng Zhang, Celso Miguel de Melo, Alan Yuille, Jieneng Chen.

    Technical Report, 2024.

    Guess what? The state-of-the-art GPT is way off, scoring less than 50% accuracy compared to humans' near-perfect 95%.
    Paper | Data | Project



    Efficient Large Multi-modal Models via Visual Context Compression.
    Jieneng Chen, Luoxin Ye, Ju He, Zhaoyang Wang, Daniel Khashabi, Alan Yuille.

    In Neural Information Processing Systems (NeurIPS), 2024.
    Paper | Code | Project


    Designing Scalable Vision Models in the Vision-Language Era.
    Jieneng Chen, Qihang Yu, Xiaohui Shen, Alan Yuille, Liang-Chieh Chen.

    In Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
    Paper | Code | 🤗 HuggingFace | timm GitHub Stars Badge | open_clip GitHub Stars Badge


    TransUNet: Rethinking the U-Net Architecture Design for Medical Image Segmentation through the Lens of Transformers.
    Jieneng Chen, Jieru Mei, Xianhang Li, Yongyi Lu, Qihang Yu, Qingyue Wei, Xiangde Luo, Yutong Xie, Ehsan Adeli, Yan Wang, Matthew P Lungren, Shaoting Zhang, Lei Xing, Le Lu, Alan Yuille, Yuyin Zhou.

    In Medical Image Analysis (MedIA), 2024.

    ICML-W 2021 | Journal | Code | GitHub Stars Badge
    Top ScienceDirect downloaded article 🏆, published all time.
    Top 15 cited 2021 paper in all AI fields (cited 5K times as of 2024) for arXiv version. [Source]


    Teaching and Mentoring
    • Instructor: I designed and taught the undergraduate course Machine Imagination at JHU in 2025.
    • Serving: I am on the invited reviewers and program committees for major conference and journals, such as CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, AAAI, TPAMI, TMI, MICCAI and CogSci.
    • Mentoring: I am fortunate to have mentored super talented undergraduate, master and visiting students at JHU (and some from underrepresentative groups). Several from the 2024 cohort have gone on to pursue top CS PhD programs at institutions including CMU, JHU, Princeton, Northwestern and Oxford.