Research Themes
My research is driven by a desire to understand the computational foundations of human-level and expert-level intelligence. I strive to replicate human intelligence by developing general-purpose spatial and embodied intelligence, and to build expert-level intelligence to address challenges like health inequity and climate change.
|
Spatial Intelligence:
my vision is grounded in interpreting 3D spatial configurations and advancing spatial reasoning capabilities.
(1) develop vision models capable of predicting 3D orientation and geometry of objects from the monocular image and video.
(2) benchmark and bridge the (significant) gap in 3D spatial reasoning capabilities between humans and GPT models.
Embodied Intelligence:
🧠 I am developing human-like multi-sensory embodied agents and working towards addressing critical challengs in embodied AI.
(1) develop embodied large multimodal models through the vision-focused and spatial-aware redesign.
(2) build generative world explorer to address the challenge of planning with partial observations.
(3) equip embodied agents with 3D spatial intelligence in the dynamic physical world.
Medical Intelligence:
my vision is grounded in foundational algorithms and scalable, clinically relevant applications to address health inequity 🙏.
(1) algorithms: I developed the TransUNet, pioneering the Transformer era in medical image analysis.
(2) clinical applications: I developed the world's first AI model capable of detecting and diagnosing eight major cancers.
Once detected, I monitored the cancer's prognosis using multi-scale spatial analysis in CT and IHC.
(3) Spatial and embodied intelligence holds immense potential to revolutionize patient care through homecare agents.
I will be offering a brand new course 'Machine Imagination' for students of all levels at JHU in Winter 2025.
selected as a Siebel Scholar, recognizing exceptional students from the world’s leading graduate schools.
my algorithms have been hosted in the world-renowned timm and open_clip .
our paper TransUNet is listed as one of the top 15 cited 2021 paper in all AI fields (the top 1 alphafold has won the nobel prize).
our paper TransFG is listed as one of top 3 most influential AAAI 2023 papers.
our paper SwinUNet is listed as one of top 3 most cited ECCV papers in five years in Google Metrics.
Recent Publications (last year)
|
Full list on Google Scholar Profile (10+ first-authored, an h-idex of 22, with over 11,000 citations in three years).
GenEx: Generating an Explorable World
Taiming Lu (Ugrad),
Tianmin Shu,
Junfei Xiao,
Luoxin Ye,
Jiahao Wang,
Cheng Peng,
Chen Wei,
Daniel Khashabi,
Rama Chellappa,
Alan Yuille,
Jieneng Chen
Technical Report, 2024
Turn a single image into a 3D world adventure. Discover the magic within.
Paper |
Project
|
Generative World Explorer
Taiming Lu (Ugrad),
Tianmin Shu,
Alan Yuille,
Daniel Khashabi,
Jieneng Chen
Technical Report, 🤗 #1 Hugging Face Daily Papers, 2024
Humans can see and explore the unseen. Read our paper to learn how.
Paper |
Code |
Project | Video
|
3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark
Wufei Ma,
Haoyu Chen,
Guofeng Zhang,
Celso Miguel de Melo,
Alan Yuille,
Jieneng Chen
Technical Report, 2024
Guess what? Gemini-2 is way off, scoring less than 50% accuracy compared to humans' near-perfect 95%.
Paper |
Data |
Project
|
Efficient Large Multi-modal Models via Visual Context Compression
Jieneng Chen,
Luoxin Ye,
Ju He,
Zhaoyang Wang,
Daniel Khashabi,
Alan Yuille
In Neural Information Processing Systems (NeurIPS), 2024
Paper |
Code |
Project
|
Designing Scalable Vision Models in the Vision-Language Era
Jieneng Chen,
Qihang Yu,
Xiaohui Shen,
Alan Yuille,
Liang-Chieh Chen
In Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Paper |
Code |
🤗 HuggingFace | timm | open_clip
|
VM-Gait: Virtual Marker-Driven 3D Representation for Multi-Modal Gait Recognition
Zhaoyang Wang,
Jiang Liu,
Jieneng Chen†,
Rama Chellappa†
to be appeared in Winter Conference on Applications of Computer Vision (WACV), 2025
IARPA program
|
TransUNet: Rethinking the U-Net Architecture Design for Medical Image Segmentation through the Lens of Transformers
Jieneng Chen,
Jieru Mei,
Xianhang Li,
Yongyi Lu,
Qihang Yu,
Qingyue Wei,
Xiangde Luo,
Yutong Xie,
Ehsan Adeli,
Yan Wang,
Matthew P Lungren,
Shaoting Zhang,
Lei Xing,
Le Lu,
Alan Yuille,
Yuyin Zhou
Medical Image Analysis (MedIA), 2024
ICML-W 2021 |
Journal |
Code |
Top ScienceDirect downloaded articles in 2024, published all time
Top 15 cited 2021 paper in all AI fields (cited 5K times as of 2024) for arXiv version [Source]
|
- Instructor: He is the instructor for a course in Winter 2025 at JHU.
- Teaching assistant:
- Guest lecturer
- Serving: He is on the invited reviewers and program committees for major conference and journals, such as CVPR, ICCV, ECCV, NeurIPS, ICML, ICLR, AAAI, TPAMI, TMI and MICCAI. He provided mentor hours for PhD students and underrepresented students at JHU.
|