๐ฅ Chinese AI Agent Beats Medical Imaging SOTA by 35% โ Without Touching the Model
What if AI could read medical scans the way doctors actually think โ step by step, not just a single glance?
Most medical AI today works like a one-shot guessing game: look at the scan, output a mask, done. The problem? You have to rebuild the model for every new imaging type, and every modification chips away at the AI's ability to reason.
Researchers from Zhejiang University and Shanghai AI Lab just flipped the script.
Their new system, IBISAgent, treats medical image segmentation as a multi-step reasoning problem. Instead of modifying the model or adding special tokens, it teaches the AI to analyze, point, call external tools, and verify โ iterating until confident.
Think of it as giving the AI a doctor's thinking process rather than just a doctor's eyes.
๐ฏ The results speak for themselves:
- +35% IoU improvement over the best medical AI baselines
- Works across 5 imaging modalities (CT, MRI, pathology) with one model
- 80% accuracy on 7 unseen cancer types it was never trained on
- Just 4.26 steps on average to reach a conclusion
- Trained on 456K high-quality reasoning trajectories
The elegant part? The base multimodal model stays completely intact. IBISAgent uses MedSAM2 as an external tool and learns when and where to call it through reinforcement learning โ preserving full language reasoning while gaining surgical precision.
Already accepted at CVPR 2026, this work signals a shift: the future of medical AI isn't bigger models, it's smarter agents.
๐ Source
qbitai