Teaching AI Anatomy for Colorectal Cancer Surveillance

As an abdominal radiologist, Kang Wang, MD, PhD, noticed a disconnect in his daily work between the time determining whether a cancer patient's disease is progressing, responding to treatment, or stable, and how long it took to document those findings.

"It probably takes me less than five minutes to have an overall impression of how this patient is doing," Wang explained to Yang Yang, PhD, “But it takes me at least another 15-20 minutes to draft the report describing all the findings in this patient with extensive disease to provide detailed context for the oncologist."

This inefficiency isn't trivial. With the volume of cancer CT follow-ups ever rising, could artificial intelligence help automate routine parts while preserving the expertise radiologists bring to complex interpretations?

Yang had the technical foundation to explore this question. He built RadImageNet, a database containing 1.35 million medical images that have been labeled by experts, a crucial resource for teaching AI systems to recognize patterns in scans. He connected Wang, who understood the clinical need intimately, with Zongwei Zhou, PhD, at Johns Hopkins University, a leader in developing methods to train AI with less human effort. This cross-institutional team secured a $2.8 million R01 grant from the National Institute of Biomedical Imaging and Bioengineering. Their four-year project aims to develop an AI system that can automatically detect, track, and report on metastatic colorectal cancer across multiple CT scans over time.

The Data Challenge

Most AI development in medical imaging hits the same bottleneck: teaching AI to recognize abnormalities requires experts to manually mark and label thousands of images. Experts must outline every tumor, identify every organ, annotate every abnormality. Most research teams exhaust their resources after labeling a few hundred images, which limits how well AI can learn the full spectrum of disease appearances.

Wang, Yang, and Zhou designed an innovative workaround by using what radiologists already produce: written reports. UCSF's system contains over 207,000 CT scans from 22,000 colorectal cancer patients, each with a corresponding radiology report. By using large language models (similar to ChatGPT) to automatically extract key information from these reports, the team can teach their imaging AI system using tens of thousands of examples instead of just hundreds.

Teaching AI to Think Like a Radiologist

The key idea was to teach the AI anatomy first, then pathology, the same way medical students learn.

Wang explained, "Help the AI understand the anatomy on CT first. Then the AI can understand what the report means when they say 'enlarged periportal lymph node.' It's looking at a lymph node around the main portal vein near the hilum, rather than processing the entire CT volume."

Their AI system learns in two stages. First, it masters normal anatomy throughout the abdomen. Then it learns to recognize disease by connecting radiologists' written descriptions with visual abnormalities in the images. This "anatomy-first" approach makes the AI far more efficient. Instead of searching an entire CT scan for problems, it can focus on specific locations mentioned in previous reports.

Beyond Development: Real-World Testing

The team now plans to conduct a rigorous prospective study with three UCSF radiologists who will interpret colorectal cancer CT scans both with and without AI assistance. UCSF's status as a referral center provides a unique opportunity. Radiologists here regularly re-interpret CT scans from outside hospitals, which means the team can test their AI on images from diverse scanners and institutions. And while the team’s initial focus is colorectal cancer metastases to the liver, lymph nodes, bones, and lining of the abdominal cavity, the underlying methods of anatomy-first learning and report-based training can be adapted to virtually any cancer where repeated imaging tracks disease over time.

Reducing Burden, Improving Care

For radiologists this AI assistance can help focus their expertise on interpretation rather than documentation. For patients, there can be earlier detection of subtle disease changes for more timely treatment adjustments. The system will also generate standardized reports using OR-RADS (Oncologic Response Reporting and Data System), which provides clear, consistent language about whether disease is progressing, responding, or stable, which makes reports easier for both oncologists and patients to understand.

Portrait of a man with glasses smiling outdoors in a courtyard with greenery, wearing a black polo shirt

Kang Wang, MD, PhD

Yang Yang, PhD