Posts by Collection

portfolio

publications

Renal cell carcinoma detection and subtyping with minimal point-based annotation in whole-slide images

Published in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2020

Automated renal cell carcinoma detection and subtyping is limited by the lack of large whole-slide datasets with precise annotations. This paper proposes a semi-supervised framework built on a minimal point-based annotation strategy, where annotators only mark a few cancerous and non-cancerous points in each slide. The resulting detector and subtype classifier achieve performance comparable to much more heavily annotated alternatives while substantially lowering labeling effort.

Recommended citation: Gao Z, Puttapirat P, Shi J, et al. Renal cell carcinoma detection and subtyping with minimal point-based annotation in whole-slide images[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer International Publishing, 2020: 439-448.
Download Paper

Nuclei Grading of Clear Cell Renal Cell Carcinoma in Histopathological Image by Composite High-Resolution Network

Published in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2021

Clear cell renal cell carcinoma grading depends on accurate nuclei segmentation and fine-grained nuclei classification, both of which are challenging in crowded pathological images. This work introduces a composite high-resolution network that first separates clustered nuclei and then performs cross-category grading-aware classification. The framework addresses inter-class similarity in nuclear appearance and improves automated ccRCC grading for pathology analysis.

Recommended citation: Gao Z, Shi J, Zhang X, et al. Nuclei grading of clear cell renal cell carcinoma in histopathological image by composite high-resolution network[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer International Publishing, 2021: 132-142.
Download Paper

Instance-based vision transformer for subtyping of papillary renal cell carcinoma in histopathological image

Published in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2021

Papillary renal cell carcinoma subtyping requires subtle cellular and cell-layer patterns that conventional CNNs struggle to capture in large histopathological images. This paper proposes an instance-based vision transformer that focuses on top informative nuclei-centered instances and models their relationships with positional and grade-aware embeddings. The design learns finer morphological representations and improves performance on fine-grained pRCC subtyping.

Recommended citation: Gao Z, Hong B, Zhang X, et al. Instance-based vision transformer for subtyping of papillary renal cell carcinoma in histopathological image[C]//International conference on medical image computing and computer-assisted intervention. Cham: Springer International Publishing, 2021: 299-308.
Download Paper

Unsupervised representation learning for tissue segmentation in histopathological images: From global to local contrast

Published in IEEE Transactions on Medical Imaging, 2022

Tissue segmentation requires pixel-level labels that are expensive to obtain in histopathology. This paper develops an unsupervised representation learning framework that moves from global to local contrastive objectives so that the learned features become useful for fine-grained tissue discrimination. By encoding multi-granularity views without annotations, it improves segmentation quality under limited-label conditions.

Recommended citation: Gao Z, Jia C, Li Y, et al. Unsupervised representation learning for tissue segmentation in histopathological images: From global to local contrast[J]. IEEE Transactions on Medical Imaging, 2022, 41(12): 3611-3623.
Download Paper

Uncertainty-based Model Acceleration for Cancer Classification in Whole-Slide Images

Published in 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2022

Whole-slide image classification is often slowed by the need to process many high-magnification patches across a gigapixel slide. This paper proposes an uncertainty-based acceleration strategy that mimics pathologists by sending only suspicious high-uncertainty regions to expensive high-resolution analysis while handling most regions at low magnification. The framework reduces inference cost and deployment burden without sacrificing the accuracy needed for computational pathology applications.

Recommended citation: Gao Z, Mao A, Wu J, et al. Uncertainty-based Model Acceleration for Cancer Classification in Whole-Slide Images[C]//2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2022: 1534-1538.
Download Paper

A semi-supervised multi-task learning framework for cancer classification with weak annotation in whole-slide images

Published in Medical Image Analysis, 2023

Cancer region detection and subtype classification are two key tasks in digital pathology, but both are constrained by limited precise annotations on whole-slide images. This work proposes a semi-supervised multi-task framework that jointly learns detection and subtyping instead of training them as isolated steps. By coupling the two tasks under weak supervision, it reduces annotation demand and improves slide-level cancer classification.

Recommended citation: Gao Z, Hong B, Li Y, et al. A semi-supervised multi-task learning framework for cancer classification with weak annotation in whole-slide images[J]. Medical Image Analysis, 2023, 83: 102652.
Download Paper

Childhood leukemia classification via information bottleneck enhanced hierarchical multi-instance learning

Published in IEEE Transactions on Medical Imaging, 2023

Bone marrow smear analysis for childhood leukemia is labor-intensive and traditionally depends on detailed expert cell annotations. This work formulates the problem with patient-level supervision and introduces a hierarchical multi-instance learning framework enhanced by an information bottleneck. The model captures subtype relationships across multiple hierarchies and improves leukemia classification with better data efficiency and generalization.

Recommended citation: Gao Z, Mao A, Wu K, et al. Childhood leukemia classification via information bottleneck enhanced hierarchical multi-instance learning[J]. IEEE Transactions on Medical Imaging, 2023, 42(8): 2348-2359.
Download Paper

A structure-aware hierarchical graph-based multiple instance learning framework for pt staging in histopathological image

Published in IEEE Transactions on Medical Imaging, 2023

Pathological primary tumor staging depends on contextual evidence across multiple magnifications, but dense annotation on gigapixel whole-slide images is impractical. This work introduces a structure-aware hierarchical graph-based MIL framework that progressively models cross-scale contextual information instead of treating patches independently. The method improves weakly supervised pT staging by capturing multiscale structural cues that are critical for prognosis-related classification.

Recommended citation: Shi J, Tang L, Li Y, Zhang X, Gao Z, Zheng Y, Wang C, Gong T, Li C. A structure-aware hierarchical graph-based multiple instance learning framework for pt staging in histopathological image[J]. IEEE Transactions on Medical Imaging, 2023, 42(10): 3000-3011.
Download Paper

MG-trans: Multi-scale graph transformer with information bottleneck for whole slide image classification

Published in IEEE Transactions on Medical Imaging, 2023

Existing MIL pipelines for whole-slide image classification often rely on many high-magnification patches, creating redundant inputs while underusing spatial structure. MG-Trans addresses this by combining patch anchoring, dynamic structure learning, and a multi-scale information bottleneck within a graph-transformer framework. The resulting model captures fine-grained morphology more efficiently and strengthens discriminative whole-slide representations.

Recommended citation: Shi J, Tang L, Gao Z, Li Y, Wang C, Gong T, Li C, Fu H. MG-trans: Multi-scale graph transformer with information bottleneck for whole slide image classification[J]. IEEE Transactions on Medical Imaging, 2023, 42(12): 3871-3883.
Download Paper

Multiple serous cavity effusion screening based on smear images using vision transformer

Published in Scientific Reports, 2024

Serous cavity effusion smears are widely used in cytological diagnosis, but manual examination can be labor-intensive and variable in accuracy. This study builds a vision transformer-based framework for identifying malignant cells from smear images collected from 161 patients and thousands of annotated patches. The model improves automated screening performance and offers a more precise computational tool for assisting cytological assessment in routine clinical workflows.

Recommended citation: Wang C, Wang X, Gao Z, Ran C, Li C, Ding C. Multiple serous cavity effusion screening based on smear images using vision transformer[J]. Scientific Reports, 2024, 14(1): 7395.
Download Paper

Pamil: Prototype attention-based multiple instance learning for whole slide image classification

Published in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2024

Whole-slide images often contain heterogeneous tumor patterns, but many MIL methods still assume a single dominant label and provide limited interpretability. PAMIL introduces prototype attention-based multiple instance learning to model multiple histotypes within one slide while producing more meaningful explanations of the reasoning process. This makes whole-slide classification more clinically useful in settings where tumor heterogeneity matters.

Recommended citation: J Liu, A Mao, Y Niu, X Zhang, T Gong, C Li, Z Gao*. Pamil: Prototype attention-based multiple instance learning for whole slide image classification[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer Nature Switzerland, 2024: 362-372.
Download Paper

Shallow-Deep Synergy: Boosting Cross-Domain Generalization in Histopathological Image Segmentation

Published in 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2024

Histopathological image segmentation suffers from severe domain shifts caused by staining variation, imaging conditions, and tissue diversity across sites and organs. Shallow-Deep Synergy improves generalization in U-Net-based segmentation by explicitly combining the complementary strengths of shallow fine-detail features and deep semantic features. This design strengthens dense prediction performance under cross-domain settings where standard domain generalization methods are less effective.

Recommended citation: X Wang, W Su, Y Dong, Y Li, X Zhang, T Gong, IP Machado, M Crispin-Ortuzar, C Li, Z Gao*. Shallow-Deep Synergy: Boosting Cross-Domain Generalization in Histopathological Image Segmentation[C]//2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2024: 3790-3794.
Download Paper

From patches to WSIs: A systematic review of deep Multiple Instance Learning in computational pathology

Published in Information Fusion, 2025

Computational pathology systems based on whole-slide images are often bottlenecked by the need for costly fine-grained annotations. This review surveys how multiple instance learning reduces that dependence by learning from coarse supervision while aggregating information from large-scale WSIs. It synthesizes recent advances, organizes the rapidly growing literature, and highlights the technical trends shaping modern MIL research in pathology.

Recommended citation: Zhang Y1, Gao Z1, He K, et al. From patches to WSIs: A systematic review of deep Multiple Instance Learning in computational pathology[J]. Information Fusion, 2025: 103027.
Download Paper

ALPaCA: Adapting Llama for Pathology Context Analysis to enable slide-level question answering

Published in medRxiv(preprint), 2025

Large vision-language models are promising for computational pathology, but existing systems are largely restricted to small predefined regions rather than gigapixel whole-slide images. ALPaCA introduces a general-purpose slide-level LVLM trained on tens of thousands of WSIs with curated descriptions and question-answer pairs, combining a slide-level adaptor with prototype-based modeling and Llama3.1. It achieves strong slide-level question answering performance and can be adapted efficiently to organ-specific or disease-specific pathology tasks.

Recommended citation: Gao Z, He K, Su W, et al. ALPaCA: Adapting Llama for Pathology Context Analysis to enable slide-level question answering[J]. medRxiv, 2025: 2025.04. 22.25326190.
Download Paper

CoxKAN: Kolmogorov-Arnold networks for interpretable, High-Performance survival analysis

Published in Bioinfomatics, 2025

Survival analysis in medicine requires models that are both accurate and interpretable, yet deep survival models are often treated as black boxes. CoxKAN introduces a Cox proportional hazards Kolmogorov-Arnold Network that combines strong predictive performance with transparent functional structure. Evaluations on synthetic and real-world datasets show that it offers a practical balance between interpretability and high-performance survival modeling.

Recommended citation: W Knottenbelt, W McGough, R Wray, W Zhang, J Liu, I Machado, Z Gao*, M Crispin*, CoxKAN: Kolmogorov-Arnold networks for interpretable, High-Performance survival analysis, Bioinformatics, 2025
Download Paper

A fully annotated pathology slide dataset for early gastric cancer and precancerous lesions

Published in Scientific Data, 2025

Early gastric cancer diagnosis from ESD specimens is clinically important but remains labor-intensive and prone to interobserver variability. This work releases a fully annotated pathology slide dataset designed for precise examination of early gastric cancer and precancerous lesions. The dataset provides a challenging benchmark for computational pathology and supports the development of AI systems for fine-grained lesion detection and analysis.

Recommended citation: Wang, C., Ge, J., Niu, Y., Ding, C., Fan, Y., Chang, H., Yang, Z., Ran, C., Teng, X., Wang, X., Wu, L., Gao, Z.*, Li, C.* (2025). A fully annotated pathology slide dataset for early gastric cancer and precancerous lesions. Scientific Data, 12.
Download Paper

StaDis: Stability distance to detecting out-of-distribution data in computational pathology

Published in Medical Image Analysis, 2025

Computational pathology models can fail silently when they encounter out-of-distribution data that differ from the training distribution. StaDis introduces a plug-and-play OOD detection method tailored to this setting by measuring the feature gap between an image and its perturbed counterpart. Without retraining the underlying predictor, it improves deployment safety and helps flag unreliable predictions in real clinical environments.

Recommended citation: Zhang D, Ge J, Liu J, Wang C, Gong T, Gao Z*, Li C*. StaDis: Stability distance to detecting out-of-distribution data in computational pathology. Medical Image Analysis. 2025 Aug 27:103774.
Download Paper

ProGIS: Prototype-Guided Interactive Segmentation for Pathological Images

Published in IEEE Transactions on Medical Imaging, 2025

Interactive segmentation is attractive for computational pathology because it can reduce annotation cost while still accommodating pathologist input. ProGIS introduces a prototype-guided interactive segmentation framework that segments pathological structures accurately with minimal interaction and can recover multiple same-type connected components from one prompt. This makes slide-level interactive pathology segmentation more efficient and practical than fully automatic or heavily interactive alternatives.

Recommended citation: Ge J, Zhang D, Zhan Y, Liu J, Gong T, Wu J, Crispin M, Li C, Gao Z*. ProGIS: Prototype-Guided Interactive Segmentation for Pathological Images. IEEE Transactions on Medical Imaging. 2025.
Download Paper

SMMILe enables accurate spatial quantification in digital pathology using multiple-instance learning

Published in Nature Cancer, 2025

Spatial quantification is essential in computational pathology, yet many multiple-instance learning methods gain slide-level accuracy at the cost of spatial awareness. SMMILe shows that instance-level aggregation can achieve strong spatial quantification without sacrificing whole-slide prediction and introduces a superpatch-based measurable MIL formulation. Across multiple cancer types, tasks, and datasets, it consistently improves spatial localization and slide-level performance.

Recommended citation: Gao Z., Mao, A., Dong, Y. et al. SMMILe enables accurate spatial quantification in digital pathology using multiple-instance learning. Nat Cancer (2025).
Download Paper

Learning Heterogeneous Embedding with Prototype-Aware Graph Attention for Whole Slide Image Classification

Published in 2025 IEEE International Conference on Bioinformatics and Biomedicine, 2025

Whole-slide images contain diagnostic cues spanning local neighborhoods, distant regions, and hierarchical tissue organization, but existing graph and MIL models do not unify these relations effectively. This paper proposes a prototype-aware heterogeneous graph attention network that lets each region interact with diverse heterogeneous neighbors while guiding slide-level representation learning with multilevel prototypes. The framework strengthens whole-slide classification by jointly modeling local, non-local, and hierarchical structure within a single representation space.

Recommended citation: Niu Y, Liu J, Zhan Y, Shi J, Chen J, Zhang D, Li C, Gao Z*. Learning Heterogeneous Embedding with Prototype-Aware Graph Attention for Whole Slide Image Classification[C]. 2025 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2025: 2671-2678.
Download Paper

MegaSeg: Towards Scalable Semantic Segmentation for Megapixel Images

Published in Medical Image Analysis, 2026

Megapixel image segmentation is central to high-resolution histopathology analysis, but standard pipelines depend on patching or downsampling that lose important context. MegaSeg introduces an end-to-end segmentation framework for megapixel images that combines streaming convolutional networks, a U-shaped architecture, and a divide-and-conquer strategy. It preserves fine detail and global structure while dramatically reducing memory requirements for very large images.

Recommended citation: Kaura SK, Wu J, Gao Z*, Li C*. MegaSeg: Towards Scalable Semantic Segmentation for Megapixel Images. Medical Image Analysis. 2026 Jan 10:103933.
Download Paper

HAAF: Hierarchical Adaptation and Alignment of Foundation Models for Few-Shot Pathology Anomaly Detection

Published in The Web Conference (WWW) 2026, 2026

Few-shot pathology anomaly detection depends on subtle region-level abnormalities, yet generic vision-language adaptation often fails because semantic prompts are not grounded in fine-grained visual evidence. HAAF tackles this granularity mismatch with a hierarchical adaptation and alignment strategy centered on cross-level scaled alignment, where visual context first refines text prompts and the adapted prompts then guide anomaly-focused visual encoding. A dual-branch inference design further improves stability, and experiments on four benchmarks show strong gains over existing few-shot baselines.

Recommended citation: Yang C, Zhao W, Tang Y, Lu J, Ge J, Liu Q, Gao Z*, Li C. HAAF: Hierarchical Adaptation and Alignment of Foundation Models for Few-Shot Pathology Anomaly Detection. Accepted to The Web Conference (WWW), 2026.
Download Paper

PH2ST: Prompt-Guided Hypergraph Learning for Spatial Transcriptomics Prediction in Whole Slide Images

Published in Medical Image Analysis, 2026

Spatial transcriptomics provides valuable molecular maps, but current assays remain costly, sparse, and difficult to scale across large tissue regions. PH2ST uses limited spatial transcriptomics signals as prompts to guide multi-scale histological representation learning with a hypergraph framework for robust gene expression prediction from H&E slides. Across public datasets and realistic prompt settings, it outperforms prior methods and supports applications such as missing-spot imputation, super-resolution, and local-to-global prediction.

Recommended citation: Niu Y, Liu J, Zhan Y, Shi J, Zhang D, Reinius M, Machado I, Crispin-Ortuzar M, Wu J, Li C, Gao Z*. PH2ST: Prompt-Guided Hypergraph Learning for Spatial Transcriptomics Prediction in Whole Slide Images[J]. Medical Image Analysis, 2026: 104008.
Download Paper

CARE: A Molecular-Guided Foundation Model with Adaptive Region Modeling for Whole Slide Image Analysis

Published in IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2026

Existing pathology foundation models often inherit natural-image backbones that overlook the heterogeneous and irregular organization of pathological regions of interest. CARE introduces a two-stage pretraining strategy that first learns morphological structure from large-scale whole-slide images and then aligns adaptive region representations with molecular signals from RNA and protein profiles. Using only a fraction of the pretraining data common in prior work, CARE delivers strong average performance across 33 downstream benchmarks for classification, molecular prediction, and survival analysis.

Recommended citation: D Zhang, Z Gong, X Pang, J Liu, J Lu, H Cui, J Ge, Z Zeng, K Yi, Y Li, S Liu, T Yu, H Wang, M Crispin-Ortuzar, W Yu, C Li, Z Gao*. CARE: A Molecular-Guided Foundation Model with Adaptive Region Modeling for Whole Slide Image Analysis. Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026.
Download Paper

Thinking in Scales: Accelerating Gigapixel Pathology Image Analysis via Adaptive Continuous Reasoning

Published in International Conference on Machine Learning, 2026

Traditional whole-slide image analysis relies on exhaustive patch-level processing that is computationally expensive at gigapixel scale. PathCTM formulates diagnosis as adaptive scale-space continuous reasoning, progressively moving from low-magnification global inspection to high-magnification local evidence gathering with dynamic scale switching, region pruning, and confidence-aware early stopping. It cuts required image patches and inference time by about 96% while maintaining slide-level AUC.

Recommended citation: Ge J, Zhan Y, Zhao W, Zhang D, Wang K, Liu J, Yang C, Li C, Zhang J, Dong Y, Zhang N, Liu Q, Crispin-Ortuzar M, Fu H, Li C, Gao Z. Thinking in Scales: Accelerating Gigapixel Pathology Image Analysis via Adaptive Continuous Reasoning. Accepted to the International Conference on Machine Learning (ICML), 2026.
Download Paper

talks

Multimodal and Multiomics Foundation Models

Published:

Invited lecture at the Artificial Intelligence in Cancer Research Summer School 2025 on multimodal and multiomics foundation models for cancer research.

Multiomics for Cancer

Published:

Invited lecture at Oxford ML School 2025 on multiomics approaches for cancer research.

teaching