ALPaCA: Adapting Llama for Pathology Context Analysis to enable slide-level question answering

Large vision-language models are promising for computational pathology, but existing systems are largely restricted to small predefined regions rather than gigapixel whole-slide images. ALPaCA introduces a general-purpose slide-level LVLM trained on tens of thousands of WSIs with curated descriptions and question-answer pairs, combining a slide-level adaptor with prototype-based modeling and Llama3.1. It achieves strong slide-level question answering performance and can be adapted efficiently to organ-specific or disease-specific pathology tasks.
Recommended citation: Gao Z, He K, Su W, et al. ALPaCA: Adapting Llama for Pathology Context Analysis to enable slide-level question answering[J]. medRxiv, 2025: 2025.04. 22.25326190.
Download Paper