Introduction
Pathology sits at the center of modern medicine, providing definitive diagnoses that guide patient management, prognostication, and therapeutic decision-making. For more than a century, this discipline has relied on light microscopy and glass slides as its primary tools. While this paradigm has proven remarkably robust, it is increasingly challenged by rising case volumes, growing diagnostic complexity, subspecialization, workforce shortages, and escalating demands for accuracy, efficiency, and standardization. Against this backdrop, digital pathology (DP) and artificial intelligence (AI) have emerged as transformative technologies with the potential to fundamentally reshape pathology practice and its integration into clinical workflows.
DP involves the acquisition, management, sharing, and interpretation of pathology data in a digital environment. Whole-slide imaging (WSI) enables pathologists to review and analyze cases on computers, freeing them from traditional microscopes. While similar to radiology’s digital shift, DP faces unique challenges related to image size and complexity. Advances in scanning, storage, and software have driven its rapid adoption. Once limited to education and research, DP is now widely used in clinical practice, including primary diagnosis, with growing implementation across academic and community settings—accelerated by needs for remote work and resilience highlighted during the coronavirus disease 2019 (COVID-19) pandemic.
Concurrently, AI has advanced from experimental algorithms to clinically relevant tools that assist with detection, classification, quantification, and prognostic assessment. Deep learning (DL) models trained on whole-slide images can identify mitoses, grade tumors, quantify biomarkers, detect metastases, and reveal patterns linked to molecular and clinical outcomes. Rather than replacing pathologists, AI serves as an augmentative tool that improves consistency, reduces workload, and uncovers additional insights from pathology images.
The convergence of DP and AI represents more than a technological upgrade; it signals a paradigm shift in how pathology is practiced. Digital workflows enable seamless image sharing across institutions and disciplines, facilitating multidisciplinary team discussions, telepathology, and global consultation. AI-driven image analysis extends the diagnostic reach of pathology into quantitative and predictive domains, supporting personalized treatment strategies.
Despite this promise, the path to widespread adoption is complex. Implementing DP requires substantial investment, careful workflow redesign, robust validation, staff training, and adherence to regulatory and accreditation standards. Similarly, AI tools must be rigorously validated, transparently evaluated, and thoughtfully integrated into daily practice to ensure safety, usability, and clinical value. Ethical considerations including data privacy, algorithmic bias, accountability, and the evolving role of the pathologist must be addressed alongside technical innovation.
This review aims to provide a comprehensive, practical, and forward-looking overview of DP and AI as they relate to clinical workflow. It first explores the implementation of DP systems, and their clinical and non-clinical applications, and then delves into the fundamentals of AI and its clinical applications, and emerging developments such as generative AI (GAI), multimodal AI, and agentic AI.
Digital pathology
DP infrastructure
WSI systems
DP has evolved over nearly two centuries, from early photomicrography to telepathology and, more recently, WSI. A key milestone came in 2017, when the U.S. Food and Drug Administration approved the first WSI system for primary diagnosis, establishing DP as a clinically viable alternative to traditional microscopy. WSI scanners serve as the entry point to digital workflows, converting glass slides into high-resolution images by capturing and stitching sequential tiles. Modern WSI systems include two core components: image acquisition (slide scanning) and a workstation for image viewing and management. Scanner capabilities vary in capacity, throughput, and automation, with selection guided by case volume, specimen type, and turnaround needs.
Whole slide image scanners
Scanners can be broadly classified as high-throughput, medium-throughput, and single-slide systems[
1,
2].
High-throughput scanners digitize hundreds to thousands of slides per batch and are suited for large centers, with automated loading and barcode tracking. Medium-throughput systems support lower volumes with faster turnaround for targeted workflows such as frozen sections and consultations. Single-slide or on-demand scanners are used for rapid, flexible applications such as frozen sections, rapid on-site evaluation (ROSE) for cytology, and education, where immediate access is prioritized over batch efficiency.
Scanner speed depends on several factors, including magnification (20× vs. 40×), file format, focus algorithms, and use of z-stacking for thicker sections. Institutions adopting full digital sign-out should ensure scanner throughput aligns with daily slide volume, ideally enabling complete digitization within 24 h to avoid diagnostic delays.
Some DP systems have received U.S. FDA clearance/authorization for primary diagnosis (Table 1).
Whole slide imaging file formats
WSI files are large, often 1–3 GB per slide at 40×, making efficient compression essential for storage and network performance. Compression may be lossless or lossy; for primary diagnosis, lossless or minimally lossy formats are preferred, and settings should be validated to ensure diagnostic integrity. A longstanding challenge in DP is the variety of proprietary file formats (e.g., .svs, .ndpi, .mrxs, .iSyntax), each with vendor-specific structures, compression methods, and metadata.
To promote interoperability, the Digital Imaging and Communications in Medicine (DICOM) Working Group 26 extended the DICOM standard to encompass WSIs[
3]. This standardization allows for consistent encoding of images, annotations, and metadata, supporting cross-platform compatibility and long-term accessibility[
3]. Transitioning to DICOM requires infrastructure readiness and close vendor collaboration, but offers key advantages, including standardized data management, reduced vendor lock-in, and improved integration with radiology archives. As radiology and pathology converge within enterprise imaging strategies, DICOM compliance becomes increasingly important for creating unified patient imaging records[
4,
5].
Image quality and calibration
Diagnostic fidelity in DP depends on accurate color reproduction, image sharpness, and consistent focus. Regular scanner calibration using standardized color targets and focus slides is essential and should be built into routine maintenance, with quality control (QC) logs maintained for compliance[
6–
8]. Because pathologists are sensitive to color variation, especially in hematoxylin and eosin (H&E) staining, scanner software should support color normalization to reduce batch variability. Periodic quality assurance (QA) review of scanned slides is also important to detect artifacts such as stitching errors, debris, or incomplete scans.
Image management system (IMS)
Once slides are digitized, effective image management is critical to workflow success. The IMS serves as the central hub—analogous to the radiology Picture Archiving and Communication System (PACS)—enabling secure, efficient, and scalable storage and access to WSIs for clinical care, education, research, and QA.
A robust IMS must handle large files, support multiple formats, and integrate seamlessly with the laboratory information system (LIS), electronic medical record (EMR), and tools such as AI analytics and annotation platforms. This requires careful planning of database structure, network performance, caching, and access control. Key capabilities include rapid image access, smooth navigation, reliable linkage to LIS metadata, annotation tools, audit trails with role-based permissions, and multi-user collaboration. Several commercial IMS platforms are available, some with FDA clearance including Philips IntelliSite Pathology Solution, PathPresenter Platform, Sectra Digital Pathology Solution, PathAI AISight Dx, etc.
IMS performance and scalability
IMS performance is critical to user satisfaction, as pathologists expect rapid loading and smooth navigation without latency. This requires high-speed network connectivity and optimized streaming protocols. Modern systems use tile-based streaming, loading only the visible portion of an image to reduce bandwidth demands.
Scalability is equally important. As slide volumes grow, the IMS must efficiently manage millions of files. A modular architecture allows expansion through additional servers or cloud resources, and vendors should be assessed on their ability to support seamless horizontal scaling without service disruption.
Integration with LIS
The LIS remains the authoritative source of patient and specimen data in anatomic pathology. Seamless integration with the IMS ensures each digital slide is accurately linked to its case, block, and stain, typically via barcode identifiers. When a case is opened in the LIS, associated images should automatically load in the IMS, eliminating manual searching[
9].
Bidirectional communication is essential: annotations and measurements from the IMS should be transferred to the LIS or reporting system, while case status updates synchronize across platforms. Effective integration reduces duplicate data entry and minimizes the risk of mismatched records and images.
Supporting AI integration and research
As AI advances, the IMS should be able to serve as a central platform for deploying inference tools, integrating automated image analysis into diagnostic workflows. Choosing an IMS that supports such extensions helps ensure future AI integration without major system redesign.
Beyond clinical use, WSI storage and management are critical for research, particularly in computational pathology and AI. Structured storage and metadata tagging enable efficient cohort selection and large-scale image retrieval for model development, while sandbox environments allow research access without compromising clinical systems or data integrity[
10].
Data storage, security, and retention
Storage architecture is among the most technically demanding aspects of implementing DP. A fully digital laboratory can produce tens to hundreds of terabytes of image data each year, depending on case volume and scanning resolution. As a result, storage systems must carefully balance capacity, speed, redundancy, and cost[
11,
12].
Storage tiers and locations
Most institutions use a tiered storage model. Primary storage (Tier 1) consists of high-performance disks for active cases, prioritizing speed and reliability, often with redundant array of independent disks (RAID) for data protection. Secondary storage (Tier 2) offers slower but more cost-effective systems for recently completed cases. Archival storage (Tier 3) provides long-term, low-cost options—such as magnetic tape, optical media, or cloud-based cold storage—optimized for durability rather than retrieval speed. Institutional policies should specify how long images remain in each tier before migration, based on retention requirements and the likelihood of re-access.
Institutions must weigh the choice between on-premises and cloud-based storage. On-premises solutions offer greater control and lower latency but require substantial capital investment and ongoing maintenance, often using RAID arrays, storage area networks (SANs), or object-based systems with built-in data replication to protect against loss. Cloud storage provides scalable, flexible capacity and advanced redundancy with geographic replication, though institutions must ensure compliance with healthcare data protection regulations such as Health Insurance Portability and Accountability Act (HIPAA), General Data Protection Regulation (GDPR), or local equivalents. Hybrid models—combining high-speed on-premises storage for active cases with cloud archiving for long-term retention or disaster recovery—are increasingly common and cost-effective.
Data backup and security
Robust backup policies are essential to protect against data loss. Backups should be performed at regular intervals and stored in geographically separate locations, with automated verification processes to ensure integrity. Disaster recovery plans should define recovery time objectives (RTO) and recovery point objectives (RPO) tailored to clinical needs, and mission-critical systems should incorporate failover mechanisms to maintain access if the primary server fails.
Patient data within DP systems is subject to strict privacy regulations, including the HIPAA in the United States and the GDPR in Europe. Compliance requires a combination of technical, administrative, and physical safeguards. Technical measures include encryption of stored images (data at rest) and secure transmission channels (data in motion). Administrative controls involve access policies, staff training, and periodic security audits. Physical protections cover server room access restrictions, environmental controls, and redundant power and network systems. Institutions must also establish protocols for image anonymization and de-identification, particularly when sharing data for research or AI development. Metadata that could inadvertently identify patients—such as accession numbers, dates, or annotations—should be removed prior to export. Automated anonymization workflows help prevent data breaches while enabling ethical secondary use of DP data.
Data retention and archiving
Regulatory requirements determine how long pathology materials must be kept. In the U.S., the College of American Pathologists (CAP) mandates a minimum 10-year retention for glass slides and blocks[
13], but no formal guideline yet exists for digital Whole slide images (WSIs). As a result, institutions set their own policies: some delete images after 3–6 months, while others keep them much longer to support research and innovation.
As DP adoption grows, managing the balance between image accessibility and storage costs has become increasingly important, with storage expenses rising annually. Effective WSI life cycle management (WSI-LCM) policies are therefore essential. Key considerations include the purpose of retention, required duration, access speed, and storage cost. Common retention strategies include: (1) Clinical diagnostic cases: retained for up to 3 months; require rapid, high-performance storage. After sign-out, these cases can be migrated to secondary storage. (2) Prior cases after sign-out: retention varies by subspecialty, averaging 2–3 years; moderate access times are acceptable with slower, cost-effective storage. (3) Educational cases: retained indefinitely with fast access to support teaching needs. (4) Legal/regulatory cases: often stored for up to 10 years; rarely accessed, suitable for slow, low-cost archival storage.
Network architecture and performance
The transition to DP significantly increases network demands. WSIs, even when compressed, are substantially larger than typical clinical data files, requiring networks with sufficient bandwidth, low latency, and high reliability. Bandwidth needs depend on case volume, image size, and the number of concurrent users. As a guideline, institutions should provide at least 1 Gbps connections between scanners, storage servers, and pathologist workstations, with 10 Gbps backbones preferred in high-volume settings. Network performance should be evaluated under peak load to identify potential bottlenecks.
Even minor latency can impact pathologist’s productivity. To address this, the IMS often employ local caching, temporarily storing recently accessed images on workstations or local servers for near-instantaneous reload during viewing sessions. Network optimization techniques—such as content delivery networks (CDNs) or distributed caching—can further enhance performance across geographically dispersed campuses. Because DP involves protected health information (PHI), network security must meet strict regulatory standards. Access controls ensure that only authorized personnel can view, annotate, or modify images. Role-based access control (RBAC) assigns permissions according to user roles, while audit trails log every access event to support accountability and compliance.
For remote access, secure virtual private networks (VPNs) or institution-managed web portals are used. Mandatory safeguards include multi-factor authentication, encryption in transit (HTTPS/TLS), and data integrity verification.
Diagnostic workstations
The quality of the viewing workstation directly affects the diagnostic experience. Pathologists transitioning from microscopes to digital monitors require high-resolution, color-calibrated displays that accurately reproduce histologic detail. A fully digital workflow also necessitates dedicated diagnostic displays at each pathologist’s workstation. DP displays can be categorized as medical grade (MG), professional grade (PG), or consumer off-the-shelf (COTS). Recently, pathology-specific MG displays have received FDA approval, and preliminary studies have benchmarked these instruments for primary diagnosis. Other studies have compared the performance of PG and COTS displays against MG displays[
14–
17].
Diagnostic monitors generally require at least 27-inch screens with 4K resolution or higher, providing a sufficient field of view for efficient navigation. Color calibration should meet medical imaging standards, and luminance should remain consistent across viewing sessions. Some institutions implement dual- or multi-monitor setups, dedicating one screen for image viewing and another for the LIS or report interface to enhance multitasking efficiency. Digital sign-out also changes the physical posture of pathologists, shifting from microscope-based to screen-based work. Proper ergonomic arrangements including adjustable monitor height, supportive chairs, and optimal lighting—help reduce fatigue and prevent musculoskeletal strain. Training in ergonomic practices should be incorporated into onboarding for all digital users[
18].
Annotation tablets, programmable keyboards, and high-precision mice can enhance navigation speed and user comfort. Some pathologists prefer trackballs or touch interfaces for faster panning and zooming[
19]. Institutions should accommodate individual preferences to optimize productivity and user satisfaction[
20].
Information technology (IT) support, maintenance, and lifecycle management
DP systems demand ongoing IT support well beyond traditional histology operations. Dedicated IT personnel—ideally embedded within the pathology department—should manage hardware maintenance, software updates, and user support[
11]. Routine tasks include monitoring scanner performance and updating firmware, verifying backups and storage integrity, applying security patches and antivirus updates, and reviewing system logs for errors or performance issues. Lifecycle management encompasses planning hardware refresh cycles (typically every 3–5 years), ensuring compatibility between legacy and new systems, and budgeting for software upgrades. A long-term maintenance roadmap helps prevent sudden obsolescence and ensures consistent user experience.
Clinical implementation of digital workflow
Although infrastructure and technology provide the foundation for DP, true success relies on seamless integration into daily laboratory workflows. This requires re-engineering traditional processes to include slide scanning, digital case assignment, and electronic sign-out, ensuring that digitization improves diagnostic efficiency without disruption[
9,
21–
25].
DP workflows can be broadly classified into two models: partial digital implementation, in which only selected subspecialties (for example, breast pathology, gastrointestinal pathology or genitourinary pathology) or use cases are digitized (for example, consultation, frozen sections, or education); and full digital implementation, in which all routine diagnostic slides are scanned and reviewed digitally. Both models offer unique advantages and challenges. Partial implementation allows laboratories to pilot the technology, build confidence, and identify bottlenecks before large-scale deployment. Full digital adoption, while more ambitious, enables seamless integration with other digital health systems and unlocks efficiencies in workload distribution, telepathology, and computational analytics.
Institutions pursue DP for diverse reasons. Many aim to enhance diagnostic efficiency, reduce turnaround times, and support remote sign-out. Others seek to improve patient safety through standardized workflows and traceable digital records. Additionally, DP provides a foundation for future innovations, including AI-assisted diagnostics, computational image analysis, and precision oncology. Regardless of the initial motivation, successful implementation requires alignment among multiple stakeholders, including pathologists, laboratory managers, IT teams, hospital administrators, and regulatory authorities.
Historically, pathology has lagged behind other medical disciplines in digital transformation, largely due to the technical and logistical challenges of digitizing glass slides at diagnostic resolution. Nevertheless, several landmark institutional initiatives have shown that DP is both feasible and beneficial when carefully planned and executed[
11,
18,
26–
37]. Large academic institutions have reported measurable gains in workflow efficiency, cost savings through reduced slide handling and courier services, and improvements in multidisciplinary collaboration[
11,
12,
38–
40]. These experiences offer invaluable insights into the operational and strategic considerations necessary for successful adoption. Over more than two decades of published work, early adopters have demonstrated that DP is a transformative yet inherently complex endeavor. Key findings from these pioneers are summarized in Table 2.
A key lesson from early adopters is the critical importance of infrastructure readiness. Implementing a DP workflow requires a robust network capable of handling terabyte-scale data, secure and redundant storage, and scalable image management software that integrates seamlessly with existing LISs. Equally essential is clinical validation to ensure diagnostic equivalence between digital and optical modalities, supported by rigorous QA protocols, standard operating procedures, and compliance with guidelines from professional bodies such as the CAP, the Digital Pathology Association (DPA), and relevant international agencies[
24,
41–
44].
Equally important is the human factor. Digital transformation reshapes how pathologists work, communicate, and interact with their environment. Training programs must address both the technical use of digital systems and the cognitive adaptation required for interpreting digital images. User acceptance, workflow redesign, and sustained leadership support are critical to success. Institutions that have achieved smooth transitions often emphasize early and continuous engagement of pathologists in planning, system selection, and validation.
DP should also be viewed as a long-term investment rather than a one-time project. Beyond initial hardware and software acquisition, ongoing costs include system maintenance, data storage expansion, software updates, and cybersecurity. A comprehensive business case is essential to secure institutional funding and demonstrate the economic value of DP, whether through direct operational savings or indirect benefits such as improved efficiency, faster turnaround times, and enhanced academic and research capacity.
A comprehensive roadmap for implementing a digital workflow in clinical surgical pathology is illustrated in Fig. 1. Based on institutional case studies and published evidence, it outlines the technical, operational, and human dimensions of digital adoption. Each section addresses a core domain of implementation, from strategic planning to technical infrastructure, workflow design, system integration, validation, staffing, training, adoption, and continuous quality improvement. This roadmap provides a practical framework adaptable to institutions of varying size, complexity, and readiness, helping ensure that DP delivers sustainable improvements in diagnostic practice.
Designing a digital slide workflow
The quality of a digital image starts with the glass slide. Slides must be clean, uniformly coverslipped, and accurately labeled or barcoded, as even minor imperfections—such as excess mounting medium, debris, or air bubbles—can distort scanning. Barcode labeling is critical: each slide should carry a 2D barcode encoding the accession number and slide ID to enable automatic linkage to the case record. Standardizing label placement ensures reliable recognition by scanner cameras. Modern histology laboratories often integrate immunohistochemistry (IHC) staining into routine workflows. Ideally, whole-slide scanners are positioned between the H&E and IHC areas so slides can be scanned immediately after coverslipping. This placement minimizes manual handling, reduces the risk of loss or damage, and ensures temporal alignment between physical and digital slides. In high-throughput settings, dedicating scanning personnel helps maintain continuous operation, rapid troubleshooting, and consistent QC[
11,
12,
29].
Slides are typically loaded into racks or cassettes, verified by barcode, and automatically queued for scanning. Modern scanners include automated tissue detection, focus mapping, and calibration algorithms, but technician oversight remains essential. All scanned fields should be reviewed to ensure proper focus and image fidelity. Key QC checkpoints include focus accuracy, scan completeness, absence of scanning or staining artifacts, and appropriate image color balance.
The IMS assigns a unique digital identifier to each image, linking it to metadata such as case number, tissue type, stain, and block or slide ID. Consistent naming conventions are critical to prevent mismatches, and ideally, metadata is imported directly from the LIS via Health Level Seven (HL7) interfaces to avoid manual entry errors.
After scanning, digital files are automatically uploaded to the IMS, where checksum verification confirms file integrity, followed by thumbnail generation and indexing. In high-volume laboratories, this process is parallelized across multiple scanners and servers. Some labs also employ automated dashboards that display scanning progress, error rates, and throughput metrics in real time, enabling supervisors to quickly identify bottlenecks, such as scanner downtime or poor slide quality.
LIS-IMS synchronization
Integration between the LIS and the IMS forms the backbone of a digital workflow. Without reliable synchronization, pathologists cannot seamlessly access digital slides within their case worklists. When a case is accessioned in the LIS, a unique case ID is generated. During scanning, barcode data link each digital image to this case ID, enabling automatic association. Once ingestion is complete, the IMS sends a confirmation message to the LIS, indicating that the slides are available for review. In a fully integrated environment, the pathologist can open the case in the LIS and launch the digital viewer directly, which loads all corresponding slides in the correct order with the appropriate stain identifiers[
11,
12,
31]. Two-way communication between the LIS and IMS enables real-time updates on case status, such as “in scan”, “available”, or “awaiting QC”. This transparency supports efficient workload management and helps prevent premature case assignment.
Hybrid workflows
Few institutions transition immediately to full digital operations; most begin with hybrid workflows that combine glass and digital review. Hybrid approaches commonly include: (1) pilot subspecialties, where digital workflows are initially implemented in areas that benefit most, such as dermatopathology or gastrointestinal pathology; (2) remote consultations, using digital slides for second opinions while maintaining glass archives; and (3) academic or training deployments, introducing digital systems in educational settings before diagnostic rollout[
11,
12,
18,
22,
27–
32,
35,
37,
45,
46].
Each scanning batch should undergo systematic QC inspection. This includes verifying focus, color fidelity, completeness of tissue capture, and absence of artifacts. Institutions often designate dedicated QC personnel who review thumbnail grids or random samples[
47–
49].
Validation and QA
Validation and QA are essential for DP. Unlike traditional microscopy, digital workflows introduce variables including scanners, image formats, displays, network performance, and software that can affect diagnostic accuracy. Robust validation and ongoing QA ensure that WSIs faithfully replicate optical review, support regulatory compliance, and maintain diagnostic confidence.
Validation
Validation in DP covers both technical and clinical performance. Technical validation evaluates hardware and software, including image fidelity, scanner calibration, focus accuracy, color reproduction, tissue coverage, and scanning speed. Test slides representing typical tissue types, staining variations, and known artifacts are scanned, and deficiencies prompt calibration, software adjustment, or protocol modification. Repeated scans under varying conditions (operator, scanner, or time of day) should yield consistent results[
27,
30,
42,
49,
50].
Clinical validation ensures pathologists can render accurate diagnoses on WSIs compared to glass slides. Studies should include the full spectrum of routine cases, with subspecialty-specific validation as needed. Pathologists independently review cases in both digital and glass formats, with concordance rates calculated to confirm diagnostic equivalence. Systematic discrepancies must be addressed, and a minimum washout period between readings is recommended. For telepathology or remote sign-out, validation should replicate expected network conditions and display setups to maintain accuracy across locations.
Regulatory and professional guidelines shape the validation process. Institutions typically refer to standards set forth by bodies such as the CAP in the US (Table 3), the Royal College of Pathologists (RCPath) in the UK or other regional authorities/accreditation agencies[
41,
42].
QA
QA is essential to maintaining diagnostic integrity in DP. Digitization introduces variables absent in conventional microscopy, requiring systematic monitoring of image quality. Technologists and pathologists perform daily or batch-level inspections to detect focus errors, blurring, color distortions, tissue truncation, and missing sections. Barcode accuracy and metadata integrity are also verified to ensure proper case assignment and seamless LIS integration. Modern scanners and imaging platforms increasingly include automated QC algorithms that detect focus or registration deficiencies, allowing timely corrective action before review[
47–
49,
51].
Monitoring rescan rates provide an important operational metric, as elevated rates may indicate issues in tissue processing, coverslipping, staining, scanner calibration, or operator technique. Display calibration is critical for accurate color, brightness, and contrast rendering, with photometric or colorimetric tools ensuring consistent viewing across workstations.
QA extends to DP software and systems, including functional validation, regression testing, bug tracking, version control, and formal change-management policies to prevent workflow disruptions. Data integrity is supported by comprehensive audit trails documenting scanning times, operators, access logs, annotations, modifications, case assignments, and sign-out timestamps, enabling root-cause analysis when errors occur.
A structured error-management framework addresses inevitable scanning, data handling, or interpretive errors. This includes prompt identification via automated alerts or QC checkpoints, classification of errors, corrective actions such as rescanning or retraining, and preventive strategies guided by trend analysis.
Ongoing performance monitoring using quantitative metrics—diagnostic concordance, scanner uptime, image rejection/rescan rates, digital turnaround time, and user satisfaction—ensures that DP systems continue to meet clinical, operational, and regulatory standards.
Clinical adoption
Resistance to change is a common barrier in digital transformation. Pathologists and laboratory staff accustomed to glass-slide practice may question digital accuracy or workflow efficiency, making effective change management essential. Institutions should engage stakeholders early, communicate benefits such as faster turnaround, remote access, and AI integration, and use pilot projects to demonstrate tangible improvements.
Leadership fosters adoption by recognizing early adopters, sharing successes, celebrating milestones, and providing channels for feedback. Structured strategies—early engagement, transparent communication, targeted training, pilot projects, and “digital champions” mentoring colleagues—further support acceptance.
Most institutions implement DP in phases[
11,
12,
29]:
•Pilot phase: Limited deployment, often in one subspecialty or consultation cases, to assess technology, training, and workflow impact.
•Hybrid phase: Expanded scanning for selected routine cases, combining glass and digital workflows, identifying bottlenecks, validating systems, and refining standard operating procedures (SOPs).
•Full digital phase: Transition all diagnostic cases to digital, supported by established QC measures, IT infrastructure, and trained staff.
Each phase includes performance monitoring, staff feedback, and iterative improvement, reducing risk and building confidence. Adoption can be evaluated using metrics such as digital case volume, turnaround time, user satisfaction, costs, rescan rates, diagnostic concordance, and QC compliance, ensuring alignment with institutional goals and guiding optimization.
AI in pathology
Introduction of AI in pathology
AI is rapidly transforming diagnostic pathology[
52,
53]. As a discipline grounded in interpreting complex tissue morphology, pathology has traditionally relied on manual expertise developed over years of training. Increasing case complexity, the rise of precision medicine, and a global workforce shortage now create a strong need for computational support. AI addresses this gap by performing pattern recognition, decision-making, and learning tasks at scale and speed[
52,
54].
Unlike structured data (e.g., laboratory values), histopathology images are inherently unstructured. WSIs consist of millions of pixels without inherent semantic meaning—computers “see” only pixel arrays, not biological entities such as nuclei or tumors. This lack of structure limits direct computational interpretation. To enable analysis, visual patterns must be translated into quantitative features[
55]. This involves segmentation (identifying regions), feature extraction (e.g., nuclear size, texture), and classification (assigning biological meaning). AI bridges this gap by converting pixel data into structured outputs—such as tumor grade, biomarker expression, or prognostic scores—thereby supporting automated diagnosis, risk prediction, and treatment planning.
Every pathology image begins as a grid of pixels—rich in color but devoid of meaning. While pathologists recognize nuclei, glands, and tumor architecture, computers see only numerical arrays. Bridging this gap requires a structured pipeline that converts raw pixels into clinically actionable insights.
Annotation and labeling provide the foundation. Expert pathologists define “ground truth” by marking key structures (e.g., tumor regions, mitoses), enabling supervised learning and linking biological knowledge to computational models[
56]. Segmentation partitions images into meaningful regions—tumor, stroma, necrosis, lymphocytes—so that context-specific analysis becomes possible. This step underpins tasks such as tumor grading and biomarker quantification[
57–
65]. Registration aligns multi-modal images (e.g., H&E and IHC), ensuring that corresponding regions match across slides and enabling integrated analysis of morphology and molecular markers[
66–
69]. Color normalization reduces variability from staining and scanning differences, improving model robustness and consistency across datasets[
70–
73]. Feature extraction then converts visual patterns into quantitative representations. Traditional approaches rely on engineered features (e.g., nuclear size, texture), whereas DL learns hierarchical features directly from pixel data, often revealing patterns beyond human perception[
74–
76]. Finally, model outputs translate these features into structured results—tumor classification, biomarker levels, or prognostic predictions—supporting diagnosis and treatment decisions[
77–
88]. In essence, this pipeline transforms unstructured images into structured knowledge, enabling scalable, reproducible analysis and advancing precision pathology.
Overview of machine learning (ML) and DL
AI in pathology is driven primarily by two approaches. ML uses handcrafted features (e.g., nuclear size, texture, shape) defined by experts to train models on labeled data (Fig. 2A). DL uses neural networks to learn hierarchical features directly from raw pixels, often uncovering patterns beyond human perception. This distinction has practical implications: DL typically requires larger datasets, ML is more interpretable, and both must meet clinical expectations for transparency and reproducibility.
ML: handcrafted intelligence
ML learns from annotated datasets by converting images into predefined quantitative features. These features enable models to classify patterns—such as distinguishing mitotic from non-mitotic cells—based on measurable attributes. Because inputs are explicit and biologically meaningful, ML models are generally more interpretable and well-suited for tasks with defined criteria (e.g., tumor grading, biomarker quantification). Learning in ML is essentially mapping patterns to meaning. In supervised learning, models are trained on labeled examples (e.g., annotated slides), enabling accurate, task-specific predictions. In unsupervised learning, models identify inherent patterns without labels, supporting discoveries such as identifying novel histologic subgroups. Together, they balance reliability and innovation[
89].
DL: automated feature discovery
DL represents a shift from manual feature design to data-driven learning. Neural networks trained on large image datasets automatically learn features associated with diagnostic patterns such as tumor architecture or mitotic activity without explicit instructions[
90,
91]. This enables detection of subtle, previously unrecognized signals but often at the cost of interpretability (“black box” behavior). Architecture represents network design (e.g., CNN, Vision Transformer) while a model is an architecture trained on data for a specific task. Common DL architectures in pathology include: (1) Convolutional Neural Networks (CNNs): detect local features; widely used for segmentation and classification[
92,
93]; (2) Vision Transformers (ViTs): capture global context for tasks like grading[
94,
95]; (3) Graph Neural Networks (GNNs): model cell–cell interactions and tissue architecture[
96,
97]; (4) Recurrent Neural Networks (RNNs)/Long Short-Term Memory networks (LSTMs): analyze sequential data (e.g., longitudinal samples, reports)[
98,
99]; and (5) multimodal models: integrate images with genomic and clinical data. (Fig. 2B)
DL models can achieve high diagnostic accuracy yet often lack transparent reasoning, so-called “black boxes”. In clinical settings, this is problematic: predictions must be explainable, defensible, and aligned with medical logic. Without interpretability, even accurate models risk limited trust and adoption.
This opacity stems from how DL works. Unlike traditional machine learning with explicit features (e.g., nuclear size, texture), DL learns abstract patterns from raw pixels. These features are encoded as numerical vectors without clear biological meaning, making it difficult to explain why a model predicts high grade or poor prognosis.
Explainability is therefore essential in pathology, where errors have direct clinical consequences and regulatory approval depends on transparency. To address this, several methods provide insight into model behavior: Saliency maps: highlight image regions influencing predictions[
100]; Grad-CAM: generates heatmaps showing where the model “looked”[
101,
102]; Attention mechanisms: indicate which regions receive the most focus[
85,
103–
105]. While not fully resolving interpretability, these tools help validate whether model decisions align with athology principles. Ultimately, explainability is a clinical necessity—enabling trust, regulatory acceptance, and effective collaboration between AI and pathologists.
Foundation models in pathology: the era of scalable intelligence
The emergence of foundation models marks a turning point in computational pathology. Traditionally, AI systems were developed for narrow tasks—such as mitosis detection, tumor grading, or biomarker quantification—each requiring dedicated datasets and training pipelines. In contrast, foundation models are trained on millions of image tiles from diverse WSIs, spanning tissue types and staining protocols. Models such as CTransPath, PLIP, UNI, Virchow, and CHIEF operate at an unprecedented scale, enabling them to learn generalizable representations of histopathology[
106–
112]. These representations act as universal building blocks that can be fine-tuned for specific tasks using minimal labeled data, substantially reducing the annotation burden that has long constrained AI development in pathology. Beyond scalability, foundation models offer strong adaptability. Through transfer learning, knowledge gained in one domain (e.g., breast cancer morphology) can accelerate learning in others, such as prostate cancer or rare sarcomas—an essential capability for the heterogeneous landscape of pathology. Moreover, multimodal models like PLIP extend this paradigm by integrating images with text, linking WSIs to pathology reports and clinical narratives. This convergence of visual and contextual data moves the field toward more holistic diagnostic systems and intelligent platforms that support precision medicine.
Current state of AI applications in clinical practice
To understand the practical impact of AI in pathology, imagine the daily challenges faced in a busy diagnostic lab. Each case brings unique complexities, some subtle, others glaring—and every decision carries clinical consequences. AI steps in as a digital assistant, not replacing the pathologist but amplifying their capabilities. Below are key areas where AI is making a difference, illustrated through real-world scenarios.
Classification and diagnosis
Precise classification of pathologic lesions is essential for appropriate treatment selection. AI is increasingly used to support primary diagnosis by analyzing complex morphologic patterns at scale. For example, in prostate biopsies, grading depends on glandular architecture—yet even experts may disagree in borderline cases. AI models trained on large annotated datasets can assess these patterns consistently, reducing variability. In breast pathology, CNNs analyze nuclear features and mitotic activity, achieving performance comparable to experienced pathologists and serving as a reliable second reader[
75,
82,
113–
117].
AI-based screening tools further enhance workflow efficiency. Systems such as the IBEX (Galen Breast) platform, trained on millions of labeled image patches, demonstrate high accuracy across lesion types while improving pathologist performance, reducing review time, and lowering unnecessary immunohistochemistry use[
118,
119]. Regulatory clearance of Paige Prostate Detect by the FDA represented a key milestone, accelerating the clinical deployment of AI for prostate cancer detection in core needle biopsies across the U.S. pathology practices[
120–
122]. Meanwhile, in Europe, CE-marked platforms—including Ibex’s Galen Prostate, Aiforia’s Prostate Cancer AI, and DeepDx—are seeing increasing adoption in routine laboratory workflows[
123–
126].
Overall, AI-assisted diagnostic classification has matured rapidly, with algorithms approaching or even exceeding expert-level performance in specific tasks. The greatest near-term utility lies in workflow augmentation, including triage of negative or high-probability malignant cases, pre-screening, and decision support for challenging diagnostic lesions.
Screening and detection
Lymph node metastasis
Detecting metastases in lymph nodes is critical for accurate staging in cancer, yet small tumor deposits can be easily missed. AI systems trained on large datasets can identify these subtle foci and flag suspicious regions, reducing false negatives and improving patient safety[
80,
83,
85]. This challenge is especially relevant for axillary lymph nodes in breast cancer patients, where micrometastases and isolated tumor cells are difficult and time-consuming to detect. Studies show that AI significantly improves both sensitivity and efficiency. In the landmark CAMELYON16 challenge, top-performing algorithms achieved near-perfect accuracy (area under the curve [AUC] up to 0.994), in some cases exceeding the pathologist’s performance. AI assistance has also been shown to increase detection sensitivity and reduce interpretation time in clinical settings[
127–
130].
Commercial tools, such as the Visiopharm platform, now automate metastasis detection, measurement, and annotation on WSIs. These systems can achieve very high sensitivity while reducing review time, underscoring AI’s growing role in enhancing diagnostic accuracy and workflow efficiency[
131]. Tools such as Paige’s PanCancer Detect and the Aiforia Colon Suite, are trained to recognize metastatic lesions in many tumors, including those arising from gastrointestinal, breast and other primaries[
132,
133].
Microcalcifications
Identifying mammary microcalcifications on H&E-stained stereotactic breast biopsy slides is challenging, as they are associated with a broad spectrum of benign, premalignant, and malignant conditions. In addition to detecting malignant and atypical lesions, the IBEX AI solution is designed to identify microcalcifications in breast biopsy WSIs. In one study, the algorithm achieved an AUC of 0.925 with 95% sensitivity, highlighting its potential to support microcalcification detection in routine practice[
119].
Mitosis
Counting mitosis is a cornerstone of tumor grading, yet it is notoriously tedious. Scanning a large sarcoma or other tumor section for tiny mitotic spindles among thousands of cells can be exhausting, and fatigue or time pressure may cause figures to be missed. AI algorithms excel in this task, rapidly scanning entire slides and accurately pinpointing mitotic cells. This automation not only speeds workflow but also enhances reproducibility, which is crucial in high-stakes diagnoses[
134–
137].
AI-driven quantification of biomarkers and grading
IHC biomarkers
Tissue biomarkers are central to diagnosis, prognosis, and treatment selection. Traditionally, protein markers are evaluated by IHC and pathologists have historically interpreted these manually. With the advancement of DP, AI technology is now being used to provide more objective and standardized assessments. Key biomarkers in routine pathology practice, including estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2/neu), Ki-67, and programmed death-ligand 1 (PD-L1) have been the focus of many AI studies.
ER and PR are predictive of breast cancer prognosis and treatment response and are routinely assessed by IHC, typically as the percentage of positive tumor nuclei. Manual scoring is variable[
138–
146], but AI algorithms show strong agreement with pathologists and can calculate H-scores, improving reproducibility[
12,
147–
155], though oversight remains necessary for faint staining or benign tissue[
156].
HER2, overexpressed in about 20% of breast cancers and many other cancers such as endometrial serous carcinoma, gastrointestinal adenocarcinoma, and lung cancer, is assessed via IHC and/or
in situ hybridization (ISH) per American Society of Clinical Oncology (ASCO)/CAP guidelines[
157–
160]. Manual HER2 scoring suffers from variability[
161–
163]. AI-based methods have demonstrated superior reproducibility and accuracy[
164–
169]. Algorithms evaluate membrane staining patterns or staining intensity, while AI-assisted microscopy incorporating augmented reality is further improving reliability[
164,
170–
173]. Emerging research suggests that AI may help refine classification, such as distinguishing HER2-low subgroups[
174–
176].
Ki-67, a proliferation marker with prognostic significance in many tumors and in gastrointestinal neuroendocrine tumor grading[
177–
180], suffers from variable manual scoring[
177,
181–
185]. AI can automate detection and positivity calculations across entire tumor areas for more consistent estimates[
186–
193].
PD-L1 IHC guides immune checkpoint therapy in lung cancer (Tumor Proportion Score [TPS]) and others (Combined Positive Score [CPS]), but scoring (TPS or CPS) is challenging manually. AI models can distinguish tumor and immune cells and compute TPS or CPS values matching expert assessments[
194,
195]. Emerging biomarkers, including multiplex panels and novel targets, are also poised to benefit from AI-based quantification, leveraging its ability to analyze complex staining patterns and spatial relationships.
Tumor grading
Consistent histologic grading in cancers is essential for prognosis and therapy planning. For example, the Nottingham system in breast cancer evaluates tubule formation, nuclear pleomorphism, and mitotic activity, but manual assessment is prone to interobserver variability. DL models improve reproducibility across all three components[
37,
196–
206]. For instance, DL classifiers can automatically identify tubule-forming nuclei and calculate their ratio to total tumor cells, while automated pipelines detect mitotic hotspots and figures, enhancing proliferation assessment. Mitosis detection has been enhanced through automated DL pipelines that identify hotspots and mitotic figures, improving proliferation assessment accuracy. AI models for nuclear pleomorphism have also demonstrated superior reproducibility and prognostic stratification compared with traditional grading. In genitourinary pathology, AI achieves comparable performance with expert pathologists in grading prostate cancer (Gleason score) and clear cell renal cell carcinoma[
207–
215].
Tumor microenvironment
The tumor microenvironment, particularly tumor-infiltrating lymphocytes (TILs), is a strong prognostic factor in cancers, with high TIL levels linked to better survival and chemotherapy response[
216,
217]. Manual TIL assessment is subjective and variable, complicated by the complex spatial distribution of immune cells. DL algorithms improve detection and quantification, segmenting tumor versus stromal areas and calculating TIL density with high precision and reproducibility[
218,
219]. AI enables continuous scoring for finer stratification and systematic spatial profiling of immune cells, mapping their locations and interactions across entire slides. These spatial biomarkers provide additional prognostic and therapeutic insights, supporting research and clinical decision-making[
220].
Prognosis, risk stratification, and prediction of treatment response
Perhaps the most exciting frontier is prediction. Imagine an H&E slide that not only confirms diagnosis but also forecasts a patient’s future—risk of recurrence, likelihood of therapy response, or underlying genetic alterations. AI models can bridge morphology and genomics, extracting subtle signals from tissue architecture to generate actionable prognostic insights. This approach transforms pathology from a descriptive discipline into a predictive science, fully aligning with the goals of precision medicine[
77–
82,
84–
88,
117,
221–
224].
In breast cancer, histopathologic features contain rich prognostic information, traditionally assessed via histologic type, grade, tumor size, lymph node status, lymphovascular invasion, and biomarker expression[
225–
229]. Genomic assays like Oncotype DX, MammaPrint, and PAM50 guide adjuvant therapy decisions but are costly and not universally available. AI offers a scalable, cost-effective alternative by analyzing H&E slides to predict outcomes, recurrence risk, and molecular assay results. DL models, such as BCR-Net and Deep-ODX, can accurately score histologic features, correlate morphology with prognosis, and predict Oncotype DX scores with high consistency[
84,
86,
200,
225,
227,
230–
235]. Commercial tools like Stratipath Breast (Stratipath AB) and RlapsRisk® BC (Owkin) demonstrate the clinical feasibility of AI-based risk stratification.
AI also enables prediction of treatment response by extracting features from tumor biology and the microenvironment. Multimodal approaches integrating histology with biomarkers, genomics, and imaging improve accuracy, supporting early therapy guidance[
230,
236–
241].
AI applied to digital histopathology has emerged as a promising approach to infer molecular alterations directly from routine H&E slides, enabling “virtual molecular testing”. DL models can detect subtle morphologic correlations of molecular phenotypes, many of which are imperceptible to human observers. This capability has far-reaching implications for both research and clinical practice. By decoding complex biological patterns, AI is transforming breast cancer management toward more personalized, predictive care.
AI-integrated pathology workflow
Successful AI implementation in pathology requires strategic decisions about workflow. Automation opportunities include highly standardized tasks—grading, routine IHC quantifications, and straightforward frozen-section evaluations—where AI can improve efficiency, accuracy, and consistency. Processes requiring reinvention include QA, which may shift from retrospective audits to real-time AI-driven evaluation, and education and certification, which will increasingly embed AI competencies. Reports will become structured with AI-generated data, positioning pathologists to focus on verification, interpretation, and contextual integration. AI-generated summaries will enhance multidisciplinary communication. Human-centric roles to retain include integrated diagnostic reasoning, management of complex clinical scenarios, ethical decision-making, regulatory oversight, and patient-centered communication. Emerging capabilities include data-driven reports, advanced visualizations, predictive analytics, AI-driven case prioritization, and enhanced prognostic guidance.
Careful management of these transitions, with active pathologist involvement, ensures AI enriches rather than diminishes professional roles. Delegating routine tasks to AI allows pathologists to focus on complex cases, interdisciplinary collaboration, and clinical impact. Implementing AI demands structured planning, validation, and governance, guided by regulatory standards and stakeholder engagement. By following systematic approaches, pathology departments can transition to AI-enhanced practice while maintaining diagnostic excellence. Ultimately, the pathologist’s role evolves from diagnostician to diagnostic orchestrator, integrating diverse data into cohesive clinical narratives.
Ethics in AI adoption
Algorithmic bias in AI, particularly in pathology, largely arises from limitations in the datasets used for model training, including issues with sample selection, representation, and completeness of variables[
242–
244]. For example, datasets that disproportionately represent certain populations, such as adult white males due to accessibility and socioeconomic factors, can result in algorithms that do not generalize well to broader, more diverse patient groups, thereby potentially disadvantaging underrepresented populations. Another major concern is under-specification, where critical variables (such as genetic background or social determinants of health) are not included in the training data, leading to incomplete models and potentially misleading correlations in outcome prediction. This can result in erroneous assumptions, such as interpreting lower healthcare spending as an indicator of better health, when it may instead reflect limited access to care. These challenges underscore the importance of thoughtful dataset design and a deep understanding of both clinical and contextual factors when developing AI tools. Although it is not always practical for busy pathologists to master the complexities of statistical and algorithmic bias, maintaining awareness of these risks is essential when interpreting AI-assisted results. Addressing these issues requires a shared responsibility: regulatory agencies such as the U.S. Food and Drug Administration play a central role in evaluating and approving AI systems, while researchers, industry vendors, and pathologists must collaborate to identify, mitigate, and monitor sources of bias over time. In addition, professional organizations like the DPA and the CAP are actively contributing to guidance and best practices, helping ensure that AI is implemented in a way that is equitable, reliable, and clinically meaningful.
Patient privacy is a fundamental principle in the U.S. healthcare, centered on safeguarding PHI under the Health Insurance Portability and Accountability Act[
245,
246]. HIPAA establishes national standards for protecting PHI, grants patients’ rights over their health data, regulates electronic healthcare transactions, and enforces compliance through its privacy and security rules, overseen by the HHS Office for Civil Rights[
245–
248]. As AI technologies become increasingly integrated into healthcare and pathology, practitioners, especially those involved in decision-making, must maintain a working understanding of these regulatory requirements to ensure compliance and protect patient data.
New frontiers of AI in pathology
Multimodal foundation model
Multimodal foundation models represent a new paradigm in AI, extending beyond traditional large language models (LLMs) and image-based models. Trained on large, multimodal datasets including images, text, genomics, and clinical records, they learn generalizable representations that can be adapted to diverse tasks with minimal labeled data. Their architecture, typically built on large transformer backbones, enables integration and reasoning across multiple data modalities, particularly valuable in pathology, where diagnosis depends on the convergence of morphological, molecular, and multi-omics clinical data. This multimodal reasoning capability enables more comprehensive decision support, for example, correlating immunohistochemistry with morphological patterns, predicting genomic alterations from routine H&E slides, and generating differential diagnoses informed by clinical context.
The superiority of multimodal foundation models lies in four key areas: scalability and transferability, as pretraining on large, diverse datasets enables rapid adaptation with minimal labeled data; multimodal integration, allowing fusion of images, laboratory results, and clinical text for precision diagnostics; robustness and generalization, improving reliability across diverse populations and settings; and emergent reasoning, including in-context learning and cross-domain inference, which can generate novel biological insights and enhance clinical decision-making.
In practice, multimodal foundation models are poised to underpin next-generation pathology workflows, serving as the “central intelligence” layer. They can orchestrate image analysis, molecular prediction, automated reporting, and communication with clinicians and patients. By combining the narrative fluency of LLMs, the generative capabilities of diffusion models, and large-scale multimodal integration, they enable a shift from siloed analyses to integrated, patient-centered precision pathology, transforming data from images and clinical sources into diagnostic, prognostic, and predictive insights. Several emerging models illustrate this potential. PathChat applies LLM capabilities to WSIs, enabling interactive queries that link histopathologic findings with natural language explanations for education and decision support[
249]. Virchow, a large vision foundation model trained on pathology images, demonstrates strong zero- and few-shot performance in cancer classification, survival prediction, and biomarker discovery[
250]. Other models, such as CONCH[
251] and PLIP[
252], extend this paradigm through cross-modal representation learning that aligns images with text, enhancing retrieval and multimodal reasoning. The clinical utility of these models can be substantial. A pathologist could use a multimodal model as a “copilot,” integrating pathologic images with relevant clinical history to generate a tailored differential diagnosis and recommend appropriate ancillary tests to reach a final diagnosis.
Together, these advances show that foundation models are not only enhancing the accuracy and efficiency of diagnostic pathology but also enabling new workflows—such as natural language–driven exploration of whole-slide images, automated biomarker detection, and seamless integration with genomics and electronic health records[
249–
251,
253–
256]. These capabilities lay the groundwork for scalable, generalizable, and clinically deployable AI systems in precision pathology.
Agentic AI
While multimodal foundation models have been transformative, the next paradigm shift is the transition from reactive copilot to proactive, autonomous AI agent[
257–
259]. Agentic AI is defined by true agency: the ability to understand context, invoke appropriate tools, formulate complex goals, execute multi-step tasks through an iterative reasoning loop, and continuously monitor outcomes[
260]. In healthcare, this could extend beyond answering queries to autonomously coordinating tasks such as test ordering or workflow management from a single high-level instruction[
258,
259]. In pathology, this shift marks a critical inflection point: from task-specific, assistive algorithms to fully agentic systems capable of orchestrating complex diagnostic workflows. By integrating modular architectures, persistent memory for gigapixel-scale WSI images, clinically grounded reasoning, agentic AI moves beyond passive detection to goal-directed execution. This evolution has the potential to redefine the pathologist’s role from manual analyst to high-level director, overseeing intelligent, adaptive, and end-to-end diagnostic processes.
Ferber et al. recently demonstrated an autonomous AI agent combining GPT-4 with precision oncology tools—ViTs for histopathology, MedSAM for radiology, and web-based resources (OncoKB, PubMed, Google). Across 20 multimodal cases, the system achieved 87.5% tool-use accuracy, 91.0% correct clinical conclusions, and 75.5% guideline citation accuracy—substantially outperforming GPT-4 alone (30.3% overall). This underscores the potential of integrated language model–driven systems to enhance personalized oncology decision support[
261]. Similarly, SlideSeek exemplifies multi-agent AI in pathology[
262]. Designed to analyze gigapixel WSIs autonomously, it mimics hierarchical human diagnostic reasoning: a Supervisor Agent formulates hypotheses and plans, while Explorer Agents use the PathChat multimodal model to examine slide regions. Iteratively refining its analysis, SlideSeek achieved 80.0% primary diagnosis accuracy on the DDxBench differential diagnosis benchmark, rivaling human-assisted systems that rely on pre-selected regions of interest (ROIs). This shift from reactive copilot to proactive, autonomous partner redefines the pathologist’s role from primary diagnostician to AI supervisor, with implications for training, workflow, and liability.
A defining advantage of agentic AI is its capacity to co-evolve with the pathologist[
263]. Within a “pathologist-in-the-loop” model, the pathologist serves as an active supervisor rather than a passive end user. Corrections can be captured through active learning and converted into labeled training data. With lightweight fine-tuning, the system can quickly adapt to local laboratory practices or rare pathologic changes. This creates a continuous learning ecosystem in which AI performance improves alongside clinical expertise: the AI performs the computational heavy lifting, while the pathologist provides interpretive oversight and final diagnostic authority[
263,
264].
To ensure clinical safety, agentic AI must evolve beyond opaque “black box” outputs toward deliberate, self-evaluative reasoning. Pathology agents can implement an “Observe–Reflect–Refine” cycle, systematically reassessing intermediate outputs before finalizing a diagnosis. This will result in a transparent, “glass-box” system, which can be firmly anchored in authoritative clinical standards.
As AI transitions from a “co-pilot” to a true autonomous agent, accountability becomes paramount. Deploying agentic systems requires strong governance frameworks to ensure that greater autonomy does not dilute clinical responsibility. Large-scale data infrastructures must adhere to strict consent and privacy standards, while algorithmic bias should be actively monitored and mitigated through built-in reflective mechanisms.
The future of pathology lies in autonomous yet tightly integrated systems that augment, rather than replace, human expertise. By assuming routine, repetitive, and computationally intensive tasks, from ordering IHCs and quantifying biomarkers to drafting a diagnostic report, agentic AI can reduce workload and cognitive strain. In doing so, it reinforces the pathologist’s central role, enabling greater focus on complex diagnostic reasoning, multidisciplinary collaboration, and the delivery of precision care.
Conclusions
Pathology is entering a new era defined by the convergence of digital infrastructure, advanced analytics, and increasingly intelligent AI systems. The transition from glass slides to fully integrated digital workflows has not only improved efficiency and accessibility but has also expanded the scope of diagnostic insight through computational and multimodal approaches. Emerging technologies including multimodal foundation models and agentic AI further extend this trajectory, enabling more adaptive, context-aware, and scalable solutions that align closely with the goals of precision medicine.
Importantly, this transformation is not about replacing the pathologist, but about redefining and elevating their role. As these tools mature, the pathologist remains central: providing clinical judgment, oversight, and integration of complex data into meaningful patient care decisions. Realizing the full potential of digital and AI-enabled pathology will require thoughtful implementation, robust validation, and strong governance frameworks to ensure safety, transparency, and equity. Ultimately, the future of pathology lies in a synergistic partnership between human expertise and intelligent systems: one that enhances diagnostic accuracy, streamlines workflows, and advances personalized medicine on a global scale.
The Author(s) 2026. This article is published by Higher Education Press at journal.hep.com.cn.