eprintid: 27825
rev_number: 9
eprint_status: archive
userid: 2
dir: disk0/00/02/78/25
datestamp: 2026-03-13 23:30:09
lastmod: 2026-03-13 23:30:11
status_changed: 2026-03-13 23:30:09
type: article
metadata_visibility: show
creators_name: Butt, Naveed Anwer
creators_name: Sarwat, Dilawaiz
creators_name: Delgado Noya, Irene
creators_name: Tutusaus, Kilian
creators_name: Samee, Nagwan Abdel
creators_name: Ashraf, Imran
creators_id: 
creators_id: 
creators_id: irene.delgado@uneatlantico.es
creators_id: kilian.tutusaus@uneatlantico.es
creators_id: 
creators_id: 
title: Benchmarking multiple instance learning architectures from patches to pathology for prostate cancer detection and grading using attention-based weak supervision
ispublished: pub
subjects: uneat_bm
subjects: uneat_eng
divisions: uneatlantico_produccion_cientifica
divisions: unincol_produccion_cientifica
divisions: uninimx_produccion_cientifica
divisions: uninipr_produccion_cientifica
divisions: unic_produccion_cientifica
divisions: uniromana_produccion_cientifica
full_text_status: public
keywords: Prostate cancer detection
Weakly supervised learning
Multiple instance learning
Whole slide images
ISUP grading
abstract: Histopathological evaluation is necessary for the diagnosis and grading of prostate cancer, which is still one of the most common cancers in men globally. Traditional evaluation is time-consuming, prone to inter-observer variability, and challenging to scale. The clinical usefulness of current AI systems is limited by the need for comprehensive pixel-level annotations. The objective of this research is to develop and evaluate a large-scale benchmarking study on a weakly supervised deep learning framework that minimizes the need for annotation and ensures interpretability for automated prostate cancer diagnosis and International Society of Urological Pathology (ISUP) grading using whole slide images (WSIs). This study rigorously tested six cutting-edge multiple instance learning (MIL) architectures (CLAM-MB, CLAM-SB, ILRA-MIL, AC-MIL, AMD-MIL, WiKG-MIL), three feature encoders (ResNet50, CTransPath, UNI2), and four patch extraction techniques (varying sizes and overlap) using the PANDA dataset (10,616 WSIs), yielding 72 experimental configurations. The methodology used distributed cloud computing to process over 31 million tissue patches, implementing advanced attention mechanisms to ensure clinical interpretability through Grad-CAM visualizations. The optimum configuration (UNI2 encoder with ILRA-MIL, 256
256 patches, 50% overlap) achieved 78.75% accuracy and 90.12% quadratic weighted kappa (QWK), outperforming traditional methods and approaching expert pathologist-level diagnostic capability. Overlapping smaller patches offered the best balance of spatial resolution and contextual information, while domain-specific foundation models performed noticeably better than generic encoders. This work is the first large-scale, comprehensive comparison of weekly supervised MIL methods for prostate cancer diagnosis and grading. The proposed approach has excellent clinical diagnostic performance, scalability, practical feasibility through cloud computing, and interpretability using visualization tools.
date: 2026-03
publication: Scientific Reports
id_number: doi:10.1038/s41598-026-39196-x
refereed: TRUE
issn: 2045-2322
official_url: http://doi.org/10.1038/s41598-026-39196-x
access: open
language: en
citation:   Artículo Materias > Biomedicina <http://repositorio.unincol.edu.co/view/subjects/uneat=5Fbm.html>
Materias > Ingeniería <http://repositorio.unincol.edu.co/view/subjects/uneat=5Feng.html> Universidad Europea del Atlántico > Investigación > Producción Científica <http://repositorio.unincol.edu.co/view/divisions/uneatlantico=5Fproduccion=5Fcientifica.html>
Fundación Universitaria Internacional de Colombia > Investigación > Artículos y libros <http://repositorio.unincol.edu.co/view/divisions/unincol=5Fproduccion=5Fcientifica.html>
Universidad Internacional Iberoamericana México > Investigación > Producción Científica <http://repositorio.unincol.edu.co/view/divisions/uninimx=5Fproduccion=5Fcientifica.html>
Universidad Internacional Iberoamericana Puerto Rico > Investigación > Producción Científica <http://repositorio.unincol.edu.co/view/divisions/uninipr=5Fproduccion=5Fcientifica.html>
Universidad Internacional do Cuanza > Investigación > Producción Científica <http://repositorio.unincol.edu.co/view/divisions/unic=5Fproduccion=5Fcientifica.html>
Universidad de La Romana > Investigación > Producción Científica <http://repositorio.unincol.edu.co/view/divisions/uniromana=5Fproduccion=5Fcientifica.html> Abierto Inglés Histopathological evaluation is necessary for the diagnosis and grading of prostate cancer, which is still one of the most common cancers in men globally. Traditional evaluation is time-consuming, prone to inter-observer variability, and challenging to scale. The clinical usefulness of current AI systems is limited by the need for comprehensive pixel-level annotations. The objective of this research is to develop and evaluate a large-scale benchmarking study on a weakly supervised deep learning framework that minimizes the need for annotation and ensures interpretability for automated prostate cancer diagnosis and International Society of Urological Pathology (ISUP) grading using whole slide images (WSIs). This study rigorously tested six cutting-edge multiple instance learning (MIL) architectures (CLAM-MB, CLAM-SB, ILRA-MIL, AC-MIL, AMD-MIL, WiKG-MIL), three feature encoders (ResNet50, CTransPath, UNI2), and four patch extraction techniques (varying sizes and overlap) using the PANDA dataset (10,616 WSIs), yielding 72 experimental configurations. The methodology used distributed cloud computing to process over 31 million tissue patches, implementing advanced attention mechanisms to ensure clinical interpretability through Grad-CAM visualizations. The optimum configuration (UNI2 encoder with ILRA-MIL, 256 256 patches, 50% overlap) achieved 78.75% accuracy and 90.12% quadratic weighted kappa (QWK), outperforming traditional methods and approaching expert pathologist-level diagnostic capability. Overlapping smaller patches offered the best balance of spatial resolution and contextual information, while domain-specific foundation models performed noticeably better than generic encoders. This work is the first large-scale, comprehensive comparison of weekly supervised MIL methods for prostate cancer diagnosis and grading. The proposed approach has excellent clinical diagnostic performance, scalability, practical feasibility through cloud computing, and interpretability using visualization tools. metadata Butt, Naveed Anwer; Sarwat, Dilawaiz; Delgado Noya, Irene; Tutusaus, Kilian; Samee, Nagwan Abdel y Ashraf, Imran mail SIN ESPECIFICAR, SIN ESPECIFICAR, irene.delgado@uneatlantico.es, kilian.tutusaus@uneatlantico.es, SIN ESPECIFICAR, SIN ESPECIFICAR     <http://repositorio.unincol.edu.co/id/eprint/27825/1/s41598-026-39196-x_reference.pdf>     (2026) Benchmarking multiple instance learning architectures from patches to pathology for prostate cancer detection and grading using attention-based weak supervision.  Scientific Reports.   ISSN 2045-2322     
document_url: http://repositorio.unincol.edu.co/id/eprint/27825/1/s41598-026-39196-x_reference.pdf