Search publications

Reset filters Search by keyword

No publications found.

 

ADPv2: A hierarchical histological tissue type-annotated dataset for potential biomarker discovery of colorectal disease

Authors: Yang ZLi KRamandi SGBrassard PKhellaf ATrinh VQZhang JChen LRowsell CVarma SPlataniotis KHosseini MS


Affiliations

1 Department of Computer Science & Software Engineering, Concordia University, 2155 Guy St, Montreal, QC H3H 2L9, Canada.
2 Department of Electrical & Computer Engineering, University of Toronto, 10 King's College Rd, Toronto, ON M5S 3G8, Canada.
3 Department of Chemistry & Biology, Toronto Metropolitan University, 350 Victoria St., Toronto, ON M5B 2K3, Canada.
4 Department of Medicine, Université de Montréal, Pavillon Roger-Gaudry, 2900 Edouard Montpetit Blvd, Montreal, QC H3T 1J4, Canada.
5 Department of Pathology & Molecular Medicine, Université de Montréal, 2900 Édouard-Montpetit Blvd, Montréal, QC H3T 1J4, Canada.
6 Axe Cancer, Centre de recherche du CHUM, 900 Saint-Denis St, Montréal, QC H2X 0A9, Canada.
7 Institut de recherche en immunologie et cancérologie, Université de Montréal, Marcelle-Coutu Pavilion, 2950 Chem. de Polytechnique, Montréal, QC H3T 1J4, Canada.
8 Anatomic Pathology, Sunnybrook Health Sciences Centre, 2075 Bayview Ave, Toronto, ON M4N 3M5, Canada.
9 Department of Laboratory Medicine & Pathobiology, University of Toronto, Simcoe Hall, 1 King's College Circle, Toronto, ON M5S 3K3, Canada.
10 Department of Pathology & Molecular Medicine, Queen's University, 88 Stuart Street, Kingston, ON K7L 3N6, Canada.

Description

Computational pathology (CPath) leverages histopathology images to enhance diagnostic precision and reproducibility in clinical pathology. However, publicly available datasets for CPath that are annotated with extensive histological tissue type (HTT) taxonomies at a granular level remain scarce due to the significant expertise and high annotation costs required. Existing datasets, such as the Atlas of Digital Pathology (ADP), address this by offering diverse HTT annotations generalized to multiple organs, but limit the capability for in-depth studies on specific organ diseases. Building upon this foundation, we introduce ADPv2, a novel dataset focused on gastrointestinal histopathology. Our dataset comprises 20,004 image patches derived from healthy colon biopsy slides, annotated according to a hierarchical taxonomy of 32 distinct HTTs of 3 levels. Furthermore, we train a multilabel representation learning model following a two-stage training procedure on our ADPv2 dataset. By leveraging the VMamba model architecture, we achieve a mean average precision of 0.88 in multilabel colon HTT classification.. Finally, we show that our dataset is capable of an organ-specific in-depth study for potential biomarker discovery by analyzing the model's prediction behavior on tissues affected by different colon diseases, which reveals statistical patterns that confirm the two pathological pathways of colon cancer development. Our dataset is publicly available here: Part 1, Part 2, and Part 3.


Keywords: ADPv2 datasetBiomarker discoveryComputational pathologyDeep learningMultilabel representation learning


Links

PubMed: https://pubmed.ncbi.nlm.nih.gov/41658283/

DOI: 10.1016/j.jpi.2025.100537