Leveraging Image Captions for Selective Whole Slide Image Annotation

Qiu J, Aubreville M, Wilm F, Öttl M, Utz J, Schlereth M, Breininger K (2024)

Publication Type: Conference contribution

Publication year: 2024

Journal

Lecture Notes in Computer Science Springer Verlag

Publisher: Springer Science and Business Media Deutschland GmbH

Book Volume: 15012 LNCS

Pages Range: 207-217

Conference Proceedings Title: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Event location: Marrakesh, MAR

ISBN: 9783031723896

DOI: 10.1007/978-3-031-72390-2_20

Abstract

Acquiring annotations for whole slide images (WSIs)-based deep learning tasks, such as creating tissue segmentation masks or detecting mitotic figures, is a laborious process due to the extensive image size and the significant manual work involved in the annotation. This paper focuses on identifying and annotating specific image regions that optimize model training, given a limited annotation budget. While random sampling helps capture data variance by collecting annotation regions throughout the WSI, insufficient data curation may result in an inadequate representation of minority classes. Recent studies proposed diversity sampling to select a set of regions that maximally represent unique characteristics of the WSIs. This is done by pretraining on unlabeled data through self-supervised learning and then clustering all regions in the latent space. However, establishing the optimal number of clusters can be difficult and not all clusters are task-relevant. This paper presents prototype sampling, a new method for annotation region selection. It discovers regions exhibiting typical characteristics of each task-specific class. The process entails recognizing class prototypes from extensive histopathology image-caption databases and detecting unlabeled image regions that resemble these prototypes. Our results show that prototype sampling is more effective than random and diversity sampling in identifying annotation regions with valuable training information, resulting in improved model performance in semantic segmentation and mitotic figure detection tasks. Code is available at https://github.com/DeepMicroscopy/Prototype-sampling.

Authors with CRIS profile

Jingna Qiu Department Artificial Intelligence in Biomedical Engineering (AIBE) Frauke Wilm Department Artificial Intelligence in Biomedical Engineering (AIBE) Mathias Öttl Department Artificial Intelligence in Biomedical Engineering (AIBE) Jonas Utz Department Artificial Intelligence in Biomedical Engineering (AIBE) Maja Schlereth Department Artificial Intelligence in Biomedical Engineering (AIBE) Katharina Breininger Department Artificial Intelligence in Biomedical Engineering (AIBE)

Involved external institutions

Technische Hochschule Ingolstadt

Germany (DE)

How to cite

APA:

Qiu, J., Aubreville, M., Wilm, F., Öttl, M., Utz, J., Schlereth, M., & Breininger, K. (2024). Leveraging Image Captions for Selective Whole Slide Image Annotation. In Marius George Linguraru, Qi Dou, Aasa Feragen, Stamatia Giannarou, Ben Glocker, Karim Lekadir, Julia A. Schnabel (Eds.), Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (pp. 207-217). Marrakesh, MAR: Springer Science and Business Media Deutschland GmbH.

MLA:

Qiu, Jingna, et al. "Leveraging Image Captions for Selective Whole Slide Image Annotation." Proceedings of the 27th International Conference on Medical Image Computing and Computer-Assisted Intervention, MICCAI 2024, Marrakesh, MAR Ed. Marius George Linguraru, Qi Dou, Aasa Feragen, Stamatia Giannarou, Ben Glocker, Karim Lekadir, Julia A. Schnabel, Springer Science and Business Media Deutschland GmbH, 2024. 207-217.

BibTeX: Download