Dataset Pruning using Evolutionary Optimization

Neubig L, Kist A (2023)


Publication Type: Conference contribution

Publication year: 2023

Journal

Publisher: Springer Science and Business Media Deutschland GmbH

Pages Range: 134-139

Conference Proceedings Title: Informatik aktuell

Event location: Braunschweig, DEU

ISBN: 9783658416560

DOI: 10.1007/978-3-658-41657-7_30

Abstract

Data is key to training deep neural networks. A common demand for individual data units is their abundance and diversity. However, it is barely investigated what is actually an informative data unit and how the amount of data relates to the neural network performance. In this study, we utilize evolutionary algorithms to optimize data usage during deep neural network training. We test multiple medical classification and segmentation datasets as being key tasks in medical imaging and found that this so-called dataset pruning removes rather unimportant data elements. Depending on how much we punished the incorporation of data, we found that across tasks and datasets, a critical amount of data is incorporated by the algorithm itself. This shows that future research not only needs to incorporate abundant data but rather relevant data.

Authors with CRIS profile

How to cite

APA:

Neubig, L., & Kist, A. (2023). Dataset Pruning using Evolutionary Optimization. In Thomas M. Deserno, Heinz Handels, Andreas Maier, Klaus Maier-Hein, Christoph Palm, Thomas Tolxdorff (Eds.), Informatik aktuell (pp. 134-139). Braunschweig, DEU: Springer Science and Business Media Deutschland GmbH.

MLA:

Neubig, Luisa, and Andreas Kist. "Dataset Pruning using Evolutionary Optimization." Proceedings of the Bildverarbeitung für die Medizin Workshop, BVM 2023, Braunschweig, DEU Ed. Thomas M. Deserno, Heinz Handels, Andreas Maier, Klaus Maier-Hein, Christoph Palm, Thomas Tolxdorff, Springer Science and Business Media Deutschland GmbH, 2023. 134-139.

BibTeX: Download