Plant organ segmentation images and annotations on digitized herbarium scans

This dataset contains low-resolution digitized herbarium scans from the Herbarium Senckenbergianum collection, annotated for plant organ segmentation and scale detection.

Each scan is labeled with six types of plant organs: leaf, flower, fruit, stem, seed, and root. Annotations are provided in two formats:

  • Segmentation masks for each plant organ, suitable for training instance segmentation models like Mask R-CNN.
  • YOLO annotations that include polygon coordinates for each object, supporting polygon-based detection and segmentation in YOLO-compatible frameworks.

The dataset also includes annotations for scale bars commonly found in herbarium sheets, enabling models to detect scales for measurement and standardization tasks.

The data is organized into two main parts:

  • plant_organ_dataset: Contains images, masks, bounding box and polygon annotations for plant organs.
  • scale_dataset: Contains images, masks, bounding box and polygon annotations for scale bars.

The dataset includes ML Croissant metadata (mlcroissant.json) that describes the dataset’s structure, labels, and annotation formats in a standardized way for easier use in machine learning workflows.

Metadaten im EML-Format herunterladen

Dataset DOI: doi:10.12761/fj4m-zr97

Daten und Ressourcen

Zusätzliche Informationen

Feld Wert
Andere Informationen
Zuletzt aktualisiert Mai 22, 2025, 09:21 (UTC)
Erstellt Mai 21, 2025, 13:30 (UTC)

Responsible parties

Creator and point of contact
Name Rajapreethi Rajendran

Creator
Name Marco Schmidt

Creator
Name Claus Weiland

Creator
Name Jonas Grieb

Research data management planning

Estimated volume of created data <1GB
Data will be stored at (long-term archived) The dataset includes images in a smaller resolution which are long-term archived (in original size resolution) at https://doi.org/10.1594/PANGAEA.920895. The additionally included annotations in this dataset are only stored here (in the Dataportal).

Link zu diesem Datensatz

https://doi.org/10.12761/fj4m-zr97