This dataset contains low-resolution digitized herbarium scans from the Herbarium Senckenbergianum collection, annotated for plant organ segmentation and scale detection.
Each scan is labeled with six types of plant organs: leaf, flower, fruit, stem, seed, and root. Annotations are provided in two formats:
- Segmentation masks for each plant organ, suitable for training instance segmentation models like Mask R-CNN.
- YOLO annotations that include polygon coordinates for each object, supporting polygon-based detection and segmentation in YOLO-compatible frameworks.
The dataset also includes annotations for scale bars commonly found in herbarium sheets, enabling models to detect scales for measurement and standardization tasks.
The data is organized into two main parts:
plant_organ_dataset
: Contains images, masks, bounding box and polygon annotations for plant organs.
scale_dataset
: Contains images, masks, bounding box and polygon annotations for scale bars.
The dataset includes ML Croissant metadata (mlcroissant.json
) that describes the dataset’s structure, labels, and annotation formats in a standardized way for easier use in machine learning workflows.