Invasive species
Today I learned that there is a computer vision dataset of 196 invasive species. It's a pretty big dataset too, 19K images with bounding box annotations and 1.2M unlabelled images in addition. The dataset comes with a paper as well as a website and there a few interesting things about it.
![](taxonomy.png)
For starters the dataset also respects a taxonomy, which means that the classes can be associated with eachother hierarchically. That can be useful because some species may belong to the same family.
![](family.png)
But the dataset even goes a step further by taking life cycles into account. Insects of a specfic species look different when they are mere larva, and you may also be interested in what the eggs look like.
![](eggs.png)
The paper gives more details on how they used CLIP to construct the embeddings as well as some benchmarks on common models. But I thought it was interesting to see domain knowledge seeping into the data collection methods like this. Neat!