Classification via Segmented Attention

I spotted an interesting paper the other day that starts by detecting bird species. It introduces an algorithm called PDiscoNet that tries to detect specific categories based on predefined segments.

The steps described in the paper. The approach is to first detect specific parts of the bird and to then run a classifier on each seperately.

Detecting a specific species is tricky, but it becomes a lot easier if you know what to look for. Some species have a specific beak, while others have a more or less distinctive wing. So what if you try to detect interesting features first? Things like the head, neck, wing and belly? Then you might put all your attention there and then try to make your prediction based on these features.

Comparison of different methods.

It turns out that this won't just work for birds though. You could also do something similar when trying to detect specific people. In this case you'd go for facial features instead.

Similar trick can be applied on faces too.

Pretty interesting read. And creative use of a bird dataset that I didn't know about.