Classification as a Heavy-Tail Regressor

I havenâ€™t benchmarked this idea, but it sounds like it might work.

Letâ€™s say that you want to have a regression algorithm on a dataset that has a large skew. Many values may be zero, but thereâ€™s a very long tail too. How might we go about regressing this?

We could â€¦ turn it into a classification problem instead.

Letâ€™s say that we have an ordered dataset. Letâ€™s say that item 1 has the smallest regression value and item \(n\) has the highest value. That means that;

\[ y_1 \leq y_2 \leq ... \leq y_{n-1} \leq y_n \] Letâ€™s now say we have a new datapoint \(y_{new}\). Maybe we donâ€™t need to perform regression. Maybe we just need to care about if \(y_{new} \leq y_1\). If it is, we just predict \(y_{new} = y_1\). If itâ€™s not, we try \(y_1 \leq y_{new} \leq y_2\). If thatâ€™s not it, we can try \(y_2 \leq y_{new} \leq y_3\) â€¦

This turns the problem on itâ€™s head. Weâ€™re no longer worrying about how heavy the tail could be. Instead weâ€™re wondering where in the order of our training data our new datapoint is. That means that we can use classification!

Given that weâ€™ve trained a classifier that can be used to detect order, we can now use it as a heuristic to order new data.