I stumbled apon a funny paper the other day.
The paper is titled "torch.manual_seed(3407) is all you need: On the influence of random seeds in deep learning architectures for computer vision" by David Picard.
The Effect of Entropy
The paper describes the effect of a random seed in training computer vision models. Here's the quick histogram results.
It's not nothing. And, to quote the paper:
On a scanning of 10^4 seeds, we obtained a difference between the maximum and minimum accuracy close to 2% which is above the threshold commonly used by the computer vision community of what is considered significant.
There's a grain of salt to take with this result, which the paper also mentions:
First, the accuracy obtained in these experiments are not at the level of the state of the art. This is because of the budget constraint that forbids training for longer times. One could argue that the variations observed in these experiments could very well disappear after longer training times and/or with the better setup required to reach higher accuracy.
That said, this paper is a nice example of a growing concern on my end. Neural networks are not deterministic. At all.. This is a shame because it suggests that papers may be favoring luck by accident.
Or as the author nicely puts it:
As a matter of comparison, there are more than 10^4 submissions to major computer vision conferences each year. Of course, the submissions are not submitting the very same model, but they nonetheless account for an exploration of the seed landscape comparable in scale of the present study, of which the best ones are more likely to be selected for publication because of the impression it has on the reviewers.
For each of these submissions, the researchers are likely to have modified many times hyper-parameters or even the computational graph through trial and error as is common practice in deep learning. Even if these changes where insignificant in terms of accuracy, they would have contributed to an implicit scan of seeds. Authors may inadvertently be searching for a lucky seed just by tweaking their model. Competing teams on a similar subject with similar methods may unknowingly aggregate the search for lucky seeds.
I am definitely not saying that all recent publications in computer vision are the result of lucky seed optimization. This is clearly not the case, these methods work. However, in the light of this short study, I am inclined to believe that many results are overstated due to implicit seed selection - be it from common experimental practice of trial and error or of the “evolutionary pressure” that peer review exerts on them.