One Dimensional Word2Vec
I read a paper the other day that tries to generate one dimensional word embeddings. It sounds like a crazy idea, but there's some interesting creativity that's applied here.
The idea is to take pre-existing word embeddings and to connect all the word embeddings via a TSP solver. This will create a tour and each embedding will have a position on that tour. That also means that there's a distance between the nodes that can be expressed as distanace on the tour. This way, you can still approximate distance but you'd only have to store an integer that denotes the order per token. For 40K words, that amounts to an embedding that's 321KB on disk.
But who might you calculate distance? Well, there's a trick for that too: smoothing.
The paper goes on by comparing these "WordTour" embeddings by comparing them to PCA projections of the original word embeddings in some classification tasks.
The results aren't that convincingly better than normal bag of words but the idea still tickled my mind a bit. A lot of folks try to make the embeddings bigger, but this paper really tries the other extreme by going for 1-dimensional representations. I also appreciate the creativity to use a TSP to get there. Even if it might not be that practical.