author: Peter Frick
Word2vec is an established algorithm to learn vector embeddings for individual words or items through a shallow network. Learning embeddings at the document level (e.g., collection of words) is a more recent area of research.
This repo is meant to explore some different network architectures for getting embeddings at the tweet level. They are as follows:
- maxpool encoder
- prediction autoencoder
- embeddings autoencoder
Each are explored in an individual *.ipynb
notebooks.
Pretrained glove embeddings for tweets are publically available here