NLP Embeddings
Embeddings are vector representations of tokens (words or subwords) so machines can work with text. They place tokens in a multi-dimensional space where similar meanings end up closer together.
Static vs. Contextual Embeddings
Section titled “Static vs. Contextual Embeddings”The biggest evolution in NLP was moving from “one word, one vector” to “one word, many vectors (depending on context)“.
1. Static Embeddings (Word2Vec)
Section titled “1. Static Embeddings (Word2Vec)”Models: Word2Vec, GloVe, FastText
In these models, a token has a fixed representation regardless of where it appears.
- How it works: Assigns a single vector to each word in the vocabulary.
- The Problem: It cannot handle polysemy (words with multiple meanings).
- Example: The word “bank” has the same vector in “river bank” and “bank deposit”. The model can’t tell the difference.
2. Contextual Embeddings (BERT)
Section titled “2. Contextual Embeddings (BERT)”Models: BERT, RoBERTa, GPT (Transformer-based models)
In these models, the representation of a token changes dynamically based on surrounding tokens.
- How it works: Uses attention to incorporate context. BERT uses bidirectional attention; GPT uses causal (left-to-right) attention.
- The Solution: It understands context.
- Example: The vector for “bank” in “river bank” is completely different from the vector for “bank” in “bank deposit”.
Comparison Table
Section titled “Comparison Table”| Feature | Word2Vec / GloVe | BERT (Transformer) |
|---|---|---|
| Type | Static Embedding | Contextual Embedding |
| Context Awareness | ❌ None (Context-independent) | ✅ High (Context-dependent) |
| Handling Polysemy | ❌ Fails (One vector per word) | ✅ Excellent (Different vectors for same word) |
| Computation | Fast, lightweight | Slower, computationally heavy |
| Best For | Simple analogies, keyword matching | Complex understanding, QA, sentiment analysis |
Exam Tips
Section titled “Exam Tips”Exam Tip: Contextual embeddings (e.g., BERT, RoBERTa, GPT) differentiate contextual meanings of the same word in different phrases (polysemy).
Exam Tip: Word2Vec and GloVe are static and cannot distinguish context.