Day 08
Section outline
-
March 19th, Wednesday (16:30-18:30)
Transformers: short recap
- Attention
- Encoder
- Decoder
- Residual stream
Contextualised word embeddings
- Static embeddings vs. contextualized embeddings
- ELMo
- BERT: encoder-only model
- Masked language modeling
- Next sentence prediction
References
- Jurafsky and Martin, chapter 9
- Jurafsky and Martin, sections 11.1, 11.2, 11.3
- Voita, NLP Course | For You (web course): Language Modeling
Resources