Here is the detailed list of subjects from lecture 6 that we have presented in class and that will be matter of evaluation at the finals.
Lecture 06: Contextual word embeddings.
Content: Static embeddings vs. contextual embeddings. ELMo architecture. BERT encoder-based model. Masked language modeling. Next sentence prediction. Applications of BERT: sentiment analysis and named-entity recognition. GPT-n decoder-based model, masked attention. Sentence BERT.
References: I take it for granted that you already know about transformers, which are presented in Jurafsky & Martin, chapter 8. ELMo model has been taken from the on-line course 'NLP Course | For You' by Elena Voita, section Transfer Learning. BERT is presented in Jurafsky & Martin, chapter 9; skip sections 9.2.3, 9.3.1, 9.3.2, 9.4.2. Use lecture slides for GPT-n and sentence BERT.