Here is the detailed list of subjects and references for Lecture 06.
Lecture 06: Large language models
Content:
Review of transformers. Static embeddings vs. contextualized embeddings. ELMo architecture. BERT architecture; masked language modeling and next sentence prediction. Large language models (LLMs). GPT-n family. Other LLMs; open-source LLMs. LLMs classification: encoder, decoder and encoder-decoder. Multi-lingual LLMs: monolingual training and training based on parallel corpora. Sentence BERT, training and inference. Miscellanea: emergent abilities, hallucinations, and mixture of experts. Adaptation: feature extraction and fine-tuning. Parameter efficient fine-tuning: adapters and LoRA. Transfer learning. Prompt learning. Retrieval-augmented generation. Contextualized embeddings and ethics.
References:
Jurafsky & Martin, chapter 11; skip sections 11.3, 11.4.3, 11.5.
ELMo and GPT-n models, adapters and transfer learning have all been taken from the on-line course 'NLP Course | For You' by Elena Voita, section Transfer Learning. Use lecture slides for the following topics: other LLMs, open-source LLMs and LLMs classification, multi-lingual LLM, sentence BERT, LoRA, prompt learning, and retrieval-augmented generation. Read Jurafsky & Martin, chapter 10.10 on ethic issues for LLM.