In this thread I am going to post, for each individual lecture, a detailed lists of all the subjects that we have presented in class and that will be matter of evaluation at the final exam.
Here is the detailed list of subjects from lecture 1 that we have presented in class and that will be matter of evaluation at the finals.
Lecture 01: Natural language processing: An unexpected journey.
Content: What is natural language processing? A very short history of natural language processing. Why is natural language processing tricky? Word distribution, ambiguity, composition, recursion and hidden structure. Language & Learning.
References: Slides from the lecture.
Here is the detailed list of subjects from lecture 3 that we have presented in class and that will be matter of evaluation at the finals.
Lecture 03: Text normalization.
Content: Words, tokens, types and vocabulary. Herdan/Heaps law and Zipf/Mandelbrot law. Morphology: root and affixes; inflectional and derivational morphology. Word-forms and lemmas; multi-element word-forms. Corpora. Text normalization: language identification, spell checker, contraction, punctuation and special characters. Text tokenization: word tokenization, character tokenization, and subword tokenization. Subword tokenization: learning algorithm, encoder algorithm and decoder algorithm. Byte-pair encoding: algorithm and examples. Byte-level BPE and WordPiece. Sentence segmentation and case folding. Stop words, stemming and lemmatization.
References: Jurafsky & Martin, chapter 2; skip sections 2.3, 2.6 and 2.9. Slides from the lecture.
Here is the detailed list of subjects from lecture 4 that we have presented in class and that will be matter of evaluation at the finals.
Lecture 04: Words and meaning.
Content: Lexical semantics: word senses and word relationships. Distributional semantics. Vector semantics and term-context matrix. Cosine similarity. Neural static word embeddings. Word2vec and skip-gram with negative sampling: target embedding, context embedding, classifier algorithm derived from logistic regression and training algorithm. Practical issues. Other kinds of static embeddings: FastText and Glove. Visualizing word embeddings. Semantic properties of word embeddings. Bias and word embeddings. Evaluation of word embeddings. Cross-lingual word embeddings.
References: Jurafsky & Martin, chapter 5; skip equations (5.22)-(5.27). Some of the topics under practical issues have been taken from the on-line course 'NLP Course | For You' by Elena Voita. Use lecture slides for cross-lingual word embeddings.
Here is the detailed list of subjects from lecture 5a that we have presented in class and that will be matter of evaluation at the finals.
Lecture 05a: Statistical language models.
Content: Language modeling (LM) and applications. Relative frequency estimation. N-gram model, N-gram probabilities and bias-variance tradeoff. Practical issues. Evaluation: perplexity measure. Sampling sentences. Sparse data: Laplace smoothing and add-k smoothing; stupid backoff and linear interpolation; out-of-vocabulary words. Limitations of N-gram model.
References: Jurafsky & Martin, chapter 3; skip sections 3.7, 3.8.
Here is the detailed list of subjects from lecture 5b that we have presented in class and that will be matter of evaluation at the finals.
Lecture 05b: Neural language models.
Content: Neural language models: general architecture. Feedforward neural LM (NLM): inference and training. Recurrent NLM: inference and training. Character level and character-aware NLM. Practical issues: weight tying, adaptive softmax, softmax temperature, contrastive evaluation.
References: Jurafsky & Martin, section 6.5 (skip subsection 'Pooling input embeddings for sentiment') and section 13.2. General architecture for NLM has been taken from the on-line course 'NLP Course | For You' by Elena Voita, section Language Modeling. I take it for granted that you already know about feedforward neural networks and recurrent neural networks: feedforward neural networks are presented in Jurafsky & Martin sections 6.3 and 6.6; recurrent neural networks are presented in Jurafsky & Martin section 13.1.
Here is the detailed list of subjects from lecture 6 that we have presented in class and that will be matter of evaluation at the finals.
Lecture 06: Contextual word embeddings.
Content: Static embeddings vs. contextual embeddings. ELMo architecture. BERT encoder-based model. Masked language modeling. Next sentence prediction. Applications of BERT: sentiment analysis and named-entity recognition. GPT-n decoder-based model, masked attention. Sentence BERT.
References: I take it for granted that you already know about transformers, which are presented in Jurafsky & Martin, chapter 8. ELMo model has been taken from the on-line course 'NLP Course | For You' by Elena Voita, section Transfer Learning. BERT is presented in Jurafsky & Martin, chapter 9; skip sections 9.2.3, 9.3.1, 9.3.2, 9.4.2. Use lecture slides for GPT-n and sentence BERT.