Day 21
Section outline
-
May 12th, Monday (16:30-18:30)
Neural machine translation
- Neural machine translation (NMT) and posterior probability
- Encoder-decoder architecture (seq2seq): general idea
- Encoder-decoder through RNN and through transformer
- RNN: autoregressive encoder-decoder
- RNN: greedy inference algorithm
- RNN: training algorithm and teacher forcing
- RNN: attention and dynamic context vector
- RNN: dot-product attention
- RNN: bilinear attention
- Transformer-based architecture
- Cross-attention, query, key and value
- Search tree and beam search
References
- Jurafsky and Martin, chapter 13