Section outline

  • May 12th, Monday (16:30-18:30)

    Neural machine translation

    • Neural machine translation (NMT) and posterior probability
    • Encoder-decoder architecture (seq2seq): general idea
    • Encoder-decoder through RNN and through transformer
    • RNN: autoregressive encoder-decoder
    • RNN: greedy inference algorithm
    • RNN: training algorithm and teacher forcing
    • RNN: attention and dynamic context vector
    • RNN: dot-product attention
    • RNN: bilinear attention
    • Transformer-based architecture
    • Cross-attention, query, key and value
    • Search tree and beam search

    References

    • Jurafsky and Martin, chapter 13