Section outline

  • April 9th, Thursday (10:30-12:30)

    Large language models and pretraining

    • Sampling
    • Key-Value cache
    • Pretraining

    References

    • Jurafsky and Martin, chapter 7
    • Jurafsky and Martin, chapter 8