Transformers

Transformers

by Giorgio Satta -
Number of replies: 1

Next week we will start with large language models (LLM). I will give for granted a special NN architecture called transformer.

Those of you who are taking deep learning this year for the first time may not have heard about it at this early point. I then recommend that you spend 1 hour with this architecture, by looking into any of the tutorials that you can find on the web. You should especially focus on the notion of attention.