Starting on March 19th we will be using the neural network architecture called Transformer for several NLP tasks.
Those of you who are taking deep learning this year may not have heard about it at this early point. I then recommend that you spend at least 1 hour with this architecture, by looking into any of the tutorials that you can find on the web. You should especially focus on the notion of attention.
Two wonderful animation videos on the Transformer are also available at the following links, very remarkable: