Period: Second semester

Course unit contents:

* Gradient descent methods
* Ridge and LASSO regularization
* Deep neural networks and convolutional version
* Clustering
* Data visualization
* Energy-based models
* Restricted Boltzmann machines
* Combination of models: bagging, random forests, boosting, XGBoost

Planned learning activities and teaching methods: The aim of this course is to expose the students to modern tools for classifying data and machine learning techniques, so that they can apply those methods in lab experiences with computers. The first half of the course (24 hours) is reserved for this purpose of learning general principles via applications, while the second half of the course allows the students, in small groups, to develop a deeper understanding of one specific subject by carrying out a small project.

The first half of the course includes theoretical explanations of some key procedures for data analysis or of a class of algorithms, followed by exercise sessions in which the students will apply the new ideas on computers. This learning by practical experience is expected to improve the understanding of the theoretical tools. The numerical analysis includes either adopting and modifying pre-built software, or sketching simple algorithms from scratch.

The text mainly followed in the course is an open access review on the arxiv:
“A high-bias, low-variance introduction to Machine Learning for physicists” by Pankaj Mehta et al, arXiv:1803.08823.
This review also furnishes useful python notebooks to analyze data and is connected to tools as the scikit-learn package.

Some of those notebooks are used in the course, while other code is written during lectures.

Last modified: Thursday, 1 June 2023, 6:00 PM