2022-09-27 - 2022-09-30 Data Analysis & Machine Learning (CAS ADS M3)
In this module, you will learn about standard analysis techniques and how to apply state-of-the-arte machine learning with Python.
Reiter
About this module
Data Analysis and Machine Learning
Basic introduction on how to perform typical machine learning tasks with Python.
Learning outcomes
Basic introduction on how to perform typical machine learning tasks with Python.
Learning outcomes
- Overview of machine learning pipelines and their implementation with scikit-learn
- Regression and Classification: linear models and logistic regression
- Decision trees & random forest models
- Principal component analysis (PCA) and non-linear embeddings (t-SNE and UMAP)
- Clustering with K-means and Gaussian mixtures
- Artificial Neural networks as general fitters, fully connected nets used to classify the fashion-MNIST dataset
- Scikit-learn and clustering maps, Q&A
Target group
- Students, researchers and professionals working with data
Prerequisites
- CAS ADS Module 1 and 2.
- Basic programming skills are necessary, we don't reserve time for basic programming concepts.
Methods
- Course languages are Python and English
- Short theory sessions followed by hands-on tutorials with Jupyter notebooks
Certificate and points
- Certificate of attendance.
- Upon a successful presentation 2 ECTS credit points are given.
Coaches
- The coaches are local and external experts
Practical information (time, location ...)
Time : 2022-09-27 to 2022-09-30 08:30 - 19:00
Location : Mallorca (https://www.esblaudesnord.com/en/) and online
Training language: English
Participants : Max 24
Registraion : Mandatory
Coaches : Dr. Mykhailo Vladymyrov, Dr. Aris Marcolongo
Prerequisites : Laptop, Python skills
Certificate : Certificate for full training attendance, 2 ECTS upon successful project presentation
Course material : You don't have to prepare anything for the course. You will be given access to a Jupyter environment that runs in your default browser and that is pre-loaded with the course material.
Location : Mallorca (https://www.esblaudesnord.com/en/) and online
Training language: English
Participants : Max 24
Registraion : Mandatory
Coaches : Dr. Mykhailo Vladymyrov, Dr. Aris Marcolongo
Prerequisites : Laptop, Python skills
Certificate : Certificate for full training attendance, 2 ECTS upon successful project presentation
Course material : You don't have to prepare anything for the course. You will be given access to a Jupyter environment that runs in your default browser and that is pre-loaded with the course material.
Schedule
Monday (arrival day)
17:00 Apero in the patio
19:00 Dinner
Tuesday
07:30 - 08:00 Breakfast
08:30 - 08:40 Welcome
08:40 - 10:30 General intro in ML, datasets, skl interface
10:30 - 11:00 Break
11:00 - 12:30 Linear models & logistic regression
12:30 - 17:00 Swimming, chilling, working etc
17:00 - 19:00 More course
19:30 - 20:30 Dinner
Wednesday
07:30 - 08:00 Breakfast
08:30 - 09:45 Trees and forests
09:45 - 10:15 Break
10:15 - 12:30 PCA and embeddings
12:30 - 17:00 Swimming, chilling, working etc
17:00 - 19:00 More course
19:30 - 20:30 Dinner
Thursday
07:30 - 08:00 Breakfast
08:30 - 09:45 Clustering 1
09:45 - 10:15 Break
10:15 - 12:30 Clustering 2
12:30 - 17:00 Swimming, chilling, working etc
17:00 - 19:00 More course
19:30 - 20:30 Dinner
Friday
07:30 - 08:30 Breakfast
08:30 - 09:45 Intro to NN and image fitting, Fully connected net for F-MNIST classification, scikit-learn map, Q&A
09:45 - 10:15 Break
10:15 - 11:30 Discussions ?
11:30 - 12:30 Summary Session and Project Discussion
2022-11-24 (Main Building Room 204)
13:30 - 14:00 Jonas and Marco
14:00 - 14:30 Raphael
14:30 - 15:00 Naima and Zita
15:00 - 15:30 Nicolas
15:30 - 16:00 Romain
16:00 - 16:30 Leonard
2022-12-12 Session 1 (Main Building Room 217) Aris (Sigve)
13:30 - 14:00 Sandra and Barbara
14:00 - 14:30 Filipe and Laura
14:30 - 15:00 Jani
15:00 - 15:30 Stefano and Marc
16:00 - 16:30 Jürg Krähenbühl
16:30 - 17:00 Nathalie and Claudia
2022-12-12 Session 2 (Main Building Room 331) Mykhailo
13:30 - 14:00 Kim and Lisa and Emilie
14:00 - 14:30 Michael Zbinden
14:30 - 15:00 Fred and Nico
15:00 - 15:30 Elena
15:30 - 16:00 Michael Albrecht
Project presentations day in Bern or online
Please sign up for one of the slots for your group here (or indicate there that you cannot at those days):
https://terminplaner4.dfn.de/FzcKpUxgGRGoY5LA
17:00 Apero in the patio
19:00 Dinner
Tuesday
07:30 - 08:00 Breakfast
08:30 - 08:40 Welcome
08:40 - 10:30 General intro in ML, datasets, skl interface
10:30 - 11:00 Break
11:00 - 12:30 Linear models & logistic regression
12:30 - 17:00 Swimming, chilling, working etc
17:00 - 19:00 More course
19:30 - 20:30 Dinner
Wednesday
07:30 - 08:00 Breakfast
08:30 - 09:45 Trees and forests
09:45 - 10:15 Break
10:15 - 12:30 PCA and embeddings
12:30 - 17:00 Swimming, chilling, working etc
17:00 - 19:00 More course
19:30 - 20:30 Dinner
Thursday
07:30 - 08:00 Breakfast
08:30 - 09:45 Clustering 1
09:45 - 10:15 Break
10:15 - 12:30 Clustering 2
12:30 - 17:00 Swimming, chilling, working etc
17:00 - 19:00 More course
19:30 - 20:30 Dinner
Friday
07:30 - 08:30 Breakfast
08:30 - 09:45 Intro to NN and image fitting, Fully connected net for F-MNIST classification, scikit-learn map, Q&A
09:45 - 10:15 Break
10:15 - 11:30 Discussions ?
11:30 - 12:30 Summary Session and Project Discussion
2022-11-24 (Main Building Room 204)
13:30 - 14:00 Jonas and Marco
14:00 - 14:30 Raphael
14:30 - 15:00 Naima and Zita
15:00 - 15:30 Nicolas
15:30 - 16:00 Romain
16:00 - 16:30 Leonard
2022-12-12 Session 1 (Main Building Room 217) Aris (Sigve)
13:30 - 14:00 Sandra and Barbara
14:00 - 14:30 Filipe and Laura
14:30 - 15:00 Jani
15:00 - 15:30 Stefano and Marc
16:00 - 16:30 Jürg Krähenbühl
16:30 - 17:00 Nathalie and Claudia
2022-12-12 Session 2 (Main Building Room 331) Mykhailo
13:30 - 14:00 Kim and Lisa and Emilie
14:00 - 14:30 Michael Zbinden
14:30 - 15:00 Fred and Nico
15:00 - 15:30 Elena
15:30 - 16:00 Michael Albrecht
Project presentations day in Bern or online
Please sign up for one of the slots for your group here (or indicate there that you cannot at those days):
https://terminplaner4.dfn.de/FzcKpUxgGRGoY5LA
Project Instructions
Train and apply at least two models presented in the Module to other datasets than those used in class.
Please present the notebook or and slides in 15 min at the presentation day. Describe your objective and data, show descriptive statistics and plots, divide in test, validation and test sets, train the models and show performance measures and your conclusions.
You are encouraged to work in teams of two or three. If you don’t have a dataset yet, you may use one from: https://www.kaggle.com/datasets or https://archive.ics.uci.edu/ml/datasets.php
Please present the notebook or and slides in 15 min at the presentation day. Describe your objective and data, show descriptive statistics and plots, divide in test, validation and test sets, train the models and show performance measures and your conclusions.
You are encouraged to work in teams of two or three. If you don’t have a dataset yet, you may use one from: https://www.kaggle.com/datasets or https://archive.ics.uci.edu/ml/datasets.php
About the coaches
PD Dr. Sigve Haug (overview, school responsible)
Sigve studied physics in Germany, Spain and Norway. He has been involved in neutrino physics experiments and high energy frontier experiments, often with main focus on the computing challenges related to the large and distributed data from these experiments. He worked 15 years for the Albert Einstein Center for fundamental Physics. Today he is heading the Data Science Lab at the University of Bern. Beyond science he likes philosophical conversations in the evening.
Dr. Aris Marcolongo
Aris is an ML expert by training, PhD in computer science, and experience in various research enterprises. Currently he is pursuing a research project investigating ML methods for understatnding extreme climate events with compound drivers. He carries the rare combination of friendliness and deep technical knowledge.
Dr. Mykhailo Vladymyrov
Mykhailo is a trained physicist who we learned to know at the Albert Einstein Institute of Fundamental physics (and beyond) with many years of experience with big data, machine learning and GPU computing. This year he is working for the Theodor Kocher Institute at the University of Bern. Mykhailo has a high level humor and view upon the human strive.
Sigve studied physics in Germany, Spain and Norway. He has been involved in neutrino physics experiments and high energy frontier experiments, often with main focus on the computing challenges related to the large and distributed data from these experiments. He worked 15 years for the Albert Einstein Center for fundamental Physics. Today he is heading the Data Science Lab at the University of Bern. Beyond science he likes philosophical conversations in the evening.
Dr. Aris Marcolongo
Aris is an ML expert by training, PhD in computer science, and experience in various research enterprises. Currently he is pursuing a research project investigating ML methods for understatnding extreme climate events with compound drivers. He carries the rare combination of friendliness and deep technical knowledge.
Dr. Mykhailo Vladymyrov
Mykhailo is a trained physicist who we learned to know at the Albert Einstein Institute of Fundamental physics (and beyond) with many years of experience with big data, machine learning and GPU computing. This year he is working for the Theodor Kocher Institute at the University of Bern. Mykhailo has a high level humor and view upon the human strive.