Icon Course

2022-09-27 - 2022-09-30 Data Analysis & Machine Learning (CAS ADS M3)

In this module, you will learn about standard analysis techniques and how to apply state-of-the-arte machine learning with Python.

Tabs

Data Analysis and Machine Learning

Basic introduction on how to perform typical machine learning tasks with Python.

Learning outcomes
    • Overview of machine learning pipelines and their implementation with scikit-learn
    • Regression and Classification: linear models and logistic regression
    • Decision trees & random forest models
    • Principal component analysis (PCA) and non-linear embeddings (t-SNE and UMAP)
    • Clustering with K-means and Gaussian mixtures
    • Artificial Neural networks as general fitters, fully connected nets used to classify the fashion-MNIST dataset
    • Scikit-learn and clustering maps, Q&A
Target group
  • Students, researchers and professionals working with data
Prerequisites
  • CAS ADS Module 1 and 2.
  • Basic programming skills are necessary, we don't reserve time for basic programming concepts.
Methods
  • Course languages are Python and English
  • Short theory sessions followed by hands-on tutorials with Jupyter notebooks
Certificate and points
  • Certificate of attendance.
  • Upon a successful presentation 2 ECTS credit points are given. 
Coaches
  • The coaches are local and external experts
Time : 2022-09-27 to 2022-09-30 08:30 - 19:00 
Location : Mallorca (https://www.esblaudesnord.com/en/) and online

Training language: English
Participants : Max 24
Registraion : Mandatory
Coaches : Dr. Mykhailo Vladymyrov, Dr. Aris Marcolongo
Prerequisites : Laptop, Python skills 
Certificate : Certificate for full training attendance, 2 ECTS upon successful project presentation

Course material : You don't have to prepare anything for the course. You will be given access to a Jupyter environment that runs in your default browser and that is pre-loaded with the course material.
Monday (arrival day)
17:00 Apero in the patio
19:00 Dinner

Tuesday 

07:30 - 08:00 Breakfast
08:30 - 08:40 Welcome
08:40 - 10:30 General intro in ML, datasets, skl interface
10:30 - 11:00 Break
11:00 - 12:30 Linear models & logistic regression
12:30 - 17:00 Swimming, chilling, working etc
17:00 - 19:00 More course
19:30 - 20:30 Dinner

Wednesday
07:30 - 08:00 Breakfast
08:30 - 09:45 Trees and forests
09:45 - 10:15 Break
10:15 - 12:30 PCA and embeddings
12:30 - 17:00 Swimming, chilling, working etc
17:00 - 19:00 More course
19:30 - 20:30 Dinner
 
Thursday
07:30 - 08:00 Breakfast
08:30 - 09:45 Clustering 1
09:45 - 10:15 Break
10:15 - 12:30 Clustering 2
12:30 - 17:00 Swimming, chilling, working etc
17:00 - 19:00 More course
19:30 - 20:30 Dinner
 
Friday
07:30 - 08:30 Breakfast
08:30 - 09:45 Intro to NN and image fitting, Fully connected net for F-MNIST classification, scikit-learn map, Q&A
09:45 - 10:15 Break
10:15 - 11:30 Discussions ?
11:30 - 12:30 Summary Session and Project Discussion 

2022-11-24 (Main Building Room 204)
13:30 - 14:00 Jonas and Marco
14:00 - 14:30 Raphael
14:30 - 15:00 Naima and Zita
15:00 - 15:30 Nicolas
15:30 - 16:00 Romain
16:00 - 16:30 Leonard

2022-12-12 Session 1 (Main Building Room 217) Aris (Sigve)
13:30 - 14:00 Sandra and Barbara
14:00 - 14:30 Filipe and Laura
14:30 - 15:00 Jani
15:00 - 15:30 Stefano and Marc
16:00 - 16:30 Jürg Krähenbühl
16:30 - 17:00 Nathalie and Claudia

2022-12-12 Session 2 (Main Building Room 331) Mykhailo
13:30 - 14:00 Kim and Lisa and Emilie 
14:00 - 14:30 Michael Zbinden
14:30 - 15:00 Fred and Nico
15:00 - 15:30 Elena
15:30 - 16:00 Michael Albrecht
 
Project presentations day in Bern or online 

Please sign up for one of the slots for your group here (or indicate there that you cannot at those days):

https://terminplaner4.dfn.de/FzcKpUxgGRGoY5LA
Train and  apply at least two models presented in the Module to other datasets than those used in class. 

Please present the notebook or and slides in 15 min at the presentation day. Describe your objective and data, show descriptive statistics and plots, divide in test, validation and test sets, train the models and show performance measures and your conclusions.

You are encouraged to work in teams of two or three. If you don’t have a dataset yet, you may use one from: https://www.kaggle.com/datasets or https://archive.ics.uci.edu/ml/datasets.php
PD Dr. Sigve Haug (overview, school responsible)

Sigve studied physics in Germany, Spain and Norway. He has been involved in neutrino physics experiments and high energy frontier experiments, often with main focus on the computing challenges related to the large and distributed data from these experiments. He worked 15 years for the Albert Einstein Center for fundamental Physics. Today he is heading the Data Science Lab at the University of Bern. Beyond science he likes philosophical conversations in the evening. 

Dr. Aris Marcolongo 

Aris is an ML expert by training, PhD in computer science, and experience in various research enterprises. Currently he is pursuing a research project investigating ML methods for understatnding extreme climate events with compound drivers. He carries the rare combination of friendliness and deep technical knowledge. 

Dr. Mykhailo Vladymyrov

Mykhailo is a trained physicist who we learned to know at the Albert Einstein Institute of Fundamental physics (and beyond) with many years of experience with big data, machine learning and GPU computing. This year he is working for the Theodor Kocher Institute at the University of Bern. Mykhailo has a high level humor and view upon the human strive.