STATISTICAL FOUNDATIONS OF DATA SCIENCE

STATISTICAL FOUNDATIONS OF DATA SCIENCE

Mariaelena Bottazzi Schenone, Marta Catalano

Obiettivi formativi

The course provides an overview of some advanced statistical methods for data science. The focus is on understanding advantages and limitations of each approach, interpretation, and main applications in various disciplines, particularly in economics, business, and management. Students will learn how to solve several supervised and unsupervised learning tasks, including regression, classification, clustering, and Bayesian inference.

Prerequisiti

Solid knowledge of basic probability, descriptive statistics and statistical inference, including hypothesis testing and confidence intervals; see, for example, Chapters 1–8 of Ross (2017). Working knowledge of R is welcome but not mandatory. Luiss Preliminary Courses for Master's degree - Recommended: Statistics, Probability Suggested: R, Mathematics

Contenuti Del Corso

• Introduction • Probability and Statistics recap • Principles of regression • Simple and multivariate linear regression • Cross-validation • Principles of classification • Logistic regression • Clustering • Principles of Bayesian inference • Bayesian linear regression

Testi Di Riferimento

• James, G., Witten D., Hastie T. & Tibshirani R. (2021). An Introduction to Statistical Learning: With Applications in R. 2nd Ed. Springer. [main] • Bishop C. (2006). Pattern Recognition and Machine Learning. Springer. • Gelman A., Carlin J., Stern H., Dunson D., Vehtari A., Rubin D. (2013). Bayesian Data Analysis. 3rd Ed. Chapman & Hall. • Hastie T., Tibshirani R., Friedman J. (2009). The Elements of Statistical Learning. 2nd Ed. Springer. • Hoff P. (2009). A First Course in Bayesian Statistical Methods. Springer. • Ross S. M. (2017). Introductory statistics. 4th Ed. Elsevier.

Metodologie Didattiche

Lectures and Lab sessions.

Modalità di verifica dell'apprendimento

Assignment (1/3) Written final exam (2/3).

Criteri per l’assegnazione dell’elaborato finale

An interview to verify understanding and motivation.

Settimana 1

Introduction.

Settimana 2

Probability and Statistics recap.

Settimana 3

Regression.

Settimana 4

Linear regression.

Settimana 5

Multivariate regression.

Settimana 6

Crossvalidation.

Settimana 7

Classification.

Settimana 8

Logistic regression.

Settimana 9

Clustering.

Settimana 10

Bayesian inference.

Settimana 11

Bayesian inference.

Settimana 12

Bayesian linear regression.