STATISTICAL FOUNDATIONS OF DATA SCIENCE
STATISTICAL FOUNDATIONS OF DATA SCIENCE
Mariaelena Bottazzi Schenone, Marta Catalano
Instructional goals
The course provides an overview of some advanced statistical methods for data science. The focus is on understanding advantages and limitations of each approach, interpretation, and main applications in various disciplines, particularly in economics, business, and management. Students will learn how to solve several supervised and unsupervised learning tasks, including regression, classification, clustering, and Bayesian inference.
Prerequisites
Solid knowledge of basic probability, descriptive statistics and statistical inference, including hypothesis testing and confidence intervals; see, for example, Chapters 1–8 of Ross (2017). Working knowledge of R is welcome but not mandatory.
Luiss Preliminary Courses for Master's degree -
Recommended: Statistics, Probability
Suggested: R, Mathematics
Course Contents
• Introduction
• Probability and Statistics recap
• Principles of regression
• Simple and multivariate linear regression
• Cross-validation
• Principles of classification
• Logistic regression
• Clustering
• Principles of Bayesian inference
• Bayesian linear regression
Reference Books
• James, G., Witten D., Hastie T. & Tibshirani R. (2021). An Introduction to Statistical Learning: With Applications in R. 2nd Ed. Springer. [main]
• Bishop C. (2006). Pattern Recognition and Machine Learning. Springer.
• Gelman A., Carlin J., Stern H., Dunson D., Vehtari A., Rubin D. (2013). Bayesian Data Analysis. 3rd Ed. Chapman & Hall.
• Hastie T., Tibshirani R., Friedman J. (2009). The Elements of Statistical Learning. 2nd Ed. Springer.
• Hoff P. (2009). A First Course in Bayesian Statistical Methods. Springer.
• Ross S. M. (2017). Introductory statistics. 4th Ed. Elsevier.
Teaching Methods
Lectures and Lab sessions.
Assessment Method
Assignment (1/3)
Written final exam (2/3).
Thesis assignment criteria
An interview to verify understanding and motivation.
Week 1
Introduction.
Week 2
Probability and Statistics recap.
Week 3
Regression.
Week 4
Linear regression.
Week 5
Multivariate regression.
Week 6
Crossvalidation.
Week 7
Classification.
Week 8
Logistic regression.
Week 9
Clustering.
Week 10
Bayesian inference.
Week 11
Bayesian inference.
Week 12
Bayesian linear regression.