MACHINE LEARNING
Instructional goals
This course is an introduction to machine learning with specialization in methods for financial time series. The course is divided into two parts and will be jointly taught by Professors Megha Patnaik (Part 1) and Marta Catalano (Part 2). The programming language for the course will be R.
Intended learning outcomes
.
Course Contents
In the first part, we will cover linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, subset selection and model regularization methods (ridge and lasso); tree-based methods, random forests and boosting. The focus is on the important elements of modern data analysis and its applications. The computing language is the R programming language.
In the second part of the course you will learn how to include prior information into your analysis and how to quantify the uncertainty in your estimates, both for static quantities and for quantities that evolve in time. Real world applications include, e.g., forecasting the returns of a set of assets in a portfolio or predicting the growth of the Gross Domestic Product of a country. We will cover Bayesian methods for unsupervised learning and time series analysis with a particular focus on parametric density estimation, conjugate priors, and dynamic linear models, a wide class of models that includes, e.g, polynomial and cyclical trends, ARMA, and VAR models. We will describe how to make forecasting and inference on these time series through the Kalman filter and discuss their implementation using the R software.
Reference Books
- 2nd edition of Introduction to Statistical Learning by Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani.
(pdf available at https://hastie.su.domains/ISLR2/ISLRv2_website.pdf)
- Dynamic Linear Models with R, P. Campagnoli, S. Petrone, G. Petris , Springer New York, NY.
(pdf available at https://www.researchgate.net/publication/226410454_Dynamic_Linear_Models_with_R)
- Bayesian Data Analysis, A. Gelman, J. Carlin, H. Stern, D. Dunson, A. Vehtari, and D. Rubin, 3rd Ed. Chapman & Hall.
(pdf available at http://www.stat.columbia.edu/~gelman/book/ )
- Bayesian Forecasting and Dynamic Models, M. West, J. Harrison, 2nd Ed. Springer New York, NY.
Teaching Methods
.
Assessment Method
Evaluation will be based on a combination of homeworks and the final exam.
Thesis assignment criteria
.
Week 1 Contenuto sessioni on line e on campus
Introduction & Review/Setup for the R programming language
Week 2 Contenuto sessioni on line e on campus
Statistical Learning - Statistical Learning and Regression, Parametric vs. Non-Parametric Models, Model
Accuracy, K-Nearest Neighbors
Week 3 Contenuto sessioni on line e on campus
Linear Regression - Simple Linear Regression, Hypothesis Testing, Multiple Linear Regression, Model
Selection, Interactions and Non-Linear Models
Week 4 Contenuto sessioni on line e on campus
Classification - Logistic Regression, Multivariate Logistic Regression, Multiclass Logistic Regression,
Linear Discriminant Analysis, Univariate Linear Discriminant Analysis, Multivariate Linear Discriminant
Analysis, Quadratic Discriminant Analysis
Week 5 Contenuto sessioni on line e on campus
Cross Validation - K-Fold Cross-Validation, Bootstrap
Week 6 Contenuto sessioni on line e on campus
Variable Selection - Linear Model Subset Selection, Forward Stepwise Selection, Backward Stepwise
Selection, Estimating Test Error - AIC, BIC, Estimating Test Error -- Cross-Validation, Ridge Regression,
Lasso, Tuning Parameters, Dimension Reduction, Principal Components and Partial Least Squares
Week 7 Contenuto sessioni on line e on campus
Tree/Forest methods - Decision Trees, Pruning Trees, Classification Trees, Bagging, Random Forests,
Boosting
Week 8 Contenuto sessioni on line e on campus
Parametric density estimation - introduction & recap, maximum likelihood estimation, Bayes theorem.
Week 9 Contenuto sessioni on line e on campus
Bayesian inference and prediction - conjugate priors, summaries of posterior distributions, interpretation.
Week 10 Contenuto sessioni on line e on campus
Dynamic linear models - state-space models, filtering, smoothing, prediction, Kalman Filter.
Week 11 Contenuto sessioni on line e on campus
Implementation - basic building blocks, dlm r package, model checking
Week 12 Contenuto sessioni on line e on campus
.