DATA ANALYSIS FOR SOCIAL SCIENCES

Michele Gallo

Instructional goals

The course aims to introduce the main tools for simple analysis to multivariate statistical techniques.

Prerequisites

Students will be able to analyze social and economic data using the most appropriate statistical techniques. Data analysis operations will be carried out using the statistical software R, instruction in which is an integral part of the course.

Intended learning outcomes

Basic concepts of Mathematics.

Course Contents

1) Basic Concepts of Statistics and R Programming 2) Different Types of Data 3) Exploratory Data Analysis (EDA) 4) Linear Regression 5) Multivariate Analysis

Reference Books

MAIN: C. Chapman and E. McDonnell Feit (2015) R for Marketing Research and Analytics, Springer. SUGGESTED for R: Venables, William N., David M. Smith, and R Development Core Team. "An introduction to R." (2024). https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf Wickham, Hadley, and Garrett Grolemund. R for data science (2e). " O'Reilly Media, Inc.", 2023. https://r4ds.hadley.nz/ SUGGESTED for the THEORY James, Gareth, et al. An introduction to statistical learning. Vol. 112. New York: Springer, 2023. https://www.statlearning.com/

Teaching Methods

Book, slides, lecture notes, R scripts

Assessment Method

A written exam (at least) and a project work on a real set of data

Thesis assignment criteria

TBD

Week 1

1) Basic Concepts of Statistics and R Programming: - Introduction to R and RStudio IDE - Introduction to Statistics

Week 2

2) Different Types of Data - Quantitative Data (Numeric, Continuous, Discrete) - Categorical Data (Nominal, Ordinal, Dichotomous) - Different Types of Data in R - Open data e Datasets

Week 3

2) Different Types of Data - Unit Distribution - Simple Distribution - Class Distribution Basic Objects in R (vector, matrix, data frame, list)

Week 4

3) Exploratory data analysis: Describing Data - Summarize Single Variable (Central Tendency, Variability etc.) - Single Variable Visualization (Bar Graphs, Histograms, Maps, Pie Charts, boxplots, etc.) - Basic Statistics with R - Data visualization in R with ggplot2

Week 5

3) Exploratory Data Analysis: Describing Data by Group - Tables - Summarize Variable by Group Two Variables Visualization (Scatterplot, Correlation Plot, Conditional Histogram, etc.) Exploring Relationship between Variables - Correlation Coefficient - Chi-square Table and Statistics with R Data visualization in R with ggplot2

Week 6

4) Linear Regression - Main Probability Distributions (Normal, Binomial, Poisson, etc.) - Recap of Statistical Inference (Confidence Interval, Hypothesis Testing) Inference in R

Week 7

4) Linear regression - Linear Regression - OLS method - Parameters' Interpretation and Model Assessment Linear regression in R

Week 8

Recap of the topics Intermediate Test Project Work on a Real Dataset

Week 9

5) Multivariate Analysis Reducing Data Complexity - Basic Concepts - Standardize and Representation Multivariate Analysis in R

Week 10

5) Multivariate Analysis - Principal component analysis (PCA) - Singular Value Decomposition (SVD) Multivariate Analysis in R

Week 11

5) Multivariate Analysis Multidimensional Scaling - Basic Concepts - Clustering Multivariate Analysis in R

Week 12

Recap of the topics Final projects presentation