MACHINE LEARNING APPLICATIONS IN POLITICAL SCIENCES
Luca Secondi, Carmine Pappalardo
Instructional goals
This course is designed to: • Introduce students to key Machine Learning (ML) techniques in a clear and accessible way, showing how they can be applied to topics in politics, society, and international relations. • Provide students with practical, hands-on learning opportunities using tools like R and simple artificial intelligence applications in order to explore real world data with “confidence”. • Help students understanding when and why to use different ML models, focusing on making thoughtful choices rather than just technical details. • Guide students in exploring patterns and insights not only from quantitative values but also from text — such as policy speeches, news articles, or social media — using methods like sentiment and content analysis. • Encourage students to think critically and ethically about the role of data and algorithms in today’s world, with a focus on how data-driven approaches and decisions can support the Sustainable Development Goals (SDGs). • Strengthen students’ ability to communicate their findings clearly, turning complex data into insights that matter for real-world decision-making and meaningful social impact.
Prerequisites
This course builds naturally on the skills and knowledge developed in the Research Methods for Social Sciences course. Students are not expected to have advanced technical or programming skills; instead, the familiarity with basic concepts and approaches introduced in the RMSS course will provide a solid foundation to easily follow the topics and hands-on exercises we will explore together. The course is designed to guide students step by step, helping them gradually apply these methods to real-world data challenges in politics, society, and international relations.
Intended learning outcomes
By the end of the course, students will be able to: i) distinguish and apply supervised and unsupervised machine learning methods to political and social data using R and AI tools. ii) carry out simple sentiment and content analysis on policy and media texts. iii) reflect on the strengths, limitations, and ethical issues of data-driven research. iv) communicate analytical results effectively for policymaking and governance. v) contextualize their analyses to the SDGs.
Course Contents
In today’s political and social context, data-driven decisions are fundamental to informing governance, shaping policies, and addressing the complex global challenges set out in the Agenda 2030 for Sustainable Development. This elective course offers a comprehensive introduction to machine learning (ML) techniques specifically designed for applications in international relations, political science, and the broader social sciences. Building as a natural continuation of the Research Methods for Social Sciences course, it leverages the computational and analytical foundations previously acquired, guiding students toward the further use of supervised and unsupervised methods — including logistic regression, decision trees, random forests, and cluster analysis. Through hands-on sessions in R, students will work directly with political and social datasets, applying these models to real-world issues such as electoral forecasting, conflict prediction, public opinion clustering, and policy evaluation. In addition, the course introduces students to sentiment analysis and content analysis techniques, enabling them to extract insights from textual and qualitative data, such as social media posts, speeches, or policy documents. These analyses can be performed using R or by leveraging artificial intelligence tools to process large-scale textual data and uncover patterns in opinions, attitudes, and narratives. While introducing theoretical underpinnings, the course focuses on the practical implementation and critical interpretation of machine learning tools, enabling students to apply algorithmic approaches effectively and reflect on their opportunities, limitations, and implications. Ultimately, the course equips participants with essential skills to contribute to evidence-based research and policymaking aligned with the Sustainable Development Goals (SDGs) and the broader pursuit of social progress.
Reference Books
Selected lectures and materials provided by the instructor.
Teaching Methods
The course will adopt an interactive and practice-based methodology, combining lectures with continuous student engagement. Emphasis will be placed on practical exercises, case study analysis, and empirical applications, through which students will learn and derive key methodological approaches. This structure aims to promote the ability to apply techniques to real-world social and political data contexts.
Assessment Method
In accordance with university regulations, student evaluation is based on continuous assessment activities (such as practical exercises and case analyses) and the development of a research paper. The paper allows students to apply the techniques and analytical approaches learned during the course to a relevant topic in political or social science. The research paper will be discussed during an oral examination at the end of the course. For attending students, the final grade is composed of one-third from continuous assessment and two-thirds from the oral examination. For exempted or non-compliant students, the final grade is based solely on the oral examination, which, in addition to the discussion of the research paper, will include further questions on the topics covered during the course.
Thesis assignment criteria
The thesis consists of a research paper that applies statistical methods to the analysis of topics in international relations, political science, and the broader field of social sciences.
Week 1
Session 1: Introduction to data-driven decision-making in the social sciences; overview of machine learning types (supervised, unsupervised, ensemble, dimensionality reduction). Session 2: Setting up R and RStudio; introduction to political and social datasets; preparing and exploring structured data.
Week 2
Session 1: Supervised learning overview — classification problems in political science and international relations (beyond regression). Session 2: Applying classification models (logistic regression, decision trees) in R to political or social datasets.
Week 3
Session 1: Decision trees — theoretical background, splitting criteria, interpretability, and applications to political decision-making. Session 2: Hands-on work in R: building and interpreting decision trees applied to electoral, conflict, or policy datasets.
Week 4
Session 1: Ensemble methods I — introduction to random forests and bagging; Session 2: Applying random forests in R; interpreting feature importance and comparing to single models.
Week 5
Session 1: Ensemble methods II — introduction to gradient boosting (e.g., XGBoost), main concepts, and advantages over other models. Session 2: Applying gradient boosting in R; tuning parameters and analysing performance.
Week 6
Session 1: Unsupervised learning — clustering principles; when and why unsupervised methods are useful in social science research. Session 2: Applying k-means and hierarchical clustering in R; interpreting group patterns among political or social actors.
Week 7
Session 1: Dimensionality reduction — Principal Component Analysis (PCA), understanding variance, reducing complexity in datasets. Session 2: Applying PCA in R; visualizing and interpreting principal components in political and social data.
Week 8
Session 1: Model evaluation — cross-validation, overfitting, underfitting, balancing complexity and generalizability. Session 2: Running cross-validation in R; comparing models across different political datasets.
Week 9
Session 1: Feature engineering and selection — identifying meaningful variables, reducing noise, improving model interpretability. Session 2: Applying feature selection techniques in R; analysing the effect on model outcomes.
Week 10
Session 1: Case Study Discussion I — political risk prediction, policy diffusion, governance performance: how ML models contribute. Session 2: Group analysis and discussion of case study applications; comparing approaches, assumptions, and results.
Week 11
Session 1: Case Study Discussion II — electoral behavior, public opinion, international cooperation: interpreting patterns and predictions. Session 2: Group debates and critical reflections; drawing lessons from case comparisons and model outcomes.
Week 12
Session 1: Ethical and methodological reflections — fairness, accountability, algorithmic bias, and the responsible use of ML in political and social contexts. Session 2: Final wrap-up: integrating methods, reflecting on the broader significance of ML for evidence-based decisions and Agenda 2030 challenges.