DATABASES & BIG DATA
Instructional goals
The Database & Big Data course aims at providing the students
with a solid understanding of classical notions of data management as
well as modern concepts related to the 'Big Data'. All these
notions will be analyzed both formally and practically through real world
scenarios.Furthermore, the course will cover various technologies related to relational databases as well as NoSQL databases for accessing, storing, and managing large data sets
Intended learning outcomes
Knowledge and understanding. The course offers several conceptual
tools to solve data-related problems in an effective way. At the end
of the of course, students will posses a solid understanding of the
issues related to managing large volumes of data as well as
theoretical and practical solutions to these issues.
Applying knowledge and understanding. Successful students will be able to:
- Apply the conceptual tools of data management to real-world scenarios;
- Extract information from data sets using both conceptual and practical tools;
- Understand the key differences underlying different data management technologies;
- Interact with and understand data-oriented technologies.
Making judgements. The course will foster the development of critical
thinking related to data-oriented applications. Successful students
will be able to analyse real-world problems involving data and choose the right conceptual and technological solutions to solve them.
Communications Skills. The course will provide the students with an
understanding of the standard terms and concepts of classical data
management as well as notions related to the so called Big
Data. This will allow the students to effectively communicate their
ideas, proposals, and analyses.
Learning skills. This introductory course provides a solid background
of knowledge on the basics of data management. Successful students
will posses the skills required to study and understand intermediate
concepts related to data management and big data.
Course Contents
Contenuti Del Corso / Course Contents
The course covers several topics related to classical data management and the so called Big Data. Topics include:
- Conceptual database design and the relational model;
- Relational design principles based on dependencies and normal forms;
- Concurrency control, storage, and indexing;
- Database design and the use of databases in applications;
- SQL databases and query language;
- NoSQL databases
Reference Books
Lecture notes (slides) and course material will be made available on the e-learning platform. There is no mandatory textbook for the course. However, for students who want additional resources, we
recommend:
- Ramakrishnan and Gehrke. Database Management Systems. 3rd Edition.
- Garcia-Molina, Ullmann. Database Systems 2nd Edition. Prentice Hall
- Karau, Konwinski, Wendell, Zaharia: Learning Spark: Lightning-Fast Big Data Analysis. O'Reilly Media
Teaching Methods
Lectures, labs and exercise sessions. The use of a laptop is strongly recommended.
Assessment Method
- Mid-term written exam
- Final written exam
- Group Project
Thesis assignment criteria
To be discussed with the instructor.
Week 1
On Campus Session. Course introduction and organization. Introduction to data management, databases, data warehouse, data lakes.
On Campus Session. Setup of mysql and python environment
On Campus Session. Introduction to Relational Databases: tables, relations, keys
On Campus Session.Introduction to SQL: tables, keys, etc
Week 2
On Campus Session. Relational databases and SQL (II)
On Campus Session. Exercises in SQL.
On Campus Session. Schema design: Introduction to ER modelling
On Campus Session. ER modelling
Week 3
On Campus Session. Exercises for ER modelling
On Campus Session. Practice with SQL
On Campus Session. ER modelling
On Campus Session. Restructuring the ER model (I)
Week 4
On Campus Session. Exercises for ER modelling
On Campus Session. Practice with SQL
On Campus Session. Restructuring the ER model (II)
On Campus Session. Normal forms (I)
Week 5
On Campus Session. Exercises for ER modelling
On Campus Session. Practice with SQL
On Campus Session.Normal Forms (II)
On Campus Session Indexing and query optimization (I)
Week 6
On Campus Session. Exercises for ER modelling
On Campus Session. Practice with SQL
On Campus Session Indexing and query optimization (II)
On Campus Session Indexing and query optimization (III)
Week 7
On Campus Session. Exercise on indexing
On Campus Session. Practice with SQL
On Campus Session. Concurrency in databases
On Campus Session. Transactions (I)
Week 8
On Campus Session. Exercises on transactions
On Campus Session. Practice with SQL
On Campus Session. Transactions (II)
On Campus Session. Transactions (III)
Week 9
On Campus Session. Exercises on transactions
On Campus Session. Practice with SQL
On Campus Session. Databases and Big Data (I)
On Campus Session. Databases and Big Data (II)
Week 10
On Campus Session. Exercises on transactions
On Campus Session. Practice with SQL
On Campus Session. Distributed Databases
On Campus Session. NoSQL databases
Week 11
On Campus Session. NoSQL mongoDB
On Campus Session. Practice with SQL
On Campus Session. Vector databases (I)
On Campus Session. Vector databases (II)
Week 12
On Campus Session. NoSQL mongoDB
On Campus Session. Q&A for the project
On Campus Session. Vector databases (III)
On Campus Session. Vector databases (IV)