DATABASES & BIG DATA

DATABASES & BIG DATA

Blerina Sinaimeri

Obiettivi formativi

The Database & Big Data course aims at providing the students with a solid understanding of classical notions of data management as well as modern concepts related to the 'Big Data'. All these notions will be analyzed both formally and practically through real world scenarios.Furthermore, the course will cover various technologies related to relational databases as well as NoSQL databases for accessing, storing, and managing large data sets

Risultati di apprendimento attesi

Knowledge and understanding. The course offers several conceptual tools to solve data-related problems in an effective way. At the end of the of course, students will posses a solid understanding of the issues related to managing large volumes of data as well as theoretical and practical solutions to these issues. Applying knowledge and understanding. Successful students will be able to: - Apply the conceptual tools of data management to real-world scenarios; - Extract information from data sets using both conceptual and practical tools; - Understand the key differences underlying different data management technologies; - Interact with and understand data-oriented technologies. Making judgements. The course will foster the development of critical thinking related to data-oriented applications. Successful students will be able to analyse real-world problems involving data and choose the right conceptual and technological solutions to solve them. Communications Skills. The course will provide the students with an understanding of the standard terms and concepts of classical data management as well as notions related to the so called Big Data. This will allow the students to effectively communicate their ideas, proposals, and analyses. Learning skills. This introductory course provides a solid background of knowledge on the basics of data management. Successful students will posses the skills required to study and understand intermediate concepts related to data management and big data.

Contenuti Del Corso

Contenuti Del Corso / Course Contents The course covers several topics related to classical data management and the so called Big Data. Topics include: - Conceptual database design and the relational model; - Relational design principles based on dependencies and normal forms; - Concurrency control, storage, and indexing; - Database design and the use of databases in applications; - SQL databases and query language; - NoSQL databases

Testi Di Riferimento

Lecture notes (slides) and course material will be made available on the e-learning platform. There is no mandatory textbook for the course. However, for students who want additional resources, we recommend: - Ramakrishnan and Gehrke. Database Management Systems. 3rd Edition. - Garcia-Molina, Ullmann. Database Systems 2nd Edition. Prentice Hall - Karau, Konwinski, Wendell, Zaharia: Learning Spark: Lightning-Fast Big Data Analysis. O'Reilly Media

Metodologie Didattiche

Lectures, labs and exercise sessions. The use of a laptop is strongly recommended.

Modalità di verifica dell'apprendimento

- Mid-term written exam - Final written exam - Group Project

Criteri per l’assegnazione dell’elaborato finale

To be discussed with the instructor.

Settimana 1

On Campus Session. Course introduction and organization. Introduction to data management, databases, data warehouse, data lakes. On Campus Session. Setup of mysql and python environment On Campus Session. Introduction to Relational Databases: tables, relations, keys On Campus Session.Introduction to SQL: tables, keys, etc

Settimana 2

On Campus Session. Relational databases and SQL (II) On Campus Session. Exercises in SQL. On Campus Session. Schema design: Introduction to ER modelling On Campus Session. ER modelling

Settimana 3

On Campus Session. Exercises for ER modelling On Campus Session. Practice with SQL On Campus Session. ER modelling On Campus Session. Restructuring the ER model (I)

Settimana 4

On Campus Session. Exercises for ER modelling On Campus Session. Practice with SQL On Campus Session. Restructuring the ER model (II) On Campus Session. Normal forms (I)

Settimana 5

On Campus Session. Exercises for ER modelling On Campus Session. Practice with SQL On Campus Session.Normal Forms (II) On Campus Session Indexing and query optimization (I)

Settimana 6

On Campus Session. Exercises for ER modelling On Campus Session. Practice with SQL On Campus Session Indexing and query optimization (II) On Campus Session Indexing and query optimization (III)

Settimana 7

On Campus Session. Exercise on indexing On Campus Session. Practice with SQL On Campus Session. Concurrency in databases On Campus Session. Transactions (I)

Settimana 8

On Campus Session. Exercises on transactions On Campus Session. Practice with SQL On Campus Session. Transactions (II) On Campus Session. Transactions (III)

Settimana 9

On Campus Session. Exercises on transactions On Campus Session. Practice with SQL On Campus Session. Databases and Big Data (I) On Campus Session. Databases and Big Data (II)

Settimana 10

On Campus Session. Exercises on transactions On Campus Session. Practice with SQL On Campus Session. Distributed Databases On Campus Session. NoSQL databases

Settimana 11

On Campus Session. NoSQL mongoDB On Campus Session. Practice with SQL On Campus Session. Vector databases (I) On Campus Session. Vector databases (II)

Settimana 12

On Campus Session. NoSQL mongoDB On Campus Session. Q&A for the project On Campus Session. Vector databases (III) On Campus Session. Vector databases (IV)