Website online!
August 19th, 2024
Last modification: August 19th, 2024
The Fall 2024 Edition of the course is delivered in the classrooms. The idea behind this course is that the students do most of the work during the course duration because this should yield considerably better results.
Learning how to handle one’s own data involves working on real personal data. Data collection is then a crucial part of the whole process. For this reason, the students of the course will be taught and asked to collect data from their own smartphones via an application developed by the Knowdive group, ideally over around one month. The calendar shows the dates of the data collection and when the data will be made available (dates can still change). Participants unwilling to collect their personal data will work on data collected in past data collections. More details will be provided by the professors during the first lessons.
Course Objectives and Outcomes
The course participants will learn how to collect and manage data about their everyday life to become researchers of themselves. The general objective is to allow the participants to gain a practical understanding of how data about their everyday life are related to AI. The participants have the opportunity to explore both technical questions about data quality and data generation, and ethical and privacy aspects. The course covers all the phases of the data life cycle: collection, preparation, documentation and distribution. After a brief theoretical introduction, the participants will collect behavioural data about themselves using a mobile application (available for Android and iOS) called iLog and developed by the Knowdive group at the University of Trento. The app collects phone sensor data and sends questions to the participants at regular intervals. Then, they will prepare and document their data. The course is data-intensive and hands on. The participants are free to decide whether to collect their own data or to work on data collected from previous studies. However, the collection and management of one’s own data is strongly suggested to fully understand and become aware of the challenges and the potential of handling one’s own personal data. The overall objective of this course is to familiarise with the management of personal data and to raise awareness about the value and the impact of one own’s data.
General Description
This course will cover the following topics:
Prerequisites
Students from all backgrounds are welcome. Because of the practical parts, a basic knowledge of Python programming is strongly suggested. Basic knowledge of data science, ethics, and data governance is useful but not mandatory to attend the course.
Course modality
Theory:The course runs from September 9 2024 till November 26, 2024 with the following schedule
Monday, 10:30-13:30, Room A216 Povo 1
Tuesday, 14:30-16:30, Room A216 Povo 1
You might want to read the Instructions to understand how to take the course.
Notice also the titles and structure of the lessons yet to be delivered might change slightly. The rule of the thumb is: if there are links with materials, things won’t change; if there are no links to the materials, titles and content are just suggestions.
Lesson Number | Date | Time | Material | Content of Material | Lecturer(s) | External resources | |
---|---|---|---|---|---|---|---|
1 | Mon 9 Sep, 2024 | 10:30 | Module organization slides | Module organization | IB and FG | ||
2 | Tue 10 Sep, 2024 | 14:30 | Module Project organization slides Quantified self slides | Big thick data and quantified self + project organization | FG | ||
5 | Mon 23 Sep, 2024 | 10:30 | Quantified self, data and privacy iLog app | Data collection | FG | ||
- | Wed 24 Sep, 2024 | - | Google Play App Store Datascientia Project | iLog data collection starts | |||
7 | Mon 30 Sep, 2024 | 10:30 | Slides | Types of data: passive and active data | FG | ||
8 | Tue 1 Oct, 2024 | 14:30 | Q/A: project definition | IB and FG | |||
9 | Mon 7 Oct, 2024 | 10:30 | Slides | Data cleaning and preparation - part 1 (motivation, problem, methods) | FG | ||
10 | Tue 8 Oct, 2024 | 14:30 | Data cleaning and preparation - part 2 (Total survey error) | IB | |||
12 | Tue 15 Oct, 2024 | 14:30 | Data cleaning and preparation - part 3 (feature engineering) | IB | |||
13 | Mon 21 Oct, 2024 | 10:30 | Data cleaning and preparation - part 4 (pseudonymization and anonymization) | FG | |||
14 | Tue 22 Oct, 2024 | 14:30 | Data cleaning and preparation - part 5 (documenting and sharing) | FG | |||
- | Tue 22 Oct, 2024 | - | iLog data collection end | ||||
17 | Mon 4 Nov, 2024 | 10:30 | Q/A: data preparation | IB and FG | |||
23 | Wed 25 Nov, 2024 | 10:30 | Final presentation | IB and FG |
The exam will consist of presenting the results of the study in a short presentation during the last lesson. The required deliverables (templates to be provided) are:
pseudonymized, cleaned and prepared dataset (only for evaluation purposes and removed from the course storage after the evaluation process);
metadata catalog with the metadata of the pseudonymized dataset (catalog example). The catalog is a static webpage hosted by the participant and can be private or public. If the visualization is set to private, remember to give permission to the professor to access the webpage.
static website describing the data collection and the data preparation
Multiple positions are available as 150h and internships. They should be considered as the first part of a research project and thesis with the Knowdive group. The general activities of the group are listed on the website (http://knowdive.disi.unitn.it/), while activities already scheduled and available now can be found at http://knowdive.disi.unitn.it/work-with-us/. The 150h activities have variable length and are strictly related to software development: for this reason, knowledge of software development with at least onr programming language is a must. All the activities can also be carried on in a remote fashion.
Anyone interested in these opportunities can send an email to knowdive-positions@disi.unitn.it, providing already information about preferences in terms of topics or activities (if known). For 150h activities it is important to provide information about known programming languages with the corresponding level, a value in the range [1 - 5] where 1= basic knowledge, 5= advanced knowledge.
The applications to the “150 ore” program can be done at the link:
https://www.unitn.it/servizi/224/collaborazioni-studenti-150-ore
Notice that the deadline for applications for the A.Y 2024-2025 is September 30, 2024