metadata

Field Name	Description
Project title	name of the DataScientia project in a natural language as a string
Project url	this attribute encodes the dereferenciable URL of the DataScientia project
Project webpage	this attribute encodes the dereferenciable URL of the DataScientia project
Project keywords	list of keywords in a natural language to quickly understand the theme of the project
Project type	the type of the DataScientia project. e.g., Knowledge Resource Generation, Knowledge Resource Annotation, etc
Project description	description of the DataScientia project in a natural language, namely (i) the project name and objectives; (ii) the type of data collected; (iii) suggested reuse.
Project start date	the date of the commencement of a DataScientia project
Project end date	the date of conclusion of a DataScientia project
Project funding agency	the name of the agency or institution funding a DataScientia project and, if available, the grant agreement number. European projects also have a DOI, e.g., https://doi.org/10.3030/823783
Project input	this (repeatable) attribute encodes the various inputs (e.g., datasets, specifications, etc.) with respect to a DataScientia project.
Project output	this (repeatable) attribute encodes the various outputs (e.g., datasets, domain languages, etc.) with respect to a DataScientia project.
Project coordinator	the name of the research coordinator in charge of a DataScientia project
Project observations	this attribute can be used to record any observations about a DataScientia project in a natural language.
Project coordinator organization	Institution of the ds:prjCoordinator
Project project area	Category of the project that pertains to the Catalog (LivePeople, LiveLanguage, LiveKnowledge, LiveData, LiveMedia).
Project members	Other members who took part in the project with roles other than the coordinator
Project target location	expected project data collection area
Project target population	Sociodemographic characteristics that were adopted as population inclusion criteria (e.g. gender, age, educational qualification, profession). Multiple options can be selected
Project overall participants involved	Total number of participants who were involved in the project
Project selected participants	Number of participants who have been selected for the project (less than or equal to ds:prjOverallPeopleInvolved), if the project has multiple phases
Project type of measurements	Survey methods that were used to collect participant data (i.e., Questionnaire, Structured Interview, Semi-Structured Interview, Participant Observation, Intensive Longitudinal Survey). Multiple options can be selected
Project IRB approval date	Date of Institutional review board (IRB) approval of the project
Project IRB approval organization	Institution to which the committee belongs to
Project IRB approval number	Number of protocol of the approval
Project cite as	Bibtext citation of the project
Project maintenance	description of the dataset maintenance process: how often is the dataset updated, how updates will be communicated, who is responsible for the updates, whether and how older versions will be supported
Project latitude	Latitude of the centroid of ds:prjTargetLocation (maily for visualization purposes on the catalog website)
Project longitude	Longitude of the centroid of ds:prjTargetLocation (maily for visualization purposes on the catalog website)
Project documentation name	Project documentation name
Project additional material name	Project additional material name
Data name	the name of the dataset in a natural language
Data description	a description of the dataset
Data version	the version of the dataset. The schema to follow is MAJOR.MINOR.PATCH as described https://semver.org/spec/v2.0.0.html
Data publication timestamp	the timestamp of the publication of the dataset in the respective catalog.
Data license	license of the dataset
Data url	the dereferenceable URL of the dataset
Data keyword	the keywords which can quickly convey the topic of the dataset
Data publisher	the publisher of the dataset
Data creator	the creator of the dataset. Individual member of the community or organization.
Data owner	the owner of the dataset. Individual member of the community or organization.
Data language	The language of the dataset (only if dataset type==Synchronic interaction \| dataset type==Diachronic interaction). Recommended values are taken from https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes. When not applicable use N/A
Data level	the knowledge level of the dataset, e.g., L1-2, L4
Data size	the byte size of the dataset (e.g., the size of the file parquet)
Data domain	the domain to which the dataset belongs. Possible values are: Digital University, Territory & Society, Health
Data file format	type of the file using MIME format https://www.iana.org/assignments/media-types/media-types.xhtml
Data detailed description	links the dataset to its personalized project web page. Please note that is different from the ds:prjURL attribute, which links to the datascientia project on the community.
Data download request	email of the institution responsible for validating the download request and share the data or link to the form to fill
Data conditions of access	conditions that affect the availability of the data. E.g., only for scientific purposes
Data genre	Genre of the dataset: images, tabular data, geographical file
Data is accessible for free
Data expires	When the dataset is no longer available
Data type	The type of dataset
Data sensor type	The type of sensor (only if dataset type == Sensor)
Data start date	The begin of the data collection (i.e., the oldest date in the dataset)
Data end date	The end of the data collection (i.e., the most recent date in the dataset)
Data five stars	Ranking based on Tim Berners-Lee’s 5-star deployment scheme for Open Data (https://5stardata.info/en/). iLog data after the data preparation are 3 stars
Data origin	Origin of the data. Synthetic: generated from an algorithm. Direct observation: collected from humans through human behavior studies. Composition: dataset resulting from the composition of other datasets. Feature engineering: dataset as the result of features extraction or aggregation of existing datasets;
Data creative work status	Status of the data. Raw: no preprocessing has been performed. Cleaned: cleaning procedures has been applied to the raw data. Reduced: the data has been aggregated, e.g., to generate features. Enriched: contains additional data from third parties.
Data download request name	The name of the document to request the dataset
Data documentation name	link to the dataset documentation, e.g., techinical report describing the data collection (design, collection, preparation, main results)
Data additional material name	link to the materials used during the research (e.g., focus group tracks, interviews, and questionnaires)
Data codebook name	A codebook describes the contents of a data collection. A well-documented code- book contains information intended to be complete and self-explanatory for each variable in a dataset.
Data identifier	alpha numeric code identifing the resource (see Identifier sheet)
Data changelog url
Data licence url
Data sha256	hash of the content of the file
Data update timestamp	last date the dataset has been updated
Data based on	A resource from which this work is derived or from which it is a modification or adaptation. It is a link to Dataset, Project, Url
Data sensor name	single sensor name