Overview

Dataset statistics

Number of variables3
Number of observations3576798
Missing cells0
Missing cells (%)0.0%
Total size in memory143.3 MiB
Average record size in memory42.0 B

Variable types

Text1
Numeric1
DateTime1

Dataset

Description[Step] The step detector sensor collects an event each time a step is taken by the user. The value reported by the sensor is always one, the fractional part being always zero, and the event timestamp is the time when the user’s foot hit the ground. To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
CreatorAndrea Bontempelli, Matteo Busso, Roy Alia Asiku
AuthorAndrea Bontempelli, Matteo Busso, Fausto Giunchiglia
URL
Copyright(c) University of Trento - Knowledge Diversity 2023

Variable descriptions

experimentidExperiment Id
useridUser id
timestampshow month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Alerts

experimentid has constant value "wenetItaly"Constant

Reproduction

Analysis started2024-11-23 06:06:46.304230
Analysis finished2024-11-23 06:06:58.102286
Duration11.8 seconds
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

experimentid
Text

CONSTANT 

Experiment Id

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size88.7 MiB
2024-11-23T07:06:58.169122image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters35767980
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowwenetItaly
2nd rowwenetItaly
3rd rowwenetItaly
4th rowwenetItaly
5th rowwenetItaly
ValueCountFrequency (%)
wenetitaly 3576798
100.0%
2024-11-23T07:06:58.419166image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 7153596
20.0%
t 7153596
20.0%
w 3576798
10.0%
n 3576798
10.0%
I 3576798
10.0%
a 3576798
10.0%
l 3576798
10.0%
y 3576798
10.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 35767980
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 7153596
20.0%
t 7153596
20.0%
w 3576798
10.0%
n 3576798
10.0%
I 3576798
10.0%
a 3576798
10.0%
l 3576798
10.0%
y 3576798
10.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 35767980
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 7153596
20.0%
t 7153596
20.0%
w 3576798
10.0%
n 3576798
10.0%
I 3576798
10.0%
a 3576798
10.0%
l 3576798
10.0%
y 3576798
10.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 35767980
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 7153596
20.0%
t 7153596
20.0%
w 3576798
10.0%
n 3576798
10.0%
I 3576798
10.0%
a 3576798
10.0%
l 3576798
10.0%
y 3576798
10.0%

userid
Real number (ℝ)

User id

Distinct120
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean130.1816753
Minimum0
Maximum264
Zeros1791
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size54.6 MiB
2024-11-23T07:06:58.546695image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile19
Q160
median119
Q3191
95-th percentile250
Maximum264
Range264
Interquartile range (IQR)131

Descriptive statistics

Standard deviation73.73056645
Coefficient of variation (CV)0.5663667045
Kurtosis-1.159709956
Mean130.1816753
Median Absolute Deviation (MAD)71
Skewness0.02328835561
Sum465633556
Variance5436.19643
MonotonicityIncreasing
2024-11-23T07:06:58.670307image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
191 270819
 
7.6%
27 164633
 
4.6%
148 136535
 
3.8%
78 123871
 
3.5%
60 120703
 
3.4%
107 109861
 
3.1%
163 102654
 
2.9%
19 92211
 
2.6%
109 84753
 
2.4%
42 84470
 
2.4%
Other values (110) 2286288
63.9%
ValueCountFrequency (%)
0 1791
 
0.1%
2 10642
 
0.3%
3 2367
 
0.1%
8 68592
1.9%
17 1966
 
0.1%
ValueCountFrequency (%)
264 8059
 
0.2%
263 28309
0.8%
262 36336
1.0%
259 1977
 
0.1%
255 35990
1.0%

timestamp
Date

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct3572003
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size54.6 MiB
Minimum2020-11-16 07:00:20.471000
Maximum2020-12-11 21:59:37.208000
2024-11-23T07:06:58.791940image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-23T07:06:58.912330image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)