Overview

Dataset statistics

Number of variables4
Number of observations474108
Missing cells0
Missing cells (%)0.0%
Total size in memory19.9 MiB
Average record size in memory44.0 B

Variable types

Text1
Numeric2
DateTime1

Dataset

Description[Steps] The step counter sensor is used to get the total number of steps taken by the user since the last reboot (power on) of the phone. To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
CreatorAndrea Bontempelli, Matteo Busso, Roy Alia Asiku
AuthorAndrea Bontempelli, Matteo Busso, Fausto Giunchiglia
URL
Copyright(c) University of Trento - Knowledge Diversity 2023

Variable descriptions

experimentidExperiment Id
useridUser id
timestampshow month(2), day(2), hour(2), minute(2), second(2), decimals(3)
valueThe number of steps

Alerts

experimentid has constant value "wenetDenmark"Constant
userid has 39497 (8.3%) zerosZeros

Reproduction

Analysis started2024-11-23 02:05:03.872931
Analysis finished2024-11-23 02:05:05.740583
Duration1.87 second
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

experimentid
Text

CONSTANT 

Experiment Id

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size9.0 MiB
2024-11-23T03:05:05.838931image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters5689296
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowwenetDenmark
2nd rowwenetDenmark
3rd rowwenetDenmark
4th rowwenetDenmark
5th rowwenetDenmark
ValueCountFrequency (%)
wenetdenmark 474108
100.0%
2024-11-23T03:05:06.056783image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1422324
25.0%
n 948216
16.7%
w 474108
 
8.3%
t 474108
 
8.3%
D 474108
 
8.3%
m 474108
 
8.3%
a 474108
 
8.3%
r 474108
 
8.3%
k 474108
 
8.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 5689296
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1422324
25.0%
n 948216
16.7%
w 474108
 
8.3%
t 474108
 
8.3%
D 474108
 
8.3%
m 474108
 
8.3%
a 474108
 
8.3%
r 474108
 
8.3%
k 474108
 
8.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 5689296
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1422324
25.0%
n 948216
16.7%
w 474108
 
8.3%
t 474108
 
8.3%
D 474108
 
8.3%
m 474108
 
8.3%
a 474108
 
8.3%
r 474108
 
8.3%
k 474108
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 5689296
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1422324
25.0%
n 948216
16.7%
w 474108
 
8.3%
t 474108
 
8.3%
D 474108
 
8.3%
m 474108
 
8.3%
a 474108
 
8.3%
r 474108
 
8.3%
k 474108
 
8.3%

userid
Real number (ℝ)

ZEROS 

User id

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.90895113
Minimum0
Maximum27
Zeros39497
Zeros (%)8.3%
Negative0
Negative (%)0.0%
Memory size3.6 MiB
2024-11-23T03:05:06.157342image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q16
median17
Q321
95-th percentile27
Maximum27
Range27
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.498390127
Coefficient of variation (CV)0.5700193163
Kurtosis-0.9855742791
Mean14.90895113
Median Absolute Deviation (MAD)4
Skewness-0.4342805832
Sum7068453
Variance72.22263475
MonotonicityIncreasing
2024-11-23T03:05:06.253734image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
17 190453
40.2%
6 53296
 
11.2%
26 50620
 
10.7%
0 39497
 
8.3%
21 35691
 
7.5%
2 33625
 
7.1%
27 27763
 
5.9%
20 18486
 
3.9%
3 12741
 
2.7%
25 6773
 
1.4%
Other values (4) 5163
 
1.1%
ValueCountFrequency (%)
0 39497
8.3%
2 33625
7.1%
3 12741
 
2.7%
6 53296
11.2%
8 4091
 
0.9%
ValueCountFrequency (%)
27 27763
5.9%
26 50620
10.7%
25 6773
 
1.4%
22 123
 
< 0.1%
21 35691
7.5%

timestamp
Date

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct474042
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size3.6 MiB
Minimum2020-11-16 07:00:00.452000
Maximum2020-12-11 21:59:59.724000
2024-11-23T03:05:06.369958image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-23T03:05:06.488411image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

value
Real number (ℝ)

The number of steps

Distinct134250
Distinct (%)28.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43518.8592
Minimum0
Maximum364089
Zeros647
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size3.6 MiB
2024-11-23T03:05:06.602855image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1154
Q17615
median27001
Q366893.25
95-th percentile126246
Maximum364089
Range364089
Interquartile range (IQR)59278.25

Descriptive statistics

Standard deviation45772.02215
Coefficient of variation (CV)1.051774403
Kurtosis2.698394126
Mean43518.8592
Median Absolute Deviation (MAD)23140
Skewness1.525696018
Sum2.06326393 × 1010
Variance2095078012
MonotonicityNot monotonic
2024-11-23T03:05:06.726509image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3281 869
 
0.2%
0 647
 
0.1%
96078 629
 
0.1%
8454 609
 
0.1%
51359 535
 
0.1%
81643 517
 
0.1%
59380 515
 
0.1%
44431 509
 
0.1%
3269 481
 
0.1%
34643 467
 
0.1%
Other values (134240) 468330
98.8%
ValueCountFrequency (%)
0 647
0.1%
2 1
 
< 0.1%
3 1
 
< 0.1%
4 1
 
< 0.1%
5 1
 
< 0.1%
ValueCountFrequency (%)
364089 1
< 0.1%
364011 1
< 0.1%
363918 1
< 0.1%
363807 1
< 0.1%
363793 1
< 0.1%

Correlations

2024-11-23T03:05:06.804588image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
useridvalue
userid1.000-0.066
value-0.0661.000