Overview

Dataset statistics

Number of variables4
Number of observations568824
Missing cells0
Missing cells (%)0.0%
Total size in memory20.1 MiB
Average record size in memory37.0 B

Variable types

Text1
Numeric1
DateTime1
Boolean1

Dataset

Description[unitless] Sensor that detects when the user is present. An example is when the user unlocks the screen. This sensor can be used in comparison to Screen status to check if the screen turn on event occurred due to the user or, for example, due to a received notification. The event user present OFF is simply when the screen turns off. To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
CreatorAndrea Bontempelli, Matteo Busso, Roy Alia Asiku
AuthorAndrea Bontempelli, Matteo Busso, Fausto Giunchiglia
URL
Copyright(c) University of Trento - Knowledge Diversity 2023

Variable descriptions

experimentidExperiment Id
useridUser id
timestampshow month(2), day(2), hour(2), minute(2), second(2), decimals(3)
statusReturn if the user is present or not

Alerts

experimentid has constant value "wenetDenmark"Constant
status is highly imbalanced (50.3%)Imbalance
userid has 18114 (3.2%) zerosZeros

Reproduction

Analysis started2024-11-23 01:50:53.055309
Analysis finished2024-11-23 01:50:54.957283
Duration1.9 second
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

experimentid
Text

CONSTANT 

Experiment Id

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size10.8 MiB
2024-11-23T02:50:55.056020image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters6825888
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowwenetDenmark
2nd rowwenetDenmark
3rd rowwenetDenmark
4th rowwenetDenmark
5th rowwenetDenmark
ValueCountFrequency (%)
wenetdenmark 568824
100.0%
2024-11-23T02:50:55.277316image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 1706472
25.0%
n 1137648
16.7%
w 568824
 
8.3%
t 568824
 
8.3%
D 568824
 
8.3%
m 568824
 
8.3%
a 568824
 
8.3%
r 568824
 
8.3%
k 568824
 
8.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 6825888
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1706472
25.0%
n 1137648
16.7%
w 568824
 
8.3%
t 568824
 
8.3%
D 568824
 
8.3%
m 568824
 
8.3%
a 568824
 
8.3%
r 568824
 
8.3%
k 568824
 
8.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 6825888
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1706472
25.0%
n 1137648
16.7%
w 568824
 
8.3%
t 568824
 
8.3%
D 568824
 
8.3%
m 568824
 
8.3%
a 568824
 
8.3%
r 568824
 
8.3%
k 568824
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 6825888
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1706472
25.0%
n 1137648
16.7%
w 568824
 
8.3%
t 568824
 
8.3%
D 568824
 
8.3%
m 568824
 
8.3%
a 568824
 
8.3%
r 568824
 
8.3%
k 568824
 
8.3%

userid
Real number (ℝ)

ZEROS 

User id

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.8791577
Minimum0
Maximum27
Zeros18114
Zeros (%)3.2%
Negative0
Negative (%)0.0%
Memory size4.3 MiB
2024-11-23T02:50:55.383349image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q16
median17
Q317
95-th percentile25
Maximum27
Range27
Interquartile range (IQR)11

Descriptive statistics

Standard deviation7.938786329
Coefficient of variation (CV)0.6164057086
Kurtosis-1.444711791
Mean12.8791577
Median Absolute Deviation (MAD)6
Skewness-0.07130086789
Sum7325974
Variance63.02432838
MonotonicityIncreasing
2024-11-23T02:50:55.479579image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
17 199953
35.2%
6 119188
21.0%
3 92512
16.3%
23 63674
 
11.2%
21 19537
 
3.4%
0 18114
 
3.2%
26 12082
 
2.1%
2 11142
 
2.0%
25 9810
 
1.7%
27 7174
 
1.3%
Other values (7) 15638
 
2.7%
ValueCountFrequency (%)
0 18114
 
3.2%
2 11142
 
2.0%
3 92512
16.3%
6 119188
21.0%
8 937
 
0.2%
ValueCountFrequency (%)
27 7174
 
1.3%
26 12082
 
2.1%
25 9810
 
1.7%
23 63674
11.2%
22 4553
 
0.8%

timestamp
Date

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct568732
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size4.3 MiB
Minimum2020-11-16 07:00:00.484000
Maximum2020-12-11 21:58:45.492000
2024-11-23T02:50:55.608890image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-23T02:50:55.729731image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

status
Boolean

IMBALANCE 

Return if the user is present or not

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size555.6 KiB
False
506743 
True
62081 
ValueCountFrequency (%)
False 506743
89.1%
True 62081
 
10.9%
2024-11-23T02:50:55.820114image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Correlations

2024-11-23T02:50:55.877883image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
statususerid
status1.0000.072
userid0.0721.000