Overview

Dataset statistics

Number of variables4
Number of observations470791
Missing cells0
Missing cells (%)0.0%
Total size in memory23.1 MiB
Average record size in memory51.5 B

Variable types

Text2
Numeric1
DateTime1

Dataset

Description[0/1] Returns whether the phone's screen is on or off. To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
CreatorAndrea Bontempelli, Matteo Busso, Roy Alia Asiku
AuthorAndrea Bontempelli, Matteo Busso, Fausto Giunchiglia
URL
Copyright(c) University of Trento - Knowledge Diversity 2023

Variable descriptions

experimentidExperiment Id
useridUser id
timestampshow month(2), day(2), hour(2), minute(2), second(2), decimals(3)
statusReturn if the screen is ON

Alerts

experimentid has constant value "wenetIndia"Constant
userid has 172016 (36.5%) zerosZeros

Reproduction

Analysis started2024-11-22 12:32:46.028161
Analysis finished2024-11-22 12:32:47.745010
Duration1.72 second
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

experimentid
Text

CONSTANT 

Experiment Id

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size8.1 MiB
2024-11-22T13:32:47.809151image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters4707910
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowwenetIndia
2nd rowwenetIndia
3rd rowwenetIndia
4th rowwenetIndia
5th rowwenetIndia
ValueCountFrequency (%)
wenetindia 470791
100.0%
2024-11-22T13:32:48.061149image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 941582
20.0%
n 941582
20.0%
w 470791
10.0%
t 470791
10.0%
I 470791
10.0%
d 470791
10.0%
i 470791
10.0%
a 470791
10.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4707910
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 941582
20.0%
n 941582
20.0%
w 470791
10.0%
t 470791
10.0%
I 470791
10.0%
d 470791
10.0%
i 470791
10.0%
a 470791
10.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4707910
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 941582
20.0%
n 941582
20.0%
w 470791
10.0%
t 470791
10.0%
I 470791
10.0%
d 470791
10.0%
i 470791
10.0%
a 470791
10.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4707910
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 941582
20.0%
n 941582
20.0%
w 470791
10.0%
t 470791
10.0%
I 470791
10.0%
d 470791
10.0%
i 470791
10.0%
a 470791
10.0%

userid
Real number (ℝ)

ZEROS 

User id

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.25518755
Minimum0
Maximum62
Zeros172016
Zeros (%)36.5%
Negative0
Negative (%)0.0%
Memory size3.6 MiB
2024-11-22T13:32:48.169267image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median12
Q326
95-th percentile26
Maximum62
Range62
Interquartile range (IQR)26

Descriptive statistics

Standard deviation12.30761748
Coefficient of variation (CV)1.004278183
Kurtosis0.8658215142
Mean12.25518755
Median Absolute Deviation (MAD)12
Skewness0.9440134969
Sum5769632
Variance151.4774481
MonotonicityIncreasing
2024-11-22T13:32:48.268159image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
0 172016
36.5%
12 110177
23.4%
26 110001
23.4%
9 37360
 
7.9%
17 8433
 
1.8%
35 7860
 
1.7%
43 4795
 
1.0%
24 4428
 
0.9%
4 3890
 
0.8%
44 3080
 
0.7%
Other values (10) 8751
 
1.9%
ValueCountFrequency (%)
0 172016
36.5%
4 3890
 
0.8%
8 979
 
0.2%
9 37360
 
7.9%
12 110177
23.4%
ValueCountFrequency (%)
62 2143
0.5%
57 2352
0.5%
49 838
 
0.2%
46 164
 
< 0.1%
44 3080
0.7%

timestamp
Date

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct470750
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size3.6 MiB
Minimum2021-07-12 08:05:13.651000
Maximum2021-08-12 14:40:29.574000
2024-11-22T13:32:48.386930image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-22T13:32:48.514827image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

status
Text

Return if the screen is ON

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size7.9 MiB
2024-11-22T13:32:48.596434image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.517484404
Min length9

Characters and Unicode

Total characters4480746
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSCREEN_OFF
2nd rowSCREEN_ON
3rd rowSCREEN_OFF
4th rowSCREEN_OFF
5th rowSCREEN_ON
ValueCountFrequency (%)
screen_off 243627
51.7%
screen_on 227164
48.3%
2024-11-22T13:32:48.780071image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 941582
21.0%
N 697955
15.6%
F 487254
10.9%
S 470791
10.5%
C 470791
10.5%
R 470791
10.5%
_ 470791
10.5%
O 470791
10.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4480746
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
E 941582
21.0%
N 697955
15.6%
F 487254
10.9%
S 470791
10.5%
C 470791
10.5%
R 470791
10.5%
_ 470791
10.5%
O 470791
10.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4480746
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
E 941582
21.0%
N 697955
15.6%
F 487254
10.9%
S 470791
10.5%
C 470791
10.5%
R 470791
10.5%
_ 470791
10.5%
O 470791
10.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4480746
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
E 941582
21.0%
N 697955
15.6%
F 487254
10.9%
S 470791
10.5%
C 470791
10.5%
R 470791
10.5%
_ 470791
10.5%
O 470791
10.5%