Overview

Dataset statistics

Number of variables4
Number of observations28208464
Missing cells0
Missing cells (%)0.0%
Total size in memory1.4 GiB
Average record size in memory52.0 B

Variable types

Text1
Numeric2
DateTime1

Dataset

Description[lx] Illuminance. To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
CreatorAndrea Bontempelli, Matteo Busso, Roy Alia Asiku
AuthorAndrea Bontempelli, Matteo Busso, Fausto Giunchiglia
URL
Copyright(c) University of Trento - Knowledge Diversity 2023

Variable descriptions

experimentidExperiment Id
useridUser id
timestampshow month(2), day(2), hour(2), minute(2), second(2), decimals(3)
valueThe ambient light level in SI lux units

Alerts

experimentid has constant value "wenetDenmark"Constant
value is highly skewed (γ1 = 64.953274)Skewed
userid has 6637628 (23.5%) zerosZeros
value has 10575001 (37.5%) zerosZeros

Reproduction

Analysis started2024-11-23 02:35:03.058165
Analysis finished2024-11-23 02:36:54.275291
Duration1 minute and 51.22 seconds
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

experimentid
Text

CONSTANT 

Experiment Id

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size753.2 MiB
2024-11-23T03:36:54.349430image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters338501568
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowwenetDenmark
2nd rowwenetDenmark
3rd rowwenetDenmark
4th rowwenetDenmark
5th rowwenetDenmark
ValueCountFrequency (%)
wenetdenmark 28208464
100.0%
2024-11-23T03:36:54.535645image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 84625392
25.0%
n 56416928
16.7%
w 28208464
 
8.3%
t 28208464
 
8.3%
D 28208464
 
8.3%
m 28208464
 
8.3%
a 28208464
 
8.3%
r 28208464
 
8.3%
k 28208464
 
8.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 338501568
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 84625392
25.0%
n 56416928
16.7%
w 28208464
 
8.3%
t 28208464
 
8.3%
D 28208464
 
8.3%
m 28208464
 
8.3%
a 28208464
 
8.3%
r 28208464
 
8.3%
k 28208464
 
8.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 338501568
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 84625392
25.0%
n 56416928
16.7%
w 28208464
 
8.3%
t 28208464
 
8.3%
D 28208464
 
8.3%
m 28208464
 
8.3%
a 28208464
 
8.3%
r 28208464
 
8.3%
k 28208464
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 338501568
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 84625392
25.0%
n 56416928
16.7%
w 28208464
 
8.3%
t 28208464
 
8.3%
D 28208464
 
8.3%
m 28208464
 
8.3%
a 28208464
 
8.3%
r 28208464
 
8.3%
k 28208464
 
8.3%

userid
Real number (ℝ)

ZEROS 

User id

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.829913674
Minimum0
Maximum27
Zeros6637628
Zeros (%)23.5%
Negative0
Negative (%)0.0%
Memory size430.4 MiB
2024-11-23T03:36:54.643091image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median6
Q320
95-th percentile26
Maximum27
Range27
Interquartile range (IQR)17

Descriptive statistics

Standard deviation9.850967567
Coefficient of variation (CV)1.002141819
Kurtosis-1.208658235
Mean9.829913674
Median Absolute Deviation (MAD)6
Skewness0.6702756727
Sum277286766
Variance97.041562
MonotonicityIncreasing
2024-11-23T03:36:54.742540image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
0 6637628
23.5%
6 6488825
23.0%
3 4502828
16.0%
26 3700909
13.1%
25 1718927
 
6.1%
17 1356653
 
4.8%
8 904023
 
3.2%
20 858978
 
3.0%
21 756342
 
2.7%
23 375348
 
1.3%
Other values (7) 908003
 
3.2%
ValueCountFrequency (%)
0 6637628
23.5%
2 269945
 
1.0%
3 4502828
16.0%
6 6488825
23.0%
8 904023
 
3.2%
ValueCountFrequency (%)
27 235405
 
0.8%
26 3700909
13.1%
25 1718927
6.1%
23 375348
 
1.3%
22 14498
 
0.1%

timestamp
Date

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct28056559
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Memory size430.4 MiB
Minimum2020-11-16 07:00:00.009000
Maximum2020-12-11 21:59:59.858000
2024-11-23T03:36:54.856641image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-23T03:36:54.978050image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

value
Real number (ℝ)

SKEWED  ZEROS 

The ambient light level in SI lux units

Distinct138090
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean94.73917191
Minimum0
Maximum416450.1875
Zeros10575001
Zeros (%)37.5%
Negative0
Negative (%)0.0%
Memory size430.4 MiB
2024-11-23T03:36:55.094807image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median4.46999979
Q329.23999977
95-th percentile254
Maximum416450.1875
Range416450.1875
Interquartile range (IQR)29.23999977

Descriptive statistics

Standard deviation978.5301604
Coefficient of variation (CV)10.32867546
Kurtosis10669.17257
Mean94.73917191
Median Absolute Deviation (MAD)4.46999979
Skewness64.953274
Sum2672446520
Variance957521.2748
MonotonicityNot monotonic
2024-11-23T03:36:55.214463image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 10575001
37.5%
1 814327
 
2.9%
2 762362
 
2.7%
3 447727
 
1.6%
4 351789
 
1.2%
8 306321
 
1.1%
5 279198
 
1.0%
21 271161
 
1.0%
10 249717
 
0.9%
11 245129
 
0.9%
Other values (138080) 13905732
49.3%
ValueCountFrequency (%)
0 10575001
37.5%
0.009999999776 1255
 
< 0.1%
0.01999999955 68
 
< 0.1%
0.02999999933 95
 
< 0.1%
0.03999999911 36
 
< 0.1%
ValueCountFrequency (%)
416450.1875 1
< 0.1%
272040 1
< 0.1%
266722 1
< 0.1%
258831 1
< 0.1%
251384 1
< 0.1%

Correlations

2024-11-23T03:36:55.284136image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
useridvalue
userid1.0000.176
value0.1761.000