Overview

Dataset statistics

Number of variables4
Number of observations12781814
Missing cells0
Missing cells (%)0.0%
Total size in memory609.5 MiB
Average record size in memory50.0 B

Variable types

Text1
Numeric2
DateTime1

Dataset

Description[lx] Illuminance. To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
CreatorAndrea Bontempelli, Matteo Busso, Roy Alia Asiku
AuthorAndrea Bontempelli, Matteo Busso, Fausto Giunchiglia
URL
Copyright(c) University of Trento - Knowledge Diversity 2023

Variable descriptions

experimentidExperiment Id
useridUser id
timestampshow month(2), day(2), hour(2), minute(2), second(2), decimals(3)
valueThe ambient light level in SI lux units

Alerts

experimentid has constant value "wenetIndia"Constant
value is highly skewed (γ1 = 48.26722084)Skewed
userid has 906491 (7.1%) zerosZeros
value has 2712356 (21.2%) zerosZeros

Reproduction

Analysis started2024-11-22 13:02:12.220922
Analysis finished2024-11-22 13:02:53.757789
Duration41.54 seconds
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

experimentid
Text

CONSTANT 

Experiment Id

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size316.9 MiB
2024-11-22T14:02:53.859871image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters127818140
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowwenetIndia
2nd rowwenetIndia
3rd rowwenetIndia
4th rowwenetIndia
5th rowwenetIndia
ValueCountFrequency (%)
wenetindia 12781814
100.0%
2024-11-22T14:02:54.047085image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 25563628
20.0%
n 25563628
20.0%
w 12781814
10.0%
t 12781814
10.0%
I 12781814
10.0%
d 12781814
10.0%
i 12781814
10.0%
a 12781814
10.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 127818140
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 25563628
20.0%
n 25563628
20.0%
w 12781814
10.0%
t 12781814
10.0%
I 12781814
10.0%
d 12781814
10.0%
i 12781814
10.0%
a 12781814
10.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 127818140
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 25563628
20.0%
n 25563628
20.0%
w 12781814
10.0%
t 12781814
10.0%
I 12781814
10.0%
d 12781814
10.0%
i 12781814
10.0%
a 12781814
10.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 127818140
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 25563628
20.0%
n 25563628
20.0%
w 12781814
10.0%
t 12781814
10.0%
I 12781814
10.0%
d 12781814
10.0%
i 12781814
10.0%
a 12781814
10.0%

userid
Real number (ℝ)

ZEROS 

User id

Distinct18
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.14817615
Minimum0
Maximum62
Zeros906491
Zeros (%)7.1%
Negative0
Negative (%)0.0%
Memory size195.0 MiB
2024-11-22T14:02:54.147889image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q19
median26
Q343
95-th percentile44
Maximum62
Range62
Interquartile range (IQR)34

Descriptive statistics

Standard deviation17.61430422
Coefficient of variation (CV)0.7004207428
Kurtosis-1.456394698
Mean25.14817615
Median Absolute Deviation (MAD)17
Skewness0.07417222588
Sum321439310
Variance310.2637132
MonotonicityIncreasing
2024-11-22T14:02:54.240396image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
43 2539649
19.9%
44 1743825
13.6%
4 1611935
12.6%
9 1024831
8.0%
26 1023970
8.0%
12 956383
 
7.5%
0 906491
 
7.1%
8 589208
 
4.6%
57 418335
 
3.3%
18 414370
 
3.2%
Other values (8) 1552817
12.1%
ValueCountFrequency (%)
0 906491
7.1%
4 1611935
12.6%
8 589208
 
4.6%
9 1024831
8.0%
12 956383
7.5%
ValueCountFrequency (%)
62 72383
 
0.6%
57 418335
 
3.3%
46 41547
 
0.3%
44 1743825
13.6%
43 2539649
19.9%

timestamp
Date

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct12747987
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Memory size195.0 MiB
Minimum2021-07-12 08:00:08.772000
Maximum2021-08-12 14:41:36.045000
2024-11-22T14:02:54.349688image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-22T14:02:54.471964image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

value
Real number (ℝ)

SKEWED  ZEROS 

The ambient light level in SI lux units

Distinct67510
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean390.068209
Minimum0
Maximum445634
Zeros2712356
Zeros (%)21.2%
Negative0
Negative (%)0.0%
Memory size195.0 MiB
2024-11-22T14:02:54.589768image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median20
Q358
95-th percentile598.9199829
Maximum445634
Range445634
Interquartile range (IQR)56

Descriptive statistics

Standard deviation5491.912127
Coefficient of variation (CV)14.07936356
Kurtosis2743.329648
Mean390.068209
Median Absolute Deviation (MAD)20
Skewness48.26722084
Sum4985779295
Variance30161098.81
MonotonicityNot monotonic
2024-11-22T14:02:54.706412image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2712356
 
21.2%
2 195477
 
1.5%
1 163086
 
1.3%
11 150199
 
1.2%
6 143769
 
1.1%
17 132685
 
1.0%
14 130219
 
1.0%
22 128912
 
1.0%
12 125469
 
1.0%
4 124545
 
1.0%
Other values (67500) 8775097
68.7%
ValueCountFrequency (%)
0 2712356
21.2%
0.009999999776 85
 
< 0.1%
0.01999999955 154
 
< 0.1%
0.02999999933 28
 
< 0.1%
0.03999999911 14
 
< 0.1%
ValueCountFrequency (%)
445634 1
< 0.1%
369805 1
< 0.1%
353750 1
< 0.1%
353676 1
< 0.1%
353470 1
< 0.1%

Correlations

2024-11-22T14:02:54.775251image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
useridvalue
userid1.0000.195
value0.1951.000