Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 12781814 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Total size in memory | 609.5 MiB |
Average record size in memory | 50.0 B |
Variable types
Text | 1 |
---|---|
Numeric | 2 |
DateTime | 1 |
Dataset
Variable descriptions
experimentid | Experiment Id |
---|---|
userid | User id |
timestamp | show month(2), day(2), hour(2), minute(2), second(2), decimals(3) |
value | The ambient light level in SI lux units |
experimentid has constant value "wenetIndia" | Constant |
value is highly skewed (γ1 = 48.26722084) | Skewed |
userid has 906491 (7.1%) zeros | Zeros |
value has 2712356 (21.2%) zeros | Zeros |
Reproduction
Analysis started | 2024-11-22 13:02:12.220922 |
---|---|
Analysis finished | 2024-11-22 13:02:53.757789 |
Duration | 41.54 seconds |
Software version | ydata-profiling v4.8.3 |
Download configuration | config.json |
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 316.9 MiB |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Characters and Unicode
Total characters | 127818140 |
---|---|
Distinct characters | 8 |
Distinct categories | 1 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | wenetIndia |
---|---|
2nd row | wenetIndia |
3rd row | wenetIndia |
4th row | wenetIndia |
5th row | wenetIndia |
Value | Count | Frequency (%) |
wenetindia | 12781814 |
Most occurring characters
Value | Count | Frequency (%) |
e | 25563628 | |
n | 25563628 | |
w | 12781814 | |
t | 12781814 | |
I | 12781814 | |
d | 12781814 | |
i | 12781814 | |
a | 12781814 |
Most occurring categories
Value | Count | Frequency (%) |
(unknown) | 127818140 |
Most frequent character per category
(unknown)
Value | Count | Frequency (%) |
e | 25563628 | |
n | 25563628 | |
w | 12781814 | |
t | 12781814 | |
I | 12781814 | |
d | 12781814 | |
i | 12781814 | |
a | 12781814 |
Most occurring scripts
Value | Count | Frequency (%) |
(unknown) | 127818140 |
Most frequent character per script
(unknown)
Value | Count | Frequency (%) |
e | 25563628 | |
n | 25563628 | |
w | 12781814 | |
t | 12781814 | |
I | 12781814 | |
d | 12781814 | |
i | 12781814 | |
a | 12781814 |
Most occurring blocks
Value | Count | Frequency (%) |
(unknown) | 127818140 |
Most frequent character per block
(unknown)
Value | Count | Frequency (%) |
e | 25563628 | |
n | 25563628 | |
w | 12781814 | |
t | 12781814 | |
I | 12781814 | |
d | 12781814 | |
i | 12781814 | |
a | 12781814 |
Distinct | 18 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 25.14817615 |
Minimum | 0 |
---|---|
Maximum | 62 |
Zeros | 906491 |
Zeros (%) | 7.1% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 195.0 MiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 9 |
median | 26 |
Q3 | 43 |
95-th percentile | 44 |
Maximum | 62 |
Range | 62 |
Interquartile range (IQR) | 34 |
Descriptive statistics
Standard deviation | 17.61430422 |
---|---|
Coefficient of variation (CV) | 0.7004207428 |
Kurtosis | -1.456394698 |
Mean | 25.14817615 |
Median Absolute Deviation (MAD) | 17 |
Skewness | 0.07417222588 |
Sum | 321439310 |
Variance | 310.2637132 |
Monotonicity | Increasing |
Histogram with fixed size bins (bins=18)
Value | Count | Frequency (%) |
43 | 2539649 | |
44 | 1743825 | |
4 | 1611935 | |
9 | 1024831 | |
26 | 1023970 | |
12 | 956383 | 7.5% |
0 | 906491 | 7.1% |
8 | 589208 | 4.6% |
57 | 418335 | 3.3% |
18 | 414370 | 3.2% |
Other values (8) | 1552817 |
Value | Count | Frequency (%) |
0 | 906491 | |
4 | 1611935 | |
8 | 589208 | 4.6% |
9 | 1024831 | |
12 | 956383 |
Value | Count | Frequency (%) |
62 | 72383 | 0.6% |
57 | 418335 | 3.3% |
46 | 41547 | 0.3% |
44 | 1743825 | |
43 | 2539649 |
timestamp
Date
show month(2), day(2), hour(2), minute(2), second(2), decimals(3)
Distinct | 12747987 |
---|---|
Distinct (%) | 99.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 195.0 MiB |
Minimum | 2021-07-12 08:00:08.772000 |
---|---|
Maximum | 2021-08-12 14:41:36.045000 |
Histogram with fixed size bins (bins=50)
Distinct | 67510 |
---|---|
Distinct (%) | 0.5% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 390.068209 |
Minimum | 0 |
---|---|
Maximum | 445634 |
Zeros | 2712356 |
Zeros (%) | 21.2% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 195.0 MiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 2 |
median | 20 |
Q3 | 58 |
95-th percentile | 598.9199829 |
Maximum | 445634 |
Range | 445634 |
Interquartile range (IQR) | 56 |
Descriptive statistics
Standard deviation | 5491.912127 |
---|---|
Coefficient of variation (CV) | 14.07936356 |
Kurtosis | 2743.329648 |
Mean | 390.068209 |
Median Absolute Deviation (MAD) | 20 |
Skewness | 48.26722084 |
Sum | 4985779295 |
Variance | 30161098.81 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
0 | 2712356 | 21.2% |
2 | 195477 | 1.5% |
1 | 163086 | 1.3% |
11 | 150199 | 1.2% |
6 | 143769 | 1.1% |
17 | 132685 | 1.0% |
14 | 130219 | 1.0% |
22 | 128912 | 1.0% |
12 | 125469 | 1.0% |
4 | 124545 | 1.0% |
Other values (67500) | 8775097 |
Value | Count | Frequency (%) |
0 | 2712356 | |
0.009999999776 | 85 | < 0.1% |
0.01999999955 | 154 | < 0.1% |
0.02999999933 | 28 | < 0.1% |
0.03999999911 | 14 | < 0.1% |
Value | Count | Frequency (%) |
445634 | 1 | |
369805 | 1 | |
353750 | 1 | |
353676 | 1 | |
353470 | 1 |
userid | value | |
---|---|---|
userid | 1.000 | 0.195 |
value | 0.195 | 1.000 |