Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 642940848 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Total size in memory | 29.9 GiB |
Average record size in memory | 50.0 B |
Variable types
Text | 1 |
---|---|
Numeric | 2 |
DateTime | 1 |
Dataset
Variable descriptions
experimentid | Experiment Id |
---|---|
userid | User id |
timestamp | show month(2), day(2), hour(2), minute(2), second(2), decimals(3) |
value | The ambient light level in SI lux units |
experimentid has constant value "wenetItaly" | Constant |
value is highly skewed (γ1 = 4506.746548) | Skewed |
value has 204613392 (31.8%) zeros | Zeros |
Reproduction
Analysis started | 2024-11-23 11:40:18.781881 |
---|---|
Analysis finished | 2024-11-23 12:30:33.701083 |
Duration | 50 minutes and 14.92 seconds |
Software version | ydata-profiling v4.8.3 |
Download configuration | config.json |
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 15.6 GiB |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Characters and Unicode
Total characters | 6429408480 |
---|---|
Distinct characters | 8 |
Distinct categories | 1 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | wenetItaly |
---|---|
2nd row | wenetItaly |
3rd row | wenetItaly |
4th row | wenetItaly |
5th row | wenetItaly |
Value | Count | Frequency (%) |
wenetitaly | 642940848 |
Most occurring characters
Value | Count | Frequency (%) |
e | 1285881696 | |
t | 1285881696 | |
w | 642940848 | |
n | 642940848 | |
I | 642940848 | |
a | 642940848 | |
l | 642940848 | |
y | 642940848 |
Most occurring categories
Value | Count | Frequency (%) |
(unknown) | 6429408480 |
Most frequent character per category
(unknown)
Value | Count | Frequency (%) |
e | 1285881696 | |
t | 1285881696 | |
w | 642940848 | |
n | 642940848 | |
I | 642940848 | |
a | 642940848 | |
l | 642940848 | |
y | 642940848 |
Most occurring scripts
Value | Count | Frequency (%) |
(unknown) | 6429408480 |
Most frequent character per script
(unknown)
Value | Count | Frequency (%) |
e | 1285881696 | |
t | 1285881696 | |
w | 642940848 | |
n | 642940848 | |
I | 642940848 | |
a | 642940848 | |
l | 642940848 | |
y | 642940848 |
Most occurring blocks
Value | Count | Frequency (%) |
(unknown) | 6429408480 |
Most frequent character per block
(unknown)
Value | Count | Frequency (%) |
e | 1285881696 | |
t | 1285881696 | |
w | 642940848 | |
n | 642940848 | |
I | 642940848 | |
a | 642940848 | |
l | 642940848 | |
y | 642940848 |
userid
Real number (ℝ)
User id
Distinct | 214 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 122.760921 |
Minimum | 0 |
---|---|
Maximum | 265 |
Zeros | 553319 |
Zeros (%) | 0.1% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.6 GiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 6 |
Q1 | 55 |
median | 119 |
Q3 | 200 |
95-th percentile | 254 |
Maximum | 265 |
Range | 265 |
Interquartile range (IQR) | 145 |
Descriptive statistics
Standard deviation | 80.38110694 |
---|---|
Coefficient of variation (CV) | 0.6547776465 |
Kurtosis | -1.253834288 |
Mean | 122.760921 |
Median Absolute Deviation (MAD) | 72 |
Skewness | 0.1267212358 |
Sum | 7.892801065 × 1010 |
Variance | 6461.122353 |
Monotonicity | Increasing |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
5 | 17251045 | 2.7% |
58 | 16510055 | 2.6% |
134 | 15264417 | 2.4% |
225 | 15101079 | 2.3% |
40 | 14664449 | 2.3% |
91 | 14436934 | 2.2% |
15 | 13610247 | 2.1% |
99 | 13601091 | 2.1% |
258 | 13467124 | 2.1% |
158 | 13337248 | 2.1% |
Other values (204) | 495697159 |
Value | Count | Frequency (%) |
0 | 553319 | 0.1% |
1 | 7671609 | |
2 | 1142716 | 0.2% |
3 | 989565 | 0.2% |
4 | 2057941 | 0.3% |
Value | Count | Frequency (%) |
265 | 3393983 | |
264 | 45140 | < 0.1% |
263 | 1862719 | |
262 | 1174411 | 0.2% |
260 | 31436 | < 0.1% |
timestamp
Date
show month(2), day(2), hour(2), minute(2), second(2), decimals(3)
Distinct | 553986674 |
---|---|
Distinct (%) | 86.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 9.6 GiB |
Minimum | 2020-11-16 07:00:00.008000 |
---|---|
Maximum | 2020-12-11 21:59:59.998000 |
Histogram with fixed size bins (bins=50)
Distinct | 404636 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 155.9931373 |
Minimum | -71238 |
---|---|
Maximum | 21474836 |
Zeros | 204613392 |
Zeros (%) | 31.8% |
Negative | 116 |
Negative (%) | < 0.1% |
Memory size | 9.6 GiB |
Quantile statistics
Minimum | -71238 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 9 |
Q3 | 59 |
95-th percentile | 476 |
Maximum | 21474836 |
Range | 21546074 |
Interquartile range (IQR) | 59 |
Descriptive statistics
Standard deviation | 2742.855611 |
---|---|
Coefficient of variation (CV) | 17.58318128 |
Kurtosis | 35068686.29 |
Mean | 155.9931373 |
Median Absolute Deviation (MAD) | 9 |
Skewness | 4506.746548 |
Sum | 1.0029436 × 1011 |
Variance | 7523256.904 |
Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
Value | Count | Frequency (%) |
0 | 204613392 | |
1 | 25694079 | 4.0% |
2 | 17382850 | 2.7% |
3 | 13540948 | 2.1% |
5 | 9098039 | 1.4% |
4 | 8934894 | 1.4% |
6 | 8839110 | 1.4% |
9 | 7686629 | 1.2% |
7 | 7435170 | 1.2% |
8 | 6164538 | 1.0% |
Other values (404626) | 333551199 |
Value | Count | Frequency (%) |
-71238 | 1 | |
-71070 | 1 | |
-70591 | 1 | |
-70189 | 1 | |
-70158 | 1 |
Value | Count | Frequency (%) |
21474836 | 6 | |
1190695.25 | 5 | |
1185558.125 | 1 | < 0.1% |
1135853.875 | 1 | < 0.1% |
894393.1875 | 1 | < 0.1% |
userid | value | |
---|---|---|
userid | 1.000 | 0.010 |
value | 0.010 | 1.000 |