Dataset statistics
Number of variables | 3 |
---|---|
Number of observations | 5097489 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Total size in memory | 204.2 MiB |
Average record size in memory | 42.0 B |
Variable types
Text | 1 |
---|---|
Numeric | 1 |
DateTime | 1 |
Dataset
Variable descriptions
experimentid | Experiment Id |
---|---|
userid | User id |
timestamp | show month(2), day(2), hour(2), minute(2), second(2), decimals(3) |
experimentid has constant value "wenetIndia" | Constant |
userid has 2931971 (57.5%) zeros | Zeros |
Reproduction
Analysis started | 2024-11-22 12:32:22.465145 |
---|---|
Analysis finished | 2024-11-22 12:32:36.851021 |
Duration | 14.39 seconds |
Software version | ydata-profiling v4.8.3 |
Download configuration | config.json |
Distinct | 1 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 126.4 MiB |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Characters and Unicode
Total characters | 50974890 |
---|---|
Distinct characters | 8 |
Distinct categories | 1 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | wenetIndia |
---|---|
2nd row | wenetIndia |
3rd row | wenetIndia |
4th row | wenetIndia |
5th row | wenetIndia |
Value | Count | Frequency (%) |
wenetindia | 5097489 |
Most occurring characters
Value | Count | Frequency (%) |
e | 10194978 | |
n | 10194978 | |
w | 5097489 | |
t | 5097489 | |
I | 5097489 | |
d | 5097489 | |
i | 5097489 | |
a | 5097489 |
Most occurring categories
Value | Count | Frequency (%) |
(unknown) | 50974890 |
Most frequent character per category
(unknown)
Value | Count | Frequency (%) |
e | 10194978 | |
n | 10194978 | |
w | 5097489 | |
t | 5097489 | |
I | 5097489 | |
d | 5097489 | |
i | 5097489 | |
a | 5097489 |
Most occurring scripts
Value | Count | Frequency (%) |
(unknown) | 50974890 |
Most frequent character per script
(unknown)
Value | Count | Frequency (%) |
e | 10194978 | |
n | 10194978 | |
w | 5097489 | |
t | 5097489 | |
I | 5097489 | |
d | 5097489 | |
i | 5097489 | |
a | 5097489 |
Most occurring blocks
Value | Count | Frequency (%) |
(unknown) | 50974890 |
Most frequent character per block
(unknown)
Value | Count | Frequency (%) |
e | 10194978 | |
n | 10194978 | |
w | 5097489 | |
t | 5097489 | |
I | 5097489 | |
d | 5097489 | |
i | 5097489 | |
a | 5097489 |
Distinct | 12 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 8.750704121 |
Minimum | 0 |
---|---|
Maximum | 62 |
Zeros | 2931971 |
Zeros (%) | 57.5% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 77.8 MiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 0 |
Q3 | 12 |
95-th percentile | 43 |
Maximum | 62 |
Range | 62 |
Interquartile range (IQR) | 12 |
Descriptive statistics
Standard deviation | 14.39924737 |
---|---|
Coefficient of variation (CV) | 1.645495856 |
Kurtosis | 5.380385753 |
Mean | 8.750704121 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 2.328552181 |
Sum | 44606618 |
Variance | 207.3383247 |
Monotonicity | Increasing |
Histogram with fixed size bins (bins=12)
Value | Count | Frequency (%) |
0 | 2931971 | |
12 | 1211469 | |
17 | 519385 | 10.2% |
62 | 156086 | 3.1% |
35 | 85469 | 1.7% |
43 | 85023 | 1.7% |
57 | 69208 | 1.4% |
4 | 12676 | 0.2% |
49 | 10594 | 0.2% |
25 | 10547 | 0.2% |
Other values (2) | 5061 | 0.1% |
Value | Count | Frequency (%) |
0 | 2931971 | |
4 | 12676 | 0.2% |
12 | 1211469 | |
17 | 519385 | 10.2% |
22 | 890 | < 0.1% |
Value | Count | Frequency (%) |
62 | 156086 | |
57 | 69208 | |
49 | 10594 | 0.2% |
43 | 85023 | |
35 | 85469 |
timestamp
Date
show month(2), day(2), hour(2), minute(2), second(2), decimals(3)
Distinct | 5093826 |
---|---|
Distinct (%) | 99.9% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 77.8 MiB |
Minimum | 2021-07-12 08:00:01.384000 |
---|---|
Maximum | 2021-08-12 13:39:46.970000 |
Histogram with fixed size bins (bins=50)