Overview

Dataset statistics

Number of variables10
Number of observations68890509
Missing cells2
Missing cells (%)< 0.1%
Total size in memory6.7 GiB
Average record size in memory104.9 B

Variable types

Text2
Numeric7
DateTime1

Dataset

Description[unitless] Returns the Point Of Interests surrounding the geocoordinates of where the phone is located. POI extracted every 5 minutes. To compare each sensor observation, the frequency was reduced to one minute. The first non-missing name is reported for each of the categorical variables.
CreatorAndrea Bontempelli, Matteo Busso, Roy Alia Asiku
AuthorAndrea Bontempelli, Matteo Busso, Fausto Giunchiglia
URL
Copyright(c) University of Trento - Knowledge Diversity 2023

Variable descriptions

experimentidExperiment Id
useridUser id
timestampshow month(2), day(2), hour(2), minute(2), second(2), decimals(3)
accuracyThe GPS accuracy in meters
bearingThe compass direction from the current position the intended destination. Bearing is measured in degrees and calculated clockwise from true north (e.g., the bearing for the direction of east is 090°)
providerIt indicates whether the coordinates were found using the network/Wi-Fi It indicates whether the coordinates were found using GPS
speedThe speed of the device, measured in meters/second over ground
latitudeGeographic coordinate that specifies the N/S position. Latitude is an angle which ranges from 0° at the Equator to 90° at the poles. It is expressed in sexadecimal notation.
longitudeGeographic coordinate that specifies the E/W position. Longitude is an angle which ranges from 0° at the prime Meridian to 180°. It is expressed in sexadecimal notation
altitudeElevation above sea level in meters.

Alerts

experimentid has constant value "wenetItaly"Constant
altitude is highly overall correlated with bearing and 1 other fieldsHigh correlation
bearing is highly overall correlated with altitude and 1 other fieldsHigh correlation
speed is highly overall correlated with altitude and 1 other fieldsHigh correlation
altitude is highly skewed (γ1 = 21.62002563)Skewed
bearing has 16460209 (23.9%) zerosZeros
speed has 27335270 (39.7%) zerosZeros

Reproduction

Analysis started2024-11-24 09:52:50.047132
Analysis finished2024-11-24 10:03:59.820722
Duration11 minutes and 9.77 seconds
Software versionydata-profiling v4.8.3
Download configurationconfig.json

Variables

experimentid
Text

CONSTANT 

Experiment Id

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.7 GiB
2024-11-24T11:03:59.877914image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters688905090
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowwenetItaly
2nd rowwenetItaly
3rd rowwenetItaly
4th rowwenetItaly
5th rowwenetItaly
ValueCountFrequency (%)
wenetitaly 68890509
100.0%
2024-11-24T11:04:00.060322image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 137781018
20.0%
t 137781018
20.0%
w 68890509
10.0%
n 68890509
10.0%
I 68890509
10.0%
a 68890509
10.0%
l 68890509
10.0%
y 68890509
10.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 688905090
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 137781018
20.0%
t 137781018
20.0%
w 68890509
10.0%
n 68890509
10.0%
I 68890509
10.0%
a 68890509
10.0%
l 68890509
10.0%
y 68890509
10.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 688905090
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 137781018
20.0%
t 137781018
20.0%
w 68890509
10.0%
n 68890509
10.0%
I 68890509
10.0%
a 68890509
10.0%
l 68890509
10.0%
y 68890509
10.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 688905090
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 137781018
20.0%
t 137781018
20.0%
w 68890509
10.0%
n 68890509
10.0%
I 68890509
10.0%
a 68890509
10.0%
l 68890509
10.0%
y 68890509
10.0%

userid
Real number (ℝ)

User id

Distinct206
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean110.6713333
Minimum0
Maximum264
Zeros1998
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.0 GiB
2024-11-24T11:04:00.181216image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8
Q158
median78
Q3163
95-th percentile249
Maximum264
Range264
Interquartile range (IQR)105

Descriptive statistics

Standard deviation75.38776589
Coefficient of variation (CV)0.6811860276
Kurtosis-1.0746079
Mean110.6713333
Median Absolute Deviation (MAD)52
Skewness0.4369244871
Sum7624204482
Variance5683.315246
MonotonicityIncreasing
2024-11-24T11:04:00.306059image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
58 15792740
22.9%
163 5144969
 
7.5%
8 3651336
 
5.3%
75 2332490
 
3.4%
212 1837840
 
2.7%
114 1820529
 
2.6%
32 1341808
 
1.9%
2 1164729
 
1.7%
30 1130396
 
1.6%
66 1074908
 
1.6%
Other values (196) 33598764
48.8%
ValueCountFrequency (%)
0 1998
 
< 0.1%
1 184570
 
0.3%
2 1164729
1.7%
3 4392
 
< 0.1%
4 66013
 
0.1%
ValueCountFrequency (%)
264 745
 
< 0.1%
263 304455
0.4%
262 180703
0.3%
260 1067
 
< 0.1%
259 54482
 
0.1%

timestamp
Date

show month(2), day(2), hour(2), minute(2), second(2), decimals(3)

Distinct67653754
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size1.0 GiB
Minimum2020-11-16 07:00:00.173000
Maximum2020-12-11 21:59:58.190000
2024-11-24T11:04:00.444663image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-11-24T11:04:00.566352image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

accuracy
Real number (ℝ)

The GPS accuracy in meters

Distinct659212
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean101.9284606
Minimum0
Maximum24973.80078
Zeros15068
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size1.0 GiB
2024-11-24T11:04:00.683763image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.100000024
Q14.300000191
median17
Q325.80500031
95-th percentile260.4960022
Maximum24973.80078
Range24973.80078
Interquartile range (IQR)21.50500011

Descriptive statistics

Standard deviation406.2640445
Coefficient of variation (CV)3.985776321
Kurtosis114.3825893
Mean101.9284606
Median Absolute Deviation (MAD)12
Skewness7.416205284
Sum7021903529
Variance165050.4739
MonotonicityNot monotonic
2024-11-24T11:04:00.810396image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1.100000024 5394263
 
7.8%
20 5389768
 
7.8%
1 1570597
 
2.3%
19 1245597
 
1.8%
4 1077793
 
1.6%
3.790092468 970512
 
1.4%
1.5 758206
 
1.1%
16 684760
 
1.0%
3 667841
 
1.0%
24 659561
 
1.0%
Other values (659202) 50471611
73.3%
ValueCountFrequency (%)
0 15068
< 0.1%
0.5 1278
 
< 0.1%
0.75 4702
 
< 0.1%
0.8000000119 6
 
< 0.1%
0.8999999762 5414
 
< 0.1%
ValueCountFrequency (%)
24973.80078 2
< 0.1%
24869.46484 2
< 0.1%
24769.19531 2
< 0.1%
24674.49609 2
< 0.1%
24582.30859 2
< 0.1%

bearing
Real number (ℝ)

HIGH CORRELATION  ZEROS 

The compass direction from the current position the intended destination. Bearing is measured in degrees and calculated clockwise from true north (e.g., the bearing for the direction of east is 090°)

Distinct35999
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean75.13230873
Minimum-1
Maximum360
Zeros16460209
Zeros (%)23.9%
Negative23559209
Negative (%)34.2%
Memory size1.0 GiB
2024-11-24T11:04:00.927535image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median0
Q3137.8
95-th percentile330.01
Maximum360
Range361
Interquartile range (IQR)138.8

Descriptive statistics

Standard deviation115.4066117
Coefficient of variation (CV)1.536045061
Kurtosis0.03945932793
Mean75.13230873
Median Absolute Deviation (MAD)1
Skewness1.251522082
Sum5175902991
Variance13318.68603
MonotonicityNot monotonic
2024-11-24T11:04:01.048123image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 23559209
34.2%
0 16460209
23.9%
324.68 373720
 
0.5%
330.01 373569
 
0.5%
342.29 336204
 
0.5%
15.53 310400
 
0.5%
6.78 265217
 
0.4%
113.41 249701
 
0.4%
346.92 246026
 
0.4%
342.39 243279
 
0.4%
Other values (35989) 26472975
38.4%
ValueCountFrequency (%)
-1 23559209
34.2%
0 16460209
23.9%
0.01 1484
 
< 0.1%
0.02 117
 
< 0.1%
0.03 224
 
< 0.1%
ValueCountFrequency (%)
360 243
 
< 0.1%
359.99 406
< 0.1%
359.98 964
< 0.1%
359.97 237
 
< 0.1%
359.96 176
 
< 0.1%

provider
Text

It indicates whether the coordinates were found using the network/Wi-Fi It indicates whether the coordinates were found using GPS

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 GiB
2024-11-24T11:04:01.129911image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.875010998
Min length3

Characters and Unicode

Total characters473623007
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownetwork
2nd rowpassive
3rd rownetwork
4th rowpassive
5th rowpassive
ValueCountFrequency (%)
passive 65200765
94.6%
gps 2152639
 
3.1%
network 1537105
 
2.2%
2024-11-24T11:04:01.325929image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 132554169
28.0%
p 67353404
14.2%
e 66737870
14.1%
a 65200765
13.8%
i 65200765
13.8%
v 65200765
13.8%
g 2152639
 
0.5%
n 1537105
 
0.3%
t 1537105
 
0.3%
w 1537105
 
0.3%
Other values (3) 4611315
 
1.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 473623007
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
s 132554169
28.0%
p 67353404
14.2%
e 66737870
14.1%
a 65200765
13.8%
i 65200765
13.8%
v 65200765
13.8%
g 2152639
 
0.5%
n 1537105
 
0.3%
t 1537105
 
0.3%
w 1537105
 
0.3%
Other values (3) 4611315
 
1.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 473623007
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
s 132554169
28.0%
p 67353404
14.2%
e 66737870
14.1%
a 65200765
13.8%
i 65200765
13.8%
v 65200765
13.8%
g 2152639
 
0.5%
n 1537105
 
0.3%
t 1537105
 
0.3%
w 1537105
 
0.3%
Other values (3) 4611315
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 473623007
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
s 132554169
28.0%
p 67353404
14.2%
e 66737870
14.1%
a 65200765
13.8%
i 65200765
13.8%
v 65200765
13.8%
g 2152639
 
0.5%
n 1537105
 
0.3%
t 1537105
 
0.3%
w 1537105
 
0.3%
Other values (3) 4611315
 
1.0%

speed
Real number (ℝ)

HIGH CORRELATION  ZEROS 

The speed of the device, measured in meters/second over ground

Distinct4843
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.335258295
Minimum-1
Maximum231.3600006
Zeros27335270
Zeros (%)39.7%
Negative23559615
Negative (%)34.2%
Memory size1.0 GiB
2024-11-24T11:04:01.448622image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-0.009999999776
Q1-0.009999999776
median0
Q30.1400000006
95-th percentile11.75
Maximum231.3600006
Range232.3600006
Interquartile range (IQR)0.1500000004

Descriptive statistics

Standard deviation4.495260826
Coefficient of variation (CV)3.36658521
Kurtosis36.98472704
Mean1.335258295
Median Absolute Deviation (MAD)0.009999999776
Skewness4.663322723
Sum91986623.57
Variance20.2073699
MonotonicityNot monotonic
2024-11-24T11:04:01.573459image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 27335270
39.7%
-0.009999999776 22022154
32.0%
-1 1537105
 
2.2%
0.4499999881 161145
 
0.2%
0.2599999905 160377
 
0.2%
0.4199999869 160114
 
0.2%
0.2899999917 159254
 
0.2%
0.3899999857 157581
 
0.2%
0.4799999893 154746
 
0.2%
0.5199999809 153055
 
0.2%
Other values (4833) 16889708
24.5%
ValueCountFrequency (%)
-1 1537105
2.2%
-0.8500000238 16
 
< 0.1%
-0.5899999738 14
 
< 0.1%
-0.5199999809 12
 
< 0.1%
-0.4199999869 21
 
< 0.1%
ValueCountFrequency (%)
231.3600006 7
< 0.1%
231.2400055 7
< 0.1%
231.1499939 7
< 0.1%
224.1900024 7
< 0.1%
224.1799927 7
< 0.1%

latitude
Real number (ℝ)

Geographic coordinate that specifies the N/S position. Latitude is an angle which ranges from 0° at the Equator to 90° at the poles. It is expressed in sexadecimal notation.

Distinct22968
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean45.36320509
Minimum-11.1903
Maximum54.9885
Zeros0
Zeros (%)0.0%
Negative2691
Negative (%)< 0.1%
Memory size1.0 GiB
2024-11-24T11:04:01.688549image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum-11.1903
5-th percentile37.3689
Q145.6343
median45.9171
Q346.0662
95-th percentile46.1671
Maximum54.9885
Range66.1788
Interquartile range (IQR)0.4319

Descriptive statistics

Standard deviation2.146508816
Coefficient of variation (CV)0.04731827947
Kurtosis28.07168762
Mean45.36320509
Median Absolute Deviation (MAD)0.152
Skewness-3.239154499
Sum3125094289
Variance4.607500097
MonotonicityNot monotonic
2024-11-24T11:04:01.805083image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
45.9171 2848111
 
4.1%
45.917 2568822
 
3.7%
45.9169 2222689
 
3.2%
45.4205 1484155
 
2.2%
45.9168 1348504
 
2.0%
45.9172 1173352
 
1.7%
37.3651 938965
 
1.4%
46.0116 797200
 
1.2%
45.9118 790040
 
1.1%
45.9167 773069
 
1.1%
Other values (22958) 53945602
78.3%
ValueCountFrequency (%)
-11.1903 2691
< 0.1%
36.8256 13
 
< 0.1%
36.8257 22
 
< 0.1%
36.8261 23
 
< 0.1%
36.8262 19
 
< 0.1%
ValueCountFrequency (%)
54.9885 4
< 0.1%
54.9884 4
< 0.1%
54.9883 4
< 0.1%
54.9882 4
< 0.1%
54.9881 4
< 0.1%

longitude
Real number (ℝ)

Geographic coordinate that specifies the E/W position. Longitude is an angle which ranges from 0° at the prime Meridian to 180°. It is expressed in sexadecimal notation

Distinct25054
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.2964586
Minimum-1.7061
Maximum47.0565
Zeros0
Zeros (%)0.0%
Negative348250
Negative (%)0.5%
Memory size1.0 GiB
2024-11-24T11:04:01.923510image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum-1.7061
5-th percentile10.7693
Q111.033
median11.1223
Q311.317
95-th percentile13.8524
Maximum47.0565
Range48.7626
Interquartile range (IQR)0.284

Descriptive statistics

Standard deviation1.205842034
Coefficient of variation (CV)0.1067451382
Kurtosis60.38520621
Mean11.2964586
Median Absolute Deviation (MAD)0.0942
Skewness-5.073175387
Sum778218783.2
Variance1.454055012
MonotonicityNot monotonic
2024-11-24T11:04:02.040264image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11.0331 2782467
 
4.0%
11.033 2491312
 
3.6%
11.0332 1594114
 
2.3%
11.0329 1267272
 
1.8%
10.9545 1040580
 
1.5%
11.0333 915978
 
1.3%
11.0282 891044
 
1.3%
11.0281 869527
 
1.3%
11.3005 867399
 
1.3%
13.8525 835773
 
1.2%
Other values (25044) 55335043
80.3%
ValueCountFrequency (%)
-1.7061 3
 
< 0.1%
-1.7057 3
 
< 0.1%
-1.7045 232
< 0.1%
-1.7015 95
< 0.1%
-1.7012 24
 
< 0.1%
ValueCountFrequency (%)
47.0565 2
 
< 0.1%
17.4265 3
 
< 0.1%
17.4261 2
 
< 0.1%
17.421 9
< 0.1%
17.4209 12
< 0.1%

altitude
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Elevation above sea level in meters.

Distinct1191131
Distinct (%)1.7%
Missing2
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean212.5581349
Minimum-1691.1
Maximum63366.5
Zeros193905
Zeros (%)0.3%
Negative23670211
Negative (%)34.4%
Memory size1.0 GiB
2024-11-24T11:04:02.157828image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum-1691.1
5-th percentile-1
Q1-1
median179.7
Q3259
95-th percentile674.6458
Maximum63366.5
Range65057.6
Interquartile range (IQR)260

Descriptive statistics

Standard deviation343.1330013
Coefficient of variation (CV)1.614301902
Kurtosis2493.186198
Mean212.5581349
Median Absolute Deviation (MAD)180.7
Skewness21.62002563
Sum1.464323768 × 1010
Variance117740.2566
MonotonicityNot monotonic
2024-11-24T11:04:02.278240image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 23560108
34.2%
0 193905
 
0.3%
190 166972
 
0.2%
210 131077
 
0.2%
188.1 124773
 
0.2%
205.6 120399
 
0.2%
190.4 108591
 
0.2%
215.1 107335
 
0.2%
200.4 101628
 
0.1%
552 96504
 
0.1%
Other values (1191121) 44179215
64.1%
ValueCountFrequency (%)
-1691.1 3
< 0.1%
-1688.7 3
< 0.1%
-1686 3
< 0.1%
-1683.6 3
< 0.1%
-1681.1 3
< 0.1%
ValueCountFrequency (%)
63366.5 4
< 0.1%
63316.3 4
< 0.1%
63243.7 4
< 0.1%
63240.1 4
< 0.1%
63116.7 4
< 0.1%

Correlations

2024-11-24T11:04:02.356787image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
accuracyaltitudebearinglatitudelongitudespeeduserid
accuracy1.000-0.249-0.4570.0320.270-0.2900.147
altitude-0.2491.0000.6800.2030.0450.756-0.048
bearing-0.4570.6801.000-0.012-0.1600.825-0.130
latitude0.0320.203-0.0121.0000.0670.0150.199
longitude0.2700.045-0.1600.0671.000-0.0270.061
speed-0.2900.7560.8250.015-0.0271.000-0.021
userid0.147-0.048-0.1300.1990.061-0.0211.000