Dataset statistics
Number of variables | 23 |
---|---|
Number of observations | 7631721 |
Missing cells | 68145580 |
Missing cells (%) | 38.8% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 1.3 GiB |
Average record size in memory | 184.0 B |
Variable types
Numeric | 1 |
---|---|
DateTime | 1 |
Text | 11 |
Categorical | 10 |
Facility Type is highly imbalanced (68.4%) | Imbalance |
Status is highly imbalanced (93.3%) | Imbalance |
Vehicle Type is highly imbalanced (74.8%) | Imbalance |
Closed Date has 143947 (1.9%) missing values | Missing |
Descriptor has 129525 (1.7%) missing values | Missing |
Location Type has 1699240 (22.3%) missing values | Missing |
Landmark has 7629562 (> 99.9%) missing values | Missing |
Facility Type has 5217452 (68.4%) missing values | Missing |
Vehicle Type has 7631568 (> 99.9%) missing values | Missing |
Taxi Company Borough has 7625165 (99.9%) missing values | Missing |
Taxi Pick Up Location has 7592481 (99.5%) missing values | Missing |
Bridge Highway Name has 7618111 (99.8%) missing values | Missing |
Bridge Highway Direction has 7618125 (99.8%) missing values | Missing |
Road Ramp has 7618286 (99.8%) missing values | Missing |
Bridge Highway Segment has 7615395 (99.8%) missing values | Missing |
Unique Key has unique values | Unique |
Reproduction
Analysis started | 2024-04-22 15:30:54.003259 |
---|---|
Analysis finished | 2024-04-22 15:44:48.702554 |
Duration | 13 minutes and 54.7 seconds |
Software version | ydata-profiling vv4.7.0 |
Download configuration | config.json |
Unique Key
Real number (ℝ)
UNIQUE
 
Distinct | 7631721 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 36901967 |
Minimum | 32305076 |
---|---|
Maximum | 52179269 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 58.2 MiB |
Quantile statistics
Minimum | 32305076 |
---|---|
5-th percentile | 32778445 |
Q1 | 34613509 |
median | 36920091 |
Q3 | 39151185 |
95-th percentile | 41007169 |
Maximum | 52179269 |
Range | 19874193 |
Interquartile range (IQR) | 4537676 |
Descriptive statistics
Standard deviation | 2655089.9 |
---|---|
Coefficient of variation (CV) | 0.07194982 |
Kurtosis | -1.0864008 |
Mean | 36901967 |
Median Absolute Deviation (MAD) | 2268547 |
Skewness | 0.034143271 |
Sum | 2.8162551 × 1014 |
Variance | 7.0495023 × 1012 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
38237851 | 1 | < 0.1% |
34826212 | 1 | < 0.1% |
34826224 | 1 | < 0.1% |
34826223 | 1 | < 0.1% |
34826222 | 1 | < 0.1% |
34826221 | 1 | < 0.1% |
34826220 | 1 | < 0.1% |
34826219 | 1 | < 0.1% |
34826218 | 1 | < 0.1% |
34826217 | 1 | < 0.1% |
Other values (7631711) | 7631711 |
Value | Count | Frequency (%) |
32305076 | 1 | |
32305086 | 1 | |
32305088 | 1 | |
32305113 | 1 | |
32305114 | 1 | |
32305125 | 1 | |
32305135 | 1 | |
32305138 | 1 | |
32305139 | 1 | |
32305154 | 1 |
Value | Count | Frequency (%) |
52179269 | 1 | |
52179256 | 1 | |
52179246 | 1 | |
52179245 | 1 | |
52179244 | 1 | |
52179235 | 1 | |
52179234 | 1 | |
52179233 | 1 | |
52179232 | 1 | |
52179231 | 1 |
Created Date
Date
Distinct | 5657411 |
---|---|
Distinct (%) | 74.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 58.2 MiB |
Minimum | 2016-01-01 00:00:00 |
---|---|
Maximum | 2018-12-31 23:59:56 |
Closed Date
Text
MISSING
 
Distinct | 3171371 |
---|---|
Distinct (%) | 42.4% |
Missing | 143947 |
Missing (%) | 1.9% |
Memory size | 58.2 MiB |
Length
Max length | 22 |
---|---|
Median length | 22 |
Mean length | 22 |
Min length | 22 |
Characters and Unicode
Total characters | 164731028 |
---|---|
Distinct characters | 16 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 1978311 ? |
---|---|
Unique (%) | 26.4% |
Sample
1st row | 01/24/2018 12:00:00 AM |
---|---|
2nd row | 01/21/2018 12:00:00 AM |
3rd row | 01/20/2018 10:02:00 PM |
4th row | 01/19/2018 12:00:00 AM |
5th row | 01/20/2018 07:41:00 PM |
Value | Count | Frequency (%) |
am | 3860303 | 17.2% |
pm | 3627471 | 16.1% |
12:00:00 | 1198410 | 5.3% |
10:00:00 | 16923 | 0.1% |
11:00:00 | 16300 | 0.1% |
01:00:00 | 13934 | 0.1% |
09:00:00 | 13758 | 0.1% |
01:45:00 | 13363 | 0.1% |
02:00:00 | 13107 | 0.1% |
10:30:00 | 13066 | 0.1% |
Other values (45421) | 13676687 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 32503924 | |
1 | 21457693 | |
2 | 17201338 | |
/ | 14975548 | |
14975548 | ||
: | 14975548 | |
M | 7487774 | 4.5% |
8 | 5606542 | 3.4% |
7 | 5223507 | 3.2% |
3 | 5154427 | 3.1% |
Other values (6) | 25169179 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 104828836 | |
Other Punctuation | 29951096 | 18.2% |
Space Separator | 14975548 | 9.1% |
Uppercase Letter | 14975548 | 9.1% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 32503924 | |
1 | 21457693 | |
2 | 17201338 | |
8 | 5606542 | 5.3% |
7 | 5223507 | 5.0% |
3 | 5154427 | 4.9% |
6 | 5042739 | 4.8% |
5 | 4793207 | 4.6% |
4 | 4611931 | 4.4% |
9 | 3233528 | 3.1% |
Uppercase Letter
Value | Count | Frequency (%) |
M | 7487774 | |
A | 3860303 | |
P | 3627471 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 14975548 | |
: | 14975548 |
Space Separator
Value | Count | Frequency (%) |
14975548 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 149755480 | |
Latin | 14975548 | 9.1% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 32503924 | |
1 | 21457693 | |
2 | 17201338 | |
/ | 14975548 | |
14975548 | ||
: | 14975548 | |
8 | 5606542 | 3.7% |
7 | 5223507 | 3.5% |
3 | 5154427 | 3.4% |
6 | 5042739 | 3.4% |
Other values (3) | 12638666 | 8.4% |
Latin
Value | Count | Frequency (%) |
M | 7487774 | |
A | 3860303 | |
P | 3627471 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 164731028 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 32503924 | |
1 | 21457693 | |
2 | 17201338 | |
/ | 14975548 | |
14975548 | ||
: | 14975548 | |
M | 7487774 | 4.5% |
8 | 5606542 | 3.4% |
7 | 5223507 | 3.2% |
3 | 5154427 | 3.1% |
Other values (6) | 25169179 |
Agency
Categorical
Distinct | 30 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 58.2 MiB |
NYPD | |
---|---|
HPD | |
DOT | |
DSNY | |
DEP | |
Other values (25) |
Length
Max length | 5 |
---|---|
Median length | 3 |
Mean length | 3.450203 |
Min length | 3 |
Characters and Unicode
Total characters | 26330987 |
---|---|
Distinct characters | 21 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | DSNY |
---|---|
2nd row | DSNY |
3rd row | DSNY |
4th row | DSNY |
5th row | DSNY |
Common Values
Value | Count | Frequency (%) |
NYPD | 2167548 | |
HPD | 1778185 | |
DOT | 877335 | |
DSNY | 831025 | 10.9% |
DEP | 585582 | 7.7% |
DOB | 399518 | 5.2% |
DPR | 316820 | 4.2% |
DOHMH | 201343 | 2.6% |
DOF | 149812 | 2.0% |
DHS | 88780 | 1.2% |
Other values (20) | 235773 | 3.1% |
Length
Value | Count | Frequency (%) |
nypd | 2167548 | |
hpd | 1778185 | |
dot | 877335 | |
dsny | 831025 | 10.9% |
dep | 585582 | 7.7% |
dob | 399518 | 5.2% |
dpr | 316820 | 4.2% |
dohmh | 201343 | 2.6% |
dof | 149812 | 2.0% |
dhs | 88780 | 1.2% |
Other values (20) | 235773 | 3.1% |
Most occurring characters
Value | Count | Frequency (%) |
D | 7490189 | |
P | 4848210 | |
N | 2999777 | |
Y | 2999777 | |
H | 2325398 | 8.8% |
O | 1636139 | 6.2% |
T | 990234 | 3.8% |
S | 922330 | 3.5% |
E | 596426 | 2.3% |
B | 399536 | 1.5% |
Other values (11) | 1122971 | 4.3% |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 26328747 | |
Decimal Number | 1344 | < 0.1% |
Dash Punctuation | 896 | < 0.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
D | 7490189 | |
P | 4848210 | |
N | 2999777 | |
Y | 2999777 | |
H | 2325398 | 8.8% |
O | 1636139 | 6.2% |
T | 990234 | 3.8% |
S | 922330 | 3.5% |
E | 596426 | 2.3% |
B | 399536 | 1.5% |
Other values (8) | 1120731 | 4.3% |
Decimal Number
Value | Count | Frequency (%) |
1 | 896 | |
3 | 448 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 896 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 26328747 | |
Common | 2240 | < 0.1% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
D | 7490189 | |
P | 4848210 | |
N | 2999777 | |
Y | 2999777 | |
H | 2325398 | 8.8% |
O | 1636139 | 6.2% |
T | 990234 | 3.8% |
S | 922330 | 3.5% |
E | 596426 | 2.3% |
B | 399536 | 1.5% |
Other values (8) | 1120731 | 4.3% |
Common
Value | Count | Frequency (%) |
- | 896 | |
1 | 896 | |
3 | 448 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 26330987 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
D | 7490189 | |
P | 4848210 | |
N | 2999777 | |
Y | 2999777 | |
H | 2325398 | 8.8% |
O | 1636139 | 6.2% |
T | 990234 | 3.8% |
S | 922330 | 3.5% |
E | 596426 | 2.3% |
B | 399536 | 1.5% |
Other values (11) | 1122971 | 4.3% |
Agency Name
Text
Distinct | 1373 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 58.2 MiB |
Length
Max length | 82 |
---|---|
Median length | 78 |
Mean length | 34.290636 |
Min length | 3 |
Characters and Unicode
Total characters | 261696566 |
---|---|
Distinct characters | 69 |
Distinct categories | 8 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 342 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | Department of Sanitation |
---|---|
2nd row | Department of Sanitation |
3rd row | Department of Sanitation |
4th row | Department of Sanitation |
5th row | Department of Sanitation |
Value | Count | Frequency (%) |
department | 6917923 | |
of | 4732019 | |
and | 2411979 | 6.9% |
new | 2166530 | 6.2% |
york | 2166426 | 6.2% |
city | 2166367 | 6.2% |
police | 2166325 | 6.2% |
development | 1781227 | 5.1% |
housing | 1778111 | 5.1% |
preservation | 1778111 | 5.1% |
Other values (1855) | 7082746 |
Most occurring characters
Value | Count | Frequency (%) |
e | 31300846 | |
27516043 | 10.5% | |
t | 25609180 | 9.8% |
n | 22229250 | 8.5% |
o | 19878380 | 7.6% |
r | 16991927 | 6.5% |
a | 16146107 | 6.2% |
i | 13205430 | 5.0% |
m | 9835462 | 3.8% |
p | 9812916 | 3.7% |
Other values (59) | 69171025 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 205632134 | |
Uppercase Letter | 27977089 | 10.7% |
Space Separator | 27516043 | 10.5% |
Dash Punctuation | 402569 | 0.2% |
Decimal Number | 161941 | 0.1% |
Other Punctuation | 6752 | < 0.1% |
Open Punctuation | 19 | < 0.1% |
Close Punctuation | 19 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 31300846 | |
t | 25609180 | |
n | 22229250 | |
o | 19878380 | |
r | 16991927 | |
a | 16146107 | |
i | 13205430 | 6.4% |
m | 9835462 | 4.8% |
p | 9812916 | 4.8% |
s | 6292678 | 3.1% |
Other values (16) | 34329958 |
Uppercase Letter
Value | Count | Frequency (%) |
D | 8706702 | |
P | 4915156 | |
C | 2834178 | 10.1% |
H | 2329904 | 8.3% |
N | 2211407 | 7.9% |
Y | 2168172 | 7.7% |
T | 973118 | 3.5% |
B | 825310 | 2.9% |
E | 705860 | 2.5% |
S | 655743 | 2.3% |
Other values (16) | 1651539 | 5.9% |
Decimal Number
Value | Count | Frequency (%) |
0 | 62358 | |
1 | 38414 | |
2 | 18367 | 11.3% |
3 | 17112 | 10.6% |
4 | 8020 | 5.0% |
8 | 4971 | 3.1% |
6 | 4934 | 3.0% |
5 | 3615 | 2.2% |
7 | 3466 | 2.1% |
9 | 684 | 0.4% |
Other Punctuation
Value | Count | Frequency (%) |
, | 6644 | |
' | 72 | 1.1% |
: | 36 | 0.5% |
Space Separator
Value | Count | Frequency (%) |
27516043 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 402569 |
Open Punctuation
Value | Count | Frequency (%) |
( | 19 |
Close Punctuation
Value | Count | Frequency (%) |
) | 19 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 233609223 | |
Common | 28087343 | 10.7% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 31300846 | |
t | 25609180 | |
n | 22229250 | 9.5% |
o | 19878380 | 8.5% |
r | 16991927 | 7.3% |
a | 16146107 | 6.9% |
i | 13205430 | 5.7% |
m | 9835462 | 4.2% |
p | 9812916 | 4.2% |
D | 8706702 | 3.7% |
Other values (42) | 59893023 |
Common
Value | Count | Frequency (%) |
27516043 | ||
- | 402569 | 1.4% |
0 | 62358 | 0.2% |
1 | 38414 | 0.1% |
2 | 18367 | 0.1% |
3 | 17112 | 0.1% |
4 | 8020 | < 0.1% |
, | 6644 | < 0.1% |
8 | 4971 | < 0.1% |
6 | 4934 | < 0.1% |
Other values (7) | 7911 | < 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 261696566 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
e | 31300846 | |
27516043 | 10.5% | |
t | 25609180 | 9.8% |
n | 22229250 | 8.5% |
o | 19878380 | 7.6% |
r | 16991927 | 6.5% |
a | 16146107 | 6.2% |
i | 13205430 | 5.0% |
m | 9835462 | 3.8% |
p | 9812916 | 3.7% |
Other values (59) | 69171025 |
Complaint Type
Text
Distinct | 271 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 58.2 MiB |
Length
Max length | 41 |
---|---|
Median length | 33 |
Mean length | 17.007385 |
Min length | 3 |
Characters and Unicode
Total characters | 129795617 |
---|---|
Distinct characters | 59 |
Distinct categories | 8 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 8 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | Request Large Bulky Item Collection |
---|---|
2nd row | Request Large Bulky Item Collection |
3rd row | Request Large Bulky Item Collection |
4th row | Request Large Bulky Item Collection |
5th row | Request Large Bulky Item Collection |
Value | Count | Frequency (%) |
1335244 | 7.7% | |
noise | 1305443 | 7.5% |
condition | 1161910 | 6.7% |
water | 1012500 | 5.8% |
residential | 670050 | 3.9% |
heat/hot | 665315 | 3.8% |
street | 586764 | 3.4% |
parking | 467965 | 2.7% |
illegal | 443251 | 2.6% |
blocked | 392458 | 2.3% |
Other values (356) | 9299196 |
Most occurring characters
Value | Count | Frequency (%) |
e | 12875357 | 9.9% |
i | 9907988 | 7.6% |
9708375 | 7.5% | |
t | 7478979 | 5.8% |
o | 6580378 | 5.1% |
n | 6475264 | 5.0% |
l | 5876127 | 4.5% |
a | 5342985 | 4.1% |
r | 4861128 | 3.7% |
s | 4621470 | 3.6% |
Other values (49) | 56067566 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 81656963 | |
Uppercase Letter | 35167695 | |
Space Separator | 9708375 | 7.5% |
Other Punctuation | 1635452 | 1.3% |
Dash Punctuation | 1354422 | 1.0% |
Open Punctuation | 136354 | 0.1% |
Close Punctuation | 136354 | 0.1% |
Decimal Number | 2 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 12875357 | |
i | 9907988 | |
t | 7478979 | |
o | 6580378 | |
n | 6475264 | |
l | 5876127 | |
a | 5342985 | 6.5% |
r | 4861128 | 6.0% |
s | 4621470 | 5.7% |
d | 3115955 | 3.8% |
Other values (16) | 14521332 |
Uppercase Letter
Value | Count | Frequency (%) |
T | 3617955 | 10.3% |
N | 3023045 | 8.6% |
A | 3010699 | 8.6% |
R | 2915949 | 8.3% |
S | 2509008 | 7.1% |
C | 2450028 | 7.0% |
E | 2382909 | 6.8% |
I | 2312430 | 6.6% |
O | 1864243 | 5.3% |
P | 1607318 | 4.6% |
Other values (15) | 9474111 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 1634828 | |
' | 612 | < 0.1% |
. | 12 | < 0.1% |
Space Separator
Value | Count | Frequency (%) |
9708375 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 1354422 |
Open Punctuation
Value | Count | Frequency (%) |
( | 136354 |
Close Punctuation
Value | Count | Frequency (%) |
) | 136354 |
Decimal Number
Value | Count | Frequency (%) |
4 | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 116824658 | |
Common | 12970959 | 10.0% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 12875357 | 11.0% |
i | 9907988 | 8.5% |
t | 7478979 | 6.4% |
o | 6580378 | 5.6% |
n | 6475264 | 5.5% |
l | 5876127 | 5.0% |
a | 5342985 | 4.6% |
r | 4861128 | 4.2% |
s | 4621470 | 4.0% |
T | 3617955 | 3.1% |
Other values (41) | 49187027 |
Common
Value | Count | Frequency (%) |
9708375 | ||
/ | 1634828 | 12.6% |
- | 1354422 | 10.4% |
( | 136354 | 1.1% |
) | 136354 | 1.1% |
' | 612 | < 0.1% |
. | 12 | < 0.1% |
4 | 2 | < 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 129795617 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
e | 12875357 | 9.9% |
i | 9907988 | 7.6% |
9708375 | 7.5% | |
t | 7478979 | 5.8% |
o | 6580378 | 5.1% |
n | 6475264 | 5.0% |
l | 5876127 | 4.5% |
a | 5342985 | 4.1% |
r | 4861128 | 3.7% |
s | 4621470 | 3.6% |
Other values (49) | 56067566 |
Descriptor
Text
MISSING
 
Distinct | 1314 |
---|---|
Distinct (%) | < 0.1% |
Missing | 129525 |
Missing (%) | 1.7% |
Memory size | 58.2 MiB |
Length
Max length | 80 |
---|---|
Median length | 63 |
Mean length | 18.535133 |
Min length | 3 |
Characters and Unicode
Total characters | 139054204 |
---|---|
Distinct characters | 76 |
Distinct categories | 9 ? |
Distinct scripts | 2 ? |
Distinct blocks | 2 ? |
Unique
Unique | 73 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | Request Large Bulky Item Collection |
---|---|
2nd row | Request Large Bulky Item Collection |
3rd row | Request Large Bulky Item Collection |
4th row | Request Large Bulky Item Collection |
5th row | Request Large Bulky Item Collection |
Value | Count | Frequency (%) |
loud | 829704 | 4.3% |
music/party | 696660 | 3.6% |
building | 528467 | 2.7% |
entire | 448188 | 2.3% |
access | 393451 | 2.0% |
389412 | 2.0% | |
no | 368568 | 1.9% |
street | 333805 | 1.7% |
collection | 299475 | 1.5% |
request | 276089 | 1.4% |
Other values (1704) | 14841851 |
Most occurring characters
Value | Count | Frequency (%) |
11904997 | 8.6% | |
e | 9940675 | 7.1% |
i | 7775645 | 5.6% |
t | 7296442 | 5.2% |
o | 7112437 | 5.1% |
n | 6330243 | 4.6% |
r | 5970261 | 4.3% |
a | 5697013 | 4.1% |
l | 4813929 | 3.5% |
s | 4703019 | 3.4% |
Other values (66) | 67509543 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 84376607 | |
Uppercase Letter | 37037039 | |
Space Separator | 11904997 | 8.6% |
Other Punctuation | 2403910 | 1.7% |
Decimal Number | 1093995 | 0.8% |
Open Punctuation | 791269 | 0.6% |
Close Punctuation | 790028 | 0.6% |
Dash Punctuation | 656308 | 0.5% |
Other Symbol | 51 | < 0.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
L | 3185828 | 8.6% |
N | 2797927 | 7.6% |
P | 2646636 | 7.1% |
I | 2622547 | 7.1% |
E | 2579532 | 7.0% |
A | 2291255 | 6.2% |
R | 2258719 | 6.1% |
S | 2071969 | 5.6% |
C | 2057280 | 5.6% |
B | 2054680 | 5.5% |
Other values (17) | 12470666 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 9940675 | |
i | 7775645 | 9.2% |
t | 7296442 | 8.6% |
o | 7112437 | 8.4% |
n | 6330243 | 7.5% |
r | 5970261 | 7.1% |
a | 5697013 | 6.8% |
l | 4813929 | 5.7% |
s | 4703019 | 5.6% |
c | 4116077 | 4.9% |
Other values (16) | 20620866 |
Decimal Number
Value | Count | Frequency (%) |
1 | 481900 | |
2 | 210660 | |
3 | 128852 | 11.8% |
5 | 125819 | 11.5% |
4 | 53074 | 4.9% |
0 | 44382 | 4.1% |
9 | 19778 | 1.8% |
8 | 18752 | 1.7% |
6 | 9047 | 0.8% |
7 | 1731 | 0.2% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 2090176 | |
: | 178382 | 7.4% |
, | 100858 | 4.2% |
. | 21176 | 0.9% |
\ | 4842 | 0.2% |
" | 4842 | 0.2% |
& | 3624 | 0.2% |
* | 10 | < 0.1% |
Space Separator
Value | Count | Frequency (%) |
11904997 |
Open Punctuation
Value | Count | Frequency (%) |
( | 791269 |
Close Punctuation
Value | Count | Frequency (%) |
) | 790028 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 656308 |
Other Symbol
Value | Count | Frequency (%) |
© | 51 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 121413646 | |
Common | 17640558 | 12.7% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 9940675 | 8.2% |
i | 7775645 | 6.4% |
t | 7296442 | 6.0% |
o | 7112437 | 5.9% |
n | 6330243 | 5.2% |
r | 5970261 | 4.9% |
a | 5697013 | 4.7% |
l | 4813929 | 4.0% |
s | 4703019 | 3.9% |
c | 4116077 | 3.4% |
Other values (43) | 57657905 |
Common
Value | Count | Frequency (%) |
11904997 | ||
/ | 2090176 | 11.8% |
( | 791269 | 4.5% |
) | 790028 | 4.5% |
- | 656308 | 3.7% |
1 | 481900 | 2.7% |
2 | 210660 | 1.2% |
: | 178382 | 1.0% |
3 | 128852 | 0.7% |
5 | 125819 | 0.7% |
Other values (13) | 282167 | 1.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 139054102 | |
None | 102 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
11904997 | 8.6% | |
e | 9940675 | 7.1% |
i | 7775645 | 5.6% |
t | 7296442 | 5.2% |
o | 7112437 | 5.1% |
n | 6330243 | 4.6% |
r | 5970261 | 4.3% |
a | 5697013 | 4.1% |
l | 4813929 | 3.5% |
s | 4703019 | 3.4% |
Other values (64) | 67509441 |
None
Value | Count | Frequency (%) |
à | 51 | |
© | 51 |
Location Type
Text
MISSING
 
Distinct | 161 |
---|---|
Distinct (%) | < 0.1% |
Missing | 1699240 |
Missing (%) | 22.3% |
Memory size | 58.2 MiB |
Length
Max length | 36 |
---|---|
Median length | 30 |
Mean length | 16.105777 |
Min length | 3 |
Characters and Unicode
Total characters | 95547217 |
---|---|
Distinct characters | 59 |
Distinct categories | 9 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 1 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | Sidewalk |
---|---|
2nd row | Sidewalk |
3rd row | Sidewalk |
4th row | Sidewalk |
5th row | Sidewalk |
Value | Count | Frequency (%) |
residential | 2491260 | |
building | 1860682 | |
street/sidewalk | 1331865 | |
street | 718155 | 7.8% |
building/house | 709171 | 7.7% |
sidewalk | 652507 | 7.1% |
address | 176877 | 1.9% |
family | 124163 | 1.4% |
store/commercial | 103650 | 1.1% |
3 | 90303 | 1.0% |
Other values (179) | 925986 | 10.1% |
Most occurring characters
Value | Count | Frequency (%) |
e | 9218360 | 9.6% |
I | 7067894 | 7.4% |
S | 5969522 | 6.2% |
i | 5527392 | 5.8% |
t | 5407940 | 5.7% |
l | 4019177 | 4.2% |
d | 3973723 | 4.2% |
D | 3603967 | 3.8% |
N | 3592566 | 3.8% |
L | 3564165 | 3.7% |
Other values (49) | 43602511 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 47818150 | |
Uppercase Letter | 41554795 | |
Space Separator | 3252138 | 3.4% |
Other Punctuation | 2488411 | 2.6% |
Decimal Number | 203907 | 0.2% |
Dash Punctuation | 118757 | 0.1% |
Math Symbol | 67407 | 0.1% |
Open Punctuation | 21826 | < 0.1% |
Close Punctuation | 21826 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 9218360 | |
i | 5527392 | |
t | 5407940 | |
l | 4019177 | |
d | 3973723 | |
a | 3522275 | 7.4% |
r | 3094034 | 6.5% |
k | 2106450 | 4.4% |
w | 2051414 | 4.3% |
s | 1961602 | 4.1% |
Other values (14) | 6935783 |
Uppercase Letter
Value | Count | Frequency (%) |
I | 7067894 | |
S | 5969522 | |
D | 3603967 | |
N | 3592566 | |
L | 3564165 | |
E | 3553648 | |
B | 2735144 | 6.6% |
R | 2602326 | 6.3% |
A | 2011097 | 4.8% |
U | 1816876 | 4.4% |
Other values (13) | 5037590 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 2415789 | |
. | 46346 | 1.9% |
, | 22896 | 0.9% |
' | 3380 | 0.1% |
Decimal Number
Value | Count | Frequency (%) |
3 | 90401 | |
1 | 56802 | |
2 | 56704 |
Space Separator
Value | Count | Frequency (%) |
3252138 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 118757 |
Math Symbol
Value | Count | Frequency (%) |
+ | 67407 |
Open Punctuation
Value | Count | Frequency (%) |
( | 21826 |
Close Punctuation
Value | Count | Frequency (%) |
) | 21826 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 89372945 | |
Common | 6174272 | 6.5% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 9218360 | 10.3% |
I | 7067894 | 7.9% |
S | 5969522 | 6.7% |
i | 5527392 | 6.2% |
t | 5407940 | 6.1% |
l | 4019177 | 4.5% |
d | 3973723 | 4.4% |
D | 3603967 | 4.0% |
N | 3592566 | 4.0% |
L | 3564165 | 4.0% |
Other values (37) | 37428239 |
Common
Value | Count | Frequency (%) |
3252138 | ||
/ | 2415789 | |
- | 118757 | 1.9% |
3 | 90401 | 1.5% |
+ | 67407 | 1.1% |
1 | 56802 | 0.9% |
2 | 56704 | 0.9% |
. | 46346 | 0.8% |
, | 22896 | 0.4% |
( | 21826 | 0.4% |
Other values (2) | 25206 | 0.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 95547217 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
e | 9218360 | 9.6% |
I | 7067894 | 7.4% |
S | 5969522 | 6.2% |
i | 5527392 | 5.8% |
t | 5407940 | 5.7% |
l | 4019177 | 4.2% |
d | 3973723 | 4.2% |
D | 3603967 | 3.8% |
N | 3592566 | 3.8% |
L | 3564165 | 3.7% |
Other values (49) | 43602511 |
Landmark
Text
MISSING
 
Distinct | 416 |
---|---|
Distinct (%) | 19.3% |
Missing | 7629562 |
Missing (%) | > 99.9% |
Memory size | 58.2 MiB |
Length
Max length | 32 |
---|---|
Median length | 29 |
Mean length | 14.862899 |
Min length | 3 |
Characters and Unicode
Total characters | 32089 |
---|---|
Distinct characters | 38 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 234 ? |
---|---|
Unique (%) | 10.8% |
Sample
1st row | J F K AIRPORT |
---|---|
2nd row | LGA |
3rd row | FT TOTTEN |
4th row | FLUSHING MEADOWS CORONA PARK |
5th row | CITI FIELD |
Value | Count | Frequency (%) |
park | 1036 | 19.0% |
central | 295 | 5.4% |
airport | 262 | 4.8% |
j | 156 | 2.9% |
f | 152 | 2.8% |
k | 151 | 2.8% |
square | 119 | 2.2% |
prospect | 105 | 1.9% |
la | 100 | 1.8% |
guardia | 100 | 1.8% |
Other values (490) | 2984 |
Most occurring characters
Value | Count | Frequency (%) |
A | 3577 | 11.1% |
R | 3509 | 10.9% |
3307 | 10.3% | |
E | 2197 | 6.8% |
N | 1844 | 5.7% |
P | 1822 | 5.7% |
T | 1808 | 5.6% |
O | 1666 | 5.2% |
I | 1608 | 5.0% |
L | 1510 | 4.7% |
Other values (28) | 9241 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 28666 | |
Space Separator | 3307 | 10.3% |
Decimal Number | 109 | 0.3% |
Other Punctuation | 7 | < 0.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
A | 3577 | |
R | 3509 | |
E | 2197 | 7.7% |
N | 1844 | 6.4% |
P | 1822 | 6.4% |
T | 1808 | 6.3% |
O | 1666 | 5.8% |
I | 1608 | 5.6% |
L | 1510 | 5.3% |
K | 1418 | 4.9% |
Other values (16) | 7707 |
Decimal Number
Value | Count | Frequency (%) |
2 | 21 | |
1 | 15 | |
4 | 15 | |
9 | 14 | |
7 | 11 | |
6 | 8 | 7.3% |
5 | 8 | 7.3% |
3 | 7 | 6.4% |
8 | 6 | 5.5% |
0 | 4 | 3.7% |
Space Separator
Value | Count | Frequency (%) |
3307 |
Other Punctuation
Value | Count | Frequency (%) |
' | 7 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 28666 | |
Common | 3423 | 10.7% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
A | 3577 | |
R | 3509 | |
E | 2197 | 7.7% |
N | 1844 | 6.4% |
P | 1822 | 6.4% |
T | 1808 | 6.3% |
O | 1666 | 5.8% |
I | 1608 | 5.6% |
L | 1510 | 5.3% |
K | 1418 | 4.9% |
Other values (16) | 7707 |
Common
Value | Count | Frequency (%) |
3307 | ||
2 | 21 | 0.6% |
1 | 15 | 0.4% |
4 | 15 | 0.4% |
9 | 14 | 0.4% |
7 | 11 | 0.3% |
6 | 8 | 0.2% |
5 | 8 | 0.2% |
' | 7 | 0.2% |
3 | 7 | 0.2% |
Other values (2) | 10 | 0.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 32089 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
A | 3577 | 11.1% |
R | 3509 | 10.9% |
3307 | 10.3% | |
E | 2197 | 6.8% |
N | 1844 | 5.7% |
P | 1822 | 5.7% |
T | 1808 | 5.6% |
O | 1666 | 5.2% |
I | 1608 | 5.0% |
L | 1510 | 4.7% |
Other values (28) | 9241 |
Facility Type
Categorical
IMBALANCE
  MISSING
 
Distinct | 3 |
---|---|
Distinct (%) | < 0.1% |
Missing | 5217452 |
Missing (%) | 68.4% |
Memory size | 58.2 MiB |
Precinct | |
---|---|
DSNY Garage | |
School | 5995 |
Length
Max length | 11 |
---|---|
Median length | 8 |
Mean length | 8.3021117 |
Min length | 6 |
Characters and Unicode
Total characters | 20043531 |
---|---|
Distinct characters | 18 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Precinct |
---|---|
2nd row | Precinct |
3rd row | Precinct |
4th row | Precinct |
5th row | Precinct |
Common Values
Value | Count | Frequency (%) |
Precinct | 2161151 | |
DSNY Garage | 247123 | 3.2% |
School | 5995 | 0.1% |
(Missing) | 5217452 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
precinct | 2161151 | |
dsny | 247123 | 9.3% |
garage | 247123 | 9.3% |
school | 5995 | 0.2% |
Most occurring characters
Value | Count | Frequency (%) |
c | 4328297 | |
e | 2408274 | |
r | 2408274 | |
P | 2161151 | |
i | 2161151 | |
n | 2161151 | |
t | 2161151 | |
a | 494246 | 2.5% |
S | 253118 | 1.3% |
G | 247123 | 1.2% |
Other values (8) | 1259595 | 6.3% |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 16393647 | |
Uppercase Letter | 3402761 | 17.0% |
Space Separator | 247123 | 1.2% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
c | 4328297 | |
e | 2408274 | |
r | 2408274 | |
i | 2161151 | |
n | 2161151 | |
t | 2161151 | |
a | 494246 | 3.0% |
g | 247123 | 1.5% |
o | 11990 | 0.1% |
h | 5995 | < 0.1% |
Uppercase Letter
Value | Count | Frequency (%) |
P | 2161151 | |
S | 253118 | 7.4% |
G | 247123 | 7.3% |
N | 247123 | 7.3% |
Y | 247123 | 7.3% |
D | 247123 | 7.3% |
Space Separator
Value | Count | Frequency (%) |
247123 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 19796408 | |
Common | 247123 | 1.2% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
c | 4328297 | |
e | 2408274 | |
r | 2408274 | |
P | 2161151 | |
i | 2161151 | |
n | 2161151 | |
t | 2161151 | |
a | 494246 | 2.5% |
S | 253118 | 1.3% |
G | 247123 | 1.2% |
Other values (7) | 1012472 | 5.1% |
Common
Value | Count | Frequency (%) |
247123 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 20043531 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
c | 4328297 | |
e | 2408274 | |
r | 2408274 | |
P | 2161151 | |
i | 2161151 | |
n | 2161151 | |
t | 2161151 | |
a | 494246 | 2.5% |
S | 253118 | 1.3% |
G | 247123 | 1.2% |
Other values (8) | 1259595 | 6.3% |
Status
Categorical
IMBALANCE
 
Distinct | 11 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 58.2 MiB |
Closed | |
---|---|
Pending | 69260 |
Open | 53839 |
Assigned | 50936 |
In Progress | 21036 |
Other values (6) | 6201 |
Length
Max length | 16 |
---|---|
Median length | 6 |
Mean length | 6.024083 |
Min length | 4 |
Characters and Unicode
Total characters | 45974121 |
---|---|
Distinct characters | 27 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Closed |
---|---|
2nd row | Closed |
3rd row | Closed |
4th row | Closed |
5th row | Closed |
Common Values
Value | Count | Frequency (%) |
Closed | 7430449 | |
Pending | 69260 | 0.9% |
Open | 53839 | 0.7% |
Assigned | 50936 | 0.7% |
In Progress | 21036 | 0.3% |
Started | 3233 | < 0.1% |
Email Sent | 2903 | < 0.1% |
Unassigned | 26 | < 0.1% |
Closed - Testing | 19 | < 0.1% |
Draft | 13 | < 0.1% |
Length
Value | Count | Frequency (%) |
closed | 7430468 | |
pending | 69260 | 0.9% |
open | 53839 | 0.7% |
assigned | 50936 | 0.7% |
in | 21036 | 0.3% |
progress | 21036 | 0.3% |
started | 3233 | < 0.1% |
2903 | < 0.1% | |
sent | 2903 | < 0.1% |
unassigned | 26 | < 0.1% |
Other values (4) | 58 | < 0.1% |
Most occurring characters
Value | Count | Frequency (%) |
e | 7631734 | |
s | 7574490 | |
d | 7553930 | |
o | 7451504 | |
l | 7433371 | |
C | 7430468 | |
n | 267312 | 0.6% |
g | 141277 | 0.3% |
i | 123158 | 0.3% |
P | 90296 | 0.2% |
Other values (17) | 276581 | 0.6% |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 38294446 | |
Uppercase Letter | 7655679 | 16.7% |
Space Separator | 23977 | 0.1% |
Dash Punctuation | 19 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 7631734 | |
s | 7574490 | |
d | 7553930 | |
o | 7451504 | |
l | 7433371 | |
n | 267312 | 0.7% |
g | 141277 | 0.4% |
i | 123158 | 0.3% |
p | 53846 | 0.1% |
r | 45318 | 0.1% |
Other values (5) | 18506 | < 0.1% |
Uppercase Letter
Value | Count | Frequency (%) |
C | 7430468 | |
P | 90296 | 1.2% |
O | 53839 | 0.7% |
A | 50936 | 0.7% |
I | 21036 | 0.3% |
S | 6136 | 0.1% |
E | 2903 | < 0.1% |
U | 33 | < 0.1% |
T | 19 | < 0.1% |
D | 13 | < 0.1% |
Space Separator
Value | Count | Frequency (%) |
23977 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 19 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 45950125 | |
Common | 23996 | 0.1% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 7631734 | |
s | 7574490 | |
d | 7553930 | |
o | 7451504 | |
l | 7433371 | |
C | 7430468 | |
n | 267312 | 0.6% |
g | 141277 | 0.3% |
i | 123158 | 0.3% |
P | 90296 | 0.2% |
Other values (15) | 252585 | 0.5% |
Common
Value | Count | Frequency (%) |
23977 | ||
- | 19 | 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 45974121 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
e | 7631734 | |
s | 7574490 | |
d | 7553930 | |
o | 7451504 | |
l | 7433371 | |
C | 7430468 | |
n | 267312 | 0.6% |
g | 141277 | 0.3% |
i | 123158 | 0.3% |
P | 90296 | 0.2% |
Other values (17) | 276581 | 0.6% |
Community Board
Text
Distinct | 81 |
---|---|
Distinct (%) | < 0.1% |
Missing | 2238 |
Missing (%) | < 0.1% |
Memory size | 58.2 MiB |
Length
Max length | 801 |
---|---|
Median length | 758 |
Mean length | 10.992296 |
Min length | 4 |
Characters and Unicode
Total characters | 83865533 |
---|---|
Distinct characters | 67 |
Distinct categories | 9 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 4 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | 01 BROOKLYN |
---|---|
2nd row | 03 STATEN ISLAND |
3rd row | 11 QUEENS |
4th row | 12 BRONX |
5th row | 06 BROOKLYN |
Value | Count | Frequency (%) |
brooklyn | 2373911 | |
queens | 1814028 | 11.6% |
manhattan | 1532407 | 9.8% |
bronx | 1384810 | 8.8% |
12 | 668515 | 4.3% |
01 | 631716 | 4.0% |
03 | 604545 | 3.9% |
05 | 586667 | 3.7% |
unspecified | 564638 | 3.6% |
07 | 545379 | 3.5% |
Other values (149) | 4950638 |
Most occurring characters
Value | Count | Frequency (%) |
N | 9433879 | 11.2% |
8027768 | 9.6% | |
O | 6132656 | 7.3% |
A | 5393529 | 6.4% |
0 | 5199786 | 6.2% |
E | 4026263 | 4.8% |
T | 3861112 | 4.6% |
R | 3758740 | 4.5% |
B | 3758728 | 4.5% |
1 | 3481625 | 4.2% |
Other values (57) | 30791447 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 55933885 | |
Decimal Number | 14256489 | 17.0% |
Space Separator | 8027768 | 9.6% |
Lowercase Letter | 5647040 | 6.7% |
Other Punctuation | 314 | < 0.1% |
Dash Punctuation | 15 | < 0.1% |
Open Punctuation | 11 | < 0.1% |
Close Punctuation | 8 | < 0.1% |
Control | 3 | < 0.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
N | 9433879 | |
O | 6132656 | |
A | 5393529 | |
E | 4026263 | 7.2% |
T | 3861112 | 6.9% |
R | 3758740 | 6.7% |
B | 3758728 | 6.7% |
L | 2772056 | 5.0% |
S | 2610335 | 4.7% |
U | 2378689 | 4.3% |
Other values (14) | 11807898 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 1129366 | |
i | 1129329 | |
n | 564690 | |
c | 564677 | |
d | 564670 | |
s | 564665 | |
p | 564659 | |
f | 564655 | |
o | 66 | < 0.1% |
r | 49 | < 0.1% |
Other values (12) | 214 | < 0.1% |
Decimal Number
Value | Count | Frequency (%) |
0 | 5199786 | |
1 | 3481625 | |
2 | 1133111 | 7.9% |
3 | 796173 | 5.6% |
4 | 726700 | 5.1% |
5 | 708310 | 5.0% |
7 | 706904 | 5.0% |
8 | 574226 | 4.0% |
9 | 501247 | 3.5% |
6 | 428407 | 3.0% |
Other Punctuation
Value | Count | Frequency (%) |
, | 174 | |
" | 59 | 18.8% |
/ | 30 | 9.6% |
: | 28 | 8.9% |
. | 20 | 6.4% |
\ | 3 | 1.0% |
Space Separator
Value | Count | Frequency (%) |
8027768 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 15 |
Open Punctuation
Value | Count | Frequency (%) |
( | 11 |
Close Punctuation
Value | Count | Frequency (%) |
) | 8 |
Control
Value | Count | Frequency (%) |
3 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 61580925 | |
Common | 22284608 | 26.6% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
N | 9433879 | |
O | 6132656 | 10.0% |
A | 5393529 | 8.8% |
E | 4026263 | 6.5% |
T | 3861112 | 6.3% |
R | 3758740 | 6.1% |
B | 3758728 | 6.1% |
L | 2772056 | 4.5% |
S | 2610335 | 4.2% |
U | 2378689 | 3.9% |
Other values (36) | 17454938 |
Common
Value | Count | Frequency (%) |
8027768 | ||
0 | 5199786 | |
1 | 3481625 | |
2 | 1133111 | 5.1% |
3 | 796173 | 3.6% |
4 | 726700 | 3.3% |
5 | 708310 | 3.2% |
7 | 706904 | 3.2% |
8 | 574226 | 2.6% |
9 | 501247 | 2.2% |
Other values (11) | 428758 | 1.9% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 83865533 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
N | 9433879 | 11.2% |
8027768 | 9.6% | |
O | 6132656 | 7.3% |
A | 5393529 | 6.4% |
0 | 5199786 | 6.2% |
E | 4026263 | 4.8% |
T | 3861112 | 4.6% |
R | 3758740 | 4.5% |
B | 3758728 | 4.5% |
1 | 3481625 | 4.2% |
Other values (57) | 30791447 |
Borough
Categorical
Distinct | 8 |
---|---|
Distinct (%) | < 0.1% |
Missing | 2238 |
Missing (%) | < 0.1% |
Memory size | 58.2 MiB |
BROOKLYN | |
---|---|
QUEENS | |
MANHATTAN | |
BRONX | |
STATEN ISLAND | |
Other values (3) | 126191 |
Length
Max length | 13 |
---|---|
Median length | 11 |
Mean length | 7.4938049 |
Min length | 4 |
Characters and Unicode
Total characters | 57173857 |
---|---|
Distinct characters | 32 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 1 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | BROOKLYN |
---|---|
2nd row | STATEN ISLAND |
3rd row | QUEENS |
4th row | BRONX |
5th row | BROOKLYN |
Common Values
Value | Count | Frequency (%) |
BROOKLYN | 2373911 | |
QUEENS | 1811212 | |
MANHATTAN | 1537823 | |
BRONX | 1382211 | |
STATEN ISLAND | 398135 | 5.2% |
Unspecified | 126188 | 1.7% |
2016 | 2 | < 0.1% |
2017 | 1 | < 0.1% |
(Missing) | 2238 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
brooklyn | 2373911 | |
queens | 1811212 | |
manhattan | 1537823 | |
bronx | 1382211 | |
staten | 398135 | 5.0% |
island | 398135 | 5.0% |
unspecified | 126188 | 1.6% |
2016 | 2 | < 0.1% |
2017 | 1 | < 0.1% |
Most occurring characters
Value | Count | Frequency (%) |
N | 9439250 | |
O | 6130033 | |
A | 5409739 | |
E | 4020559 | 7.0% |
T | 3871916 | 6.8% |
B | 3756122 | 6.6% |
R | 3756122 | 6.6% |
L | 2772046 | 4.8% |
S | 2607482 | 4.6% |
Y | 2373911 | 4.2% |
Other values (22) | 13036677 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 55513830 | |
Lowercase Letter | 1261880 | 2.2% |
Space Separator | 398135 | 0.7% |
Decimal Number | 12 | < 0.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
N | 9439250 | |
O | 6130033 | |
A | 5409739 | |
E | 4020559 | 7.2% |
T | 3871916 | 7.0% |
B | 3756122 | 6.8% |
R | 3756122 | 6.8% |
L | 2772046 | 5.0% |
S | 2607482 | 4.7% |
Y | 2373911 | 4.3% |
Other values (8) | 11376650 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 252376 | |
i | 252376 | |
n | 126188 | |
s | 126188 | |
p | 126188 | |
c | 126188 | |
f | 126188 | |
d | 126188 |
Decimal Number
Value | Count | Frequency (%) |
2 | 3 | |
0 | 3 | |
1 | 3 | |
6 | 2 | |
7 | 1 | 8.3% |
Space Separator
Value | Count | Frequency (%) |
398135 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 56775710 | |
Common | 398147 | 0.7% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
N | 9439250 | |
O | 6130033 | |
A | 5409739 | |
E | 4020559 | 7.1% |
T | 3871916 | 6.8% |
B | 3756122 | 6.6% |
R | 3756122 | 6.6% |
L | 2772046 | 4.9% |
S | 2607482 | 4.6% |
Y | 2373911 | 4.2% |
Other values (16) | 12638530 |
Common
Value | Count | Frequency (%) |
398135 | ||
2 | 3 | < 0.1% |
0 | 3 | < 0.1% |
1 | 3 | < 0.1% |
6 | 2 | < 0.1% |
7 | 1 | < 0.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 57173857 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
N | 9439250 | |
O | 6130033 | |
A | 5409739 | |
E | 4020559 | 7.0% |
T | 3871916 | 6.8% |
B | 3756122 | 6.6% |
R | 3756122 | 6.6% |
L | 2772046 | 4.8% |
S | 2607482 | 4.6% |
Y | 2373911 | 4.2% |
Other values (22) | 13036677 |
Open Data Channel Type
Categorical
Distinct | 5 |
---|---|
Distinct (%) | < 0.1% |
Missing | 3 |
Missing (%) | < 0.1% |
Memory size | 58.2 MiB |
PHONE | |
---|---|
ONLINE | |
UNKNOWN | |
MOBILE | |
OTHER | 90337 |
Length
Max length | 7 |
---|---|
Median length | 5 |
Mean length | 5.5946957 |
Min length | 5 |
Characters and Unicode
Total characters | 42697140 |
---|---|
Distinct characters | 14 |
Distinct categories | 1 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | PHONE |
---|---|
2nd row | PHONE |
3rd row | PHONE |
4th row | PHONE |
5th row | PHONE |
Common Values
Value | Count | Frequency (%) |
PHONE | 4054186 | |
ONLINE | 1597490 | 20.9% |
UNKNOWN | 1051355 | 13.8% |
MOBILE | 838350 | 11.0% |
OTHER | 90337 | 1.2% |
(Missing) | 3 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
phone | 4054186 | |
online | 1597490 | 20.9% |
unknown | 1051355 | 13.8% |
mobile | 838350 | 11.0% |
other | 90337 | 1.2% |
Most occurring characters
Value | Count | Frequency (%) |
N | 10403231 | |
O | 7631718 | |
E | 6580363 | |
H | 4144523 | 9.7% |
P | 4054186 | 9.5% |
L | 2435840 | 5.7% |
I | 2435840 | 5.7% |
U | 1051355 | 2.5% |
K | 1051355 | 2.5% |
W | 1051355 | 2.5% |
Other values (4) | 1857374 | 4.4% |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 42697140 |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
N | 10403231 | |
O | 7631718 | |
E | 6580363 | |
H | 4144523 | 9.7% |
P | 4054186 | 9.5% |
L | 2435840 | 5.7% |
I | 2435840 | 5.7% |
U | 1051355 | 2.5% |
K | 1051355 | 2.5% |
W | 1051355 | 2.5% |
Other values (4) | 1857374 | 4.4% |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 42697140 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
N | 10403231 | |
O | 7631718 | |
E | 6580363 | |
H | 4144523 | 9.7% |
P | 4054186 | 9.5% |
L | 2435840 | 5.7% |
I | 2435840 | 5.7% |
U | 1051355 | 2.5% |
K | 1051355 | 2.5% |
W | 1051355 | 2.5% |
Other values (4) | 1857374 | 4.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 42697140 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
N | 10403231 | |
O | 7631718 | |
E | 6580363 | |
H | 4144523 | 9.7% |
P | 4054186 | 9.5% |
L | 2435840 | 5.7% |
I | 2435840 | 5.7% |
U | 1051355 | 2.5% |
K | 1051355 | 2.5% |
W | 1051355 | 2.5% |
Other values (4) | 1857374 | 4.4% |
Distinct | 3056 |
---|---|
Distinct (%) | < 0.1% |
Missing | 3 |
Missing (%) | < 0.1% |
Memory size | 58.2 MiB |
Length
Max length | 95 |
---|---|
Median length | 11 |
Mean length | 11.09074 |
Min length | 6 |
Characters and Unicode
Total characters | 84641403 |
---|---|
Distinct characters | 70 |
Distinct categories | 8 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 466 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | Unspecified |
---|---|
2nd row | Unspecified |
3rd row | Unspecified |
4th row | Unspecified |
5th row | Unspecified |
Value | Count | Frequency (%) |
unspecified | 7572739 | |
park | 33971 | 0.4% |
17810 | 0.2% | |
playground | 15042 | 0.2% |
school | 9027 | 0.1% |
ps | 4639 | 0.1% |
center | 2864 | < 0.1% |
central | 2553 | < 0.1% |
beach | 2406 | < 0.1% |
and | 2272 | < 0.1% |
Other values (2963) | 128820 | 1.7% |
Most occurring characters
Value | Count | Frequency (%) |
e | 15230538 | |
i | 15191095 | |
n | 7648417 | |
d | 7613055 | |
s | 7605818 | |
c | 7601552 | |
p | 7579073 | |
f | 7577850 | |
U | 7573384 | |
160559 | 0.2% | |
Other values (60) | 860062 | 1.0% |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 76671347 | |
Uppercase Letter | 7772935 | 9.2% |
Space Separator | 160559 | 0.2% |
Dash Punctuation | 18132 | < 0.1% |
Decimal Number | 16407 | < 0.1% |
Other Punctuation | 1751 | < 0.1% |
Open Punctuation | 136 | < 0.1% |
Close Punctuation | 136 | < 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 15230538 | |
i | 15191095 | |
n | 7648417 | |
d | 7613055 | |
s | 7605818 | |
c | 7601552 | |
p | 7579073 | |
f | 7577850 | |
a | 114472 | 0.1% |
r | 113107 | 0.1% |
Other values (16) | 396370 | 0.5% |
Uppercase Letter
Value | Count | Frequency (%) |
U | 7573384 | |
P | 62086 | 0.8% |
S | 26063 | 0.3% |
C | 16200 | 0.2% |
B | 11219 | 0.1% |
M | 9980 | 0.1% |
H | 9643 | 0.1% |
R | 9623 | 0.1% |
F | 6628 | 0.1% |
A | 6616 | 0.1% |
Other values (16) | 41493 | 0.5% |
Decimal Number
Value | Count | Frequency (%) |
1 | 3798 | |
2 | 2067 | |
3 | 1611 | |
7 | 1548 | |
4 | 1426 | 8.7% |
9 | 1302 | 7.9% |
0 | 1266 | 7.7% |
8 | 1263 | 7.7% |
6 | 1077 | 6.6% |
5 | 1049 | 6.4% |
Other Punctuation
Value | Count | Frequency (%) |
' | 1246 | |
. | 394 | 22.5% |
, | 71 | 4.1% |
: | 40 | 2.3% |
Space Separator
Value | Count | Frequency (%) |
160559 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 18132 |
Open Punctuation
Value | Count | Frequency (%) |
( | 136 |
Close Punctuation
Value | Count | Frequency (%) |
) | 136 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 84444282 | |
Common | 197121 | 0.2% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 15230538 | |
i | 15191095 | |
n | 7648417 | |
d | 7613055 | |
s | 7605818 | |
c | 7601552 | |
p | 7579073 | |
f | 7577850 | |
U | 7573384 | |
a | 114472 | 0.1% |
Other values (42) | 709028 | 0.8% |
Common
Value | Count | Frequency (%) |
160559 | ||
- | 18132 | 9.2% |
1 | 3798 | 1.9% |
2 | 2067 | 1.0% |
3 | 1611 | 0.8% |
7 | 1548 | 0.8% |
4 | 1426 | 0.7% |
9 | 1302 | 0.7% |
0 | 1266 | 0.6% |
8 | 1263 | 0.6% |
Other values (8) | 4149 | 2.1% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 84641403 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
e | 15230538 | |
i | 15191095 | |
n | 7648417 | |
d | 7613055 | |
s | 7605818 | |
c | 7601552 | |
p | 7579073 | |
f | 7577850 | |
U | 7573384 | |
160559 | 0.2% | |
Other values (60) | 860062 | 1.0% |
Park Borough
Categorical
Distinct | 6 |
---|---|
Distinct (%) | < 0.1% |
Missing | 2241 |
Missing (%) | < 0.1% |
Memory size | 58.2 MiB |
BROOKLYN | |
---|---|
QUEENS | |
MANHATTAN | |
BRONX | |
STATEN ISLAND |
Length
Max length | 13 |
---|---|
Median length | 11 |
Mean length | 7.4938063 |
Min length | 5 |
Characters and Unicode
Total characters | 57173845 |
---|---|
Distinct characters | 27 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | BROOKLYN |
---|---|
2nd row | STATEN ISLAND |
3rd row | QUEENS |
4th row | BRONX |
5th row | BROOKLYN |
Common Values
Value | Count | Frequency (%) |
BROOKLYN | 2373911 | |
QUEENS | 1811212 | |
MANHATTAN | 1537823 | |
BRONX | 1382211 | |
STATEN ISLAND | 398135 | 5.2% |
Unspecified | 126188 | 1.7% |
(Missing) | 2241 | < 0.1% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
brooklyn | 2373911 | |
queens | 1811212 | |
manhattan | 1537823 | |
bronx | 1382211 | |
staten | 398135 | 5.0% |
island | 398135 | 5.0% |
unspecified | 126188 | 1.6% |
Most occurring characters
Value | Count | Frequency (%) |
N | 9439250 | |
O | 6130033 | |
A | 5409739 | |
E | 4020559 | 7.0% |
T | 3871916 | 6.8% |
B | 3756122 | 6.6% |
R | 3756122 | 6.6% |
L | 2772046 | 4.8% |
S | 2607482 | 4.6% |
K | 2373911 | 4.2% |
Other values (17) | 13036665 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 55513830 | |
Lowercase Letter | 1261880 | 2.2% |
Space Separator | 398135 | 0.7% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
N | 9439250 | |
O | 6130033 | |
A | 5409739 | |
E | 4020559 | 7.2% |
T | 3871916 | 7.0% |
B | 3756122 | 6.8% |
R | 3756122 | 6.8% |
L | 2772046 | 5.0% |
S | 2607482 | 4.7% |
K | 2373911 | 4.3% |
Other values (8) | 11376650 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 252376 | |
i | 252376 | |
n | 126188 | |
s | 126188 | |
p | 126188 | |
c | 126188 | |
f | 126188 | |
d | 126188 |
Space Separator
Value | Count | Frequency (%) |
398135 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 56775710 | |
Common | 398135 | 0.7% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
N | 9439250 | |
O | 6130033 | |
A | 5409739 | |
E | 4020559 | 7.1% |
T | 3871916 | 6.8% |
B | 3756122 | 6.6% |
R | 3756122 | 6.6% |
L | 2772046 | 4.9% |
S | 2607482 | 4.6% |
K | 2373911 | 4.2% |
Other values (16) | 12638530 |
Common
Value | Count | Frequency (%) |
398135 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 57173845 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
N | 9439250 | |
O | 6130033 | |
A | 5409739 | |
E | 4020559 | 7.0% |
T | 3871916 | 6.8% |
B | 3756122 | 6.6% |
R | 3756122 | 6.6% |
L | 2772046 | 4.8% |
S | 2607482 | 4.6% |
K | 2373911 | 4.2% |
Other values (17) | 13036665 |
Vehicle Type
Categorical
IMBALANCE
  MISSING
 
Distinct | 4 |
---|---|
Distinct (%) | 2.6% |
Missing | 7631568 |
Missing (%) | > 99.9% |
Memory size | 58.2 MiB |
Car Service | |
---|---|
Green Taxi | 10 |
Commuter Van | 2 |
Ambulette / Paratransit | 1 |
Length
Max length | 23 |
---|---|
Median length | 11 |
Mean length | 11.026144 |
Min length | 10 |
Characters and Unicode
Total characters | 1687 |
---|---|
Distinct characters | 24 |
Distinct categories | 4 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.7% |
Sample
1st row | Car Service |
---|---|
2nd row | Green Taxi |
3rd row | Car Service |
4th row | Car Service |
5th row | Car Service |
Common Values
Value | Count | Frequency (%) |
Car Service | 140 | < 0.1% |
Green Taxi | 10 | < 0.1% |
Commuter Van | 2 | < 0.1% |
Ambulette / Paratransit | 1 | < 0.1% |
(Missing) | 7631568 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
car | 140 | |
service | 140 | |
green | 10 | 3.3% |
taxi | 10 | 3.3% |
commuter | 2 | 0.7% |
van | 2 | 0.7% |
ambulette | 1 | 0.3% |
1 | 0.3% | |
paratransit | 1 | 0.3% |
Most occurring characters
Value | Count | Frequency (%) |
e | 304 | |
r | 294 | |
a | 155 | |
154 | ||
i | 151 | |
C | 142 | |
S | 140 | |
v | 140 | |
c | 140 | |
n | 13 | 0.8% |
Other values (14) | 54 | 3.2% |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 1226 | |
Uppercase Letter | 306 | 18.1% |
Space Separator | 154 | 9.1% |
Other Punctuation | 1 | 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
e | 304 | |
r | 294 | |
a | 155 | |
i | 151 | |
v | 140 | |
c | 140 | |
n | 13 | 1.1% |
x | 10 | 0.8% |
t | 6 | 0.5% |
m | 5 | 0.4% |
Other values (5) | 8 | 0.7% |
Uppercase Letter
Value | Count | Frequency (%) |
C | 142 | |
S | 140 | |
T | 10 | 3.3% |
G | 10 | 3.3% |
V | 2 | 0.7% |
A | 1 | 0.3% |
P | 1 | 0.3% |
Space Separator
Value | Count | Frequency (%) |
154 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 1532 | |
Common | 155 | 9.2% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
e | 304 | |
r | 294 | |
a | 155 | |
i | 151 | |
C | 142 | |
S | 140 | |
v | 140 | |
c | 140 | |
n | 13 | 0.8% |
T | 10 | 0.7% |
Other values (12) | 43 | 2.8% |
Common
Value | Count | Frequency (%) |
154 | ||
/ | 1 | 0.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 1687 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
e | 304 | |
r | 294 | |
a | 155 | |
154 | ||
i | 151 | |
C | 142 | |
S | 140 | |
v | 140 | |
c | 140 | |
n | 13 | 0.8% |
Other values (14) | 54 | 3.2% |
Taxi Company Borough
Categorical
MISSING
 
Distinct | 5 |
---|---|
Distinct (%) | 0.1% |
Missing | 7625165 |
Missing (%) | 99.9% |
Memory size | 58.2 MiB |
MANHATTAN | |
---|---|
BROOKLYN | |
QUEENS | |
BRONX | |
STATEN ISLAND |
Length
Max length | 13 |
---|---|
Median length | 9 |
Mean length | 7.509457 |
Min length | 5 |
Characters and Unicode
Total characters | 49232 |
---|---|
Distinct characters | 19 |
Distinct categories | 2 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | BRONX |
---|---|
2nd row | BROOKLYN |
3rd row | STATEN ISLAND |
4th row | QUEENS |
5th row | MANHATTAN |
Common Values
Value | Count | Frequency (%) |
MANHATTAN | 1928 | < 0.1% |
BROOKLYN | 1638 | < 0.1% |
QUEENS | 1506 | < 0.1% |
BRONX | 1194 | < 0.1% |
STATEN ISLAND | 290 | < 0.1% |
(Missing) | 7625165 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
manhattan | 1928 | |
brooklyn | 1638 | |
queens | 1506 | |
bronx | 1194 | |
staten | 290 | 4.2% |
island | 290 | 4.2% |
Most occurring characters
Value | Count | Frequency (%) |
N | 8774 | |
A | 6364 | |
O | 4470 | |
T | 4436 | |
E | 3302 | 6.7% |
B | 2832 | 5.8% |
R | 2832 | 5.8% |
S | 2086 | 4.2% |
M | 1928 | 3.9% |
L | 1928 | 3.9% |
Other values (9) | 10280 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 48942 | |
Space Separator | 290 | 0.6% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
N | 8774 | |
A | 6364 | |
O | 4470 | |
T | 4436 | |
E | 3302 | 6.7% |
B | 2832 | 5.8% |
R | 2832 | 5.8% |
S | 2086 | 4.3% |
M | 1928 | 3.9% |
L | 1928 | 3.9% |
Other values (8) | 9990 |
Space Separator
Value | Count | Frequency (%) |
290 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 48942 | |
Common | 290 | 0.6% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
N | 8774 | |
A | 6364 | |
O | 4470 | |
T | 4436 | |
E | 3302 | 6.7% |
B | 2832 | 5.8% |
R | 2832 | 5.8% |
S | 2086 | 4.3% |
M | 1928 | 3.9% |
L | 1928 | 3.9% |
Other values (8) | 9990 |
Common
Value | Count | Frequency (%) |
290 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 49232 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
N | 8774 | |
A | 6364 | |
O | 4470 | |
T | 4436 | |
E | 3302 | 6.7% |
B | 2832 | 5.8% |
R | 2832 | 5.8% |
S | 2086 | 4.2% |
M | 1928 | 3.9% |
L | 1928 | 3.9% |
Other values (9) | 10280 |
MISSING
 
Distinct | 2227 |
---|---|
Distinct (%) | 5.7% |
Missing | 7592481 |
Missing (%) | 99.5% |
Memory size | 58.2 MiB |
Length
Max length | 55 |
---|---|
Median length | 5 |
Mean length | 8.825739 |
Min length | 5 |
Characters and Unicode
Total characters | 346322 |
---|---|
Distinct characters | 58 |
Distinct categories | 6 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 1955 ? |
---|---|
Unique (%) | 5.0% |
Sample
1st row | JFK Airport |
---|---|
2nd row | Other |
3rd row | La Guardia Airport |
4th row | Other |
5th row | Other |
Value | Count | Frequency (%) |
other | 28137 | |
airport | 6679 | 10.7% |
jfk | 4198 | 6.7% |
la | 2482 | 4.0% |
guardia | 2482 | 4.0% |
street | 1909 | 3.1% |
avenue | 1376 | 2.2% |
manhattan | 1031 | 1.7% |
station | 1009 | 1.6% |
and | 973 | 1.6% |
Other values (1441) | 12151 |
Most occurring characters
Value | Count | Frequency (%) |
r | 46572 | |
t | 38616 | 11.2% |
e | 30493 | 8.8% |
O | 30393 | 8.8% |
h | 28471 | 8.2% |
24708 | 7.1% | |
A | 15424 | 4.5% |
i | 11034 | 3.2% |
E | 10101 | 2.9% |
a | 9565 | 2.8% |
Other values (48) | 100945 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 193734 | |
Uppercase Letter | 118624 | |
Space Separator | 24708 | 7.1% |
Decimal Number | 8434 | 2.4% |
Dash Punctuation | 805 | 0.2% |
Other Punctuation | 17 | < 0.1% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
O | 30393 | |
A | 15424 | |
E | 10101 | 8.5% |
T | 8814 | 7.4% |
N | 7672 | 6.5% |
S | 5426 | 4.6% |
K | 4877 | 4.1% |
F | 4502 | 3.8% |
J | 4235 | 3.6% |
R | 3833 | 3.2% |
Other values (16) | 23347 |
Lowercase Letter
Value | Count | Frequency (%) |
r | 46572 | |
t | 38616 | |
e | 30493 | |
h | 28471 | |
i | 11034 | 5.7% |
a | 9565 | 4.9% |
o | 9173 | 4.7% |
p | 6679 | 3.4% |
n | 3753 | 1.9% |
u | 3150 | 1.6% |
Other values (8) | 6228 | 3.2% |
Decimal Number
Value | Count | Frequency (%) |
1 | 1986 | |
2 | 1114 | |
3 | 906 | |
0 | 894 | |
5 | 743 | 8.8% |
4 | 739 | 8.8% |
7 | 587 | 7.0% |
6 | 574 | 6.8% |
8 | 472 | 5.6% |
9 | 419 | 5.0% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 14 | |
. | 3 | 17.6% |
Space Separator
Value | Count | Frequency (%) |
24708 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 805 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 312358 | |
Common | 33964 | 9.8% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
r | 46572 | |
t | 38616 | |
e | 30493 | 9.8% |
O | 30393 | 9.7% |
h | 28471 | 9.1% |
A | 15424 | 4.9% |
i | 11034 | 3.5% |
E | 10101 | 3.2% |
a | 9565 | 3.1% |
o | 9173 | 2.9% |
Other values (34) | 82516 |
Common
Value | Count | Frequency (%) |
24708 | ||
1 | 1986 | 5.8% |
2 | 1114 | 3.3% |
3 | 906 | 2.7% |
0 | 894 | 2.6% |
- | 805 | 2.4% |
5 | 743 | 2.2% |
4 | 739 | 2.2% |
7 | 587 | 1.7% |
6 | 574 | 1.7% |
Other values (4) | 908 | 2.7% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 346322 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
r | 46572 | |
t | 38616 | 11.2% |
e | 30493 | 8.8% |
O | 30393 | 8.8% |
h | 28471 | 8.2% |
24708 | 7.1% | |
A | 15424 | 4.5% |
i | 11034 | 3.2% |
E | 10101 | 2.9% |
a | 9565 | 2.8% |
Other values (48) | 100945 |
MISSING
 
Distinct | 81 |
---|---|
Distinct (%) | 0.6% |
Missing | 7618111 |
Missing (%) | 99.8% |
Memory size | 58.2 MiB |
Length
Max length | 42 |
---|---|
Median length | 32 |
Mean length | 16.51543 |
Min length | 6 |
Characters and Unicode
Total characters | 224775 |
---|---|
Distinct characters | 63 |
Distinct categories | 6 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 6 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | Henry Hudson Pkwy/Rt 9A |
---|---|
2nd row | Grand Central Pkwy |
3rd row | Belt Pkwy |
4th row | BQE/Gowanus Expwy |
5th row | Henry Hudson Pkwy/Rt 9A |
Value | Count | Frequency (%) |
expwy | 6319 | 16.7% |
pkwy | 3699 | 9.8% |
island | 2122 | 5.6% |
bqe/gowanus | 1526 | 4.0% |
dr | 1316 | 3.5% |
belt | 1251 | 3.3% |
cross | 1221 | 3.2% |
br | 1151 | 3.1% |
central | 1042 | 2.8% |
grand | 1011 | 2.7% |
Other values (127) | 17069 |
Most occurring characters
Value | Count | Frequency (%) |
24117 | 10.7% | |
n | 14736 | 6.6% |
r | 13080 | 5.8% |
w | 12840 | 5.7% |
y | 12601 | 5.6% |
a | 11602 | 5.2% |
e | 10575 | 4.7% |
o | 9841 | 4.4% |
s | 9406 | 4.2% |
E | 7967 | 3.5% |
Other values (53) | 98010 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 149951 | |
Uppercase Letter | 45203 | 20.1% |
Space Separator | 24117 | 10.7% |
Other Punctuation | 3468 | 1.5% |
Decimal Number | 1708 | 0.8% |
Dash Punctuation | 328 | 0.1% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
n | 14736 | 9.8% |
r | 13080 | 8.7% |
w | 12840 | 8.6% |
y | 12601 | 8.4% |
a | 11602 | 7.7% |
e | 10575 | 7.1% |
o | 9841 | 6.6% |
s | 9406 | 6.3% |
t | 7313 | 4.9% |
x | 7261 | 4.8% |
Other values (15) | 40696 |
Uppercase Letter
Value | Count | Frequency (%) |
E | 7967 | |
B | 5468 | |
P | 4950 | |
R | 3159 | 7.0% |
D | 2926 | 6.5% |
G | 2601 | 5.8% |
C | 2542 | 5.6% |
I | 2460 | 5.4% |
W | 1935 | 4.3% |
Q | 1775 | 3.9% |
Other values (14) | 9420 |
Decimal Number
Value | Count | Frequency (%) |
9 | 816 | |
5 | 328 | |
1 | 246 | 14.4% |
2 | 91 | 5.3% |
4 | 76 | 4.4% |
8 | 56 | 3.3% |
3 | 54 | 3.2% |
0 | 22 | 1.3% |
7 | 19 | 1.1% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 3191 | |
. | 235 | 6.8% |
, | 42 | 1.2% |
Space Separator
Value | Count | Frequency (%) |
24117 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 328 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 195154 | |
Common | 29621 | 13.2% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
n | 14736 | 7.6% |
r | 13080 | 6.7% |
w | 12840 | 6.6% |
y | 12601 | 6.5% |
a | 11602 | 5.9% |
e | 10575 | 5.4% |
o | 9841 | 5.0% |
s | 9406 | 4.8% |
E | 7967 | 4.1% |
t | 7313 | 3.7% |
Other values (39) | 85193 |
Common
Value | Count | Frequency (%) |
24117 | ||
/ | 3191 | 10.8% |
9 | 816 | 2.8% |
- | 328 | 1.1% |
5 | 328 | 1.1% |
1 | 246 | 0.8% |
. | 235 | 0.8% |
2 | 91 | 0.3% |
4 | 76 | 0.3% |
8 | 56 | 0.2% |
Other values (4) | 137 | 0.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 224775 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
24117 | 10.7% | |
n | 14736 | 6.6% |
r | 13080 | 5.8% |
w | 12840 | 5.7% |
y | 12601 | 5.6% |
a | 11602 | 5.2% |
e | 10575 | 4.7% |
o | 9841 | 4.4% |
s | 9406 | 4.2% |
E | 7967 | 3.5% |
Other values (53) | 98010 |
Bridge Highway Direction
Categorical
MISSING
 
Distinct | 50 |
---|---|
Distinct (%) | 0.4% |
Missing | 7618125 |
Missing (%) | 99.8% |
Memory size | 58.2 MiB |
North/Bronx Bound | |
---|---|
East/Long Island Bound | |
East/Queens Bound | |
West/Staten Island Bound | 709 |
West/Brooklyn Bound | 596 |
Other values (45) |
Length
Max length | 30 |
---|---|
Median length | 24 |
Mean length | 19.149382 |
Min length | 9 |
Characters and Unicode
Total characters | 260355 |
---|---|
Distinct characters | 47 |
Distinct categories | 6 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 1 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | North/Bronx Bound |
---|---|
2nd row | East/Long Island Bound |
3rd row | East/Queens Bound |
4th row | East/Bronx Bound |
5th row | South/Downtown |
Common Values
Value | Count | Frequency (%) |
North/Bronx Bound | 1306 | < 0.1% |
East/Long Island Bound | 1182 | < 0.1% |
East/Queens Bound | 985 | < 0.1% |
West/Staten Island Bound | 709 | < 0.1% |
West/Brooklyn Bound | 596 | < 0.1% |
West/Manhattan Bound | 493 | < 0.1% |
West/Toward Triborough Br | 478 | < 0.1% |
Northbound/Uptown | 470 | < 0.1% |
South/Downtown | 468 | < 0.1% |
North/Westchester County Bound | 454 | < 0.1% |
Other values (40) | 6455 | 0.1% |
(Missing) | 7618125 |
Length
Value | Count | Frequency (%) |
bound | 8729 | |
island | 2212 | 7.2% |
br | 1899 | 6.2% |
north/bronx | 1306 | 4.3% |
east/long | 1182 | 3.9% |
triborough | 1069 | 3.5% |
east/queens | 985 | 3.2% |
west/staten | 709 | 2.3% |
to | 613 | 2.0% |
west/brooklyn | 596 | 1.9% |
Other values (57) | 11357 |
Most occurring characters
Value | Count | Frequency (%) |
o | 32147 | 12.3% |
n | 25320 | 9.7% |
t | 20240 | 7.8% |
u | 18276 | 7.0% |
17061 | 6.6% | |
d | 15101 | 5.8% |
B | 13717 | 5.3% |
r | 13165 | 5.1% |
s | 11767 | 4.5% |
/ | 11178 | 4.3% |
Other values (37) | 82383 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 187774 | |
Uppercase Letter | 43230 | 16.6% |
Space Separator | 17061 | 6.6% |
Other Punctuation | 11178 | 4.3% |
Close Punctuation | 556 | 0.2% |
Open Punctuation | 556 | 0.2% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
o | 32147 | |
n | 25320 | |
t | 20240 | |
u | 18276 | |
d | 15101 | |
r | 13165 | |
s | 11767 | 6.3% |
a | 10573 | 5.6% |
h | 9653 | 5.1% |
e | 9148 | 4.9% |
Other values (12) | 22384 |
Uppercase Letter
Value | Count | Frequency (%) |
B | 13717 | |
S | 4217 | 9.8% |
W | 3894 | 9.0% |
N | 3725 | 8.6% |
E | 3596 | 8.3% |
T | 3407 | 7.9% |
I | 2212 | 5.1% |
L | 1492 | 3.5% |
Q | 1333 | 3.1% |
M | 961 | 2.2% |
Other values (11) | 4676 | 10.8% |
Space Separator
Value | Count | Frequency (%) |
17061 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 11178 |
Close Punctuation
Value | Count | Frequency (%) |
) | 556 |
Open Punctuation
Value | Count | Frequency (%) |
( | 556 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 231004 | |
Common | 29351 | 11.3% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
o | 32147 | |
n | 25320 | |
t | 20240 | 8.8% |
u | 18276 | 7.9% |
d | 15101 | 6.5% |
B | 13717 | 5.9% |
r | 13165 | 5.7% |
s | 11767 | 5.1% |
a | 10573 | 4.6% |
h | 9653 | 4.2% |
Other values (33) | 61045 |
Common
Value | Count | Frequency (%) |
17061 | ||
/ | 11178 | |
) | 556 | 1.9% |
( | 556 | 1.9% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 260355 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
o | 32147 | 12.3% |
n | 25320 | 9.7% |
t | 20240 | 7.8% |
u | 18276 | 7.0% |
17061 | 6.6% | |
d | 15101 | 5.8% |
B | 13717 | 5.3% |
r | 13165 | 5.1% |
s | 11767 | 4.5% |
/ | 11178 | 4.3% |
Other values (37) | 82383 |
Road Ramp
Categorical
MISSING
 
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 7618286 |
Missing (%) | 99.8% |
Memory size | 58.2 MiB |
Roadway | |
---|---|
Ramp |
Length
Max length | 7 |
---|---|
Median length | 7 |
Mean length | 6.1695571 |
Min length | 4 |
Characters and Unicode
Total characters | 82888 |
---|---|
Distinct characters | 8 |
Distinct categories | 2 ? |
Distinct scripts | 1 ? |
Distinct blocks | 1 ? |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | Ramp |
---|---|
2nd row | Ramp |
3rd row | Roadway |
4th row | Roadway |
5th row | Ramp |
Common Values
Value | Count | Frequency (%) |
Roadway | 9716 | 0.1% |
Ramp | 3719 | < 0.1% |
(Missing) | 7618286 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
roadway | 9716 | |
ramp | 3719 | 27.7% |
Most occurring characters
Value | Count | Frequency (%) |
a | 23151 | |
R | 13435 | |
o | 9716 | |
d | 9716 | |
w | 9716 | |
y | 9716 | |
m | 3719 | 4.5% |
p | 3719 | 4.5% |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 69453 | |
Uppercase Letter | 13435 | 16.2% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
a | 23151 | |
o | 9716 | |
d | 9716 | |
w | 9716 | |
y | 9716 | |
m | 3719 | 5.4% |
p | 3719 | 5.4% |
Uppercase Letter
Value | Count | Frequency (%) |
R | 13435 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 82888 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
a | 23151 | |
R | 13435 | |
o | 9716 | |
d | 9716 | |
w | 9716 | |
y | 9716 | |
m | 3719 | 4.5% |
p | 3719 | 4.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 82888 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
a | 23151 | |
R | 13435 | |
o | 9716 | |
d | 9716 | |
w | 9716 | |
y | 9716 | |
m | 3719 | 4.5% |
p | 3719 | 4.5% |
MISSING
 
Distinct | 3879 |
---|---|
Distinct (%) | 23.8% |
Missing | 7615395 |
Missing (%) | 99.8% |
Memory size | 58.2 MiB |
Length
Max length | 100 |
---|---|
Median length | 77 |
Mean length | 41.255666 |
Min length | 4 |
Characters and Unicode
Total characters | 673540 |
---|---|
Distinct characters | 67 |
Distinct categories | 8 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 2972 ? |
---|---|
Unique (%) | 18.2% |
Sample
1st row | W 96 St (Exit 11) |
---|---|
2nd row | 1-1-1514825238 |
3rd row | 1-1-1514671280 |
4th row | Brooklyn-Queens Expwy (I-278) (Exit 4) |
5th row | Flatbush Ave (Exit 11N) - Gateway National Recreation Area (Exit 12) |
Value | Count | Frequency (%) |
exit | 18059 | 14.4% |
11155 | 8.9% | |
ave | 5916 | 4.7% |
st | 5111 | 4.1% |
blvd | 3342 | 2.7% |
expwy | 3175 | 2.5% |
pkwy | 2349 | 1.9% |
east | 2171 | 1.7% |
ny | 1322 | 1.1% |
island | 1220 | 1.0% |
Other values (3482) | 71485 |
Most occurring characters
Value | Count | Frequency (%) |
109027 | 16.2% | |
t | 41602 | 6.2% |
i | 28370 | 4.2% |
E | 26833 | 4.0% |
e | 26197 | 3.9% |
) | 22880 | 3.4% |
( | 22880 | 3.4% |
x | 22617 | 3.4% |
1 | 21606 | 3.2% |
- | 20194 | 3.0% |
Other values (57) | 331334 |
Most occurring categories
Value | Count | Frequency (%) |
Lowercase Letter | 305280 | |
Space Separator | 109027 | 16.2% |
Uppercase Letter | 98864 | 14.7% |
Decimal Number | 88247 | 13.1% |
Close Punctuation | 22880 | 3.4% |
Open Punctuation | 22880 | 3.4% |
Dash Punctuation | 20194 | 3.0% |
Other Punctuation | 6168 | 0.9% |
Most frequent character per category
Lowercase Letter
Value | Count | Frequency (%) |
t | 41602 | |
i | 28370 | 9.3% |
e | 26197 | 8.6% |
x | 22617 | 7.4% |
a | 19831 | 6.5% |
n | 19491 | 6.4% |
r | 18384 | 6.0% |
o | 17139 | 5.6% |
s | 14946 | 4.9% |
d | 13407 | 4.4% |
Other values (15) | 83296 |
Uppercase Letter
Value | Count | Frequency (%) |
E | 26833 | |
B | 9618 | 9.7% |
A | 9560 | 9.7% |
S | 8446 | 8.5% |
W | 5507 | 5.6% |
N | 4902 | 5.0% |
R | 4571 | 4.6% |
P | 4402 | 4.5% |
I | 4006 | 4.1% |
C | 3717 | 3.8% |
Other values (15) | 17302 |
Decimal Number
Value | Count | Frequency (%) |
1 | 21606 | |
2 | 11678 | |
5 | 9599 | |
3 | 8422 | 9.5% |
4 | 7380 | 8.4% |
9 | 6484 | 7.3% |
6 | 6360 | 7.2% |
7 | 6245 | 7.1% |
8 | 5934 | 6.7% |
0 | 4539 | 5.1% |
Other Punctuation
Value | Count | Frequency (%) |
/ | 4560 | |
. | 1532 | 24.8% |
, | 76 | 1.2% |
Space Separator
Value | Count | Frequency (%) |
109027 |
Close Punctuation
Value | Count | Frequency (%) |
) | 22880 |
Open Punctuation
Value | Count | Frequency (%) |
( | 22880 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 20194 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 404144 | |
Common | 269396 |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
t | 41602 | 10.3% |
i | 28370 | 7.0% |
E | 26833 | 6.6% |
e | 26197 | 6.5% |
x | 22617 | 5.6% |
a | 19831 | 4.9% |
n | 19491 | 4.8% |
r | 18384 | 4.5% |
o | 17139 | 4.2% |
s | 14946 | 3.7% |
Other values (40) | 168734 |
Common
Value | Count | Frequency (%) |
109027 | ||
) | 22880 | 8.5% |
( | 22880 | 8.5% |
1 | 21606 | 8.0% |
- | 20194 | 7.5% |
2 | 11678 | 4.3% |
5 | 9599 | 3.6% |
3 | 8422 | 3.1% |
4 | 7380 | 2.7% |
9 | 6484 | 2.4% |
Other values (7) | 29246 | 10.9% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 673540 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
109027 | 16.2% | |
t | 41602 | 6.2% |
i | 28370 | 4.2% |
E | 26833 | 4.0% |
e | 26197 | 3.9% |
) | 22880 | 3.4% |
( | 22880 | 3.4% |
x | 22617 | 3.4% |
1 | 21606 | 3.2% |
- | 20194 | 3.0% |
Other values (57) | 331334 |