Overview

Dataset statistics

Number of variables6
Number of observations8474
Missing cells610
Missing cells (%)1.2%
Duplicate rows8
Duplicate rows (%)0.1%
Total size in memory422.2 KiB
Average record size in memory51.0 B

Variable types

Categorical1
DateTime1
Text2
Numeric2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-22190/F/1/datasetView.do

Alerts

단속일 has constant value ""Constant
Dataset has 8 (0.1%) duplicate rowsDuplicates
도로명 has 600 (7.1%) missing valuesMissing
경도 is highly skewed (γ1 = 25.47397936)Skewed
위도 is highly skewed (γ1 = 41.1044938)Skewed

Reproduction

Analysis started2024-04-21 00:07:29.343836
Analysis finished2024-04-21 00:07:32.428078
Duration3.08 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

단속일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size66.3 KiB
20231101
8474 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20231101
2nd row20231101
3rd row20231101
4th row20231101
5th row20231101

Common Values

ValueCountFrequency (%)
20231101 8474
100.0%

Length

2024-04-21T09:07:32.493078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T09:07:32.603541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20231101 8474
100.0%
Distinct7707
Distinct (%)90.9%
Missing0
Missing (%)0.0%
Memory size66.3 KiB
Minimum2024-04-21 00:01:05
Maximum2024-04-21 20:15:59
2024-04-21T09:07:32.703225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T09:07:32.835991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct4026
Distinct (%)47.5%
Missing6
Missing (%)0.1%
Memory size66.3 KiB
2024-04-21T09:07:33.097162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length35
Mean length14.172296
Min length4

Characters and Unicode

Total characters120011
Distinct characters286
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2472 ?
Unique (%)29.2%

Sample

1st row서울 용산구 후암동 256-4
2nd row서울 송파구 문정동 150-2
3rd row서울 중구 을지로7가 8-5
4th row서울 서초구 방배동 1-36
5th row서울 광진구 중곡동 453
ValueCountFrequency (%)
서울 5467
 
19.4%
영등포구 552
 
2.0%
강남구 514
 
1.8%
송파구 448
 
1.6%
종로구 398
 
1.4%
중구 301
 
1.1%
동대문구 295
 
1.0%
마포구 292
 
1.0%
강동구 243
 
0.9%
구로구 232
 
0.8%
Other values (4132) 19404
68.9%
2024-04-21T09:07:33.511792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20112
16.8%
9045
 
7.5%
- 6674
 
5.6%
1 6647
 
5.5%
6104
 
5.1%
6081
 
5.1%
5478
 
4.6%
2 4746
 
4.0%
3 4017
 
3.3%
4 3422
 
2.9%
Other values (276) 47685
39.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 57532
47.9%
Decimal Number 35429
29.5%
Space Separator 20112
 
16.8%
Dash Punctuation 6674
 
5.6%
Open Punctuation 79
 
0.1%
Close Punctuation 79
 
0.1%
Other Punctuation 69
 
0.1%
Uppercase Letter 37
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9045
 
15.7%
6104
 
10.6%
6081
 
10.6%
5478
 
9.5%
1390
 
2.4%
1239
 
2.2%
1105
 
1.9%
1052
 
1.8%
806
 
1.4%
762
 
1.3%
Other values (253) 24470
42.5%
Decimal Number
ValueCountFrequency (%)
1 6647
18.8%
2 4746
13.4%
3 4017
11.3%
4 3422
9.7%
5 3245
9.2%
6 3120
8.8%
7 2994
8.5%
8 2425
 
6.8%
9 2421
 
6.8%
0 2392
 
6.8%
Uppercase Letter
ValueCountFrequency (%)
G 14
37.8%
L 7
18.9%
S 7
18.9%
A 3
 
8.1%
P 3
 
8.1%
T 3
 
8.1%
Other Punctuation
ValueCountFrequency (%)
, 62
89.9%
/ 6
 
8.7%
@ 1
 
1.4%
Space Separator
ValueCountFrequency (%)
20112
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6674
100.0%
Open Punctuation
ValueCountFrequency (%)
( 79
100.0%
Close Punctuation
ValueCountFrequency (%)
) 79
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 62442
52.0%
Hangul 57532
47.9%
Latin 37
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9045
 
15.7%
6104
 
10.6%
6081
 
10.6%
5478
 
9.5%
1390
 
2.4%
1239
 
2.2%
1105
 
1.9%
1052
 
1.8%
806
 
1.4%
762
 
1.3%
Other values (253) 24470
42.5%
Common
ValueCountFrequency (%)
20112
32.2%
- 6674
 
10.7%
1 6647
 
10.6%
2 4746
 
7.6%
3 4017
 
6.4%
4 3422
 
5.5%
5 3245
 
5.2%
6 3120
 
5.0%
7 2994
 
4.8%
8 2425
 
3.9%
Other values (7) 5040
 
8.1%
Latin
ValueCountFrequency (%)
G 14
37.8%
L 7
18.9%
S 7
18.9%
A 3
 
8.1%
P 3
 
8.1%
T 3
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 62479
52.1%
Hangul 57532
47.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20112
32.2%
- 6674
 
10.7%
1 6647
 
10.6%
2 4746
 
7.6%
3 4017
 
6.4%
4 3422
 
5.5%
5 3245
 
5.2%
6 3120
 
5.0%
7 2994
 
4.8%
8 2425
 
3.9%
Other values (13) 5077
 
8.1%
Hangul
ValueCountFrequency (%)
9045
 
15.7%
6104
 
10.6%
6081
 
10.6%
5478
 
9.5%
1390
 
2.4%
1239
 
2.2%
1105
 
1.9%
1052
 
1.8%
806
 
1.4%
762
 
1.3%
Other values (253) 24470
42.5%

도로명
Text

MISSING 

Distinct3589
Distinct (%)45.6%
Missing600
Missing (%)7.1%
Memory size66.3 KiB
2024-04-21T09:07:33.797549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length46
Median length38
Mean length14.997587
Min length5

Characters and Unicode

Total characters118091
Distinct characters456
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2066 ?
Unique (%)26.2%

Sample

1st row서울특별시 용산구 두텁바위로 61
2nd row서울특별시 송파구 중대로 80
3rd row서울 중구 을지로7가 8-5
4th row서울특별시 서초구 사평대로6길 102 (방배동, 서래acoA동)
5th row서울 광진구 중곡동 453
ValueCountFrequency (%)
서울 3189
 
11.9%
서울특별시 1698
 
6.3%
영등포구 519
 
1.9%
강남구 460
 
1.7%
송파구 379
 
1.4%
종로구 359
 
1.3%
마포구 287
 
1.1%
동대문구 271
 
1.0%
강동구 269
 
1.0%
성동구 226
 
0.8%
Other values (3772) 19123
71.4%
2024-04-21T09:07:34.251636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19167
 
16.2%
6643
 
5.6%
5527
 
4.7%
1 5447
 
4.6%
5394
 
4.6%
4930
 
4.2%
3918
 
3.3%
2 3916
 
3.3%
3 3378
 
2.9%
3085
 
2.6%
Other values (446) 56686
48.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67188
56.9%
Decimal Number 27510
23.3%
Space Separator 19167
 
16.2%
Dash Punctuation 2119
 
1.8%
Open Punctuation 829
 
0.7%
Close Punctuation 808
 
0.7%
Other Punctuation 281
 
0.2%
Uppercase Letter 156
 
0.1%
Lowercase Letter 30
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6643
 
9.9%
5527
 
8.2%
5394
 
8.0%
4930
 
7.3%
3918
 
5.8%
3085
 
4.6%
1827
 
2.7%
1698
 
2.5%
1698
 
2.5%
1263
 
1.9%
Other values (408) 31205
46.4%
Uppercase Letter
ValueCountFrequency (%)
S 38
24.4%
A 24
15.4%
I 14
 
9.0%
N 12
 
7.7%
U 11
 
7.1%
E 11
 
7.1%
Q 11
 
7.1%
R 11
 
7.1%
G 6
 
3.8%
C 4
 
2.6%
Other values (7) 14
 
9.0%
Decimal Number
ValueCountFrequency (%)
1 5447
19.8%
2 3916
14.2%
3 3378
12.3%
4 2863
10.4%
5 2404
8.7%
6 2206
8.0%
7 1928
 
7.0%
8 1926
 
7.0%
0 1844
 
6.7%
9 1598
 
5.8%
Lowercase Letter
ValueCountFrequency (%)
k 12
40.0%
t 12
40.0%
o 2
 
6.7%
a 2
 
6.7%
c 2
 
6.7%
Space Separator
ValueCountFrequency (%)
19167
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2119
100.0%
Open Punctuation
ValueCountFrequency (%)
( 829
100.0%
Close Punctuation
ValueCountFrequency (%)
) 808
100.0%
Other Punctuation
ValueCountFrequency (%)
, 281
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67188
56.9%
Common 50717
42.9%
Latin 186
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6643
 
9.9%
5527
 
8.2%
5394
 
8.0%
4930
 
7.3%
3918
 
5.8%
3085
 
4.6%
1827
 
2.7%
1698
 
2.5%
1698
 
2.5%
1263
 
1.9%
Other values (408) 31205
46.4%
Latin
ValueCountFrequency (%)
S 38
20.4%
A 24
12.9%
I 14
 
7.5%
k 12
 
6.5%
t 12
 
6.5%
N 12
 
6.5%
U 11
 
5.9%
E 11
 
5.9%
Q 11
 
5.9%
R 11
 
5.9%
Other values (12) 30
16.1%
Common
ValueCountFrequency (%)
19167
37.8%
1 5447
 
10.7%
2 3916
 
7.7%
3 3378
 
6.7%
4 2863
 
5.6%
5 2404
 
4.7%
6 2206
 
4.3%
- 2119
 
4.2%
7 1928
 
3.8%
8 1926
 
3.8%
Other values (6) 5363
 
10.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67188
56.9%
ASCII 50903
43.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
19167
37.7%
1 5447
 
10.7%
2 3916
 
7.7%
3 3378
 
6.6%
4 2863
 
5.6%
5 2404
 
4.7%
6 2206
 
4.3%
- 2119
 
4.2%
7 1928
 
3.8%
8 1926
 
3.8%
Other values (28) 5549
 
10.9%
Hangul
ValueCountFrequency (%)
6643
 
9.9%
5527
 
8.2%
5394
 
8.0%
4930
 
7.3%
3918
 
5.8%
3085
 
4.6%
1827
 
2.7%
1698
 
2.5%
1698
 
2.5%
1263
 
1.9%
Other values (408) 31205
46.4%

경도
Real number (ℝ)

SKEWED 

Distinct5508
Distinct (%)65.0%
Missing2
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean1939538.7
Minimum37.538419
Maximum1.2638998 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size74.6 KiB
2024-04-21T09:07:34.395391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.538419
5-th percentile126.8748
Q1126.92153
median127.0006
Q3127.04824
95-th percentile127.12193
Maximum1.2638998 × 109
Range1.2638998 × 109
Interquartile range (IQR)0.12671279

Descriptive statistics

Standard deviation49474737
Coefficient of variation (CV)25.508507
Kurtosis647.07638
Mean1939538.7
Median Absolute Deviation (MAD)0.062414534
Skewness25.473979
Sum1.6431772 × 1010
Variance2.4477496 × 1015
MonotonicityNot monotonic
2024-04-21T09:07:34.540146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.97925943831896 63
 
0.7%
126.97281501547188 42
 
0.5%
126.9786163 33
 
0.4%
127.002162 29
 
0.3%
126.9158448 25
 
0.3%
126.93173 24
 
0.3%
126.97020285491436 23
 
0.3%
126.9785413005342 22
 
0.3%
126.9204924 22
 
0.3%
127.1303715717826 22
 
0.3%
Other values (5498) 8167
96.4%
ValueCountFrequency (%)
37.53841854034201 1
< 0.1%
37.54149572893306 2
< 0.1%
37.5434837341309 2
< 0.1%
126.7966417595744 1
< 0.1%
126.79675256833434 1
< 0.1%
126.79695691913366 1
< 0.1%
126.79705708287656 1
< 0.1%
126.801278424065 1
< 0.1%
126.80660323239864 1
< 0.1%
126.80662116967142 1
< 0.1%
ValueCountFrequency (%)
1263899849.0 13
0.2%
127.18287362 1
 
< 0.1%
127.18284144 1
 
< 0.1%
127.18271987 1
 
< 0.1%
127.17999717 1
 
< 0.1%
127.17999433 1
 
< 0.1%
127.17926628 1
 
< 0.1%
127.178237 1
 
< 0.1%
127.1738434 1
 
< 0.1%
127.1734543 1
 
< 0.1%

위도
Real number (ℝ)

SKEWED 

Distinct5503
Distinct (%)65.0%
Missing2
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean37.602829
Minimum37.36114
Maximum127.05404
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size74.6 KiB
2024-04-21T09:07:34.688192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.36114
5-th percentile37.481356
Q137.513732
median37.550397
Q337.574048
95-th percentile37.643524
Maximum127.05404
Range89.692899
Interquartile range (IQR)0.060315313

Descriptive statistics

Standard deviation2.1739546
Coefficient of variation (CV)0.057813592
Kurtosis1688.7853
Mean37.602829
Median Absolute Deviation (MAD)0.030382209
Skewness41.104494
Sum318571.16
Variance4.7260785
MonotonicityNot monotonic
2024-04-21T09:07:34.817895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.57122342169176 63
 
0.7%
37.57306915349144 42
 
0.5%
37.5744654 33
 
0.4%
37.5657701 29
 
0.3%
37.5859655 25
 
0.3%
37.520769 24
 
0.3%
37.58389847202408 23
 
0.3%
37.55904182543168 22
 
0.3%
37.55091065 22
 
0.3%
37.51571704892341 22
 
0.3%
Other values (5493) 8167
96.4%
ValueCountFrequency (%)
37.36114 2
 
< 0.1%
37.44122280819926 1
 
< 0.1%
37.4462083333333 1
 
< 0.1%
37.4462736666667 1
 
< 0.1%
37.4463926666667 1
 
< 0.1%
37.4466823333333 1
 
< 0.1%
37.447242 1
 
< 0.1%
37.4474221666667 1
 
< 0.1%
37.44893 1
 
< 0.1%
37.4498946 5
0.1%
ValueCountFrequency (%)
127.05403869196364 2
< 0.1%
127.0438277284258 1
 
< 0.1%
127.015312194824 2
< 0.1%
37.69013773708887 3
< 0.1%
37.68931953088563 2
< 0.1%
37.6875678333333 1
 
< 0.1%
37.687545 1
 
< 0.1%
37.68727205 1
 
< 0.1%
37.6870305 1
 
< 0.1%
37.68702681 1
 
< 0.1%

Interactions

2024-04-21T09:07:31.910053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T09:07:31.659591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T09:07:32.018961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T09:07:31.813446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T09:07:34.904641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
경도위도
경도1.0000.000
위도0.0001.000
2024-04-21T09:07:34.985230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
경도위도
경도1.0000.244
위도0.2441.000

Missing values

2024-04-21T09:07:32.136008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T09:07:32.245358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-21T09:07:32.363764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

단속일단속시간구주소도로명경도위도
02023110100:01:05서울 용산구 후암동 256-4서울특별시 용산구 두텁바위로 61126.97899437.546086
12023110100:02:16서울 송파구 문정동 150-2서울특별시 송파구 중대로 80127.11714237.491273
22023110100:02:27서울 중구 을지로7가 8-5서울 중구 을지로7가 8-5127.00973537.565491
32023110100:04:21서울 서초구 방배동 1-36서울특별시 서초구 사평대로6길 102 (방배동, 서래acoA동)126.99337837.494293
42023110100:04:25서울 광진구 중곡동 453서울 광진구 중곡동 453127.09115337.556489
52023110100:05:08서울 서초구 방배동 1-36서울특별시 서초구 사평대로6길 102 (방배동, 서래acoA동)126.99357637.494264
62023110100:06:32서울 용산구 후암동 214-6서울특별시 용산구 후암로 26126.97787337.548372
72023110100:06:32서울 용산구 후암동 214-6서울특별시 용산구 후암로 26126.97787337.548372
82023110100:10:29서울 용산구 후암동 244-97서울특별시 용산구 후암로16길 26126.9793437.548496
92023110100:13:48서울 동대문구 용두동 797-2서울특별시 동대문구 고산자로34길 70 청량리역 해링턴플레이스127.04183237.57846
단속일단속시간구주소도로명경도위도
84642023110120:15:42서울 용산구 청파동3가 90-4서울 용산구 원효로97길 22126.96943137.541992
84652023110120:15:44서울 중랑구 묵동 192-3서울특별시 중랑구 동일로160길 12127.07856137.610765
84662023110120:15:45서울 강동구 고덕동 211서울 강동구 고덕동 189127.16867537.557759
84672023110120:15:46서울 종로구 창신동 23-878서울 종로구 창신동 23-445127.0111637.578396
84682023110120:15:47서울 서초구 반포동 127-1서울특별시 서초구 사평대로22길 36126.99770837.497821
84692023110120:15:49서울 용산구 동빙고동 7-18서울 용산구 장문로 14126.9921737.528682
84702023110120:15:51서울 은평구 응암동 125-12서울 은평구 응암로21길 12126.91634337.594207
84712023110120:15:52용두동 39-951고산자로28길 21127.03923937.575041
84722023110120:15:56서울 영등포구 신길동 451-2서울 영등포구 신길동 470-2126.92199137.512199
84732023110120:15:59서울 마포구 신수동 181-2서울 마포구 독막로 210126.93694637.547143

Duplicate rows

Most frequently occurring

단속일단속시간구주소도로명경도위도# duplicates
02023110100:06:32서울 용산구 후암동 214-6서울특별시 용산구 후암로 26126.97787337.5483722
12023110109:32:39서울 송파구 신천동 19<NA>127.11046237.5169432
22023110110:49:10서울 송파구 석촌동 209<NA>127.106537.5050152
32023110112:41:16서울 강서구 외발산동 276-9서울 강서구 외발산동 276-9126.81663337.5486942
42023110116:11:32서울 강남구 대치동 1012-10서울특별시 강남구 테헤란로114길 16 (대치동, 동남빌딩)127.06681337.5086052
52023110116:20:24서울 중구 방산동 1-1<NA>127.00151237.5685732
62023110117:18:04서울 영등포구 영등포동6가 10-1서울 영등포구 영등포동5가 62-1126.90500337.5225722
72023110119:50:57서울 성동구 행당동 158-4서울 성동구 행당동 19-72 마조로1길6127.03992437.558312