Overview

Dataset statistics

Number of variables6
Number of observations7065
Missing cells1539
Missing cells (%)3.6%
Duplicate rows6
Duplicate rows (%)0.1%
Total size in memory352.0 KiB
Average record size in memory51.0 B

Variable types

Categorical1
DateTime1
Text2
Numeric2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-22190/F/1/datasetView.do

Alerts

단속일 has constant value ""Constant
Dataset has 6 (0.1%) duplicate rowsDuplicates
도로명 has 1531 (21.7%) missing valuesMissing
경도 is highly skewed (γ1 = 59.41379953)Skewed
위도 is highly skewed (γ1 = 83.71093753)Skewed

Reproduction

Analysis started2024-04-21 00:07:02.278536
Analysis finished2024-04-21 00:07:05.121040
Duration2.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

단속일
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size55.3 KiB
20210929
7065 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20210929
2nd row20210929
3rd row20210929
4th row20210929
5th row20210929

Common Values

ValueCountFrequency (%)
20210929 7065
100.0%

Length

2024-04-21T09:07:05.192061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T09:07:05.295169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20210929 7065
100.0%
Distinct6473
Distinct (%)91.6%
Missing0
Missing (%)0.0%
Memory size55.3 KiB
Minimum2024-04-21 00:00:44
Maximum2024-04-21 20:50:27
2024-04-21T09:07:05.402967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T09:07:05.550586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct2613
Distinct (%)37.0%
Missing4
Missing (%)0.1%
Memory size55.3 KiB
2024-04-21T09:07:05.813059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length56
Median length41
Mean length13.83685
Min length2

Characters and Unicode

Total characters97702
Distinct characters408
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1312 ?
Unique (%)18.6%

Sample

1st row서울특별시 동대문구 장한로18길 9
2nd row서울 강북구 수유동 142
3rd row서울 중구 흥인동 160
4th row서울 중구 흥인동 160
5th row서울 중구 흥인동 160
ValueCountFrequency (%)
서울 2600
 
12.4%
서울특별시 675
 
3.2%
중구 426
 
2.0%
서초구 305
 
1.5%
종로구 233
 
1.1%
마포구 229
 
1.1%
동대문구 201
 
1.0%
양천구 188
 
0.9%
강서구 187
 
0.9%
신당동 175
 
0.8%
Other values (3238) 15728
75.1%
2024-04-21T09:07:06.231188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14502
 
14.8%
6857
 
7.0%
1 5806
 
5.9%
- 5262
 
5.4%
4145
 
4.2%
2 4033
 
4.1%
3592
 
3.7%
3 3488
 
3.6%
3297
 
3.4%
4 2929
 
3.0%
Other values (398) 43791
44.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 46866
48.0%
Decimal Number 29990
30.7%
Space Separator 14502
 
14.8%
Dash Punctuation 5262
 
5.4%
Open Punctuation 381
 
0.4%
Close Punctuation 378
 
0.4%
Other Punctuation 294
 
0.3%
Uppercase Letter 16
 
< 0.1%
Math Symbol 6
 
< 0.1%
Lowercase Letter 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6857
 
14.6%
4145
 
8.8%
3592
 
7.7%
3297
 
7.0%
1613
 
3.4%
1225
 
2.6%
758
 
1.6%
722
 
1.5%
706
 
1.5%
675
 
1.4%
Other values (362) 23276
49.7%
Decimal Number
ValueCountFrequency (%)
1 5806
19.4%
2 4033
13.4%
3 3488
11.6%
4 2929
9.8%
5 2828
9.4%
7 2517
8.4%
6 2372
7.9%
9 2108
 
7.0%
0 2046
 
6.8%
8 1863
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
S 5
31.2%
B 2
 
12.5%
G 2
 
12.5%
K 1
 
6.2%
I 1
 
6.2%
L 1
 
6.2%
C 1
 
6.2%
U 1
 
6.2%
E 1
 
6.2%
T 1
 
6.2%
Lowercase Letter
ValueCountFrequency (%)
o 1
16.7%
w 1
16.7%
e 1
16.7%
r 1
16.7%
d 1
16.7%
b 1
16.7%
Other Punctuation
ValueCountFrequency (%)
, 289
98.3%
/ 3
 
1.0%
: 1
 
0.3%
@ 1
 
0.3%
Space Separator
ValueCountFrequency (%)
14502
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5262
100.0%
Open Punctuation
ValueCountFrequency (%)
( 381
100.0%
Close Punctuation
ValueCountFrequency (%)
) 378
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 50813
52.0%
Hangul 46866
48.0%
Latin 23
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6857
 
14.6%
4145
 
8.8%
3592
 
7.7%
3297
 
7.0%
1613
 
3.4%
1225
 
2.6%
758
 
1.6%
722
 
1.5%
706
 
1.5%
675
 
1.4%
Other values (362) 23276
49.7%
Common
ValueCountFrequency (%)
14502
28.5%
1 5806
11.4%
- 5262
 
10.4%
2 4033
 
7.9%
3 3488
 
6.9%
4 2929
 
5.8%
5 2828
 
5.6%
7 2517
 
5.0%
6 2372
 
4.7%
9 2108
 
4.1%
Other values (9) 4968
 
9.8%
Latin
ValueCountFrequency (%)
S 5
21.7%
B 2
 
8.7%
G 2
 
8.7%
K 1
 
4.3%
1
 
4.3%
I 1
 
4.3%
L 1
 
4.3%
C 1
 
4.3%
U 1
 
4.3%
E 1
 
4.3%
Other values (7) 7
30.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50835
52.0%
Hangul 46866
48.0%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
14502
28.5%
1 5806
11.4%
- 5262
 
10.4%
2 4033
 
7.9%
3 3488
 
6.9%
4 2929
 
5.8%
5 2828
 
5.6%
7 2517
 
5.0%
6 2372
 
4.7%
9 2108
 
4.1%
Other values (25) 4990
 
9.8%
Hangul
ValueCountFrequency (%)
6857
 
14.6%
4145
 
8.8%
3592
 
7.7%
3297
 
7.0%
1613
 
3.4%
1225
 
2.6%
758
 
1.6%
722
 
1.5%
706
 
1.5%
675
 
1.4%
Other values (362) 23276
49.7%
Number Forms
ValueCountFrequency (%)
1
100.0%

도로명
Text

MISSING 

Distinct1781
Distinct (%)32.2%
Missing1531
Missing (%)21.7%
Memory size55.3 KiB
2024-04-21T09:07:06.515474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length24
Mean length11.244308
Min length2

Characters and Unicode

Total characters62226
Distinct characters327
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique793 ?
Unique (%)14.3%

Sample

1st row서울특별시 동대문구 장한로18길 9
2nd row서울특별시 강북구 도봉로 337
3rd row서울특별시 강북구 솔매로 99
4th row퇴계로 307 (광희동1가)
5th row을지로 238 (을지로6가)
ValueCountFrequency (%)
서울 1339
 
9.3%
서울특별시 409
 
2.8%
중구 226
 
1.6%
양천구 182
 
1.3%
마포구 152
 
1.1%
청계천로 145
 
1.0%
퇴계로 112
 
0.8%
은평구 106
 
0.7%
11 104
 
0.7%
광진구 100
 
0.7%
Other values (1864) 11546
80.1%
2024-04-21T09:07:06.940941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9118
 
14.7%
5533
 
8.9%
1 3896
 
6.3%
2 2452
 
3.9%
2308
 
3.7%
3 2224
 
3.6%
2049
 
3.3%
1971
 
3.2%
1775
 
2.9%
4 1641
 
2.6%
Other values (317) 29259
47.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 34050
54.7%
Decimal Number 17720
28.5%
Space Separator 9118
 
14.7%
Dash Punctuation 606
 
1.0%
Open Punctuation 333
 
0.5%
Close Punctuation 333
 
0.5%
Uppercase Letter 62
 
0.1%
Other Punctuation 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5533
 
16.2%
2308
 
6.8%
2049
 
6.0%
1971
 
5.8%
1775
 
5.2%
988
 
2.9%
675
 
2.0%
633
 
1.9%
506
 
1.5%
474
 
1.4%
Other values (298) 17138
50.3%
Decimal Number
ValueCountFrequency (%)
1 3896
22.0%
2 2452
13.8%
3 2224
12.6%
4 1641
9.3%
6 1453
 
8.2%
5 1431
 
8.1%
7 1304
 
7.4%
0 1258
 
7.1%
8 1127
 
6.4%
9 934
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
L 30
48.4%
G 30
48.4%
C 1
 
1.6%
S 1
 
1.6%
Space Separator
ValueCountFrequency (%)
9118
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 606
100.0%
Open Punctuation
ValueCountFrequency (%)
( 333
100.0%
Close Punctuation
ValueCountFrequency (%)
) 333
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 34050
54.7%
Common 28114
45.2%
Latin 62
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5533
 
16.2%
2308
 
6.8%
2049
 
6.0%
1971
 
5.8%
1775
 
5.2%
988
 
2.9%
675
 
2.0%
633
 
1.9%
506
 
1.5%
474
 
1.4%
Other values (298) 17138
50.3%
Common
ValueCountFrequency (%)
9118
32.4%
1 3896
13.9%
2 2452
 
8.7%
3 2224
 
7.9%
4 1641
 
5.8%
6 1453
 
5.2%
5 1431
 
5.1%
7 1304
 
4.6%
0 1258
 
4.5%
8 1127
 
4.0%
Other values (5) 2210
 
7.9%
Latin
ValueCountFrequency (%)
L 30
48.4%
G 30
48.4%
C 1
 
1.6%
S 1
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 34050
54.7%
ASCII 28176
45.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9118
32.4%
1 3896
13.8%
2 2452
 
8.7%
3 2224
 
7.9%
4 1641
 
5.8%
6 1453
 
5.2%
5 1431
 
5.1%
7 1304
 
4.6%
0 1258
 
4.5%
8 1127
 
4.0%
Other values (9) 2272
 
8.1%
Hangul
ValueCountFrequency (%)
5533
 
16.2%
2308
 
6.8%
2049
 
6.0%
1971
 
5.8%
1775
 
5.2%
988
 
2.9%
675
 
2.0%
633
 
1.9%
506
 
1.5%
474
 
1.4%
Other values (298) 17138
50.3%

경도
Real number (ℝ)

SKEWED 

Distinct3383
Distinct (%)47.9%
Missing2
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean358020.13
Minimum37.543484
Maximum1.2638998 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size62.2 KiB
2024-04-21T09:07:07.120392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.543484
5-th percentile126.86325
Q1126.91909
median126.98556
Q3127.02596
95-th percentile127.0759
Maximum1.2638998 × 109
Range1.2638998 × 109
Interquartile range (IQR)0.1068695

Descriptive statistics

Standard deviation21266806
Coefficient of variation (CV)59.401144
Kurtosis3528.9989
Mean358020.13
Median Absolute Deviation (MAD)0.054280531
Skewness59.4138
Sum2.5286962 × 109
Variance4.5227702 × 1014
MonotonicityNot monotonic
2024-04-21T09:07:07.271528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.9158448 59
 
0.8%
126.9786163 53
 
0.8%
126.93173 44
 
0.6%
126.97281501547188 42
 
0.6%
126.97925943831896 40
 
0.6%
126.924697 34
 
0.5%
126.91878 30
 
0.4%
126.98750515671884 29
 
0.4%
126.9149868 23
 
0.3%
126.86665492510193 21
 
0.3%
Other values (3373) 6688
94.7%
ValueCountFrequency (%)
37.5434837341309 1
< 0.1%
126.807648069139 1
< 0.1%
126.80785727929504 1
< 0.1%
126.807877655171 1
< 0.1%
126.80794662702776 1
< 0.1%
126.8081155843709 1
< 0.1%
126.80856970039116 1
< 0.1%
126.80861714276212 1
< 0.1%
126.80878027490792 1
< 0.1%
126.80880908682272 1
< 0.1%
ValueCountFrequency (%)
1263899849.0 2
 
< 0.1%
127.17401708848259 2
 
< 0.1%
127.1734772 3
< 0.1%
127.1734543 2
 
< 0.1%
127.1733246 1
 
< 0.1%
127.1729889 3
< 0.1%
127.1723099 1
 
< 0.1%
127.168663 5
0.1%
127.166030545194 2
 
< 0.1%
127.165508 4
0.1%

위도
Real number (ℝ)

SKEWED 

Distinct3381
Distinct (%)47.9%
Missing2
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean37.564385
Minimum35.517833
Maximum127.01531
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size62.2 KiB
2024-04-21T09:07:07.407481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum35.517833
5-th percentile37.476667
Q137.519244
median37.557256
Q337.573069
95-th percentile37.640806
Maximum127.01531
Range91.497479
Interquartile range (IQR)0.053825178

Descriptive statistics

Standard deviation1.0659104
Coefficient of variation (CV)0.02837556
Kurtosis7026.0692
Mean37.564385
Median Absolute Deviation (MAD)0.028588932
Skewness83.710938
Sum265317.25
Variance1.1361651
MonotonicityNot monotonic
2024-04-21T09:07:07.542146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.5859655 59
 
0.8%
37.5744654 53
 
0.8%
37.520769 44
 
0.6%
37.57306915349144 42
 
0.6%
37.57122342169176 40
 
0.6%
37.491633 34
 
0.5%
37.529953 30
 
0.4%
37.55846872864774 29
 
0.4%
37.5693973 23
 
0.3%
37.5695626 21
 
0.3%
Other values (3371) 6688
94.7%
ValueCountFrequency (%)
35.517833 1
< 0.1%
37.44122280819926 2
< 0.1%
37.44410730254636 1
< 0.1%
37.444220068329805 1
< 0.1%
37.446707 1
< 0.1%
37.447218 1
< 0.1%
37.4477544 1
< 0.1%
37.44893 2
< 0.1%
37.4489924636289 1
< 0.1%
37.450028 2
< 0.1%
ValueCountFrequency (%)
127.015312194824 1
 
< 0.1%
37.69013773708887 4
0.1%
37.683274044176514 1
 
< 0.1%
37.6792296666667 1
 
< 0.1%
37.6792063333333 1
 
< 0.1%
37.6791726666667 1
 
< 0.1%
37.6791491666667 1
 
< 0.1%
37.6790831666667 1
 
< 0.1%
37.6790255 1
 
< 0.1%
37.6785601666667 1
 
< 0.1%

Interactions

2024-04-21T09:07:04.625950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T09:07:04.368001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T09:07:04.712808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T09:07:04.532277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T09:07:07.628319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
경도위도
경도1.0000.000
위도0.0001.000
2024-04-21T09:07:07.713543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
경도위도
경도1.0000.467
위도0.4671.000

Missing values

2024-04-21T09:07:04.819721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T09:07:04.926814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-21T09:07:05.056348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

단속일단속시간구주소도로명경도위도
02021092900:00:44서울특별시 동대문구 장한로18길 9서울특별시 동대문구 장한로18길 9127.06947337.567763
12021092900:03:36서울 강북구 수유동 142서울특별시 강북구 도봉로 337127.02534337.638049
22021092900:04:20서울 중구 흥인동 160<NA>127.01416637.565367
32021092900:04:53서울 중구 흥인동 160<NA>127.01420637.565343
42021092900:05:23서울 중구 흥인동 160<NA>127.01419437.565294
52021092900:07:03서울 중구 을지로7가 2-36<NA>127.01035537.565875
62021092900:07:05서울 중구 을지로7가 2-36<NA>127.01048537.565923
72021092900:07:09서울 중구 을지로7가 2-36<NA>127.0107837.56601
82021092900:07:10서울 중구 을지로7가 2-36<NA>127.01092237.566052
92021092900:07:11서울 중구 을지로7가 2-36<NA>127.01092237.566052
단속일단속시간구주소도로명경도위도
70552021092920:49:12서울특별시 서초구 서초대로77길 15 (서초동, 대경빌딩)<NA>127.02629537.499184
70562021092920:49:31서울특별시 서초구 서초대로77길 15 (서초동, 대경빌딩)<NA>127.02626837.499128
70572021092920:49:32서울특별시 서초구 청두곶길 50 (방배동)<NA>126.98535737.481141
70582021092920:49:41신당동 217-92번지앞청계천로 318127.0140337.569563
70592021092920:49:41서울 중구 신당동 372-12<NA>127.0113537.5534
70602021092920:49:54방이동 89-28양재대로 1233127.13037237.515717
70612021092920:50:00서울 금천구 독산동 901-4서울특별시 금천구 남부순환로 1424126.90878437.480488
70622021092920:50:01서울특별시 서초구 서초대로77길 15 (서초동, 대경빌딩)<NA>127.02624637.498826
70632021092920:50:16신월3동 166-7남부순환로 364126.82889537.534737
70642021092920:50:27서울특별시 중구 동호로 171서울특별시 중구 동호로 171127.01124337.553442

Duplicate rows

Most frequently occurring

단속일단속시간구주소도로명경도위도# duplicates
02021092904:10:00서울 관악구 신림동 475-103서울 관악구 조원로31길 57126.91891737.488452
12021092913:31:51서울 강서구 마곡동 758<NA>126.82535337.5693632
22021092913:36:00서울 동대문구 장안동 465-1서울특별시 동대문구 천호대로 427-4127.06676137.5618342
32021092914:36:00서울 관악구 봉천동 893-29서울 관악구 남부순환로 1769126.94694437.4821382
42021092918:55:00서울 강서구 염창동 281서울특별시 강서구 공항대로 607126.87240537.5474762
52021092919:27:38서울특별시 동대문구 홍릉로12길 24<NA>127.04442937.5857822