Overview

Dataset statistics

Number of variables7
Number of observations370
Missing cells2
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory20.7 KiB
Average record size in memory57.4 B

Variable types

Categorical1
Text5
Numeric1

Dataset

Description고속도로 하이패스 단말기 영업소 정보를 제공한다. (본부명,지사명,영업소명,전화번호,팩스번호,우편번호,주소)
Author한국도로공사
URLhttps://www.data.go.kr/data/15064223/fileData.do

Alerts

우편번호 is highly overall correlated with 본부명High correlation
본부명 is highly overall correlated with 우편번호High correlation
영업소명 has unique valuesUnique
전화번호 has unique valuesUnique
팩스번호 has unique valuesUnique
주소 has unique valuesUnique

Reproduction

Analysis started2023-12-12 01:36:07.230211
Analysis finished2023-12-12 01:36:08.220695
Duration0.99 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

본부명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
부산경남
69 
대구경북
57 
광주전남
52 
강원
48 
수도권
41 
Other values (3)
103 

Length

Max length4
Median length4
Mean length3.2675676
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수도권
2nd row수도권
3rd row수도권
4th row수도권
5th row수도권

Common Values

ValueCountFrequency (%)
부산경남 69
18.6%
대구경북 57
15.4%
광주전남 52
14.1%
강원 48
13.0%
수도권 41
11.1%
대전충남 36
9.7%
전북 34
9.2%
충북 33
8.9%

Length

2023-12-12T10:36:08.327359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:36:08.573772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산경남 69
18.6%
대구경북 57
15.4%
광주전남 52
14.1%
강원 48
13.0%
수도권 41
11.1%
대전충남 36
9.7%
전북 34
9.2%
충북 33
8.9%
Distinct56
Distinct (%)15.1%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-12T10:36:08.919721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.0702703
Min length2

Characters and Unicode

Total characters766
Distinct characters58
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row인천
2nd row인천
3rd row시흥
4th row시흥
5th row시흥
ValueCountFrequency (%)
대구 15
 
4.1%
진주 12
 
3.2%
함평 11
 
3.0%
울산 10
 
2.7%
화성 10
 
2.7%
이천 9
 
2.4%
순천 9
 
2.4%
춘천 9
 
2.4%
강릉 9
 
2.4%
군포 9
 
2.4%
Other values (46) 267
72.2%
2023-12-12T10:36:09.471966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
72
 
9.4%
56
 
7.3%
36
 
4.7%
31
 
4.0%
29
 
3.8%
28
 
3.7%
27
 
3.5%
25
 
3.3%
25
 
3.3%
21
 
2.7%
Other values (48) 416
54.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 766
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
72
 
9.4%
56
 
7.3%
36
 
4.7%
31
 
4.0%
29
 
3.8%
28
 
3.7%
27
 
3.5%
25
 
3.3%
25
 
3.3%
21
 
2.7%
Other values (48) 416
54.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 766
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
72
 
9.4%
56
 
7.3%
36
 
4.7%
31
 
4.0%
29
 
3.8%
28
 
3.7%
27
 
3.5%
25
 
3.3%
25
 
3.3%
21
 
2.7%
Other values (48) 416
54.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 766
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
72
 
9.4%
56
 
7.3%
36
 
4.7%
31
 
4.0%
29
 
3.8%
28
 
3.7%
27
 
3.5%
25
 
3.3%
25
 
3.3%
21
 
2.7%
Other values (48) 416
54.3%

영업소명
Text

UNIQUE 

Distinct370
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-12T10:36:09.988416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length2
Mean length2.5324324
Min length2

Characters and Unicode

Total characters937
Distinct characters187
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique370 ?
Unique (%)100.0%

Sample

1st row인천
2nd row김포
3rd row서서울
4th row시흥
5th row남인천
ValueCountFrequency (%)
인천 1
 
0.3%
서포항 1
 
0.3%
경주 1
 
0.3%
동김천 1
 
0.3%
칠곡물류 1
 
0.3%
추풍령 1
 
0.3%
김천 1
 
0.3%
구미 1
 
0.3%
남구미 1
 
0.3%
왜관 1
 
0.3%
Other values (360) 360
97.3%
2023-12-12T10:36:10.755365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
42
 
4.5%
40
 
4.3%
39
 
4.2%
37
 
3.9%
37
 
3.9%
36
 
3.8%
29
 
3.1%
25
 
2.7%
24
 
2.6%
20
 
2.1%
Other values (177) 608
64.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 930
99.3%
Open Punctuation 3
 
0.3%
Close Punctuation 3
 
0.3%
Decimal Number 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
42
 
4.5%
40
 
4.3%
39
 
4.2%
37
 
4.0%
37
 
4.0%
36
 
3.9%
29
 
3.1%
25
 
2.7%
24
 
2.6%
20
 
2.2%
Other values (174) 601
64.6%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 930
99.3%
Common 7
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
42
 
4.5%
40
 
4.3%
39
 
4.2%
37
 
4.0%
37
 
4.0%
36
 
3.9%
29
 
3.1%
25
 
2.7%
24
 
2.6%
20
 
2.2%
Other values (174) 601
64.6%
Common
ValueCountFrequency (%)
( 3
42.9%
) 3
42.9%
2 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 930
99.3%
ASCII 7
 
0.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
42
 
4.5%
40
 
4.3%
39
 
4.2%
37
 
4.0%
37
 
4.0%
36
 
3.9%
29
 
3.1%
25
 
2.7%
24
 
2.6%
20
 
2.2%
Other values (174) 601
64.6%
ASCII
ValueCountFrequency (%)
( 3
42.9%
) 3
42.9%
2 1
 
14.3%

전화번호
Text

UNIQUE 

Distinct370
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-12T10:36:11.075347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters4440
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique370 ?
Unique (%)100.0%

Sample

1st row032-718-2190
2nd row031-327-2101
3rd row031-327-2103
4th row031-327-2104
5th row032-718-2191
ValueCountFrequency (%)
032-718-2190 1
 
0.3%
054-706-2745 1
 
0.3%
054-706-2702 1
 
0.3%
054-706-2712 1
 
0.3%
054-706-2742 1
 
0.3%
054-706-2714 1
 
0.3%
054-706-2711 1
 
0.3%
054-706-2708 1
 
0.3%
054-706-2707 1
 
0.3%
054-706-2741 1
 
0.3%
Other values (360) 360
97.3%
2023-12-12T10:36:11.542440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 740
16.7%
2 701
15.8%
1 581
13.1%
0 552
12.4%
7 385
8.7%
3 383
8.6%
5 299
6.7%
4 268
 
6.0%
6 240
 
5.4%
8 224
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3700
83.3%
Dash Punctuation 740
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 701
18.9%
1 581
15.7%
0 552
14.9%
7 385
10.4%
3 383
10.4%
5 299
8.1%
4 268
 
7.2%
6 240
 
6.5%
8 224
 
6.1%
9 67
 
1.8%
Dash Punctuation
ValueCountFrequency (%)
- 740
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4440
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 740
16.7%
2 701
15.8%
1 581
13.1%
0 552
12.4%
7 385
8.7%
3 383
8.6%
5 299
6.7%
4 268
 
6.0%
6 240
 
5.4%
8 224
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4440
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 740
16.7%
2 701
15.8%
1 581
13.1%
0 552
12.4%
7 385
8.7%
3 383
8.6%
5 299
6.7%
4 268
 
6.0%
6 240
 
5.4%
8 224
 
5.0%

팩스번호
Text

UNIQUE 

Distinct370
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-12T10:36:11.875438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters4440
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique370 ?
Unique (%)100.0%

Sample

1st row032-717-4190
2nd row031-894-4101
3rd row031-894-4103
4th row031-894-4104
5th row032-717-4191
ValueCountFrequency (%)
032-717-4190 1
 
0.3%
054-706-4745 1
 
0.3%
054-706-4702 1
 
0.3%
054-706-4712 1
 
0.3%
054-706-4742 1
 
0.3%
054-706-4714 1
 
0.3%
054-706-4711 1
 
0.3%
054-706-4708 1
 
0.3%
054-706-4707 1
 
0.3%
054-706-4741 1
 
0.3%
Other values (360) 360
97.3%
2023-12-12T10:36:12.437655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 740
16.7%
4 663
14.9%
0 642
14.5%
1 490
11.0%
7 370
8.3%
3 363
8.2%
5 312
7.0%
8 277
 
6.2%
2 248
 
5.6%
6 218
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3700
83.3%
Dash Punctuation 740
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 663
17.9%
0 642
17.4%
1 490
13.2%
7 370
10.0%
3 363
9.8%
5 312
8.4%
8 277
7.5%
2 248
 
6.7%
6 218
 
5.9%
9 117
 
3.2%
Dash Punctuation
ValueCountFrequency (%)
- 740
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4440
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 740
16.7%
4 663
14.9%
0 642
14.5%
1 490
11.0%
7 370
8.3%
3 363
8.2%
5 312
7.0%
8 277
 
6.2%
2 248
 
5.6%
6 218
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4440
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 740
16.7%
4 663
14.9%
0 642
14.5%
1 490
11.0%
7 370
8.3%
3 363
8.2%
5 312
7.0%
8 277
 
6.2%
2 248
 
5.6%
6 218
 
4.9%

우편번호
Real number (ℝ)

HIGH CORRELATION 

Distinct362
Distinct (%)98.4%
Missing2
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean38141.443
Minimum10133
Maximum62683
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.4 KiB
2023-12-12T10:36:12.638086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10133
5-th percentile13375.2
Q126389.25
median38007.5
Q352206.75
95-th percentile57909.95
Maximum62683
Range52550
Interquartile range (IQR)25817.5

Descriptive statistics

Standard deviation14562.942
Coefficient of variation (CV)0.38181412
Kurtosis-1.1850679
Mean38141.443
Median Absolute Deviation (MAD)12798
Skewness-0.14568811
Sum14036051
Variance2.1207927 × 108
MonotonicityNot monotonic
2023-12-12T10:36:12.849938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11960 2
 
0.5%
39865 2
 
0.5%
50319 2
 
0.5%
58569 2
 
0.5%
46076 2
 
0.5%
15208 2
 
0.5%
58027 1
 
0.3%
56050 1
 
0.3%
38898 1
 
0.3%
38180 1
 
0.3%
Other values (352) 352
95.1%
(Missing) 2
 
0.5%
ValueCountFrequency (%)
10133 1
0.3%
11960 2
0.5%
12196 1
0.3%
12203 1
0.3%
12205 1
0.3%
12278 1
0.3%
12467 1
0.3%
12500 1
0.3%
12508 1
0.3%
12604 1
0.3%
ValueCountFrequency (%)
62683 1
0.3%
62460 1
0.3%
62409 1
0.3%
61046 1
0.3%
59502 1
0.3%
59442 1
0.3%
59416 1
0.3%
59309 1
0.3%
59202 1
0.3%
58569 2
0.5%

주소
Text

UNIQUE 

Distinct370
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-12T10:36:13.283131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length32
Mean length22.797297
Min length14

Characters and Unicode

Total characters8435
Distinct characters268
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique370 ?
Unique (%)100.0%

Sample

1st row인천광역시 계양구 경인고속도로 17
2nd row경기도 김포시 김포대로 319번길 209-23
3rd row경기도 안산시 상록구 장하로 141-2
4th row경기도 시흥시 서해안로 1780번길94
5th row인천광역시 남동구 음실로 117번길 32
ValueCountFrequency (%)
경기도 57
 
3.2%
경상남도 50
 
2.8%
경상북도 49
 
2.7%
전라남도 39
 
2.2%
강원도 35
 
1.9%
전라북도 33
 
1.8%
충청남도 29
 
1.6%
충청북도 29
 
1.6%
대구광역시 15
 
0.8%
남해고속도로 12
 
0.7%
Other values (991) 1457
80.7%
2023-12-12T10:36:13.910823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1448
 
17.2%
468
 
5.5%
326
 
3.9%
1 246
 
2.9%
241
 
2.9%
207
 
2.5%
196
 
2.3%
189
 
2.2%
2 183
 
2.2%
162
 
1.9%
Other values (258) 4769
56.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5463
64.8%
Space Separator 1448
 
17.2%
Decimal Number 1322
 
15.7%
Dash Punctuation 124
 
1.5%
Close Punctuation 37
 
0.4%
Open Punctuation 37
 
0.4%
Other Punctuation 2
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
468
 
8.6%
326
 
6.0%
241
 
4.4%
207
 
3.8%
196
 
3.6%
189
 
3.5%
162
 
3.0%
161
 
2.9%
139
 
2.5%
135
 
2.5%
Other values (240) 3239
59.3%
Decimal Number
ValueCountFrequency (%)
1 246
18.6%
2 183
13.8%
3 154
11.6%
5 141
10.7%
4 132
10.0%
7 106
8.0%
6 97
 
7.3%
0 93
 
7.0%
9 87
 
6.6%
8 83
 
6.3%
Other Punctuation
ValueCountFrequency (%)
? 1
50.0%
. 1
50.0%
Uppercase Letter
ValueCountFrequency (%)
I 1
50.0%
C 1
50.0%
Space Separator
ValueCountFrequency (%)
1448
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 124
100.0%
Close Punctuation
ValueCountFrequency (%)
) 37
100.0%
Open Punctuation
ValueCountFrequency (%)
( 37
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5463
64.8%
Common 2970
35.2%
Latin 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
468
 
8.6%
326
 
6.0%
241
 
4.4%
207
 
3.8%
196
 
3.6%
189
 
3.5%
162
 
3.0%
161
 
2.9%
139
 
2.5%
135
 
2.5%
Other values (240) 3239
59.3%
Common
ValueCountFrequency (%)
1448
48.8%
1 246
 
8.3%
2 183
 
6.2%
3 154
 
5.2%
5 141
 
4.7%
4 132
 
4.4%
- 124
 
4.2%
7 106
 
3.6%
6 97
 
3.3%
0 93
 
3.1%
Other values (6) 246
 
8.3%
Latin
ValueCountFrequency (%)
I 1
50.0%
C 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5463
64.8%
ASCII 2972
35.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1448
48.7%
1 246
 
8.3%
2 183
 
6.2%
3 154
 
5.2%
5 141
 
4.7%
4 132
 
4.4%
- 124
 
4.2%
7 106
 
3.6%
6 97
 
3.3%
0 93
 
3.1%
Other values (8) 248
 
8.3%
Hangul
ValueCountFrequency (%)
468
 
8.6%
326
 
6.0%
241
 
4.4%
207
 
3.8%
196
 
3.6%
189
 
3.5%
162
 
3.0%
161
 
2.9%
139
 
2.5%
135
 
2.5%
Other values (240) 3239
59.3%

Interactions

2023-12-12T10:36:07.706972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:36:14.040999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
본부명지사명우편번호
본부명1.0001.0000.901
지사명1.0001.0000.972
우편번호0.9010.9721.000
2023-12-12T10:36:14.165207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호본부명
우편번호1.0000.725
본부명0.7251.000

Missing values

2023-12-12T10:36:07.915970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:36:08.161832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

본부명지사명영업소명전화번호팩스번호우편번호주소
0수도권인천인천032-718-2190032-717-419021077인천광역시 계양구 경인고속도로 17
1수도권인천김포031-327-2101031-894-410110133경기도 김포시 김포대로 319번길 209-23
2수도권시흥서서울031-327-2103031-894-410315208경기도 안산시 상록구 장하로 141-2
3수도권시흥시흥031-327-2104031-894-410414924경기도 시흥시 서해안로 1780번길94
4수도권시흥남인천032-718-2191032-717-419121601인천광역시 남동구 음실로 117번길 32
5수도권군포군자031-327-2105031-894-410515004경기도 시흥시 군자로335번길 36-29
6수도권군포서안산031-327-2106031-894-410615210경기도 안산시 단원구 시흥대로 19-30
7수도권군포안산031-327-2107031-894-410715208경기도 안산시 오리골길 15-1
8수도권군포군포031-327-2112031-894-411215884경기도 군포시 영동고속도로 26
9수도권군포동군포031-327-2111031-894-411115878경기도 군포시 영동고속도로 25
본부명지사명영업소명전화번호팩스번호우편번호주소
360부산경남고성동고성055-711-2803055-717-480352924경상남도 고성군 거류면 송산로 421
361부산경남고성북통영055-711-2835055-717-483553012경상남도 통영시 광도면 남해안대로 1227
362부산경남고성통영055-711-2836055-717-483653026경상남도 통영시 용남면 기호로 25-52
363부산경남서울산부산051-793-2882051-796-488246209부산광역시 금정구 고분로 148
364부산경남서울산노포051-793-2883051-796-488346204부산광역시 금정구 고분로93번길 51
365부산경남서울산양산055-711-2822055-717-482250565경상남도 양산시 상북면 와곡2길 12-1
366부산경남서울산통도사052-701-2893052-701-489344954울산광역시 울주군 삼남면 반구대로 80
367부산경남서울산서울산052-701-2892052-701-489244951울산광역시 울주군 삼남면 반구대로 760
368부산경남서울산활천052-701-2890052-701-489044910울산광역시 울주군 두서면 활천내와로 30-25
369부산경남서울산배내골055-711-2849055-717-484962683경상남도 양산시 원동면 배내로 915