Overview

Dataset statistics

Number of variables7
Number of observations2802
Missing cells1698
Missing cells (%)8.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory158.8 KiB
Average record size in memory58.0 B

Variable types

Numeric1
Text3
Categorical3

Dataset

Description대구광역시 상수도사업본부에서 행정안전부 도로명주소시스템과 연계되어 관리하고 있는 도로명 주소코드, 도로명, 시, 구군, 읍면동, 리 내역입니다.
URLhttps://www.data.go.kr/data/15116747/fileData.do

Alerts

시도 has constant value ""Constant
시코드 has constant value ""Constant
도로명관리번호 is highly overall correlated with 구군High correlation
구군 is highly overall correlated with 도로명관리번호High correlation
has 1698 (60.6%) missing valuesMissing

Reproduction

Analysis started2023-12-12 20:47:44.821984
Analysis finished2023-12-12 20:47:45.461584
Duration0.64 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

도로명관리번호
Real number (ℝ)

HIGH CORRELATION 

Distinct2082
Distinct (%)74.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.740242 × 1011
Minimum2.7110201 × 1011
Maximum2.7720486 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.8 KiB
2023-12-13T05:47:45.545200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.7110201 × 1011
5-th percentile2.7110422 × 1011
Q12.7170423 × 1011
median2.7260424 × 1011
Q32.7710424 × 1011
95-th percentile2.7720474 × 1011
Maximum2.7720486 × 1011
Range6.1028489 × 109
Interquartile range (IQR)5.4000152 × 109

Descriptive statistics

Standard deviation2.5635391 × 109
Coefficient of variation (CV)0.0093551558
Kurtosis-1.756146
Mean2.740242 × 1011
Median Absolute Deviation (MAD)1.2010964 × 109
Skewness0.32463748
Sum7.6781582 × 1014
Variance6.5717328 × 1018
MonotonicityIncreasing
2023-12-13T05:47:45.697312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
277103007026 20
 
0.7%
271103007001 15
 
0.5%
277202018002 12
 
0.4%
271102007001 11
 
0.4%
271104223003 10
 
0.4%
272602007002 10
 
0.4%
277203308077 10
 
0.4%
277203018066 9
 
0.3%
277103351381 9
 
0.3%
272902007002 9
 
0.3%
Other values (2072) 2687
95.9%
ValueCountFrequency (%)
271102007001 11
0.4%
271102007002 9
0.3%
271103007001 15
0.5%
271103007017 4
 
0.1%
271103141001 1
 
< 0.1%
271103141002 1
 
< 0.1%
271103141003 2
 
0.1%
271103141004 5
 
0.2%
271103141005 3
 
0.1%
271103141006 8
0.3%
ValueCountFrequency (%)
277204855924 2
0.1%
277204742306 3
0.1%
277204742302 2
0.1%
277204742301 1
 
< 0.1%
277204742300 1
 
< 0.1%
277204742299 1
 
< 0.1%
277204742298 1
 
< 0.1%
277204742297 1
 
< 0.1%
277204742296 1
 
< 0.1%
277204742295 1
 
< 0.1%
Distinct2035
Distinct (%)72.6%
Missing0
Missing (%)0.0%
Memory size22.0 KiB
2023-12-13T05:47:45.971748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length5.0820842
Min length2

Characters and Unicode

Total characters14240
Distinct characters273
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1682 ?
Unique (%)60.0%

Sample

1st row중앙대로
2nd row중앙대로
3rd row중앙대로
4th row중앙대로
5th row중앙대로
ValueCountFrequency (%)
달구벌대로 32
 
1.1%
비슬로 20
 
0.7%
국채보상로 19
 
0.7%
중앙대로 15
 
0.5%
경북대로 12
 
0.4%
도군로 10
 
0.4%
태평로 10
 
0.4%
경상감영길 10
 
0.4%
산성가음로 9
 
0.3%
동부로 9
 
0.3%
Other values (2025) 2656
94.8%
2023-12-13T05:47:46.339072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2067
 
14.5%
2029
 
14.2%
1 615
 
4.3%
2 522
 
3.7%
3 357
 
2.5%
347
 
2.4%
4 299
 
2.1%
282
 
2.0%
5 248
 
1.7%
233
 
1.6%
Other values (263) 7241
50.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11308
79.4%
Decimal Number 2932
 
20.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2067
 
18.3%
2029
 
17.9%
347
 
3.1%
282
 
2.5%
233
 
2.1%
218
 
1.9%
183
 
1.6%
177
 
1.6%
162
 
1.4%
136
 
1.2%
Other values (253) 5474
48.4%
Decimal Number
ValueCountFrequency (%)
1 615
21.0%
2 522
17.8%
3 357
12.2%
4 299
10.2%
5 248
8.5%
6 230
 
7.8%
7 200
 
6.8%
0 162
 
5.5%
8 159
 
5.4%
9 140
 
4.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11308
79.4%
Common 2932
 
20.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2067
 
18.3%
2029
 
17.9%
347
 
3.1%
282
 
2.5%
233
 
2.1%
218
 
1.9%
183
 
1.6%
177
 
1.6%
162
 
1.4%
136
 
1.2%
Other values (253) 5474
48.4%
Common
ValueCountFrequency (%)
1 615
21.0%
2 522
17.8%
3 357
12.2%
4 299
10.2%
5 248
8.5%
6 230
 
7.8%
7 200
 
6.8%
0 162
 
5.5%
8 159
 
5.4%
9 140
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11308
79.4%
ASCII 2932
 
20.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2067
 
18.3%
2029
 
17.9%
347
 
3.1%
282
 
2.5%
233
 
2.1%
218
 
1.9%
183
 
1.6%
177
 
1.6%
162
 
1.4%
136
 
1.2%
Other values (253) 5474
48.4%
ASCII
ValueCountFrequency (%)
1 615
21.0%
2 522
17.8%
3 357
12.2%
4 299
10.2%
5 248
8.5%
6 230
 
7.8%
7 200
 
6.8%
0 162
 
5.5%
8 159
 
5.4%
9 140
 
4.8%

시도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.0 KiB
대구광역시
2802 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대구광역시
2nd row대구광역시
3rd row대구광역시
4th row대구광역시
5th row대구광역시

Common Values

ValueCountFrequency (%)
대구광역시 2802
100.0%

Length

2023-12-13T05:47:46.455066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:47:46.535203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대구광역시 2802
100.0%

시코드
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.0 KiB
27
2802 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row27
2nd row27
3rd row27
4th row27
5th row27

Common Values

ValueCountFrequency (%)
27 2802
100.0%

Length

2023-12-13T05:47:46.626782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:47:46.703683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
27 2802
100.0%

구군
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size22.0 KiB
달성군
660 
군위군
444 
동구
344 
북구
305 
중구
265 
Other values (4)
784 

Length

Max length3
Median length3
Mean length2.5710207
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중구
2nd row중구
3rd row중구
4th row중구
5th row중구

Common Values

ValueCountFrequency (%)
달성군 660
23.6%
군위군 444
15.8%
동구 344
12.3%
북구 305
10.9%
중구 265
9.5%
달서구 256
 
9.1%
수성구 240
 
8.6%
남구 171
 
6.1%
서구 117
 
4.2%

Length

2023-12-13T05:47:46.783594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:47:46.879917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
달성군 660
23.6%
군위군 444
15.8%
동구 344
12.3%
북구 305
10.9%
중구 265
9.5%
달서구 256
 
9.1%
수성구 240
 
8.6%
남구 171
 
6.1%
서구 117
 
4.2%
Distinct210
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Memory size22.0 KiB
2023-12-13T05:47:47.166268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length3.1448965
Min length2

Characters and Unicode

Total characters8812
Distinct characters135
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)0.5%

Sample

1st row남산동
2nd row남일동
3rd row덕산동
4th row동성로3가
5th row북성로1가
ValueCountFrequency (%)
대명동 110
 
3.9%
군위읍 93
 
3.3%
현풍읍 92
 
3.3%
옥포읍 91
 
3.2%
효령면 77
 
2.7%
논공읍 74
 
2.6%
다사읍 65
 
2.3%
유가읍 65
 
2.3%
소보면 64
 
2.3%
구지면 56
 
2.0%
Other values (200) 2015
71.9%
2023-12-13T05:47:47.582428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1709
 
19.4%
573
 
6.5%
550
 
6.2%
337
 
3.8%
269
 
3.1%
208
 
2.4%
188
 
2.1%
173
 
2.0%
131
 
1.5%
128
 
1.5%
Other values (125) 4546
51.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8604
97.6%
Decimal Number 208
 
2.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1709
 
19.9%
573
 
6.7%
550
 
6.4%
337
 
3.9%
269
 
3.1%
208
 
2.4%
188
 
2.2%
173
 
2.0%
131
 
1.5%
128
 
1.5%
Other values (121) 4338
50.4%
Decimal Number
ValueCountFrequency (%)
2 72
34.6%
1 67
32.2%
3 51
24.5%
4 18
 
8.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8604
97.6%
Common 208
 
2.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1709
 
19.9%
573
 
6.7%
550
 
6.4%
337
 
3.9%
269
 
3.1%
208
 
2.4%
188
 
2.2%
173
 
2.0%
131
 
1.5%
128
 
1.5%
Other values (121) 4338
50.4%
Common
ValueCountFrequency (%)
2 72
34.6%
1 67
32.2%
3 51
24.5%
4 18
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8604
97.6%
ASCII 208
 
2.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1709
 
19.9%
573
 
6.7%
550
 
6.4%
337
 
3.9%
269
 
3.1%
208
 
2.4%
188
 
2.2%
173
 
2.0%
131
 
1.5%
128
 
1.5%
Other values (121) 4338
50.4%
ASCII
ValueCountFrequency (%)
2 72
34.6%
1 67
32.2%
3 51
24.5%
4 18
 
8.7%


Text

MISSING 

Distinct175
Distinct (%)15.9%
Missing1698
Missing (%)60.6%
Memory size22.0 KiB
2023-12-13T05:47:47.911136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.8432971
Min length2

Characters and Unicode

Total characters3139
Distinct characters129
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.5%

Sample

1st row성하리
2nd row삼리리
3rd row매곡리
4th row문양리
5th row죽곡리
ValueCountFrequency (%)
본리리 36
 
3.3%
금포리 24
 
2.2%
상리 23
 
2.1%
원교리 21
 
1.9%
교항리 21
 
1.9%
강림리 18
 
1.6%
하리 18
 
1.6%
부리 17
 
1.5%
성하리 17
 
1.5%
서부리 16
 
1.4%
Other values (165) 893
80.9%
2023-12-13T05:47:48.354544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1160
37.0%
103
 
3.3%
82
 
2.6%
59
 
1.9%
57
 
1.8%
49
 
1.6%
49
 
1.6%
48
 
1.5%
48
 
1.5%
47
 
1.5%
Other values (119) 1437
45.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3139
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1160
37.0%
103
 
3.3%
82
 
2.6%
59
 
1.9%
57
 
1.8%
49
 
1.6%
49
 
1.6%
48
 
1.5%
48
 
1.5%
47
 
1.5%
Other values (119) 1437
45.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3139
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1160
37.0%
103
 
3.3%
82
 
2.6%
59
 
1.9%
57
 
1.8%
49
 
1.6%
49
 
1.6%
48
 
1.5%
48
 
1.5%
47
 
1.5%
Other values (119) 1437
45.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3139
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1160
37.0%
103
 
3.3%
82
 
2.6%
59
 
1.9%
57
 
1.8%
49
 
1.6%
49
 
1.6%
48
 
1.5%
48
 
1.5%
47
 
1.5%
Other values (119) 1437
45.8%

Interactions

2023-12-13T05:47:45.173403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T05:47:48.471446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도로명관리번호구군
도로명관리번호1.0000.988
구군0.9881.000
2023-12-13T05:47:48.588639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도로명관리번호구군
도로명관리번호1.0000.994
구군0.9941.000

Missing values

2023-12-13T05:47:45.305091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:47:45.418139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

도로명관리번호도로명시도시코드구군읍면동
0271102007001중앙대로대구광역시27중구남산동<NA>
1271102007001중앙대로대구광역시27중구남일동<NA>
2271102007001중앙대로대구광역시27중구덕산동<NA>
3271102007001중앙대로대구광역시27중구동성로3가<NA>
4271102007001중앙대로대구광역시27중구북성로1가<NA>
5271102007001중앙대로대구광역시27중구사일동<NA>
6271102007001중앙대로대구광역시27중구전동<NA>
7271102007001중앙대로대구광역시27중구종로2가<NA>
8271102007001중앙대로대구광역시27중구포정동<NA>
9271102007001중앙대로대구광역시27중구향촌동<NA>
도로명관리번호도로명시도시코드구군읍면동
2792277204742299화수길대구광역시27군위군삼국유사면화수리
2793277204742300화전1길대구광역시27군위군산성면화전리
2794277204742301화전2길대구광역시27군위군산성면화전리
2795277204742302효령공단길대구광역시27군위군효령면거매리
2796277204742302효령공단길대구광역시27군위군효령면중구리
2797277204742306내외량길대구광역시27군위군군위읍내량리
2798277204742306내외량길대구광역시27군위군군위읍삽령리
2799277204742306내외량길대구광역시27군위군군위읍외량리
2800277204855924도담마을길대구광역시27군위군군위읍내량리
2801277204855924도담마을길대구광역시27군위군군위읍대북리