Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory654.3 KiB
Average record size in memory67.0 B

Variable types

Numeric3
Categorical3
Text1

Dataset

Description경상북도 의성군에서 보유한 데이터로 일련번호 , 우편번호코드,시도 ,시군구,읍면동,도로명,건물번호본번, 등을 제공합니다.
Author경상북도 의성군
URLhttps://www.data.go.kr/data/15123672/fileData.do

Alerts

시도 has constant value ""Constant
시군구 is highly overall correlated with 일련번호 and 2 other fieldsHigh correlation
읍면동 is highly overall correlated with 일련번호 and 2 other fieldsHigh correlation
일련번호 is highly overall correlated with 우편번호코드 and 2 other fieldsHigh correlation
우편번호코드 is highly overall correlated with 일련번호 and 2 other fieldsHigh correlation
일련번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 20:30:32.740821
Analysis finished2023-12-12 20:30:34.695277
Duration1.95 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일련번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47482.031
Minimum23
Maximum95283
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:30:34.818536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum23
5-th percentile4883.85
Q123481.25
median47801.5
Q370901
95-th percentile90410.4
Maximum95283
Range95260
Interquartile range (IQR)47419.75

Descriptive statistics

Standard deviation27406.258
Coefficient of variation (CV)0.5771922
Kurtosis-1.1945528
Mean47482.031
Median Absolute Deviation (MAD)23648
Skewness0.00018395022
Sum4.7482031 × 108
Variance7.5110297 × 108
MonotonicityNot monotonic
2023-12-13T05:30:34.995688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
63969 1
 
< 0.1%
13415 1
 
< 0.1%
42827 1
 
< 0.1%
1050 1
 
< 0.1%
62200 1
 
< 0.1%
89002 1
 
< 0.1%
82054 1
 
< 0.1%
57468 1
 
< 0.1%
78518 1
 
< 0.1%
48559 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
23 1
< 0.1%
35 1
< 0.1%
39 1
< 0.1%
70 1
< 0.1%
73 1
< 0.1%
76 1
< 0.1%
80 1
< 0.1%
89 1
< 0.1%
100 1
< 0.1%
105 1
< 0.1%
ValueCountFrequency (%)
95283 1
< 0.1%
95280 1
< 0.1%
95276 1
< 0.1%
95270 1
< 0.1%
95265 1
< 0.1%
95254 1
< 0.1%
95238 1
< 0.1%
95229 1
< 0.1%
95221 1
< 0.1%
95219 1
< 0.1%

우편번호코드
Real number (ℝ)

HIGH CORRELATION 

Distinct431
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38294.696
Minimum38000
Maximum38699
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:30:35.169311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum38000
5-th percentile38015
Q138125
median38198
Q338512
95-th percentile38657
Maximum38699
Range699
Interquartile range (IQR)387

Descriptive statistics

Standard deviation216.04461
Coefficient of variation (CV)0.0056416328
Kurtosis-1.3053741
Mean38294.696
Median Absolute Deviation (MAD)166
Skewness0.40431944
Sum3.8294696 × 108
Variance46675.275
MonotonicityNot monotonic
2023-12-13T05:30:35.321624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
38125 181
 
1.8%
38053 177
 
1.8%
38121 166
 
1.7%
38695 134
 
1.3%
38187 129
 
1.3%
38180 123
 
1.2%
38572 120
 
1.2%
38033 117
 
1.2%
38221 106
 
1.1%
38216 103
 
1.0%
Other values (421) 8644
86.4%
ValueCountFrequency (%)
38000 56
0.6%
38001 28
 
0.3%
38002 86
0.9%
38003 84
0.8%
38004 50
0.5%
38005 28
 
0.3%
38006 30
 
0.3%
38007 7
 
0.1%
38008 51
0.5%
38009 19
 
0.2%
ValueCountFrequency (%)
38699 7
 
0.1%
38698 13
 
0.1%
38697 39
 
0.4%
38696 47
 
0.5%
38695 134
1.3%
38694 3
 
< 0.1%
38693 1
 
< 0.1%
38692 1
 
< 0.1%
38691 7
 
0.1%
38690 7
 
0.1%

시도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경상북도
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경상북도
2nd row경상북도
3rd row경상북도
4th row경상북도
5th row경상북도

Common Values

ValueCountFrequency (%)
경상북도 10000
100.0%

Length

2023-12-13T05:30:35.481844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:30:35.593556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경상북도 10000
100.0%

시군구
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경주시
5960 
경산시
4040 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경주시
2nd row경주시
3rd row경산시
4th row경주시
5th row경주시

Common Values

ValueCountFrequency (%)
경주시 5960
59.6%
경산시 4040
40.4%

Length

2023-12-13T05:30:35.708255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:30:35.841456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경주시 5960
59.6%
경산시 4040
40.4%

읍면동
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
3065 
안강읍
766 
진량읍
503 
건천읍
486 
하양읍
478 
Other values (14)
4702 

Length

Max length5
Median length3
Mean length3.3369
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row강동면
3rd row압량읍
4th row산내면
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 3065
30.6%
안강읍 766
 
7.7%
진량읍 503
 
5.0%
건천읍 486
 
4.9%
하양읍 478
 
4.8%
외동읍 467
 
4.7%
내남면 405
 
4.0%
양남면 401
 
4.0%
와촌면 387
 
3.9%
압량읍 386
 
3.9%
Other values (9) 2656
26.6%

Length

2023-12-13T05:30:35.972964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 3065
30.6%
안강읍 766
 
7.7%
진량읍 503
 
5.0%
건천읍 486
 
4.9%
하양읍 478
 
4.8%
외동읍 467
 
4.7%
내남면 405
 
4.0%
양남면 401
 
4.0%
와촌면 387
 
3.9%
압량읍 386
 
3.9%
Other values (9) 2656
26.6%
Distinct1777
Distinct (%)17.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T05:30:36.366415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length4.2791
Min length3

Characters and Unicode

Total characters42791
Distinct characters322
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique286 ?
Unique (%)2.9%

Sample

1st row포석로
2nd row강동로
3rd row당음길8길
4th row거산제궁길
5th row북성로56번길
ValueCountFrequency (%)
동해안로 94
 
0.9%
원효로 84
 
0.8%
단석로 79
 
0.8%
문복로 71
 
0.7%
산업로 70
 
0.7%
내서로 69
 
0.7%
감포로 63
 
0.6%
대경로 60
 
0.6%
내외로 58
 
0.6%
안현로 51
 
0.5%
Other values (1767) 9301
93.0%
2023-12-13T05:30:36.912539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7943
 
18.6%
4176
 
9.8%
1 1459
 
3.4%
2 1148
 
2.7%
788
 
1.8%
735
 
1.7%
724
 
1.7%
3 677
 
1.6%
643
 
1.5%
614
 
1.4%
Other values (312) 23884
55.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 37223
87.0%
Decimal Number 5568
 
13.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7943
 
21.3%
4176
 
11.2%
788
 
2.1%
735
 
2.0%
724
 
1.9%
643
 
1.7%
614
 
1.6%
563
 
1.5%
501
 
1.3%
490
 
1.3%
Other values (302) 20046
53.9%
Decimal Number
ValueCountFrequency (%)
1 1459
26.2%
2 1148
20.6%
3 677
12.2%
4 417
 
7.5%
5 383
 
6.9%
6 350
 
6.3%
7 306
 
5.5%
8 283
 
5.1%
9 279
 
5.0%
0 266
 
4.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 37223
87.0%
Common 5568
 
13.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7943
 
21.3%
4176
 
11.2%
788
 
2.1%
735
 
2.0%
724
 
1.9%
643
 
1.7%
614
 
1.6%
563
 
1.5%
501
 
1.3%
490
 
1.3%
Other values (302) 20046
53.9%
Common
ValueCountFrequency (%)
1 1459
26.2%
2 1148
20.6%
3 677
12.2%
4 417
 
7.5%
5 383
 
6.9%
6 350
 
6.3%
7 306
 
5.5%
8 283
 
5.1%
9 279
 
5.0%
0 266
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 37223
87.0%
ASCII 5568
 
13.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7943
 
21.3%
4176
 
11.2%
788
 
2.1%
735
 
2.0%
724
 
1.9%
643
 
1.7%
614
 
1.6%
563
 
1.5%
501
 
1.3%
490
 
1.3%
Other values (302) 20046
53.9%
ASCII
ValueCountFrequency (%)
1 1459
26.2%
2 1148
20.6%
3 677
12.2%
4 417
 
7.5%
5 383
 
6.9%
6 350
 
6.3%
7 306
 
5.5%
8 283
 
5.1%
9 279
 
5.0%
0 266
 
4.8%

건물번호본번
Real number (ℝ)

Distinct1121
Distinct (%)11.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean200.4807
Minimum1
Maximum5670
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T05:30:37.086420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q117
median42
Q3130
95-th percentile1056
Maximum5670
Range5669
Interquartile range (IQR)113

Descriptive statistics

Standard deviation493.52567
Coefficient of variation (CV)2.4617116
Kurtosis32.149463
Mean200.4807
Median Absolute Deviation (MAD)32
Skewness5.0032233
Sum2004807
Variance243567.58
MonotonicityNot monotonic
2023-12-13T05:30:37.262464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7 193
 
1.9%
13 188
 
1.9%
8 169
 
1.7%
14 163
 
1.6%
11 162
 
1.6%
12 159
 
1.6%
15 157
 
1.6%
5 157
 
1.6%
17 155
 
1.6%
9 153
 
1.5%
Other values (1111) 8344
83.4%
ValueCountFrequency (%)
1 76
 
0.8%
2 82
0.8%
3 141
1.4%
4 144
1.4%
5 157
1.6%
6 149
1.5%
7 193
1.9%
8 169
1.7%
9 153
1.5%
10 152
1.5%
ValueCountFrequency (%)
5670 1
< 0.1%
5537 2
< 0.1%
5523 1
< 0.1%
5522 1
< 0.1%
5459 1
< 0.1%
5435 1
< 0.1%
5397 2
< 0.1%
5299 1
< 0.1%
4875 1
< 0.1%
4846 1
< 0.1%

Interactions

2023-12-13T05:30:34.122578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:30:33.414163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:30:33.758201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:30:34.222874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:30:33.526580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:30:33.886518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:30:34.346427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:30:33.660014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:30:34.004751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T05:30:37.374217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호우편번호코드시군구읍면동건물번호본번
일련번호1.0000.8571.0000.9910.285
우편번호코드0.8571.0001.0000.9970.170
시군구1.0001.0001.0001.0000.196
읍면동0.9910.9971.0001.0000.356
건물번호본번0.2850.1700.1960.3561.000
2023-12-13T05:30:37.812112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구읍면동
시군구1.0000.999
읍면동0.9991.000
2023-12-13T05:30:37.931241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호우편번호코드건물번호본번시군구읍면동
일련번호1.000-0.7170.0920.9940.952
우편번호코드-0.7171.000-0.0681.0000.910
건물번호본번0.092-0.0681.0000.1500.126
시군구0.9941.0000.1501.0000.999
읍면동0.9520.9100.1260.9991.000

Missing values

2023-12-13T05:30:34.487108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:30:34.625221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일련번호우편번호코드시도시군구읍면동도로명건물번호본번
639686396938173경상북도경주시<NA>포석로742
448874488838004경상북도경주시강동면강동로8
110071100838539경상북도경산시압량읍당음길8길36
697296973038186경상북도경주시산내면거산제궁길287
723147231538144경상북도경주시<NA>북성로56번길9
561835618438155경상북도경주시<NA>봉황로37
148461484738680경상북도경산시<NA>성암로12길38
556085560938158경상북도경주시<NA>원효로86
9928992938540경상북도경산시<NA>정상지길63
672146721538184경상북도경주시산내면오봉로324
일련번호우편번호코드시도시군구읍면동도로명건물번호본번
117541175538509경상북도경산시압량읍압량시장길15
698946989538053경상북도경주시서면도계서오길230
131971319838496경상북도경산시압량읍의송길97
115261152738510경상북도경산시압량읍부적길36
590595906038110경상북도경주시<NA>탈해로36
399463994738123경상북도경주시감포읍노동길31
546335463438197경상북도경주시내남면용장1길46
940529405338215경상북도경주시외동읍모화북2길120
936809368138215경상북도경주시외동읍산업로1768
445714457238004경상북도경주시강동면동해대로167