Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells11
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory634.8 KiB
Average record size in memory65.0 B

Variable types

Numeric1
Categorical4
Text2

Dataset

Description충청남도 보령시 도로명주소(건물번호)에 관련된 자료로 시군구명, 읍면동명, 도로명주소, 건축물용도, 형태, 건물표지판의 용도로 구성되어있습니다.
Author충청남도 보령시
URLhttps://www.data.go.kr/data/15041819/fileData.do

Alerts

시군구명 has constant value ""Constant
용도 is highly overall correlated with 형태High correlation
형태 is highly overall correlated with 용도High correlation
형태 is highly imbalanced (98.0%)Imbalance
용도 is highly imbalanced (97.2%)Imbalance
순번 has unique valuesUnique

Reproduction

Analysis started2024-01-14 13:26:24.463415
Analysis finished2024-01-14 13:26:25.804995
Duration1.34 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17108.416
Minimum2
Maximum34337
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-01-14T22:26:25.892981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile1714.9
Q18495.5
median17119.5
Q325686.25
95-th percentile32563.2
Maximum34337
Range34335
Interquartile range (IQR)17190.75

Descriptive statistics

Standard deviation9876.6448
Coefficient of variation (CV)0.57729745
Kurtosis-1.1970578
Mean17108.416
Median Absolute Deviation (MAD)8595
Skewness-0.0034987555
Sum1.7108416 × 108
Variance97548112
MonotonicityNot monotonic
2024-01-14T22:26:26.105303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
88 1
 
< 0.1%
13239 1
 
< 0.1%
23938 1
 
< 0.1%
14493 1
 
< 0.1%
4039 1
 
< 0.1%
21167 1
 
< 0.1%
11143 1
 
< 0.1%
4001 1
 
< 0.1%
7469 1
 
< 0.1%
27115 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
15 1
< 0.1%
17 1
< 0.1%
19 1
< 0.1%
20 1
< 0.1%
21 1
< 0.1%
23 1
< 0.1%
24 1
< 0.1%
25 1
< 0.1%
26 1
< 0.1%
ValueCountFrequency (%)
34337 1
< 0.1%
34336 1
< 0.1%
34335 1
< 0.1%
34327 1
< 0.1%
34325 1
< 0.1%
34324 1
< 0.1%
34323 1
< 0.1%
34321 1
< 0.1%
34320 1
< 0.1%
34318 1
< 0.1%

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
보령시
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row보령시
2nd row보령시
3rd row보령시
4th row보령시
5th row보령시

Common Values

ValueCountFrequency (%)
보령시 10000
100.0%

Length

2024-01-14T22:26:26.302991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-14T22:26:26.431138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보령시 10000
100.0%

읍면동명
Categorical

Distinct21
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
웅천읍
1050 
대천동
966 
남포면
914 
오천면
838 
천북면
751 
Other values (16)
5481 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row청라면
2nd row신흑동
3rd row청소면
4th row오천면
5th row궁촌동

Common Values

ValueCountFrequency (%)
웅천읍 1050
10.5%
대천동 966
 
9.7%
남포면 914
 
9.1%
오천면 838
 
8.4%
천북면 751
 
7.5%
주교면 743
 
7.4%
청라면 698
 
7.0%
주산면 551
 
5.5%
동대동 542
 
5.4%
청소면 510
 
5.1%
Other values (11) 2437
24.4%

Length

2024-01-14T22:26:26.558888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
웅천읍 1050
10.5%
대천동 966
 
9.7%
남포면 914
 
9.1%
오천면 838
 
8.4%
천북면 751
 
7.5%
주교면 743
 
7.4%
청라면 698
 
7.0%
주산면 551
 
5.5%
동대동 542
 
5.4%
청소면 510
 
5.1%
Other values (11) 2437
24.4%
Distinct727
Distinct (%)7.3%
Missing11
Missing (%)0.1%
Memory size156.2 KiB
2024-01-14T22:26:26.916748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length3.9982981
Min length2

Characters and Unicode

Total characters39939
Distinct characters309
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)0.2%

Sample

1st row가느실길
2nd row해수욕장3길
3rd row밧느락길
4th row원산도5길
5th row진등1길
ValueCountFrequency (%)
충서로 193
 
1.9%
홍보로 124
 
1.2%
토정로 120
 
1.2%
대해로 87
 
0.9%
중앙로 82
 
0.8%
만수로 81
 
0.8%
성주산로 78
 
0.8%
충청수영로 77
 
0.8%
대청로 77
 
0.8%
봉덕삼현길 71
 
0.7%
Other values (717) 8999
90.1%
2024-01-14T22:26:27.428111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7550
 
18.9%
2487
 
6.2%
1 1082
 
2.7%
2 928
 
2.3%
853
 
2.1%
831
 
2.1%
673
 
1.7%
625
 
1.6%
583
 
1.5%
536
 
1.3%
Other values (299) 23791
59.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 36918
92.4%
Decimal Number 3021
 
7.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7550
 
20.5%
2487
 
6.7%
853
 
2.3%
831
 
2.3%
673
 
1.8%
625
 
1.7%
583
 
1.6%
536
 
1.5%
462
 
1.3%
461
 
1.2%
Other values (289) 21857
59.2%
Decimal Number
ValueCountFrequency (%)
1 1082
35.8%
2 928
30.7%
3 428
 
14.2%
4 218
 
7.2%
5 107
 
3.5%
6 82
 
2.7%
7 71
 
2.4%
9 36
 
1.2%
8 36
 
1.2%
0 33
 
1.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 36918
92.4%
Common 3021
 
7.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7550
 
20.5%
2487
 
6.7%
853
 
2.3%
831
 
2.3%
673
 
1.8%
625
 
1.7%
583
 
1.6%
536
 
1.5%
462
 
1.3%
461
 
1.2%
Other values (289) 21857
59.2%
Common
ValueCountFrequency (%)
1 1082
35.8%
2 928
30.7%
3 428
 
14.2%
4 218
 
7.2%
5 107
 
3.5%
6 82
 
2.7%
7 71
 
2.4%
9 36
 
1.2%
8 36
 
1.2%
0 33
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 36918
92.4%
ASCII 3021
 
7.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7550
 
20.5%
2487
 
6.7%
853
 
2.3%
831
 
2.3%
673
 
1.8%
625
 
1.7%
583
 
1.6%
536
 
1.5%
462
 
1.3%
461
 
1.2%
Other values (289) 21857
59.2%
ASCII
ValueCountFrequency (%)
1 1082
35.8%
2 928
30.7%
3 428
 
14.2%
4 218
 
7.2%
5 107
 
3.5%
6 82
 
2.7%
7 71
 
2.4%
9 36
 
1.2%
8 36
 
1.2%
0 33
 
1.1%
Distinct154
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-01-14T22:26:27.803812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length4
Mean length4.1988
Min length2

Characters and Unicode

Total characters41988
Distinct characters195
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)0.4%

Sample

1st row기타창고시설
2nd row노래연습장
3rd row단독주택
4th row소매점
5th row단독주택
ValueCountFrequency (%)
단독주택 7717
77.2%
일반음식점 194
 
1.9%
창고 124
 
1.2%
기타판매및영업시설 121
 
1.2%
기타창고시설 116
 
1.2%
사무소 116
 
1.2%
소매점 115
 
1.1%
일반공장 101
 
1.0%
축사 98
 
1.0%
기타관광숙박시설 83
 
0.8%
Other values (146) 1217
 
12.2%
2024-01-14T22:26:28.460683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7844
18.7%
7821
18.6%
7723
18.4%
7717
18.4%
682
 
1.6%
674
 
1.6%
662
 
1.6%
660
 
1.6%
465
 
1.1%
411
 
1.0%
Other values (185) 7329
17.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 41870
99.7%
Decimal Number 53
 
0.1%
Open Punctuation 28
 
0.1%
Close Punctuation 28
 
0.1%
Other Punctuation 7
 
< 0.1%
Space Separator 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7844
18.7%
7821
18.7%
7723
18.4%
7717
18.4%
682
 
1.6%
674
 
1.6%
662
 
1.6%
660
 
1.6%
465
 
1.1%
411
 
1.0%
Other values (179) 7211
17.2%
Decimal Number
ValueCountFrequency (%)
1 28
52.8%
2 25
47.2%
Open Punctuation
ValueCountFrequency (%)
( 28
100.0%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%
Other Punctuation
ValueCountFrequency (%)
. 7
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 41870
99.7%
Common 118
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7844
18.7%
7821
18.7%
7723
18.4%
7717
18.4%
682
 
1.6%
674
 
1.6%
662
 
1.6%
660
 
1.6%
465
 
1.1%
411
 
1.0%
Other values (179) 7211
17.2%
Common
ValueCountFrequency (%)
1 28
23.7%
( 28
23.7%
) 28
23.7%
2 25
21.2%
. 7
 
5.9%
2
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 41870
99.7%
ASCII 118
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7844
18.7%
7821
18.7%
7723
18.4%
7717
18.4%
682
 
1.6%
674
 
1.6%
662
 
1.6%
660
 
1.6%
465
 
1.1%
411
 
1.0%
Other values (179) 7211
17.2%
ASCII
ValueCountFrequency (%)
1 28
23.7%
( 28
23.7%
) 28
23.7%
2 25
21.2%
. 7
 
5.9%
2
 
1.7%

형태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
표준형
9981 
비표준형
 
19

Length

Max length4
Median length3
Mean length3.0019
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row표준형
2nd row표준형
3rd row표준형
4th row표준형
5th row표준형

Common Values

ValueCountFrequency (%)
표준형 9981
99.8%
비표준형 19
 
0.2%

Length

2024-01-14T22:26:28.673631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-14T22:26:28.846743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
표준형 9981
99.8%
비표준형 19
 
0.2%

용도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
일반용(오각형)
9932 
관공서용
 
47
자율형
 
19
일반용(4각형)
 
1
문화재 관광용
 
1

Length

Max length8
Median length8
Mean length7.9716
Min length3

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row일반용(오각형)
2nd row일반용(오각형)
3rd row일반용(오각형)
4th row일반용(오각형)
5th row일반용(오각형)

Common Values

ValueCountFrequency (%)
일반용(오각형) 9932
99.3%
관공서용 47
 
0.5%
자율형 19
 
0.2%
일반용(4각형) 1
 
< 0.1%
문화재 관광용 1
 
< 0.1%

Length

2024-01-14T22:26:29.009317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-14T22:26:29.203302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반용(오각형 9932
99.3%
관공서용 47
 
0.5%
자율형 19
 
0.2%
일반용(4각형 1
 
< 0.1%
문화재 1
 
< 0.1%
관광용 1
 
< 0.1%

Interactions

2024-01-14T22:26:25.303039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-14T22:26:29.679724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번읍면동명형태용도
순번1.0000.5790.0190.048
읍면동명0.5791.0000.0270.044
형태0.0190.0271.0001.000
용도0.0480.0441.0001.000
2024-01-14T22:26:29.800123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용도읍면동명형태
용도1.0000.0211.000
읍면동명0.0211.0000.024
형태1.0000.0241.000
2024-01-14T22:26:29.921626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번읍면동명형태용도
순번1.0000.2550.0150.020
읍면동명0.2551.0000.0240.021
형태0.0150.0241.0001.000
용도0.0200.0211.0001.000

Missing values

2024-01-14T22:26:25.498700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-14T22:26:25.716949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번시군구명읍면동명도로명주소건축물용도형태용도
8788보령시청라면가느실길기타창고시설표준형일반용(오각형)
3179831799보령시신흑동해수욕장3길노래연습장표준형일반용(오각형)
1027710278보령시청소면밧느락길단독주택표준형일반용(오각형)
2221922220보령시오천면원산도5길소매점표준형일반용(오각형)
2676826769보령시궁촌동진등1길단독주택표준형일반용(오각형)
74067407보령시미산면도흥길단독주택표준형일반용(오각형)
23292330보령시대천동구시5길단독주택표준형일반용(오각형)
2493124932보령시웅천읍정굴길단독주택표준형일반용(오각형)
3005730058보령시신흑동통나무마을1길단독주택표준형일반용(오각형)
2224922250보령시오천면원산도6길단독주택표준형일반용(오각형)
순번시군구명읍면동명도로명주소건축물용도형태용도
3180031801보령시신흑동해수욕장3길공중화장실표준형일반용(오각형)
2339723398보령시남포면읍내냇둑길단독주택표준형일반용(오각형)
1024510246보령시주포면밖강술길단독주택표준형일반용(오각형)
38243825보령시남포면남포역전길단독주택표준형일반용(오각형)
3174331744보령시신흑동해수욕장12길기타관광숙박시설표준형일반용(오각형)
1308113082보령시천북면사호장은로단독주택표준형일반용(오각형)
2902729028보령시오천면충청수영로대규모소매점표준형일반용(오각형)
2487424875보령시내항동절길단독주택표준형일반용(오각형)
1957519576보령시청라면오서산길단독주택표준형일반용(오각형)
94189419보령시성주면먹방길단독주택표준형일반용(오각형)