Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells9
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory634.8 KiB
Average record size in memory65.0 B

Variable types

Numeric1
Categorical4
Text2

Dataset

Description충청남도 보령시 도로명주소(건물번호)에 관련된 자료로 시군구명, 읍면동명, 도로명주소, 건축물용도, 형태, 건물표지판의 용도로 구성되어있습니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=400&beforeMenuCd=DOM_000000201001001000&publicdatapk=15041819

Alerts

시군구명 has constant value ""Constant
형태 is highly overall correlated with 용도High correlation
용도 is highly overall correlated with 형태High correlation
형태 is highly imbalanced (98.1%)Imbalance
용도 is highly imbalanced (97.1%)Imbalance
순번 has unique valuesUnique

Reproduction

Analysis started2024-01-14 06:39:40.178409
Analysis finished2024-01-14 06:39:41.111864
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17267.812
Minimum1
Maximum34336
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-01-14T15:39:41.201922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1785.7
Q18777.25
median17242
Q325918.75
95-th percentile32566.05
Maximum34336
Range34335
Interquartile range (IQR)17141.5

Descriptive statistics

Standard deviation9865.6989
Coefficient of variation (CV)0.57133463
Kurtosis-1.1957846
Mean17267.812
Median Absolute Deviation (MAD)8577.5
Skewness-0.010034457
Sum1.7267812 × 108
Variance97332015
MonotonicityNot monotonic
2024-01-14T15:39:41.334562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
29602 1
 
< 0.1%
23997 1
 
< 0.1%
20563 1
 
< 0.1%
5991 1
 
< 0.1%
4276 1
 
< 0.1%
26160 1
 
< 0.1%
4861 1
 
< 0.1%
22660 1
 
< 0.1%
27935 1
 
< 0.1%
23719 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
15 1
< 0.1%
17 1
< 0.1%
19 1
< 0.1%
25 1
< 0.1%
26 1
< 0.1%
27 1
< 0.1%
33 1
< 0.1%
ValueCountFrequency (%)
34336 1
< 0.1%
34328 1
< 0.1%
34326 1
< 0.1%
34324 1
< 0.1%
34323 1
< 0.1%
34320 1
< 0.1%
34317 1
< 0.1%
34316 1
< 0.1%
34315 1
< 0.1%
34310 1
< 0.1%

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
보령시
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row보령시
2nd row보령시
3rd row보령시
4th row보령시
5th row보령시

Common Values

ValueCountFrequency (%)
보령시 10000
100.0%

Length

2024-01-14T15:39:41.482721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-14T15:39:41.627524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보령시 10000
100.0%

읍면동명
Categorical

Distinct21
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
웅천읍
1123 
대천동
946 
남포면
918 
오천면
857 
청라면
721 
Other values (16)
5435 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대천동
2nd row신흑동
3rd row신흑동
4th row주교면
5th row남포면

Common Values

ValueCountFrequency (%)
웅천읍 1123
11.2%
대천동 946
 
9.5%
남포면 918
 
9.2%
오천면 857
 
8.6%
청라면 721
 
7.2%
주교면 704
 
7.0%
천북면 697
 
7.0%
주산면 564
 
5.6%
동대동 521
 
5.2%
청소면 514
 
5.1%
Other values (11) 2435
24.3%

Length

2024-01-14T15:39:41.755268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
웅천읍 1123
11.2%
대천동 946
 
9.5%
남포면 918
 
9.2%
오천면 857
 
8.6%
청라면 721
 
7.2%
주교면 704
 
7.0%
천북면 697
 
7.0%
주산면 564
 
5.6%
동대동 521
 
5.2%
청소면 514
 
5.1%
Other values (11) 2435
24.3%
Distinct727
Distinct (%)7.3%
Missing9
Missing (%)0.1%
Memory size156.2 KiB
2024-01-14T15:39:42.059361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.0047042
Min length2

Characters and Unicode

Total characters40011
Distinct characters309
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)0.2%

Sample

1st row토정로
2nd row고잠2길
3rd row석서1길
4th row가운데벌길
5th row평촌밤섬길
ValueCountFrequency (%)
충서로 211
 
2.1%
토정로 120
 
1.2%
홍보로 116
 
1.2%
중앙로 87
 
0.9%
만수로 85
 
0.9%
보령남로 78
 
0.8%
성주산로 77
 
0.8%
대해로 77
 
0.8%
충청수영로 76
 
0.8%
봉덕삼현길 73
 
0.7%
Other values (717) 8991
90.0%
2024-01-14T15:39:42.574758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7537
 
18.8%
2508
 
6.3%
1 1067
 
2.7%
2 893
 
2.2%
831
 
2.1%
780
 
1.9%
718
 
1.8%
615
 
1.5%
559
 
1.4%
554
 
1.4%
Other values (299) 23949
59.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 37005
92.5%
Decimal Number 3006
 
7.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7537
 
20.4%
2508
 
6.8%
831
 
2.2%
780
 
2.1%
718
 
1.9%
615
 
1.7%
559
 
1.5%
554
 
1.5%
473
 
1.3%
440
 
1.2%
Other values (289) 21990
59.4%
Decimal Number
ValueCountFrequency (%)
1 1067
35.5%
2 893
29.7%
3 444
14.8%
4 214
 
7.1%
5 111
 
3.7%
6 95
 
3.2%
7 71
 
2.4%
8 42
 
1.4%
0 41
 
1.4%
9 28
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Hangul 37005
92.5%
Common 3006
 
7.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7537
 
20.4%
2508
 
6.8%
831
 
2.2%
780
 
2.1%
718
 
1.9%
615
 
1.7%
559
 
1.5%
554
 
1.5%
473
 
1.3%
440
 
1.2%
Other values (289) 21990
59.4%
Common
ValueCountFrequency (%)
1 1067
35.5%
2 893
29.7%
3 444
14.8%
4 214
 
7.1%
5 111
 
3.7%
6 95
 
3.2%
7 71
 
2.4%
8 42
 
1.4%
0 41
 
1.4%
9 28
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 37005
92.5%
ASCII 3006
 
7.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7537
 
20.4%
2508
 
6.8%
831
 
2.2%
780
 
2.1%
718
 
1.9%
615
 
1.7%
559
 
1.5%
554
 
1.5%
473
 
1.3%
440
 
1.2%
Other values (289) 21990
59.4%
ASCII
ValueCountFrequency (%)
1 1067
35.5%
2 893
29.7%
3 444
14.8%
4 214
 
7.1%
5 111
 
3.7%
6 95
 
3.2%
7 71
 
2.4%
8 42
 
1.4%
0 41
 
1.4%
9 28
 
0.9%
Distinct157
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-01-14T15:39:42.852369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length4
Mean length4.1916
Min length2

Characters and Unicode

Total characters41916
Distinct characters192
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique46 ?
Unique (%)0.5%

Sample

1st row단독주택
2nd row기타일반숙박시설
3rd row어린이집
4th row단독주택
5th row단독주택
ValueCountFrequency (%)
단독주택 7659
76.6%
일반음식점 182
 
1.8%
창고 130
 
1.3%
사무소 129
 
1.3%
소매점 124
 
1.2%
기타판매및영업시설 116
 
1.2%
일반공장 112
 
1.1%
기타창고시설 104
 
1.0%
휴게음식점 92
 
0.9%
축사 89
 
0.9%
Other values (147) 1263
 
12.6%
2024-01-14T15:39:43.274380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7785
18.6%
7757
18.5%
7664
18.3%
7660
18.3%
679
 
1.6%
669
 
1.6%
657
 
1.6%
651
 
1.6%
463
 
1.1%
438
 
1.0%
Other values (182) 7493
17.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 41769
99.6%
Decimal Number 56
 
0.1%
Close Punctuation 43
 
0.1%
Open Punctuation 43
 
0.1%
Other Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7785
18.6%
7757
18.6%
7664
18.3%
7660
18.3%
679
 
1.6%
669
 
1.6%
657
 
1.6%
651
 
1.6%
463
 
1.1%
438
 
1.0%
Other values (177) 7346
17.6%
Decimal Number
ValueCountFrequency (%)
1 29
51.8%
2 27
48.2%
Close Punctuation
ValueCountFrequency (%)
) 43
100.0%
Open Punctuation
ValueCountFrequency (%)
( 43
100.0%
Other Punctuation
ValueCountFrequency (%)
. 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 41769
99.6%
Common 147
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7785
18.6%
7757
18.6%
7664
18.3%
7660
18.3%
679
 
1.6%
669
 
1.6%
657
 
1.6%
651
 
1.6%
463
 
1.1%
438
 
1.0%
Other values (177) 7346
17.6%
Common
ValueCountFrequency (%)
) 43
29.3%
( 43
29.3%
1 29
19.7%
2 27
18.4%
. 5
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 41769
99.6%
ASCII 147
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7785
18.6%
7757
18.6%
7664
18.3%
7660
18.3%
679
 
1.6%
669
 
1.6%
657
 
1.6%
651
 
1.6%
463
 
1.1%
438
 
1.0%
Other values (177) 7346
17.6%
ASCII
ValueCountFrequency (%)
) 43
29.3%
( 43
29.3%
1 29
19.7%
2 27
18.4%
. 5
 
3.4%

형태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
표준형
9982 
비표준형
 
18

Length

Max length4
Median length3
Mean length3.0018
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row표준형
2nd row표준형
3rd row표준형
4th row표준형
5th row표준형

Common Values

ValueCountFrequency (%)
표준형 9982
99.8%
비표준형 18
 
0.2%

Length

2024-01-14T15:39:43.413331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-14T15:39:43.498749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
표준형 9982
99.8%
비표준형 18
 
0.2%

용도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
일반용(오각형)
9940 
관공서용
 
41
자율형
 
18
문화재 관광용
 
1

Length

Max length8
Median length8
Mean length7.9745
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row일반용(오각형)
2nd row일반용(오각형)
3rd row일반용(오각형)
4th row일반용(오각형)
5th row일반용(오각형)

Common Values

ValueCountFrequency (%)
일반용(오각형) 9940
99.4%
관공서용 41
 
0.4%
자율형 18
 
0.2%
문화재 관광용 1
 
< 0.1%

Length

2024-01-14T15:39:43.599628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-14T15:39:43.733073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반용(오각형 9940
99.4%
관공서용 41
 
0.4%
자율형 18
 
0.2%
문화재 1
 
< 0.1%
관광용 1
 
< 0.1%

Interactions

2024-01-14T15:39:40.707913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-14T15:39:43.816217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번읍면동명형태용도
순번1.0000.5720.0120.000
읍면동명0.5721.0000.0000.061
형태0.0120.0001.0001.000
용도0.0000.0611.0001.000
2024-01-14T15:39:43.935685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
읍면동명형태용도
읍면동명1.0000.0000.033
형태0.0001.0001.000
용도0.0331.0001.000
2024-01-14T15:39:44.037509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번읍면동명형태용도
순번1.0000.2510.0090.000
읍면동명0.2511.0000.0000.033
형태0.0090.0001.0001.000
용도0.0000.0331.0001.000

Missing values

2024-01-14T15:39:40.870685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-14T15:39:41.044557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번시군구명읍면동명도로명주소건축물용도형태용도
2960129602보령시대천동토정로단독주택표준형일반용(오각형)
14961497보령시신흑동고잠2길기타일반숙박시설표준형일반용(오각형)
1452914530보령시신흑동석서1길어린이집표준형일반용(오각형)
254255보령시주교면가운데벌길단독주택표준형일반용(오각형)
3057530576보령시남포면평촌밤섬길단독주택표준형일반용(오각형)
23302331보령시대천동구시5길단독주택표준형일반용(오각형)
95419542보령시명천동명암2길동사무소표준형일반용(오각형)
1503715038보령시성주면성주산로단독주택표준형일반용(오각형)
1093910940보령시성주면벌뜸길단독주택표준형일반용(오각형)
1901419015보령시청라면여술길단독주택표준형일반용(오각형)
순번시군구명읍면동명도로명주소건축물용도형태용도
2898028981보령시오천면충청수영로단독주택표준형일반용(오각형)
89328933보령시미산면만수로단독주택표준형일반용(오각형)
1268312684보령시청라면사거리길단독주택표준형일반용(오각형)
43654366보령시청소면넙티로일반공장표준형일반용(오각형)
1099510996보령시대천동벼루길단독주택표준형일반용(오각형)
1721917220보령시주교면신대1길창고표준형일반용(오각형)
2043520436보령시내항동왕대산길단독주택표준형일반용(오각형)
2809928100보령시청소면청소큰길사무소표준형일반용(오각형)
50045005보령시청라면당안길단독주택표준형일반용(오각형)
1540015401보령시오천면소루구지길단독주택표준형일반용(오각형)