Overview

Dataset statistics

Number of variables5
Number of observations832
Missing cells48
Missing cells (%)1.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory34.3 KiB
Average record size in memory42.2 B

Variable types

Categorical1
Text2
Numeric2

Dataset

Description경기도 화성시 기계설비 성능점검 대상 현황 데이터로 건축물용도, 건물명(소유자명), 도로명주소, 우편번호, 건축물연면적을 포함합니다.
Author경기도 화성시
URLhttps://www.data.go.kr/data/15124797/fileData.do

Alerts

건물명(소유자명) has 48 (5.8%) missing valuesMissing
건축물연면적 is highly skewed (γ1 = 21.64836309)Skewed

Reproduction

Analysis started2023-12-12 17:28:00.013482
Analysis finished2023-12-12 17:28:01.210776
Duration1.2 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

건축물용도
Categorical

Distinct44
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
공동주택
332 
공장
147 
교육연구시설
125 
업무시설
57 
제1종근린생활시설
35 
Other values (39)
136 

Length

Max length28
Median length4
Mean length4.7644231
Min length2

Unique

Unique25 ?
Unique (%)3.0%

Sample

1st row공장
2nd row교육연구시설
3rd row공장
4th row공장
5th row공장

Common Values

ValueCountFrequency (%)
공동주택 332
39.9%
공장 147
17.7%
교육연구시설 125
 
15.0%
업무시설 57
 
6.9%
제1종근린생활시설 35
 
4.2%
제2종근린생활시설 28
 
3.4%
판매시설 21
 
2.5%
자동차관련시설 16
 
1.9%
숙박시설 9
 
1.1%
업무시설, 판매시설 8
 
1.0%
Other values (34) 54
 
6.5%

Length

2023-12-13T02:28:01.314006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
공동주택 332
38.2%
공장 151
17.4%
교육연구시설 128
 
14.7%
업무시설 76
 
8.8%
제1종근린생활시설 35
 
4.0%
판매시설 31
 
3.6%
제2종근린생활시설 29
 
3.3%
자동차관련시설 16
 
1.8%
숙박시설 9
 
1.0%
문화및집회시설 7
 
0.8%
Other values (23) 54
 
6.2%
Distinct746
Distinct (%)95.2%
Missing48
Missing (%)5.8%
Memory size6.6 KiB
2023-12-13T02:28:01.633864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length22
Mean length9.7704082
Min length2

Characters and Unicode

Total characters7660
Distinct characters445
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique715 ?
Unique (%)91.2%

Sample

1st row삼성전자(주)
2nd row현대.기아자동차(주)연구개발본부
3rd row기아자동차 화성공장
4th row기아자동차 화성공장
5th row금강펜테리움 IX타워
ValueCountFrequency (%)
동탄역 52
 
4.1%
동탄 35
 
2.8%
아파트 15
 
1.2%
향남시범 12
 
0.9%
반도유보라 11
 
0.9%
9
 
0.7%
화성 9
 
0.7%
오피스텔 8
 
0.6%
화성공장 8
 
0.6%
롯데캐슬 8
 
0.6%
Other values (883) 1098
86.8%
2023-12-13T02:28:02.145954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
490
 
6.4%
303
 
4.0%
244
 
3.2%
178
 
2.3%
178
 
2.3%
170
 
2.2%
166
 
2.2%
155
 
2.0%
128
 
1.7%
116
 
1.5%
Other values (435) 5532
72.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6569
85.8%
Space Separator 490
 
6.4%
Decimal Number 218
 
2.8%
Uppercase Letter 194
 
2.5%
Lowercase Letter 46
 
0.6%
Open Punctuation 36
 
0.5%
Close Punctuation 36
 
0.5%
Other Symbol 32
 
0.4%
Other Punctuation 22
 
0.3%
Letter Number 10
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
303
 
4.6%
244
 
3.7%
178
 
2.7%
178
 
2.7%
170
 
2.6%
166
 
2.5%
155
 
2.4%
128
 
1.9%
116
 
1.8%
116
 
1.8%
Other values (373) 4815
73.3%
Uppercase Letter
ValueCountFrequency (%)
T 20
 
10.3%
I 18
 
9.3%
A 17
 
8.8%
L 15
 
7.7%
E 15
 
7.7%
S 15
 
7.7%
M 13
 
6.7%
H 11
 
5.7%
O 10
 
5.2%
G 8
 
4.1%
Other values (14) 52
26.8%
Lowercase Letter
ValueCountFrequency (%)
e 9
19.6%
n 5
10.9%
c 4
8.7%
a 4
8.7%
l 4
8.7%
o 3
 
6.5%
r 3
 
6.5%
t 3
 
6.5%
b 2
 
4.3%
d 2
 
4.3%
Other values (6) 7
15.2%
Decimal Number
ValueCountFrequency (%)
2 77
35.3%
1 46
21.1%
3 23
 
10.6%
0 20
 
9.2%
5 13
 
6.0%
4 11
 
5.0%
6 9
 
4.1%
8 8
 
3.7%
7 6
 
2.8%
9 5
 
2.3%
Letter Number
ValueCountFrequency (%)
7
70.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
. 18
81.8%
, 3
 
13.6%
& 1
 
4.5%
Space Separator
ValueCountFrequency (%)
490
100.0%
Open Punctuation
ValueCountFrequency (%)
( 36
100.0%
Close Punctuation
ValueCountFrequency (%)
) 36
100.0%
Other Symbol
ValueCountFrequency (%)
32
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6600
86.2%
Common 809
 
10.6%
Latin 250
 
3.3%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
303
 
4.6%
244
 
3.7%
178
 
2.7%
178
 
2.7%
170
 
2.6%
166
 
2.5%
155
 
2.3%
128
 
1.9%
116
 
1.8%
116
 
1.8%
Other values (373) 4846
73.4%
Latin
ValueCountFrequency (%)
T 20
 
8.0%
I 18
 
7.2%
A 17
 
6.8%
L 15
 
6.0%
E 15
 
6.0%
S 15
 
6.0%
M 13
 
5.2%
H 11
 
4.4%
O 10
 
4.0%
e 9
 
3.6%
Other values (34) 107
42.8%
Common
ValueCountFrequency (%)
490
60.6%
2 77
 
9.5%
1 46
 
5.7%
( 36
 
4.4%
) 36
 
4.4%
3 23
 
2.8%
0 20
 
2.5%
. 18
 
2.2%
5 13
 
1.6%
4 11
 
1.4%
Other values (7) 39
 
4.8%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6568
85.7%
ASCII 1049
 
13.7%
None 32
 
0.4%
Number Forms 10
 
0.1%
CJK 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
490
46.7%
2 77
 
7.3%
1 46
 
4.4%
( 36
 
3.4%
) 36
 
3.4%
3 23
 
2.2%
0 20
 
1.9%
T 20
 
1.9%
I 18
 
1.7%
. 18
 
1.7%
Other values (47) 265
25.3%
Hangul
ValueCountFrequency (%)
303
 
4.6%
244
 
3.7%
178
 
2.7%
178
 
2.7%
170
 
2.6%
166
 
2.5%
155
 
2.4%
128
 
1.9%
116
 
1.8%
116
 
1.8%
Other values (372) 4814
73.3%
None
ValueCountFrequency (%)
32
100.0%
Number Forms
ValueCountFrequency (%)
7
70.0%
1
 
10.0%
1
 
10.0%
1
 
10.0%
CJK
ValueCountFrequency (%)
1
100.0%
Distinct814
Distinct (%)97.8%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
2023-12-13T02:28:02.531456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length26
Mean length18.877404
Min length13

Characters and Unicode

Total characters15706
Distinct characters181
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique799 ?
Unique (%)96.0%

Sample

1st row경기도 화성시 삼성전자로 1
2nd row경기도 화성시 남양읍 현대연구소로 150
3rd row경기도 화성시 우정읍 기아자동차로 95
4th row경기도 화성시 우정읍 기아자동차로 95
5th row경기도 화성시 동탄첨단산업1로 27
ValueCountFrequency (%)
화성시 834
23.1%
경기도 833
23.1%
향남읍 80
 
2.2%
봉담읍 60
 
1.7%
남양읍 37
 
1.0%
동탄대로 31
 
0.9%
동탄기흥로 29
 
0.8%
동탄반석로 26
 
0.7%
동탄순환대로 20
 
0.6%
동탄대로시범길 19
 
0.5%
Other values (734) 1635
45.4%
2023-12-13T02:28:03.083092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2772
17.6%
883
 
5.6%
876
 
5.6%
873
 
5.6%
861
 
5.5%
853
 
5.4%
836
 
5.3%
722
 
4.6%
1 643
 
4.1%
441
 
2.8%
Other values (171) 5946
37.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9838
62.6%
Decimal Number 2885
 
18.4%
Space Separator 2772
 
17.6%
Dash Punctuation 190
 
1.2%
Close Punctuation 7
 
< 0.1%
Open Punctuation 7
 
< 0.1%
Other Punctuation 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
883
 
9.0%
876
 
8.9%
873
 
8.9%
861
 
8.8%
853
 
8.7%
836
 
8.5%
722
 
7.3%
441
 
4.5%
414
 
4.2%
370
 
3.8%
Other values (155) 2709
27.5%
Decimal Number
ValueCountFrequency (%)
1 643
22.3%
2 424
14.7%
3 304
10.5%
4 266
9.2%
6 250
 
8.7%
5 242
 
8.4%
7 217
 
7.5%
0 203
 
7.0%
8 169
 
5.9%
9 167
 
5.8%
Other Punctuation
ValueCountFrequency (%)
. 4
57.1%
, 3
42.9%
Space Separator
ValueCountFrequency (%)
2772
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 190
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9838
62.6%
Common 5868
37.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
883
 
9.0%
876
 
8.9%
873
 
8.9%
861
 
8.8%
853
 
8.7%
836
 
8.5%
722
 
7.3%
441
 
4.5%
414
 
4.2%
370
 
3.8%
Other values (155) 2709
27.5%
Common
ValueCountFrequency (%)
2772
47.2%
1 643
 
11.0%
2 424
 
7.2%
3 304
 
5.2%
4 266
 
4.5%
6 250
 
4.3%
5 242
 
4.1%
7 217
 
3.7%
0 203
 
3.5%
- 190
 
3.2%
Other values (6) 357
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9838
62.6%
ASCII 5868
37.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2772
47.2%
1 643
 
11.0%
2 424
 
7.2%
3 304
 
5.2%
4 266
 
4.5%
6 250
 
4.3%
5 242
 
4.1%
7 217
 
3.7%
0 203
 
3.5%
- 190
 
3.2%
Other values (6) 357
 
6.1%
Hangul
ValueCountFrequency (%)
883
 
9.0%
876
 
8.9%
873
 
8.9%
861
 
8.8%
853
 
8.7%
836
 
8.5%
722
 
7.3%
441
 
4.5%
414
 
4.2%
370
 
3.8%
Other values (155) 2709
27.5%

우편번호
Real number (ℝ)

Distinct253
Distinct (%)30.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18452.528
Minimum18221
Maximum18635
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.4 KiB
2023-12-13T02:28:03.267127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum18221
5-th percentile18268
Q118406.75
median18469
Q318499
95-th percentile18610
Maximum18635
Range414
Interquartile range (IQR)92.25

Descriptive statistics

Standard deviation94.768953
Coefficient of variation (CV)0.005135825
Kurtosis-0.16141331
Mean18452.528
Median Absolute Deviation (MAD)41
Skewness-0.39636949
Sum15352503
Variance8981.1545
MonotonicityNot monotonic
2023-12-13T02:28:03.437035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
18478 25
 
3.0%
18479 25
 
3.0%
18469 24
 
2.9%
18487 20
 
2.4%
18450 14
 
1.7%
18449 12
 
1.4%
18476 12
 
1.4%
18454 12
 
1.4%
18468 12
 
1.4%
18484 11
 
1.3%
Other values (243) 665
79.9%
ValueCountFrequency (%)
18221 2
 
0.2%
18236 2
 
0.2%
18237 7
0.8%
18238 3
0.4%
18239 2
 
0.2%
18241 1
 
0.1%
18242 2
 
0.2%
18244 4
0.5%
18247 1
 
0.1%
18256 1
 
0.1%
ValueCountFrequency (%)
18635 1
 
0.1%
18631 1
 
0.1%
18629 1
 
0.1%
18627 2
 
0.2%
18626 1
 
0.1%
18623 8
1.0%
18622 8
1.0%
18621 1
 
0.1%
18617 1
 
0.1%
18616 1
 
0.1%

건축물연면적
Real number (ℝ)

SKEWED 

Distinct775
Distinct (%)93.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20313.228
Minimum312
Maximum2724340.3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.4 KiB
2023-12-13T02:28:03.602001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum312
5-th percentile473.85
Q1849.25
median11502.425
Q317192.143
95-th percentile50761.03
Maximum2724340.3
Range2724028.3
Interquartile range (IQR)16342.893

Descriptive statistics

Standard deviation105134.18
Coefficient of variation (CV)5.175651
Kurtosis535.3868
Mean20313.228
Median Absolute Deviation (MAD)10497.925
Skewness21.648363
Sum16900605
Variance1.1053195 × 1010
MonotonicityNot monotonic
2023-12-13T02:28:03.784449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
872.0 3
 
0.4%
608.0 3
 
0.4%
470.0 3
 
0.4%
498.0 3
 
0.4%
1005.0 3
 
0.4%
545.0 3
 
0.4%
622.0 3
 
0.4%
514.0 3
 
0.4%
536.0 2
 
0.2%
534.0 2
 
0.2%
Other values (765) 804
96.6%
ValueCountFrequency (%)
312.0 1
0.1%
326.0 1
0.1%
330.0 1
0.1%
344.0 1
0.1%
352.0 1
0.1%
357.0 1
0.1%
361.0 1
0.1%
365.0 1
0.1%
367.0 1
0.1%
376.0 1
0.1%
ValueCountFrequency (%)
2724340.32 1
0.1%
843922.76 1
0.1%
772228.76 1
0.1%
333571.89 1
0.1%
287024.48 1
0.1%
277416.07 1
0.1%
276457.0 1
0.1%
238551.15 1
0.1%
142799.07 1
0.1%
123177.01 1
0.1%

Interactions

2023-12-13T02:28:00.769485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:28:00.540959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:28:00.881963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:28:00.632616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:28:03.894368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건축물용도우편번호건축물연면적
건축물용도1.0000.4180.000
우편번호0.4181.0000.112
건축물연면적0.0000.1121.000
2023-12-13T02:28:04.013031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호건축물연면적건축물용도
우편번호1.0000.1940.155
건축물연면적0.1941.0000.000
건축물용도0.1550.0001.000

Missing values

2023-12-13T02:28:01.024623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:28:01.145853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

건축물용도건물명(소유자명)도로명주소우편번호건축물연면적
0공장삼성전자(주)경기도 화성시 삼성전자로 1184482724340.32
1교육연구시설현대.기아자동차(주)연구개발본부경기도 화성시 남양읍 현대연구소로 15018280843922.76
2공장기아자동차 화성공장경기도 화성시 우정읍 기아자동차로 9518571772228.76
3공장기아자동차 화성공장경기도 화성시 우정읍 기아자동차로 9518571333571.89
4공장금강펜테리움 IX타워경기도 화성시 동탄첨단산업1로 2718469287024.48
5교육연구시설수원대학교경기도 화성시 봉담읍 와우안길 1718323277416.07
6판매시설동탄역 롯데캐슬(롯데백화점)경기도 화성시 동탄역로 16018478276457.0
7교육연구시설장안대학교경기도 화성시 봉담읍 삼천병마로 118218331142799.07
8교육연구시설수원과학대학교경기도 화성시 정남면 세자로 28818516111353.01
9의료시설한림대학교의료원경기도 화성시 큰재봉길 718450101795.47
건축물용도건물명(소유자명)도로명주소우편번호건축물연면적
822공동주택시티프라디움4차아파트경기도 화성시 남양읍 남양로862번길 1318264556.0
823공동주택우미린 센트포레1단지경기도 화성시 장조4로 1318354650.0
824공동주택남양리젠시빌란트아파트경기도 화성시 남양읍 시청로102번길 5218268380.0
825공동주택동탄역헤리엇경기도 화성시 동탄역로 5418481428.0
826공동주택에듀시티파라곤경기도 화성시 봉담읍 동화서북안길 1018298600.0
827공동주택화성시청역 센트럴파크 서희스타힐스3단지경기도 화성시 남양읍 화성시청역로 1418269847.0
828공동주택화성동탄2 LH행복주택 38단지경기도 화성시 동탄신리천로5길 9018492700.0
829공동주택향남언덕마을15단지경기도 화성시 향남읍 향남로39번길 2218621922.0
830공동주택증흥에스클래스 더 센트럴경기도 화성시 봉담읍 상리3길 16018311824.0
831공동주택우미린 센트포레2단지경기도 화성시 장조4로 3318352650.0