Overview

Dataset statistics

Number of variables5
Number of observations249
Missing cells494
Missing cells (%)39.7%
Duplicate rows60
Duplicate rows (%)24.1%
Total size in memory10.1 KiB
Average record size in memory41.5 B

Variable types

Text4
Numeric1

Dataset

Description광주광역시 서구에 위치한 건축물 중 기계설비 법에 해당되는 건축물에 대한 정보로 건축물명, 도로명주소, 연면적, 세대수 등에 대한 공공데이터입니다.
Author광주광역시 서구
URLhttps://www.data.go.kr/data/15125453/fileData.do

Alerts

Dataset has 60 (24.1%) duplicate rowsDuplicates
연면적 has 76 (30.5%) missing valuesMissing
세대수 has 173 (69.5%) missing valuesMissing
비고 has 245 (98.4%) missing valuesMissing

Reproduction

Analysis started2023-12-23 07:55:39.051184
Analysis finished2023-12-23 07:55:43.657754
Duration4.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct188
Distinct (%)75.5%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-23T07:55:44.389169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length7.9196787
Min length3

Characters and Unicode

Total characters1972
Distinct characters262
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique128 ?
Unique (%)51.4%

Sample

1st row동천마을주공아파트6단지
2nd row쌍촌주공아파트
3rd row빛고을파크
4th row상무중흥아파트2단지
5th row광천E편한세상
ValueCountFrequency (%)
광주광역시 5
 
1.7%
건물 4
 
1.4%
유탑유블레스 4
 
1.4%
상무병원 3
 
1.0%
상무영무예다음(상가 2
 
0.7%
모아엘가비즈니스센터 2
 
0.7%
한국토지주택공사 2
 
0.7%
치평동 2
 
0.7%
현대해상화재보험㈜ 2
 
0.7%
광덕중고등학교 2
 
0.7%
Other values (206) 266
90.5%
2023-12-23T07:55:46.284142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
77
 
3.9%
76
 
3.9%
66
 
3.3%
55
 
2.8%
47
 
2.4%
46
 
2.3%
45
 
2.3%
33
 
1.7%
29
 
1.5%
27
 
1.4%
Other values (252) 1471
74.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1824
92.5%
Space Separator 45
 
2.3%
Uppercase Letter 43
 
2.2%
Decimal Number 38
 
1.9%
Open Punctuation 6
 
0.3%
Close Punctuation 6
 
0.3%
Other Symbol 4
 
0.2%
Other Punctuation 3
 
0.2%
Lowercase Letter 2
 
0.1%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
77
 
4.2%
76
 
4.2%
66
 
3.6%
55
 
3.0%
47
 
2.6%
46
 
2.5%
33
 
1.8%
29
 
1.6%
27
 
1.5%
26
 
1.4%
Other values (223) 1342
73.6%
Uppercase Letter
ValueCountFrequency (%)
B 7
16.3%
K 6
14.0%
S 6
14.0%
C 5
11.6%
L 4
9.3%
G 3
7.0%
T 3
7.0%
E 2
 
4.7%
Y 2
 
4.7%
M 2
 
4.7%
Other values (3) 3
7.0%
Decimal Number
ValueCountFrequency (%)
1 15
39.5%
2 11
28.9%
3 5
 
13.2%
5 4
 
10.5%
6 2
 
5.3%
8 1
 
2.6%
Other Punctuation
ValueCountFrequency (%)
, 1
33.3%
& 1
33.3%
. 1
33.3%
Lowercase Letter
ValueCountFrequency (%)
k 1
50.0%
t 1
50.0%
Space Separator
ValueCountFrequency (%)
45
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1828
92.7%
Common 99
 
5.0%
Latin 45
 
2.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
77
 
4.2%
76
 
4.2%
66
 
3.6%
55
 
3.0%
47
 
2.6%
46
 
2.5%
33
 
1.8%
29
 
1.6%
27
 
1.5%
26
 
1.4%
Other values (224) 1346
73.6%
Latin
ValueCountFrequency (%)
B 7
15.6%
K 6
13.3%
S 6
13.3%
C 5
11.1%
L 4
8.9%
G 3
6.7%
T 3
6.7%
E 2
 
4.4%
Y 2
 
4.4%
M 2
 
4.4%
Other values (5) 5
11.1%
Common
ValueCountFrequency (%)
45
45.5%
1 15
 
15.2%
2 11
 
11.1%
( 6
 
6.1%
) 6
 
6.1%
3 5
 
5.1%
5 4
 
4.0%
6 2
 
2.0%
, 1
 
1.0%
- 1
 
1.0%
Other values (3) 3
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1824
92.5%
ASCII 144
 
7.3%
None 4
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
77
 
4.2%
76
 
4.2%
66
 
3.6%
55
 
3.0%
47
 
2.6%
46
 
2.5%
33
 
1.8%
29
 
1.6%
27
 
1.5%
26
 
1.4%
Other values (223) 1342
73.6%
ASCII
ValueCountFrequency (%)
45
31.2%
1 15
 
10.4%
2 11
 
7.6%
B 7
 
4.9%
( 6
 
4.2%
) 6
 
4.2%
K 6
 
4.2%
S 6
 
4.2%
C 5
 
3.5%
3 5
 
3.5%
Other values (18) 32
22.2%
None
ValueCountFrequency (%)
4
100.0%
Distinct188
Distinct (%)75.5%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-23T07:55:47.521461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length26
Mean length19.662651
Min length15

Characters and Unicode

Total characters4896
Distinct characters95
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique127 ?
Unique (%)51.0%

Sample

1st row광주광역시 서구 하남대로710번길 5
2nd row광주광역시 서구 쌍학로 47
3rd row광주광역시 서구 화정로 105
4th row광주광역시 서구 치평로 77
5th row광주광역시 서구 화운로 278
ValueCountFrequency (%)
광주광역시 249
22.8%
서구 249
22.8%
치평동 24
 
2.2%
시청로 19
 
1.7%
상무중앙로 16
 
1.5%
쌍촌동 14
 
1.3%
풍암동 14
 
1.3%
죽봉대로 13
 
1.2%
화정동 11
 
1.0%
무진대로 9
 
0.8%
Other values (207) 475
43.5%
2023-12-23T07:55:50.574413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
846
17.3%
502
 
10.3%
275
 
5.6%
255
 
5.2%
249
 
5.1%
249
 
5.1%
249
 
5.1%
247
 
5.0%
1 149
 
3.0%
2 108
 
2.2%
Other values (85) 1767
36.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3101
63.3%
Space Separator 846
 
17.3%
Decimal Number 744
 
15.2%
Close Punctuation 98
 
2.0%
Open Punctuation 98
 
2.0%
Dash Punctuation 9
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
502
16.2%
275
 
8.9%
255
 
8.2%
249
 
8.0%
249
 
8.0%
249
 
8.0%
247
 
8.0%
106
 
3.4%
70
 
2.3%
61
 
2.0%
Other values (71) 838
27.0%
Decimal Number
ValueCountFrequency (%)
1 149
20.0%
2 108
14.5%
3 75
10.1%
7 71
9.5%
4 70
9.4%
5 59
 
7.9%
9 56
 
7.5%
0 55
 
7.4%
6 55
 
7.4%
8 46
 
6.2%
Space Separator
ValueCountFrequency (%)
846
100.0%
Close Punctuation
ValueCountFrequency (%)
) 98
100.0%
Open Punctuation
ValueCountFrequency (%)
( 98
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3101
63.3%
Common 1795
36.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
502
16.2%
275
 
8.9%
255
 
8.2%
249
 
8.0%
249
 
8.0%
249
 
8.0%
247
 
8.0%
106
 
3.4%
70
 
2.3%
61
 
2.0%
Other values (71) 838
27.0%
Common
ValueCountFrequency (%)
846
47.1%
1 149
 
8.3%
2 108
 
6.0%
) 98
 
5.5%
( 98
 
5.5%
3 75
 
4.2%
7 71
 
4.0%
4 70
 
3.9%
5 59
 
3.3%
9 56
 
3.1%
Other values (4) 165
 
9.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3101
63.3%
ASCII 1795
36.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
846
47.1%
1 149
 
8.3%
2 108
 
6.0%
) 98
 
5.5%
( 98
 
5.5%
3 75
 
4.2%
7 71
 
4.0%
4 70
 
3.9%
5 59
 
3.3%
9 56
 
3.1%
Other values (4) 165
 
9.2%
Hangul
ValueCountFrequency (%)
502
16.2%
275
 
8.9%
255
 
8.2%
249
 
8.0%
249
 
8.0%
249
 
8.0%
247
 
8.0%
106
 
3.4%
70
 
2.3%
61
 
2.0%
Other values (71) 838
27.0%

연면적
Text

MISSING 

Distinct129
Distinct (%)74.6%
Missing76
Missing (%)30.5%
Memory size2.1 KiB
2023-12-23T07:55:52.223040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length7.7803468
Min length5

Characters and Unicode

Total characters1346
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique85 ?
Unique (%)49.1%

Sample

1st row23489.67
2nd row28211.39
3rd row15579.24
4th row15744.51
5th row16685.68
ValueCountFrequency (%)
21816.88 2
 
1.2%
16111.43 2
 
1.2%
23489.67 2
 
1.2%
21975.5 2
 
1.2%
15159.25 2
 
1.2%
23410.24 2
 
1.2%
24745.18 2
 
1.2%
17464.97 2
 
1.2%
25206.26 2
 
1.2%
20795.39 2
 
1.2%
Other values (119) 153
88.4%
2023-12-23T07:55:55.114097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 213
15.8%
. 160
11.9%
2 147
10.9%
5 126
9.4%
4 108
8.0%
3 105
7.8%
0 105
7.8%
8 103
7.7%
9 102
7.6%
7 95
7.1%
Other values (2) 82
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1184
88.0%
Other Punctuation 162
 
12.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 213
18.0%
2 147
12.4%
5 126
10.6%
4 108
9.1%
3 105
8.9%
0 105
8.9%
8 103
8.7%
9 102
8.6%
7 95
8.0%
6 80
 
6.8%
Other Punctuation
ValueCountFrequency (%)
. 160
98.8%
, 2
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Common 1346
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 213
15.8%
. 160
11.9%
2 147
10.9%
5 126
9.4%
4 108
8.0%
3 105
7.8%
0 105
7.8%
8 103
7.7%
9 102
7.6%
7 95
7.1%
Other values (2) 82
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1346
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 213
15.8%
. 160
11.9%
2 147
10.9%
5 126
9.4%
4 108
8.0%
3 105
7.8%
0 105
7.8%
8 103
7.7%
9 102
7.6%
7 95
7.1%
Other values (2) 82
 
6.1%

세대수
Real number (ℝ)

MISSING 

Distinct60
Distinct (%)78.9%
Missing173
Missing (%)69.5%
Infinite0
Infinite (%)0.0%
Mean990.28947
Minimum276
Maximum2185
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.3 KiB
2023-12-23T07:55:55.945956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum276
5-th percentile519
Q1639.25
median874
Q31283
95-th percentile1734
Maximum2185
Range1909
Interquartile range (IQR)643.75

Descriptive statistics

Standard deviation412.93063
Coefficient of variation (CV)0.41697972
Kurtosis-0.0060529718
Mean990.28947
Median Absolute Deviation (MAD)267.5
Skewness0.73510039
Sum75262
Variance170511.7
MonotonicityNot monotonic
2023-12-23T07:55:56.627638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1210 2
 
0.8%
1435 2
 
0.8%
1976 2
 
0.8%
1600 2
 
0.8%
1298 2
 
0.8%
1734 2
 
0.8%
1278 2
 
0.8%
1233 2
 
0.8%
1308 2
 
0.8%
1060 2
 
0.8%
Other values (50) 56
 
22.5%
(Missing) 173
69.5%
ValueCountFrequency (%)
276 1
0.4%
374 1
0.4%
500 1
0.4%
510 1
0.4%
522 1
0.4%
525 1
0.4%
536 1
0.4%
564 1
0.4%
570 1
0.4%
571 1
0.4%
ValueCountFrequency (%)
2185 1
0.4%
1976 2
0.8%
1734 2
0.8%
1600 2
0.8%
1500 2
0.8%
1442 2
0.8%
1437 2
0.8%
1435 2
0.8%
1308 2
0.8%
1298 2
0.8%

비고
Text

MISSING 

Distinct3
Distinct (%)75.0%
Missing245
Missing (%)98.4%
Memory size2.1 KiB
2023-12-23T07:55:57.101163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5.5
Mean length4.75
Min length4

Characters and Unicode

Total characters19
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)50.0%

Sample

1st row중앙집중
2nd row중앙집중
3rd row공동주택제외
4th row상가 제외
ValueCountFrequency (%)
중앙집중 2
40.0%
공동주택제외 1
20.0%
상가 1
20.0%
제외 1
20.0%
2023-12-23T07:55:58.404500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
21.1%
2
10.5%
2
10.5%
2
10.5%
2
10.5%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
1
 
5.3%
Other values (2) 2
10.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 18
94.7%
Space Separator 1
 
5.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
22.2%
2
11.1%
2
11.1%
2
11.1%
2
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 18
94.7%
Common 1
 
5.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
22.2%
2
11.1%
2
11.1%
2
11.1%
2
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 18
94.7%
ASCII 1
 
5.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4
22.2%
2
11.1%
2
11.1%
2
11.1%
2
11.1%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
1
 
5.6%
ASCII
ValueCountFrequency (%)
1
100.0%

Interactions

2023-12-23T07:55:40.858044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-23T07:55:58.965545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세대수비고
세대수1.000NaN
비고NaN1.000

Missing values

2023-12-23T07:55:41.897618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-23T07:55:42.772980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-23T07:55:43.397709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

건축물명도로명주소연면적세대수비고
0동천마을주공아파트6단지광주광역시 서구 하남대로710번길 5<NA>1308<NA>
1쌍촌주공아파트광주광역시 서구 쌍학로 47<NA>1435<NA>
2빛고을파크광주광역시 서구 화정로 105<NA>1100<NA>
3상무중흥아파트2단지광주광역시 서구 치평로 77<NA>1108<NA>
4광천E편한세상광주광역시 서구 화운로 278<NA>1096<NA>
5금호빛여울채아파트광주광역시 서구 운천로32번길 23<NA>1500<NA>
6동천마을1단지아파트광주광역시 서구 동천로 25<NA>1442<NA>
7풍암중흥아파트광주광역시 서구 마재로 21<NA>1437<NA>
8화정라인동산아파트광주광역시 서구 염화로45번길 17<NA>1060<NA>
9내방마을주공아파트광주광역시 서구 화운로193번길 25<NA>1210<NA>
건축물명도로명주소연면적세대수비고
239디오빌광주광역시 서구 시청로 4137112.93<NA><NA>
240광주광역시도시공사광주광역시 서구 시청로 2632840.77<NA><NA>
241기아1,3공장광주광역시 서구 화운로 277298002.1<NA><NA>
242기아2-1공장광주광역시 서구 월드컵4강로 27735889.53<NA><NA>
243기아2공장광주광역시 서구 화운로 211202592.05<NA><NA>
244유스퀘어터미널광주광역시 서구 무진대로 90481277.79<NA><NA>
245광주신세계백화점광주광역시 서구 무진대로 93258432<NA><NA>
246이마트 신세계점광주광역시 서구 죽봉대로 6176386<NA><NA>
247서부농수산물도매시장광주광역시 서구 매월2로 1659702.56<NA><NA>
248금호월드광주광역시 서구 군분2로 5458019.67<NA><NA>

Duplicate rows

Most frequently occurring

건축물명도로명주소연면적세대수비고# duplicates
0BM타워광주광역시 서구 상무중앙로 9815579.24<NA><NA>2
1BYC주식회사광주광역시 서구 상무중앙로 4321935.23<NA><NA>2
2KBC플러스(호반써밋상가)광주광역시 서구 무진대로 91927484.85<NA><NA>2
3LST주식회사광주광역시 서구 매월2로15번길 1520795.39<NA><NA>2
4골든빌오피스텔광주광역시 서구 시청로96번길 1222898.58<NA><NA>2
5골든힐스타워광주광역시 서구 죽봉대로78번길 1021975.5<NA><NA>2
6광덕중고등학교광주광역시 서구 화정로 20223147.51<NA><NA>2
7광주광역시 도시철도공사광주광역시 서구 상무대로 76027849.18<NA><NA>2
8광주광역시 서구청광주광역시 서구 경열로 3323489.67<NA><NA>2
9광주서석중고등학교광주광역시 서구 화정로253번길 2718811.23<NA><NA>2