Overview

Dataset statistics

Number of variables6
Number of observations218
Missing cells1
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.0 KiB
Average record size in memory51.6 B

Variable types

Text2
Numeric2
Categorical2

Dataset

Description경상북도 경산시 마을회관 정보에 관한 데이터로 마을회관 위치, 마을회관 면적, 마을회관 건물구조 등의 항목을 제공하고 있습니다.
Author경상북도 경산시
URLhttps://www.data.go.kr/data/15007448/fileData.do

Alerts

건물구조 is highly imbalanced (61.3%)Imbalance
소재지 has unique valuesUnique

Reproduction

Analysis started2023-12-12 22:00:30.072830
Analysis finished2023-12-12 22:00:31.209565
Duration1.14 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct216
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-13T07:00:31.476672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.5779817
Min length2

Characters and Unicode

Total characters780
Distinct characters115
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique214 ?
Unique (%)98.2%

Sample

1st row금락1리
2nd row금락2리
3rd row금락3리
4th row금락4리
5th row동서1리
ValueCountFrequency (%)
용천1리 2
 
0.9%
당리리 2
 
0.9%
조영1동 1
 
0.5%
사월2리 1
 
0.5%
사월1리 1
 
0.5%
우검리 1
 
0.5%
금락1리 1
 
0.5%
용산리 1
 
0.5%
미산1리 1
 
0.5%
미산2리 1
 
0.5%
Other values (207) 207
94.5%
2023-12-13T07:00:31.952657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
195
25.0%
1 63
 
8.1%
2 57
 
7.3%
34
 
4.4%
20
 
2.6%
18
 
2.3%
15
 
1.9%
14
 
1.8%
14
 
1.8%
13
 
1.7%
Other values (105) 337
43.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 645
82.7%
Decimal Number 131
 
16.8%
Space Separator 4
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
195
30.2%
34
 
5.3%
20
 
3.1%
18
 
2.8%
15
 
2.3%
14
 
2.2%
14
 
2.2%
13
 
2.0%
10
 
1.6%
9
 
1.4%
Other values (99) 303
47.0%
Decimal Number
ValueCountFrequency (%)
1 63
48.1%
2 57
43.5%
3 8
 
6.1%
4 2
 
1.5%
5 1
 
0.8%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 645
82.7%
Common 135
 
17.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
195
30.2%
34
 
5.3%
20
 
3.1%
18
 
2.8%
15
 
2.3%
14
 
2.2%
14
 
2.2%
13
 
2.0%
10
 
1.6%
9
 
1.4%
Other values (99) 303
47.0%
Common
ValueCountFrequency (%)
1 63
46.7%
2 57
42.2%
3 8
 
5.9%
4
 
3.0%
4 2
 
1.5%
5 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 645
82.7%
ASCII 135
 
17.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
195
30.2%
34
 
5.3%
20
 
3.1%
18
 
2.8%
15
 
2.3%
14
 
2.2%
14
 
2.2%
13
 
2.0%
10
 
1.6%
9
 
1.4%
Other values (99) 303
47.0%
ASCII
ValueCountFrequency (%)
1 63
46.7%
2 57
42.2%
3 8
 
5.9%
4
 
3.0%
4 2
 
1.5%
5 1
 
0.7%

소재지
Text

UNIQUE 

Distinct218
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-13T07:00:32.168954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length17
Mean length15.541284
Min length12

Characters and Unicode

Total characters3388
Distinct characters132
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique218 ?
Unique (%)100.0%

Sample

1st row경산시 하양읍금락리 75
2nd row경산시 하양읍금락리 35-2
3rd row경산시 하양읍금락리 209-38
4th row경산시 하양읍금락리 133-13
5th row경산시 하양읍동서리 558-4
ValueCountFrequency (%)
경산시 218
33.4%
하양읍금락리 4
 
0.6%
하양읍동서리 3
 
0.5%
149-1 2
 
0.3%
용성면일광리 2
 
0.3%
서부1동사정동 2
 
0.3%
하양읍남하리 2
 
0.3%
하양읍부호리 2
 
0.3%
하양읍대곡리 2
 
0.3%
하양읍은호리 2
 
0.3%
Other values (412) 414
63.4%
2023-12-13T07:00:32.531607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
436
 
12.9%
253
 
7.5%
220
 
6.5%
220
 
6.5%
1 180
 
5.3%
174
 
5.1%
2 151
 
4.5%
- 133
 
3.9%
112
 
3.3%
3 93
 
2.7%
Other values (122) 1416
41.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1962
57.9%
Decimal Number 855
25.2%
Space Separator 436
 
12.9%
Dash Punctuation 133
 
3.9%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
253
 
12.9%
220
 
11.2%
220
 
11.2%
174
 
8.9%
112
 
5.7%
80
 
4.1%
68
 
3.5%
54
 
2.8%
53
 
2.7%
40
 
2.0%
Other values (108) 688
35.1%
Decimal Number
ValueCountFrequency (%)
1 180
21.1%
2 151
17.7%
3 93
10.9%
5 77
9.0%
4 77
9.0%
7 67
 
7.8%
8 56
 
6.5%
9 54
 
6.3%
6 54
 
6.3%
0 46
 
5.4%
Space Separator
ValueCountFrequency (%)
436
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 133
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1960
57.9%
Common 1426
42.1%
Han 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
253
 
12.9%
220
 
11.2%
220
 
11.2%
174
 
8.9%
112
 
5.7%
80
 
4.1%
68
 
3.5%
54
 
2.8%
53
 
2.7%
40
 
2.0%
Other values (106) 686
35.0%
Common
ValueCountFrequency (%)
436
30.6%
1 180
12.6%
2 151
 
10.6%
- 133
 
9.3%
3 93
 
6.5%
5 77
 
5.4%
4 77
 
5.4%
7 67
 
4.7%
8 56
 
3.9%
9 54
 
3.8%
Other values (4) 102
 
7.2%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1960
57.9%
ASCII 1426
42.1%
CJK 2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
436
30.6%
1 180
12.6%
2 151
 
10.6%
- 133
 
9.3%
3 93
 
6.5%
5 77
 
5.4%
4 77
 
5.4%
7 67
 
4.7%
8 56
 
3.9%
9 54
 
3.8%
Other values (4) 102
 
7.2%
Hangul
ValueCountFrequency (%)
253
 
12.9%
220
 
11.2%
220
 
11.2%
174
 
8.9%
112
 
5.7%
80
 
4.1%
68
 
3.5%
54
 
2.8%
53
 
2.7%
40
 
2.0%
Other values (106) 686
35.0%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%

연면적(m2)
Real number (ℝ)

Distinct215
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean157.05573
Minimum0
Maximum939.11
Zeros1
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size2.0 KiB
2023-12-13T07:00:32.662452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile71.0255
Q1111.3875
median151.825
Q3178.91
95-th percentile271.8155
Maximum939.11
Range939.11
Interquartile range (IQR)67.5225

Descriptive statistics

Standard deviation80.847472
Coefficient of variation (CV)0.51476931
Kurtosis40.624033
Mean157.05573
Median Absolute Deviation (MAD)33.485
Skewness4.6653549
Sum34238.15
Variance6536.3137
MonotonicityNot monotonic
2023-12-13T07:00:32.786500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
105.0 2
 
0.9%
166.25 2
 
0.9%
132.0 2
 
0.9%
130.23 1
 
0.5%
137.04 1
 
0.5%
66.96 1
 
0.5%
102.52 1
 
0.5%
163.11 1
 
0.5%
108.0 1
 
0.5%
78.72 1
 
0.5%
Other values (205) 205
94.0%
ValueCountFrequency (%)
0.0 1
0.5%
35.37 1
0.5%
41.68 1
0.5%
41.98 1
0.5%
43.26 1
0.5%
48.0 1
0.5%
65.93 1
0.5%
66.0 1
0.5%
66.96 1
0.5%
69.0 1
0.5%
ValueCountFrequency (%)
939.11 1
0.5%
442.54 1
0.5%
436.24 1
0.5%
324.0 1
0.5%
323.79 1
0.5%
312.04 1
0.5%
291.3 1
0.5%
290.62 1
0.5%
280.87 1
0.5%
275.69 1
0.5%

층수
Categorical

Distinct3
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2
146 
1
68 
3
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row2
3rd row1
4th row1
5th row2

Common Values

ValueCountFrequency (%)
2 146
67.0%
1 68
31.2%
3 4
 
1.8%

Length

2023-12-13T07:00:32.893777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:00:32.983833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 146
67.0%
1 68
31.2%
3 4
 
1.8%

건물구조
Categorical

IMBALANCE 

Distinct14
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
철근콘크리트
161 
벽돌
26 
시멘트벽돌
 
15
시멘트블럭
 
3
조적조
 
3
Other values (9)
 
10

Length

Max length12
Median length6
Mean length5.3669725
Min length2

Unique

Unique8 ?
Unique (%)3.7%

Sample

1st row철근콘크리트
2nd row시멘트벽돌
3rd row시멘트블럭
4th row시멘트블럭
5th row철근콘크리트

Common Values

ValueCountFrequency (%)
철근콘크리트 161
73.9%
벽돌 26
 
11.9%
시멘트벽돌 15
 
6.9%
시멘트블럭 3
 
1.4%
조적조 3
 
1.4%
블록 2
 
0.9%
철골조 1
 
0.5%
블럭조+벽돌조 1
 
0.5%
철근콘크리트+시멘트벽돌 1
 
0.5%
슬라브 1
 
0.5%
Other values (4) 4
 
1.8%

Length

2023-12-13T07:00:33.098468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
철근콘크리트 161
73.5%
벽돌 26
 
11.9%
시멘트벽돌 15
 
6.8%
시멘트블럭 3
 
1.4%
조적조 3
 
1.4%
블록 3
 
1.4%
철골조 1
 
0.5%
블럭조+벽돌조 1
 
0.5%
철근콘크리트+시멘트벽돌 1
 
0.5%
슬라브 1
 
0.5%
Other values (4) 4
 
1.8%

건립연도
Real number (ℝ)

Distinct45
Distinct (%)20.7%
Missing1
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean2003.1152
Minimum1972
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 KiB
2023-12-13T07:00:33.234292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1972
5-th percentile1986.6
Q11998
median2003
Q32008
95-th percentile2018
Maximum2022
Range50
Interquartile range (IQR)10

Descriptive statistics

Standard deviation9.1241742
Coefficient of variation (CV)0.0045549922
Kurtosis1.2385398
Mean2003.1152
Median Absolute Deviation (MAD)5
Skewness-0.63140053
Sum434676
Variance83.250555
MonotonicityNot monotonic
2023-12-13T07:00:33.368545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
2001 18
 
8.3%
2002 15
 
6.9%
1998 14
 
6.4%
2003 13
 
6.0%
2008 11
 
5.0%
2007 11
 
5.0%
2000 10
 
4.6%
2005 10
 
4.6%
2014 8
 
3.7%
1996 8
 
3.7%
Other values (35) 99
45.4%
ValueCountFrequency (%)
1972 1
0.5%
1973 1
0.5%
1974 1
0.5%
1978 1
0.5%
1979 1
0.5%
1980 2
0.9%
1981 1
0.5%
1983 1
0.5%
1985 2
0.9%
1987 1
0.5%
ValueCountFrequency (%)
2022 2
 
0.9%
2021 1
 
0.5%
2020 1
 
0.5%
2019 6
2.8%
2018 2
 
0.9%
2017 4
1.8%
2016 2
 
0.9%
2015 4
1.8%
2014 8
3.7%
2013 3
 
1.4%

Interactions

2023-12-13T07:00:30.557733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:00:30.383134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:00:30.933807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:00:30.457417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:00:33.460248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연면적(m2)층수건물구조건립연도
연면적(m2)1.0000.5830.5700.367
층수0.5831.0000.2380.366
건물구조0.5700.2381.0000.702
건립연도0.3670.3660.7021.000
2023-12-13T07:00:33.545723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건물구조층수
건물구조1.0000.131
층수0.1311.000
2023-12-13T07:00:33.633329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연면적(m2)건립연도층수건물구조
연면적(m2)1.000-0.0810.2890.317
건립연도-0.0811.0000.2410.396
층수0.2890.2411.0000.131
건물구조0.3170.3960.1311.000

Missing values

2023-12-13T07:00:31.070417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:00:31.169700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회관명소재지연면적(m2)층수건물구조건립연도
0금락1리경산시 하양읍금락리 75262.443철근콘크리트1998
1금락2리경산시 하양읍금락리 35-241.682시멘트벽돌1985
2금락3리경산시 하양읍금락리 209-3835.371시멘트블럭1973
3금락4리경산시 하양읍금락리 133-1341.981시멘트블럭1974
4동서1리경산시 하양읍동서리 558-4214.052철근콘크리트2006
5동서2리경산시 하양읍동서리 592-7268.92철골조1997
6동서3리경산시 하양읍동서리 115-23104.221시멘트블럭1978
7도리리경산시 하양읍도리리 112-43263.852철근콘크리트2003
8양지리경산시 하양읍양지리 162-399.212철근콘크리트2007
9대곡1리경산시 하양읍대곡리 138-1138.961철근콘크리트2002
회관명소재지연면적(m2)층수건물구조건립연도
208대평동경산시 북부동대평동 176-1156.72철근콘크리트2001
209대정동경산시 북부동대정동 623-1264.542철근콘크리트2001
210임당1동경산시 북부동임당1동 403192.92철근콘크리트2002
211임당2동경산시 북부동임당2동 516-5170.12철근콘크리트1987
212임당3동경산시 북부동임당로19길 2-5149.142철근콘크리트1997
213대동경산시 북부동대동 57-6179.222철근콘크리트2007
214조영1동경산시 북부동조영1동 272112.462철근콘크리트2000
215조영2동경산시 북부동조영2동 371-4171.782철근콘크리트2005
216갑제동경산시 북부동갑제동 296-10170.852철근콘크리트2003
217계양동경산시 북부동계양동 235-1146.262철근콘크리트1988