Overview

Dataset statistics

Number of variables4
Number of observations72
Missing cells15
Missing cells (%)5.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 KiB
Average record size in memory34.8 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description행정중심복합도시건설청에서 추진하는 공공 건축물 건립 현황으로 구분,공공건축물,현황,준공(예정)연도가 포함되어 있습니다.
URLhttps://www.data.go.kr/data/15064005/fileData.do

Alerts

준공(예정)연도 is highly overall correlated with 현황High correlation
현황 is highly overall correlated with 준공(예정)연도High correlation
준공(예정)연도 has 15 (20.8%) missing valuesMissing
공공건축물 has unique valuesUnique

Reproduction

Analysis started2023-12-12 03:59:19.955223
Analysis finished2023-12-12 03:59:20.466497
Duration0.51 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

Distinct7
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Memory size708.0 B
복지
28 
중앙행정
21 
문화
10 
지방행정
주거
 
2
Other values (2)
 
2

Length

Max length4
Median length2
Mean length2.8611111
Min length2

Unique

Unique2 ?
Unique (%)2.8%

Sample

1st row중앙행정
2nd row중앙행정
3rd row중앙행정
4th row중앙행정
5th row중앙행정

Common Values

ValueCountFrequency (%)
복지 28
38.9%
중앙행정 21
29.2%
문화 10
 
13.9%
지방행정 9
 
12.5%
주거 2
 
2.8%
체육 1
 
1.4%
도시기능 1
 
1.4%

Length

2023-12-12T12:59:20.538702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:59:20.673630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
복지 28
38.9%
중앙행정 21
29.2%
문화 10
 
13.9%
지방행정 9
 
12.5%
주거 2
 
2.8%
체육 1
 
1.4%
도시기능 1
 
1.4%

공공건축물
Text

UNIQUE 

Distinct72
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size708.0 B
2023-12-12T12:59:20.950826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length12
Mean length9.7638889
Min length3

Characters and Unicode

Total characters703
Distinct characters123
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)100.0%

Sample

1st row정부세종청사
2nd row총리공관
3rd row행정지원센터
4th row대통령기록관
5th row선관위
ValueCountFrequency (%)
복합커뮤니티센터 22
 
20.0%
광역복지지원센터 6
 
5.5%
정부세종청사 4
 
3.6%
3생활권 2
 
1.8%
2차 1
 
0.9%
디지털문화유산센터 1
 
0.9%
4생활권 1
 
0.9%
5-2생활권 1
 
0.9%
6-3생활권 1
 
0.9%
4-2생활권 1
 
0.9%
Other values (70) 70
63.6%
2023-12-12T12:59:21.392671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
38
 
5.4%
35
 
5.0%
34
 
4.8%
31
 
4.4%
30
 
4.3%
29
 
4.1%
29
 
4.1%
25
 
3.6%
22
 
3.1%
22
 
3.1%
Other values (113) 408
58.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 583
82.9%
Decimal Number 58
 
8.3%
Space Separator 38
 
5.4%
Dash Punctuation 22
 
3.1%
Uppercase Letter 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
35
 
6.0%
34
 
5.8%
31
 
5.3%
30
 
5.1%
29
 
5.0%
29
 
5.0%
25
 
4.3%
22
 
3.8%
22
 
3.8%
22
 
3.8%
Other values (102) 304
52.1%
Decimal Number
ValueCountFrequency (%)
1 16
27.6%
2 15
25.9%
3 10
17.2%
4 6
 
10.3%
5 5
 
8.6%
6 5
 
8.6%
9 1
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
A 1
50.0%
B 1
50.0%
Space Separator
ValueCountFrequency (%)
38
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 583
82.9%
Common 118
 
16.8%
Latin 2
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
35
 
6.0%
34
 
5.8%
31
 
5.3%
30
 
5.1%
29
 
5.0%
29
 
5.0%
25
 
4.3%
22
 
3.8%
22
 
3.8%
22
 
3.8%
Other values (102) 304
52.1%
Common
ValueCountFrequency (%)
38
32.2%
- 22
18.6%
1 16
13.6%
2 15
 
12.7%
3 10
 
8.5%
4 6
 
5.1%
5 5
 
4.2%
6 5
 
4.2%
9 1
 
0.8%
Latin
ValueCountFrequency (%)
A 1
50.0%
B 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 583
82.9%
ASCII 120
 
17.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
38
31.7%
- 22
18.3%
1 16
13.3%
2 15
 
12.5%
3 10
 
8.3%
4 6
 
5.0%
5 5
 
4.2%
6 5
 
4.2%
A 1
 
0.8%
B 1
 
0.8%
Hangul
ValueCountFrequency (%)
35
 
6.0%
34
 
5.8%
31
 
5.3%
30
 
5.1%
29
 
5.0%
29
 
5.0%
25
 
4.3%
22
 
3.8%
22
 
3.8%
22
 
3.8%
Other values (102) 304
52.1%

현황
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size708.0 B
준공
40 
진행중
19 
착수 예정
13 

Length

Max length5
Median length2
Mean length2.8055556
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row준공
2nd row준공
3rd row준공
4th row준공
5th row준공

Common Values

ValueCountFrequency (%)
준공 40
55.6%
진행중 19
26.4%
착수 예정 13
 
18.1%

Length

2023-12-12T12:59:21.552900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:59:21.664616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
준공 40
47.1%
진행중 19
22.4%
착수 13
 
15.3%
예정 13
 
15.3%

준공(예정)연도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct18
Distinct (%)31.6%
Missing15
Missing (%)20.8%
Infinite0
Infinite (%)0.0%
Mean2019.9649
Minimum2012
Maximum2030
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size780.0 B
2023-12-12T12:59:21.769715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2012
5-th percentile2013
Q12015
median2020
Q32024
95-th percentile2027.4
Maximum2030
Range18
Interquartile range (IQR)9

Descriptive statistics

Standard deviation5.0953933
Coefficient of variation (CV)0.0025225157
Kurtosis-1.1207634
Mean2019.9649
Median Absolute Deviation (MAD)4
Skewness0.14703253
Sum115138
Variance25.963033
MonotonicityNot monotonic
2023-12-12T12:59:21.895068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
2021 6
 
8.3%
2014 6
 
8.3%
2027 5
 
6.9%
2018 4
 
5.6%
2013 4
 
5.6%
2016 4
 
5.6%
2019 3
 
4.2%
2015 3
 
4.2%
2022 3
 
4.2%
2024 3
 
4.2%
Other values (8) 16
22.2%
(Missing) 15
20.8%
ValueCountFrequency (%)
2012 2
 
2.8%
2013 4
5.6%
2014 6
8.3%
2015 3
4.2%
2016 4
5.6%
2017 1
 
1.4%
2018 4
5.6%
2019 3
4.2%
2020 2
 
2.8%
2021 6
8.3%
ValueCountFrequency (%)
2030 1
 
1.4%
2029 2
 
2.8%
2027 5
6.9%
2026 2
 
2.8%
2025 3
4.2%
2024 3
4.2%
2023 3
4.2%
2022 3
4.2%
2021 6
8.3%
2020 2
 
2.8%

Interactions

2023-12-12T12:59:20.160818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:59:22.007756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분공공건축물현황준공(예정)연도
구분1.0001.0000.0000.044
공공건축물1.0001.0001.0001.000
현황0.0001.0001.0000.894
준공(예정)연도0.0441.0000.8941.000
2023-12-12T12:59:22.106389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
현황구분
현황1.0000.000
구분0.0001.000
2023-12-12T12:59:22.219505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
준공(예정)연도구분현황
준공(예정)연도1.0000.0000.787
구분0.0001.0000.000
현황0.7870.0001.000

Missing values

2023-12-12T12:59:20.308021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:59:20.432510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분공공건축물현황준공(예정)연도
0중앙행정정부세종청사준공2014
1중앙행정총리공관준공2012
2중앙행정행정지원센터준공2014
3중앙행정대통령기록관준공2015
4중앙행정선관위준공2018
5중앙행정정부세종청사 주차장준공2019
6중앙행정정부세종청사 문화관준공2019
7중앙행정세무서준공2021
8중앙행정남부경찰서준공2021
9중앙행정복합편의시설 체육관준공2021
구분공공건축물현황준공(예정)연도
62문화도시건축박물관진행중<NA>
63문화디자인박물관진행중<NA>
64문화디지털문화유산센터진행중<NA>
65문화국가기록박물관진행중<NA>
66문화민속박물관착수 예정<NA>
67문화자연사박물관착수 예정<NA>
68체육종합체육시설 3생활권진행중2027
69도시기능산학연클러스터 지원센터준공2019
70주거행복아파트 2차준공2014
71주거경로복지관준공2014