Overview

Dataset statistics

Number of variables8
Number of observations22
Missing cells26
Missing cells (%)14.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 KiB
Average record size in memory73.0 B

Variable types

Text2
Numeric3
DateTime2
Categorical1

Dataset

Description행복도시내 각 생활권 입주시기에 맞추어 행정, 문화, 보육,체육시설 등 공공 편익 시설들을 복합화한 복합커뮤니티센터의 건립 현황
URLhttps://www.data.go.kr/data/15091178/fileData.do

Alerts

부지면적 is highly overall correlated with 연면적 and 1 other fieldsHigh correlation
연면적 is highly overall correlated with 부지면적 and 1 other fieldsHigh correlation
사업비 is highly overall correlated with 부지면적 and 2 other fieldsHigh correlation
현황 is highly overall correlated with 사업비High correlation
사업비 has 3 (13.6%) missing valuesMissing
준공일 has 3 (13.6%) missing valuesMissing
개청일 has 6 (27.3%) missing valuesMissing
비고 has 14 (63.6%) missing valuesMissing
사업명 has unique valuesUnique
부지면적 has unique valuesUnique

Reproduction

Analysis started2023-12-12 20:11:00.637760
Analysis finished2023-12-12 20:11:01.899963
Duration1.26 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사업명
Text

UNIQUE 

Distinct22
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size308.0 B
2023-12-13T05:11:02.027077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length6
Mean length6.3181818
Min length6

Characters and Unicode

Total characters139
Distinct characters15
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)100.0%

Sample

1st row1-2생활권
2nd row1-3생활권
3rd row1-4생활권
4th row1-5생활권
5th row2-3생활권
ValueCountFrequency (%)
1-2생활권 1
 
4.5%
1-3생활권 1
 
4.5%
6-1생활권 1
 
4.5%
5-3생활권 1
 
4.5%
5-2생활권 1
 
4.5%
6-3생활권 1
 
4.5%
4-2생활권 1
 
4.5%
5-1생활권 1
 
4.5%
2-4생활권 1
 
4.5%
6-4생활권 1
 
4.5%
Other values (12) 12
54.5%
2023-12-13T05:11:02.331990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 22
15.8%
22
15.8%
22
15.8%
22
15.8%
1 13
9.4%
2 10
7.2%
3 8
 
5.8%
4 5
 
3.6%
5 4
 
2.9%
6 4
 
2.9%
Other values (5) 7
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 66
47.5%
Decimal Number 44
31.7%
Dash Punctuation 22
 
15.8%
Open Punctuation 2
 
1.4%
Close Punctuation 2
 
1.4%
Uppercase Letter 2
 
1.4%
Space Separator 1
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 13
29.5%
2 10
22.7%
3 8
18.2%
4 5
 
11.4%
5 4
 
9.1%
6 4
 
9.1%
Other Letter
ValueCountFrequency (%)
22
33.3%
22
33.3%
22
33.3%
Uppercase Letter
ValueCountFrequency (%)
A 1
50.0%
B 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 71
51.1%
Hangul 66
47.5%
Latin 2
 
1.4%

Most frequent character per script

Common
ValueCountFrequency (%)
- 22
31.0%
1 13
18.3%
2 10
14.1%
3 8
 
11.3%
4 5
 
7.0%
5 4
 
5.6%
6 4
 
5.6%
( 2
 
2.8%
) 2
 
2.8%
1
 
1.4%
Hangul
ValueCountFrequency (%)
22
33.3%
22
33.3%
22
33.3%
Latin
ValueCountFrequency (%)
A 1
50.0%
B 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 73
52.5%
Hangul 66
47.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 22
30.1%
1 13
17.8%
2 10
13.7%
3 8
 
11.0%
4 5
 
6.8%
5 4
 
5.5%
6 4
 
5.5%
( 2
 
2.7%
) 2
 
2.7%
A 1
 
1.4%
Other values (2) 2
 
2.7%
Hangul
ValueCountFrequency (%)
22
33.3%
22
33.3%
22
33.3%

부지면적
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct22
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12993.636
Minimum6781
Maximum52539
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size330.0 B
2023-12-13T05:11:02.468724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6781
5-th percentile7898.95
Q18130.75
median10142
Q314029.25
95-th percentile25809.5
Maximum52539
Range45758
Interquartile range (IQR)5898.5

Descriptive statistics

Standard deviation9844.7652
Coefficient of variation (CV)0.75766052
Kurtosis13.284969
Mean12993.636
Median Absolute Deviation (MAD)2044.5
Skewness3.4576319
Sum285860
Variance96919403
MonotonicityNot monotonic
2023-12-13T05:11:02.561941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
26338 1
 
4.5%
7917 1
 
4.5%
10957 1
 
4.5%
10210 1
 
4.5%
8284 1
 
4.5%
15768 1
 
4.5%
11171 1
 
4.5%
15009 1
 
4.5%
10074 1
 
4.5%
6781 1
 
4.5%
Other values (12) 12
54.5%
ValueCountFrequency (%)
6781 1
4.5%
7898 1
4.5%
7917 1
4.5%
8057 1
4.5%
8086 1
4.5%
8109 1
4.5%
8196 1
4.5%
8243 1
4.5%
8284 1
4.5%
9925 1
4.5%
ValueCountFrequency (%)
52539 1
4.5%
26338 1
4.5%
15768 1
4.5%
15395 1
4.5%
15009 1
4.5%
14982 1
4.5%
11171 1
4.5%
11158 1
4.5%
10957 1
4.5%
10763 1
4.5%

연면적
Real number (ℝ)

HIGH CORRELATION 

Distinct20
Distinct (%)90.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15454.909
Minimum11793
Maximum51946
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size330.0 B
2023-12-13T05:11:02.661454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11793
5-th percentile11800.9
Q112245.25
median12600
Q315448.25
95-th percentile18959.15
Maximum51946
Range40153
Interquartile range (IQR)3203

Descriptive statistics

Standard deviation8435.6495
Coefficient of variation (CV)0.5458233
Kurtosis18.742227
Mean15454.909
Median Absolute Deviation (MAD)728
Skewness4.2079081
Sum340008
Variance71160183
MonotonicityNot monotonic
2023-12-13T05:11:02.775404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
12600 2
 
9.1%
11793 2
 
9.1%
19015 1
 
4.5%
12564 1
 
4.5%
15860 1
 
4.5%
17898 1
 
4.5%
16298 1
 
4.5%
12581 1
 
4.5%
12420 1
 
4.5%
12187 1
 
4.5%
Other values (10) 10
45.5%
ValueCountFrequency (%)
11793 2
9.1%
11951 1
4.5%
12023 1
4.5%
12075 1
4.5%
12187 1
4.5%
12420 1
4.5%
12502 1
4.5%
12564 1
4.5%
12581 1
4.5%
12600 2
9.1%
ValueCountFrequency (%)
51946 1
4.5%
19015 1
4.5%
17898 1
4.5%
17489 1
4.5%
16298 1
4.5%
15860 1
4.5%
14213 1
4.5%
14100 1
4.5%
13461 1
4.5%
12639 1
4.5%

사업비
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct19
Distinct (%)100.0%
Missing3
Missing (%)13.6%
Infinite0
Infinite (%)0.0%
Mean462.21053
Minimum298
Maximum1047
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size330.0 B
2023-12-13T05:11:02.880492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum298
5-th percentile300.7
Q1370
median404
Q3484
95-th percentile741
Maximum1047
Range749
Interquartile range (IQR)114

Descriptive statistics

Standard deviation181.17106
Coefficient of variation (CV)0.39196654
Kurtosis5.4106179
Mean462.21053
Median Absolute Deviation (MAD)72
Skewness2.1557595
Sum8782
Variance32822.953
MonotonicityNot monotonic
2023-12-13T05:11:02.995144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
298 1
 
4.5%
707 1
 
4.5%
476 1
 
4.5%
660 1
 
4.5%
454 1
 
4.5%
404 1
 
4.5%
411 1
 
4.5%
392 1
 
4.5%
406 1
 
4.5%
578 1
 
4.5%
Other values (9) 9
40.9%
(Missing) 3
 
13.6%
ValueCountFrequency (%)
298 1
4.5%
301 1
4.5%
316 1
4.5%
329 1
4.5%
358 1
4.5%
382 1
4.5%
385 1
4.5%
386 1
4.5%
392 1
4.5%
404 1
4.5%
ValueCountFrequency (%)
1047 1
4.5%
707 1
4.5%
660 1
4.5%
578 1
4.5%
492 1
4.5%
476 1
4.5%
454 1
4.5%
411 1
4.5%
406 1
4.5%
404 1
4.5%

준공일
Date

MISSING 

Distinct16
Distinct (%)84.2%
Missing3
Missing (%)13.6%
Memory size308.0 B
Minimum2012-01-01 00:00:00
Maximum2027-12-01 00:00:00
2023-12-13T05:11:03.100039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:11:03.196364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)

개청일
Date

MISSING 

Distinct14
Distinct (%)87.5%
Missing6
Missing (%)27.3%
Memory size308.0 B
Minimum2012-07-01 00:00:00
Maximum2025-12-31 00:00:00
2023-12-13T05:11:03.293697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:11:03.386905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)

현황
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)22.7%
Missing0
Missing (%)0.0%
Memory size308.0 B
준공
15 
<NA>
공사
설계
 
1
기획
 
1

Length

Max length4
Median length2
Mean length2.2727273
Min length2

Unique

Unique2 ?
Unique (%)9.1%

Sample

1st row준공
2nd row준공
3rd row준공
4th row준공
5th row준공

Common Values

ValueCountFrequency (%)
준공 15
68.2%
<NA> 3
 
13.6%
공사 2
 
9.1%
설계 1
 
4.5%
기획 1
 
4.5%

Length

2023-12-13T05:11:03.829396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T05:11:03.950103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
준공 15
68.2%
na 3
 
13.6%
공사 2
 
9.1%
설계 1
 
4.5%
기획 1
 
4.5%

비고
Text

MISSING 

Distinct6
Distinct (%)75.0%
Missing14
Missing (%)63.6%
Memory size308.0 B
2023-12-13T05:11:04.128509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length13
Mean length9.625
Min length6

Characters and Unicode

Total characters77
Distinct characters35
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)62.5%

Sample

1st row2023년 3월 개청예정
2nd row2022년 12월 공사 발주
3rd row부지조성 중
4th row실시설계 중
5th row기본계획 수립 추진
ValueCountFrequency (%)
24년 3
13.0%
이후 3
13.0%
건립 3
13.0%
2
 
8.7%
2023년 1
 
4.3%
3월 1
 
4.3%
개청예정 1
 
4.3%
2022년 1
 
4.3%
12월 1
 
4.3%
공사 1
 
4.3%
Other values (6) 6
26.1%
2023-12-13T05:11:04.444357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
15
19.5%
2 9
 
11.7%
5
 
6.5%
4
 
5.2%
3
 
3.9%
3
 
3.9%
3
 
3.9%
4 3
 
3.9%
2
 
2.6%
2
 
2.6%
Other values (25) 28
36.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 45
58.4%
Decimal Number 17
 
22.1%
Space Separator 15
 
19.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
11.1%
4
 
8.9%
3
 
6.7%
3
 
6.7%
3
 
6.7%
2
 
4.4%
2
 
4.4%
2
 
4.4%
1
 
2.2%
1
 
2.2%
Other values (19) 19
42.2%
Decimal Number
ValueCountFrequency (%)
2 9
52.9%
4 3
 
17.6%
3 2
 
11.8%
0 2
 
11.8%
1 1
 
5.9%
Space Separator
ValueCountFrequency (%)
15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 45
58.4%
Common 32
41.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
11.1%
4
 
8.9%
3
 
6.7%
3
 
6.7%
3
 
6.7%
2
 
4.4%
2
 
4.4%
2
 
4.4%
1
 
2.2%
1
 
2.2%
Other values (19) 19
42.2%
Common
ValueCountFrequency (%)
15
46.9%
2 9
28.1%
4 3
 
9.4%
3 2
 
6.2%
0 2
 
6.2%
1 1
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 45
58.4%
ASCII 32
41.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
15
46.9%
2 9
28.1%
4 3
 
9.4%
3 2
 
6.2%
0 2
 
6.2%
1 1
 
3.1%
Hangul
ValueCountFrequency (%)
5
 
11.1%
4
 
8.9%
3
 
6.7%
3
 
6.7%
3
 
6.7%
2
 
4.4%
2
 
4.4%
2
 
4.4%
1
 
2.2%
1
 
2.2%
Other values (19) 19
42.2%

Interactions

2023-12-13T05:11:01.362699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:11:00.875134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:11:01.112865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:11:01.444949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:11:00.952836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:11:01.183883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:11:01.528569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:11:01.034374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:11:01.277888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T05:11:04.563317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업명부지면적연면적사업비준공일개청일현황비고
사업명1.0001.0001.0001.0001.0001.0001.0001.000
부지면적1.0001.0000.7910.9000.9260.7670.0001.000
연면적1.0000.7911.0000.9180.9360.6680.4230.249
사업비1.0000.9000.9181.0000.7210.5800.7941.000
준공일1.0000.9260.9360.7211.0001.0000.0001.000
개청일1.0000.7670.6680.5801.0001.0000.000NaN
현황1.0000.0000.4230.7940.0000.0001.0001.000
비고1.0001.0000.2491.0001.000NaN1.0001.000
2023-12-13T05:11:04.679917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부지면적연면적사업비현황
부지면적1.0000.5750.7280.000
연면적0.5751.0000.6560.000
사업비0.7280.6561.0000.618
현황0.0000.0000.6181.000

Missing values

2023-12-13T05:11:01.632441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:11:01.738107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T05:11:01.836134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

사업명부지면적연면적사업비준공일개청일현황비고
01-2생활권26338190155782013-112015-01-26준공<NA>
11-3생활권8109125022982015-042016-04-18준공<NA>
21-4생활권11158119513582013-092014-02-10준공<NA>
31-5생활권14982134613822013-07<NA>준공<NA>
42-3생활권525395194610472012-012012-07-01준공<NA>
51-1생활권(A)8057117933012016-102017-02-20준공<NA>
63-2생활권15395174894922016-112017-02-27준공<NA>
73-1생활권7898120233162017-102018-07-16준공<NA>
81-1생활권(B)8243120753292018-042018-07-26준공<NA>
92-2생활권8086141003852018-032018-03-19준공<NA>
사업명부지면적연면적사업비준공일개청일현황비고
124-1생활권7917126393922021-072021-11-05준공<NA>
136-4생활권9925121874112021-062021-10-01준공<NA>
142-4생활권6781124204042022-11<NA>준공2023년 3월 개청예정
155-1생활권10074125814542025-122025-12-31공사2022년 12월 공사 발주
164-2생활권15009162986602025-122025-12-31공사부지조성 중
176-3생활권11171117934762025-122025-12-31설계실시설계 중
185-2생활권15768178987072027-12<NA>기획기본계획 수립 추진
195-3생활권828412600<NA><NA><NA><NA>24년 이후 건립
206-1생활권1021012600<NA><NA><NA><NA>24년 이후 건립
216-2생활권1095715860<NA><NA><NA><NA>24년 이후 건립