Overview

Dataset statistics

Number of variables8
Number of observations143
Missing cells12
Missing cells (%)1.0%
Duplicate rows1
Duplicate rows (%)0.7%
Total size in memory9.3 KiB
Average record size in memory66.9 B

Variable types

Numeric2
Text3
Categorical2
DateTime1

Dataset

Description성남시내 학교체육시설에 대한 자료이며, 학교명, 시설종류, 학교위치, 설치년도, 비고 등의 항목으로 구성되어 있습니다 ※ 학교 체육시설과 관련한 세부사항은 경기도 성남교육지원청(780-2500)으로 문의해주시기 바랍니다.
URLhttps://www.data.go.kr/data/15000788/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (0.7%) duplicate rowsDuplicates
비고 is highly overall correlated with 번호 and 2 other fieldsHigh correlation
시설종류 is highly overall correlated with 비고High correlation
번호 is highly overall correlated with 비고High correlation
설치년도 is highly overall correlated with 비고High correlation
시설종류 is highly imbalanced (67.6%)Imbalance
비고 is highly imbalanced (66.1%)Imbalance
번호 has 2 (1.4%) missing valuesMissing
학교명 has 2 (1.4%) missing valuesMissing
학교위치(지번) has 2 (1.4%) missing valuesMissing
학교위치(도로명) has 2 (1.4%) missing valuesMissing
설치년도 has 2 (1.4%) missing valuesMissing
데이터기준일자 has 2 (1.4%) missing valuesMissing

Reproduction

Analysis started2023-12-12 02:36:55.226386
Analysis finished2023-12-12 02:36:56.462504
Duration1.24 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct141
Distinct (%)100.0%
Missing2
Missing (%)1.4%
Infinite0
Infinite (%)0.0%
Mean71
Minimum1
Maximum141
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2023-12-12T11:36:56.542554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q136
median71
Q3106
95-th percentile134
Maximum141
Range140
Interquartile range (IQR)70

Descriptive statistics

Standard deviation40.847277
Coefficient of variation (CV)0.57531375
Kurtosis-1.2
Mean71
Median Absolute Deviation (MAD)35
Skewness0
Sum10011
Variance1668.5
MonotonicityStrictly increasing
2023-12-12T11:36:56.718153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
98 1
 
0.7%
92 1
 
0.7%
93 1
 
0.7%
94 1
 
0.7%
95 1
 
0.7%
96 1
 
0.7%
97 1
 
0.7%
99 1
 
0.7%
90 1
 
0.7%
100 1
 
0.7%
Other values (131) 131
91.6%
(Missing) 2
 
1.4%
ValueCountFrequency (%)
1 1
0.7%
2 1
0.7%
3 1
0.7%
4 1
0.7%
5 1
0.7%
6 1
0.7%
7 1
0.7%
8 1
0.7%
9 1
0.7%
10 1
0.7%
ValueCountFrequency (%)
141 1
0.7%
140 1
0.7%
139 1
0.7%
138 1
0.7%
137 1
0.7%
136 1
0.7%
135 1
0.7%
134 1
0.7%
133 1
0.7%
132 1
0.7%

학교명
Text

MISSING 

Distinct133
Distinct (%)94.3%
Missing2
Missing (%)1.4%
Memory size1.2 KiB
2023-12-12T11:36:57.000317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length3
Mean length3.7092199
Min length3

Characters and Unicode

Total characters523
Distinct characters101
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique125 ?
Unique (%)88.7%

Sample

1st row검단초
2nd row구미초
3rd row금빛초
4th row금상초
5th row금상초
ValueCountFrequency (%)
태평초 2
 
1.4%
풍생고 2
 
1.4%
수내중 2
 
1.4%
양영디지털고 2
 
1.4%
성남중 2
 
1.4%
복정고 2
 
1.4%
분당영덕여자고 2
 
1.4%
금상초 2
 
1.4%
창성중(구 2
 
1.4%
장안중 1
 
0.7%
Other values (124) 124
86.7%
2023-12-12T11:36:57.426894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
66
 
12.6%
53
 
10.1%
36
 
6.9%
31
 
5.9%
25
 
4.8%
11
 
2.1%
10
 
1.9%
9
 
1.7%
9
 
1.7%
9
 
1.7%
Other values (91) 264
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 517
98.9%
Close Punctuation 2
 
0.4%
Space Separator 2
 
0.4%
Open Punctuation 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
66
 
12.8%
53
 
10.3%
36
 
7.0%
31
 
6.0%
25
 
4.8%
11
 
2.1%
10
 
1.9%
9
 
1.7%
9
 
1.7%
9
 
1.7%
Other values (88) 258
49.9%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 517
98.9%
Common 6
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
66
 
12.8%
53
 
10.3%
36
 
7.0%
31
 
6.0%
25
 
4.8%
11
 
2.1%
10
 
1.9%
9
 
1.7%
9
 
1.7%
9
 
1.7%
Other values (88) 258
49.9%
Common
ValueCountFrequency (%)
) 2
33.3%
2
33.3%
( 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 517
98.9%
ASCII 6
 
1.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
66
 
12.8%
53
 
10.3%
36
 
7.0%
31
 
6.0%
25
 
4.8%
11
 
2.1%
10
 
1.9%
9
 
1.7%
9
 
1.7%
9
 
1.7%
Other values (88) 258
49.9%
ASCII
ValueCountFrequency (%)
) 2
33.3%
2
33.3%
( 2
33.3%

시설종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
체육관
125 
인조잔디 운동장
15 
<NA>
 
2
천연잔디 운동장
 
1

Length

Max length8
Median length3
Mean length3.5734266
Min length3

Unique

Unique1 ?
Unique (%)0.7%

Sample

1st row체육관
2nd row체육관
3rd row체육관
4th row인조잔디 운동장
5th row체육관

Common Values

ValueCountFrequency (%)
체육관 125
87.4%
인조잔디 운동장 15
 
10.5%
<NA> 2
 
1.4%
천연잔디 운동장 1
 
0.7%

Length

2023-12-12T11:36:57.564655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:36:57.678164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
체육관 125
78.6%
운동장 16
 
10.1%
인조잔디 15
 
9.4%
na 2
 
1.3%
천연잔디 1
 
0.6%

학교위치(지번)
Text

MISSING 

Distinct130
Distinct (%)92.2%
Missing2
Missing (%)1.4%
Memory size1.2 KiB
2023-12-12T11:36:57.959714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length21
Mean length19.489362
Min length17

Characters and Unicode

Total characters2748
Distinct characters66
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique120 ?
Unique (%)85.1%

Sample

1st row경기도 성남시 중원구 하대원동 123
2nd row경기도 성남시 분당구 구미동 215
3rd row경기도 성남시 수정구 태평동 7289
4th row경기도 성남시 중원구 금광1동 1645
5th row경기도 성남시 중원구 금광1동 1645
ValueCountFrequency (%)
경기도 141
20.0%
성남시 141
20.0%
분당구 76
 
10.8%
수정구 33
 
4.7%
중원구 32
 
4.5%
야탑동 11
 
1.6%
정자동 10
 
1.4%
성남동 8
 
1.1%
수내동 7
 
1.0%
창곡동 7
 
1.0%
Other values (155) 239
33.9%
2023-12-12T11:36:58.425077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
564
20.5%
152
 
5.5%
149
 
5.4%
147
 
5.3%
143
 
5.2%
141
 
5.1%
141
 
5.1%
141
 
5.1%
141
 
5.1%
1 90
 
3.3%
Other values (56) 939
34.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1701
61.9%
Space Separator 564
 
20.5%
Decimal Number 462
 
16.8%
Dash Punctuation 21
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
152
 
8.9%
149
 
8.8%
147
 
8.6%
143
 
8.4%
141
 
8.3%
141
 
8.3%
141
 
8.3%
141
 
8.3%
80
 
4.7%
80
 
4.7%
Other values (44) 386
22.7%
Decimal Number
ValueCountFrequency (%)
1 90
19.5%
2 64
13.9%
5 57
12.3%
7 50
10.8%
3 49
10.6%
6 43
9.3%
4 43
9.3%
8 24
 
5.2%
9 22
 
4.8%
0 20
 
4.3%
Space Separator
ValueCountFrequency (%)
564
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1701
61.9%
Common 1047
38.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
152
 
8.9%
149
 
8.8%
147
 
8.6%
143
 
8.4%
141
 
8.3%
141
 
8.3%
141
 
8.3%
141
 
8.3%
80
 
4.7%
80
 
4.7%
Other values (44) 386
22.7%
Common
ValueCountFrequency (%)
564
53.9%
1 90
 
8.6%
2 64
 
6.1%
5 57
 
5.4%
7 50
 
4.8%
3 49
 
4.7%
6 43
 
4.1%
4 43
 
4.1%
8 24
 
2.3%
9 22
 
2.1%
Other values (2) 41
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1701
61.9%
ASCII 1047
38.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
564
53.9%
1 90
 
8.6%
2 64
 
6.1%
5 57
 
5.4%
7 50
 
4.8%
3 49
 
4.7%
6 43
 
4.1%
4 43
 
4.1%
8 24
 
2.3%
9 22
 
2.1%
Other values (2) 41
 
3.9%
Hangul
ValueCountFrequency (%)
152
 
8.9%
149
 
8.8%
147
 
8.6%
143
 
8.4%
141
 
8.3%
141
 
8.3%
141
 
8.3%
141
 
8.3%
80
 
4.7%
80
 
4.7%
Other values (44) 386
22.7%
Distinct131
Distinct (%)92.9%
Missing2
Missing (%)1.4%
Memory size1.2 KiB
2023-12-12T11:36:58.704938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length23
Mean length20.340426
Min length17

Characters and Unicode

Total characters2868
Distinct characters102
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique121 ?
Unique (%)85.8%

Sample

1st row경기도 성남시 중원구 원터로 25
2nd row경기도 성남시 분당구 미금로22번길 25
3rd row경기도 성남시 수정구 남문로 112
4th row경기도 성남시 중원구 금상로 97
5th row경기도 성남시 중원구 금상로 97
ValueCountFrequency (%)
경기도 141
20.0%
성남시 141
20.0%
분당구 76
 
10.8%
수정구 33
 
4.7%
중원구 31
 
4.4%
11 6
 
0.9%
동판교로 6
 
0.9%
금곡로 5
 
0.7%
7 5
 
0.7%
25 5
 
0.7%
Other values (184) 256
36.3%
2023-12-12T11:36:59.131920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
564
19.7%
148
 
5.2%
145
 
5.1%
145
 
5.1%
143
 
5.0%
142
 
5.0%
141
 
4.9%
141
 
4.9%
141
 
4.9%
1 84
 
2.9%
Other values (92) 1074
37.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1852
64.6%
Space Separator 564
 
19.7%
Decimal Number 449
 
15.7%
Dash Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
148
 
8.0%
145
 
7.8%
145
 
7.8%
143
 
7.7%
142
 
7.7%
141
 
7.6%
141
 
7.6%
141
 
7.6%
79
 
4.3%
79
 
4.3%
Other values (80) 548
29.6%
Decimal Number
ValueCountFrequency (%)
1 84
18.7%
2 60
13.4%
3 51
11.4%
4 47
10.5%
5 46
10.2%
7 42
9.4%
9 36
8.0%
6 29
 
6.5%
0 28
 
6.2%
8 26
 
5.8%
Space Separator
ValueCountFrequency (%)
564
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1852
64.6%
Common 1016
35.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
148
 
8.0%
145
 
7.8%
145
 
7.8%
143
 
7.7%
142
 
7.7%
141
 
7.6%
141
 
7.6%
141
 
7.6%
79
 
4.3%
79
 
4.3%
Other values (80) 548
29.6%
Common
ValueCountFrequency (%)
564
55.5%
1 84
 
8.3%
2 60
 
5.9%
3 51
 
5.0%
4 47
 
4.6%
5 46
 
4.5%
7 42
 
4.1%
9 36
 
3.5%
6 29
 
2.9%
0 28
 
2.8%
Other values (2) 29
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1852
64.6%
ASCII 1016
35.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
564
55.5%
1 84
 
8.3%
2 60
 
5.9%
3 51
 
5.0%
4 47
 
4.6%
5 46
 
4.5%
7 42
 
4.1%
9 36
 
3.5%
6 29
 
2.9%
0 28
 
2.8%
Other values (2) 29
 
2.9%
Hangul
ValueCountFrequency (%)
148
 
8.0%
145
 
7.8%
145
 
7.8%
143
 
7.7%
142
 
7.7%
141
 
7.6%
141
 
7.6%
141
 
7.6%
79
 
4.3%
79
 
4.3%
Other values (80) 548
29.6%

설치년도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct33
Distinct (%)23.4%
Missing2
Missing (%)1.4%
Infinite0
Infinite (%)0.0%
Mean2009.9574
Minimum1976
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.4 KiB
2023-12-12T11:36:59.259499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1976
5-th percentile1994
Q12006
median2010
Q32017
95-th percentile2022
Maximum2023
Range47
Interquartile range (IQR)11

Descriptive statistics

Standard deviation8.9177771
Coefficient of variation (CV)0.0044367989
Kurtosis1.5353024
Mean2009.9574
Median Absolute Deviation (MAD)6
Skewness-0.94549189
Sum283404
Variance79.526748
MonotonicityNot monotonic
2023-12-12T11:36:59.390040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
2010 22
15.4%
2009 16
 
11.2%
2021 9
 
6.3%
2016 8
 
5.6%
2005 8
 
5.6%
2019 7
 
4.9%
2020 7
 
4.9%
2008 6
 
4.2%
2006 5
 
3.5%
2022 5
 
3.5%
Other values (23) 48
33.6%
ValueCountFrequency (%)
1976 1
 
0.7%
1984 1
 
0.7%
1985 2
1.4%
1987 1
 
0.7%
1989 1
 
0.7%
1994 2
1.4%
1995 1
 
0.7%
1997 3
2.1%
1998 2
1.4%
1999 2
1.4%
ValueCountFrequency (%)
2023 4
2.8%
2022 5
3.5%
2021 9
6.3%
2020 7
4.9%
2019 7
4.9%
2018 2
 
1.4%
2017 3
 
2.1%
2016 8
5.6%
2015 1
 
0.7%
2014 2
 
1.4%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
<NA>
134 
신규
 
9

Length

Max length4
Median length4
Mean length3.8741259
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row신규
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 134
93.7%
신규 9
 
6.3%

Length

2023-12-12T11:36:59.516229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:36:59.640802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 134
93.7%
신규 9
 
6.3%

데이터기준일자
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)0.7%
Missing2
Missing (%)1.4%
Memory size1.2 KiB
Minimum2023-05-30 00:00:00
Maximum2023-05-30 00:00:00
2023-12-12T11:36:59.738672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:37:00.171304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T11:36:55.796847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:36:55.566679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:36:55.908369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:36:55.673909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T11:37:00.277154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호시설종류설치년도
번호1.0000.0650.340
시설종류0.0651.0000.802
설치년도0.3400.8021.000
2023-12-12T11:37:00.401151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비고시설종류
비고1.0001.000
시설종류1.0001.000
2023-12-12T11:37:00.521631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호설치년도시설종류비고
번호1.000-0.3230.0001.000
설치년도-0.3231.0000.4911.000
시설종류0.0000.4911.0001.000
비고1.0001.0001.0001.000

Missing values

2023-12-12T11:36:56.031740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T11:36:56.172996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T11:36:56.347139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

번호학교명시설종류학교위치(지번)학교위치(도로명)설치년도비고데이터기준일자
01검단초체육관경기도 성남시 중원구 하대원동 123경기도 성남시 중원구 원터로 252021<NA>2023-05-30
12구미초체육관경기도 성남시 분당구 구미동 215경기도 성남시 분당구 미금로22번길 252023신규2023-05-30
23금빛초체육관경기도 성남시 수정구 태평동 7289경기도 성남시 수정구 남문로 1122003<NA>2023-05-30
34금상초인조잔디 운동장경기도 성남시 중원구 금광1동 1645경기도 성남시 중원구 금상로 972008<NA>2023-05-30
45금상초체육관경기도 성남시 중원구 금광1동 1645경기도 성남시 중원구 금상로 972002<NA>2023-05-30
56낙생초체육관경기도 성남시 분당구 판교동 551경기도 성남시 분당구 서판교로74번길 112009<NA>2023-05-30
67내정초체육관경기도 성남시 분당구 수내동 57경기도 성남시 분당구 내정로174번길 192017<NA>2023-05-30
78늘푸른초체육관경기도 성남시 분당구 정자동 178-5경기도 성남시 분당구 정자일로 872005<NA>2023-05-30
89단남초체육관경기도 성남시 중원구 금광동 210경기도 성남시 중원구 광명로300번길 252022신규2023-05-30
910단대초체육관경기도 성남시 수정구 단대동 77경기도 성남시 수정구 단대로 302004<NA>2023-05-30
번호학교명시설종류학교위치(지번)학교위치(도로명)설치년도비고데이터기준일자
133134이매고체육관경기도 성남시 분당구 이매동 147경기도 성남시 분당구 탄천로 11998<NA>2023-05-30
134135태원고체육관경기도 성남시 분당구 야탑동 754-1경기도 성남시 분당구 성남대로 8221999<NA>2023-05-30
135136판교고체육관경기도 성남시 분당구 삼평동 706경기도 성남시 분당구 동판교로 2572011<NA>2023-05-30
136137판교대장초체육관경기도 성남시 분당구 대장동 155경기도 성남시 분당구 판교대장로5길 712021<NA>2023-05-30
137138풍생고인조잔디 운동장경기도 성남시 수정구 수진동 4511-1경기도 성남시 수정구 산성대로 1112008<NA>2023-05-30
138139풍생고체육관경기도 성남시 수정구 수진동 4511-1경기도 성남시 수정구 산성대로 1111984<NA>2023-05-30
139140한솔고체육관경기도 성남시 분당구 정자동 115경기도 성남시 분당구 내정로 702006<NA>2023-05-30
140141효성고체육관경기도 성남시 수정구 심곡동 321경기도 성남시 수정구 심곡로 162006<NA>2023-05-30
141<NA><NA><NA><NA><NA><NA><NA><NA>
142<NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

번호학교명시설종류학교위치(지번)학교위치(도로명)설치년도비고데이터기준일자# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA>2