Overview

Dataset statistics

Number of variables10
Number of observations2288
Missing cells2298
Missing cells (%)10.0%
Duplicate rows2
Duplicate rows (%)0.1%
Total size in memory185.6 KiB
Average record size in memory83.1 B

Variable types

Categorical6
Text2
Numeric1
Unsupported1

Dataset

Description강원특별자치도교육청의 학교시설환경개선 사업 현황입니다. 기관(학교)명, 사업명, 예산액 등을 제공합니다.
URLhttps://www.data.go.kr/data/15117205/fileData.do

Alerts

연도 has constant value ""Constant
예산구분 has constant value ""Constant
Dataset has 2 (0.1%) duplicate rowsDuplicates
세세부사업 is highly overall correlated with 예산액(천원) and 2 other fieldsHigh correlation
세부사업 is highly overall correlated with 세세부사업High correlation
예산액(천원) is highly overall correlated with 세세부사업High correlation
급별 is highly overall correlated with 세세부사업High correlation
세부사업 is highly imbalanced (75.8%)Imbalance
비고 has 2288 (100.0%) missing valuesMissing
예산액(천원) is highly skewed (γ1 = 21.91314061)Skewed
비고 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 23:33:48.607091
Analysis finished2023-12-11 23:33:49.562303
Duration0.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.0 KiB
2023
2288 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2023 2288
100.0%

Length

2023-12-12T08:33:49.616016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:33:49.693270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 2288
100.0%

예산구분
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.0 KiB
본예산
2288 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row본예산
2nd row본예산
3rd row본예산
4th row본예산
5th row본예산

Common Values

ValueCountFrequency (%)
본예산 2288
100.0%

Length

2023-12-12T08:33:49.785137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:33:49.881940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
본예산 2288
100.0%

지역
Categorical

Distinct18
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size18.0 KiB
원주
434 
강릉
385 
춘천
256 
속초양양
152 
홍천
134 
Other values (13)
927 

Length

Max length4
Median length2
Mean length2.1328671
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row평창
2nd row양구
3rd row인제
4th row춘천
5th row춘천

Common Values

ValueCountFrequency (%)
원주 434
19.0%
강릉 385
16.8%
춘천 256
11.2%
속초양양 152
 
6.6%
홍천 134
 
5.9%
삼척 99
 
4.3%
동해 96
 
4.2%
철원 92
 
4.0%
태백 91
 
4.0%
영월 84
 
3.7%
Other values (8) 465
20.3%

Length

2023-12-12T08:33:49.989550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
원주 434
19.0%
강릉 385
16.8%
춘천 256
11.2%
속초양양 152
 
6.6%
홍천 134
 
5.9%
삼척 99
 
4.3%
동해 96
 
4.2%
철원 92
 
4.0%
태백 91
 
4.0%
영월 84
 
3.7%
Other values (8) 465
20.3%

급별
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size18.0 KiB
1125 
541 
469 
직속기관
 
57
 
35
Other values (3)
 
61

Length

Max length5
Median length1
Mean length1.1368007
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
1125
49.2%
541
23.6%
469
20.5%
직속기관 57
 
2.5%
35
 
1.5%
교육지원청 25
 
1.1%
22
 
1.0%
도교육청 14
 
0.6%

Length

2023-12-12T08:33:50.118798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:33:50.267176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1125
49.2%
541
23.6%
469
20.5%
직속기관 57
 
2.5%
35
 
1.5%
교육지원청 25
 
1.1%
22
 
1.0%
도교육청 14
 
0.6%
Distinct606
Distinct (%)26.5%
Missing0
Missing (%)0.0%
Memory size18.0 KiB
2023-12-12T08:33:50.659225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length3
Mean length3.5699301
Min length3

Characters and Unicode

Total characters8168
Distinct characters215
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique162 ?
Unique (%)7.1%

Sample

1st row도성초
2nd row한전초병설유
3rd row상남초병설유
4th row교동초
5th row금산초
ValueCountFrequency (%)
중앙초 27
 
1.2%
학교시 23
 
1.0%
강릉중앙고 19
 
0.8%
남산초 18
 
0.8%
율곡초 17
 
0.7%
경포고 17
 
0.7%
원주고 16
 
0.7%
진광고 15
 
0.7%
강릉제일고 15
 
0.7%
강원과학고 15
 
0.7%
Other values (596) 2106
92.0%
2023-12-12T08:33:51.208141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1165
 
14.3%
558
 
6.8%
524
 
6.4%
344
 
4.2%
193
 
2.4%
185
 
2.3%
167
 
2.0%
165
 
2.0%
151
 
1.8%
124
 
1.5%
Other values (205) 4592
56.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8164
> 99.9%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1165
 
14.3%
558
 
6.8%
524
 
6.4%
344
 
4.2%
193
 
2.4%
185
 
2.3%
167
 
2.0%
165
 
2.0%
151
 
1.8%
124
 
1.5%
Other values (203) 4588
56.2%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8164
> 99.9%
Common 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1165
 
14.3%
558
 
6.8%
524
 
6.4%
344
 
4.2%
193
 
2.4%
185
 
2.3%
167
 
2.0%
165
 
2.0%
151
 
1.8%
124
 
1.5%
Other values (203) 4588
56.2%
Common
ValueCountFrequency (%)
( 2
50.0%
) 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8164
> 99.9%
ASCII 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1165
 
14.3%
558
 
6.8%
524
 
6.4%
344
 
4.2%
193
 
2.4%
185
 
2.3%
167
 
2.0%
165
 
2.0%
151
 
1.8%
124
 
1.5%
Other values (203) 4588
56.2%
ASCII
ValueCountFrequency (%)
( 2
50.0%
) 2
50.0%

세부사업
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct12
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size18.0 KiB
학교시설환경개선
1982 
교육과정운영여건개선
 
105
기관시설유지관리
 
90
학교급식환경개선
 
71
학교신증설
 
19
Other values (7)
 
21

Length

Max length10
Median length8
Mean length8.0638112
Min length5

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row학교환경위생관리
2nd row학교시설환경개선
3rd row학교시설환경개선
4th row학교시설환경개선
5th row학교시설환경개선

Common Values

ValueCountFrequency (%)
학교시설환경개선 1982
86.6%
교육과정운영여건개선 105
 
4.6%
기관시설유지관리 90
 
3.9%
학교급식환경개선 71
 
3.1%
학교신증설 19
 
0.8%
학교시설확충 5
 
0.2%
직업교육환경개선 4
 
0.2%
유치원교육여건개선 3
 
0.1%
학교도서관운영 3
 
0.1%
학교폭력예방및교육 3
 
0.1%
Other values (2) 3
 
0.1%

Length

2023-12-12T08:33:51.350444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
학교시설환경개선 1982
86.6%
교육과정운영여건개선 105
 
4.6%
기관시설유지관리 90
 
3.9%
학교급식환경개선 71
 
3.1%
학교신증설 19
 
0.8%
학교시설확충 5
 
0.2%
직업교육환경개선 4
 
0.2%
유치원교육여건개선 3
 
0.1%
학교도서관운영 3
 
0.1%
학교폭력예방및교육 3
 
0.1%
Other values (2) 3
 
0.1%

세세부사업
Categorical

HIGH CORRELATION 

Distinct44
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size18.0 KiB
교육시설대수선
497 
외부환경개선
376 
석면해체연계교육환경개선
301 
기타교육환경개선
165 
내진보강
124 
Other values (39)
825 

Length

Max length16
Median length12
Mean length7.9479895
Min length4

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st row학교교육환경보호지원
2nd row어린이놀이시설개선
3rd row어린이놀이시설개선
4th row내진보강
5th row내진보강

Common Values

ValueCountFrequency (%)
교육시설대수선 497
21.7%
외부환경개선 376
16.4%
석면해체연계교육환경개선 301
13.2%
기타교육환경개선 165
 
7.2%
내진보강 124
 
5.4%
학교체육시설확충및개보수 104
 
4.5%
교직원편의시설확충 86
 
3.8%
화장실개선 76
 
3.3%
냉난방시설개선 54
 
2.4%
직속기관시설관리 47
 
2.1%
Other values (34) 458
20.0%

Length

2023-12-12T08:33:51.498002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
교육시설대수선 497
21.7%
외부환경개선 376
16.4%
석면해체연계교육환경개선 301
13.2%
기타교육환경개선 165
 
7.2%
내진보강 124
 
5.4%
학교체육시설확충및개보수 104
 
4.5%
교직원편의시설확충 86
 
3.8%
화장실개선 76
 
3.3%
냉난방시설개선 54
 
2.4%
교실환경개선 47
 
2.1%
Other values (34) 458
20.0%
Distinct2011
Distinct (%)87.9%
Missing0
Missing (%)0.0%
Memory size18.0 KiB
2023-12-12T08:33:51.850469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length28
Mean length11.092657
Min length2

Characters and Unicode

Total characters25380
Distinct characters388
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1898 ?
Unique (%)83.0%

Sample

1st row라돈저감설비설치
2nd row어린이놀이시설 교체보수
3rd row어린이놀이시설 교체보수
4th row교동초 내진보강
5th row금산초 내진보강
ValueCountFrequency (%)
보수 312
 
5.3%
교사동 284
 
4.8%
드라이비트 182
 
3.1%
환경개선 145
 
2.5%
설치 121
 
2.0%
내진보강 111
 
1.9%
급식소 99
 
1.7%
냉난방개선 97
 
1.6%
관사보수 88
 
1.5%
지붕보수 70
 
1.2%
Other values (1130) 4408
74.5%
2023-12-12T08:33:52.361056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3702
 
14.6%
958
 
3.8%
880
 
3.5%
817
 
3.2%
808
 
3.2%
582
 
2.3%
558
 
2.2%
552
 
2.2%
472
 
1.9%
438
 
1.7%
Other values (378) 15613
61.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 21376
84.2%
Space Separator 3702
 
14.6%
Open Punctuation 106
 
0.4%
Close Punctuation 103
 
0.4%
Decimal Number 49
 
0.2%
Uppercase Letter 29
 
0.1%
Math Symbol 13
 
0.1%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
958
 
4.5%
880
 
4.1%
817
 
3.8%
808
 
3.8%
582
 
2.7%
558
 
2.6%
552
 
2.6%
472
 
2.2%
438
 
2.0%
431
 
2.0%
Other values (359) 14880
69.6%
Uppercase Letter
ValueCountFrequency (%)
C 6
20.7%
E 5
17.2%
T 4
13.8%
D 4
13.8%
L 4
13.8%
V 3
10.3%
M 1
 
3.4%
A 1
 
3.4%
S 1
 
3.4%
Decimal Number
ValueCountFrequency (%)
1 22
44.9%
2 19
38.8%
3 5
 
10.2%
4 2
 
4.1%
5 1
 
2.0%
Space Separator
ValueCountFrequency (%)
3702
100.0%
Open Punctuation
ValueCountFrequency (%)
( 106
100.0%
Close Punctuation
ValueCountFrequency (%)
) 103
100.0%
Math Symbol
ValueCountFrequency (%)
+ 13
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 21376
84.2%
Common 3975
 
15.7%
Latin 29
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
958
 
4.5%
880
 
4.1%
817
 
3.8%
808
 
3.8%
582
 
2.7%
558
 
2.6%
552
 
2.6%
472
 
2.2%
438
 
2.0%
431
 
2.0%
Other values (359) 14880
69.6%
Common
ValueCountFrequency (%)
3702
93.1%
( 106
 
2.7%
) 103
 
2.6%
1 22
 
0.6%
2 19
 
0.5%
+ 13
 
0.3%
3 5
 
0.1%
4 2
 
0.1%
/ 2
 
0.1%
5 1
 
< 0.1%
Latin
ValueCountFrequency (%)
C 6
20.7%
E 5
17.2%
T 4
13.8%
D 4
13.8%
L 4
13.8%
V 3
10.3%
M 1
 
3.4%
A 1
 
3.4%
S 1
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 21376
84.2%
ASCII 4004
 
15.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3702
92.5%
( 106
 
2.6%
) 103
 
2.6%
1 22
 
0.5%
2 19
 
0.5%
+ 13
 
0.3%
C 6
 
0.1%
E 5
 
0.1%
3 5
 
0.1%
T 4
 
0.1%
Other values (9) 19
 
0.5%
Hangul
ValueCountFrequency (%)
958
 
4.5%
880
 
4.1%
817
 
3.8%
808
 
3.8%
582
 
2.7%
558
 
2.6%
552
 
2.6%
472
 
2.2%
438
 
2.0%
431
 
2.0%
Other values (359) 14880
69.6%

예산액(천원)
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct1841
Distinct (%)80.8%
Missing10
Missing (%)0.4%
Infinite0
Infinite (%)0.0%
Mean293006.96
Minimum1000
Maximum46428595
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.2 KiB
2023-12-12T08:33:52.498074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile8258.1
Q125552.5
median72000
Q3223107.75
95-th percentile890271.65
Maximum46428595
Range46427595
Interquartile range (IQR)197555.25

Descriptive statistics

Standard deviation1651948.7
Coefficient of variation (CV)5.6379163
Kurtosis560.82533
Mean293006.96
Median Absolute Deviation (MAD)56612
Skewness21.913141
Sum6.6746986 × 108
Variance2.7289346 × 1012
MonotonicityNot monotonic
2023-12-12T08:33:52.628392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20000 27
 
1.2%
30000 26
 
1.1%
10000 22
 
1.0%
25000 14
 
0.6%
40000 13
 
0.6%
50000 10
 
0.4%
15000 10
 
0.4%
289910 9
 
0.4%
18000 9
 
0.4%
13000 8
 
0.3%
Other values (1831) 2130
93.1%
(Missing) 10
 
0.4%
ValueCountFrequency (%)
1000 1
 
< 0.1%
1200 2
0.1%
1600 1
 
< 0.1%
1900 1
 
< 0.1%
2000 3
0.1%
2200 1
 
< 0.1%
2244 1
 
< 0.1%
2380 1
 
< 0.1%
2386 1
 
< 0.1%
2500 1
 
< 0.1%
ValueCountFrequency (%)
46428595 1
< 0.1%
45917929 1
< 0.1%
26703363 1
< 0.1%
18230479 1
< 0.1%
14772919 1
< 0.1%
11691911 1
< 0.1%
10796814 1
< 0.1%
7330912 1
< 0.1%
6940105 1
< 0.1%
6520313 1
< 0.1%

비고
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing2288
Missing (%)100.0%
Memory size20.2 KiB

Interactions

2023-12-12T08:33:49.224096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:33:52.702741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역급별세부사업세세부사업예산액(천원)
지역1.0000.7060.2140.7240.421
급별0.7061.0000.7130.9270.348
세부사업0.2140.7131.0001.0000.418
세세부사업0.7240.9271.0001.0000.841
예산액(천원)0.4210.3480.4180.8411.000
2023-12-12T08:33:53.006700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
급별지역세세부사업세부사업
급별1.0000.3900.6710.397
지역0.3901.0000.2690.073
세세부사업0.6710.2691.0000.993
세부사업0.3970.0730.9931.000
2023-12-12T08:33:53.077618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
예산액(천원)지역급별세부사업세세부사업
예산액(천원)1.0000.1810.2010.1760.551
지역0.1811.0000.3900.0730.269
급별0.2010.3901.0000.3970.671
세부사업0.1760.0730.3971.0000.993
세세부사업0.5510.2690.6710.9931.000

Missing values

2023-12-12T08:33:49.354505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:33:49.496555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도예산구분지역급별기관(학교)명세부사업세세부사업사업명예산액(천원)비고
02023본예산평창도성초학교환경위생관리학교교육환경보호지원라돈저감설비설치25000<NA>
12023본예산양구한전초병설유학교시설환경개선어린이놀이시설개선어린이놀이시설 교체보수36059<NA>
22023본예산인제상남초병설유학교시설환경개선어린이놀이시설개선어린이놀이시설 교체보수10278<NA>
32023본예산춘천교동초학교시설환경개선내진보강교동초 내진보강40000<NA>
42023본예산춘천금산초학교시설환경개선내진보강금산초 내진보강99000<NA>
52023본예산춘천금산초학교시설환경개선내진보강금산초 내진보강28000<NA>
62023본예산춘천동내초학교시설환경개선내진보강동내초 내진보강24000<NA>
72023본예산춘천동내초학교시설환경개선내진보강동내초 내진보강20000<NA>
82023본예산춘천봉의초학교시설환경개선내진보강봉의초 내진보강20000<NA>
92023본예산춘천봉의초학교시설환경개선내진보강봉의초 내진보강26000<NA>
연도예산구분지역급별기관(학교)명세부사업세세부사업사업명예산액(천원)비고
22782023본예산인제인제남초학교시설환경개선외부환경개선인제남초 테니스장 락커룸 외부 배수로 설치10000<NA>
22792023본예산인제서화초학교시설환경개선외부환경개선서화초 조례대 지붕 교체25000<NA>
22802023본예산인제원통초학교시설환경개선기타교육환경개선원통초 교직원 휴게실 설치15000<NA>
22812023본예산인제인제고학교시설환경개선기타교육환경개선인제고 교직원 휴게실 설치23640<NA>
22822023본예산고성거진초학교시설환경개선교실환경개선거진초교실환경개선17040<NA>
22832023본예산고성공현진초학교시설환경개선관리실환경개선공현진초관리실환경개선7453<NA>
22842023본예산고성고성고학교시설환경개선관리실환경개선고성고관리실환경개선60007<NA>
22852023본예산고성고성고학교시설환경개선외부환경개선고성고옥상연결통로바닥보수21000<NA>
22862023본예산고성죽왕초학교시설환경개선기타교육환경개선죽왕초학생쉼터조성18000<NA>
22872023본예산고성도학초학교시설환경개선기타교육환경개선도학초야외학습장조성20000<NA>

Duplicate rows

Most frequently occurring

연도예산구분지역급별기관(학교)명세부사업세세부사업사업명예산액(천원)# duplicates
02023본예산원주원주고학교시설환경개선내진보강원주고 내진보강50002
12023본예산원주원주공고학교시설환경개선내진보강원주공업고 내진보강300002