Overview

Dataset statistics

Number of variables6
Number of observations198
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.8 KiB
Average record size in memory50.7 B

Variable types

Text2
Categorical2
Numeric2

Dataset

Description중소벤처기업진흥공단의 중소기업 특성화고 인력양성사업에 참여하고 있는 학교 현황 정보(참여학교, 계열, 소재지 등)를 제공
URLhttps://www.data.go.kr/data/15047642/fileData.do

Alerts

학생수 is highly overall correlated with 취업맞춤반 과정수High correlation
취업맞춤반 과정수 is highly overall correlated with 학생수High correlation
참여학교 has unique valuesUnique
학생수 has 5 (2.5%) zerosZeros
취업맞춤반 과정수 has 5 (2.5%) zerosZeros

Reproduction

Analysis started2023-12-12 06:58:42.070392
Analysis finished2023-12-12 06:58:43.005267
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

참여학교
Text

UNIQUE 

Distinct198
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-12T15:58:43.186598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length8.959596
Min length6

Characters and Unicode

Total characters1774
Distinct characters171
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique198 ?
Unique (%)100.0%

Sample

1st row강릉정보공업고등학교
2nd row강원생활과학고등학교
3rd row김화공업고등학교
4th row강원생명과학고등학교
5th row영서고등학교
ValueCountFrequency (%)
강릉정보공업고등학교 1
 
0.5%
서울전자고등학교 1
 
0.5%
선일빅데이터고등학교 1
 
0.5%
성동공업고등학교 1
 
0.5%
성동글로벌경영고등학교 1
 
0.5%
성암국제무역고등학교 1
 
0.5%
세그루패션디자인고등학교 1
 
0.5%
세명컴퓨터고등학교 1
 
0.5%
송곡관광고등학교 1
 
0.5%
신진과학기술고등학교 1
 
0.5%
Other values (188) 188
94.9%
2023-12-12T15:58:43.570574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
226
 
12.7%
199
 
11.2%
198
 
11.2%
198
 
11.2%
76
 
4.3%
57
 
3.2%
43
 
2.4%
32
 
1.8%
27
 
1.5%
25
 
1.4%
Other values (161) 693
39.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1771
99.8%
Uppercase Letter 2
 
0.1%
Lowercase Letter 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
226
 
12.8%
199
 
11.2%
198
 
11.2%
198
 
11.2%
76
 
4.3%
57
 
3.2%
43
 
2.4%
32
 
1.8%
27
 
1.5%
25
 
1.4%
Other values (158) 690
39.0%
Uppercase Letter
ValueCountFrequency (%)
I 1
50.0%
T 1
50.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1771
99.8%
Latin 3
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
226
 
12.8%
199
 
11.2%
198
 
11.2%
198
 
11.2%
76
 
4.3%
57
 
3.2%
43
 
2.4%
32
 
1.8%
27
 
1.5%
25
 
1.4%
Other values (158) 690
39.0%
Latin
ValueCountFrequency (%)
I 1
33.3%
e 1
33.3%
T 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1771
99.8%
ASCII 3
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
226
 
12.8%
199
 
11.2%
198
 
11.2%
198
 
11.2%
76
 
4.3%
57
 
3.2%
43
 
2.4%
32
 
1.8%
27
 
1.5%
25
 
1.4%
Other values (158) 690
39.0%
ASCII
ValueCountFrequency (%)
I 1
33.3%
e 1
33.3%
T 1
33.3%

계열
Categorical

Distinct15
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
공업
105 
상업
61 
가사실업
 
7
상업정보
 
6
상업가사
 
6
Other values (10)
13 

Length

Max length6
Median length2
Mean length2.3383838
Min length2

Unique

Unique7 ?
Unique (%)3.5%

Sample

1st row공업
2nd row가사실업
3rd row공업
4th row농업공업
5th row농생명

Common Values

ValueCountFrequency (%)
공업 105
53.0%
상업 61
30.8%
가사실업 7
 
3.5%
상업정보 6
 
3.0%
상업가사 6
 
3.0%
농업공업 2
 
1.0%
농생명 2
 
1.0%
상업가사실업 2
 
1.0%
농생명산업 1
 
0.5%
가사 1
 
0.5%
Other values (5) 5
 
2.5%

Length

2023-12-12T15:58:43.738987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
공업 105
53.0%
상업 61
30.8%
가사실업 7
 
3.5%
상업정보 6
 
3.0%
상업가사 6
 
3.0%
농업공업 2
 
1.0%
농생명 2
 
1.0%
상업가사실업 2
 
1.0%
농생명산업 1
 
0.5%
가사 1
 
0.5%
Other values (5) 5
 
2.5%

소재지
Categorical

Distinct16
Distinct (%)8.1%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
경기
32 
서울
31 
부산
21 
전남
14 
경남
13 
Other values (11)
87 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique2 ?
Unique (%)1.0%

Sample

1st row강원
2nd row강원
3rd row강원
4th row강원
5th row강원

Common Values

ValueCountFrequency (%)
경기 32
16.2%
서울 31
15.7%
부산 21
10.6%
전남 14
7.1%
경남 13
6.6%
경북 13
6.6%
충북 13
6.6%
대구 12
 
6.1%
인천 12
 
6.1%
충남 10
 
5.1%
Other values (6) 27
13.6%

Length

2023-12-12T15:58:43.854968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 32
16.2%
서울 31
15.7%
부산 21
10.6%
전남 14
7.1%
경남 13
6.6%
경북 13
6.6%
충북 13
6.6%
대구 12
 
6.1%
인천 12
 
6.1%
충남 10
 
5.1%
Other values (6) 27
13.6%

전공
Text

Distinct197
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-12T15:58:44.150346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length144
Median length56
Mean length33.989899
Min length4

Characters and Unicode

Total characters6730
Distinct characters272
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique196 ?
Unique (%)99.0%

Sample

1st row소프트웨어과, 그린자동차과, 미용디자인과, 신재생에너지과, 조리제빵과
2nd row보건간호과, 미용예술과
3rd row전기시스템제어과, IT융합과, 웰빙식품과
4th row스마트팜도시농업과, 플라워가드닝과, 반려동물케어과, 카페N디저트과, 바이오식품가공과, 스마트전기전자과, IoT그린전기차과
5th row동물자원, 산업기계, 환경조경, 생활원예, 골프경영과, 식품산업, 유통경영, 사무행정
ValueCountFrequency (%)
전기과 26
 
3.0%
기계과 23
 
2.6%
보건간호과 19
 
2.2%
자동차과 13
 
1.5%
전기전자과 12
 
1.4%
건축과 10
 
1.1%
정밀기계과 8
 
0.9%
전자과 8
 
0.9%
토목과 8
 
0.9%
외식조리과 8
 
0.9%
Other values (523) 741
84.6%
2023-12-12T15:58:44.971765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
864
 
12.8%
, 685
 
10.2%
680
 
10.1%
211
 
3.1%
172
 
2.6%
159
 
2.4%
158
 
2.3%
142
 
2.1%
125
 
1.9%
119
 
1.8%
Other values (262) 3415
50.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5175
76.9%
Other Punctuation 693
 
10.3%
Space Separator 680
 
10.1%
Uppercase Letter 101
 
1.5%
Open Punctuation 29
 
0.4%
Close Punctuation 29
 
0.4%
Lowercase Letter 8
 
0.1%
Decimal Number 7
 
0.1%
Control 6
 
0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
864
 
16.7%
211
 
4.1%
172
 
3.3%
159
 
3.1%
158
 
3.1%
142
 
2.7%
125
 
2.4%
119
 
2.3%
103
 
2.0%
98
 
1.9%
Other values (229) 3024
58.4%
Uppercase Letter
ValueCountFrequency (%)
I 34
33.7%
T 26
25.7%
A 8
 
7.9%
D 6
 
5.9%
S 4
 
4.0%
C 4
 
4.0%
W 4
 
4.0%
R 3
 
3.0%
N 3
 
3.0%
V 2
 
2.0%
Other values (6) 7
 
6.9%
Lowercase Letter
ValueCountFrequency (%)
o 4
50.0%
s 1
 
12.5%
h 1
 
12.5%
e 1
 
12.5%
p 1
 
12.5%
Other Punctuation
ValueCountFrequency (%)
, 685
98.8%
/ 4
 
0.6%
: 3
 
0.4%
· 1
 
0.1%
Decimal Number
ValueCountFrequency (%)
3 5
71.4%
2 1
 
14.3%
1 1
 
14.3%
Space Separator
ValueCountFrequency (%)
680
100.0%
Open Punctuation
ValueCountFrequency (%)
( 29
100.0%
Close Punctuation
ValueCountFrequency (%)
) 29
100.0%
Control
ValueCountFrequency (%)
6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5175
76.9%
Common 1446
 
21.5%
Latin 109
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
864
 
16.7%
211
 
4.1%
172
 
3.3%
159
 
3.1%
158
 
3.1%
142
 
2.7%
125
 
2.4%
119
 
2.3%
103
 
2.0%
98
 
1.9%
Other values (229) 3024
58.4%
Latin
ValueCountFrequency (%)
I 34
31.2%
T 26
23.9%
A 8
 
7.3%
D 6
 
5.5%
o 4
 
3.7%
S 4
 
3.7%
C 4
 
3.7%
W 4
 
3.7%
R 3
 
2.8%
N 3
 
2.8%
Other values (11) 13
 
11.9%
Common
ValueCountFrequency (%)
, 685
47.4%
680
47.0%
( 29
 
2.0%
) 29
 
2.0%
6
 
0.4%
3 5
 
0.3%
/ 4
 
0.3%
: 3
 
0.2%
- 2
 
0.1%
2 1
 
0.1%
Other values (2) 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5175
76.9%
ASCII 1554
 
23.1%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
864
 
16.7%
211
 
4.1%
172
 
3.3%
159
 
3.1%
158
 
3.1%
142
 
2.7%
125
 
2.4%
119
 
2.3%
103
 
2.0%
98
 
1.9%
Other values (229) 3024
58.4%
ASCII
ValueCountFrequency (%)
, 685
44.1%
680
43.8%
I 34
 
2.2%
( 29
 
1.9%
) 29
 
1.9%
T 26
 
1.7%
A 8
 
0.5%
D 6
 
0.4%
6
 
0.4%
3 5
 
0.3%
Other values (22) 46
 
3.0%
None
ValueCountFrequency (%)
· 1
100.0%

학생수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct57
Distinct (%)28.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.318182
Minimum0
Maximum102
Zeros5
Zeros (%)2.5%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-12T15:58:45.095582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10
Q121
median28.5
Q338.75
95-th percentile61.6
Maximum102
Range102
Interquartile range (IQR)17.75

Descriptive statistics

Standard deviation16.710882
Coefficient of variation (CV)0.53358404
Kurtosis1.8456111
Mean31.318182
Median Absolute Deviation (MAD)9.5
Skewness1.0512613
Sum6201
Variance279.25358
MonotonicityNot monotonic
2023-12-12T15:58:45.207181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33 9
 
4.5%
30 8
 
4.0%
27 8
 
4.0%
24 7
 
3.5%
26 7
 
3.5%
25 7
 
3.5%
47 7
 
3.5%
28 7
 
3.5%
32 7
 
3.5%
21 6
 
3.0%
Other values (47) 125
63.1%
ValueCountFrequency (%)
0 5
2.5%
5 1
 
0.5%
7 1
 
0.5%
8 1
 
0.5%
10 3
1.5%
11 2
 
1.0%
12 4
2.0%
13 3
1.5%
14 3
1.5%
15 4
2.0%
ValueCountFrequency (%)
102 1
 
0.5%
85 1
 
0.5%
83 1
 
0.5%
75 1
 
0.5%
74 1
 
0.5%
70 1
 
0.5%
69 1
 
0.5%
66 1
 
0.5%
65 2
1.0%
61 3
1.5%

취업맞춤반 과정수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.5707071
Minimum0
Maximum10
Zeros5
Zeros (%)2.5%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-12T15:58:45.303114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q13
median3
Q34
95-th percentile6
Maximum10
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.4470521
Coefficient of variation (CV)0.40525647
Kurtosis1.6438321
Mean3.5707071
Median Absolute Deviation (MAD)1
Skewness0.4568126
Sum707
Variance2.0939599
MonotonicityNot monotonic
2023-12-12T15:58:45.386189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
3 65
32.8%
4 44
22.2%
5 33
16.7%
2 32
16.2%
6 11
 
5.6%
0 5
 
2.5%
7 4
 
2.0%
1 3
 
1.5%
10 1
 
0.5%
ValueCountFrequency (%)
0 5
 
2.5%
1 3
 
1.5%
2 32
16.2%
3 65
32.8%
4 44
22.2%
5 33
16.7%
6 11
 
5.6%
7 4
 
2.0%
10 1
 
0.5%
ValueCountFrequency (%)
10 1
 
0.5%
7 4
 
2.0%
6 11
 
5.6%
5 33
16.7%
4 44
22.2%
3 65
32.8%
2 32
16.2%
1 3
 
1.5%
0 5
 
2.5%

Interactions

2023-12-12T15:58:42.634895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:58:42.460894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:58:42.729791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:58:42.539131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:58:45.450594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
계열소재지학생수취업맞춤반 과정수
계열1.0000.3490.4440.225
소재지0.3491.0000.0000.000
학생수0.4440.0001.0000.789
취업맞춤반 과정수0.2250.0000.7891.000
2023-12-12T15:58:45.522807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
계열소재지
계열1.0000.122
소재지0.1221.000
2023-12-12T15:58:45.587103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
학생수취업맞춤반 과정수계열소재지
학생수1.0000.7350.1780.000
취업맞춤반 과정수0.7351.0000.0640.000
계열0.1780.0641.0000.122
소재지0.0000.0000.1221.000

Missing values

2023-12-12T15:58:42.848239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:58:42.963585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

참여학교계열소재지전공학생수취업맞춤반 과정수
0강릉정보공업고등학교공업강원소프트웨어과, 그린자동차과, 미용디자인과, 신재생에너지과, 조리제빵과303
1강원생활과학고등학교가사실업강원보건간호과, 미용예술과162
2김화공업고등학교공업강원전기시스템제어과, IT융합과, 웰빙식품과193
3강원생명과학고등학교농업공업강원스마트팜도시농업과, 플라워가드닝과, 반려동물케어과, 카페N디저트과, 바이오식품가공과, 스마트전기전자과, IoT그린전기차과264
4영서고등학교농생명강원동물자원, 산업기계, 환경조경, 생활원예, 골프경영과, 식품산업, 유통경영, 사무행정586
5미래고등학교공업강원컴퓨터응용기계과, 전기과, 모바일전자과, 토목과, 건축과, 자동차과, 드론전자과, 뷰티과183
6정선정보공업고등학교상업정보강원토목과, 정보처리과, 금융정보과82
7춘천기계공업고등학교공업강원기계과, 스마트금형과, 산업설비과, 자동차과, 전기과, 건축토목과386
8춘천한샘고등학교공업강원뷰티패션과,조리과,디자인콘텐츠과,바이오코스메틱과,스마트경영과,융합소프트웨어과547
9경기경영고등학교상업정보경기패션코디네이터과, 웹툰크리에이터과, 금융경영과, 회계경영과, 콘텐츠크리에이터과, 스마트콘텐츠과, 조리디저트과, 뷰티미용과293
참여학교계열소재지전공학생수취업맞춤반 과정수
188제천디지털전자고등학교공업충북전기전자과, IT전자과, 보건간호과51
189제천산업고등학교공업충북기계과, 전기제어과, 뷰티미용과102
190청주IT과학고등학교상업충북스마트소프트웨어과, 판매관리과, 사무행정과362
191청주공업고등학교공업충북정밀기계과, 기계설계과, 항공모빌리티과, 융합설비과, 화학공업과, 전기제어과, 반도체전자과476
192청주하이텍고등학교공업충북정밀기계과, 자동화시스템과, 전기전자과313
193충북공업고등학교공업충북생산자동화설비과, 금형과, 정밀기계과, 전기전자과415
194충주공업고등학교공업충북건축디자인과, 자동화기계과, 전기전자과, 토목시스템과233
195충주상업고등학교상업충북경영회계과,경영관리과, 스마트IT과, 관광레저과, 외식조리과656
196한림디자인고등학교상업가사충북경영회계과, 디자인과, 패션디자인과, 뷰티디자인과394
197충북상업정보고등학교상업충북창업경영과, 항공물류서비스과, 사무행정과263