Overview

Dataset statistics

Number of variables4
Number of observations57
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.0 KiB
Average record size in memory35.3 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description대전광역시교육청 관내 특성화고등학교,마이스터고등학교의 2024학년도신입생 기준 학과 및 학급 수 데이터를 제공함.
Author대전광역시교육청
URLhttps://www.data.go.kr/data/15112950/fileData.do

Alerts

학교명 is highly overall correlated with 학생성별High correlation
학생성별 is highly overall correlated with 학교명High correlation

Reproduction

Analysis started2024-04-21 02:16:10.866056
Analysis finished2024-04-21 02:16:12.801838
Duration1.94 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

학교명
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size588.0 B
충남기계공업고등학교
대전도시과학고등학교
대전국제통상고등학교
대전대성여자고등학교
대전전자디자인고등학교
Other values (7)
26 

Length

Max length15
Median length10
Mean length10.298246
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대덕소프트웨어마이스터고등학교
2nd row대덕소프트웨어마이스터고등학교
3rd row대덕소프트웨어마이스터고등학교
4th row동아마이스터고등학교
5th row동아마이스터고등학교

Common Values

ValueCountFrequency (%)
충남기계공업고등학교 7
12.3%
대전도시과학고등학교 6
10.5%
대전국제통상고등학교 6
10.5%
대전대성여자고등학교 6
10.5%
대전전자디자인고등학교 6
10.5%
동아마이스터고등학교 5
8.8%
대전생활과학고등학교 5
8.8%
계룡디지텍고등학교 4
7.0%
유성생명과학고등학교 4
7.0%
대덕소프트웨어마이스터고등학교 3
5.3%
Other values (2) 5
8.8%

Length

2024-04-21T11:16:12.873163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
충남기계공업고등학교 7
12.3%
대전도시과학고등학교 6
10.5%
대전국제통상고등학교 6
10.5%
대전대성여자고등학교 6
10.5%
대전전자디자인고등학교 6
10.5%
동아마이스터고등학교 5
8.8%
대전생활과학고등학교 5
8.8%
계룡디지텍고등학교 4
7.0%
유성생명과학고등학교 4
7.0%
대덕소프트웨어마이스터고등학교 3
5.3%
Other values (2) 5
8.8%
Distinct52
Distinct (%)91.2%
Missing0
Missing (%)0.0%
Memory size588.0 B
2024-04-21T11:16:13.093413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length20
Mean length8.3333333
Min length3

Characters and Unicode

Total characters475
Distinct characters122
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)84.2%

Sample

1st rowSW개발과
2nd row임베디드SW과
3rd row인공지능SW과
4th row전기전자제어과
5th row스마트자동화시스템과
ValueCountFrequency (%)
스마트기계과 3
 
5.3%
토탈미용과 2
 
3.5%
보건간호과 2
 
3.5%
전기과 2
 
3.5%
스마트융합기계과 1
 
1.8%
철도차량과 1
 
1.8%
스마트시티과 1
 
1.8%
스마트경영과 1
 
1.8%
철도건축시설과 1
 
1.8%
철도전기신호과 1
 
1.8%
Other values (42) 42
73.7%
2024-04-21T11:16:13.459983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
70
 
14.7%
17
 
3.6%
16
 
3.4%
15
 
3.2%
14
 
2.9%
13
 
2.7%
12
 
2.5%
12
 
2.5%
10
 
2.1%
) 10
 
2.1%
Other values (112) 286
60.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 437
92.0%
Close Punctuation 10
 
2.1%
Open Punctuation 10
 
2.1%
Other Punctuation 10
 
2.1%
Uppercase Letter 8
 
1.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
70
 
16.0%
17
 
3.9%
16
 
3.7%
15
 
3.4%
14
 
3.2%
13
 
3.0%
12
 
2.7%
12
 
2.7%
10
 
2.3%
9
 
2.1%
Other values (104) 249
57.0%
Uppercase Letter
ValueCountFrequency (%)
S 3
37.5%
W 3
37.5%
I 1
 
12.5%
T 1
 
12.5%
Other Punctuation
ValueCountFrequency (%)
, 9
90.0%
· 1
 
10.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 437
92.0%
Common 30
 
6.3%
Latin 8
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
70
 
16.0%
17
 
3.9%
16
 
3.7%
15
 
3.4%
14
 
3.2%
13
 
3.0%
12
 
2.7%
12
 
2.7%
10
 
2.3%
9
 
2.1%
Other values (104) 249
57.0%
Common
ValueCountFrequency (%)
) 10
33.3%
( 10
33.3%
, 9
30.0%
· 1
 
3.3%
Latin
ValueCountFrequency (%)
S 3
37.5%
W 3
37.5%
I 1
 
12.5%
T 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 437
92.0%
ASCII 37
 
7.8%
None 1
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
70
 
16.0%
17
 
3.9%
16
 
3.7%
15
 
3.4%
14
 
3.2%
13
 
3.0%
12
 
2.7%
12
 
2.7%
10
 
2.3%
9
 
2.1%
Other values (104) 249
57.0%
ASCII
ValueCountFrequency (%)
) 10
27.0%
( 10
27.0%
, 9
24.3%
S 3
 
8.1%
W 3
 
8.1%
I 1
 
2.7%
T 1
 
2.7%
None
ValueCountFrequency (%)
· 1
100.0%

학생성별
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size588.0 B
남녀
30 
16 
11 

Length

Max length2
Median length2
Mean length1.5263158
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남녀
2nd row남녀
3rd row남녀
4th row
5th row

Common Values

ValueCountFrequency (%)
남녀 30
52.6%
16
28.1%
11
 
19.3%

Length

2024-04-21T11:16:13.618412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T11:16:13.724105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
남녀 30
52.6%
16
28.1%
11
 
19.3%

학급수
Real number (ℝ)

Distinct6
Distinct (%)10.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0175439
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size645.0 B
2024-04-21T11:16:13.834534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q32
95-th percentile4.2
Maximum6
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.0262837
Coefficient of variation (CV)0.50867973
Kurtosis4.8423488
Mean2.0175439
Median Absolute Deviation (MAD)0
Skewness1.9157284
Sum115
Variance1.0532581
MonotonicityNot monotonic
2024-04-21T11:16:13.940287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2 32
56.1%
1 16
28.1%
3 5
 
8.8%
5 2
 
3.5%
4 1
 
1.8%
6 1
 
1.8%
ValueCountFrequency (%)
1 16
28.1%
2 32
56.1%
3 5
 
8.8%
4 1
 
1.8%
5 2
 
3.5%
6 1
 
1.8%
ValueCountFrequency (%)
6 1
 
1.8%
5 2
 
3.5%
4 1
 
1.8%
3 5
 
8.8%
2 32
56.1%
1 16
28.1%

Interactions

2024-04-21T11:16:12.450479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T11:16:14.027296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
학교명학과명학생성별학급수
학교명1.0000.8441.0000.802
학과명0.8441.0000.6570.937
학생성별1.0000.6571.0000.608
학급수0.8020.9370.6081.000
2024-04-21T11:16:14.130922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
학교명학생성별
학교명1.0000.913
학생성별0.9131.000
2024-04-21T11:16:14.227042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
학급수학교명학생성별
학급수1.0000.4150.298
학교명0.4151.0000.913
학생성별0.2980.9131.000

Missing values

2024-04-21T11:16:12.646911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T11:16:12.761347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

학교명학과명학생성별학급수
0대덕소프트웨어마이스터고등학교SW개발과남녀2
1대덕소프트웨어마이스터고등학교임베디드SW과남녀1
2대덕소프트웨어마이스터고등학교인공지능SW과남녀1
3동아마이스터고등학교전기전자제어과3
4동아마이스터고등학교스마트자동화시스템과3
5동아마이스터고등학교스마트기계과2
6동아마이스터고등학교전기전자제어과(군특성화마이스터)1
7동아마이스터고등학교스마트기계과(군특성화마이스터)1
8계룡디지텍고등학교스마트소프트웨어과2
9계룡디지텍고등학교정보통신과2
학교명학과명학생성별학급수
47유성생명과학고등학교자동차·건설정보과(건설정보,자동차건설기계)남녀2
48유성생명과학고등학교토탈미용과남녀2
49유성생명과학고등학교보건간호과남녀1
50충남기계공업고등학교스마트융합기계과2
51충남기계공업고등학교철도차량과2
52충남기계공업고등학교스마트시티과2
53충남기계공업고등학교디지털설비과2
54충남기계공업고등학교기계설계과2
55충남기계공업고등학교전기과2
56충남기계공업고등학교스마트팩토리과2