Overview

Dataset statistics

Number of variables8
Number of observations2344
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory151.2 KiB
Average record size in memory66.1 B

Variable types

Numeric1
Categorical4
Text2
Boolean1

Dataset

Description고등학교 3학년 학생 중 중소·중견기업에 취업한 학생을 위한 장려금: 고교 취업연계 장려금 참여 고등학교 유형(특성화고, 일반고 등) 및 학교명에 따른 구분
URLhttps://www.data.go.kr/data/15101534/fileData.do

Alerts

상품명 has constant value ""Constant
장학년도 has constant value ""Constant
학교세부유형 is highly overall correlated with 고등학교유형High correlation
고등학교유형 is highly overall correlated with 학교세부유형 and 1 other fieldsHigh correlation
참여여부 is highly overall correlated with 고등학교유형High correlation
학교세부유형 is highly imbalanced (61.2%)Imbalance
고등학교유형 is highly imbalanced (56.8%)Imbalance
참여여부 is highly imbalanced (99.0%)Imbalance
순번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 22:16:06.524398
Analysis finished2023-12-12 22:16:07.631855
Duration1.11 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct2344
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1172.5
Minimum1
Maximum2344
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.7 KiB
2023-12-13T07:16:07.689518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile118.15
Q1586.75
median1172.5
Q31758.25
95-th percentile2226.85
Maximum2344
Range2343
Interquartile range (IQR)1171.5

Descriptive statistics

Standard deviation676.79884
Coefficient of variation (CV)0.57722715
Kurtosis-1.2
Mean1172.5
Median Absolute Deviation (MAD)586
Skewness0
Sum2748340
Variance458056.67
MonotonicityStrictly increasing
2023-12-13T07:16:07.802690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1576 1
 
< 0.1%
1560 1
 
< 0.1%
1561 1
 
< 0.1%
1562 1
 
< 0.1%
1563 1
 
< 0.1%
1564 1
 
< 0.1%
1565 1
 
< 0.1%
1566 1
 
< 0.1%
1567 1
 
< 0.1%
Other values (2334) 2334
99.6%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2344 1
< 0.1%
2343 1
< 0.1%
2342 1
< 0.1%
2341 1
< 0.1%
2340 1
< 0.1%
2339 1
< 0.1%
2338 1
< 0.1%
2337 1
< 0.1%
2336 1
< 0.1%
2335 1
< 0.1%

상품명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.4 KiB
고교 취업연계 장려금
2344 

Length

Max length11
Median length11
Mean length11
Min length11

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row고교 취업연계 장려금
2nd row고교 취업연계 장려금
3rd row고교 취업연계 장려금
4th row고교 취업연계 장려금
5th row고교 취업연계 장려금

Common Values

ValueCountFrequency (%)
고교 취업연계 장려금 2344
100.0%

Length

2023-12-13T07:16:07.904682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:16:07.972789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고교 2344
33.3%
취업연계 2344
33.3%
장려금 2344
33.3%

장학년도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size18.4 KiB
2022
2344 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 2344
100.0%

Length

2023-12-13T07:16:08.047965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:16:08.121325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 2344
100.0%
Distinct2232
Distinct (%)95.2%
Missing0
Missing (%)0.0%
Memory size18.4 KiB
2023-12-13T07:16:08.338887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length21
Mean length7.4701365
Min length6

Characters and Unicode

Total characters17510
Distinct characters382
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2131 ?
Unique (%)90.9%

Sample

1st row가곡고등학교
2nd row가락고등학교
3rd row가림고등학교
4th row가야고등학교
5th row가온고등학교
ValueCountFrequency (%)
금천고등학교 3
 
0.1%
금성고등학교 3
 
0.1%
영산고등학교 3
 
0.1%
덕산고등학교 3
 
0.1%
경일고등학교 3
 
0.1%
강동고등학교 3
 
0.1%
광남고등학교 3
 
0.1%
대진고등학교 3
 
0.1%
영동고등학교 3
 
0.1%
세화고등학교 3
 
0.1%
Other values (2224) 2316
98.7%
2023-12-13T07:16:08.716044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2512
 
14.3%
2388
 
13.6%
2371
 
13.5%
2346
 
13.4%
423
 
2.4%
392
 
2.2%
232
 
1.3%
217
 
1.2%
176
 
1.0%
168
 
1.0%
Other values (372) 6285
35.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 17483
99.8%
Lowercase Letter 14
 
0.1%
Uppercase Letter 10
 
0.1%
Space Separator 2
 
< 0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2512
14.4%
2388
 
13.7%
2371
 
13.6%
2346
 
13.4%
423
 
2.4%
392
 
2.2%
232
 
1.3%
217
 
1.2%
176
 
1.0%
168
 
1.0%
Other values (355) 6258
35.8%
Lowercase Letter
ValueCountFrequency (%)
s 4
28.6%
n 2
14.3%
e 2
14.3%
i 2
14.3%
l 1
 
7.1%
h 1
 
7.1%
g 1
 
7.1%
u 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
P 2
20.0%
I 2
20.0%
T 2
20.0%
B 1
10.0%
K 1
10.0%
O 1
10.0%
E 1
10.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 17483
99.8%
Latin 24
 
0.1%
Common 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2512
14.4%
2388
 
13.7%
2371
 
13.6%
2346
 
13.4%
423
 
2.4%
392
 
2.2%
232
 
1.3%
217
 
1.2%
176
 
1.0%
168
 
1.0%
Other values (355) 6258
35.8%
Latin
ValueCountFrequency (%)
s 4
16.7%
P 2
 
8.3%
I 2
 
8.3%
T 2
 
8.3%
n 2
 
8.3%
e 2
 
8.3%
i 2
 
8.3%
B 1
 
4.2%
l 1
 
4.2%
h 1
 
4.2%
Other values (5) 5
20.8%
Common
ValueCountFrequency (%)
2
66.7%
- 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 17483
99.8%
ASCII 27
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2512
14.4%
2388
 
13.7%
2371
 
13.6%
2346
 
13.4%
423
 
2.4%
392
 
2.2%
232
 
1.3%
217
 
1.2%
176
 
1.0%
168
 
1.0%
Other values (355) 6258
35.8%
ASCII
ValueCountFrequency (%)
s 4
14.8%
P 2
 
7.4%
I 2
 
7.4%
T 2
 
7.4%
n 2
 
7.4%
e 2
 
7.4%
2
 
7.4%
i 2
 
7.4%
B 1
 
3.7%
l 1
 
3.7%
Other values (7) 7
25.9%

학교세부유형
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size18.4 KiB
일반고
1608 
특성화고 및 예체능고
670 
외국어고
 
28
과학고
 
19
영재고
 
8
Other values (2)
 
11

Length

Max length11
Median length3
Mean length5.2986348
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반고
2nd row일반고
3rd row일반고
4th row일반고
5th row특성화고 및 예체능고

Common Values

ValueCountFrequency (%)
일반고 1608
68.6%
특성화고 및 예체능고 670
28.6%
외국어고 28
 
1.2%
과학고 19
 
0.8%
영재고 8
 
0.3%
국제고 8
 
0.3%
자율고 3
 
0.1%

Length

2023-12-13T07:16:08.857610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:16:08.956212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반고 1608
43.6%
특성화고 670
18.2%
670
18.2%
예체능고 670
18.2%
외국어고 28
 
0.8%
과학고 19
 
0.5%
영재고 8
 
0.2%
국제고 8
 
0.2%
자율고 3
 
0.1%

고등학교유형
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct15
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size18.4 KiB
일반고
1667 
공업고
229 
상업고
178 
종합고
 
47
농림업고
 
45
Other values (10)
178 

Length

Max length7
Median length3
Mean length3.0341297
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row일반고
2nd row일반고
3rd row일반고
4th row일반고
5th row종합고

Common Values

ValueCountFrequency (%)
일반고 1667
71.1%
공업고 229
 
9.8%
상업고 178
 
7.6%
종합고 47
 
2.0%
농림업고 45
 
1.9%
가사고 31
 
1.3%
외국어고 31
 
1.3%
예술고 29
 
1.2%
과학고 27
 
1.2%
실업고 23
 
1.0%
Other values (5) 37
 
1.6%

Length

2023-12-13T07:16:09.068099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
일반고 1667
71.1%
공업고 229
 
9.8%
상업고 178
 
7.6%
종합고 47
 
2.0%
농림업고 45
 
1.9%
가사고 31
 
1.3%
외국어고 31
 
1.3%
예술고 29
 
1.2%
과학고 27
 
1.2%
실업고 23
 
1.0%
Other values (5) 37
 
1.6%
Distinct2328
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size18.4 KiB
2023-12-13T07:16:09.436546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length46
Median length38
Mean length23.087031
Min length10

Characters and Unicode

Total characters54116
Distinct characters444
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2314 ?
Unique (%)98.7%

Sample

1st row강원도 삼척시 가곡면 가곡천로 1427 . 가곡고등학교
2nd row서울특별시 송파구 송이로 42
3rd row인천광역시 서구 원적로 58
4th row부산광역시 부산진구 엄광로 152
5th row경기도 안성시 샛터길 46 . 가온고등학교
ValueCountFrequency (%)
642
 
5.5%
경기도 476
 
4.1%
서울특별시 319
 
2.7%
경상남도 187
 
1.6%
경상북도 181
 
1.5%
부산광역시 141
 
1.2%
전라남도 140
 
1.2%
전라북도 128
 
1.1%
인천광역시 124
 
1.1%
충청남도 115
 
1.0%
Other values (4180) 9248
79.0%
2023-12-13T07:16:09.993865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11663
 
21.6%
2052
 
3.8%
1987
 
3.7%
1573
 
2.9%
1 1571
 
2.9%
1393
 
2.6%
2 1068
 
2.0%
999
 
1.8%
973
 
1.8%
3 892
 
1.6%
Other values (434) 29945
55.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 33829
62.5%
Space Separator 11663
 
21.6%
Decimal Number 7664
 
14.2%
Other Punctuation 664
 
1.2%
Dash Punctuation 295
 
0.5%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2052
 
6.1%
1987
 
5.9%
1573
 
4.6%
1393
 
4.1%
999
 
3.0%
973
 
2.9%
831
 
2.5%
789
 
2.3%
778
 
2.3%
736
 
2.2%
Other values (420) 21718
64.2%
Decimal Number
ValueCountFrequency (%)
1 1571
20.5%
2 1068
13.9%
3 892
11.6%
5 719
9.4%
4 706
9.2%
6 626
 
8.2%
7 613
 
8.0%
0 562
 
7.3%
9 475
 
6.2%
8 432
 
5.6%
Space Separator
ValueCountFrequency (%)
11663
100.0%
Other Punctuation
ValueCountFrequency (%)
. 664
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 295
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 33829
62.5%
Common 20287
37.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2052
 
6.1%
1987
 
5.9%
1573
 
4.6%
1393
 
4.1%
999
 
3.0%
973
 
2.9%
831
 
2.5%
789
 
2.3%
778
 
2.3%
736
 
2.2%
Other values (420) 21718
64.2%
Common
ValueCountFrequency (%)
11663
57.5%
1 1571
 
7.7%
2 1068
 
5.3%
3 892
 
4.4%
5 719
 
3.5%
4 706
 
3.5%
. 664
 
3.3%
6 626
 
3.1%
7 613
 
3.0%
0 562
 
2.8%
Other values (4) 1203
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 33829
62.5%
ASCII 20287
37.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11663
57.5%
1 1571
 
7.7%
2 1068
 
5.3%
3 892
 
4.4%
5 719
 
3.5%
4 706
 
3.5%
. 664
 
3.3%
6 626
 
3.1%
7 613
 
3.0%
0 562
 
2.8%
Other values (4) 1203
 
5.9%
Hangul
ValueCountFrequency (%)
2052
 
6.1%
1987
 
5.9%
1573
 
4.6%
1393
 
4.1%
999
 
3.0%
973
 
2.9%
831
 
2.5%
789
 
2.3%
778
 
2.3%
736
 
2.2%
Other values (420) 21718
64.2%

참여여부
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
True
2342 
False
 
2
ValueCountFrequency (%)
True 2342
99.9%
False 2
 
0.1%
2023-12-13T07:16:10.113715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-13T07:16:07.390161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:16:10.188101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번학교세부유형고등학교유형참여여부
순번1.0000.0670.1050.000
학교세부유형0.0671.0000.9290.000
고등학교유형0.1050.9291.0000.752
참여여부0.0000.0000.7521.000
2023-12-13T07:16:10.287813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
고등학교유형학교세부유형참여여부
고등학교유형1.0000.7630.703
학교세부유형0.7631.0000.000
참여여부0.7030.0001.000
2023-12-13T07:16:10.367475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번학교세부유형고등학교유형참여여부
순번1.0000.0340.0390.000
학교세부유형0.0341.0000.7630.000
고등학교유형0.0390.7631.0000.703
참여여부0.0000.0000.7031.000

Missing values

2023-12-13T07:16:07.483171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:16:07.589683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번상품명장학년도고등학교명학교세부유형고등학교유형학교주소참여여부
01고교 취업연계 장려금2022가곡고등학교일반고일반고강원도 삼척시 가곡면 가곡천로 1427 . 가곡고등학교Y
12고교 취업연계 장려금2022가락고등학교일반고일반고서울특별시 송파구 송이로 42Y
23고교 취업연계 장려금2022가림고등학교일반고일반고인천광역시 서구 원적로 58Y
34고교 취업연계 장려금2022가야고등학교일반고일반고부산광역시 부산진구 엄광로 152Y
45고교 취업연계 장려금2022가온고등학교특성화고 및 예체능고종합고경기도 안성시 샛터길 46 . 가온고등학교Y
56고교 취업연계 장려금2022가운고등학교일반고일반고경기도 남양주시 가운로2길 115Y
67고교 취업연계 장려금2022가은고등학교일반고일반고경상북도 문경시 가은읍 가은로 205 . 가은고등학교Y
78고교 취업연계 장려금2022가재울고등학교일반고일반고서울특별시 서대문구 수색로 100-35Y
89고교 취업연계 장려금2022가정고등학교일반고일반고인천광역시 서구 서달로 162Y
910고교 취업연계 장려금2022가좌고등학교일반고일반고인천광역시 서구 장고개로287번길 24Y
순번상품명장학년도고등학교명학교세부유형고등학교유형학교주소참여여부
23342335고교 취업연계 장려금2022효정고등학교일반고일반고울산광역시 북구 율동2길 41Y
23352336고교 취업연계 장려금2022효청보건고등학교특성화고 및 예체능고가사고경상북도 경주시 외동읍 모화남1길 26-132Y
23362337고교 취업연계 장려금2022후포고등학교일반고일반고경상북도 울진군 후포면 후포로 41Y
23372338고교 취업연계 장려금2022휘경공업고등학교특성화고 및 예체능고공업고서울특별시 동대문구 겸재로 21 . 휘경공업고등학교Y
23382339고교 취업연계 장려금2022휘경여자고등학교일반고일반고서울특별시 동대문구 한천로 247Y
23392340고교 취업연계 장려금2022휘문고등학교일반고일반고서울특별시 강남구 역삼로 541Y
23402341고교 취업연계 장려금2022휘봉고등학교일반고일반고서울특별시 동대문구 한천로 290Y
23412342고교 취업연계 장려금2022흥덕고등학교일반고일반고충청북도 청주시 흥덕구 증안로 9Y
23422343고교 취업연계 장려금2022흥덕고등학교일반고일반고경기도 용인시 기흥구 흥덕2로 36Y
23432344고교 취업연계 장려금2022흥진고등학교일반고일반고경기도 군포시 오금로 15-17Y