Overview

Dataset statistics

Number of variables5
Number of observations365
Missing cells132
Missing cells (%)7.2%
Duplicate rows2
Duplicate rows (%)0.5%
Total size in memory14.7 KiB
Average record size in memory41.4 B

Variable types

Categorical2
Text3

Dataset

Description한국사학진흥재단 대학재정정보시스템 한국교육원 정보(회계년도, 학교/교육원 구분, 현지 명칭, 한글 명칭, 주소, 학생수, 홈페이지 주소, 권역(교육원))에 대한 파일
Author한국사학진흥재단
URLhttps://www.data.go.kr/data/15120055/fileData.do

Alerts

Dataset has 2 (0.5%) duplicate rowsDuplicates
홈페이지 링크주소 has 132 (36.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 22:13:22.716193
Analysis finished2023-12-12 22:13:23.498105
Duration0.78 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회계년도
Categorical

Distinct5
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023
88 
2022
84 
2021
79 
2020
79 
2024
35 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2023 88
24.1%
2022 84
23.0%
2021 79
21.6%
2020 79
21.6%
2024 35
 
9.6%

Length

2023-12-13T07:13:23.564394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:13:23.678827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 88
24.1%
2022 84
23.0%
2021 79
21.6%
2020 79
21.6%
2024 35
 
9.6%
Distinct2
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
교육원
189 
학교
176 

Length

Max length3
Median length3
Mean length2.5178082
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row교육원
2nd row교육원
3rd row교육원
4th row교육원
5th row학교

Common Values

ValueCountFrequency (%)
교육원 189
51.8%
학교 176
48.2%

Length

2023-12-13T07:13:23.833364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:13:23.963067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
교육원 189
51.8%
학교 176
48.2%
Distinct100
Distinct (%)27.4%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-13T07:13:24.190233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length73
Median length45
Mean length16.460274
Min length2

Characters and Unicode

Total characters6008
Distinct characters227
Distinct categories8 ?
Distinct scripts5 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)2.5%

Sample

1st row하바롭스크한국교육원
2nd row사할린한국교육원
3rd row시드니한국교육원
4th row뉴질랜드한국교육원
5th row범일한국학교
ValueCountFrequency (%)
korean 98
 
13.0%
school 90
 
11.9%
international 51
 
6.7%
of 25
 
3.3%
in 23
 
3.0%
coreano 10
 
1.3%
주미대사관 6
 
0.8%
hongkong 5
 
0.7%
hcmc 5
 
0.7%
colegio 5
 
0.7%
Other values (120) 438
57.9%
2023-12-13T07:13:24.577717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
387
 
6.4%
o 346
 
5.8%
n 259
 
4.3%
a 227
 
3.8%
213
 
3.5%
203
 
3.4%
191
 
3.2%
O 186
 
3.1%
186
 
3.1%
182
 
3.0%
Other values (217) 3628
60.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2244
37.4%
Lowercase Letter 1761
29.3%
Uppercase Letter 1542
25.7%
Space Separator 391
 
6.5%
Other Punctuation 56
 
0.9%
Dash Punctuation 6
 
0.1%
Close Punctuation 4
 
0.1%
Open Punctuation 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
213
 
9.5%
203
 
9.0%
191
 
8.5%
186
 
8.3%
182
 
8.1%
72
 
3.2%
33
 
1.5%
28
 
1.2%
28
 
1.2%
26
 
1.2%
Other values (161) 1082
48.2%
Uppercase Letter
ValueCountFrequency (%)
O 186
12.1%
A 153
9.9%
N 146
9.5%
S 136
8.8%
I 132
8.6%
K 123
8.0%
E 105
 
6.8%
H 90
 
5.8%
C 83
 
5.4%
L 82
 
5.3%
Other values (14) 306
19.8%
Lowercase Letter
ValueCountFrequency (%)
o 346
19.6%
n 259
14.7%
a 227
12.9%
r 131
 
7.4%
e 129
 
7.3%
i 116
 
6.6%
l 106
 
6.0%
h 96
 
5.5%
t 86
 
4.9%
c 63
 
3.6%
Other values (11) 202
11.5%
Other Punctuation
ValueCountFrequency (%)
/ 23
41.1%
· 19
33.9%
, 12
21.4%
& 2
 
3.6%
Space Separator
ValueCountFrequency (%)
387
99.0%
  4
 
1.0%
Close Punctuation
ValueCountFrequency (%)
) 3
75.0%
1
 
25.0%
Open Punctuation
ValueCountFrequency (%)
( 3
75.0%
1
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3303
55.0%
Hangul 1753
29.2%
Han 467
 
7.8%
Common 461
 
7.7%
Katakana 24
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
213
 
12.2%
203
 
11.6%
191
 
10.9%
186
 
10.6%
182
 
10.4%
28
 
1.6%
28
 
1.6%
26
 
1.5%
26
 
1.5%
24
 
1.4%
Other values (109) 646
36.9%
Latin
ValueCountFrequency (%)
o 346
 
10.5%
n 259
 
7.8%
a 227
 
6.9%
O 186
 
5.6%
A 153
 
4.6%
N 146
 
4.4%
S 136
 
4.1%
I 132
 
4.0%
r 131
 
4.0%
e 129
 
3.9%
Other values (35) 1458
44.1%
Han
ValueCountFrequency (%)
72
 
15.4%
33
 
7.1%
26
 
5.6%
25
 
5.4%
21
 
4.5%
21
 
4.5%
21
 
4.5%
21
 
4.5%
20
 
4.3%
15
 
3.2%
Other values (35) 192
41.1%
Common
ValueCountFrequency (%)
387
83.9%
/ 23
 
5.0%
· 19
 
4.1%
, 12
 
2.6%
- 6
 
1.3%
  4
 
0.9%
) 3
 
0.7%
( 3
 
0.7%
& 2
 
0.4%
1
 
0.2%
Katakana
ValueCountFrequency (%)
6
25.0%
3
12.5%
3
12.5%
3
12.5%
3
12.5%
3
12.5%
3
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3739
62.2%
Hangul 1753
29.2%
CJK 467
 
7.8%
None 25
 
0.4%
Katakana 24
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
387
 
10.4%
o 346
 
9.3%
n 259
 
6.9%
a 227
 
6.1%
O 186
 
5.0%
A 153
 
4.1%
N 146
 
3.9%
S 136
 
3.6%
I 132
 
3.5%
r 131
 
3.5%
Other values (42) 1636
43.8%
Hangul
ValueCountFrequency (%)
213
 
12.2%
203
 
11.6%
191
 
10.9%
186
 
10.6%
182
 
10.4%
28
 
1.6%
28
 
1.6%
26
 
1.5%
26
 
1.5%
24
 
1.4%
Other values (109) 646
36.9%
CJK
ValueCountFrequency (%)
72
 
15.4%
33
 
7.1%
26
 
5.6%
25
 
5.4%
21
 
4.5%
21
 
4.5%
21
 
4.5%
21
 
4.5%
20
 
4.3%
15
 
3.2%
Other values (35) 192
41.1%
None
ValueCountFrequency (%)
· 19
76.0%
  4
 
16.0%
1
 
4.0%
1
 
4.0%
Katakana
ValueCountFrequency (%)
6
25.0%
3
12.5%
3
12.5%
3
12.5%
3
12.5%
3
12.5%
3
12.5%
Distinct91
Distinct (%)24.9%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-13T07:13:24.834538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length8.2739726
Min length6

Characters and Unicode

Total characters3020
Distinct characters146
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)2.5%

Sample

1st row하바롭스크한국교육원
2nd row사할린한국교육원
3rd row시드니한국교육원
4th row뉴질랜드한국교육원
5th row범일한국학교
ValueCountFrequency (%)
주미대사관 6
 
1.6%
리야드한국학교 5
 
1.3%
젯다한국국제학교 5
 
1.3%
프놈펜한국국제학교 5
 
1.3%
까오숑한국국제학교 5
 
1.3%
하노이한국국제학교 5
 
1.3%
호치민시한국국제학교 5
 
1.3%
타이뻬이한국학교 5
 
1.3%
싱가포르한국국제학교 5
 
1.3%
연변한국국제학교 5
 
1.3%
Other values (82) 321
86.3%
2023-12-13T07:13:25.238626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
440
14.6%
376
 
12.5%
340
 
11.3%
191
 
6.3%
179
 
5.9%
176
 
5.8%
85
 
2.8%
65
 
2.2%
38
 
1.3%
33
 
1.1%
Other values (136) 1097
36.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3005
99.5%
Uppercase Letter 8
 
0.3%
Space Separator 7
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
440
14.6%
376
 
12.5%
340
 
11.3%
191
 
6.4%
179
 
6.0%
176
 
5.9%
85
 
2.8%
65
 
2.2%
38
 
1.3%
33
 
1.1%
Other values (133) 1082
36.0%
Uppercase Letter
ValueCountFrequency (%)
L 4
50.0%
A 4
50.0%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3005
99.5%
Latin 8
 
0.3%
Common 7
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
440
14.6%
376
 
12.5%
340
 
11.3%
191
 
6.4%
179
 
6.0%
176
 
5.9%
85
 
2.8%
65
 
2.2%
38
 
1.3%
33
 
1.1%
Other values (133) 1082
36.0%
Latin
ValueCountFrequency (%)
L 4
50.0%
A 4
50.0%
Common
ValueCountFrequency (%)
7
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3005
99.5%
ASCII 15
 
0.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
440
14.6%
376
 
12.5%
340
 
11.3%
191
 
6.4%
179
 
6.0%
176
 
5.9%
85
 
2.8%
65
 
2.2%
38
 
1.3%
33
 
1.1%
Other values (133) 1082
36.0%
ASCII
ValueCountFrequency (%)
7
46.7%
L 4
26.7%
A 4
26.7%
Distinct64
Distinct (%)27.5%
Missing132
Missing (%)36.2%
Memory size3.0 KiB
2023-12-13T07:13:25.558575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length38
Mean length28.201717
Min length8

Characters and Unicode

Total characters6571
Distinct characters34
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)1.7%

Sample

1st rowhttp://www.havaedu.com/main/main.php
2nd rowhttp://www.sakhalinedu.com/main/main.php
3rd rowhttp://www.auskec.kr/
4th rowhttp://www.nzkoreanedu.com/
5th rowhttp://www.nzkoreanedu.com/
ValueCountFrequency (%)
http://www.nzkoreanedu.com 14
 
6.0%
http://www.kecvn.com 4
 
1.7%
http://www.cecp.or.kr/cms/page_home 4
 
1.7%
http://www.havaedu.com/main/main.php 4
 
1.7%
https://www.kecla.org 4
 
1.7%
http://hiroshima.kankoku.or.kr/smain.html 4
 
1.7%
http://shimonoseki.kankoku.or.kr/smain.html 4
 
1.7%
http://www.klech.org 4
 
1.7%
http://kecmy.com/ko 4
 
1.7%
https://kecb.kg 4
 
1.7%
Other values (54) 183
78.5%
2023-12-13T07:13:26.044408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 698
 
10.6%
t 584
 
8.9%
. 577
 
8.8%
k 480
 
7.3%
o 431
 
6.6%
w 416
 
6.3%
h 381
 
5.8%
a 342
 
5.2%
r 306
 
4.7%
p 292
 
4.4%
Other values (24) 2064
31.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5041
76.7%
Other Punctuation 1502
 
22.9%
Uppercase Letter 10
 
0.2%
Dash Punctuation 8
 
0.1%
Connector Punctuation 7
 
0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 584
11.6%
k 480
 
9.5%
o 431
 
8.5%
w 416
 
8.3%
h 381
 
7.6%
a 342
 
6.8%
r 306
 
6.1%
p 292
 
5.8%
n 260
 
5.2%
e 242
 
4.8%
Other values (15) 1307
25.9%
Other Punctuation
ValueCountFrequency (%)
/ 698
46.5%
. 577
38.4%
: 227
 
15.1%
Uppercase Letter
ValueCountFrequency (%)
H 4
40.0%
K 3
30.0%
R 3
30.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7
100.0%
Math Symbol
ValueCountFrequency (%)
= 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5051
76.9%
Common 1520
 
23.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 584
11.6%
k 480
 
9.5%
o 431
 
8.5%
w 416
 
8.2%
h 381
 
7.5%
a 342
 
6.8%
r 306
 
6.1%
p 292
 
5.8%
n 260
 
5.1%
e 242
 
4.8%
Other values (18) 1317
26.1%
Common
ValueCountFrequency (%)
/ 698
45.9%
. 577
38.0%
: 227
 
14.9%
- 8
 
0.5%
_ 7
 
0.5%
= 3
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6571
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 698
 
10.6%
t 584
 
8.9%
. 577
 
8.8%
k 480
 
7.3%
o 431
 
6.6%
w 416
 
6.3%
h 381
 
5.8%
a 342
 
5.2%
r 306
 
4.7%
p 292
 
4.4%
Other values (24) 2064
31.4%

Correlations

2023-12-13T07:13:26.419952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계년도학교_교육원 구분현지 명칭한글 명칭홈페이지 링크주소
회계년도1.0000.2650.0000.0000.000
학교_교육원 구분0.2651.0001.0001.0000.994
현지 명칭0.0001.0001.0001.0001.000
한글 명칭0.0001.0001.0001.0001.000
홈페이지 링크주소0.0000.9941.0001.0001.000
2023-12-13T07:13:26.514251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계년도학교_교육원 구분
회계년도1.0000.323
학교_교육원 구분0.3231.000
2023-12-13T07:13:26.609806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계년도학교_교육원 구분
회계년도1.0000.323
학교_교육원 구분0.3231.000

Missing values

2023-12-13T07:13:23.368623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:13:23.462543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회계년도학교_교육원 구분현지 명칭한글 명칭홈페이지 링크주소
02021교육원하바롭스크한국교육원하바롭스크한국교육원http://www.havaedu.com/main/main.php
12021교육원사할린한국교육원사할린한국교육원http://www.sakhalinedu.com/main/main.php
22021교육원시드니한국교육원시드니한국교육원http://www.auskec.kr/
32021교육원뉴질랜드한국교육원뉴질랜드한국교육원http://www.nzkoreanedu.com/
42021학교범일한국학교범일한국학교http://www.nzkoreanedu.com/
52021교육원범일한국교육원범일한국교육원http://www.nzkoreanedu.com/
62022학교범일한국학교범일한국학교http://www.nzkoreanedu.com/
72022학교Tokyo Korean School동경한국학교<NA>
82022학교京都國際中學高等學校교토국제중고등학교<NA>
92022학교OSAKA KONGO INTERNATIONAL ELEMENTARY-MIDDLE-HIGH SCHOOL/大阪金剛インタナショナル小中高等校오사카금강학교https://www.kongogakuen.ed.jp/
회계년도학교_교육원 구분현지 명칭한글 명칭홈페이지 링크주소
3552024학교威海永仁外籍人子女校웨이하이한국학교http://www.weihaischool.org/
3562024학교大連韓國國際學校대련한국국제학교<NA>
3572024학교沈陽韓國國際學校선양한국국제학교http://www.sykis.org
3582024학교KOREA INTERNATIONAL SCHOOL IN YANBIAN/延外籍人子女校연변한국국제학교<NA>
3592024학교GUANGZHOU KOREAN SCHOOL광저우한국학교http://gks.or.kr
3602024학교Korean International School in Hongkong홍콩한국국제학교<NA>
3612024학교TAIPEI KOREAN SCHOOL/台北韓國學校타이뻬이한국학교<NA>
3622024학교Kaohsiung korean international school까오숑한국국제학교<NA>
3632024학교Korean International School In Hanoi하노이한국국제학교http://www.hanoischool.net/
3642024학교KOREAN INTERNATIONAL SCHOOL OF JEDDAH젯다한국국제학교<NA>

Duplicate rows

Most frequently occurring

회계년도학교_교육원 구분현지 명칭한글 명칭홈페이지 링크주소# duplicates
02023교육원주미대사관 교육관주미대사관 교육관<NA>3
12023교육원주미대사관 교육관실주미대사관 교육관실<NA>2