Overview

Dataset statistics

Number of variables5
Number of observations3680
Missing cells0
Missing cells (%)0.0%
Duplicate rows6
Duplicate rows (%)0.2%
Total size in memory147.5 KiB
Average record size in memory41.0 B

Variable types

Text2
Numeric1
Categorical2

Dataset

Description한국연구재단이 보유하고 있는 "외국교육기관 및 외국인학교종합안내" 시스템의 "학교정보 첨부파일 목록" 데이터 입니다. 학교명, 기준년도, 분기, 파일구분 파일명의 컬럼으로 구성되어있습니다
URLhttps://www.data.go.kr/data/15117892/fileData.do

Alerts

Dataset has 6 (0.2%) duplicate rowsDuplicates
분기 is highly imbalanced (56.3%)Imbalance

Reproduction

Analysis started2023-12-12 17:54:55.914209
Analysis finished2023-12-12 17:54:56.976680
Duration1.06 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct55
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size28.9 KiB
2023-12-13T02:54:57.167335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length14
Mean length9.3407609
Min length6

Characters and Unicode

Total characters34374
Distinct characters124
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row독일FAU대학교 부산캠퍼스
2nd row독일FAU대학교 부산캠퍼스
3rd row독일FAU대학교 부산캠퍼스
4th row한국켄트외국인학교
5th row서울독일학교
ValueCountFrequency (%)
한국뉴욕주립대학교 157
 
3.6%
채드윅송도국제학교 138
 
3.1%
하비에르 129
 
2.9%
국제학교 129
 
2.9%
애서튼국제외국인학교 111
 
2.5%
서울프랑스학교 108
 
2.5%
부산국제외국인학교 103
 
2.3%
sbu 103
 
2.3%
덜위치칼리지서울영국학교 103
 
2.3%
재한몽골학교 95
 
2.2%
Other values (54) 3219
73.2%
2023-12-13T02:54:57.576212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4285
 
12.5%
3557
 
10.3%
3219
 
9.4%
1574
 
4.6%
1372
 
4.0%
1007
 
2.9%
983
 
2.9%
845
 
2.5%
783
 
2.3%
734
 
2.1%
Other values (114) 16015
46.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 32637
94.9%
Space Separator 715
 
2.1%
Uppercase Letter 675
 
2.0%
Close Punctuation 139
 
0.4%
Open Punctuation 139
 
0.4%
Other Punctuation 57
 
0.2%
Decimal Number 12
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4285
 
13.1%
3557
 
10.9%
3219
 
9.9%
1574
 
4.8%
1372
 
4.2%
1007
 
3.1%
983
 
3.0%
845
 
2.6%
783
 
2.4%
734
 
2.2%
Other values (101) 14278
43.7%
Uppercase Letter
ValueCountFrequency (%)
U 171
25.3%
F 122
18.1%
B 103
15.3%
S 103
15.3%
A 68
 
10.1%
I 54
 
8.0%
T 54
 
8.0%
Decimal Number
ValueCountFrequency (%)
0 8
66.7%
1 4
33.3%
Space Separator
ValueCountFrequency (%)
715
100.0%
Close Punctuation
ValueCountFrequency (%)
) 139
100.0%
Open Punctuation
ValueCountFrequency (%)
( 139
100.0%
Other Punctuation
ValueCountFrequency (%)
· 57
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 32637
94.9%
Common 1062
 
3.1%
Latin 675
 
2.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4285
 
13.1%
3557
 
10.9%
3219
 
9.9%
1574
 
4.8%
1372
 
4.2%
1007
 
3.1%
983
 
3.0%
845
 
2.6%
783
 
2.4%
734
 
2.2%
Other values (101) 14278
43.7%
Latin
ValueCountFrequency (%)
U 171
25.3%
F 122
18.1%
B 103
15.3%
S 103
15.3%
A 68
 
10.1%
I 54
 
8.0%
T 54
 
8.0%
Common
ValueCountFrequency (%)
715
67.3%
) 139
 
13.1%
( 139
 
13.1%
· 57
 
5.4%
0 8
 
0.8%
1 4
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 32637
94.9%
ASCII 1680
 
4.9%
None 57
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4285
 
13.1%
3557
 
10.9%
3219
 
9.9%
1574
 
4.8%
1372
 
4.2%
1007
 
3.1%
983
 
3.0%
845
 
2.6%
783
 
2.4%
734
 
2.2%
Other values (101) 14278
43.7%
ASCII
ValueCountFrequency (%)
715
42.6%
U 171
 
10.2%
) 139
 
8.3%
( 139
 
8.3%
F 122
 
7.3%
B 103
 
6.1%
S 103
 
6.1%
A 68
 
4.0%
I 54
 
3.2%
T 54
 
3.2%
Other values (2) 12
 
0.7%
None
ValueCountFrequency (%)
· 57
100.0%

기준년도
Real number (ℝ)

Distinct10
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2017.6856
Minimum2013
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size32.5 KiB
2023-12-13T02:54:57.735665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2013
5-th percentile2013
Q12015
median2018
Q32020
95-th percentile2022
Maximum2022
Range9
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7668895
Coefficient of variation (CV)0.0013713184
Kurtosis-1.1373653
Mean2017.6856
Median Absolute Deviation (MAD)2
Skewness-0.049853763
Sum7425083
Variance7.6556773
MonotonicityNot monotonic
2023-12-13T02:54:57.849744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
2018 423
11.5%
2017 403
11.0%
2019 384
10.4%
2016 379
10.3%
2022 378
10.3%
2020 376
10.2%
2021 373
10.1%
2015 365
9.9%
2014 327
8.9%
2013 272
7.4%
ValueCountFrequency (%)
2013 272
7.4%
2014 327
8.9%
2015 365
9.9%
2016 379
10.3%
2017 403
11.0%
2018 423
11.5%
2019 384
10.4%
2020 376
10.2%
2021 373
10.1%
2022 378
10.3%
ValueCountFrequency (%)
2022 378
10.3%
2021 373
10.1%
2020 376
10.2%
2019 384
10.4%
2018 423
11.5%
2017 403
11.0%
2016 379
10.3%
2015 365
9.9%
2014 327
8.9%
2013 272
7.4%

분기
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size28.9 KiB
1분기
3348 
2분기
 
332

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1분기
2nd row1분기
3rd row1분기
4th row1분기
5th row1분기

Common Values

ValueCountFrequency (%)
1분기 3348
91.0%
2분기 332
 
9.0%

Length

2023-12-13T02:54:57.976921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:54:58.090062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1분기 3348
91.0%
2분기 332
 
9.0%

파일구분
Categorical

Distinct14
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size28.9 KiB
학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정
677 
회계결산서
518 
예·결산 내역
517 
학교교육과정 편성 운영 및 평가
422 
학력인정
413 
Other values (9)
1133 

Length

Max length35
Median length14
Mean length12.49212
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정
2nd row학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정
3rd row학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정
4th row회계결산서
5th row학력인정

Common Values

ValueCountFrequency (%)
학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정 677
18.4%
회계결산서 518
14.1%
예·결산 내역 517
14.0%
학교교육과정 편성 운영 및 평가 422
11.5%
학력인정 413
11.2%
교육과정 330
9.0%
교과 외 활동 256
 
7.0%
학비 193
 
5.2%
교육운영 특색사업 계획 130
 
3.5%
신입생 모집 요강 125
 
3.4%
Other values (4) 99
 
2.7%

Length

2023-12-13T02:54:58.212271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
학교 2031
13.4%
운영 1776
 
11.7%
규칙 1354
 
8.9%
규정 1354
 
8.9%
1120
 
7.4%
677
 
4.5%
677
 
4.5%
학사 677
 
4.5%
회계결산서 518
 
3.4%
예·결산 517
 
3.4%
Other values (25) 4469
29.5%
Distinct2757
Distinct (%)74.9%
Missing0
Missing (%)0.0%
Memory size28.9 KiB
2023-12-13T02:54:58.501163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length94
Median length62
Mean length26.102717
Min length6

Characters and Unicode

Total characters96058
Distinct characters497
Distinct categories13 ?
Distinct scripts5 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2304 ?
Unique (%)62.6%

Sample

1st row1_1_FAU Busan - Immatrikulationssatzung dt-kor_등록 일반규정.pdf
2nd row1_2_FAU Busan - Seiten_aus_APO_TechFak_kor_시험규정.pdf
3rd row1_3_FAU Busan - Ueberblick Bayerisches Hochschulrecht dt-kor_독일 대학법 개요.pdf
4th row2012.7~2013.6. 학교 운영비.pdf
5th rowDSSI Schulprogramm 학교교육과정 및 편성.doc
ValueCountFrequency (%)
167
 
1.5%
161
 
1.5%
교육과정 152
 
1.4%
school 149
 
1.4%
교비회계 142
 
1.3%
of 99
 
0.9%
예산서.pdf 93
 
0.8%
결산서.pdf 89
 
0.8%
학교 78
 
0.7%
2-1 77
 
0.7%
Other values (2842) 9745
89.0%
2023-12-13T02:54:59.003887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7472
 
7.8%
2 5038
 
5.2%
. 4284
 
4.5%
1 3924
 
4.1%
d 3712
 
3.9%
0 3663
 
3.8%
p 3244
 
3.4%
f 2888
 
3.0%
2566
 
2.7%
- 2473
 
2.6%
Other values (487) 56794
59.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 29137
30.3%
Other Letter 26574
27.7%
Decimal Number 15979
16.6%
Space Separator 7492
 
7.8%
Uppercase Letter 6856
 
7.1%
Other Punctuation 4530
 
4.7%
Dash Punctuation 2480
 
2.6%
Connector Punctuation 1478
 
1.5%
Close Punctuation 721
 
0.8%
Open Punctuation 715
 
0.7%
Other values (3) 96
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2566
 
9.7%
2425
 
9.1%
968
 
3.6%
912
 
3.4%
790
 
3.0%
721
 
2.7%
659
 
2.5%
599
 
2.3%
541
 
2.0%
489
 
1.8%
Other values (400) 15904
59.8%
Lowercase Letter
ValueCountFrequency (%)
d 3712
12.7%
p 3244
11.1%
f 2888
 
9.9%
e 2211
 
7.6%
o 1849
 
6.3%
i 1475
 
5.1%
n 1469
 
5.0%
a 1438
 
4.9%
t 1422
 
4.9%
c 1412
 
4.8%
Other values (16) 8017
27.5%
Uppercase Letter
ValueCountFrequency (%)
S 1344
19.6%
C 671
 
9.8%
A 650
 
9.5%
I 489
 
7.1%
F 467
 
6.8%
P 351
 
5.1%
E 319
 
4.7%
B 315
 
4.6%
Y 271
 
4.0%
K 219
 
3.2%
Other values (15) 1760
25.7%
Decimal Number
ValueCountFrequency (%)
2 5038
31.5%
1 3924
24.6%
0 3663
22.9%
9 568
 
3.6%
3 534
 
3.3%
6 517
 
3.2%
8 477
 
3.0%
7 435
 
2.7%
5 424
 
2.7%
4 395
 
2.5%
Other values (2) 4
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 4284
94.6%
105
 
2.3%
& 73
 
1.6%
, 24
 
0.5%
· 21
 
0.5%
# 16
 
0.4%
' 7
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 668
92.6%
] 52
 
7.2%
1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 662
92.6%
[ 52
 
7.3%
1
 
0.1%
Letter Number
ValueCountFrequency (%)
4
36.4%
4
36.4%
3
27.3%
Space Separator
ValueCountFrequency (%)
7472
99.7%
  20
 
0.3%
Dash Punctuation
ValueCountFrequency (%)
- 2473
99.7%
7
 
0.3%
Math Symbol
ValueCountFrequency (%)
~ 81
96.4%
+ 3
 
3.6%
Connector Punctuation
ValueCountFrequency (%)
_ 1478
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 36004
37.5%
Common 33480
34.9%
Hangul 25296
26.3%
Han 1276
 
1.3%
Hiragana 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2566
 
10.1%
2425
 
9.6%
968
 
3.8%
912
 
3.6%
790
 
3.1%
721
 
2.9%
659
 
2.6%
599
 
2.4%
541
 
2.1%
489
 
1.9%
Other values (259) 14626
57.8%
Han
ValueCountFrequency (%)
100
 
7.8%
53
 
4.2%
48
 
3.8%
37
 
2.9%
35
 
2.7%
35
 
2.7%
33
 
2.6%
32
 
2.5%
30
 
2.4%
28
 
2.2%
Other values (129) 845
66.2%
Latin
ValueCountFrequency (%)
d 3712
 
10.3%
p 3244
 
9.0%
f 2888
 
8.0%
e 2211
 
6.1%
o 1849
 
5.1%
i 1475
 
4.1%
n 1469
 
4.1%
a 1438
 
4.0%
t 1422
 
3.9%
c 1412
 
3.9%
Other values (44) 14884
41.3%
Common
ValueCountFrequency (%)
7472
22.3%
2 5038
15.0%
. 4284
12.8%
1 3924
11.7%
0 3663
10.9%
- 2473
 
7.4%
_ 1478
 
4.4%
) 668
 
2.0%
( 662
 
2.0%
9 568
 
1.7%
Other values (23) 3250
9.7%
Hiragana
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 69313
72.2%
Hangul 25296
 
26.3%
CJK 1276
 
1.3%
None 152
 
0.2%
Number Forms 11
 
< 0.1%
Punctuation 7
 
< 0.1%
Hiragana 2
 
< 0.1%
Geometric Shapes 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7472
 
10.8%
2 5038
 
7.3%
. 4284
 
6.2%
1 3924
 
5.7%
d 3712
 
5.4%
0 3663
 
5.3%
p 3244
 
4.7%
f 2888
 
4.2%
- 2473
 
3.6%
e 2211
 
3.2%
Other values (65) 30404
43.9%
Hangul
ValueCountFrequency (%)
2566
 
10.1%
2425
 
9.6%
968
 
3.8%
912
 
3.6%
790
 
3.1%
721
 
2.9%
659
 
2.6%
599
 
2.4%
541
 
2.1%
489
 
1.9%
Other values (259) 14626
57.8%
None
ValueCountFrequency (%)
105
69.1%
· 21
 
13.8%
  20
 
13.2%
3
 
2.0%
1
 
0.7%
1
 
0.7%
1
 
0.7%
CJK
ValueCountFrequency (%)
100
 
7.8%
53
 
4.2%
48
 
3.8%
37
 
2.9%
35
 
2.7%
35
 
2.7%
33
 
2.6%
32
 
2.5%
30
 
2.4%
28
 
2.2%
Other values (129) 845
66.2%
Punctuation
ValueCountFrequency (%)
7
100.0%
Number Forms
ValueCountFrequency (%)
4
36.4%
4
36.4%
3
27.3%
Hiragana
ValueCountFrequency (%)
1
50.0%
1
50.0%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%

Interactions

2023-12-13T02:54:56.595494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:54:59.109370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
학교명기준년도분기파일구분
학교명1.0000.1430.6300.692
기준년도0.1431.0000.0790.050
분기0.6300.0791.0000.492
파일구분0.6920.0500.4921.000
2023-12-13T02:54:59.486506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분기파일구분
분기1.0000.386
파일구분0.3861.000
2023-12-13T02:54:59.568156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기준년도분기파일구분
기준년도1.0000.1180.022
분기0.1181.0000.386
파일구분0.0220.3861.000

Missing values

2023-12-13T02:54:56.763068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:54:56.921355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

학교명기준년도분기파일구분파일명
0독일FAU대학교 부산캠퍼스20131분기학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정1_1_FAU Busan - Immatrikulationssatzung dt-kor_등록 일반규정.pdf
1독일FAU대학교 부산캠퍼스20131분기학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정1_2_FAU Busan - Seiten_aus_APO_TechFak_kor_시험규정.pdf
2독일FAU대학교 부산캠퍼스20131분기학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정1_3_FAU Busan - Ueberblick Bayerisches Hochschulrecht dt-kor_독일 대학법 개요.pdf
3한국켄트외국인학교20131분기회계결산서2012.7~2013.6. 학교 운영비.pdf
4서울독일학교20131분기학력인정DSSI Schulprogramm 학교교육과정 및 편성.doc
5서울독일학교20131분기학교교육과정 편성 운영 및 평가DSSI Schulprogramm 학교교육과정 및 편성.doc
6코리아외국인학교20131분기학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정학칙.pdf
7부산외국인학교20131분기학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정SchoolHandbookSchoolYear201314.pdf
8독일FAU대학교 부산캠퍼스20131분기예·결산 내역2013년도 예산계획서.pdf
9독일FAU대학교 부산캠퍼스20131분기회계결산서2012년도 정산내역서.pdf
학교명기준년도분기파일구분파일명
3670한국뉴욕주립대학교 SBU20222분기회계결산서2021-22 한국뉴욕주립대학교 (외국교육기관회계기준).pdf
3671한국뉴욕주립대학교 FIT20222분기회계결산서2021-22 한국뉴욕주립대학교 (외국교육기관회계기준).pdf
3672한국뉴욕주립대학교 SBU20222분기신입생 모집 요강3-a. 한국뉴욕주립대학교 대학원과정 신입생 모집요강 (2023).pdf
3673한국뉴욕주립대학교 SBU20222분기신입생 모집 요강3-b. 한국뉴욕주립대학교 학부과정 신입생 모집요강 (2023).pdf
3674한국뉴욕주립대학교 FIT20222분기신입생 모집 요강3-c. 한국뉴욕주립대학교 패션기술대학 학부과정 신입생 모집요강 (2023).pdf
3675채드윅송도국제학교20222분기학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정2022-2023 VillageSchool Handbook.pdf.pdf
3676채드윅송도국제학교20222분기학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정2022-2023 MiddleSchool Handbook.pdf.pdf
3677채드윅송도국제학교20222분기학교 규칙 등 학교 운영 규정 / 학교 규칙 및 학사 운영 규정2022-2023 UpperSchool tHandbook.pdf.pdf
3678채드윅송도국제학교20222분기회계결산서Chadwick MOE_Audit report_202206.pdf.pdf
3679겐트대학교 글로벌캠퍼스20222분기회계결산서FY2022_결산서.pdf

Duplicate rows

Most frequently occurring

학교명기준년도분기파일구분파일명# duplicates
0겐트대학교 글로벌캠퍼스20162분기회계결산서겐트대학교 2016 운영계산서.PDF2
1부산화교소학교20141분기회계결산서교비회계결산(2).docx2
2부산화교소학교20151분기교육과정僑民學校立案證明書.jpeg2
3부산화교중고등학교20211분기학력인정中台149-立案證書.PNG2
4현대외국인학교20181분기학력인정18-19 학사일정.pdf2
5현대외국인학교20201분기교육과정현대외국인학교 2020-2021학년도 연간학사계획 (20-21 Academic Calendar).pdf2