Overview

Dataset statistics

Number of variables3
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows384
Duplicate rows (%)3.8%
Total size in memory312.5 KiB
Average record size in memory32.0 B

Variable types

Text1
DateTime1
Categorical1

Dataset

Description학점은행제 교육훈련기관 기관정보변경에 대한 데이터로 교육훈련기관명, 변경구분, 변경일자 등의 항목을 제공합니다.
Author국가평생교육진흥원
URLhttps://www.data.go.kr/data/15089324/fileData.do

Alerts

Dataset has 384 (3.8%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 20:55:19.180390
Analysis finished2023-12-12 20:55:19.532430
Duration0.35 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct965
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T05:55:19.670405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length24
Mean length11.971
Min length3

Characters and Unicode

Total characters119710
Distinct characters394
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique101 ?
Unique (%)1.0%

Sample

1st row한국폴리텍Ⅶ대학부설평생교육원
2nd row성운대학교부설평생교육원
3rd row서울벤처대학원대학교부설평생교육원
4th row한국IT직업전문학교
5th row한국정보통신기능대학
ValueCountFrequency (%)
평생교육원 433
 
3.8%
부설 349
 
3.0%
국가평생교육진흥원 178
 
1.5%
원격평생교육원 109
 
0.9%
미래교육원 102
 
0.9%
동국대학교 61
 
0.5%
gb)글로벌이노에듀 53
 
0.5%
칠곡군교육문화회관 47
 
0.4%
중앙대학교 45
 
0.4%
패스원 43
 
0.4%
Other values (992) 10104
87.7%
2023-12-13T05:55:20.000098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12329
 
10.3%
8816
 
7.4%
7734
 
6.5%
6906
 
5.8%
5557
 
4.6%
5374
 
4.5%
5370
 
4.5%
3927
 
3.3%
3395
 
2.8%
2164
 
1.8%
Other values (384) 58138
48.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 112806
94.2%
Uppercase Letter 2029
 
1.7%
Space Separator 1524
 
1.3%
Close Punctuation 1369
 
1.1%
Open Punctuation 1369
 
1.1%
Decimal Number 254
 
0.2%
Lowercase Letter 144
 
0.1%
Dash Punctuation 75
 
0.1%
Letter Number 70
 
0.1%
Other Punctuation 65
 
0.1%
Other values (2) 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12329
 
10.9%
8816
 
7.8%
7734
 
6.9%
6906
 
6.1%
5557
 
4.9%
5374
 
4.8%
5370
 
4.8%
3927
 
3.5%
3395
 
3.0%
2164
 
1.9%
Other values (327) 51234
45.4%
Uppercase Letter
ValueCountFrequency (%)
C 316
15.6%
B 213
10.5%
A 212
10.4%
K 157
7.7%
I 150
7.4%
M 147
7.2%
O 146
7.2%
S 146
7.2%
E 99
 
4.9%
D 97
 
4.8%
Other values (12) 346
17.1%
Lowercase Letter
ValueCountFrequency (%)
s 28
19.4%
o 24
16.7%
u 16
11.1%
d 16
11.1%
t 16
11.1%
y 16
11.1%
g 12
8.3%
e 12
8.3%
i 1
 
0.7%
m 1
 
0.7%
Other values (2) 2
 
1.4%
Decimal Number
ValueCountFrequency (%)
6 56
22.0%
1 50
19.7%
0 49
19.3%
9 41
16.1%
2 26
10.2%
8 11
 
4.3%
4 11
 
4.3%
5 10
 
3.9%
Other Punctuation
ValueCountFrequency (%)
. 22
33.8%
& 20
30.8%
: 11
16.9%
· 9
13.8%
, 3
 
4.6%
Letter Number
ValueCountFrequency (%)
51
72.9%
9
 
12.9%
8
 
11.4%
2
 
2.9%
Space Separator
ValueCountFrequency (%)
1524
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1369
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1369
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 75
100.0%
Math Symbol
ValueCountFrequency (%)
+ 4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 112806
94.2%
Common 4661
 
3.9%
Latin 2243
 
1.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12329
 
10.9%
8816
 
7.8%
7734
 
6.9%
6906
 
6.1%
5557
 
4.9%
5374
 
4.8%
5370
 
4.8%
3927
 
3.5%
3395
 
3.0%
2164
 
1.9%
Other values (327) 51234
45.4%
Latin
ValueCountFrequency (%)
C 316
14.1%
B 213
 
9.5%
A 212
 
9.5%
K 157
 
7.0%
I 150
 
6.7%
M 147
 
6.6%
O 146
 
6.5%
S 146
 
6.5%
E 99
 
4.4%
D 97
 
4.3%
Other values (28) 560
25.0%
Common
ValueCountFrequency (%)
1524
32.7%
) 1369
29.4%
( 1369
29.4%
- 75
 
1.6%
6 56
 
1.2%
1 50
 
1.1%
0 49
 
1.1%
9 41
 
0.9%
2 26
 
0.6%
. 22
 
0.5%
Other values (9) 80
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 112806
94.2%
ASCII 6825
 
5.7%
Number Forms 70
 
0.1%
None 9
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12329
 
10.9%
8816
 
7.8%
7734
 
6.9%
6906
 
6.1%
5557
 
4.9%
5374
 
4.8%
5370
 
4.8%
3927
 
3.5%
3395
 
3.0%
2164
 
1.9%
Other values (327) 51234
45.4%
ASCII
ValueCountFrequency (%)
1524
22.3%
) 1369
20.1%
( 1369
20.1%
C 316
 
4.6%
B 213
 
3.1%
A 212
 
3.1%
K 157
 
2.3%
I 150
 
2.2%
M 147
 
2.2%
O 146
 
2.1%
Other values (42) 1222
17.9%
Number Forms
ValueCountFrequency (%)
51
72.9%
9
 
12.9%
8
 
11.4%
2
 
2.9%
None
ValueCountFrequency (%)
· 9
100.0%
Distinct2394
Distinct (%)23.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2005-03-08 00:00:00
Maximum2023-01-04 00:00:00
2023-12-13T05:55:20.126169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:55:20.569320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

변경구분
Categorical

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
전화
1370 
기관장명
1162 
기관장주민번호
1133 
기관장한자
1064 
신고주소
984 
Other values (12)
4287 

Length

Max length7
Median length6
Mean length3.962
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기관장명
2nd row사용유무
3rd row기관장한자
4th row예금주
5th row기관장주민번호

Common Values

ValueCountFrequency (%)
전화 1370
13.7%
기관장명 1162
11.6%
기관장주민번호 1133
11.3%
기관장한자 1064
10.6%
신고주소 984
9.8%
팩스 695
7.0%
기관명칭 623
6.2%
사용유무 571
5.7%
우편번호 513
 
5.1%
예금주 464
 
4.6%
Other values (7) 1421
14.2%

Length

2023-12-13T05:55:20.773950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전화 1370
13.7%
기관장명 1162
11.6%
기관장주민번호 1133
11.3%
기관장한자 1064
10.6%
신고주소 984
9.8%
팩스 695
7.0%
기관명칭 623
6.2%
사용유무 571
5.7%
우편번호 513
 
5.1%
예금주 464
 
4.6%
Other values (7) 1421
14.2%

Missing values

2023-12-13T05:55:19.434668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:55:19.499346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기관명변경일자변경구분
12817한국폴리텍Ⅶ대학부설평생교육원2009-03-23기관장명
4105성운대학교부설평생교육원2019-11-29사용유무
1489서울벤처대학원대학교부설평생교육원2013-01-30기관장한자
5498한국IT직업전문학교2019-06-05예금주
3578한국정보통신기능대학2006-09-29기관장주민번호
9135기독간호대학교부설평생교육원2014-03-19전화
7624국가평생교육진흥원2017-05-25기관유형
3565한국예술사관실용전문학교2015-07-23기관장주민번호
1137단국대학교천안캠퍼스부설평생교육원2019-09-03기관장한자
9743숭실원격평생교육원2015-10-28전화
기관명변경일자변경구분
8554유한대학부설평생교육원2017-02-09우편번호
32(주)성준항공 부설 한국에어텍항공직업전문학교2009-11-11기관명칭
8888가톨릭대학교부설평생교육원(성의)2017-03-03전화
10868사이버한국원격평생교육원2013-03-13팩스
1468서울기독대학교부설평생교육원2013-10-23기관장한자
8514영산대학교 평생교육원 해운대캠퍼스2017-02-09우편번호
27(재)서울예술실용전문학교2015-04-23기관명칭
1795육군정보학교2009-01-06기관장한자
4282첨단자동차직업전문학교2010-10-05사용유무
6624남예종예술실용전문학교2013-10-07신고주소

Duplicate rows

Most frequently occurring

기관명변경일자변경구분# duplicates
93국가평생교육진흥원2017-05-25기관유형11
97국가평생교육진흥원2017-05-30기관유형10
13(주)성준항공 부설 한국에어텍항공직업전문학교2015-07-10예금주9
96국가평생교육진흥원2017-05-29기관유형9
102국가평생교육진흥원2017-06-08기관유형9
100국가평생교육진흥원2017-06-02기관유형8
138남서울대학교 부설 원격평생교육원2013-07-16신고주소8
98국가평생교육진흥원2017-05-31기관유형7
95국가평생교육진흥원2017-05-28기관유형6
99국가평생교육진흥원2017-06-01기관유형6