Overview

Dataset statistics

Number of variables3
Number of observations5395
Missing cells11
Missing cells (%)0.1%
Duplicate rows130
Duplicate rows (%)2.4%
Total size in memory126.6 KiB
Average record size in memory24.0 B

Variable types

DateTime1
Text2

Dataset

Description중앙행정기관 공무원의 국내·외 교육훈련 연구결과보고서 목록(2006~2017)
Author인사혁신처
URLhttps://www.data.go.kr/data/15050434/fileData.do

Alerts

Dataset has 130 (2.4%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 12:09:44.693935
Analysis finished2023-12-12 12:09:45.292754
Duration0.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2588
Distinct (%)48.1%
Missing11
Missing (%)0.2%
Memory size42.3 KiB
Minimum2006-08-28 00:00:00
Maximum2017-12-01 00:00:00
2023-12-12T21:09:45.353759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:09:45.481555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct5113
Distinct (%)94.8%
Missing0
Missing (%)0.0%
Memory size42.3 KiB
2023-12-12T21:09:45.786417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length100
Median length76
Mean length25.120482
Min length2

Characters and Unicode

Total characters135525
Distinct characters769
Distinct categories13 ?
Distinct scripts5 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4898 ?
Unique (%)90.8%

Sample

1st row미국의 지속가능농업정책 연구
2nd row보육정책 성과지표 개발 제안
3rd row미국 지방정부의 행정혁신
4th row도로교통 안전정책 및 첨단교통체계
5th row한반도 평화체제 구축방안에 관한 연구
ValueCountFrequency (%)
연구 2867
 
9.2%
1340
 
4.3%
위한 928
 
3.0%
방안 497
 
1.6%
관한 428
 
1.4%
대한 308
 
1.0%
통한 303
 
1.0%
미국의 219
 
0.7%
미국 179
 
0.6%
정책 172
 
0.6%
Other values (10688) 23903
76.7%
2023-12-12T21:09:46.241892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25794
 
19.0%
3978
 
2.9%
3847
 
2.8%
2869
 
2.1%
2483
 
1.8%
2238
 
1.7%
2077
 
1.5%
2014
 
1.5%
1918
 
1.4%
1915
 
1.4%
Other values (759) 86392
63.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 101519
74.9%
Space Separator 25795
 
19.0%
Lowercase Letter 3604
 
2.7%
Uppercase Letter 2747
 
2.0%
Other Punctuation 537
 
0.4%
Decimal Number 374
 
0.3%
Open Punctuation 344
 
0.3%
Close Punctuation 340
 
0.3%
Dash Punctuation 229
 
0.2%
Initial Punctuation 13
 
< 0.1%
Other values (3) 23
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3978
 
3.9%
3847
 
3.8%
2869
 
2.8%
2483
 
2.4%
2238
 
2.2%
2077
 
2.0%
2014
 
2.0%
1918
 
1.9%
1915
 
1.9%
1547
 
1.5%
Other values (653) 76633
75.5%
Uppercase Letter
ValueCountFrequency (%)
T 327
11.9%
A 324
11.8%
C 198
 
7.2%
E 197
 
7.2%
I 193
 
7.0%
S 193
 
7.0%
F 187
 
6.8%
D 160
 
5.8%
O 128
 
4.7%
U 127
 
4.6%
Other values (32) 713
26.0%
Lowercase Letter
ValueCountFrequency (%)
e 437
12.1%
n 342
 
9.5%
t 323
 
9.0%
i 306
 
8.5%
a 299
 
8.3%
o 296
 
8.2%
r 239
 
6.6%
s 198
 
5.5%
l 162
 
4.5%
c 146
 
4.1%
Other values (16) 856
23.8%
Other Punctuation
ValueCountFrequency (%)
, 220
41.0%
· 115
21.4%
. 68
 
12.7%
& 50
 
9.3%
: 35
 
6.5%
/ 22
 
4.1%
' 9
 
1.7%
" 8
 
1.5%
; 6
 
1.1%
% 2
 
0.4%
Decimal Number
ValueCountFrequency (%)
0 97
25.9%
2 80
21.4%
1 67
17.9%
3 41
11.0%
4 28
 
7.5%
5 20
 
5.3%
6 15
 
4.0%
7 12
 
3.2%
9 10
 
2.7%
8 4
 
1.1%
Open Punctuation
ValueCountFrequency (%)
( 340
98.8%
3
 
0.9%
1
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 336
98.8%
3
 
0.9%
1
 
0.3%
Math Symbol
ValueCountFrequency (%)
+ 3
60.0%
< 1
 
20.0%
> 1
 
20.0%
Space Separator
ValueCountFrequency (%)
25794
> 99.9%
  1
 
< 0.1%
Initial Punctuation
ValueCountFrequency (%)
8
61.5%
5
38.5%
Final Punctuation
ValueCountFrequency (%)
7
58.3%
5
41.7%
Dash Punctuation
ValueCountFrequency (%)
- 229
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 101463
74.9%
Common 27655
 
20.4%
Latin 6312
 
4.7%
Han 56
 
< 0.1%
Cyrillic 39
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3978
 
3.9%
3847
 
3.8%
2869
 
2.8%
2483
 
2.4%
2238
 
2.2%
2077
 
2.0%
2014
 
2.0%
1918
 
1.9%
1915
 
1.9%
1547
 
1.5%
Other values (617) 76577
75.5%
Latin
ValueCountFrequency (%)
e 437
 
6.9%
n 342
 
5.4%
T 327
 
5.2%
A 324
 
5.1%
t 323
 
5.1%
i 306
 
4.8%
a 299
 
4.7%
o 296
 
4.7%
r 239
 
3.8%
s 198
 
3.1%
Other values (42) 3221
51.0%
Common
ValueCountFrequency (%)
25794
93.3%
( 340
 
1.2%
) 336
 
1.2%
- 229
 
0.8%
, 220
 
0.8%
· 115
 
0.4%
0 97
 
0.4%
2 80
 
0.3%
. 68
 
0.2%
1 67
 
0.2%
Other values (28) 309
 
1.1%
Han
ValueCountFrequency (%)
5
 
8.9%
5
 
8.9%
4
 
7.1%
4
 
7.1%
3
 
5.4%
2
 
3.6%
2
 
3.6%
2
 
3.6%
2
 
3.6%
1
 
1.8%
Other values (26) 26
46.4%
Cyrillic
ValueCountFrequency (%)
И 7
17.9%
А 4
10.3%
Л 4
10.3%
У 3
7.7%
Т 3
7.7%
М 3
7.7%
О 2
 
5.1%
Н 2
 
5.1%
К 2
 
5.1%
Ь 2
 
5.1%
Other values (6) 7
17.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 101454
74.9%
ASCII 33818
 
25.0%
None 124
 
0.1%
CJK 55
 
< 0.1%
Cyrillic 39
 
< 0.1%
Punctuation 25
 
< 0.1%
Compat Jamo 9
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
25794
76.3%
e 437
 
1.3%
n 342
 
1.0%
( 340
 
1.0%
) 336
 
1.0%
T 327
 
1.0%
A 324
 
1.0%
t 323
 
1.0%
i 306
 
0.9%
a 299
 
0.9%
Other values (70) 4990
 
14.8%
Hangul
ValueCountFrequency (%)
3978
 
3.9%
3847
 
3.8%
2869
 
2.8%
2483
 
2.4%
2238
 
2.2%
2077
 
2.0%
2014
 
2.0%
1918
 
1.9%
1915
 
1.9%
1547
 
1.5%
Other values (614) 76568
75.5%
None
ValueCountFrequency (%)
· 115
92.7%
3
 
2.4%
3
 
2.4%
1
 
0.8%
1
 
0.8%
  1
 
0.8%
Punctuation
ValueCountFrequency (%)
8
32.0%
7
28.0%
5
20.0%
5
20.0%
Compat Jamo
ValueCountFrequency (%)
7
77.8%
1
 
11.1%
1
 
11.1%
Cyrillic
ValueCountFrequency (%)
И 7
17.9%
А 4
10.3%
Л 4
10.3%
У 3
7.7%
Т 3
7.7%
М 3
7.7%
О 2
 
5.1%
Н 2
 
5.1%
К 2
 
5.1%
Ь 2
 
5.1%
Other values (6) 7
17.9%
CJK
ValueCountFrequency (%)
5
 
9.1%
5
 
9.1%
4
 
7.3%
4
 
7.3%
3
 
5.5%
2
 
3.6%
2
 
3.6%
2
 
3.6%
2
 
3.6%
1
 
1.8%
Other values (25) 25
45.5%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%
Distinct4342
Distinct (%)80.5%
Missing0
Missing (%)0.0%
Memory size42.3 KiB
2023-12-12T21:09:46.555638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length2.9877665
Min length2

Characters and Unicode

Total characters16119
Distinct characters271
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3584 ?
Unique (%)66.4%

Sample

1st row하영효
2nd row정봉협
3rd row이종배
4th row홍기범
5th row오균
ValueCountFrequency (%)
김정삼 13
 
0.2%
이성미 9
 
0.2%
임창환 8
 
0.1%
이태훈 7
 
0.1%
김재현 7
 
0.1%
이지현 6
 
0.1%
김자영 6
 
0.1%
박지은 5
 
0.1%
김성호 5
 
0.1%
이진수 5
 
0.1%
Other values (4333) 5325
98.7%
2023-12-12T21:09:47.029262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1057
 
6.6%
857
 
5.3%
587
 
3.6%
497
 
3.1%
421
 
2.6%
358
 
2.2%
345
 
2.1%
326
 
2.0%
297
 
1.8%
284
 
1.8%
Other values (261) 11090
68.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16094
99.8%
Uppercase Letter 16
 
0.1%
Close Punctuation 4
 
< 0.1%
Open Punctuation 4
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1057
 
6.6%
857
 
5.3%
587
 
3.6%
497
 
3.1%
421
 
2.6%
358
 
2.2%
345
 
2.1%
326
 
2.0%
297
 
1.8%
284
 
1.8%
Other values (255) 11065
68.8%
Uppercase Letter
ValueCountFrequency (%)
L 8
50.0%
N 4
25.0%
U 4
25.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 16094
99.8%
Latin 16
 
0.1%
Common 9
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1057
 
6.6%
857
 
5.3%
587
 
3.6%
497
 
3.1%
421
 
2.6%
358
 
2.2%
345
 
2.1%
326
 
2.0%
297
 
1.8%
284
 
1.8%
Other values (255) 11065
68.8%
Latin
ValueCountFrequency (%)
L 8
50.0%
N 4
25.0%
U 4
25.0%
Common
ValueCountFrequency (%)
) 4
44.4%
( 4
44.4%
1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 16094
99.8%
ASCII 25
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1057
 
6.6%
857
 
5.3%
587
 
3.6%
497
 
3.1%
421
 
2.6%
358
 
2.2%
345
 
2.1%
326
 
2.0%
297
 
1.8%
284
 
1.8%
Other values (255) 11065
68.8%
ASCII
ValueCountFrequency (%)
L 8
32.0%
N 4
16.0%
) 4
16.0%
U 4
16.0%
( 4
16.0%
1
 
4.0%

Missing values

2023-12-12T21:09:45.192817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:09:45.262670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

제출일자훈련과제명등록자
02007-06-28미국의 지속가능농업정책 연구하영효
12007-09-05보육정책 성과지표 개발 제안정봉협
22007-08-14미국 지방정부의 행정혁신이종배
32008-01-05도로교통 안전정책 및 첨단교통체계홍기범
42008-05-01한반도 평화체제 구축방안에 관한 연구오균
52007-08-28한반도 평화체제 구축방안에 관한 연구조재정
62007-07-26재정효율성 제고방안 연구(미국의 사례분석)조경규
72007-05-21남북중 3자경협방안설동근
82007-08-26미국헌법상 평등원칙의 심사기준 및 사례홍두표
92007-11-01미국의 고등교육정책 동향 및 미국, 일본의 대학가버넌스 고찰황홍규
제출일자훈련과제명등록자
53852017-07-18한국범죄분류 개발을 위한 국제표준범죄분류 도입 선진사례 연수김현기
53862017-09-29미국의 폴리그래프 검사 등임익삼
53872017-09-29야생동물 역학조사 현장실무교육정지민
53882017-08-25미승인 유전자변형식품 국내 유입차단을 위한 시험법 개발 및 개발동향 파악문귀임
53892017-08-30미국의 전자장치 개선 및 제도 운영 사례 연구허강무
53902017-08-20지진재해 경감을 위한 지진조기경보 현황 조사 및 지진재난 방재 업무 벤치마킹함인경
53912017-05-25지역경제활성화를 위한 공동체 금융연구설창환
53922017-07-28글로벌공동연수과정엄기복
53932017-05-24부산 크루즈통계 개발을 위한 연구신연주
53942017-09-08기록관리 평가 체계 혁신을 위한 선진기관 사례 연구이혜원

Duplicate rows

Most frequently occurring

제출일자훈련과제명등록자# duplicates
202008-07-09첨단 IT 클러스터 구축, 활용방안김정삼12
212008-07-13축산물 중 병원성 미생물 위해분석 기법 연수이성미8
162008-04-09EU환경에너지규제에대응한자동차산업기술전략공성호5
332009-01-02한국.중국 해관 감관화물 관리제도 비교 연구임창환5
62007-08-01수목장림 조성 및 운영기법 연수배상원4
402009-08-01중국 경제금융제도현황과 금융제도개혁에 관한 연구장도환4
762013-12-18과학기술기반 재난안전 관리정책 및 관리기술 개발(지진 및 기후변화 재난관리 연구)박병철4
982014-06-14부동산금융 선진화 추진이지혜4
82007-08-13디지털 증거 전문분석도구 활용방안독고지은3
182008-05-26식물유전자원 보존을 위한 종자 증식 및 관리기술 개발김재현3