Overview

Dataset statistics

Number of variables10
Number of observations4378
Missing cells0
Missing cells (%)0.0%
Duplicate rows598
Duplicate rows (%)13.7%
Total size in memory355.0 KiB
Average record size in memory83.0 B

Variable types

Text4
Categorical3
Numeric3

Dataset

Description보고서제목,국외훈련분야,국외훈련국가,훈련과정구분,훈련과정명,훈련기관한글명,훈련기관영문명,파견시작년월일,파견종료년월일,등록일자
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-2557/S/1/datasetView.do

Alerts

Dataset has 598 (13.7%) duplicate rowsDuplicates
국외훈련분야 is highly overall correlated with 훈련과정구분High correlation
훈련과정구분 is highly overall correlated with 국외훈련분야High correlation
파견시작년월일 is highly overall correlated with 파견종료년월일 and 1 other fieldsHigh correlation
파견종료년월일 is highly overall correlated with 파견시작년월일 and 1 other fieldsHigh correlation
등록일자 is highly overall correlated with 파견시작년월일 and 1 other fieldsHigh correlation
국외훈련국가 is highly imbalanced (63.3%)Imbalance

Reproduction

Analysis started2024-05-03 22:56:18.767899
Analysis finished2024-05-03 22:56:26.663904
Duration7.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2290
Distinct (%)52.3%
Missing0
Missing (%)0.0%
Memory size34.3 KiB
2024-05-03T22:56:27.102832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length90
Median length72
Mean length17.966651
Min length3

Characters and Unicode

Total characters78658
Distinct characters742
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1909 ?
Unique (%)43.6%

Sample

1st row탈라하시의 시내버스 정기이용권
2nd row국외훈련 도착신고
3rd row도착신고서
4th row합리적 시민참여 방안에 대한 고찰
5th row도착신고서
ValueCountFrequency (%)
훈련상황보고서 709
 
4.9%
귀국보고서 607
 
4.2%
367
 
2.5%
미국 226
 
1.6%
연구 222
 
1.5%
2011년 140
 
1.0%
122
 
0.8%
미국의 120
 
0.8%
위한 116
 
0.8%
보고서 107
 
0.7%
Other values (4838) 11795
81.2%
2024-05-03T22:56:28.258592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10311
 
13.1%
2569
 
3.3%
2526
 
3.2%
2423
 
3.1%
1706
 
2.2%
1635
 
2.1%
1626
 
2.1%
1579
 
2.0%
1363
 
1.7%
0 1146
 
1.5%
Other values (732) 51774
65.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 52472
66.7%
Space Separator 10313
 
13.1%
Lowercase Letter 6336
 
8.1%
Decimal Number 4199
 
5.3%
Uppercase Letter 1863
 
2.4%
Open Punctuation 1135
 
1.4%
Close Punctuation 1135
 
1.4%
Other Punctuation 690
 
0.9%
Dash Punctuation 398
 
0.5%
Connector Punctuation 54
 
0.1%
Other values (3) 63
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2569
 
4.9%
2526
 
4.8%
2423
 
4.6%
1706
 
3.3%
1635
 
3.1%
1626
 
3.1%
1579
 
3.0%
1363
 
2.6%
1138
 
2.2%
1080
 
2.1%
Other values (642) 34827
66.4%
Lowercase Letter
ValueCountFrequency (%)
e 708
11.2%
i 637
10.1%
o 573
 
9.0%
n 564
 
8.9%
r 563
 
8.9%
t 538
 
8.5%
a 494
 
7.8%
s 402
 
6.3%
l 251
 
4.0%
u 171
 
2.7%
Other values (16) 1435
22.6%
Uppercase Letter
ValueCountFrequency (%)
C 227
 
12.2%
S 214
 
11.5%
U 137
 
7.4%
P 128
 
6.9%
B 112
 
6.0%
A 110
 
5.9%
T 99
 
5.3%
M 86
 
4.6%
I 86
 
4.6%
N 80
 
4.3%
Other values (16) 584
31.3%
Decimal Number
ValueCountFrequency (%)
0 1146
27.3%
2 1108
26.4%
1 1054
25.1%
4 299
 
7.1%
9 206
 
4.9%
3 176
 
4.2%
8 118
 
2.8%
6 38
 
0.9%
7 34
 
0.8%
5 20
 
0.5%
Other Punctuation
ValueCountFrequency (%)
/ 201
29.1%
, 160
23.2%
. 160
23.2%
: 70
 
10.1%
' 51
 
7.4%
? 32
 
4.6%
& 7
 
1.0%
# 4
 
0.6%
; 4
 
0.6%
% 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 30
88.2%
+ 2
 
5.9%
> 1
 
2.9%
< 1
 
2.9%
Open Punctuation
ValueCountFrequency (%)
( 1132
99.7%
2
 
0.2%
[ 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 1132
99.7%
2
 
0.2%
] 1
 
0.1%
Space Separator
ValueCountFrequency (%)
10311
> 99.9%
  2
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
9
56.2%
7
43.8%
Initial Punctuation
ValueCountFrequency (%)
8
61.5%
5
38.5%
Dash Punctuation
ValueCountFrequency (%)
- 398
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 54
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 52427
66.7%
Common 17987
 
22.9%
Latin 8199
 
10.4%
Han 45
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2569
 
4.9%
2526
 
4.8%
2423
 
4.6%
1706
 
3.3%
1635
 
3.1%
1626
 
3.1%
1579
 
3.0%
1363
 
2.6%
1138
 
2.2%
1080
 
2.1%
Other values (608) 34782
66.3%
Latin
ValueCountFrequency (%)
e 708
 
8.6%
i 637
 
7.8%
o 573
 
7.0%
n 564
 
6.9%
r 563
 
6.9%
t 538
 
6.6%
a 494
 
6.0%
s 402
 
4.9%
l 251
 
3.1%
C 227
 
2.8%
Other values (42) 3242
39.5%
Common
ValueCountFrequency (%)
10311
57.3%
0 1146
 
6.4%
( 1132
 
6.3%
) 1132
 
6.3%
2 1108
 
6.2%
1 1054
 
5.9%
- 398
 
2.2%
4 299
 
1.7%
9 206
 
1.1%
/ 201
 
1.1%
Other values (28) 1000
 
5.6%
Han
ValueCountFrequency (%)
8
 
17.8%
4
 
8.9%
2
 
4.4%
1
 
2.2%
1
 
2.2%
1
 
2.2%
1
 
2.2%
1
 
2.2%
1
 
2.2%
1
 
2.2%
Other values (24) 24
53.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 52427
66.7%
ASCII 26151
33.2%
CJK 42
 
0.1%
Punctuation 29
 
< 0.1%
None 6
 
< 0.1%
CJK Compat Ideographs 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10311
39.4%
0 1146
 
4.4%
( 1132
 
4.3%
) 1132
 
4.3%
2 1108
 
4.2%
1 1054
 
4.0%
e 708
 
2.7%
i 637
 
2.4%
o 573
 
2.2%
n 564
 
2.2%
Other values (73) 7786
29.8%
Hangul
ValueCountFrequency (%)
2569
 
4.9%
2526
 
4.8%
2423
 
4.6%
1706
 
3.3%
1635
 
3.1%
1626
 
3.1%
1579
 
3.0%
1363
 
2.6%
1138
 
2.2%
1080
 
2.1%
Other values (608) 34782
66.3%
Punctuation
ValueCountFrequency (%)
9
31.0%
8
27.6%
7
24.1%
5
17.2%
CJK
ValueCountFrequency (%)
8
 
19.0%
4
 
9.5%
2
 
4.8%
1
 
2.4%
1
 
2.4%
1
 
2.4%
1
 
2.4%
1
 
2.4%
1
 
2.4%
1
 
2.4%
Other values (21) 21
50.0%
None
ValueCountFrequency (%)
  2
33.3%
2
33.3%
2
33.3%
CJK Compat Ideographs
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

국외훈련분야
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size34.3 KiB
도시행정
1087 
경제진흥
629 
도시계획
616 
환경
502 
보건복지
452 
Other values (5)
1092 

Length

Max length4
Median length4
Mean length3.7706715
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row도시행정
2nd row재무행정
3rd row경제진흥
4th row문화체육
5th row도시계획

Common Values

ValueCountFrequency (%)
도시행정 1087
24.8%
경제진흥 629
14.4%
도시계획 616
14.1%
환경 502
11.5%
보건복지 452
10.3%
재무행정 315
 
7.2%
도시교통 305
 
7.0%
문화체육 217
 
5.0%
IT행정 197
 
4.5%
소방방재 58
 
1.3%

Length

2024-05-03T22:56:28.693069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-03T22:56:29.106414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
도시행정 1087
24.8%
경제진흥 629
14.4%
도시계획 616
14.1%
환경 502
11.5%
보건복지 452
10.3%
재무행정 315
 
7.2%
도시교통 305
 
7.0%
문화체육 217
 
5.0%
it행정 197
 
4.5%
소방방재 58
 
1.3%

국외훈련국가
Categorical

IMBALANCE 

Distinct24
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size34.3 KiB
미국
3033 
영국
536 
일본
 
282
중국
 
189
캐나다
 
121
Other values (19)
 
217

Length

Max length7
Median length2
Mean length2.1103243
Min length2

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st row미국
2nd row미국
3rd row중국
4th row미국
5th row영국

Common Values

ValueCountFrequency (%)
미국 3033
69.3%
영국 536
 
12.2%
일본 282
 
6.4%
중국 189
 
4.3%
캐나다 121
 
2.8%
독일 83
 
1.9%
오스트레일리아 48
 
1.1%
프랑스 39
 
0.9%
인도네시아 9
 
0.2%
싱가포르 8
 
0.2%
Other values (14) 30
 
0.7%

Length

2024-05-03T22:56:29.592830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
미국 3033
69.3%
영국 536
 
12.2%
일본 282
 
6.4%
중국 189
 
4.3%
캐나다 121
 
2.8%
독일 83
 
1.9%
오스트레일리아 48
 
1.1%
프랑스 39
 
0.9%
인도네시아 9
 
0.2%
싱가포르 8
 
0.2%
Other values (14) 30
 
0.7%

훈련과정구분
Categorical

HIGH CORRELATION 

Distinct45
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size34.3 KiB
기획
824 
도시관리
340 
관광진흥
286 
기업지원
281 
예산
257 
Other values (40)
2390 

Length

Max length6
Median length2
Mean length2.9216537
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기획
2nd row회계
3rd row관광진흥
4th row문화예술
5th row주택

Common Values

ValueCountFrequency (%)
기획 824
18.8%
도시관리 340
 
7.8%
관광진흥 286
 
6.5%
기업지원 281
 
6.4%
예산 257
 
5.9%
교통 252
 
5.8%
대기 190
 
4.3%
문화예술 184
 
4.2%
보건 182
 
4.2%
건축 173
 
4.0%
Other values (35) 1409
32.2%

Length

2024-05-03T22:56:30.256534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
기획 824
18.8%
도시관리 340
 
7.8%
관광진흥 286
 
6.5%
기업지원 281
 
6.4%
예산 257
 
5.9%
교통 252
 
5.8%
대기 190
 
4.3%
문화예술 184
 
4.2%
보건 182
 
4.2%
건축 173
 
4.0%
Other values (35) 1409
32.2%
Distinct530
Distinct (%)12.1%
Missing0
Missing (%)0.0%
Memory size34.3 KiB
2024-05-03T22:56:30.848884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length79
Median length49
Mean length26.232526
Min length1

Characters and Unicode

Total characters114846
Distinct characters433
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique126 ?
Unique (%)2.9%

Sample

1st row글로벌 도시 경쟁력 제고를 위한 해외도시사례 및 우리시 활용방안 연구
2nd row글로벌 TOP5를 지향하는 서울시의 재정건전성 확보방안
3rd row해외 주요 경쟁도시의 관광정책 비교분석
4th row시민중심의 문화예술 교육 활성화 방안
5th row주택정책 및 공급제도의 선진화 방안연구
ValueCountFrequency (%)
연구 2035
 
7.9%
1224
 
4.7%
위한 691
 
2.7%
방안 481
 
1.9%
통한 421
 
1.6%
지방정부의 276
 
1.1%
서울시 258
 
1.0%
효율적 244
 
0.9%
관한 226
 
0.9%
활성화 182
 
0.7%
Other values (1519) 19850
76.7%
2024-05-03T22:56:32.229305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
21776
 
19.0%
2893
 
2.5%
2889
 
2.5%
2730
 
2.4%
2722
 
2.4%
2519
 
2.2%
2492
 
2.2%
2031
 
1.8%
2023
 
1.8%
1879
 
1.6%
Other values (423) 70892
61.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 86045
74.9%
Space Separator 21776
 
19.0%
Lowercase Letter 4009
 
3.5%
Uppercase Letter 1031
 
0.9%
Other Punctuation 676
 
0.6%
Open Punctuation 482
 
0.4%
Close Punctuation 482
 
0.4%
Decimal Number 268
 
0.2%
Dash Punctuation 77
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2893
 
3.4%
2889
 
3.4%
2730
 
3.2%
2722
 
3.2%
2519
 
2.9%
2492
 
2.9%
2031
 
2.4%
2023
 
2.4%
1879
 
2.2%
1793
 
2.1%
Other values (362) 62074
72.1%
Lowercase Letter
ValueCountFrequency (%)
n 641
16.0%
e 524
13.1%
a 445
11.1%
t 296
7.4%
i 295
7.4%
r 284
7.1%
o 266
 
6.6%
m 211
 
5.3%
g 203
 
5.1%
s 144
 
3.6%
Other values (13) 700
17.5%
Uppercase Letter
ValueCountFrequency (%)
N 128
12.4%
P 126
12.2%
M 125
12.1%
U 115
11.2%
C 92
8.9%
S 79
7.7%
T 55
 
5.3%
R 52
 
5.0%
B 48
 
4.7%
G 46
 
4.5%
Other values (8) 165
16.0%
Decimal Number
ValueCountFrequency (%)
1 113
42.2%
2 102
38.1%
0 43
 
16.0%
5 2
 
0.7%
4 2
 
0.7%
6 2
 
0.7%
7 1
 
0.4%
9 1
 
0.4%
3 1
 
0.4%
8 1
 
0.4%
Other Punctuation
ValueCountFrequency (%)
, 337
49.9%
? 268
39.6%
: 47
 
7.0%
& 21
 
3.1%
' 2
 
0.3%
. 1
 
0.1%
Space Separator
ValueCountFrequency (%)
21776
100.0%
Open Punctuation
ValueCountFrequency (%)
( 482
100.0%
Close Punctuation
ValueCountFrequency (%)
) 482
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 77
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 86030
74.9%
Common 23761
 
20.7%
Latin 5040
 
4.4%
Han 15
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2893
 
3.4%
2889
 
3.4%
2730
 
3.2%
2722
 
3.2%
2519
 
2.9%
2492
 
2.9%
2031
 
2.4%
2023
 
2.4%
1879
 
2.2%
1793
 
2.1%
Other values (361) 62059
72.1%
Latin
ValueCountFrequency (%)
n 641
 
12.7%
e 524
 
10.4%
a 445
 
8.8%
t 296
 
5.9%
i 295
 
5.9%
r 284
 
5.6%
o 266
 
5.3%
m 211
 
4.2%
g 203
 
4.0%
s 144
 
2.9%
Other values (31) 1731
34.3%
Common
ValueCountFrequency (%)
21776
91.6%
( 482
 
2.0%
) 482
 
2.0%
, 337
 
1.4%
? 268
 
1.1%
1 113
 
0.5%
2 102
 
0.4%
- 77
 
0.3%
: 47
 
0.2%
0 43
 
0.2%
Other values (10) 34
 
0.1%
Han
ValueCountFrequency (%)
15
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 86030
74.9%
ASCII 28801
 
25.1%
CJK 15
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
21776
75.6%
n 641
 
2.2%
e 524
 
1.8%
( 482
 
1.7%
) 482
 
1.7%
a 445
 
1.5%
, 337
 
1.2%
t 296
 
1.0%
i 295
 
1.0%
r 284
 
1.0%
Other values (51) 3239
 
11.2%
Hangul
ValueCountFrequency (%)
2893
 
3.4%
2889
 
3.4%
2730
 
3.2%
2722
 
3.2%
2519
 
2.9%
2492
 
2.9%
2031
 
2.4%
2023
 
2.4%
1879
 
2.2%
1793
 
2.1%
Other values (361) 62059
72.1%
CJK
ValueCountFrequency (%)
15
100.0%
Distinct269
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Memory size34.3 KiB
2024-05-03T22:56:32.913036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length18
Mean length7.2768387
Min length2

Characters and Unicode

Total characters31858
Distinct characters336
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76 ?
Unique (%)1.7%

Sample

1st row플로리다주립대
2nd row포틀랜드주립대
3rd row북경과학기술대
4th row포틀랜드주립대
5th row요크대학교
ValueCountFrequency (%)
콜로라도대(덴버 323
 
6.8%
켄터키대 198
 
4.2%
포틀랜드주립대 193
 
4.1%
듀크대 181
 
3.8%
미주리대(컬럼비아 156
 
3.3%
버밍험대 149
 
3.1%
엑시터대 145
 
3.1%
미시간주립대 131
 
2.8%
미주리대 98
 
2.1%
듀크대(프랭클린연구소 86
 
1.8%
Other values (319) 3090
65.1%
2024-05-03T22:56:33.975952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4206
 
13.2%
1129
 
3.5%
( 1109
 
3.5%
) 1109
 
3.5%
1081
 
3.4%
955
 
3.0%
849
 
2.7%
791
 
2.5%
735
 
2.3%
703
 
2.2%
Other values (326) 19191
60.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 28813
90.4%
Open Punctuation 1109
 
3.5%
Close Punctuation 1109
 
3.5%
Space Separator 373
 
1.2%
Other Punctuation 220
 
0.7%
Uppercase Letter 174
 
0.5%
Lowercase Letter 50
 
0.2%
Decimal Number 10
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4206
 
14.6%
1129
 
3.9%
1081
 
3.8%
955
 
3.3%
849
 
2.9%
791
 
2.7%
735
 
2.6%
703
 
2.4%
552
 
1.9%
541
 
1.9%
Other values (298) 17271
59.9%
Uppercase Letter
ValueCountFrequency (%)
A 45
25.9%
L 37
21.3%
I 36
20.7%
M 26
14.9%
T 19
10.9%
S 4
 
2.3%
Y 2
 
1.1%
U 2
 
1.1%
C 2
 
1.1%
F 1
 
0.6%
Lowercase Letter
ValueCountFrequency (%)
a 9
18.0%
h 8
16.0%
i 6
12.0%
g 6
12.0%
n 6
12.0%
l 4
8.0%
r 4
8.0%
e 4
8.0%
o 2
 
4.0%
y 1
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 201
91.4%
? 11
 
5.0%
& 8
 
3.6%
Decimal Number
ValueCountFrequency (%)
2 7
70.0%
1 3
30.0%
Open Punctuation
ValueCountFrequency (%)
( 1109
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1109
100.0%
Space Separator
ValueCountFrequency (%)
373
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 28813
90.4%
Common 2821
 
8.9%
Latin 224
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4206
 
14.6%
1129
 
3.9%
1081
 
3.8%
955
 
3.3%
849
 
2.9%
791
 
2.7%
735
 
2.6%
703
 
2.4%
552
 
1.9%
541
 
1.9%
Other values (298) 17271
59.9%
Latin
ValueCountFrequency (%)
A 45
20.1%
L 37
16.5%
I 36
16.1%
M 26
11.6%
T 19
8.5%
a 9
 
4.0%
h 8
 
3.6%
i 6
 
2.7%
g 6
 
2.7%
n 6
 
2.7%
Other values (10) 26
11.6%
Common
ValueCountFrequency (%)
( 1109
39.3%
) 1109
39.3%
373
 
13.2%
, 201
 
7.1%
? 11
 
0.4%
& 8
 
0.3%
2 7
 
0.2%
1 3
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 28813
90.4%
ASCII 3045
 
9.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4206
 
14.6%
1129
 
3.9%
1081
 
3.8%
955
 
3.3%
849
 
2.9%
791
 
2.7%
735
 
2.6%
703
 
2.4%
552
 
1.9%
541
 
1.9%
Other values (298) 17271
59.9%
ASCII
ValueCountFrequency (%)
( 1109
36.4%
) 1109
36.4%
373
 
12.2%
, 201
 
6.6%
A 45
 
1.5%
L 37
 
1.2%
I 36
 
1.2%
M 26
 
0.9%
T 19
 
0.6%
? 11
 
0.4%
Other values (18) 79
 
2.6%
Distinct269
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Memory size34.3 KiB
2024-05-03T22:56:34.752912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length72
Median length58
Mean length27.794655
Min length3

Characters and Unicode

Total characters121685
Distinct characters61
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique78 ?
Unique (%)1.8%

Sample

1st rowFlorida State University
2nd rowPortland State University
3rd rowUniversity of Science and Technology Beijing
4th rowPortland State University
5th rowThe University of York
ValueCountFrequency (%)
university 3738
24.1%
of 2549
 
16.4%
state 599
 
3.9%
denver 401
 
2.6%
colorado 367
 
2.4%
duke 267
 
1.7%
kentucky 216
 
1.4%
california 214
 
1.4%
portland 193
 
1.2%
at 191
 
1.2%
Other values (405) 6788
43.7%
2024-05-03T22:56:36.493871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 12850
 
10.6%
11148
 
9.2%
e 9683
 
8.0%
n 8713
 
7.2%
t 8468
 
7.0%
r 8349
 
6.9%
o 8028
 
6.6%
s 6431
 
5.3%
a 5516
 
4.5%
y 4935
 
4.1%
Other values (51) 37564
30.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 95242
78.3%
Uppercase Letter 13740
 
11.3%
Space Separator 11148
 
9.2%
Open Punctuation 474
 
0.4%
Close Punctuation 474
 
0.4%
Other Punctuation 334
 
0.3%
Dash Punctuation 268
 
0.2%
Decimal Number 4
 
< 0.1%
Other Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 12850
13.5%
e 9683
10.2%
n 8713
9.1%
t 8468
8.9%
r 8349
8.8%
o 8028
8.4%
s 6431
 
6.8%
a 5516
 
5.8%
y 4935
 
5.2%
v 4725
 
5.0%
Other values (15) 17544
18.4%
Uppercase Letter
ValueCountFrequency (%)
U 4113
29.9%
C 1715
12.5%
S 1492
 
10.9%
D 811
 
5.9%
M 648
 
4.7%
I 583
 
4.2%
A 527
 
3.8%
P 512
 
3.7%
T 476
 
3.5%
B 393
 
2.9%
Other values (15) 2470
18.0%
Other Punctuation
ValueCountFrequency (%)
. 156
46.7%
, 151
45.2%
& 17
 
5.1%
' 9
 
2.7%
: 1
 
0.3%
Space Separator
ValueCountFrequency (%)
11148
100.0%
Open Punctuation
ValueCountFrequency (%)
( 474
100.0%
Close Punctuation
ValueCountFrequency (%)
) 474
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 268
100.0%
Decimal Number
ValueCountFrequency (%)
2 4
100.0%
Other Letter
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 108982
89.6%
Common 12702
 
10.4%
Han 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 12850
11.8%
e 9683
 
8.9%
n 8713
 
8.0%
t 8468
 
7.8%
r 8349
 
7.7%
o 8028
 
7.4%
s 6431
 
5.9%
a 5516
 
5.1%
y 4935
 
4.5%
v 4725
 
4.3%
Other values (40) 31284
28.7%
Common
ValueCountFrequency (%)
11148
87.8%
( 474
 
3.7%
) 474
 
3.7%
- 268
 
2.1%
. 156
 
1.2%
, 151
 
1.2%
& 17
 
0.1%
' 9
 
0.1%
2 4
 
< 0.1%
: 1
 
< 0.1%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 121684
> 99.9%
CJK 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 12850
 
10.6%
11148
 
9.2%
e 9683
 
8.0%
n 8713
 
7.2%
t 8468
 
7.0%
r 8349
 
6.9%
o 8028
 
6.6%
s 6431
 
5.3%
a 5516
 
4.5%
y 4935
 
4.1%
Other values (50) 37563
30.9%
CJK
ValueCountFrequency (%)
1
100.0%

파견시작년월일
Real number (ℝ)

HIGH CORRELATION 

Distinct415
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20079028
Minimum19930817
Maximum20121228
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.6 KiB
2024-05-03T22:56:37.053205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19930817
5-th percentile20030579
Q120070102
median20080709
Q320100527
95-th percentile20120101
Maximum20121228
Range190411
Interquartile range (IQR)30425

Descriptive statistics

Standard deviation28618.332
Coefficient of variation (CV)0.0014252848
Kurtosis4.1507788
Mean20079028
Median Absolute Deviation (MAD)19482
Skewness-1.4747353
Sum8.7905983 × 1010
Variance8.1900894 × 108
MonotonicityNot monotonic
2024-05-03T22:56:37.564716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20110101 143
 
3.3%
20120101 117
 
2.7%
20100101 93
 
2.1%
20090720 91
 
2.1%
20070825 85
 
1.9%
20101227 72
 
1.6%
20070706 70
 
1.6%
20080701 57
 
1.3%
20070619 56
 
1.3%
20070820 54
 
1.2%
Other values (405) 3540
80.9%
ValueCountFrequency (%)
19930817 3
0.1%
19930819 5
0.1%
19940504 1
 
< 0.1%
19940601 3
0.1%
19940801 5
0.1%
19940920 3
0.1%
19941129 2
 
< 0.1%
19960915 1
 
< 0.1%
19970612 4
0.1%
19970614 5
0.1%
ValueCountFrequency (%)
20121228 1
< 0.1%
20121101 1
< 0.1%
20121022 2
< 0.1%
20121021 1
< 0.1%
20121007 1
< 0.1%
20120916 1
< 0.1%
20120915 1
< 0.1%
20120909 1
< 0.1%
20120906 1
< 0.1%
20120903 1
< 0.1%

파견종료년월일
Real number (ℝ)

HIGH CORRELATION 

Distinct426
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20097282
Minimum19950503
Maximum20141227
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.6 KiB
2024-05-03T22:56:38.168131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19950503
5-th percentile20050618
Q120081226
median20100630
Q320111231
95-th percentile20130726
Maximum20141227
Range190724
Interquartile range (IQR)30005

Descriptive statistics

Standard deviation26966.241
Coefficient of variation (CV)0.0013417855
Kurtosis4.8492778
Mean20097282
Median Absolute Deviation (MAD)10601
Skewness-1.6541373
Sum8.7985899 × 1010
Variance7.2717814 × 108
MonotonicityDecreasing
2024-05-03T22:56:38.949107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20111231 217
 
5.0%
20121231 100
 
2.3%
20090824 85
 
1.9%
20110719 84
 
1.9%
20121226 72
 
1.6%
20090705 70
 
1.6%
20100630 57
 
1.3%
20090819 54
 
1.2%
20120625 52
 
1.2%
20120719 52
 
1.2%
Other values (416) 3535
80.7%
ValueCountFrequency (%)
19950503 1
 
< 0.1%
19950630 3
0.1%
19960531 3
0.1%
19960615 5
0.1%
19960912 2
 
< 0.1%
19970331 3
0.1%
19970925 1
 
< 0.1%
19971127 5
0.1%
19980904 4
0.1%
19981215 2
 
< 0.1%
ValueCountFrequency (%)
20141227 1
 
< 0.1%
20140905 1
 
< 0.1%
20140831 1
 
< 0.1%
20140820 2
 
< 0.1%
20140727 8
0.2%
20140723 6
0.1%
20140718 9
0.2%
20140713 2
 
< 0.1%
20140708 3
 
0.1%
20140617 7
0.2%

등록일자
Real number (ℝ)

HIGH CORRELATION 

Distinct843
Distinct (%)19.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20100407
Minimum20080304
Maximum20130203
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size38.6 KiB
2024-05-03T22:56:39.627401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20080304
5-th percentile20081204
Q120090416
median20100406
Q320110713
95-th percentile20121004
Maximum20130203
Range49899
Interquartile range (IQR)20297

Descriptive statistics

Standard deviation12942.604
Coefficient of variation (CV)0.00064389761
Kurtosis-0.99770192
Mean20100407
Median Absolute Deviation (MAD)10096
Skewness0.27088635
Sum8.7999582 × 1010
Variance1.67511 × 108
MonotonicityNot monotonic
2024-05-03T22:56:40.256693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20090116 82
 
1.9%
20081215 69
 
1.6%
20081204 63
 
1.4%
20090112 48
 
1.1%
20081125 48
 
1.1%
20081224 47
 
1.1%
20090120 38
 
0.9%
20081121 37
 
0.8%
20081208 37
 
0.8%
20101201 36
 
0.8%
Other values (833) 3873
88.5%
ValueCountFrequency (%)
20080304 1
 
< 0.1%
20080305 2
 
< 0.1%
20080307 1
 
< 0.1%
20081002 3
 
0.1%
20081006 1
 
< 0.1%
20081008 1
 
< 0.1%
20081009 12
0.3%
20081016 15
0.3%
20081103 1
 
< 0.1%
20081104 12
0.3%
ValueCountFrequency (%)
20130203 1
 
< 0.1%
20130202 1
 
< 0.1%
20130131 2
 
< 0.1%
20130125 2
 
< 0.1%
20130121 7
0.2%
20130117 2
 
< 0.1%
20130116 8
0.2%
20130114 8
0.2%
20130111 7
0.2%
20130110 4
0.1%

Interactions

2024-05-03T22:56:24.697299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T22:56:22.596915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T22:56:23.732141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T22:56:25.084415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T22:56:23.029274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T22:56:24.044323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T22:56:25.462191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T22:56:23.380449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T22:56:24.374049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-03T22:56:40.558721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국외훈련분야국외훈련국가훈련과정구분파견시작년월일파견종료년월일등록일자
국외훈련분야1.0000.4921.0000.3920.3790.391
국외훈련국가0.4921.0000.7740.3070.2910.392
훈련과정구분1.0000.7741.0000.6540.6790.611
파견시작년월일0.3920.3070.6541.0000.9910.879
파견종료년월일0.3790.2910.6790.9911.0000.863
등록일자0.3910.3920.6110.8790.8631.000
2024-05-03T22:56:40.902790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국외훈련분야훈련과정구분국외훈련국가
국외훈련분야1.0000.9960.203
훈련과정구분0.9961.0000.275
국외훈련국가0.2030.2751.000
2024-05-03T22:56:41.247395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
파견시작년월일파견종료년월일등록일자국외훈련분야국외훈련국가훈련과정구분
파견시작년월일1.0000.9810.7800.1290.1170.280
파견종료년월일0.9811.0000.7760.1240.1110.300
등록일자0.7800.7761.0000.1290.1540.253
국외훈련분야0.1290.1240.1291.0000.2030.996
국외훈련국가0.1170.1110.1540.2031.0000.275
훈련과정구분0.2800.3000.2530.9960.2751.000

Missing values

2024-05-03T22:56:25.890814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-03T22:56:26.427209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

보고서제목국외훈련분야국외훈련국가훈련과정구분훈련과정명훈련기관한글명훈련기관영문명파견시작년월일파견종료년월일등록일자
0탈라하시의 시내버스 정기이용권도시행정미국기획글로벌 도시 경쟁력 제고를 위한 해외도시사례 및 우리시 활용방안 연구플로리다주립대Florida State University201212282014122720130131
1국외훈련 도착신고재무행정미국회계글로벌 TOP5를 지향하는 서울시의 재정건전성 확보방안포틀랜드주립대Portland State University201209062014090520121025
2도착신고서경제진흥중국관광진흥해외 주요 경쟁도시의 관광정책 비교분석북경과학기술대University of Science and Technology Beijing201209012014083120121004
3합리적 시민참여 방안에 대한 고찰문화체육미국문화예술시민중심의 문화예술 교육 활성화 방안포틀랜드주립대Portland State University201208212014082020121220
4도착신고서도시계획영국주택주택정책 및 공급제도의 선진화 방안연구요크대학교The University of York201208212014082020121004
5미국 국립공원의 환경교육 ‘Jonior Ranger Game' 사례연구도시행정미국교육공무원의 능력발전 방안에 관한 연구미주리대University of Missouri201207282014072720121125
6훈련기관 및 현지생활 소개서(University of Missouri-St.Louis)도시행정미국교육공무원의 능력발전 방안에 관한 연구미주리대University of Missouri201207282014072720121018
7St. Louis市의 Free Flue Shot Campaign 사례연구도시행정미국교육공무원의 능력발전 방안에 관한 연구미주리대University of Missouri201207282014072720121001
8폭염대책사례연구도시행정미국교육공무원의 능력발전 방안에 관한 연구미주리대University of Missouri201207282014072720120906
92012 가을학기 훈련상황보고서도시행정미국교육공무원의 능력발전 방안에 관한 연구미주리대University of Missouri201207282014072720121224
보고서제목국외훈련분야국외훈련국가훈련과정구분훈련과정명훈련기관한글명훈련기관영문명파견시작년월일파견종료년월일등록일자
4368귀국보고서도시계획미국도시관리선진행정기법 연구아이오와주립대Iowa State University of Science and Technology199408011996061520090923
4369귀국보고서도시계획미국도시관리선진행정기법 연구아이오와주립대Iowa State University of Science and Technology199408011996061520090923
4370귀국보고서도시계획미국도시관리선진행정기법 연구아이오와주립대Iowa State University of Science and Technology199408011996061520090923
4371귀국보고서도시행정미국기획선진행정 기법 연구위스콘신대University of Wisconsin-Madison199406011996053120090923
4372한국에서의 중앙과 지방정부간 국세의 합리적 재배분을 위한 대안 모색도시행정미국기획선진행정 기법 연구위스콘신대University of Wisconsin-Madison199406011996053120090928
4373훈련상황보고서도시행정미국기획선진행정 기법 연구위스콘신대University of Wisconsin-Madison199406011996053120090928
4374귀국보고서도시행정미국기획태평양지역 도시정책 비교연구, 도시환경, 산업정책 결정 및 평가 기법 연구캘리포니아주립대University of California, San Diego199308171995063020090922
4375귀국보고서도시행정미국기획태평양지역 도시정책 비교연구, 도시환경, 산업정책 결정 및 평가 기법 연구캘리포니아주립대University of California, San Diego199308171995063020090922
4376훈련상황 보고서도시행정미국기획태평양지역 도시정책 비교연구, 도시환경, 산업정책 결정 및 평가 기법 연구캘리포니아주립대University of California, San Diego199308171995063020090929
4377귀국보고서도시행정미국기획지방재정의 확충 및 효율적인 배분 방안 모색워싱턴대University of Washington199405041995050320090923

Duplicate rows

Most frequently occurring

보고서제목국외훈련분야국외훈련국가훈련과정구분훈련과정명훈련기관한글명훈련기관영문명파견시작년월일파견종료년월일등록일자# duplicates
520훈련상황보고서보건복지미국노인복지재가노인복지서비스사업 실천체계 연구럿거스대Rutgers University20060810200808092008121516
483훈련상황보고서도시계획캐나다도시관리도시계획 지하공간 개발(북미지역 도시 사례 조사)브리티시컬럼비아대(국제문화교류센타)Univ. of British Columbia(CenterforInterculturalCommunication)20060620200806192008112111
495훈련상황보고서도시행정미국기획공공정책 및 의사결정과정 비교연구미주리대(컬럼비아,아시아센터)University of Missouri-Columbia(AsianAfairsCenter)20060520200805192008112511
521훈련상황보고서보건복지미국보건Pulse Net 시스템 개발 및 절족동물 매개성 질환연구노스텍사스대University of North Texas20060701200806302008121111
399지속가능한 도시재생수법 중 수변공간 도시디자인 설계기법연구도시계획미국도시관리공공디자인을 적용한 도시계획휴스턴대Houston University20100626201206252011030510
424포틀랜드 교통정책 우수사례 조사 - 경전철 및 스트리트카(교통)-3도시계획미국건축친환경적이고 안전한 도시기반시설 구축방안 연구포틀랜드주립대Portland State University20070825200908242009011610
484훈련상황보고서도시교통미국교통교통안전시설의 과학적? 효율적 관리방안 연구아이오와대University of Iowa20061226200812252009010210
107공공 건축물 설계?시공 사업관리방안-지정도시계획미국설비공공 건축물 설계?시공 사업관리방안 연구스티븐스공과대Stevens Institute of Technology2007073020090729200901129
320상수원수 및 주운수로(운하)로 이용되는 해외 하천 사례 자료보고환경미국상하수도수자원고갈과 수질악화에 대비한 상하수도분야 서울시대책 수립텍사스대(오스틴)University of Texas at Austin2006112820081127200812249
348영국 런던 및 버밍엄시의 도시디자인 정책재무행정영국예산국제개발 및 공공부문의 경영관리(1)버밍험대University of Birmingham2007082020090819200901169