Overview

Dataset statistics

Number of variables10
Number of observations2082
Missing cells3349
Missing cells (%)16.1%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory162.8 KiB
Average record size in memory80.1 B

Variable types

Categorical3
Text7

Dataset

Description위원기수,위원명,전문분야,소속,직위,자격증,세부전공분야1,세부전공분야2,세부전공분야3,위촉상태
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-2524/S/1/datasetView.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
위촉상태 is highly imbalanced (94.2%)Imbalance
자격증 has 779 (37.4%) missing valuesMissing
세부전공분야2 has 913 (43.9%) missing valuesMissing
세부전공분야3 has 1640 (78.8%) missing valuesMissing

Reproduction

Analysis started2024-05-17 23:09:20.642194
Analysis finished2024-05-17 23:09:23.828283
Duration3.19 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

위원기수
Categorical

Distinct9
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size16.4 KiB
11기
288 
10기
282 
14기
253 
12기
249 
15기
249 
Other values (4)
761 

Length

Max length3
Median length3
Mean length2.8631124
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9기
2nd row9기
3rd row9기
4th row9기
5th row9기

Common Values

ValueCountFrequency (%)
11기 288
13.8%
10기 282
13.5%
14기 253
12.2%
12기 249
12.0%
15기 249
12.0%
13기 248
11.9%
9기 237
11.4%
16기 228
11.0%
2기 48
 
2.3%

Length

2024-05-18T08:09:24.260003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T08:09:24.670079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
11기 288
13.8%
10기 282
13.5%
14기 253
12.2%
12기 249
12.0%
15기 249
12.0%
13기 248
11.9%
9기 237
11.4%
16기 228
11.0%
2기 48
 
2.3%
Distinct1201
Distinct (%)57.7%
Missing0
Missing (%)0.0%
Memory size16.4 KiB
2024-05-18T08:09:25.441403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.9975985
Min length2

Characters and Unicode

Total characters6241
Distinct characters206
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique676 ?
Unique (%)32.5%

Sample

1st row정광섭
2nd row황인환
3rd row강태은
4th row김갑일
5th row이승은
ValueCountFrequency (%)
김현아 7
 
0.3%
이선화 6
 
0.3%
이승원 6
 
0.3%
신효섭 6
 
0.3%
김정선 6
 
0.3%
기유경 6
 
0.3%
최광현 6
 
0.3%
유제남 6
 
0.3%
임성순 6
 
0.3%
이채규 6
 
0.3%
Other values (1195) 2031
97.1%
2024-05-18T08:09:26.888751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
407
 
6.5%
331
 
5.3%
180
 
2.9%
168
 
2.7%
143
 
2.3%
133
 
2.1%
124
 
2.0%
124
 
2.0%
119
 
1.9%
111
 
1.8%
Other values (196) 4401
70.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6219
99.6%
Space Separator 22
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
407
 
6.5%
331
 
5.3%
180
 
2.9%
168
 
2.7%
143
 
2.3%
133
 
2.1%
124
 
2.0%
124
 
2.0%
119
 
1.9%
111
 
1.8%
Other values (195) 4379
70.4%
Space Separator
ValueCountFrequency (%)
22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6219
99.6%
Common 22
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
407
 
6.5%
331
 
5.3%
180
 
2.9%
168
 
2.7%
143
 
2.3%
133
 
2.1%
124
 
2.0%
124
 
2.0%
119
 
1.9%
111
 
1.8%
Other values (195) 4379
70.4%
Common
ValueCountFrequency (%)
22
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6219
99.6%
ASCII 22
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
407
 
6.5%
331
 
5.3%
180
 
2.9%
168
 
2.7%
143
 
2.3%
133
 
2.1%
124
 
2.0%
124
 
2.0%
119
 
1.9%
111
 
1.8%
Other values (195) 4379
70.4%
ASCII
ValueCountFrequency (%)
22
100.0%

전문분야
Categorical

Distinct27
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size16.4 KiB
토목시공
219 
토목구조
201 
토질및기초
178 
건축계획
162 
조경
153 
Other values (22)
1169 

Length

Max length7
Median length4
Mean length4.0533141
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row건축기계설비
2nd row건축기계설비
3rd row전기전력설비
4th row전기전력설비
5th row건축구조

Common Values

ValueCountFrequency (%)
토목시공 219
 
10.5%
토목구조 201
 
9.7%
토질및기초 178
 
8.5%
건축계획 162
 
7.8%
조경 153
 
7.3%
전기전력설비 152
 
7.3%
상하수도 133
 
6.4%
건축시공 93
 
4.5%
수자원개발 90
 
4.3%
건축구조 88
 
4.2%
Other values (17) 613
29.4%

Length

2024-05-18T08:09:27.425602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
토목시공 219
 
10.5%
토목구조 201
 
9.7%
토질및기초 178
 
8.5%
건축계획 162
 
7.8%
조경 153
 
7.3%
전기전력설비 152
 
7.3%
상하수도 133
 
6.4%
건축시공 93
 
4.5%
수자원개발 90
 
4.3%
건축구조 88
 
4.2%
Other values (17) 613
29.4%

소속
Text

Distinct938
Distinct (%)45.1%
Missing0
Missing (%)0.0%
Memory size16.4 KiB
2024-05-18T08:09:28.154040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length24
Mean length7.5557157
Min length1

Characters and Unicode

Total characters15731
Distinct characters373
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique586 ?
Unique (%)28.1%

Sample

1st row서울산업대학교
2nd row(주)한은기술사사무소
3rd row(주)한성컨설턴트
4th row명지대학교
5th row포항산업과학연구원
ValueCountFrequency (%)
한국건설기술연구원 43
 
1.9%
서울시립대학교 41
 
1.8%
연세대학교 30
 
1.3%
서울대학교 25
 
1.1%
서울시 24
 
1.1%
한양대학교 24
 
1.1%
고려대학교 23
 
1.0%
중앙대학교 22
 
1.0%
서울특별시의회 21
 
0.9%
한국철도기술연구원 19
 
0.8%
Other values (978) 1982
87.9%
2024-05-18T08:09:29.526289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
640
 
4.1%
590
 
3.8%
543
 
3.5%
535
 
3.4%
438
 
2.8%
411
 
2.6%
406
 
2.6%
402
 
2.6%
) 387
 
2.5%
( 381
 
2.4%
Other values (363) 10998
69.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 14043
89.3%
Other Symbol 438
 
2.8%
Close Punctuation 387
 
2.5%
Open Punctuation 381
 
2.4%
Uppercase Letter 209
 
1.3%
Space Separator 188
 
1.2%
Lowercase Letter 63
 
0.4%
Other Punctuation 11
 
0.1%
Decimal Number 10
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
640
 
4.6%
590
 
4.2%
543
 
3.9%
535
 
3.8%
411
 
2.9%
406
 
2.9%
402
 
2.9%
376
 
2.7%
376
 
2.7%
336
 
2.4%
Other values (313) 9428
67.1%
Uppercase Letter
ValueCountFrequency (%)
S 37
17.7%
C 26
12.4%
H 18
 
8.6%
K 16
 
7.7%
G 13
 
6.2%
A 12
 
5.7%
M 10
 
4.8%
I 9
 
4.3%
D 9
 
4.3%
E 9
 
4.3%
Other values (13) 50
23.9%
Lowercase Letter
ValueCountFrequency (%)
t 9
14.3%
e 8
12.7%
n 8
12.7%
a 5
7.9%
i 5
7.9%
r 5
7.9%
s 5
7.9%
g 4
6.3%
o 3
 
4.8%
l 3
 
4.8%
Other values (6) 8
12.7%
Other Punctuation
ValueCountFrequency (%)
& 6
54.5%
. 3
27.3%
1
 
9.1%
, 1
 
9.1%
Decimal Number
ValueCountFrequency (%)
2 6
60.0%
1 4
40.0%
Other Symbol
ValueCountFrequency (%)
438
100.0%
Close Punctuation
ValueCountFrequency (%)
) 387
100.0%
Open Punctuation
ValueCountFrequency (%)
( 381
100.0%
Space Separator
ValueCountFrequency (%)
188
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14481
92.1%
Common 978
 
6.2%
Latin 272
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
640
 
4.4%
590
 
4.1%
543
 
3.7%
535
 
3.7%
438
 
3.0%
411
 
2.8%
406
 
2.8%
402
 
2.8%
376
 
2.6%
376
 
2.6%
Other values (314) 9764
67.4%
Latin
ValueCountFrequency (%)
S 37
 
13.6%
C 26
 
9.6%
H 18
 
6.6%
K 16
 
5.9%
G 13
 
4.8%
A 12
 
4.4%
M 10
 
3.7%
I 9
 
3.3%
D 9
 
3.3%
E 9
 
3.3%
Other values (29) 113
41.5%
Common
ValueCountFrequency (%)
) 387
39.6%
( 381
39.0%
188
19.2%
2 6
 
0.6%
& 6
 
0.6%
1 4
 
0.4%
. 3
 
0.3%
1
 
0.1%
, 1
 
0.1%
- 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14043
89.3%
ASCII 1249
 
7.9%
None 439
 
2.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
640
 
4.6%
590
 
4.2%
543
 
3.9%
535
 
3.8%
411
 
2.9%
406
 
2.9%
402
 
2.9%
376
 
2.7%
376
 
2.7%
336
 
2.4%
Other values (313) 9428
67.1%
None
ValueCountFrequency (%)
438
99.8%
1
 
0.2%
ASCII
ValueCountFrequency (%)
) 387
31.0%
( 381
30.5%
188
15.1%
S 37
 
3.0%
C 26
 
2.1%
H 18
 
1.4%
K 16
 
1.3%
G 13
 
1.0%
A 12
 
1.0%
M 10
 
0.8%
Other values (38) 161
12.9%

직위
Text

Distinct180
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Memory size16.4 KiB
2024-05-18T08:09:30.133812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length16
Mean length3.2089337
Min length1

Characters and Unicode

Total characters6681
Distinct characters119
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique96 ?
Unique (%)4.6%

Sample

1st row교수
2nd row대표이사
3rd row대표이사
4th row교수
5th row선임연구원
ValueCountFrequency (%)
부사장 220
 
10.5%
대표이사 205
 
9.8%
정교수 203
 
9.7%
대표 192
 
9.1%
교수 166
 
7.9%
부교수 97
 
4.6%
상무 91
 
4.3%
전무 86
 
4.1%
사장 73
 
3.5%
이사 70
 
3.3%
Other values (164) 697
33.2%
2024-05-18T08:09:31.235040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
665
 
10.0%
663
 
9.9%
568
 
8.5%
543
 
8.1%
451
 
6.8%
412
 
6.2%
411
 
6.2%
340
 
5.1%
235
 
3.5%
234
 
3.5%
Other values (109) 2159
32.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6398
95.8%
Close Punctuation 114
 
1.7%
Open Punctuation 111
 
1.7%
Space Separator 25
 
0.4%
Decimal Number 25
 
0.4%
Other Punctuation 7
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
665
 
10.4%
663
 
10.4%
568
 
8.9%
543
 
8.5%
451
 
7.0%
412
 
6.4%
411
 
6.4%
340
 
5.3%
235
 
3.7%
234
 
3.7%
Other values (100) 1876
29.3%
Decimal Number
ValueCountFrequency (%)
1 12
48.0%
2 12
48.0%
3 1
 
4.0%
Other Punctuation
ValueCountFrequency (%)
/ 5
71.4%
, 2
 
28.6%
Close Punctuation
ValueCountFrequency (%)
) 114
100.0%
Open Punctuation
ValueCountFrequency (%)
( 111
100.0%
Space Separator
ValueCountFrequency (%)
25
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6398
95.8%
Common 283
 
4.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
665
 
10.4%
663
 
10.4%
568
 
8.9%
543
 
8.5%
451
 
7.0%
412
 
6.4%
411
 
6.4%
340
 
5.3%
235
 
3.7%
234
 
3.7%
Other values (100) 1876
29.3%
Common
ValueCountFrequency (%)
) 114
40.3%
( 111
39.2%
25
 
8.8%
1 12
 
4.2%
2 12
 
4.2%
/ 5
 
1.8%
, 2
 
0.7%
- 1
 
0.4%
3 1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6398
95.8%
ASCII 283
 
4.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
665
 
10.4%
663
 
10.4%
568
 
8.9%
543
 
8.5%
451
 
7.0%
412
 
6.4%
411
 
6.4%
340
 
5.3%
235
 
3.7%
234
 
3.7%
Other values (100) 1876
29.3%
ASCII
ValueCountFrequency (%)
) 114
40.3%
( 111
39.2%
25
 
8.8%
1 12
 
4.2%
2 12
 
4.2%
/ 5
 
1.8%
, 2
 
0.7%
- 1
 
0.4%
3 1
 
0.4%

자격증
Text

MISSING 

Distinct431
Distinct (%)33.1%
Missing779
Missing (%)37.4%
Memory size16.4 KiB
2024-05-18T08:09:31.937396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length70
Median length51
Mean length8.085188
Min length1

Characters and Unicode

Total characters10535
Distinct characters187
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique273 ?
Unique (%)21.0%

Sample

1st row공학박사
2nd row건축기계설비기술사
3rd row공학박사, 건축전기설비기술사
4th row공학박사
5th row공학박사
ValueCountFrequency (%)
공학박사 161
 
8.7%
박사 136
 
7.4%
토목시공기술사 84
 
4.6%
건축사 63
 
3.4%
토목구조기술사 53
 
2.9%
석사 51
 
2.8%
토질및기초 45
 
2.4%
토질및기초기술사 42
 
2.3%
토목시공 41
 
2.2%
토목구조 38
 
2.1%
Other values (328) 1132
61.3%
2024-05-18T08:09:33.435771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1221
 
11.6%
986
 
9.4%
637
 
6.0%
590
 
5.6%
522
 
5.0%
444
 
4.2%
411
 
3.9%
, 391
 
3.7%
339
 
3.2%
333
 
3.2%
Other values (177) 4661
44.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9015
85.6%
Space Separator 590
 
5.6%
Other Punctuation 442
 
4.2%
Lowercase Letter 203
 
1.9%
Uppercase Letter 147
 
1.4%
Decimal Number 51
 
0.5%
Close Punctuation 39
 
0.4%
Open Punctuation 39
 
0.4%
Dash Punctuation 9
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1221
 
13.5%
986
 
10.9%
637
 
7.1%
522
 
5.8%
444
 
4.9%
411
 
4.6%
339
 
3.8%
333
 
3.7%
317
 
3.5%
301
 
3.3%
Other values (120) 3504
38.9%
Lowercase Letter
ValueCountFrequency (%)
n 30
14.8%
e 29
14.3%
i 22
10.8%
o 18
8.9%
a 17
8.4%
r 16
7.9%
t 14
6.9%
s 14
6.9%
g 9
 
4.4%
f 9
 
4.4%
Other values (10) 25
12.3%
Uppercase Letter
ValueCountFrequency (%)
P 27
18.4%
E 26
17.7%
V 19
12.9%
S 14
9.5%
M 13
8.8%
C 13
8.8%
A 10
 
6.8%
I 5
 
3.4%
L 5
 
3.4%
T 4
 
2.7%
Other values (9) 11
7.5%
Decimal Number
ValueCountFrequency (%)
1 24
47.1%
9 6
 
11.8%
8 5
 
9.8%
5 5
 
9.8%
2 5
 
9.8%
0 3
 
5.9%
6 1
 
2.0%
7 1
 
2.0%
3 1
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 391
88.5%
. 43
 
9.7%
/ 6
 
1.4%
' 1
 
0.2%
# 1
 
0.2%
Space Separator
ValueCountFrequency (%)
590
100.0%
Close Punctuation
ValueCountFrequency (%)
) 39
100.0%
Open Punctuation
ValueCountFrequency (%)
( 39
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9015
85.6%
Common 1170
 
11.1%
Latin 350
 
3.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1221
 
13.5%
986
 
10.9%
637
 
7.1%
522
 
5.8%
444
 
4.9%
411
 
4.6%
339
 
3.8%
333
 
3.7%
317
 
3.5%
301
 
3.3%
Other values (120) 3504
38.9%
Latin
ValueCountFrequency (%)
n 30
 
8.6%
e 29
 
8.3%
P 27
 
7.7%
E 26
 
7.4%
i 22
 
6.3%
V 19
 
5.4%
o 18
 
5.1%
a 17
 
4.9%
r 16
 
4.6%
t 14
 
4.0%
Other values (29) 132
37.7%
Common
ValueCountFrequency (%)
590
50.4%
, 391
33.4%
. 43
 
3.7%
) 39
 
3.3%
( 39
 
3.3%
1 24
 
2.1%
- 9
 
0.8%
9 6
 
0.5%
/ 6
 
0.5%
8 5
 
0.4%
Other values (8) 18
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9015
85.6%
ASCII 1520
 
14.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1221
 
13.5%
986
 
10.9%
637
 
7.1%
522
 
5.8%
444
 
4.9%
411
 
4.6%
339
 
3.8%
333
 
3.7%
317
 
3.5%
301
 
3.3%
Other values (120) 3504
38.9%
ASCII
ValueCountFrequency (%)
590
38.8%
, 391
25.7%
. 43
 
2.8%
) 39
 
2.6%
( 39
 
2.6%
n 30
 
2.0%
e 29
 
1.9%
P 27
 
1.8%
E 26
 
1.7%
1 24
 
1.6%
Other values (47) 282
18.6%
Distinct501
Distinct (%)24.3%
Missing17
Missing (%)0.8%
Memory size16.4 KiB
2024-05-18T08:09:34.423345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length4
Mean length4.7898305
Min length1

Characters and Unicode

Total characters9891
Distinct characters240
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique301 ?
Unique (%)14.6%

Sample

1st row건축기계설비
2nd row건축기계설비
3rd row전기설비
4th row전기공학(전자통신분
5th row건축구조미진동
ValueCountFrequency (%)
토목시공 138
 
5.8%
건축계획 103
 
4.3%
토목구조 100
 
4.2%
토질및기초 84
 
3.5%
82
 
3.4%
건축시공 75
 
3.2%
조경계획 66
 
2.8%
상하수도 58
 
2.4%
전기전력설비 54
 
2.3%
건축기계설비 53
 
2.2%
Other values (478) 1566
65.8%
2024-05-18T08:09:35.657745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
569
 
5.8%
467
 
4.7%
441
 
4.5%
431
 
4.4%
421
 
4.3%
399
 
4.0%
397
 
4.0%
352
 
3.6%
352
 
3.6%
295
 
3.0%
Other values (230) 5767
58.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9385
94.9%
Space Separator 352
 
3.6%
Other Punctuation 60
 
0.6%
Uppercase Letter 31
 
0.3%
Open Punctuation 29
 
0.3%
Close Punctuation 23
 
0.2%
Dash Punctuation 10
 
0.1%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
569
 
6.1%
467
 
5.0%
441
 
4.7%
431
 
4.6%
421
 
4.5%
399
 
4.3%
397
 
4.2%
352
 
3.8%
295
 
3.1%
294
 
3.1%
Other values (210) 5319
56.7%
Uppercase Letter
ValueCountFrequency (%)
V 5
16.1%
E 5
16.1%
I 5
16.1%
S 4
12.9%
C 3
9.7%
B 3
9.7%
M 2
 
6.5%
R 1
 
3.2%
P 1
 
3.2%
T 1
 
3.2%
Other Punctuation
ValueCountFrequency (%)
, 43
71.7%
. 11
 
18.3%
/ 4
 
6.7%
? 2
 
3.3%
Space Separator
ValueCountFrequency (%)
352
100.0%
Open Punctuation
ValueCountFrequency (%)
( 29
100.0%
Close Punctuation
ValueCountFrequency (%)
) 23
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9385
94.9%
Common 475
 
4.8%
Latin 31
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
569
 
6.1%
467
 
5.0%
441
 
4.7%
431
 
4.6%
421
 
4.5%
399
 
4.3%
397
 
4.2%
352
 
3.8%
295
 
3.1%
294
 
3.1%
Other values (210) 5319
56.7%
Latin
ValueCountFrequency (%)
V 5
16.1%
E 5
16.1%
I 5
16.1%
S 4
12.9%
C 3
9.7%
B 3
9.7%
M 2
 
6.5%
R 1
 
3.2%
P 1
 
3.2%
T 1
 
3.2%
Common
ValueCountFrequency (%)
352
74.1%
, 43
 
9.1%
( 29
 
6.1%
) 23
 
4.8%
. 11
 
2.3%
- 10
 
2.1%
/ 4
 
0.8%
? 2
 
0.4%
1 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9385
94.9%
ASCII 506
 
5.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
569
 
6.1%
467
 
5.0%
441
 
4.7%
431
 
4.6%
421
 
4.5%
399
 
4.3%
397
 
4.2%
352
 
3.8%
295
 
3.1%
294
 
3.1%
Other values (210) 5319
56.7%
ASCII
ValueCountFrequency (%)
352
69.6%
, 43
 
8.5%
( 29
 
5.7%
) 23
 
4.5%
. 11
 
2.2%
- 10
 
2.0%
V 5
 
1.0%
E 5
 
1.0%
I 5
 
1.0%
S 4
 
0.8%
Other values (10) 19
 
3.8%

세부전공분야2
Text

MISSING 

Distinct462
Distinct (%)39.5%
Missing913
Missing (%)43.9%
Memory size16.4 KiB
2024-05-18T08:09:36.522879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length4.2994012
Min length2

Characters and Unicode

Total characters5026
Distinct characters241
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique258 ?
Unique (%)22.1%

Sample

1st row냉동냉장설비
2nd row전원설비
3rd row건축구조
4th row건축계획
5th row안전
ValueCountFrequency (%)
조경설계 45
 
3.5%
터널 43
 
3.3%
강구조 36
 
2.8%
교량 36
 
2.8%
설계 29
 
2.3%
안전진단 28
 
2.2%
건축사 24
 
1.9%
23
 
1.8%
지반 20
 
1.6%
콘크리트구조 16
 
1.2%
Other values (445) 988
76.7%
2024-05-18T08:09:38.099846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
257
 
5.1%
222
 
4.4%
214
 
4.3%
165
 
3.3%
146
 
2.9%
139
 
2.8%
137
 
2.7%
127
 
2.5%
123
 
2.4%
118
 
2.3%
Other values (231) 3378
67.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4783
95.2%
Space Separator 127
 
2.5%
Uppercase Letter 49
 
1.0%
Other Punctuation 31
 
0.6%
Open Punctuation 20
 
0.4%
Close Punctuation 14
 
0.3%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
257
 
5.4%
222
 
4.6%
214
 
4.5%
165
 
3.4%
146
 
3.1%
139
 
2.9%
137
 
2.9%
123
 
2.6%
118
 
2.5%
112
 
2.3%
Other values (214) 3150
65.9%
Uppercase Letter
ValueCountFrequency (%)
C 15
30.6%
M 8
16.3%
A 6
 
12.2%
P 6
 
12.2%
E 4
 
8.2%
V 4
 
8.2%
D 2
 
4.1%
F 2
 
4.1%
T 1
 
2.0%
L 1
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 26
83.9%
? 4
 
12.9%
. 1
 
3.2%
Space Separator
ValueCountFrequency (%)
127
100.0%
Open Punctuation
ValueCountFrequency (%)
( 20
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4783
95.2%
Common 194
 
3.9%
Latin 49
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
257
 
5.4%
222
 
4.6%
214
 
4.5%
165
 
3.4%
146
 
3.1%
139
 
2.9%
137
 
2.9%
123
 
2.6%
118
 
2.5%
112
 
2.3%
Other values (214) 3150
65.9%
Latin
ValueCountFrequency (%)
C 15
30.6%
M 8
16.3%
A 6
 
12.2%
P 6
 
12.2%
E 4
 
8.2%
V 4
 
8.2%
D 2
 
4.1%
F 2
 
4.1%
T 1
 
2.0%
L 1
 
2.0%
Common
ValueCountFrequency (%)
127
65.5%
, 26
 
13.4%
( 20
 
10.3%
) 14
 
7.2%
? 4
 
2.1%
- 2
 
1.0%
. 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4783
95.2%
ASCII 243
 
4.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
257
 
5.4%
222
 
4.6%
214
 
4.5%
165
 
3.4%
146
 
3.1%
139
 
2.9%
137
 
2.9%
123
 
2.6%
118
 
2.5%
112
 
2.3%
Other values (214) 3150
65.9%
ASCII
ValueCountFrequency (%)
127
52.3%
, 26
 
10.7%
( 20
 
8.2%
C 15
 
6.2%
) 14
 
5.8%
M 8
 
3.3%
A 6
 
2.5%
P 6
 
2.5%
E 4
 
1.6%
? 4
 
1.6%
Other values (7) 13
 
5.3%

세부전공분야3
Text

MISSING 

Distinct278
Distinct (%)62.9%
Missing1640
Missing (%)78.8%
Memory size16.4 KiB
2024-05-18T08:09:38.834813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length11
Mean length5.0135747
Min length1

Characters and Unicode

Total characters2216
Distinct characters193
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique192 ?
Unique (%)43.4%

Sample

1st row배전배선
2nd row소방
3rd row공동주택
4th row유지관리
5th row전통조경계획
ValueCountFrequency (%)
터널 16
 
3.0%
15
 
2.8%
안전진단 12
 
2.3%
강구조 11
 
2.1%
교량 10
 
1.9%
사면 10
 
1.9%
안전관리 8
 
1.5%
토질기초 8
 
1.5%
콘크리트구조 7
 
1.3%
구조해석 7
 
1.3%
Other values (267) 426
80.4%
2024-05-18T08:09:40.189152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
104
 
4.7%
71
 
3.2%
67
 
3.0%
65
 
2.9%
64
 
2.9%
62
 
2.8%
56
 
2.5%
52
 
2.3%
52
 
2.3%
48
 
2.2%
Other values (183) 1575
71.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2009
90.7%
Space Separator 104
 
4.7%
Other Punctuation 51
 
2.3%
Uppercase Letter 32
 
1.4%
Open Punctuation 14
 
0.6%
Close Punctuation 5
 
0.2%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
71
 
3.5%
67
 
3.3%
65
 
3.2%
64
 
3.2%
62
 
3.1%
56
 
2.8%
52
 
2.6%
52
 
2.6%
48
 
2.4%
48
 
2.4%
Other values (165) 1424
70.9%
Uppercase Letter
ValueCountFrequency (%)
C 7
21.9%
V 7
21.9%
E 7
21.9%
L 3
9.4%
H 2
 
6.2%
A 2
 
6.2%
I 1
 
3.1%
T 1
 
3.1%
S 1
 
3.1%
R 1
 
3.1%
Other Punctuation
ValueCountFrequency (%)
, 43
84.3%
/ 5
 
9.8%
? 2
 
3.9%
. 1
 
2.0%
Space Separator
ValueCountFrequency (%)
104
100.0%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2009
90.7%
Common 175
 
7.9%
Latin 32
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
71
 
3.5%
67
 
3.3%
65
 
3.2%
64
 
3.2%
62
 
3.1%
56
 
2.8%
52
 
2.6%
52
 
2.6%
48
 
2.4%
48
 
2.4%
Other values (165) 1424
70.9%
Latin
ValueCountFrequency (%)
C 7
21.9%
V 7
21.9%
E 7
21.9%
L 3
9.4%
H 2
 
6.2%
A 2
 
6.2%
I 1
 
3.1%
T 1
 
3.1%
S 1
 
3.1%
R 1
 
3.1%
Common
ValueCountFrequency (%)
104
59.4%
, 43
24.6%
( 14
 
8.0%
/ 5
 
2.9%
) 5
 
2.9%
? 2
 
1.1%
- 1
 
0.6%
. 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2009
90.7%
ASCII 207
 
9.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
104
50.2%
, 43
20.8%
( 14
 
6.8%
C 7
 
3.4%
V 7
 
3.4%
E 7
 
3.4%
/ 5
 
2.4%
) 5
 
2.4%
L 3
 
1.4%
H 2
 
1.0%
Other values (8) 10
 
4.8%
Hangul
ValueCountFrequency (%)
71
 
3.5%
67
 
3.3%
65
 
3.2%
64
 
3.2%
62
 
3.1%
56
 
2.8%
52
 
2.6%
52
 
2.6%
48
 
2.4%
48
 
2.4%
Other values (165) 1424
70.9%

위촉상태
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size16.4 KiB
위촉
2068 
해촉
 
14

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row위촉
2nd row위촉
3rd row위촉
4th row위촉
5th row위촉

Common Values

ValueCountFrequency (%)
위촉 2068
99.3%
해촉 14
 
0.7%

Length

2024-05-18T08:09:40.610001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T08:09:40.983096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
위촉 2068
99.3%
해촉 14
 
0.7%

Correlations

2024-05-18T08:09:41.240261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위원기수전문분야위촉상태
위원기수1.0000.5460.056
전문분야0.5461.0000.099
위촉상태0.0560.0991.000
2024-05-18T08:09:41.609106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위촉상태전문분야위원기수
위촉상태1.0000.0780.056
전문분야0.0781.0000.238
위원기수0.0560.2381.000
2024-05-18T08:09:41.859985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위원기수전문분야위촉상태
위원기수1.0000.2380.056
전문분야0.2381.0000.078
위촉상태0.0560.0781.000

Missing values

2024-05-18T08:09:22.708436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T08:09:23.305105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-18T08:09:23.645797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

위원기수위원명전문분야소속직위자격증세부전공분야1세부전공분야2세부전공분야3위촉상태
09기정광섭건축기계설비서울산업대학교교수공학박사건축기계설비<NA><NA>위촉
19기황인환건축기계설비(주)한은기술사사무소대표이사건축기계설비기술사건축기계설비냉동냉장설비<NA>위촉
29기강태은전기전력설비(주)한성컨설턴트대표이사공학박사, 건축전기설비기술사전기설비전원설비배전배선위촉
39기김갑일전기전력설비명지대학교교수공학박사전기공학(전자통신분<NA><NA>위촉
49기이승은건축구조포항산업과학연구원선임연구원공학박사건축구조미진동건축구조<NA>위촉
59기이인영건축구조(주)오푸스필구조기술사사무소대표이사건축구조기술사건축구조<NA><NA>위촉
69기강경인건축시공고려대학교교수공학박사건축시공<NA><NA>위촉
79기권회구건축시공(재)한국재난연구원강남원장건축시공기술사건축시공<NA><NA>위촉
89기김분란건축시공푸른미래도시 광진연구소소장건축사건축시공건축계획<NA>위촉
99기김태환건축시공용인대학교교수공학박사건축시공안전소방위촉
위원기수위원명전문분야소속직위자격증세부전공분야1세부전공분야2세부전공분야3위촉상태
207215기고주연교통㈜이산상무<NA>교통교통계획<NA>위촉
207315기금기정교통명지대학교정교수<NA>교통<NA><NA>위촉
207415기김지현교통㈜명성대표이사<NA>교통설계교통영향평가<NA>위촉
207515기이경아교통㈜태승알엔디상무<NA>교통안전<NA><NA>위촉
207615기이미영교통국토연구원책임연구원<NA>교통도로철도위촉
207715기이수범교통서울시립대학교정교수<NA>교통교통계획<NA>위촉
207815기허정아교통한국토지주택공사과장<NA>교통계획교통영향평가<NA>위촉
207915기강창구도로서울시설공단처장<NA>도로환승주차장<NA>위촉
208015기김철중도로한국도로공사부장<NA>고속도로설계<NA><NA>위촉
208115기문성호도로서울과학기술대학교부교수<NA>도로포장도로구조도로소음위촉

Duplicate rows

Most frequently occurring

위원기수위원명전문분야소속직위자격증세부전공분야1세부전공분야2세부전공분야3위촉상태# duplicates
010기김승철토질및기초삼성물산(주)건설부문상무토질및기초기술사.시공.품질기술사토질및기초토목시공<NA>위촉2