Overview

Dataset statistics

Number of variables5
Number of observations713
Missing cells1053
Missing cells (%)29.5%
Duplicate rows40
Duplicate rows (%)5.6%
Total size in memory28.0 KiB
Average record size in memory40.2 B

Variable types

Categorical2
Text3

Dataset

Description국립호남권생물자원관의 조직과 업무내용입니다. 해당 데이터에는 구분, 부서명, 업무내용에 대한 정보가 포함되어 있습니다.
Author국립호남권생물자원관
URLhttps://www.data.go.kr/data/15118083/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 40 (5.6%) duplicate rowsDuplicates
소속 is highly imbalanced (59.1%)Imbalance
성명 has 526 (73.8%) missing valuesMissing
전화번호 has 527 (73.9%) missing valuesMissing

Reproduction

Analysis started2024-04-20 14:17:22.451792
Analysis finished2024-04-20 14:17:24.182909
Duration1.73 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

소속
Categorical

IMBALANCE 

Distinct29
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
<NA>
526 
동물자원연구부
 
18
식물자원연구부
 
15
전시부
 
15
인사총무부
 
12
Other values (24)
127 

Length

Max length14
Median length4
Mean length4.5708275
Min length2

Unique

Unique7 ?
Unique (%)1.0%

Sample

1st row관장
2nd row경영관리본부
3rd row도서생물연구본부
4th row감사실
5th row감사실

Common Values

ValueCountFrequency (%)
<NA> 526
73.8%
동물자원연구부 18
 
2.5%
식물자원연구부 15
 
2.1%
전시부 15
 
2.1%
인사총무부 12
 
1.7%
천연소재연구부 11
 
1.5%
실용화연구부 10
 
1.4%
섬야생생물소재 선진화연구단 10
 
1.4%
환경소재연구부 10
 
1.4%
시설관리부 9
 
1.3%
Other values (19) 77
 
10.8%

Length

2024-04-20T23:17:24.461254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 526
72.8%
동물자원연구부 18
 
2.5%
식물자원연구부 15
 
2.1%
전시부 15
 
2.1%
인사총무부 12
 
1.7%
천연소재연구부 11
 
1.5%
실용화연구부 10
 
1.4%
섬야생생물소재 10
 
1.4%
선진화연구단 10
 
1.4%
환경소재연구부 10
 
1.4%
Other values (20) 86
 
11.9%

성명
Text

MISSING 

Distinct182
Distinct (%)97.3%
Missing526
Missing (%)73.8%
Memory size5.7 KiB
2024-04-20T23:17:26.325111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.9786096
Min length2

Characters and Unicode

Total characters557
Distinct characters116
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique177 ?
Unique (%)94.7%

Sample

1st row류태철
2nd row조용환
3rd row유강열
4th row김종삼
5th row이효람
ValueCountFrequency (%)
김미연 2
 
1.1%
이환휘 2
 
1.1%
최경민 2
 
1.1%
남보미 2
 
1.1%
이경준 2
 
1.1%
박철민 1
 
0.5%
정준성 1
 
0.5%
황성민 1
 
0.5%
박지원 1
 
0.5%
서혜민 1
 
0.5%
Other values (172) 172
92.0%
2024-04-20T23:17:28.065065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
37
 
6.6%
36
 
6.5%
28
 
5.0%
19
 
3.4%
18
 
3.2%
16
 
2.9%
14
 
2.5%
13
 
2.3%
12
 
2.2%
11
 
2.0%
Other values (106) 353
63.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 557
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
37
 
6.6%
36
 
6.5%
28
 
5.0%
19
 
3.4%
18
 
3.2%
16
 
2.9%
14
 
2.5%
13
 
2.3%
12
 
2.2%
11
 
2.0%
Other values (106) 353
63.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 557
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
37
 
6.6%
36
 
6.5%
28
 
5.0%
19
 
3.4%
18
 
3.2%
16
 
2.9%
14
 
2.5%
13
 
2.3%
12
 
2.2%
11
 
2.0%
Other values (106) 353
63.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 557
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
37
 
6.6%
36
 
6.5%
28
 
5.0%
19
 
3.4%
18
 
3.2%
16
 
2.9%
14
 
2.5%
13
 
2.3%
12
 
2.2%
11
 
2.0%
Other values (106) 353
63.4%

전화번호
Text

MISSING 

Distinct178
Distinct (%)95.7%
Missing527
Missing (%)73.9%
Memory size5.7 KiB
2024-04-20T23:17:28.912017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.064516
Min length12

Characters and Unicode

Total characters2244
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique173 ?
Unique (%)93.0%

Sample

1st row061-288-7801
2nd row061-288-7802
3rd row061-288-7803
4th row061-288-7808
5th row061-288-7809
ValueCountFrequency (%)
061-288-7890~1 5
 
2.7%
061-288-8946 2
 
1.1%
061-288-7812 2
 
1.1%
061-288-7826 2
 
1.1%
061-288-8901 2
 
1.1%
061-288-8915 1
 
0.5%
061-288-7801 1
 
0.5%
061-288-8909 1
 
0.5%
061-288-8905 1
 
0.5%
061-288-8908 1
 
0.5%
Other values (168) 168
90.3%
2024-04-20T23:17:30.052276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 547
24.4%
- 372
16.6%
1 237
10.6%
2 226
10.1%
0 225
10.0%
6 216
 
9.6%
7 160
 
7.1%
9 140
 
6.2%
3 45
 
2.0%
4 39
 
1.7%
Other values (2) 37
 
1.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1866
83.2%
Dash Punctuation 372
 
16.6%
Math Symbol 6
 
0.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 547
29.3%
1 237
12.7%
2 226
12.1%
0 225
12.1%
6 216
 
11.6%
7 160
 
8.6%
9 140
 
7.5%
3 45
 
2.4%
4 39
 
2.1%
5 31
 
1.7%
Dash Punctuation
ValueCountFrequency (%)
- 372
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2244
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 547
24.4%
- 372
16.6%
1 237
10.6%
2 226
10.1%
0 225
10.0%
6 216
 
9.6%
7 160
 
7.1%
9 140
 
6.2%
3 45
 
2.0%
4 39
 
1.7%
Other values (2) 37
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2244
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 547
24.4%
- 372
16.6%
1 237
10.6%
2 226
10.1%
0 225
10.0%
6 216
 
9.6%
7 160
 
7.1%
9 140
 
6.2%
3 45
 
2.0%
4 39
 
1.7%
Other values (2) 37
 
1.6%
Distinct620
Distinct (%)87.0%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
2024-04-20T23:17:30.998490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length34
Mean length20.210379
Min length4

Characters and Unicode

Total characters14410
Distinct characters362
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique565 ?
Unique (%)79.2%

Sample

1st row국립호남권생물자원관 업무 총괄
2nd row경영관리본부 업무총괄
3rd row도서생물연구본부 업무 총괄
4th row감사실 업무 총괄
5th row기관 자체감사 계획 수립 및 시행에 관한 업무
ValueCountFrequency (%)
318
 
8.3%
관리 202
 
5.3%
업무 154
 
4.0%
지원 119
 
3.1%
관한 103
 
2.7%
운영 81
 
2.1%
사항 70
 
1.8%
부서 53
 
1.4%
51
 
1.3%
연구 42
 
1.1%
Other values (1024) 2637
68.9%
2024-04-20T23:17:32.325774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3119
 
21.6%
542
 
3.8%
, 377
 
2.6%
335
 
2.3%
318
 
2.2%
276
 
1.9%
275
 
1.9%
271
 
1.9%
269
 
1.9%
259
 
1.8%
Other values (352) 8369
58.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10365
71.9%
Space Separator 3119
 
21.6%
Other Punctuation 387
 
2.7%
Uppercase Letter 167
 
1.2%
Close Punctuation 150
 
1.0%
Open Punctuation 150
 
1.0%
Decimal Number 71
 
0.5%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
542
 
5.2%
335
 
3.2%
318
 
3.1%
276
 
2.7%
275
 
2.7%
271
 
2.6%
269
 
2.6%
259
 
2.5%
231
 
2.2%
219
 
2.1%
Other values (319) 7370
71.1%
Uppercase Letter
ValueCountFrequency (%)
B 37
22.2%
I 37
22.2%
S 24
14.4%
D 21
12.6%
O 10
 
6.0%
M 9
 
5.4%
T 4
 
2.4%
F 4
 
2.4%
R 4
 
2.4%
L 3
 
1.8%
Other values (8) 14
 
8.4%
Decimal Number
ValueCountFrequency (%)
2 26
36.6%
1 11
15.5%
3 11
15.5%
4 10
 
14.1%
8 6
 
8.5%
6 2
 
2.8%
7 2
 
2.8%
5 2
 
2.8%
0 1
 
1.4%
Other Punctuation
ValueCountFrequency (%)
, 377
97.4%
/ 10
 
2.6%
Space Separator
ValueCountFrequency (%)
3119
100.0%
Close Punctuation
ValueCountFrequency (%)
) 150
100.0%
Open Punctuation
ValueCountFrequency (%)
( 150
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10365
71.9%
Common 3878
 
26.9%
Latin 167
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
542
 
5.2%
335
 
3.2%
318
 
3.1%
276
 
2.7%
275
 
2.7%
271
 
2.6%
269
 
2.6%
259
 
2.5%
231
 
2.2%
219
 
2.1%
Other values (319) 7370
71.1%
Latin
ValueCountFrequency (%)
B 37
22.2%
I 37
22.2%
S 24
14.4%
D 21
12.6%
O 10
 
6.0%
M 9
 
5.4%
T 4
 
2.4%
F 4
 
2.4%
R 4
 
2.4%
L 3
 
1.8%
Other values (8) 14
 
8.4%
Common
ValueCountFrequency (%)
3119
80.4%
, 377
 
9.7%
) 150
 
3.9%
( 150
 
3.9%
2 26
 
0.7%
1 11
 
0.3%
3 11
 
0.3%
/ 10
 
0.3%
4 10
 
0.3%
8 6
 
0.2%
Other values (5) 8
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10365
71.9%
ASCII 4045
 
28.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3119
77.1%
, 377
 
9.3%
) 150
 
3.7%
( 150
 
3.7%
B 37
 
0.9%
I 37
 
0.9%
2 26
 
0.6%
S 24
 
0.6%
D 21
 
0.5%
1 11
 
0.3%
Other values (23) 93
 
2.3%
Hangul
ValueCountFrequency (%)
542
 
5.2%
335
 
3.2%
318
 
3.1%
276
 
2.7%
275
 
2.7%
271
 
2.6%
269
 
2.6%
259
 
2.5%
231
 
2.2%
219
 
2.1%
Other values (319) 7370
71.1%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
2024-02-20
713 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2024-02-20
2nd row2024-02-20
3rd row2024-02-20
4th row2024-02-20
5th row2024-02-20

Common Values

ValueCountFrequency (%)
2024-02-20 713
100.0%

Length

2024-04-20T23:17:32.566917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-20T23:17:32.740757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2024-02-20 713
100.0%

Missing values

2024-04-20T23:17:23.064360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-20T23:17:23.483921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-20T23:17:23.826524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

소속성명전화번호담당업무데이터기준일자
0관장류태철061-288-7801국립호남권생물자원관 업무 총괄2024-02-20
1경영관리본부조용환061-288-7802경영관리본부 업무총괄2024-02-20
2도서생물연구본부유강열061-288-7803도서생물연구본부 업무 총괄2024-02-20
3감사실김종삼061-288-7808감사실 업무 총괄2024-02-20
4감사실이효람061-288-7809기관 자체감사 계획 수립 및 시행에 관한 업무2024-02-20
5<NA><NA><NA>환경부 등 외부기관 감사 수감에 관한 업무2024-02-20
6<NA><NA><NA>진정, 비위사실, 민원 등의 신고처리에 관한 업무 등2024-02-20
7전략기획실정종범061-288-7812전략기획실 업무 총괄2024-02-20
8기획성과부이정석061-288-7812기획성과부 업무 총괄2024-02-20
9<NA><NA><NA>경영평가 및 성과평가 총괄2024-02-20
소속성명전화번호담당업무데이터기준일자
703<NA><NA><NA>생물, 소재 정보 DB 구축 지원2024-02-20
704<NA><NA><NA>메타지놈 확보 및 DB 구축 지원2024-02-20
705섬야생생물소재 선진화연구단이민지061-288-8936생물소재(펩타이드) 표준화 및 품질관리2024-02-20
706<NA><NA><NA>펩타이드 및 오믹스빅데이터 유용성정보 DB 구축2024-02-20
707<NA><NA><NA>분양 및 홍보부스 지원2024-02-20
708섬야생생물소재 선진화연구단강소리061-288-8941다부처 국가생명연구자원 선진화사업 3세부 실무2024-02-20
709<NA><NA><NA>도서, 연안 생물자원유래 소재 정보 보관 및 관리2024-02-20
710노동조합이영남061-288-7995국립호남권생물자원관 노동조합 위원장2024-02-20
711노동조합김미연061-288-7994국립호남권생물자원관 노동조합 사무국장2024-02-20
712전시부전시관061-288-7892~3국립호남권생물자원관 전시관2024-02-20

Duplicate rows

Most frequently occurring

소속성명전화번호담당업무데이터기준일자# duplicates
12<NA><NA><NA>부서 및 연구사업 행정업무 지원2024-02-209
20<NA><NA><NA>식물표본 입, 출입 관리 외 연구사업 지원2024-02-204
26<NA><NA><NA>실험실, 부서 물품 재고 관리2024-02-204
33<NA><NA><NA>입장료, 관람료 등 수입 징수2024-02-204
1<NA><NA><NA>경영공시, 경영평가 지표 및 관련 규정 정비2024-02-203
2<NA><NA><NA>고등식물표본수장고, 예비수장고 운영 지원2024-02-203
4<NA><NA><NA>관람객 안전관리에 관한 사항2024-02-203
5<NA><NA><NA>교육, 행사 등 공용차량 운행 지원2024-02-203
6<NA><NA><NA>국회 및 환경부 정책자료 대응2024-02-203
8<NA><NA><NA>기타 업무분장에 포함되지 않은 사항2024-02-203