Overview

Dataset statistics

Number of variables5
Number of observations216
Missing cells120
Missing cells (%)11.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.6 KiB
Average record size in memory40.6 B

Variable types

Categorical2
Text2
DateTime1

Dataset

Description한국자산관리공사에서 진행한 직무교육, 공무원교육 등에 대해 교육과정을 진행하는 강사들의 전반적인 경력사항에 대한 데이터
Author한국자산관리공사
URLhttps://www.data.go.kr/data/15111453/fileData.do

Alerts

강사이름 is highly overall correlated with 강사번호High correlation
강사번호 is highly overall correlated with 강사이름High correlation
기간_시작 has 40 (18.5%) missing valuesMissing
기간_종료 has 80 (37.0%) missing valuesMissing

Reproduction

Analysis started2023-12-12 03:11:49.458461
Analysis finished2023-12-12 03:11:50.697643
Duration1.24 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

강사번호
Categorical

HIGH CORRELATION 

Distinct39
Distinct (%)18.1%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
P060
74 
P003
19 
P180
 
7
P063
 
6
P151
 
6
Other values (34)
104 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique5 ?
Unique (%)2.3%

Sample

1st rowP003
2nd rowP003
3rd rowP003
4th rowP003
5th rowP003

Common Values

ValueCountFrequency (%)
P060 74
34.3%
P003 19
 
8.8%
P180 7
 
3.2%
P063 6
 
2.8%
P151 6
 
2.8%
P174 5
 
2.3%
P148 5
 
2.3%
P165 5
 
2.3%
P132 4
 
1.9%
P188 4
 
1.9%
Other values (29) 81
37.5%

Length

2023-12-12T12:11:50.784683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
p060 74
34.3%
p003 19
 
8.8%
p180 7
 
3.2%
p063 6
 
2.8%
p151 6
 
2.8%
p174 5
 
2.3%
p148 5
 
2.3%
p165 5
 
2.3%
p166 4
 
1.9%
p084 4
 
1.9%
Other values (29) 81
37.5%

강사이름
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)10.2%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
문**
78 
이**
20 
강**
19 
박**
16 
최**
Other values (17)
74 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique2 ?
Unique (%)0.9%

Sample

1st row강**
2nd row강**
3rd row강**
4th row강**
5th row강**

Common Values

ValueCountFrequency (%)
문** 78
36.1%
이** 20
 
9.3%
강** 19
 
8.8%
박** 16
 
7.4%
최** 9
 
4.2%
전** 9
 
4.2%
임** 8
 
3.7%
김** 8
 
3.7%
한** 7
 
3.2%
정** 7
 
3.2%
Other values (12) 35
16.2%

Length

2023-12-12T12:11:50.940028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
78
36.1%
20
 
9.3%
19
 
8.8%
16
 
7.4%
9
 
4.2%
9
 
4.2%
8
 
3.7%
8
 
3.7%
7
 
3.2%
7
 
3.2%
Other values (12) 35
16.2%
Distinct192
Distinct (%)88.9%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
2023-12-12T12:11:51.243198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length258
Median length41
Mean length26.597222
Min length2

Characters and Unicode

Total characters5745
Distinct characters407
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique172 ?
Unique (%)79.6%

Sample

1st row광주은행 개발업무 담당(DW, 기획, 차세대TFT)
2nd row광주은행 개발업무 담당
3rd row한국SW기술진흥협회(KOSTA) 재직자과정 및 청년취업아카데미과정, 취업예정자등
4th row한국 SW기술진흥협회 재직자과정 및 청년취업아카데미과정
5th row한국SW기술진흥협회 전문기술위원
ValueCountFrequency (%)
29
 
3.1%
빅데이터 21
 
2.3%
한국자산관리공사 15
 
1.6%
10
 
1.1%
교육 9
 
1.0%
공무원 9
 
1.0%
캠코 8
 
0.9%
근무 8
 
0.9%
겸임교수 7
 
0.8%
분석 7
 
0.8%
Other values (527) 806
86.8%
2023-12-12T12:11:51.787205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
729
 
12.7%
, 167
 
2.9%
107
 
1.9%
107
 
1.9%
93
 
1.6%
) 86
 
1.5%
( 85
 
1.5%
84
 
1.5%
79
 
1.4%
73
 
1.3%
Other values (397) 4135
72.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3619
63.0%
Space Separator 729
 
12.7%
Lowercase Letter 522
 
9.1%
Uppercase Letter 431
 
7.5%
Other Punctuation 229
 
4.0%
Close Punctuation 86
 
1.5%
Open Punctuation 85
 
1.5%
Decimal Number 26
 
0.5%
Other Symbol 10
 
0.2%
Dash Punctuation 7
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
107
 
3.0%
107
 
3.0%
93
 
2.6%
84
 
2.3%
79
 
2.2%
73
 
2.0%
73
 
2.0%
73
 
2.0%
70
 
1.9%
69
 
1.9%
Other values (329) 2791
77.1%
Lowercase Letter
ValueCountFrequency (%)
e 58
11.1%
n 45
 
8.6%
o 44
 
8.4%
i 41
 
7.9%
s 40
 
7.7%
a 40
 
7.7%
r 38
 
7.3%
t 38
 
7.3%
c 36
 
6.9%
l 18
 
3.4%
Other values (13) 124
23.8%
Uppercase Letter
ValueCountFrequency (%)
S 43
 
10.0%
C 42
 
9.7%
I 36
 
8.4%
T 35
 
8.1%
P 32
 
7.4%
A 29
 
6.7%
O 29
 
6.7%
B 26
 
6.0%
K 23
 
5.3%
E 19
 
4.4%
Other values (13) 117
27.1%
Decimal Number
ValueCountFrequency (%)
2 8
30.8%
1 7
26.9%
3 2
 
7.7%
4 2
 
7.7%
0 2
 
7.7%
8 2
 
7.7%
6 2
 
7.7%
7 1
 
3.8%
Other Punctuation
ValueCountFrequency (%)
, 167
72.9%
/ 28
 
12.2%
. 20
 
8.7%
: 7
 
3.1%
& 4
 
1.7%
' 2
 
0.9%
# 1
 
0.4%
Other Symbol
ValueCountFrequency (%)
8
80.0%
® 2
 
20.0%
Space Separator
ValueCountFrequency (%)
729
100.0%
Close Punctuation
ValueCountFrequency (%)
) 86
100.0%
Open Punctuation
ValueCountFrequency (%)
( 85
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3627
63.1%
Common 1165
 
20.3%
Latin 953
 
16.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
107
 
3.0%
107
 
3.0%
93
 
2.6%
84
 
2.3%
79
 
2.2%
73
 
2.0%
73
 
2.0%
73
 
2.0%
70
 
1.9%
69
 
1.9%
Other values (330) 2799
77.2%
Latin
ValueCountFrequency (%)
e 58
 
6.1%
n 45
 
4.7%
o 44
 
4.6%
S 43
 
4.5%
C 42
 
4.4%
i 41
 
4.3%
s 40
 
4.2%
a 40
 
4.2%
r 38
 
4.0%
t 38
 
4.0%
Other values (36) 524
55.0%
Common
ValueCountFrequency (%)
729
62.6%
, 167
 
14.3%
) 86
 
7.4%
( 85
 
7.3%
/ 28
 
2.4%
. 20
 
1.7%
2 8
 
0.7%
: 7
 
0.6%
- 7
 
0.6%
1 7
 
0.6%
Other values (11) 21
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3619
63.0%
ASCII 2116
36.8%
None 10
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
729
34.5%
, 167
 
7.9%
) 86
 
4.1%
( 85
 
4.0%
e 58
 
2.7%
n 45
 
2.1%
o 44
 
2.1%
S 43
 
2.0%
C 42
 
2.0%
i 41
 
1.9%
Other values (56) 776
36.7%
Hangul
ValueCountFrequency (%)
107
 
3.0%
107
 
3.0%
93
 
2.6%
84
 
2.3%
79
 
2.2%
73
 
2.0%
73
 
2.0%
73
 
2.0%
70
 
1.9%
69
 
1.9%
Other values (329) 2791
77.1%
None
ValueCountFrequency (%)
8
80.0%
® 2
 
20.0%

기간_시작
Date

MISSING 

Distinct86
Distinct (%)48.9%
Missing40
Missing (%)18.5%
Memory size1.8 KiB
Minimum1991-07-01 00:00:00
Maximum2021-07-16 00:00:00
2023-12-12T12:11:51.997006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:11:52.176625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

기간_종료
Text

MISSING 

Distinct71
Distinct (%)52.2%
Missing80
Missing (%)37.0%
Memory size1.8 KiB
2023-12-12T12:11:52.446080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length7.7058824
Min length1

Characters and Unicode

Total characters1048
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique46 ?
Unique (%)33.8%

Sample

1st row2000-04-01
2nd row2000-04-01
3rd row2015-09-01
4th row2002-05-01
5th row2013
ValueCountFrequency (%)
2022 16
 
11.9%
2018 13
 
9.7%
2020 7
 
5.2%
2020-01-01 4
 
3.0%
2018-12-01 4
 
3.0%
2020-02-01 3
 
2.2%
2013 3
 
2.2%
2019 3
 
2.2%
2013-12-01 3
 
2.2%
2018-01-01 3
 
2.2%
Other values (60) 75
56.0%
2023-12-12T12:11:52.806207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 306
29.2%
2 234
22.3%
1 202
19.3%
- 170
16.2%
8 36
 
3.4%
9 24
 
2.3%
5 17
 
1.6%
4 17
 
1.6%
6 17
 
1.6%
3 15
 
1.4%
Other values (2) 10
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 876
83.6%
Dash Punctuation 170
 
16.2%
Space Separator 2
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 306
34.9%
2 234
26.7%
1 202
23.1%
8 36
 
4.1%
9 24
 
2.7%
5 17
 
1.9%
4 17
 
1.9%
6 17
 
1.9%
3 15
 
1.7%
7 8
 
0.9%
Dash Punctuation
ValueCountFrequency (%)
- 170
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1048
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 306
29.2%
2 234
22.3%
1 202
19.3%
- 170
16.2%
8 36
 
3.4%
9 24
 
2.3%
5 17
 
1.6%
4 17
 
1.6%
6 17
 
1.6%
3 15
 
1.4%
Other values (2) 10
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1048
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 306
29.2%
2 234
22.3%
1 202
19.3%
- 170
16.2%
8 36
 
3.4%
9 24
 
2.3%
5 17
 
1.6%
4 17
 
1.6%
6 17
 
1.6%
3 15
 
1.4%
Other values (2) 10
 
1.0%

Correlations

2023-12-12T12:11:52.936046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
강사번호강사이름기간_시작기간_종료
강사번호1.0001.0000.7810.000
강사이름1.0001.0000.5320.000
기간_시작0.7810.5321.0000.997
기간_종료0.0000.0000.9971.000
2023-12-12T12:11:53.035427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
강사이름강사번호
강사이름1.0000.955
강사번호0.9551.000
2023-12-12T12:11:53.130628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
강사번호강사이름
강사번호1.0000.955
강사이름0.9551.000

Missing values

2023-12-12T12:11:50.308487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:11:50.464898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T12:11:50.612235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

강사번호강사이름경력사항기간_시작기간_종료
0P003강**광주은행 개발업무 담당(DW, 기획, 차세대TFT)1991-07-012000-04-01
1P003강**광주은행 개발업무 담당1991-07-012000-04-01
2P003강**한국SW기술진흥협회(KOSTA) 재직자과정 및 청년취업아카데미과정, 취업예정자등1999<NA>
3P003강**한국 SW기술진흥협회 재직자과정 및 청년취업아카데미과정1999<NA>
4P003강**한국SW기술진흥협회 전문기술위원2000-01-012015-09-01
5P003강**현대정보기술 업무총괄(사내벤처 운영)2000-07-012002-05-01
6P003강**대형 SI 교육센터(SUN, SKC&C, 대우정보, 네이버 등)2002<NA>
7P003강**프로젝트 On-Site (신한/국민/기업은행, 한진해운, 현대자동차, 현대모비스, CJ대한통운 등)20022013
8P003강**프로젝트 On-Site(신한/국민/기업은행, 한진해운, 현대자동차, 현대모비스, CJ대한통운 등)20022013
9P003강**대형 SI 교육센터(SUN,SKC&C, 대우정보, 네이버 등)2002<NA>
강사번호강사이름경력사항기간_시작기간_종료
206P180한**삼성전자 상생협력아카데미 자문교수<NA><NA>
207P180한**PMBOK® Guide 3rd, 4th, 6th Edition 한글 번역감수위원<NA><NA>
208P180한**LG화학, LG에너지솔루션, 삼성 SDI, SK 하이닉스, SK 이노베이션, KT, LG 인화원, CJ 올리브네트웍스, 한국고용정보원, 한국 IBM, 삼성전자, NIA, 아시아나 IDT, 국방부, 동우화인켐, 인크루트, 소니 코리아, KEPCO, KIRD, 방위사업청, 휴넷, 메리츠 화재, 동부화재, 한양대학교, 현대자동차, 현대위아, 한화, 강원랜드, 이지케어택, 한솔인티큐브, 국가과학기술인력개발원, 삼성물산, 한화테크윈, 삼성중공업, DSME, 포스텍 등<NA><NA>
209P188허**국유재산기획처 근무(도시계획 관련 정책지원 및 신규사업 발굴 및 시행)20152020
210P188허**공무원 직무전문교육 강사(도시계획_국토계획법을 중심으로)20162018
211P188허**사내 직무마스터 국유부동산 강사(도시계획 개론)20182020
212P188허**국유본부 신규부임자교육 사내 강사(부동산 공법)20182020
213P190황**경성대학교 미디어콘텐츠학과 교수<NA><NA>
214P190황**GIO Communication 대표, 밀양시 경관위원회 위원<NA><NA>
215P190황**한국멀티미디어학회 이사, 숨쉬는 동천 홍보단장<NA><NA>