Overview

Dataset statistics

Number of variables6
Number of observations63
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.2 KiB
Average record size in memory52.1 B

Variable types

Text1
Categorical3
Numeric2

Dataset

Description인재개발원 교육과정 수강생들을 대상으로 한 연간 교육 과정에 대한 데이터로 과정명, 교육기간, 기수, 기수당 인원, 연인원 등의 자료를 제공합니다.
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=3078627

Alerts

기수 is highly overall correlated with 연인원 and 1 other fieldsHigh correlation
연인원 is highly overall correlated with 기수 and 2 other fieldsHigh correlation
교육기간 is highly overall correlated with 연인원 and 1 other fieldsHigh correlation
기당인원 is highly overall correlated with 기수 and 2 other fieldsHigh correlation
과정명 has unique valuesUnique

Reproduction

Analysis started2023-08-15 04:43:19.318599
Analysis finished2023-08-15 04:43:21.148631
Duration1.83 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

과정명
Text

UNIQUE 

Distinct63
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size636.0 B
2023-08-15T13:43:21.490356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length15
Mean length12.222222
Min length4

Characters and Unicode

Total characters770
Distinct characters190
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique63 ?
Unique (%)100.0%

Sample

1st row신임 인재 양성 과정
2nd row7~8급 승진자 역량향상 과정
3rd row6급 승진자 역량향상 과정
4th row임기제공무원 역량 향상 과정
5th row중견리더 과정
ValueCountFrequency (%)
과정 59
27.2%
역량 6
 
2.8%
향상 6
 
2.8%
찾아가는 5
 
2.3%
이해 5
 
2.3%
실무 4
 
1.8%
리더십 4
 
1.8%
역량향상 4
 
1.8%
경남 3
 
1.4%
지속가능발전 2
 
0.9%
Other values (111) 119
54.8%
2023-08-15T13:43:22.181088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
158
 
20.5%
63
 
8.2%
63
 
8.2%
13
 
1.7%
12
 
1.6%
12
 
1.6%
11
 
1.4%
10
 
1.3%
10
 
1.3%
10
 
1.3%
Other values (180) 408
53.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 592
76.9%
Space Separator 158
 
20.5%
Decimal Number 8
 
1.0%
Uppercase Letter 7
 
0.9%
Other Punctuation 4
 
0.5%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
63
 
10.6%
63
 
10.6%
13
 
2.2%
12
 
2.0%
12
 
2.0%
11
 
1.9%
10
 
1.7%
10
 
1.7%
10
 
1.7%
9
 
1.5%
Other values (163) 379
64.0%
Decimal Number
ValueCountFrequency (%)
3 1
12.5%
6 1
12.5%
1 1
12.5%
2 1
12.5%
4 1
12.5%
7 1
12.5%
8 1
12.5%
5 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
D 2
28.6%
R 1
14.3%
L 1
14.3%
S 1
14.3%
T 1
14.3%
F 1
14.3%
Space Separator
ValueCountFrequency (%)
158
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 592
76.9%
Common 171
 
22.2%
Latin 7
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
63
 
10.6%
63
 
10.6%
13
 
2.2%
12
 
2.0%
12
 
2.0%
11
 
1.9%
10
 
1.7%
10
 
1.7%
10
 
1.7%
9
 
1.5%
Other values (163) 379
64.0%
Common
ValueCountFrequency (%)
158
92.4%
, 4
 
2.3%
3 1
 
0.6%
6 1
 
0.6%
1 1
 
0.6%
2 1
 
0.6%
~ 1
 
0.6%
4 1
 
0.6%
7 1
 
0.6%
8 1
 
0.6%
Latin
ValueCountFrequency (%)
D 2
28.6%
R 1
14.3%
L 1
14.3%
S 1
14.3%
T 1
14.3%
F 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 592
76.9%
ASCII 178
 
23.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
158
88.8%
, 4
 
2.2%
D 2
 
1.1%
3 1
 
0.6%
6 1
 
0.6%
1 1
 
0.6%
2 1
 
0.6%
R 1
 
0.6%
L 1
 
0.6%
S 1
 
0.6%
Other values (7) 7
 
3.9%
Hangul
ValueCountFrequency (%)
63
 
10.6%
63
 
10.6%
13
 
2.2%
12
 
2.0%
12
 
2.0%
11
 
1.9%
10
 
1.7%
10
 
1.7%
10
 
1.7%
9
 
1.5%
Other values (163) 379
64.0%

교육기간
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)15.9%
Missing0
Missing (%)0.0%
Memory size636.0 B
3일
33 
2일
4일
1일
2h
Other values (5)

Length

Max length4
Median length2
Mean length2.047619
Min length2

Unique

Unique5 ?
Unique (%)7.9%

Sample

1st row3주
2nd row4일
3rd row4일
4th row3일
5th row43주

Common Values

ValueCountFrequency (%)
3일 33
52.4%
2일 9
 
14.3%
4일 6
 
9.5%
1일 6
 
9.5%
2h 4
 
6.3%
3주 1
 
1.6%
43주 1
 
1.6%
3h 1
 
1.6%
2~4h 1
 
1.6%
2주 1
 
1.6%

Length

2023-08-15T13:43:22.506712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T13:43:22.746201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3일 33
52.4%
2일 9
 
14.3%
4일 6
 
9.5%
1일 6
 
9.5%
2h 4
 
6.3%
3주 1
 
1.6%
43주 1
 
1.6%
3h 1
 
1.6%
2~4h 1
 
1.6%
2주 1
 
1.6%

기수
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)17.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.0634921
Minimum1
Maximum19
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size699.0 B
2023-08-15T13:43:22.971103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile12.6
Maximum19
Range18
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.6802569
Coefficient of variation (CV)1.2013274
Kurtosis8.8931476
Mean3.0634921
Median Absolute Deviation (MAD)1
Skewness2.9856035
Sum193
Variance13.544291
MonotonicityNot monotonic
2023-08-15T13:43:23.195586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2 22
34.9%
1 22
34.9%
3 6
 
9.5%
4 4
 
6.3%
5 3
 
4.8%
9 1
 
1.6%
13 1
 
1.6%
19 1
 
1.6%
16 1
 
1.6%
6 1
 
1.6%
ValueCountFrequency (%)
1 22
34.9%
2 22
34.9%
3 6
 
9.5%
4 4
 
6.3%
5 3
 
4.8%
6 1
 
1.6%
9 1
 
1.6%
13 1
 
1.6%
15 1
 
1.6%
16 1
 
1.6%
ValueCountFrequency (%)
19 1
 
1.6%
16 1
 
1.6%
15 1
 
1.6%
13 1
 
1.6%
9 1
 
1.6%
6 1
 
1.6%
5 3
 
4.8%
4 4
 
6.3%
3 6
 
9.5%
2 22
34.9%

기당인원
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)15.9%
Missing0
Missing (%)0.0%
Memory size636.0 B
30
29 
20
20 
40
200
 
2
100
 
2
Other values (5)

Length

Max length6
Median length2
Mean length2.1269841
Min length2

Unique

Unique4 ?
Unique (%)6.3%

Sample

1st row80~250
2nd row20
3rd row20
4th row20
5th row78

Common Values

ValueCountFrequency (%)
30 29
46.0%
20 20
31.7%
40 4
 
6.3%
200 2
 
3.2%
100 2
 
3.2%
25 2
 
3.2%
80~250 1
 
1.6%
78 1
 
1.6%
10 1
 
1.6%
50 1
 
1.6%

Length

2023-08-15T13:43:23.451835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T13:43:23.717788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
30 29
46.0%
20 20
31.7%
40 4
 
6.3%
200 2
 
3.2%
100 2
 
3.2%
25 2
 
3.2%
80~250 1
 
1.6%
78 1
 
1.6%
10 1
 
1.6%
50 1
 
1.6%

연인원
Real number (ℝ)

HIGH CORRELATION 

Distinct18
Distinct (%)28.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean204.73016
Minimum20
Maximum3200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size699.0 B
2023-08-15T13:43:23.927798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile30
Q130
median40
Q390
95-th percentile1300
Maximum3200
Range3180
Interquartile range (IQR)60

Descriptive statistics

Standard deviation573.65656
Coefficient of variation (CV)2.802013
Kurtosis17.15248
Mean204.73016
Median Absolute Deviation (MAD)10
Skewness4.1460827
Sum12898
Variance329081.85
MonotonicityNot monotonic
2023-08-15T13:43:24.137215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
30 16
25.4%
40 14
22.2%
60 8
12.7%
90 5
 
7.9%
20 3
 
4.8%
100 3
 
4.8%
50 2
 
3.2%
80 2
 
3.2%
150 1
 
1.6%
120 1
 
1.6%
Other values (8) 8
12.7%
ValueCountFrequency (%)
20 3
 
4.8%
30 16
25.4%
40 14
22.2%
50 2
 
3.2%
60 8
12.7%
78 1
 
1.6%
80 2
 
3.2%
90 5
 
7.9%
100 3
 
4.8%
120 1
 
1.6%
ValueCountFrequency (%)
3200 1
 
1.6%
2600 1
 
1.6%
1900 1
 
1.6%
1400 1
 
1.6%
400 1
 
1.6%
300 1
 
1.6%
160 1
 
1.6%
150 1
 
1.6%
120 1
 
1.6%
100 3
4.8%

비고
Categorical

Distinct8
Distinct (%)12.7%
Missing0
Missing (%)0.0%
Memory size636.0 B
핵심가치
14 
직무전문
13 
정보화
기본역량
리더십
Other values (3)
13 

Length

Max length5
Median length4
Mean length3.8095238
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기본역량
2nd row기본역량
3rd row기본역량
4th row기본역량
5th row기본역량

Common Values

ValueCountFrequency (%)
핵심가치 14
22.2%
직무전문 13
20.6%
정보화 9
14.3%
기본역량 7
11.1%
리더십 7
11.1%
직무공통 6
9.5%
인문·소양 4
 
6.3%
도민참여 3
 
4.8%

Length

2023-08-15T13:43:24.360939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T13:43:24.584829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
핵심가치 14
22.2%
직무전문 13
20.6%
정보화 9
14.3%
기본역량 7
11.1%
리더십 7
11.1%
직무공통 6
9.5%
인문·소양 4
 
6.3%
도민참여 3
 
4.8%

Interactions

2023-08-15T13:43:20.368110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-15T13:43:19.965243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-15T13:43:20.584813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-15T13:43:20.180778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-08-15T13:43:24.744782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과정명교육기간기수기당인원연인원비고
과정명1.0001.0001.0001.0001.0001.000
교육기간1.0001.0000.7460.9490.8630.492
기수1.0000.7461.0000.7960.9560.322
기당인원1.0000.9490.7961.0000.8860.345
연인원1.0000.8630.9560.8861.0000.000
비고1.0000.4920.3220.3450.0001.000
2023-08-15T13:43:24.941983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기당인원교육기간비고
기당인원1.0000.6180.162
교육기간0.6181.0000.251
비고0.1620.2511.000
2023-08-15T13:43:25.101791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기수연인원교육기간기당인원비고
기수1.0000.8870.4700.5320.103
연인원0.8871.0000.6610.7030.000
교육기간0.4700.6611.0000.6180.251
기당인원0.5320.7030.6181.0000.162
비고0.1030.0000.2510.1621.000

Missing values

2023-08-15T13:43:20.855283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-08-15T13:43:21.070305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

과정명교육기간기수기당인원연인원비고
0신임 인재 양성 과정3주980~2501400기본역량
17~8급 승진자 역량향상 과정4일520100기본역량
26급 승진자 역량향상 과정4일520100기본역량
3임기제공무원 역량 향상 과정3일22040기본역량
4중견리더 과정43주17878기본역량
5신규공무원 역량향상 심화 과정2일22040기본역량
6전입공무원 역량향상 과정2일22040기본역량
7경남 바로 알기 과정3일23060핵심가치
8경남형 뉴딜 이해 과정3일23060핵심가치
9기후위기 적응대응 과정3일23060핵심가치
과정명교육기간기수기당인원연인원비고
53멋진 보고서 꾸미기 과정3일33090정보화
54DSLR촬영 및 포토샵 활용 과정3일13030정보화
55업무능력 2배 향상되는 오피스 활용 테크닉 과정3일22040정보화
56상담기법2일22040인문·소양
57스트레스 치유 과정3일33090인문·소양
58테마가 있는 약초 탐방 과정3일33090인문·소양
59미래설계 과정2주14040인문·소양
60보조금 단체 회계실무 과정1일430120도민참여
61안전교육 전문인력 교육 과정2일13030도민참여
62지속가능발전 목표 이해 과정1일33090도민참여