Overview

Dataset statistics

Number of variables6
Number of observations72
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.6 KiB
Average record size in memory51.8 B

Variable types

Text1
Categorical3
Numeric2

Dataset

Description인재개발원 교육과정 수강생들을 대상으로 한 연간 교육 과정에 대한 데이터로 과정명, 교육기간, 기수, 기수당 인원, 연인원 등의 자료를 제공합니다.
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=3078627

Alerts

기수 is highly overall correlated with 연인원High correlation
연인원 is highly overall correlated with 기수 and 2 other fieldsHigh correlation
교육기간 is highly overall correlated with 연인원 and 1 other fieldsHigh correlation
기수당인원 is highly overall correlated with 연인원 and 1 other fieldsHigh correlation
과정명 has unique valuesUnique

Reproduction

Analysis started2023-12-11 00:22:24.267200
Analysis finished2023-12-11 00:22:25.069156
Duration0.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

과정명
Text

UNIQUE 

Distinct72
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size708.0 B
2023-12-11T09:22:25.263531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length17
Mean length12.611111
Min length7

Characters and Unicode

Total characters908
Distinct characters210
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)100.0%

Sample

1st row신임 인재 양성 과정
2nd row신규 공무원 역량 향상 심화 과정
3rd row7ㆍ8급 승진자 역량 향상 과정
4th row6급 승진자 역량 향상 과정
5th row임기제공무원 역량 향상 과정
ValueCountFrequency (%)
과정 48
 
20.7%
과정(★ 12
 
5.2%
향상 9
 
3.9%
역량 7
 
3.0%
이해 4
 
1.7%
실무 3
 
1.3%
경남 3
 
1.3%
맞춤형 3
 
1.3%
활용 3
 
1.3%
관리자 2
 
0.9%
Other values (126) 138
59.5%
2023-12-11T09:22:25.640781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
163
 
18.0%
74
 
8.1%
72
 
7.9%
( 19
 
2.1%
19
 
2.1%
19
 
2.1%
) 19
 
2.1%
12
 
1.3%
11
 
1.2%
11
 
1.2%
Other values (200) 489
53.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 672
74.0%
Space Separator 163
 
18.0%
Open Punctuation 20
 
2.2%
Close Punctuation 20
 
2.2%
Other Symbol 19
 
2.1%
Uppercase Letter 10
 
1.1%
Decimal Number 4
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
74
 
11.0%
72
 
10.7%
19
 
2.8%
12
 
1.8%
11
 
1.6%
11
 
1.6%
11
 
1.6%
11
 
1.6%
10
 
1.5%
10
 
1.5%
Other values (182) 431
64.1%
Uppercase Letter
ValueCountFrequency (%)
T 2
20.0%
U 2
20.0%
F 1
10.0%
I 1
10.0%
A 1
10.0%
M 1
10.0%
P 1
10.0%
N 1
10.0%
Decimal Number
ValueCountFrequency (%)
7 1
25.0%
8 1
25.0%
5 1
25.0%
6 1
25.0%
Open Punctuation
ValueCountFrequency (%)
( 19
95.0%
[ 1
 
5.0%
Close Punctuation
ValueCountFrequency (%)
) 19
95.0%
] 1
 
5.0%
Space Separator
ValueCountFrequency (%)
163
100.0%
Other Symbol
ValueCountFrequency (%)
19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 672
74.0%
Common 226
 
24.9%
Latin 10
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
74
 
11.0%
72
 
10.7%
19
 
2.8%
12
 
1.8%
11
 
1.6%
11
 
1.6%
11
 
1.6%
11
 
1.6%
10
 
1.5%
10
 
1.5%
Other values (182) 431
64.1%
Common
ValueCountFrequency (%)
163
72.1%
( 19
 
8.4%
19
 
8.4%
) 19
 
8.4%
7 1
 
0.4%
8 1
 
0.4%
5 1
 
0.4%
] 1
 
0.4%
[ 1
 
0.4%
6 1
 
0.4%
Latin
ValueCountFrequency (%)
T 2
20.0%
U 2
20.0%
F 1
10.0%
I 1
10.0%
A 1
10.0%
M 1
10.0%
P 1
10.0%
N 1
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 668
73.6%
ASCII 217
 
23.9%
Misc Symbols 19
 
2.1%
Compat Jamo 4
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
163
75.1%
( 19
 
8.8%
) 19
 
8.8%
T 2
 
0.9%
U 2
 
0.9%
F 1
 
0.5%
I 1
 
0.5%
7 1
 
0.5%
A 1
 
0.5%
M 1
 
0.5%
Other values (7) 7
 
3.2%
Hangul
ValueCountFrequency (%)
74
 
11.1%
72
 
10.8%
19
 
2.8%
12
 
1.8%
11
 
1.6%
11
 
1.6%
11
 
1.6%
11
 
1.6%
10
 
1.5%
10
 
1.5%
Other values (181) 427
63.9%
Misc Symbols
ValueCountFrequency (%)
19
100.0%
Compat Jamo
ValueCountFrequency (%)
4
100.0%

교육기간
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)15.3%
Missing0
Missing (%)0.0%
Memory size708.0 B
3일
38 
2일
11 
1일
4일
2h 
 
3
Other values (6)

Length

Max length4
Median length2
Mean length2.0833333
Min length2

Unique

Unique6 ?
Unique (%)8.3%

Sample

1st row3주
2nd row3일
3rd row4일
4th row4일
5th row3일

Common Values

ValueCountFrequency (%)
3일 38
52.8%
2일 11
 
15.3%
1일 8
 
11.1%
4일 6
 
8.3%
2h  3
 
4.2%
3주 1
 
1.4%
43주 1
 
1.4%
2주 1
 
1.4%
3h 1
 
1.4%
4h 1
 
1.4%

Length

2023-12-11T09:22:25.776690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
3일 38
52.8%
2일 11
 
15.3%
1일 8
 
11.1%
4일 6
 
8.3%
2h 3
 
4.2%
3주 1
 
1.4%
43주 1
 
1.4%
2주 1
 
1.4%
3h 1
 
1.4%
4h 1
 
1.4%

기수
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)12.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.6527778
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size780.0 B
2023-12-11T09:22:26.148938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile8
Maximum10
Range9
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.2776299
Coefficient of variation (CV)0.85858299
Kurtosis3.4049862
Mean2.6527778
Median Absolute Deviation (MAD)1
Skewness1.9515625
Sum191
Variance5.1875978
MonotonicityNot monotonic
2023-12-11T09:22:26.253846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 28
38.9%
2 18
25.0%
3 12
16.7%
4 5
 
6.9%
10 3
 
4.2%
8 2
 
2.8%
7 2
 
2.8%
5 1
 
1.4%
6 1
 
1.4%
ValueCountFrequency (%)
1 28
38.9%
2 18
25.0%
3 12
16.7%
4 5
 
6.9%
5 1
 
1.4%
6 1
 
1.4%
7 2
 
2.8%
8 2
 
2.8%
10 3
 
4.2%
ValueCountFrequency (%)
10 3
 
4.2%
8 2
 
2.8%
7 2
 
2.8%
6 1
 
1.4%
5 1
 
1.4%
4 5
 
6.9%
3 12
16.7%
2 18
25.0%
1 28
38.9%

기수당인원
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)18.1%
Missing0
Missing (%)0.0%
Memory size708.0 B
20
32 
30
25 
40
 
3
25
 
3
150
 
1
Other values (8)

Length

Max length7
Median length2
Mean length2.1527778
Min length2

Unique

Unique9 ?
Unique (%)12.5%

Sample

1st row150
2nd row40
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 32
44.4%
30 25
34.7%
40 3
 
4.2%
25 3
 
4.2%
150 1
 
1.4%
180 1
 
1.4%
250~300 1
 
1.4%
320 1
 
1.4%
80 1
 
1.4%
15 1
 
1.4%
Other values (3) 3
 
4.2%

Length

2023-12-11T09:22:26.367036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
20 32
44.4%
30 25
34.7%
40 3
 
4.2%
25 3
 
4.2%
150 1
 
1.4%
180 1
 
1.4%
250~300 1
 
1.4%
320 1
 
1.4%
80 1
 
1.4%
15 1
 
1.4%
Other values (3) 3
 
4.2%

연인원
Real number (ℝ)

HIGH CORRELATION 

Distinct20
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean182.29167
Minimum15
Maximum3200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size780.0 B
2023-12-11T09:22:26.473358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile20
Q130
median60
Q390
95-th percentile960
Maximum3200
Range3185
Interquartile range (IQR)60

Descriptive statistics

Standard deviation496.17618
Coefficient of variation (CV)2.7218808
Kurtosis22.342718
Mean182.29167
Median Absolute Deviation (MAD)30
Skewness4.5664211
Sum13125
Variance246190.8
MonotonicityNot monotonic
2023-12-11T09:22:26.587796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
60 14
19.4%
30 13
18.1%
20 12
16.7%
40 7
9.7%
80 4
 
5.6%
90 4
 
5.6%
100 3
 
4.2%
120 3
 
4.2%
1400 1
 
1.4%
600 1
 
1.4%
Other values (10) 10
13.9%
ValueCountFrequency (%)
15 1
 
1.4%
20 12
16.7%
30 13
18.1%
40 7
9.7%
50 1
 
1.4%
60 14
19.4%
80 4
 
5.6%
90 4
 
5.6%
100 3
 
4.2%
120 3
 
4.2%
ValueCountFrequency (%)
3200 1
1.4%
1900 1
1.4%
1800 1
1.4%
1400 1
1.4%
600 1
1.4%
320 1
1.4%
250 1
1.4%
200 1
1.4%
160 1
1.4%
140 1
1.4%

비고
Categorical

Distinct9
Distinct (%)12.5%
Missing0
Missing (%)0.0%
Memory size708.0 B
직무전문
17 
기본
13 
핵심과제
디지털
직무공통
Other values (4)
17 

Length

Max length4
Median length4
Mean length3.4166667
Min length2

Unique

Unique1 ?
Unique (%)1.4%

Sample

1st row기본
2nd row기본
3rd row기본
4th row기본
5th row기본

Common Values

ValueCountFrequency (%)
직무전문 17
23.6%
기본 13
18.1%
핵심과제 9
12.5%
디지털 9
12.5%
직무공통 7
9.7%
인문소양 7
9.7%
리더십 5
 
6.9%
도민참여 4
 
5.6%
장기 1
 
1.4%

Length

2023-12-11T09:22:26.712751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:22:26.837358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
직무전문 17
23.6%
기본 13
18.1%
핵심과제 9
12.5%
디지털 9
12.5%
직무공통 7
9.7%
인문소양 7
9.7%
리더십 5
 
6.9%
도민참여 4
 
5.6%
장기 1
 
1.4%

Interactions

2023-12-11T09:22:24.697339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:22:24.552944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:22:24.784791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:22:24.618486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:22:26.944891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과정명교육기간기수기수당인원연인원비고
과정명1.0001.0001.0001.0001.0001.000
교육기간1.0001.0000.5170.9040.9340.705
기수1.0000.5171.0000.6730.6460.618
기수당인원1.0000.9040.6731.0001.0000.715
연인원1.0000.9340.6461.0001.0000.224
비고1.0000.7050.6180.7150.2241.000
2023-12-11T09:22:27.080929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기수당인원교육기간비고
기수당인원1.0000.6550.392
교육기간0.6551.0000.408
비고0.3920.4081.000
2023-12-11T09:22:27.191739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기수연인원교육기간기수당인원비고
기수1.0000.8670.2560.3530.237
연인원0.8671.0000.8030.9380.120
교육기간0.2560.8031.0000.6550.408
기수당인원0.3530.9380.6551.0000.392
비고0.2370.1200.4080.3921.000

Missing values

2023-12-11T09:22:24.933148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:22:25.030362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

과정명교육기간기수기수당인원연인원비고
0신임 인재 양성 과정3주101501400기본
1신규 공무원 역량 향상 심화 과정3일840320기본
27ㆍ8급 승진자 역량 향상 과정4일520100기본
36급 승진자 역량 향상 과정4일42080기본
4임기제공무원 역량 향상 과정3일12020기본
5전입공무원 역량향상 과정(★)3일12020기본
6소통과 공감 과정2일720140기본
7대민소통역량 강화 과정2일23060기본
8고위공직자 청렴 교육 과정1일23060기본
9공직자이해충돌방지 과정(★)1일340120기본
과정명교육기간기수기수당인원연인원비고
62홍보영상제작 과정(★)3일23060디지털
63개인정보 보호 과정2일13030디지털
64업무용 오피스 활용 과정3일425100디지털
65사물인터넷과 로보틱스 이해 과정3일22040디지털
66메타버스와 NFT이해 과정3일32060디지털
67스마트기기 활용 업무능력 향상 과정3일425100디지털
68보조금 단체 회계실무 과정1일33090도민참여
69안전교육 전문인력 과정2일12020도민참여
70지속가능발전 목표 이해 과정4h33090도민참여
71도민공감 캠퍼스(★)2~3h3200600도민참여