Overview

Dataset statistics

Number of variables9
Number of observations51
Missing cells33
Missing cells (%)7.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.9 KiB
Average record size in memory77.6 B

Variable types

Categorical3
Text3
Numeric3

Dataset

Description5급 공채 행정, 기술, 외교관후보자 원서접수에 대한 데이터로, 직렬별, 선발예정인원, 출원인원, 경쟁률 등의 항목을 제공합니다.
Author인사혁신처
URLhttps://www.data.go.kr/data/15060869/fileData.do

Alerts

선발예정인원 is highly overall correlated with 출원인원 and 1 other fieldsHigh correlation
출원인원 is highly overall correlated with 선발예정인원 and 2 other fieldsHigh correlation
경쟁률 is highly overall correlated with 출원인원 and 1 other fieldsHigh correlation
시험명 is highly overall correlated with 선발예정인원 and 2 other fieldsHigh correlation
모집단위(직렬) is highly overall correlated with 경쟁률 and 1 other fieldsHigh correlation
모집단위(직류) has 1 (2.0%) missing valuesMissing
모집단위(지역) has 32 (62.7%) missing valuesMissing

Reproduction

Analysis started2024-03-16 04:20:12.491727
Analysis finished2024-03-16 04:20:14.301354
Duration1.81 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

모집구분
Categorical

Distinct2
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size540.0 B
모집단위별
32 
지역구분모집
19 

Length

Max length6
Median length5
Mean length5.372549
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row모집단위별
2nd row모집단위별
3rd row모집단위별
4th row모집단위별
5th row모집단위별

Common Values

ValueCountFrequency (%)
모집단위별 32
62.7%
지역구분모집 19
37.3%

Length

2024-03-16T13:20:14.374569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-16T13:20:14.475121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
모집단위별 32
62.7%
지역구분모집 19
37.3%

시험명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)5.9%
Missing0
Missing (%)0.0%
Memory size540.0 B
행정
25 
기술
25 
외교관후보자
 
1

Length

Max length6
Median length2
Mean length2.0784314
Min length2

Unique

Unique1 ?
Unique (%)2.0%

Sample

1st row행정
2nd row행정
3rd row행정
4th row행정
5th row행정

Common Values

ValueCountFrequency (%)
행정 25
49.0%
기술 25
49.0%
외교관후보자 1
 
2.0%

Length

2024-03-16T13:20:14.592918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-16T13:20:14.690958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
행정 25
49.0%
기술 25
49.0%
외교관후보자 1
 
2.0%

모집단위(직렬)
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)33.3%
Missing0
Missing (%)0.0%
Memory size540.0 B
행정직
20 
시설직
10 
공업직
전산직
임업직
Other values (12)
12 

Length

Max length6
Median length3
Mean length3.2352941
Min length3

Unique

Unique12 ?
Unique (%)23.5%

Sample

1st row행정직
2nd row행정직
3rd row행정직
4th row행정직
5th row행정직

Common Values

ValueCountFrequency (%)
행정직 20
39.2%
시설직 10
19.6%
공업직 3
 
5.9%
전산직 3
 
5.9%
임업직 3
 
5.9%
기상직 1
 
2.0%
방송통신직 1
 
2.0%
교정직 1
 
2.0%
방재안전직 1
 
2.0%
보호직 1
 
2.0%
Other values (7) 7
 
13.7%

Length

2024-03-16T13:20:14.836066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
행정직 20
39.2%
시설직 10
19.6%
공업직 3
 
5.9%
전산직 3
 
5.9%
임업직 3
 
5.9%
환경직 1
 
2.0%
출입국관리직 1
 
2.0%
검찰직 1
 
2.0%
농업직 1
 
2.0%
해양수산직 1
 
2.0%
Other values (7) 7
 
13.7%

모집단위(직류)
Text

MISSING 

Distinct31
Distinct (%)62.0%
Missing1
Missing (%)2.0%
Memory size540.0 B
2024-03-16T13:20:15.073976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length5.16
Min length2

Characters and Unicode

Total characters258
Distinct characters60
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)54.0%

Sample

1st row일반행정:전국
2nd row일반행정:지역
3rd row인사조직
4th row법무행정
5th row재경
ValueCountFrequency (%)
일반행정:지역 14
28.0%
건축:지역 4
 
8.0%
일반토목:지역 3
 
6.0%
산림자원:지역 2
 
4.0%
건축:전국 1
 
2.0%
일반수산 1
 
2.0%
일반환경 1
 
2.0%
기상 1
 
2.0%
일반토목:전국 1
 
2.0%
방재안전 1
 
2.0%
Other values (21) 21
42.0%
2024-03-16T13:20:15.409861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
: 27
 
10.5%
24
 
9.3%
23
 
8.9%
23
 
8.9%
23
 
8.9%
19
 
7.4%
17
 
6.6%
7
 
2.7%
6
 
2.3%
5
 
1.9%
Other values (50) 84
32.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 230
89.1%
Other Punctuation 27
 
10.5%
Close Punctuation 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
24
 
10.4%
23
 
10.0%
23
 
10.0%
23
 
10.0%
19
 
8.3%
17
 
7.4%
7
 
3.0%
6
 
2.6%
5
 
2.2%
5
 
2.2%
Other values (48) 78
33.9%
Other Punctuation
ValueCountFrequency (%)
: 27
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 230
89.1%
Common 28
 
10.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
24
 
10.4%
23
 
10.0%
23
 
10.0%
23
 
10.0%
19
 
8.3%
17
 
7.4%
7
 
3.0%
6
 
2.6%
5
 
2.2%
5
 
2.2%
Other values (48) 78
33.9%
Common
ValueCountFrequency (%)
: 27
96.4%
) 1
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 230
89.1%
ASCII 28
 
10.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
: 27
96.4%
) 1
 
3.6%
Hangul
ValueCountFrequency (%)
24
 
10.4%
23
 
10.0%
23
 
10.0%
23
 
10.0%
19
 
8.3%
17
 
7.4%
7
 
3.0%
6
 
2.6%
5
 
2.2%
5
 
2.2%
Other values (48) 78
33.9%

모집단위(지역)
Text

MISSING 

Distinct14
Distinct (%)73.7%
Missing32
Missing (%)62.7%
Memory size540.0 B
2024-03-16T13:20:15.632677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters38
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)52.6%

Sample

1st row서울
2nd row부산
3rd row대구
4th row울산
5th row경기
ValueCountFrequency (%)
서울 3
15.8%
부산 2
10.5%
대구 2
10.5%
경기 2
10.5%
울산 1
 
5.3%
강원 1
 
5.3%
충북 1
 
5.3%
충남 1
 
5.3%
전북 1
 
5.3%
전남 1
 
5.3%
Other values (4) 4
21.1%
2024-03-16T13:20:15.964901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
 
10.5%
4
 
10.5%
3
 
7.9%
3
 
7.9%
3
 
7.9%
3
 
7.9%
2
 
5.3%
2
 
5.3%
2
 
5.3%
2
 
5.3%
Other values (8) 10
26.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 38
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
 
10.5%
4
 
10.5%
3
 
7.9%
3
 
7.9%
3
 
7.9%
3
 
7.9%
2
 
5.3%
2
 
5.3%
2
 
5.3%
2
 
5.3%
Other values (8) 10
26.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 38
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
 
10.5%
4
 
10.5%
3
 
7.9%
3
 
7.9%
3
 
7.9%
3
 
7.9%
2
 
5.3%
2
 
5.3%
2
 
5.3%
2
 
5.3%
Other values (8) 10
26.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 38
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4
 
10.5%
4
 
10.5%
3
 
7.9%
3
 
7.9%
3
 
7.9%
3
 
7.9%
2
 
5.3%
2
 
5.3%
2
 
5.3%
2
 
5.3%
Other values (8) 10
26.3%

선발예정인원
Real number (ℝ)

HIGH CORRELATION 

Distinct14
Distinct (%)27.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.3921569
Minimum1
Maximum98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size591.0 B
2024-03-16T13:20:16.123468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q36
95-th percentile32.5
Maximum98
Range97
Interquartile range (IQR)5

Descriptive statistics

Standard deviation16.422032
Coefficient of variation (CV)2.2215481
Kurtosis20.269983
Mean7.3921569
Median Absolute Deviation (MAD)1
Skewness4.3018209
Sum377
Variance269.68314
MonotonicityNot monotonic
2024-03-16T13:20:16.317902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
1 19
37.3%
2 8
15.7%
3 7
 
13.7%
7 3
 
5.9%
6 3
 
5.9%
9 3
 
5.9%
98 1
 
2.0%
22 1
 
2.0%
58 1
 
2.0%
11 1
 
2.0%
Other values (4) 4
 
7.8%
ValueCountFrequency (%)
1 19
37.3%
2 8
15.7%
3 7
 
13.7%
4 1
 
2.0%
5 1
 
2.0%
6 3
 
5.9%
7 3
 
5.9%
9 3
 
5.9%
11 1
 
2.0%
14 1
 
2.0%
ValueCountFrequency (%)
98 1
 
2.0%
58 1
 
2.0%
43 1
 
2.0%
22 1
 
2.0%
14 1
 
2.0%
11 1
 
2.0%
9 3
5.9%
7 3
5.9%
6 3
5.9%
5 1
 
2.0%

출원인원
Real number (ℝ)

HIGH CORRELATION 

Distinct45
Distinct (%)88.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean253.01961
Minimum6
Maximum4325
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size591.0 B
2024-03-16T13:20:16.475570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile12.5
Q129.5
median64
Q3204
95-th percentile943
Maximum4325
Range4319
Interquartile range (IQR)174.5

Descriptive statistics

Standard deviation646.84748
Coefficient of variation (CV)2.5565113
Kurtosis32.71808
Mean253.01961
Median Absolute Deviation (MAD)51
Skewness5.4159677
Sum12904
Variance418411.66
MonotonicityNot monotonic
2024-03-16T13:20:16.624970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
204 3
 
5.9%
12 2
 
3.9%
55 2
 
3.9%
13 2
 
3.9%
26 2
 
3.9%
4325 1
 
2.0%
25 1
 
2.0%
190 1
 
2.0%
1526 1
 
2.0%
210 1
 
2.0%
Other values (35) 35
68.6%
ValueCountFrequency (%)
6 1
2.0%
12 2
3.9%
13 2
3.9%
17 1
2.0%
18 1
2.0%
24 1
2.0%
25 1
2.0%
26 2
3.9%
27 1
2.0%
29 1
2.0%
ValueCountFrequency (%)
4325 1
2.0%
1526 1
2.0%
1279 1
2.0%
607 1
2.0%
513 1
2.0%
402 1
2.0%
285 1
2.0%
272 1
2.0%
261 1
2.0%
241 1
2.0%

경쟁률
Real number (ℝ)

HIGH CORRELATION 

Distinct39
Distinct (%)76.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.588235
Minimum6
Maximum131
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size591.0 B
2024-03-16T13:20:16.749197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile12.5
Q119
median28
Q340
95-th percentile84
Maximum131
Range125
Interquartile range (IQR)21

Descriptive statistics

Standard deviation25.329174
Coefficient of variation (CV)0.73230605
Kurtosis5.6973239
Mean34.588235
Median Absolute Deviation (MAD)10
Skewness2.2356961
Sum1764
Variance641.56706
MonotonicityNot monotonic
2024-03-16T13:20:17.233904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
13 3
 
5.9%
30 3
 
5.9%
18 2
 
3.9%
20 2
 
3.9%
17 2
 
3.9%
12 2
 
3.9%
40 2
 
3.9%
21 2
 
3.9%
26 2
 
3.9%
25 2
 
3.9%
Other values (29) 29
56.9%
ValueCountFrequency (%)
6 1
 
2.0%
12 2
3.9%
13 3
5.9%
14 1
 
2.0%
15 1
 
2.0%
16 1
 
2.0%
17 2
3.9%
18 2
3.9%
20 2
3.9%
21 2
3.9%
ValueCountFrequency (%)
131 1
2.0%
121 1
2.0%
95 1
2.0%
73 1
2.0%
64 1
2.0%
61 1
2.0%
56 1
2.0%
55 1
2.0%
46 1
2.0%
45 1
2.0%
Distinct33
Distinct (%)64.7%
Missing0
Missing (%)0.0%
Memory size540.0 B
2024-03-16T13:20:17.447067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.1764706
Min length2

Characters and Unicode

Total characters111
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)45.1%

Sample

1st row44
2nd row41
3rd row34
4th row97
5th row24
ValueCountFrequency (%)
미실시 7
 
13.7%
23 3
 
5.9%
43 3
 
5.9%
20 3
 
5.9%
28 2
 
3.9%
18 2
 
3.9%
44 2
 
3.9%
24 2
 
3.9%
34 2
 
3.9%
36 2
 
3.9%
Other values (23) 23
45.1%
2024-03-16T13:20:17.762621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 19
17.1%
2 17
15.3%
4 15
13.5%
1 8
7.2%
7
 
6.3%
7
 
6.3%
7
 
6.3%
5 7
 
6.3%
0 6
 
5.4%
8 5
 
4.5%
Other values (3) 13
11.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 90
81.1%
Other Letter 21
 
18.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 19
21.1%
2 17
18.9%
4 15
16.7%
1 8
8.9%
5 7
 
7.8%
0 6
 
6.7%
8 5
 
5.6%
7 5
 
5.6%
6 4
 
4.4%
9 4
 
4.4%
Other Letter
ValueCountFrequency (%)
7
33.3%
7
33.3%
7
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 90
81.1%
Hangul 21
 
18.9%

Most frequent character per script

Common
ValueCountFrequency (%)
3 19
21.1%
2 17
18.9%
4 15
16.7%
1 8
8.9%
5 7
 
7.8%
0 6
 
6.7%
8 5
 
5.6%
7 5
 
5.6%
6 4
 
4.4%
9 4
 
4.4%
Hangul
ValueCountFrequency (%)
7
33.3%
7
33.3%
7
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90
81.1%
Hangul 21
 
18.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 19
21.1%
2 17
18.9%
4 15
16.7%
1 8
8.9%
5 7
 
7.8%
0 6
 
6.7%
8 5
 
5.6%
7 5
 
5.6%
6 4
 
4.4%
9 4
 
4.4%
Hangul
ValueCountFrequency (%)
7
33.3%
7
33.3%
7
33.3%

Interactions

2024-03-16T13:20:13.570736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:20:13.000841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:20:13.244467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:20:13.672449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:20:13.083688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:20:13.332879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:20:13.746697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:20:13.154035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:20:13.442695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-16T13:20:17.872132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
모집구분시험명모집단위(직렬)모집단위(직류)모집단위(지역)선발예정인원출원인원경쟁률전년도경쟁률
모집구분1.0000.1450.2660.722NaN0.0000.0000.3520.162
시험명0.1451.0001.0001.0000.0000.9360.7200.2000.000
모집단위(직렬)0.2661.0001.0001.0000.3720.0000.0000.8870.932
모집단위(직류)0.7221.0001.0001.0000.0000.8720.9300.9780.935
모집단위(지역)NaN0.0000.3720.0001.000NaNNaN0.0000.272
선발예정인원0.0000.9360.0000.872NaN1.0000.9450.0000.000
출원인원0.0000.7200.0000.930NaN0.9451.0000.4240.000
경쟁률0.3520.2000.8870.9780.0000.0000.4241.0000.924
전년도경쟁률0.1620.0000.9320.9350.2720.0000.0000.9241.000
2024-03-16T13:20:18.032601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
모집구분시험명모집단위(직렬)
모집구분1.0000.2350.183
시험명0.2351.0000.842
모집단위(직렬)0.1830.8421.000
2024-03-16T13:20:18.229062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
선발예정인원출원인원경쟁률모집구분시험명모집단위(직렬)
선발예정인원1.0000.8620.1190.0000.6710.000
출원인원0.8621.0000.5760.0000.6950.000
경쟁률0.1190.5761.0000.2110.1360.559
모집구분0.0000.0000.2111.0000.2350.183
시험명0.6710.6950.1360.2351.0000.842
모집단위(직렬)0.0000.0000.5590.1830.8421.000

Missing values

2024-03-16T13:20:13.852607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-16T13:20:14.050971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-16T13:20:14.246425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

모집구분시험명모집단위(직렬)모집단위(직류)모집단위(지역)선발예정인원출원인원경쟁률전년도경쟁률
0모집단위별행정행정직일반행정:전국<NA>9843254444
1모집단위별행정행정직일반행정:지역<NA>226072841
2모집단위별행정행정직인사조직<NA>2924634
3모집단위별행정행정직법무행정<NA>75137397
4모집단위별행정행정직재경<NA>5812792224
5모집단위별행정행정직국제통상<NA>114023744
6모집단위별행정행정직교육행정<NA>62043439
7모집단위별행정사회복지직사회복지<NA>1555536
8모집단위별행정교정직교정<NA>32859595
9모집단위별행정보호직보호<NA>224112143
모집구분시험명모집단위(직렬)모집단위(직류)모집단위(지역)선발예정인원출원인원경쟁률전년도경쟁률
41지역구분모집행정행정직일반행정:지역전남2462327
42지역구분모집행정행정직일반행정:지역경북1181843
43지역구분모집행정행정직일반행정:지역경남1262643
44지역구분모집행정행정직일반행정:지역제주11717미실시
45지역구분모집기술임업직산림자원:지역인천11313미실시
46지역구분모집기술시설직일반토목:지역서울2271418
47지역구분모집기술시설직일반토목:지역경기11212미실시
48지역구분모집기술시설직건축:지역서울1292928
49지역구분모집기술시설직건축:지역부산11212미실시
50지역구분모집기술시설직건축:지역대구166미실시