Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory898.4 KiB
Average record size in memory92.0 B

Variable types

Numeric3
Categorical7

Dataset

Description1급 응급구조사 국가시험 응시자의 현황을 분석할 수 있는 정보(연도, 직종, 회차, 성별, 연령대, 응시지역, 졸업여부, 합격여부, 학교소재지)를 개인을 식별할 수 없는 형태로 제공합니다.
URLhttps://www.data.go.kr/data/15060463/fileData.do

Alerts

직종 has constant value ""Constant
연도 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
회차 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
응시지역 is highly overall correlated with 학교소재지High correlation
학교소재지 is highly overall correlated with 응시지역High correlation
연령대 is highly imbalanced (77.7%)Imbalance
졸업여부 is highly imbalanced (63.0%)Imbalance
합격여부 is highly imbalanced (64.1%)Imbalance
일련번호 has unique valuesUnique

Reproduction

Analysis started2023-12-11 23:18:08.342133
Analysis finished2023-12-11 23:18:10.060725
Duration1.72 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2013.5228
Minimum2000
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T08:18:10.112381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2001
Q12009
median2015
Q32019
95-th percentile2022
Maximum2022
Range22
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.230326
Coefficient of variation (CV)0.0030942416
Kurtosis-0.6860439
Mean2013.5228
Median Absolute Deviation (MAD)4
Skewness-0.55947244
Sum20135228
Variance38.816962
MonotonicityNot monotonic
2023-12-12T08:18:10.206224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
2022 679
 
6.8%
2020 679
 
6.8%
2017 647
 
6.5%
2018 643
 
6.4%
2021 633
 
6.3%
2016 633
 
6.3%
2019 607
 
6.1%
2015 583
 
5.8%
2014 539
 
5.4%
2013 522
 
5.2%
Other values (13) 3835
38.4%
ValueCountFrequency (%)
2000 287
2.9%
2001 272
2.7%
2002 282
2.8%
2003 95
 
0.9%
2004 227
2.3%
2005 215
2.1%
2006 234
2.3%
2007 289
2.9%
2008 350
3.5%
2009 362
3.6%
ValueCountFrequency (%)
2022 679
6.8%
2021 633
6.3%
2020 679
6.8%
2019 607
6.1%
2018 643
6.4%
2017 647
6.5%
2016 633
6.3%
2015 583
5.8%
2014 539
5.4%
2013 522
5.2%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1급응급구조사
10000 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1급응급구조사
2nd row1급응급구조사
3rd row1급응급구조사
4th row1급응급구조사
5th row1급응급구조사

Common Values

ValueCountFrequency (%)
1급응급구조사 10000
100.0%

Length

2023-12-12T08:18:10.304409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:18:10.388532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1급응급구조사 10000
100.0%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.5228
Minimum6
Maximum28
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T08:18:10.470856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile7
Q115
median21
Q325
95-th percentile28
Maximum28
Range22
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.230326
Coefficient of variation (CV)0.31913076
Kurtosis-0.6860439
Mean19.5228
Median Absolute Deviation (MAD)4
Skewness-0.55947244
Sum195228
Variance38.816962
MonotonicityNot monotonic
2023-12-12T08:18:10.573993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
28 679
 
6.8%
26 679
 
6.8%
23 647
 
6.5%
24 643
 
6.4%
27 633
 
6.3%
22 633
 
6.3%
25 607
 
6.1%
21 583
 
5.8%
20 539
 
5.4%
19 522
 
5.2%
Other values (13) 3835
38.4%
ValueCountFrequency (%)
6 287
2.9%
7 272
2.7%
8 282
2.8%
9 95
 
0.9%
10 227
2.3%
11 215
2.1%
12 234
2.3%
13 289
2.9%
14 350
3.5%
15 362
3.6%
ValueCountFrequency (%)
28 679
6.8%
27 633
6.3%
26 679
6.8%
25 607
6.1%
24 643
6.4%
23 647
6.5%
22 633
6.3%
21 583
5.8%
20 539
5.4%
19 522
5.2%

일련번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13584.115
Minimum5
Maximum27086
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T08:18:10.722565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile1417.9
Q16803.75
median13539
Q320331.25
95-th percentile25782.05
Maximum27086
Range27081
Interquartile range (IQR)13527.5

Descriptive statistics

Standard deviation7802.2301
Coefficient of variation (CV)0.57436424
Kurtosis-1.1954866
Mean13584.115
Median Absolute Deviation (MAD)6765
Skewness0.0073440472
Sum1.3584115 × 108
Variance60874794
MonotonicityNot monotonic
2023-12-12T08:18:10.867515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
26084 1
 
< 0.1%
24485 1
 
< 0.1%
21937 1
 
< 0.1%
18198 1
 
< 0.1%
1211 1
 
< 0.1%
21054 1
 
< 0.1%
6848 1
 
< 0.1%
5405 1
 
< 0.1%
10996 1
 
< 0.1%
26245 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
5 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
12 1
< 0.1%
15 1
< 0.1%
23 1
< 0.1%
25 1
< 0.1%
30 1
< 0.1%
33 1
< 0.1%
34 1
< 0.1%
ValueCountFrequency (%)
27086 1
< 0.1%
27083 1
< 0.1%
27078 1
< 0.1%
27076 1
< 0.1%
27072 1
< 0.1%
27070 1
< 0.1%
27067 1
< 0.1%
27066 1
< 0.1%
27065 1
< 0.1%
27062 1
< 0.1%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
5390 
4610 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
5390
53.9%
4610
46.1%

Length

2023-12-12T08:18:10.986906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:18:11.064913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5390
53.9%
4610
46.1%

연령대
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20
9156 
30
 
493
40
 
313
50
 
37
60
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row40

Common Values

ValueCountFrequency (%)
20 9156
91.6%
30 493
 
4.9%
40 313
 
3.1%
50 37
 
0.4%
60 1
 
< 0.1%

Length

2023-12-12T08:18:11.145732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:18:11.231099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 9156
91.6%
30 493
 
4.9%
40 313
 
3.1%
50 37
 
0.4%
60 1
 
< 0.1%

응시지역
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
서울특별시
3907 
대전광역시
2320 
대구광역시
1885 
광주광역시
1738 
제주도
 
150

Length

Max length5
Median length5
Mean length4.97
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대구광역시
2nd row대구광역시
3rd row광주광역시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 3907
39.1%
대전광역시 2320
23.2%
대구광역시 1885
18.9%
광주광역시 1738
17.4%
제주도 150
 
1.5%

Length

2023-12-12T08:18:11.328300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:18:11.433325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 3907
39.1%
대전광역시 2320
23.2%
대구광역시 1885
18.9%
광주광역시 1738
17.4%
제주도 150
 
1.5%

졸업여부
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
졸업예정
8634 
졸업
1352 
 
14

Length

Max length4
Median length4
Mean length3.7254
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row졸업예정
2nd row졸업예정
3rd row졸업예정
4th row졸업예정
5th row졸업

Common Values

ValueCountFrequency (%)
졸업예정 8634
86.3%
졸업 1352
 
13.5%
14
 
0.1%

Length

2023-12-12T08:18:11.529180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:18:11.612541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
졸업예정 8634
86.5%
졸업 1352
 
13.5%

합격여부
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
합격
8414 
불합격
1412 
결시
 
155
응시결격
 
19

Length

Max length4
Median length2
Mean length2.145
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row합격
2nd row합격
3rd row합격
4th row합격
5th row불합격

Common Values

ValueCountFrequency (%)
합격 8414
84.1%
불합격 1412
 
14.1%
결시 155
 
1.6%
응시결격 19
 
0.2%

Length

2023-12-12T08:18:11.700557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:18:11.786317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 8414
84.1%
불합격 1412
 
14.1%
결시 155
 
1.6%
응시결격 19
 
0.2%

학교소재지
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경기도
1214 
광주광역시
1152 
충청북도
937 
대전광역시
928 
충청남도
927 
Other values (13)
4842 

Length

Max length5
Median length4
Mean length4.0383
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경상북도
2nd row경상남도
3rd row전라남도
4th row전라남도
5th row경기도

Common Values

ValueCountFrequency (%)
경기도 1214
12.1%
광주광역시 1152
11.5%
충청북도 937
9.4%
대전광역시 928
9.3%
충청남도 927
9.3%
경상북도 899
9.0%
경상남도 596
 
6.0%
전라남도 590
 
5.9%
전라북도 506
 
5.1%
제주도 463
 
4.6%
Other values (8) 1788
17.9%

Length

2023-12-12T08:18:11.897310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 1214
12.1%
광주광역시 1152
11.5%
충청북도 937
9.4%
대전광역시 928
9.3%
충청남도 927
9.3%
경상북도 899
9.0%
경상남도 596
 
6.0%
전라남도 590
 
5.9%
전라북도 506
 
5.1%
제주도 463
 
4.6%
Other values (8) 1788
17.9%

Interactions

2023-12-12T08:18:09.558621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:09.093623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:09.309157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:09.651526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:09.163097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:09.393938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:09.732442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:09.236732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:09.470179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:18:11.966221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
연도1.0001.0000.9780.1090.1610.5020.1580.1680.273
회차1.0001.0000.9830.1070.1660.5350.1570.1680.294
일련번호0.9780.9831.0000.1080.1870.5580.1490.1700.311
성별0.1090.1070.1081.0000.1660.0260.0960.2120.245
연령대0.1610.1660.1870.1661.0000.1670.5040.3040.612
응시지역0.5020.5350.5580.0260.1671.0000.1030.0810.910
졸업여부0.1580.1570.1490.0960.5040.1031.0000.3720.604
합격여부0.1680.1680.1700.2120.3040.0810.3721.0000.430
학교소재지0.2730.2940.3110.2450.6120.9100.6040.4301.000
2023-12-12T08:18:12.056487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
학교소재지졸업여부응시지역성별합격여부연령대
학교소재지1.0000.4050.7620.2190.2520.371
졸업여부0.4051.0000.0770.1580.3620.440
응시지역0.7620.0771.0000.0320.0660.063
성별0.2190.1580.0321.0000.1410.203
합격여부0.2520.3620.0660.1411.0000.253
연령대0.3710.4400.0630.2030.2531.000
2023-12-12T08:18:12.142722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
연도1.0001.0000.9990.0820.0700.2500.0930.1010.118
회차1.0001.0000.9990.0820.0700.2500.0930.1010.118
일련번호0.9990.9991.0000.0830.0790.2640.0890.1020.126
성별0.0820.0820.0831.0000.2030.0320.1580.1410.219
연령대0.0700.0700.0790.2031.0000.0630.4400.2530.371
응시지역0.2500.2500.2640.0320.0631.0000.0770.0660.762
졸업여부0.0930.0930.0890.1580.4400.0771.0000.3620.405
합격여부0.1010.1010.1020.1410.2530.0660.3621.0000.252
학교소재지0.1180.1180.1260.2190.3710.7620.4050.2521.000

Missing values

2023-12-12T08:18:09.855076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:18:10.002419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
2608320221급응급구조사282608420대구광역시졸업예정합격경상북도
750620101급응급구조사16750720대구광역시졸업예정합격경상남도
2113720191급응급구조사252113820광주광역시졸업예정합격전라남도
333220051급응급구조사11333320서울특별시졸업예정합격전라남도
965220121급응급구조사18965340서울특별시졸업불합격경기도
2160020191급응급구조사252160120대전광역시졸업예정합격충청북도
1400020151급응급구조사211400120대구광역시졸업예정합격부산광역시
2264420201급응급구조사262264520대구광역시졸업예정합격경상남도
1262620141급응급구조사201262720광주광역시졸업예정합격광주광역시
2178620191급응급구조사252178720제주도졸업예정합격제주도
연도직종회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
1069020131급응급구조사191069120서울특별시졸업예정합격제주도
1811620171급응급구조사231811720대전광역시졸업합격충청북도
2351320201급응급구조사262351420대전광역시졸업예정합격충청남도
2142920191급응급구조사252143020대전광역시졸업예정합격전라북도
1483520151급응급구조사211483620대전광역시졸업예정합격대전광역시
1261520141급응급구조사201261620광주광역시졸업예정합격전라남도
91320011급응급구조사791420서울특별시졸업예정합격경기도
974520121급응급구조사18974620대구광역시졸업예정합격경상남도
1028420121급응급구조사181028520대전광역시졸업예정불합격대전광역시
769620101급응급구조사16769720광주광역시졸업예정합격광주광역시