Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory898.4 KiB
Average record size in memory92.0 B

Variable types

Numeric3
Categorical7

Dataset

Description영양사 국가시험 응시자의 현황을 분석할 수 있는 정보(연도, 직종, 회차, 성별, 연령대, 응시지역, 졸업여부, 합격여부, 학교소재지)를 개인을 식별할 수 없는 형태로 제공합니다.
URLhttps://www.data.go.kr/data/15060461/fileData.do

Alerts

직종 has constant value ""Constant
연도 is highly overall correlated with 회차 and 1 other fieldsHigh correlation
회차 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 연도 and 1 other fieldsHigh correlation
응시지역 is highly overall correlated with 학교소재지High correlation
학교소재지 is highly overall correlated with 응시지역High correlation
성별 is highly imbalanced (66.9%)Imbalance
연령대 is highly imbalanced (72.6%)Imbalance
일련번호 has unique valuesUnique

Reproduction

Analysis started2023-12-11 23:50:35.176668
Analysis finished2023-12-11 23:50:37.248022
Duration2.07 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2004.6805
Minimum2000
Maximum2010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T08:50:37.299406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2000
Q12002
median2005
Q32007
95-th percentile2009
Maximum2010
Range10
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.9686194
Coefficient of variation (CV)0.0014808441
Kurtosis-1.2115924
Mean2004.6805
Median Absolute Deviation (MAD)3
Skewness0.0035728729
Sum20046805
Variance8.812701
MonotonicityNot monotonic
2023-12-12T08:50:37.401784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2008 1033
10.3%
2009 1030
10.3%
2002 1025
10.2%
2003 999
10.0%
2005 991
9.9%
2007 974
9.7%
2000 958
9.6%
2006 934
9.3%
2001 919
9.2%
2004 907
9.1%
ValueCountFrequency (%)
2000 958
9.6%
2001 919
9.2%
2002 1025
10.2%
2003 999
10.0%
2004 907
9.1%
2005 991
9.9%
2006 934
9.3%
2007 974
9.7%
2008 1033
10.3%
2009 1030
10.3%
ValueCountFrequency (%)
2010 230
 
2.3%
2009 1030
10.3%
2008 1033
10.3%
2007 974
9.7%
2006 934
9.3%
2005 991
9.9%
2004 907
9.1%
2003 999
10.0%
2002 1025
10.2%
2001 919
9.2%

직종
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
영양사
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영양사
2nd row영양사
3rd row영양사
4th row영양사
5th row영양사

Common Values

ValueCountFrequency (%)
영양사 10000
100.0%

Length

2023-12-12T08:50:37.504929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:50:37.581259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영양사 10000
100.0%

회차
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.6805
Minimum23
Maximum33
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T08:50:37.654890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum23
5-th percentile23
Q125
median28
Q330
95-th percentile32
Maximum33
Range10
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.9686194
Coefficient of variation (CV)0.10724587
Kurtosis-1.2115924
Mean27.6805
Median Absolute Deviation (MAD)3
Skewness0.0035728729
Sum276805
Variance8.812701
MonotonicityNot monotonic
2023-12-12T08:50:37.773816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
31 1033
10.3%
32 1030
10.3%
25 1025
10.2%
26 999
10.0%
28 991
9.9%
30 974
9.7%
23 958
9.6%
29 934
9.3%
24 919
9.2%
27 907
9.1%
ValueCountFrequency (%)
23 958
9.6%
24 919
9.2%
25 1025
10.2%
26 999
10.0%
27 907
9.1%
28 991
9.9%
29 934
9.3%
30 974
9.7%
31 1033
10.3%
32 1030
10.3%
ValueCountFrequency (%)
33 230
 
2.3%
32 1030
10.3%
31 1033
10.3%
30 974
9.7%
29 934
9.3%
28 991
9.9%
27 907
9.1%
26 999
10.0%
25 1025
10.2%
24 919
9.2%

일련번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47438.092
Minimum6
Maximum95292
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T08:50:37.907073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile4433.95
Q123550.25
median47265.5
Q371343.75
95-th percentile90570.3
Maximum95292
Range95286
Interquartile range (IQR)47793.5

Descriptive statistics

Standard deviation27557.262
Coefficient of variation (CV)0.58091001
Kurtosis-1.200771
Mean47438.092
Median Absolute Deviation (MAD)23940.5
Skewness0.015338494
Sum4.7438092 × 108
Variance7.594027 × 108
MonotonicityNot monotonic
2023-12-12T08:50:38.030840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1111 1
 
< 0.1%
69235 1
 
< 0.1%
29044 1
 
< 0.1%
23688 1
 
< 0.1%
86156 1
 
< 0.1%
17886 1
 
< 0.1%
89729 1
 
< 0.1%
65879 1
 
< 0.1%
50607 1
 
< 0.1%
83465 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
6 1
< 0.1%
42 1
< 0.1%
51 1
< 0.1%
55 1
< 0.1%
66 1
< 0.1%
82 1
< 0.1%
83 1
< 0.1%
90 1
< 0.1%
94 1
< 0.1%
99 1
< 0.1%
ValueCountFrequency (%)
95292 1
< 0.1%
95287 1
< 0.1%
95285 1
< 0.1%
95271 1
< 0.1%
95269 1
< 0.1%
95257 1
< 0.1%
95253 1
< 0.1%
95246 1
< 0.1%
95238 1
< 0.1%
95237 1
< 0.1%

성별
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
9391 
 
609

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
9391
93.9%
609
 
6.1%

Length

2023-12-12T08:50:38.138199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:50:38.218970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
9391
93.9%
609
 
6.1%

연령대
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
20
8854 
30
 
809
40
 
283
50
 
52
60
 
2

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 8854
88.5%
30 809
 
8.1%
40 283
 
2.8%
50 52
 
0.5%
60 2
 
< 0.1%

Length

2023-12-12T08:50:38.316014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:50:38.411928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20 8854
88.5%
30 809
 
8.1%
40 283
 
2.8%
50 52
 
0.5%
60 2
 
< 0.1%

응시지역
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
서울특별시
4706 
부산광역시
1679 
대구광역시
1282 
대전광역시
972 
광주광역시
940 
Other values (2)
 
421

Length

Max length5
Median length5
Mean length4.8737
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row대구광역시
3rd row서울특별시
4th row서울특별시
5th row부산광역시

Common Values

ValueCountFrequency (%)
서울특별시 4706
47.1%
부산광역시 1679
 
16.8%
대구광역시 1282
 
12.8%
대전광역시 972
 
9.7%
광주광역시 940
 
9.4%
전주 306
 
3.1%
원주 115
 
1.1%

Length

2023-12-12T08:50:38.517392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:50:38.617527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 4706
47.1%
부산광역시 1679
 
16.8%
대구광역시 1282
 
12.8%
대전광역시 972
 
9.7%
광주광역시 940
 
9.4%
전주 306
 
3.1%
원주 115
 
1.1%

졸업여부
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
졸업예정
6712 
졸업
3163 
 
125

Length

Max length4
Median length4
Mean length3.3299
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row졸업예정
2nd row졸업예정
3rd row졸업
4th row졸업예정
5th row졸업

Common Values

ValueCountFrequency (%)
졸업예정 6712
67.1%
졸업 3163
31.6%
125
 
1.2%

Length

2023-12-12T08:50:38.738094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:50:38.837870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
졸업예정 6712
68.0%
졸업 3163
32.0%

합격여부
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
합격
4160 
불합격
4119 
결시
1717 
응시결격
 
4

Length

Max length4
Median length2
Mean length2.4127
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row합격
2nd row합격
3rd row합격
4th row결시
5th row불합격

Common Values

ValueCountFrequency (%)
합격 4160
41.6%
불합격 4119
41.2%
결시 1717
17.2%
응시결격 4
 
< 0.1%

Length

2023-12-12T08:50:38.942515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:50:39.043937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 4160
41.6%
불합격 4119
41.2%
결시 1717
17.2%
응시결격 4
 
< 0.1%

학교소재지
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경기도
2051 
서울특별시
1549 
부산광역시
964 
경상북도
759 
경상남도
638 
Other values (12)
4039 

Length

Max length5
Median length4
Mean length4.2002
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row경기도
2nd row경상북도
3rd row서울특별시
4th row경기도
5th row부산광역시

Common Values

ValueCountFrequency (%)
경기도 2051
20.5%
서울특별시 1549
15.5%
부산광역시 964
9.6%
경상북도 759
 
7.6%
경상남도 638
 
6.4%
대구광역시 631
 
6.3%
광주광역시 600
 
6.0%
대전광역시 576
 
5.8%
전라북도 548
 
5.5%
강원도 404
 
4.0%
Other values (7) 1280
12.8%

Length

2023-12-12T08:50:39.143525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 2051
20.5%
서울특별시 1549
15.5%
부산광역시 964
9.6%
경상북도 759
 
7.6%
경상남도 638
 
6.4%
대구광역시 631
 
6.3%
광주광역시 600
 
6.0%
대전광역시 576
 
5.8%
전라북도 548
 
5.5%
강원도 404
 
4.0%
Other values (7) 1280
12.8%

Interactions

2023-12-12T08:50:36.501328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:50:36.010323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:50:36.272025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:50:36.593249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:50:36.106987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:50:36.354104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:50:36.684948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:50:36.180987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:50:36.421535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:50:39.220284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
연도1.0001.0000.9830.0190.1670.2000.2330.1320.136
회차1.0001.0000.9890.0000.1840.2020.2360.1260.132
일련번호0.9830.9891.0000.0140.1800.1580.2170.1090.112
성별0.0190.0000.0141.0000.0290.0250.0130.1710.084
연령대0.1670.1840.1800.0291.0000.0650.1510.1120.088
응시지역0.2000.2020.1580.0250.0651.0000.0440.0400.943
졸업여부0.2330.2360.2170.0130.1510.0441.0000.2540.096
합격여부0.1320.1260.1090.1710.1120.0400.2541.0000.135
학교소재지0.1360.1320.1120.0840.0880.9430.0960.1351.000
2023-12-12T08:50:39.324179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
응시지역성별합격여부학교소재지졸업여부연령대
응시지역1.0000.0260.0270.8060.0290.041
성별0.0261.0000.1130.0750.0220.035
합격여부0.0270.1131.0000.0750.2430.092
학교소재지0.8060.0750.0751.0000.0510.045
졸업여부0.0290.0220.2430.0511.0000.114
연령대0.0410.0350.0920.0450.1141.000
2023-12-12T08:50:39.433138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
연도1.0001.0000.9950.0130.0790.1040.1450.0720.053
회차1.0001.0000.9950.0130.0790.1040.1450.0720.053
일련번호0.9950.9951.0000.0110.0760.0800.1320.0660.044
성별0.0130.0130.0111.0000.0350.0260.0220.1130.075
연령대0.0790.0790.0760.0351.0000.0410.1140.0920.045
응시지역0.1040.1040.0800.0260.0411.0000.0290.0270.806
졸업여부0.1450.1450.1320.0220.1140.0291.0000.2430.051
합격여부0.0720.0720.0660.1130.0920.0270.2431.0000.075
학교소재지0.0530.0530.0440.0750.0450.8060.0510.0751.000

Missing values

2023-12-12T08:50:37.057521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:50:37.179203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도직종회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
11102000영양사23111120서울특별시졸업예정합격경기도
339102003영양사263391120대구광역시졸업예정합격경상북도
853302009영양사328533120서울특별시졸업합격서울특별시
772802008영양사317728120서울특별시졸업예정결시경기도
47462000영양사23474720부산광역시졸업불합격부산광역시
204422002영양사252044320서울특별시졸업예정불합격서울특별시
696622007영양사306966320부산광역시졸업예정합격부산광역시
67172000영양사23671820대구광역시졸업예정불합격경상북도
578982006영양사295789920서울특별시졸업예정합격서울특별시
675172007영양사306751820서울특별시졸업예정불합격강원도
연도직종회차일련번호성별연령대응시지역졸업여부합격여부학교소재지
56892000영양사23569020부산광역시졸업예정합격부산광역시
504212005영양사285042220부산광역시졸업예정불합격부산광역시
835512009영양사328355220서울특별시졸업예정결시서울특별시
749332008영양사317493430서울특별시졸업불합격경기도
484512005영양사284845220서울특별시졸업예정합격서울특별시
871592009영양사328716020서울특별시졸업예정불합격경기도
303262003영양사263032720서울특별시졸업예정불합격강원도
322012003영양사263220220부산광역시졸업예정불합격부산광역시
650532007영양사306505420서울특별시졸업합격충청남도
411722004영양사274117320부산광역시졸업예정합격부산광역시