Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.4 KiB
Average record size in memory75.3 B

Variable types

Categorical6
Text1
Numeric1
Boolean1

Alerts

cour_nm has constant value ""Constant
qf_grade_nm is highly overall correlated with wrdn_tot_grde and 3 other fieldsHigh correlation
zon_nm is highly overall correlated with efc_yy and 1 other fieldsHigh correlation
efc_yy is highly overall correlated with wrdn_tot_grde and 3 other fieldsHigh correlation
wrdn_tot_grde is highly overall correlated with efc_yy and 2 other fieldsHigh correlation
wrdn_pas_div_nm is highly overall correlated with wrdn_tot_grdeHigh correlation
qf_itm_nm is highly overall correlated with efc_yy and 1 other fieldsHigh correlation
efc_yy is highly imbalanced (80.6%)Imbalance
qf_grade_nm is highly imbalanced (80.6%)Imbalance
usr_no has unique valuesUnique
wrdn_tot_grde has 6 (6.0%) zerosZeros

Reproduction

Analysis started2023-12-10 09:52:36.176880
Analysis finished2023-12-10 09:52:37.548164
Duration1.37 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

efc_yy
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2015
97 
2021
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015
2nd row2021
3rd row2015
4th row2015
5th row2015

Common Values

ValueCountFrequency (%)
2015 97
97.0%
2021 3
 
3.0%

Length

2023-12-10T18:52:37.709562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:52:37.936121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2015 97
97.0%
2021 3
 
3.0%

qf_grade_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2급 장애인스포츠지도사
97 
유소년스포츠지도사
 
3

Length

Max length12
Median length12
Mean length11.91
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2급 장애인스포츠지도사
2nd row유소년스포츠지도사
3rd row2급 장애인스포츠지도사
4th row2급 장애인스포츠지도사
5th row2급 장애인스포츠지도사

Common Values

ValueCountFrequency (%)
2급 장애인스포츠지도사 97
97.0%
유소년스포츠지도사 3
 
3.0%

Length

2023-12-10T18:52:38.138983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:52:38.347182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2급 97
49.2%
장애인스포츠지도사 97
49.2%
유소년스포츠지도사 3
 
1.5%

cour_nm
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
일반과정
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반과정
2nd row일반과정
3rd row일반과정
4th row일반과정
5th row일반과정

Common Values

ValueCountFrequency (%)
일반과정 100
100.0%

Length

2023-12-10T18:52:38.576602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:52:38.754097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반과정 100
100.0%

usr_no
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:52:39.254172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowC000014298
2nd rowP000212059
3rd rowC000019915
4th rowC000020112
5th rowC000024828
ValueCountFrequency (%)
c000014298 1
 
1.0%
c000093395 1
 
1.0%
c000096873 1
 
1.0%
c000096856 1
 
1.0%
c000096769 1
 
1.0%
c000096674 1
 
1.0%
c000096597 1
 
1.0%
c000095987 1
 
1.0%
c000095536 1
 
1.0%
c000095355 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T18:52:40.052978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 426
42.6%
C 97
 
9.7%
1 69
 
6.9%
3 57
 
5.7%
9 56
 
5.6%
8 53
 
5.3%
5 52
 
5.2%
4 51
 
5.1%
6 50
 
5.0%
2 46
 
4.6%
Other values (2) 43
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 900
90.0%
Uppercase Letter 100
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 426
47.3%
1 69
 
7.7%
3 57
 
6.3%
9 56
 
6.2%
8 53
 
5.9%
5 52
 
5.8%
4 51
 
5.7%
6 50
 
5.6%
2 46
 
5.1%
7 40
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
C 97
97.0%
P 3
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Common 900
90.0%
Latin 100
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 426
47.3%
1 69
 
7.7%
3 57
 
6.3%
9 56
 
6.2%
8 53
 
5.9%
5 52
 
5.8%
4 51
 
5.7%
6 50
 
5.6%
2 46
 
5.1%
7 40
 
4.4%
Latin
ValueCountFrequency (%)
C 97
97.0%
P 3
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 426
42.6%
C 97
 
9.7%
1 69
 
6.9%
3 57
 
5.7%
9 56
 
5.6%
8 53
 
5.3%
5 52
 
5.2%
4 51
 
5.1%
6 50
 
5.0%
2 46
 
4.6%
Other values (2) 43
 
4.3%

wrdn_tot_grde
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct40
Distinct (%)40.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.84
Minimum0
Maximum320
Zeros6
Zeros (%)6.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:52:40.358819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q159
median67.5
Q376
95-th percentile83
Maximum320
Range320
Interquartile range (IQR)17

Descriptive statistics

Standard deviation44.132885
Coefficient of variation (CV)0.63191416
Kurtosis20.214169
Mean69.84
Median Absolute Deviation (MAD)8.5
Skewness3.9461219
Sum6984
Variance1947.7115
MonotonicityNot monotonic
2023-12-10T18:52:40.640656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
68 8
 
8.0%
64 7
 
7.0%
0 6
 
6.0%
77 6
 
6.0%
75 4
 
4.0%
59 4
 
4.0%
83 4
 
4.0%
78 4
 
4.0%
63 4
 
4.0%
58 4
 
4.0%
Other values (30) 49
49.0%
ValueCountFrequency (%)
0 6
6.0%
35 1
 
1.0%
39 1
 
1.0%
45 1
 
1.0%
47 1
 
1.0%
49 1
 
1.0%
50 2
 
2.0%
54 2
 
2.0%
55 1
 
1.0%
57 3
3.0%
ValueCountFrequency (%)
320 1
 
1.0%
295 1
 
1.0%
270 1
 
1.0%
84 1
 
1.0%
83 4
4.0%
82 1
 
1.0%
81 1
 
1.0%
80 3
3.0%
79 1
 
1.0%
78 4
4.0%

wrdn_pas_div_nm
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
합격
68 
불합격
22 
과락
10 

Length

Max length3
Median length2
Mean length2.22
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row불합격
2nd row불합격
3rd row합격
4th row합격
5th row합격

Common Values

ValueCountFrequency (%)
합격 68
68.0%
불합격 22
 
22.0%
과락 10
 
10.0%

Length

2023-12-10T18:52:40.905065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:52:41.088350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 68
68.0%
불합격 22
 
22.0%
과락 10
 
10.0%

fnl_pas_yn
Boolean

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
71 
True
29 
ValueCountFrequency (%)
False 71
71.0%
True 29
29.0%
2023-12-10T18:52:41.260451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

qf_itm_nm
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)22.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
수영
20 
역도
11 
배드민턴
10 
육상
보치아
Other values (17)
45 

Length

Max length4
Median length2
Mean length2.37
Min length2

Unique

Unique4 ?
Unique (%)4.0%

Sample

1st row태권도
2nd row당구
3rd row탁구
4th row사이클
5th row볼링

Common Values

ValueCountFrequency (%)
수영 20
20.0%
역도 11
11.0%
배드민턴 10
10.0%
육상 8
 
8.0%
보치아 6
 
6.0%
태권도 6
 
6.0%
탁구 6
 
6.0%
조정 4
 
4.0%
축구 4
 
4.0%
배구 3
 
3.0%
Other values (12) 22
22.0%

Length

2023-12-10T18:52:41.531922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
수영 20
20.0%
역도 11
11.0%
배드민턴 10
10.0%
육상 8
 
8.0%
보치아 6
 
6.0%
태권도 6
 
6.0%
탁구 6
 
6.0%
조정 4
 
4.0%
축구 4
 
4.0%
농구 3
 
3.0%
Other values (12) 22
22.0%

zon_nm
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
서울
35 
경기
19 
전남
14 
인천
전북
Other values (5)
15 

Length

Max length4
Median length2
Mean length2.06
Min length2

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row경기
2nd row<NA>
3rd row경기
4th row인천
5th row경기

Common Values

ValueCountFrequency (%)
서울 35
35.0%
경기 19
19.0%
전남 14
 
14.0%
인천 9
 
9.0%
전북 8
 
8.0%
충남 7
 
7.0%
<NA> 3
 
3.0%
충북 3
 
3.0%
경북 1
 
1.0%
강원 1
 
1.0%

Length

2023-12-10T18:52:41.804039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:52:42.137280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울 35
35.0%
경기 19
19.0%
전남 14
 
14.0%
인천 9
 
9.0%
전북 8
 
8.0%
충남 7
 
7.0%
na 3
 
3.0%
충북 3
 
3.0%
경북 1
 
1.0%
강원 1
 
1.0%

Interactions

2023-12-10T18:52:36.926834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:52:42.383655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
efc_yyqf_grade_nmusr_nowrdn_tot_grdewrdn_pas_div_nmfnl_pas_ynqf_itm_nmzon_nm
efc_yy1.0000.9631.0001.0000.0790.0001.000NaN
qf_grade_nm0.9631.0001.0001.0000.0790.0001.000NaN
usr_no1.0001.0001.0001.0001.0001.0001.0001.000
wrdn_tot_grde1.0001.0001.0001.0000.6120.2510.7820.000
wrdn_pas_div_nm0.0790.0791.0000.6121.0000.2580.0000.429
fnl_pas_yn0.0000.0001.0000.2510.2581.0000.4900.000
qf_itm_nm1.0001.0001.0000.7820.0000.4901.0000.482
zon_nmNaNNaN1.0000.0000.4290.0000.4821.000
2023-12-10T18:52:42.787485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
wrdn_pas_div_nmqf_grade_nmqf_itm_nmzon_nmfnl_pas_ynefc_yy
wrdn_pas_div_nm1.0000.1290.0000.2010.4170.129
qf_grade_nm0.1291.0000.8921.0000.0000.826
qf_itm_nm0.0000.8921.0000.1940.3450.892
zon_nm0.2011.0000.1941.0000.0001.000
fnl_pas_yn0.4170.0000.3450.0001.0000.000
efc_yy0.1290.8260.8921.0000.0001.000
2023-12-10T18:52:43.047061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
wrdn_tot_grdeefc_yyqf_grade_nmwrdn_pas_div_nmfnl_pas_ynqf_itm_nmzon_nm
wrdn_tot_grde1.0000.9850.9850.5320.1610.4720.000
efc_yy0.9851.0000.8260.1290.0000.8921.000
qf_grade_nm0.9850.8261.0000.1290.0000.8921.000
wrdn_pas_div_nm0.5320.1290.1291.0000.4170.0000.201
fnl_pas_yn0.1610.0000.0000.4171.0000.3450.000
qf_itm_nm0.4720.8920.8920.0000.3451.0000.194
zon_nm0.0001.0001.0000.2010.0000.1941.000

Missing values

2023-12-10T18:52:37.132385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:52:37.444243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

efc_yyqf_grade_nmcour_nmusr_nowrdn_tot_grdewrdn_pas_div_nmfnl_pas_ynqf_itm_nmzon_nm
020152급 장애인스포츠지도사일반과정C00001429855불합격N태권도경기
12021유소년스포츠지도사일반과정P000212059295불합격N당구<NA>
220152급 장애인스포츠지도사일반과정C00001991564합격Y탁구경기
320152급 장애인스포츠지도사일반과정C00002011260합격N사이클인천
420152급 장애인스포츠지도사일반과정C00002482868합격Y볼링경기
520152급 장애인스포츠지도사일반과정C0000286810불합격N보치아서울
620152급 장애인스포츠지도사일반과정C00002887335과락N수영전남
72021유소년스포츠지도사일반과정P000212309270불합격N줄넘기<NA>
820152급 장애인스포츠지도사일반과정C00003414957불합격N배구전북
920152급 장애인스포츠지도사일반과정C00003545063합격Y육상충남
efc_yyqf_grade_nmcour_nmusr_nowrdn_tot_grdewrdn_pas_div_nmfnl_pas_ynqf_itm_nmzon_nm
9020152급 장애인스포츠지도사일반과정C00010385578합격Y배드민턴서울
9120152급 장애인스포츠지도사일반과정C00010449557불합격N수영충남
9220152급 장애인스포츠지도사일반과정C00010466567합격Y배구전북
9320152급 장애인스포츠지도사일반과정C00010568681합격N컬링서울
9420152급 장애인스포츠지도사일반과정C00010674368합격N보치아서울
9520152급 장애인스포츠지도사일반과정C00011046271합격N배드민턴서울
9620152급 장애인스포츠지도사일반과정C00011059064과락N배드민턴서울
9720152급 장애인스포츠지도사일반과정C00011225366합격Y축구인천
9820152급 장애인스포츠지도사일반과정C00011233873합격Y육상전북
9920152급 장애인스포츠지도사일반과정C00011248558불합격N탁구전북