Overview

Dataset statistics

Number of variables10
Number of observations100
Missing cells101
Missing cells (%)10.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.3 KiB
Average record size in memory85.3 B

Variable types

Categorical5
Text1
Numeric2
Boolean1
Unsupported1

Alerts

prtc_tot_grde is highly overall correlated with prtc_pas_div_nm and 1 other fieldsHigh correlation
orst_tot_grde is highly overall correlated with prtc_pas_div_nm and 1 other fieldsHigh correlation
efc_yy is highly overall correlated with qf_grade_nm and 2 other fieldsHigh correlation
qf_grade_nm is highly overall correlated with efc_yy and 2 other fieldsHigh correlation
cour_nm is highly overall correlated with efc_yy and 2 other fieldsHigh correlation
prtc_pas_div_nm is highly overall correlated with prtc_tot_grde and 3 other fieldsHigh correlation
fnl_pas_yn is highly overall correlated with prtc_tot_grde and 3 other fieldsHigh correlation
qf_itm_nm is highly overall correlated with efc_yy and 4 other fieldsHigh correlation
efc_yy is highly imbalanced (80.6%)Imbalance
qf_grade_nm is highly imbalanced (80.6%)Imbalance
cour_nm is highly imbalanced (80.6%)Imbalance
zon_nm has 100 (100.0%) missing valuesMissing
usr_no has unique valuesUnique
zon_nm is an unsupported type, check if it needs cleaning or further analysisUnsupported
prtc_tot_grde has 7 (7.0%) zerosZeros
orst_tot_grde has 5 (5.0%) zerosZeros

Reproduction

Analysis started2023-12-10 09:51:19.336896
Analysis finished2023-12-10 09:51:21.015578
Duration1.68 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

efc_yy
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2015
97 
2021
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015
2nd row2021
3rd row2015
4th row2015
5th row2015

Common Values

ValueCountFrequency (%)
2015 97
97.0%
2021 3
 
3.0%

Length

2023-12-10T18:51:21.149132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:51:21.316394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2015 97
97.0%
2021 3
 
3.0%

qf_grade_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2급 장애인스포츠지도사
97 
유소년스포츠지도사
 
3

Length

Max length12
Median length12
Mean length11.91
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2급 장애인스포츠지도사
2nd row유소년스포츠지도사
3rd row2급 장애인스포츠지도사
4th row2급 장애인스포츠지도사
5th row2급 장애인스포츠지도사

Common Values

ValueCountFrequency (%)
2급 장애인스포츠지도사 97
97.0%
유소년스포츠지도사 3
 
3.0%

Length

2023-12-10T18:51:21.519397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:51:21.683634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2급 97
49.2%
장애인스포츠지도사 97
49.2%
유소년스포츠지도사 3
 
1.5%

cour_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
일반과정
97 
특별과정
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반과정
2nd row특별과정
3rd row일반과정
4th row일반과정
5th row일반과정

Common Values

ValueCountFrequency (%)
일반과정 97
97.0%
특별과정 3
 
3.0%

Length

2023-12-10T18:51:21.847098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:51:22.027941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반과정 97
97.0%
특별과정 3
 
3.0%

usr_no
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:51:22.494865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowC000019915
2nd rowP000176688
3rd rowC000024828
4th rowC000033959
5th rowC000035450
ValueCountFrequency (%)
c000019915 1
 
1.0%
c000110462 1
 
1.0%
c000119718 1
 
1.0%
c000119526 1
 
1.0%
c000118839 1
 
1.0%
c000118683 1
 
1.0%
c000118655 1
 
1.0%
c000115180 1
 
1.0%
c000114246 1
 
1.0%
c000113585 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T18:51:23.353725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 396
39.6%
1 102
 
10.2%
C 97
 
9.7%
6 61
 
6.1%
2 60
 
6.0%
9 53
 
5.3%
8 50
 
5.0%
3 49
 
4.9%
5 46
 
4.6%
4 43
 
4.3%
Other values (2) 43
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 900
90.0%
Uppercase Letter 100
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 396
44.0%
1 102
 
11.3%
6 61
 
6.8%
2 60
 
6.7%
9 53
 
5.9%
8 50
 
5.6%
3 49
 
5.4%
5 46
 
5.1%
4 43
 
4.8%
7 40
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
C 97
97.0%
P 3
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Common 900
90.0%
Latin 100
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 396
44.0%
1 102
 
11.3%
6 61
 
6.8%
2 60
 
6.7%
9 53
 
5.9%
8 50
 
5.6%
3 49
 
5.4%
5 46
 
5.1%
4 43
 
4.8%
7 40
 
4.4%
Latin
ValueCountFrequency (%)
C 97
97.0%
P 3
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 396
39.6%
1 102
 
10.2%
C 97
 
9.7%
6 61
 
6.1%
2 60
 
6.0%
9 53
 
5.3%
8 50
 
5.0%
3 49
 
4.9%
5 46
 
4.6%
4 43
 
4.3%
Other values (2) 43
 
4.3%

prtc_tot_grde
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct39
Distinct (%)39.4%
Missing1
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean75.393939
Minimum0
Maximum100
Zeros7
Zeros (%)7.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:51:23.704492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q172
median81
Q388
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)16

Descriptive statistics

Standard deviation23.707322
Coefficient of variation (CV)0.31444599
Kurtosis4.9028211
Mean75.393939
Median Absolute Deviation (MAD)7
Skewness-2.2202834
Sum7464
Variance562.03711
MonotonicityNot monotonic
2023-12-10T18:51:23.985053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
0 7
 
7.0%
78 7
 
7.0%
100 7
 
7.0%
82 6
 
6.0%
81 5
 
5.0%
80 5
 
5.0%
88 4
 
4.0%
84 4
 
4.0%
85 4
 
4.0%
83 3
 
3.0%
Other values (29) 47
47.0%
ValueCountFrequency (%)
0 7
7.0%
41 1
 
1.0%
55 1
 
1.0%
58 1
 
1.0%
61 1
 
1.0%
62 1
 
1.0%
63 2
 
2.0%
65 3
3.0%
66 1
 
1.0%
67 2
 
2.0%
ValueCountFrequency (%)
100 7
7.0%
99 1
 
1.0%
98 1
 
1.0%
97 1
 
1.0%
96 2
 
2.0%
95 2
 
2.0%
94 3
3.0%
91 2
 
2.0%
90 1
 
1.0%
89 2
 
2.0%

orst_tot_grde
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct36
Distinct (%)36.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.49
Minimum0
Maximum97
Zeros5
Zeros (%)5.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:51:24.259420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile18.05
Q157.75
median75
Q385
95-th percentile92.05
Maximum97
Range97
Interquartile range (IQR)27.25

Descriptive statistics

Standard deviation23.056582
Coefficient of variation (CV)0.33179712
Kurtosis2.2258646
Mean69.49
Median Absolute Deviation (MAD)10.5
Skewness-1.5739569
Sum6949
Variance531.60596
MonotonicityNot monotonic
2023-12-10T18:51:24.546407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
72 9
 
9.0%
75 8
 
8.0%
78 6
 
6.0%
82 6
 
6.0%
57 5
 
5.0%
0 5
 
5.0%
83 4
 
4.0%
92 4
 
4.0%
97 4
 
4.0%
87 4
 
4.0%
Other values (26) 45
45.0%
ValueCountFrequency (%)
0 5
5.0%
19 1
 
1.0%
29 1
 
1.0%
34 3
3.0%
36 1
 
1.0%
39 1
 
1.0%
45 2
 
2.0%
48 1
 
1.0%
50 1
 
1.0%
52 2
 
2.0%
ValueCountFrequency (%)
97 4
4.0%
93 1
 
1.0%
92 4
4.0%
90 3
3.0%
89 2
2.0%
88 3
3.0%
87 4
4.0%
86 1
 
1.0%
85 4
4.0%
84 1
 
1.0%

prtc_pas_div_nm
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
합격
60 
불합격
40 

Length

Max length3
Median length2
Mean length2.4
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row합격
2nd row합격
3rd row합격
4th row합격
5th row합격

Common Values

ValueCountFrequency (%)
합격 60
60.0%
불합격 40
40.0%

Length

2023-12-10T18:51:24.867154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:51:25.181173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
합격 60
60.0%
불합격 40
40.0%

fnl_pas_yn
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
True
53 
False
47 
ValueCountFrequency (%)
True 53
53.0%
False 47
47.0%
2023-12-10T18:51:25.342950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

qf_itm_nm
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)24.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
수영
13 
역도
13 
배드민턴
10 
보치아
배구
Other values (19)
48 

Length

Max length18
Median length2
Mean length2.61
Min length2

Unique

Unique9 ?
Unique (%)9.0%

Sample

1st row보치아
2nd row유도
3rd row볼링
4th row배구
5th row육상

Common Values

ValueCountFrequency (%)
수영 13
13.0%
역도 13
13.0%
배드민턴 10
10.0%
보치아 8
8.0%
배구 8
8.0%
축구 7
 
7.0%
태권도 7
 
7.0%
럭비 5
 
5.0%
육상 5
 
5.0%
농구 5
 
5.0%
Other values (14) 19
19.0%

Length

2023-12-10T18:51:26.030216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
수영 13
13.0%
역도 13
13.0%
배드민턴 10
10.0%
보치아 8
8.0%
배구 8
8.0%
축구 7
 
7.0%
태권도 7
 
7.0%
럭비 5
 
5.0%
육상 5
 
5.0%
농구 5
 
5.0%
Other values (14) 19
19.0%

zon_nm
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing100
Missing (%)100.0%
Memory size1.0 KiB

Interactions

2023-12-10T18:51:20.368522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:51:20.009943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:51:20.501314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:51:20.153544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:51:26.226230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
efc_yyqf_grade_nmcour_nmusr_noprtc_tot_grdeorst_tot_grdeprtc_pas_div_nmfnl_pas_ynqf_itm_nm
efc_yy1.0000.9630.9631.0000.2450.0000.0000.1260.709
qf_grade_nm0.9631.0000.9631.0000.2450.0000.0000.1260.709
cour_nm0.9630.9631.0001.0000.2450.0000.0000.1260.709
usr_no1.0001.0001.0001.0001.0001.0001.0001.0001.000
prtc_tot_grde0.2450.2450.2451.0001.0000.7850.6320.5390.590
orst_tot_grde0.0000.0000.0001.0000.7851.0000.9120.7930.000
prtc_pas_div_nm0.0000.0000.0001.0000.6320.9121.0000.9700.731
fnl_pas_yn0.1260.1260.1261.0000.5390.7930.9701.0000.743
qf_itm_nm0.7090.7090.7091.0000.5900.0000.7310.7431.000
2023-12-10T18:51:26.532427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
efc_yycour_nmfnl_pas_ynqf_itm_nmprtc_pas_div_nmqf_grade_nm
efc_yy1.0000.8260.0800.5570.0000.826
cour_nm0.8261.0000.0800.5570.0000.826
fnl_pas_yn0.0800.0801.0000.5870.8450.080
qf_itm_nm0.5570.5570.5871.0000.5750.557
prtc_pas_div_nm0.0000.0000.8450.5751.0000.000
qf_grade_nm0.8260.8260.0800.5570.0001.000
2023-12-10T18:51:26.796410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
prtc_tot_grdeorst_tot_grdeefc_yyqf_grade_nmcour_nmprtc_pas_div_nmfnl_pas_ynqf_itm_nm
prtc_tot_grde1.0000.3070.2540.2540.2540.6630.5640.270
orst_tot_grde0.3071.0000.0000.0000.0000.7220.6020.000
efc_yy0.2540.0001.0000.8260.8260.0000.0800.557
qf_grade_nm0.2540.0000.8261.0000.8260.0000.0800.557
cour_nm0.2540.0000.8260.8261.0000.0000.0800.557
prtc_pas_div_nm0.6630.7220.0000.0000.0001.0000.8450.575
fnl_pas_yn0.5640.6020.0800.0800.0800.8451.0000.587
qf_itm_nm0.2700.0000.5570.5570.5570.5750.5871.000

Missing values

2023-12-10T18:51:20.732356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:51:20.934530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

efc_yyqf_grade_nmcour_nmusr_noprtc_tot_grdeorst_tot_grdeprtc_pas_div_nmfnl_pas_ynqf_itm_nmzon_nm
020152급 장애인스포츠지도사일반과정C0000199158283합격Y보치아<NA>
12021유소년스포츠지도사특별과정P000176688<NA>82합격N유도<NA>
220152급 장애인스포츠지도사일반과정C0000248288192합격Y볼링<NA>
320152급 장애인스포츠지도사일반과정C0000339599597합격Y배구<NA>
420152급 장애인스포츠지도사일반과정C0000354508087합격Y육상<NA>
520152급 장애인스포츠지도사일반과정C0000384129989합격N수영<NA>
620152급 장애인스포츠지도사일반과정C0000413808092합격Y육상<NA>
72021유소년스포츠지도사특별과정P0002136976973불합격N검도<NA>
820152급 장애인스포츠지도사일반과정C0000425777769불합격N역도<NA>
920152급 장애인스포츠지도사일반과정C0000437739488합격Y수영<NA>
efc_yyqf_grade_nmcour_nmusr_noprtc_tot_grdeorst_tot_grdeprtc_pas_div_nmfnl_pas_ynqf_itm_nmzon_nm
9020152급 장애인스포츠지도사일반과정C0001270189675합격Y럭비<NA>
9120152급 장애인스포츠지도사일반과정C0001273829183합격Y테니스<NA>
9220152급 장애인스포츠지도사일반과정C0001274418234불합격N역도<NA>
9320152급 장애인스포츠지도사일반과정C0001282007875합격Y배드민턴<NA>
9420152급 장애인스포츠지도사일반과정C0001286538778합격Y축구<NA>
9520152급 장애인스포츠지도사일반과정C0001290916587불합격N축구<NA>
9620152급 장애인스포츠지도사일반과정C0001291718475합격Y보치아<NA>
9720152급 장애인스포츠지도사일반과정C0001293039136불합격N수영<NA>
9820152급 장애인스포츠지도사일반과정C0001306998397합격N배구<NA>
9920152급 장애인스포츠지도사일반과정C0001311928193합격Y배구<NA>