Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows2551
Duplicate rows (%)25.5%
Total size in memory498.0 KiB
Average record size in memory51.0 B

Variable types

Categorical3
Numeric2

Dataset

Description학생표본 신체(키) 검사 rawdata
Author교육부
URLhttps://www.data.go.kr/data/15051016/fileData.do

Alerts

학년도 has constant value ""Constant
Dataset has 2551 (25.5%) duplicate rowsDuplicates
is highly overall correlated with 학교급별High correlation
학교급별 is highly overall correlated with High correlation

Reproduction

Analysis started2023-12-11 23:44:16.615631
Analysis finished2023-12-11 23:44:17.528567
Duration0.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

학년도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2016
10000 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2016
2nd row2016
3rd row2016
4th row2016
5th row2016

Common Values

ValueCountFrequency (%)
2016 10000
100.0%

Length

2023-12-12T08:44:17.604131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:44:17.693543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2016 10000
100.0%

학교급별
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
4014 
3235 
2751 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
4014
40.1%
3235
32.4%
2751
27.5%

Length

2023-12-12T08:44:17.776146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:44:17.877590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4014
40.1%
3235
32.4%
2751
27.5%

학년
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.587
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T08:44:18.240563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile6
Maximum6
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4447975
Coefficient of variation (CV)0.55848376
Kurtosis0.029889682
Mean2.587
Median Absolute Deviation (MAD)1
Skewness0.87008432
Sum25870
Variance2.0874397
MonotonicityNot monotonic
2023-12-12T08:44:18.343906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2 2709
27.1%
1 2670
26.7%
3 2669
26.7%
6 668
 
6.7%
4 653
 
6.5%
5 631
 
6.3%
ValueCountFrequency (%)
1 2670
26.7%
2 2709
27.1%
3 2669
26.7%
4 653
 
6.5%
5 631
 
6.3%
6 668
 
6.7%
ValueCountFrequency (%)
6 668
 
6.7%
5 631
 
6.3%
4 653
 
6.5%
3 2669
26.7%
2 2709
27.1%
1 2670
26.7%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
5033 
4967 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
5033
50.3%
4967
49.7%

Length

2023-12-12T08:44:18.463029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:44:18.551186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5033
50.3%
4967
49.7%


Real number (ℝ)

HIGH CORRELATION 

Distinct765
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean152.95533
Minimum95.8
Maximum193.7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T08:44:18.655456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum95.8
5-th percentile121.395
Q1140
median157
Q3166
95-th percentile176.3
Maximum193.7
Range97.9
Interquartile range (IQR)26

Descriptive statistics

Standard deviation17.201766
Coefficient of variation (CV)0.11246268
Kurtosis-0.68669697
Mean152.95533
Median Absolute Deviation (MAD)11.4
Skewness-0.50410706
Sum1529553.3
Variance295.90076
MonotonicityNot monotonic
2023-12-12T08:44:18.785504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
160.0 69
 
0.7%
156.0 68
 
0.7%
165.0 58
 
0.6%
170.0 53
 
0.5%
159.0 52
 
0.5%
157.0 52
 
0.5%
158.0 51
 
0.5%
163.0 48
 
0.5%
162.0 48
 
0.5%
164.0 48
 
0.5%
Other values (755) 9453
94.5%
ValueCountFrequency (%)
95.8 1
< 0.1%
103.4 1
< 0.1%
105.4 1
< 0.1%
108.5 1
< 0.1%
108.6 1
< 0.1%
109.0 1
< 0.1%
109.2 1
< 0.1%
109.3 1
< 0.1%
109.6 1
< 0.1%
110.0 1
< 0.1%
ValueCountFrequency (%)
193.7 1
< 0.1%
190.5 1
< 0.1%
189.7 1
< 0.1%
189.0 1
< 0.1%
188.6 1
< 0.1%
188.2 1
< 0.1%
188.1 1
< 0.1%
187.5 2
< 0.1%
187.4 1
< 0.1%
187.2 2
< 0.1%

Interactions

2023-12-12T08:44:17.168880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:44:16.957041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:44:17.278072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:44:17.056269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:44:18.875148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
학교급별학년성별
학교급별1.0000.7470.0170.753
학년0.7471.0000.0200.556
성별0.0170.0201.0000.558
0.7530.5560.5581.000
2023-12-12T08:44:18.978003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별학교급별
성별1.0000.028
학교급별0.0281.000
2023-12-12T08:44:19.074464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
학년학교급별성별
학년1.000-0.1190.4250.014
-0.1191.0000.6250.431
학교급별0.4250.6251.0000.028
성별0.0140.4310.0281.000

Missing values

2023-12-12T08:44:17.378949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:44:17.484739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

학년도학교급별학년성별
7682820163151.8
7941720161174.5
7617320163182.5
2552720165145.6
6186020161167.8
3621720163169.0
1626120162128.8
4889320163167.4
6779020163163.8
6028920162168.0
학년도학교급별학년성별
6555220161182.5
5108920163170.8
5080520161150.2
7952720161173.3
5070320163169.5
3298820163141.1
4110520163170.9
4583920161167.6
7023920161154.4
1853320163129.3

Duplicate rows

Most frequently occurring

학년도학교급별학년성별# duplicates
56920163170.014
20720161161.013
119420162160.013
22520161163.012
45420162160.012
59220163173.012
69120163158.012
93420161156.012
31320162170.011
34420162174.011