Overview

Dataset statistics

Number of variables5
Number of observations26
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 KiB
Average record size in memory48.1 B

Variable types

Text1
Categorical1
Numeric3

Dataset

Description전국 초·중·고등학생의 건강조사 표본조사 결과 관련 데이터 입니다. 나이별, 성별, 키, 몸무게, 검사인원수에 대한 데이터를 확인할 수 있습니다.
Author교육부
URLhttps://www.data.go.kr/data/15051014/fileData.do

Alerts

키(cm) is highly overall correlated with 몸무게(kg) and 1 other fieldsHigh correlation
몸무게(kg) is highly overall correlated with 키(cm) and 1 other fieldsHigh correlation
검사인원수 is highly overall correlated with 키(cm) and 1 other fieldsHigh correlation
검사인원수 has unique valuesUnique

Reproduction

Analysis started2023-12-12 13:58:57.509806
Analysis finished2023-12-12 13:58:58.507064
Duration1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Text

Distinct13
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size340.0 B
2023-12-12T22:58:58.571245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.6153846
Min length4

Characters and Unicode

Total characters120
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row만 6세
2nd row만 6세
3rd row만 7세
4th row만 7세
5th row만 8세
ValueCountFrequency (%)
24
48.0%
6세 2
 
4.0%
7세 2
 
4.0%
8세 2
 
4.0%
9세 2
 
4.0%
10세 2
 
4.0%
11세 2
 
4.0%
12세 2
 
4.0%
13세 2
 
4.0%
만14세 2
 
4.0%
Other values (4) 8
 
16.0%
2023-12-12T22:58:58.844046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26
21.7%
26
21.7%
24
20.0%
1 20
16.7%
6 4
 
3.3%
7 4
 
3.3%
8 4
 
3.3%
9 2
 
1.7%
0 2
 
1.7%
2 2
 
1.7%
Other values (3) 6
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 52
43.3%
Decimal Number 44
36.7%
Space Separator 24
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 20
45.5%
6 4
 
9.1%
7 4
 
9.1%
8 4
 
9.1%
9 2
 
4.5%
0 2
 
4.5%
2 2
 
4.5%
3 2
 
4.5%
4 2
 
4.5%
5 2
 
4.5%
Other Letter
ValueCountFrequency (%)
26
50.0%
26
50.0%
Space Separator
ValueCountFrequency (%)
24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 68
56.7%
Hangul 52
43.3%

Most frequent character per script

Common
ValueCountFrequency (%)
24
35.3%
1 20
29.4%
6 4
 
5.9%
7 4
 
5.9%
8 4
 
5.9%
9 2
 
2.9%
0 2
 
2.9%
2 2
 
2.9%
3 2
 
2.9%
4 2
 
2.9%
Hangul
ValueCountFrequency (%)
26
50.0%
26
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 68
56.7%
Hangul 52
43.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
26
50.0%
26
50.0%
ASCII
ValueCountFrequency (%)
24
35.3%
1 20
29.4%
6 4
 
5.9%
7 4
 
5.9%
8 4
 
5.9%
9 2
 
2.9%
0 2
 
2.9%
2 2
 
2.9%
3 2
 
2.9%
4 2
 
2.9%

성별
Categorical

Distinct2
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size340.0 B
13 
13 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
13
50.0%
13
50.0%

Length

2023-12-12T22:58:59.028793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:58:59.148398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
13
50.0%
13
50.0%

키(cm)
Real number (ℝ)

HIGH CORRELATION 

Distinct25
Distinct (%)96.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean150.32308
Minimum119.3
Maximum173.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-12T22:58:59.278047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum119.3
5-th percentile121.5
Q1136.8
median156.25
Q3160.775
95-th percentile173.325
Maximum173.5
Range54.2
Interquartile range (IQR)23.975

Descriptive statistics

Standard deviation17.380663
Coefficient of variation (CV)0.11562205
Kurtosis-1.0936147
Mean150.32308
Median Absolute Deviation (MAD)13.35
Skewness-0.39282158
Sum3908.4
Variance302.08745
MonotonicityNot monotonic
2023-12-12T22:58:59.409289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
173.5 2
 
7.7%
120.5 1
 
3.8%
119.3 1
 
3.8%
160.7 1
 
3.8%
160.8 1
 
3.8%
160.6 1
 
3.8%
172.8 1
 
3.8%
160.3 1
 
3.8%
171.6 1
 
3.8%
159.7 1
 
3.8%
Other values (15) 15
57.7%
ValueCountFrequency (%)
119.3 1
3.8%
120.5 1
3.8%
124.5 1
3.8%
125.7 1
3.8%
130.2 1
3.8%
131.8 1
3.8%
136.6 1
3.8%
137.4 1
3.8%
142.7 1
3.8%
143.1 1
3.8%
ValueCountFrequency (%)
173.5 2
7.7%
172.8 1
3.8%
171.6 1
3.8%
168.9 1
3.8%
164.0 1
3.8%
160.8 1
3.8%
160.7 1
3.8%
160.6 1
3.8%
160.3 1
3.8%
159.7 1
3.8%

몸무게(kg)
Real number (ℝ)

HIGH CORRELATION 

Distinct25
Distinct (%)96.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean47.011538
Minimum22.9
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-12T22:58:59.523551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum22.9
5-th percentile24.575
Q133.725
median50.2
Q357.025
95-th percentile69.225
Maximum70
Range47.1
Interquartile range (IQR)23.3

Descriptive statistics

Standard deviation14.988431
Coefficient of variation (CV)0.31882451
Kurtosis-1.2222514
Mean47.011538
Median Absolute Deviation (MAD)12.2
Skewness-0.1313183
Sum1222.3
Variance224.65306
MonotonicityNot monotonic
2023-12-12T22:58:59.653237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
52.0 2
 
7.7%
24.2 1
 
3.8%
57.6 1
 
3.8%
56.6 1
 
3.8%
70.0 1
 
3.8%
57.1 1
 
3.8%
69.6 1
 
3.8%
56.8 1
 
3.8%
68.1 1
 
3.8%
55.6 1
 
3.8%
Other values (15) 15
57.7%
ValueCountFrequency (%)
22.9 1
3.8%
24.2 1
3.8%
25.7 1
3.8%
27.1 1
3.8%
29.0 1
3.8%
31.4 1
3.8%
33.1 1
3.8%
35.6 1
3.8%
38.0 1
3.8%
40.3 1
3.8%
ValueCountFrequency (%)
70.0 1
3.8%
69.6 1
3.8%
68.1 1
3.8%
65.4 1
3.8%
62.4 1
3.8%
57.6 1
3.8%
57.1 1
3.8%
56.8 1
3.8%
56.6 1
3.8%
55.6 1
3.8%

검사인원수
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct26
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3185.8077
Minimum1267
Maximum4587
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-12T22:58:59.768255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1267
5-th percentile1384.25
Q12724.5
median2939
Q33990
95-th percentile4529.5
Maximum4587
Range3320
Interquartile range (IQR)1265.5

Descriptive statistics

Standard deviation1000.1709
Coefficient of variation (CV)0.31394579
Kurtosis-0.69107143
Mean3185.8077
Median Absolute Deviation (MAD)846.5
Skewness-0.28949902
Sum82831
Variance1000341.8
MonotonicityNot monotonic
2023-12-12T22:58:59.902500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
1818 1
 
3.8%
3948 1
 
3.8%
1267 1
 
3.8%
1296 1
 
3.8%
4480 1
 
3.8%
4247 1
 
3.8%
4587 1
 
3.8%
4531 1
 
3.8%
4363 1
 
3.8%
4525 1
 
3.8%
Other values (16) 16
61.5%
ValueCountFrequency (%)
1267 1
3.8%
1296 1
3.8%
1649 1
3.8%
1818 1
3.8%
2602 1
3.8%
2644 1
3.8%
2721 1
3.8%
2735 1
3.8%
2779 1
3.8%
2801 1
3.8%
ValueCountFrequency (%)
4587 1
3.8%
4531 1
3.8%
4525 1
3.8%
4480 1
3.8%
4363 1
3.8%
4247 1
3.8%
4004 1
3.8%
3948 1
3.8%
3920 1
3.8%
3651 1
3.8%

Interactions

2023-12-12T22:58:58.129841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:58:57.675717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:58:57.898200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:58:58.203211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:58:57.748104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:58:57.975852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:58:58.281222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:58:57.815494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:58:58.053473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:58:59.984353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분성별키(cm)몸무게(kg)검사인원수
구분1.0000.0000.9290.9100.868
성별0.0001.0000.6500.0000.167
키(cm)0.9290.6501.0000.9830.676
몸무게(kg)0.9100.0000.9831.0000.635
검사인원수0.8680.1670.6760.6351.000
2023-12-12T22:59:00.087001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
키(cm)몸무게(kg)검사인원수성별
키(cm)1.0000.9980.5750.399
몸무게(kg)0.9981.0000.5920.000
검사인원수0.5750.5921.0000.000
성별0.3990.0000.0001.000

Missing values

2023-12-12T22:58:58.395841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:58:58.473970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분성별키(cm)몸무게(kg)검사인원수
0만 6세120.524.21818
1만 6세119.322.91649
2만 7세125.727.12896
3만 7세124.525.72721
4만 8세131.831.42959
5만 8세130.229.02801
6만 9세137.435.62779
7만 9세136.633.12602
8만 10세142.740.32844
9만 10세143.138.02644
구분성별키(cm)몸무게(kg)검사인원수
16만14세168.962.44004
17만14세159.753.93920
18만 15세171.665.44525
19만 15세160.355.64363
20만 16세172.868.14531
21만 16세160.656.84587
22만 17세173.569.64247
23만 17세160.857.14480
24만 18세173.570.01296
25만 18세160.756.61267