Overview

Dataset statistics

Number of variables6
Number of observations1830
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory91.3 KiB
Average record size in memory51.1 B

Variable types

Numeric2
Categorical4

Dataset

Description청년창업사관학교 '22~23년 입교생의 성별 지역 연령대별 선발연황자료를 확인할 수 있습니다. 입교생의 연령대는 입교당시의 만나이 기준입니다
Author중소벤처기업진흥공단
URLhttps://www.data.go.kr/data/15107190/fileData.do

Alerts

연번 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
일련번호 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
입교연도 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 has unique valuesUnique
일련번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 17:00:18.090584
Analysis finished2023-12-12 17:00:19.079489
Duration0.99 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1830
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean915.5
Minimum1
Maximum1830
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.2 KiB
2023-12-13T02:00:19.154408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile92.45
Q1458.25
median915.5
Q31372.75
95-th percentile1738.55
Maximum1830
Range1829
Interquartile range (IQR)914.5

Descriptive statistics

Standard deviation528.41981
Coefficient of variation (CV)0.57719259
Kurtosis-1.2
Mean915.5
Median Absolute Deviation (MAD)457.5
Skewness0
Sum1675365
Variance279227.5
MonotonicityStrictly increasing
2023-12-13T02:00:19.304850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
1231 1
 
0.1%
1229 1
 
0.1%
1228 1
 
0.1%
1227 1
 
0.1%
1226 1
 
0.1%
1225 1
 
0.1%
1224 1
 
0.1%
1223 1
 
0.1%
1222 1
 
0.1%
Other values (1820) 1820
99.5%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1830 1
0.1%
1829 1
0.1%
1828 1
0.1%
1827 1
0.1%
1826 1
0.1%
1825 1
0.1%
1824 1
0.1%
1823 1
0.1%
1822 1
0.1%
1821 1
0.1%

입교연도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size14.4 KiB
2022
915 
2023
915 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 915
50.0%
2023 915
50.0%

Length

2023-12-13T02:00:19.443990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:00:19.559460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 915
50.0%
2023 915
50.0%

일련번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct1830
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20080921
Minimum20016854
Maximum20126582
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size16.2 KiB
2023-12-13T02:00:19.676098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20016854
5-th percentile20024059
Q120040571
median20084882
Q320124191
95-th percentile20126090
Maximum20126582
Range109728
Interquartile range (IQR)83619.75

Descriptive statistics

Standard deviation43354.666
Coefficient of variation (CV)0.0021589979
Kurtosis-1.9157738
Mean20080921
Median Absolute Deviation (MAD)41483
Skewness-0.048668035
Sum3.6748085 × 1010
Variance1.8796271 × 109
MonotonicityNot monotonic
2023-12-13T02:00:19.840075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20042432 1
 
0.1%
20120962 1
 
0.1%
20122148 1
 
0.1%
20123177 1
 
0.1%
20125961 1
 
0.1%
20125581 1
 
0.1%
20125740 1
 
0.1%
20126517 1
 
0.1%
20125271 1
 
0.1%
20126400 1
 
0.1%
Other values (1820) 1820
99.5%
ValueCountFrequency (%)
20016854 1
0.1%
20016870 1
0.1%
20016879 1
0.1%
20016888 1
0.1%
20016910 1
0.1%
20016918 1
0.1%
20016923 1
0.1%
20016937 1
0.1%
20016938 1
0.1%
20016952 1
0.1%
ValueCountFrequency (%)
20126582 1
0.1%
20126567 1
0.1%
20126539 1
0.1%
20126531 1
0.1%
20126527 1
0.1%
20126517 1
0.1%
20126516 1
0.1%
20126506 1
0.1%
20126498 1
0.1%
20126497 1
0.1%

연령대
Categorical

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size14.4 KiB
30대
1191 
20대
578 
40대
 
60
10대
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row30대
2nd row20대
3rd row20대
4th row30대
5th row30대

Common Values

ValueCountFrequency (%)
30대 1191
65.1%
20대 578
31.6%
40대 60
 
3.3%
10대 1
 
0.1%

Length

2023-12-13T02:00:20.006788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:00:20.143599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
30대 1191
65.1%
20대 578
31.6%
40대 60
 
3.3%
10대 1
 
0.1%

성별
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size14.4 KiB
1271 
559 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
1271
69.5%
559
30.5%

Length

2023-12-13T02:00:20.275795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:00:20.404269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1271
69.5%
559
30.5%
Distinct19
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size14.4 KiB
안산
260 
서울
250 
광주
105 
경북
105 
충남
105 
Other values (14)
1005 

Length

Max length4
Median length2
Mean length2.0819672
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울
2nd row전북
3rd row강원
4th row충남
5th row강원

Common Values

ValueCountFrequency (%)
안산 260
14.2%
서울 250
13.7%
광주 105
 
5.7%
경북 105
 
5.7%
충남 105
 
5.7%
부산 90
 
4.9%
대구 90
 
4.9%
대전 85
 
4.6%
전북 80
 
4.4%
인천 80
 
4.4%
Other values (9) 580
31.7%

Length

2023-12-13T02:00:20.543311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
안산 260
14.2%
서울 250
13.7%
광주 105
 
5.7%
경북 105
 
5.7%
충남 105
 
5.7%
부산 90
 
4.9%
대구 90
 
4.9%
대전 85
 
4.6%
경남 80
 
4.4%
전남 80
 
4.4%
Other values (9) 580
31.7%

Interactions

2023-12-13T02:00:18.645301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:00:18.434147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:00:18.756008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:00:18.548701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:00:20.657687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번입교연도일련번호연령대성별지역(사관학교)
연번1.0001.0000.7640.2090.0700.831
입교연도1.0001.0001.0000.2910.0000.000
일련번호0.7641.0001.0000.3020.0000.000
연령대0.2090.2910.3021.0000.0550.086
성별0.0700.0000.0000.0551.0000.118
지역(사관학교)0.8310.0000.0000.0860.1181.000
2023-12-13T02:00:20.774983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령대지역(사관학교)입교연도성별
연령대1.0000.0470.1940.036
지역(사관학교)0.0471.0000.0000.104
입교연도0.1940.0001.0000.000
성별0.0360.1040.0001.000
2023-12-13T02:00:20.883376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번일련번호입교연도연령대성별지역(사관학교)
연번1.0000.7640.9980.1260.0530.497
일련번호0.7641.0000.9990.1220.0000.000
입교연도0.9980.9991.0000.1940.0000.000
연령대0.1260.1220.1941.0000.0360.047
성별0.0530.0000.0000.0361.0000.104
지역(사관학교)0.4970.0000.0000.0470.1041.000

Missing values

2023-12-13T02:00:18.911103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:00:19.031606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번입교연도일련번호연령대성별지역(사관학교)
0120222004243230대서울
1220222004160620대전북
2320222004239720대강원
3420222004268130대충남
4520222004194430대강원
5620222001709330대전북
6720222004128230대안산
7820222004187730대강원
8920222003967320대세종
91020222001764630대전남
연번입교연도일련번호연령대성별지역(사관학교)
1820182120232012631230대충북
1821182220232012584430대충북
1822182320232012218330대충북
1823182420232012056530대충북
1824182520232012587630대충북
1825182620232012457230대충북
1826182720232012546030대충북
1827182820232012600330대충북
1828182920232012571830대충북
1829183020232012498630대충북