Overview

Dataset statistics

Number of variables6
Number of observations996
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory49.7 KiB
Average record size in memory51.1 B

Variable types

Numeric3
Categorical2
Text1

Dataset

Description스마트팜 청년창업 보육센터의 연도별 교육생 선발 현황정보입니다. 스마트팜 혁신밸리(전북 김제, 전남 고흥, 경북 상주, 경남 밀양)에서 2018년부터 교육생을 선발하여 교육 중으로 해당 정보는 연도, 순번, 교육기관, 성별, 나이, 거주지 정보가 포함되어 있습니다.
URLhttps://www.data.go.kr/data/15103333/fileData.do

Alerts

순번 is highly overall correlated with 보육센터High correlation
보육센터 is highly overall correlated with 순번High correlation
성별 is highly imbalanced (56.6%)Imbalance

Reproduction

Analysis started2023-12-12 22:54:28.436373
Analysis finished2023-12-12 22:54:29.786004
Duration1.35 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

Distinct6
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2021.0281
Minimum2018
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T07:54:29.848596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2018
5-th percentile2018
Q12020
median2021
Q32022
95-th percentile2023
Maximum2023
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.4880471
Coefficient of variation (CV)0.00073628225
Kurtosis-0.87118006
Mean2021.0281
Median Absolute Deviation (MAD)1
Skewness-0.29763589
Sum2012944
Variance2.2142843
MonotonicityIncreasing
2023-12-13T07:54:29.969843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2020 208
20.9%
2021 208
20.9%
2022 208
20.9%
2023 208
20.9%
2019 104
10.4%
2018 60
 
6.0%
ValueCountFrequency (%)
2018 60
 
6.0%
2019 104
10.4%
2020 208
20.9%
2021 208
20.9%
2022 208
20.9%
2023 208
20.9%
ValueCountFrequency (%)
2023 208
20.9%
2022 208
20.9%
2021 208
20.9%
2020 208
20.9%
2019 104
10.4%
2018 60
 
6.0%

순번
Real number (ℝ)

HIGH CORRELATION 

Distinct208
Distinct (%)20.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean94.61245
Minimum1
Maximum208
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T07:54:30.123558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q142
median88
Q3146
95-th percentile196
Maximum208
Range207
Interquartile range (IQR)104

Descriptive statistics

Standard deviation60.348206
Coefficient of variation (CV)0.63784635
Kurtosis-1.1660684
Mean94.61245
Median Absolute Deviation (MAD)51
Skewness0.23321452
Sum94234
Variance3641.9059
MonotonicityNot monotonic
2023-12-13T07:54:30.297455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 6
 
0.6%
32 6
 
0.6%
34 6
 
0.6%
35 6
 
0.6%
36 6
 
0.6%
37 6
 
0.6%
38 6
 
0.6%
39 6
 
0.6%
40 6
 
0.6%
41 6
 
0.6%
Other values (198) 936
94.0%
ValueCountFrequency (%)
1 6
0.6%
2 6
0.6%
3 6
0.6%
4 6
0.6%
5 6
0.6%
6 6
0.6%
7 6
0.6%
8 6
0.6%
9 6
0.6%
10 6
0.6%
ValueCountFrequency (%)
208 4
0.4%
207 4
0.4%
206 4
0.4%
205 4
0.4%
204 4
0.4%
203 4
0.4%
202 4
0.4%
201 4
0.4%
200 4
0.4%
199 4
0.4%

보육센터
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
전라북도
260 
경상북도
260 
전라남도
208 
경상남도
208 
전라북도(JATC)
 
20
Other values (2)
40 

Length

Max length10
Median length4
Mean length4.3413655
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전라북도(JATC)
2nd row전라북도(JATC)
3rd row전라북도(JATC)
4th row전라북도(JATC)
5th row전라북도(JATC)

Common Values

ValueCountFrequency (%)
전라북도 260
26.1%
경상북도 260
26.1%
전라남도 208
20.9%
경상남도 208
20.9%
전라북도(JATC) 20
 
2.0%
전라남도(전남대) 20
 
2.0%
경상남도(ATEC) 20
 
2.0%

Length

2023-12-13T07:54:30.441615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:54:30.567312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전라북도 260
26.1%
경상북도 260
26.1%
전라남도 208
20.9%
경상남도 208
20.9%
전라북도(jatc 20
 
2.0%
전라남도(전남대 20
 
2.0%
경상남도(atec 20
 
2.0%

성별
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
765 
217 
 
11
 
3

Length

Max length2
Median length1
Mean length1.0140562
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
765
76.8%
217
 
21.8%
11
 
1.1%
3
 
0.3%

Length

2023-12-13T07:54:30.695585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:54:30.811906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
776
77.9%
220
 
22.1%

나이
Real number (ℝ)

Distinct23
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.619478
Minimum17
Maximum39
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T07:54:30.908100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile22
Q126
median31
Q335
95-th percentile39
Maximum39
Range22
Interquartile range (IQR)9

Descriptive statistics

Standard deviation5.3774872
Coefficient of variation (CV)0.17562309
Kurtosis-0.91658794
Mean30.619478
Median Absolute Deviation (MAD)4
Skewness-0.18680934
Sum30497
Variance28.917369
MonotonicityNot monotonic
2023-12-13T07:54:31.074049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
27 71
 
7.1%
39 69
 
6.9%
36 68
 
6.8%
31 64
 
6.4%
30 59
 
5.9%
25 59
 
5.9%
35 58
 
5.8%
33 58
 
5.8%
28 56
 
5.6%
37 52
 
5.2%
Other values (13) 382
38.4%
ValueCountFrequency (%)
17 5
 
0.5%
18 3
 
0.3%
19 8
 
0.8%
20 7
 
0.7%
21 15
 
1.5%
22 31
3.1%
23 35
3.5%
24 43
4.3%
25 59
5.9%
26 45
4.5%
ValueCountFrequency (%)
39 69
6.9%
38 47
4.7%
37 52
5.2%
36 68
6.8%
35 58
5.8%
34 51
5.1%
33 58
5.8%
32 51
5.1%
31 64
6.4%
30 59
5.9%

지역
Text

Distinct200
Distinct (%)20.1%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-13T07:54:31.368154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length8.1014056
Min length7

Characters and Unicode

Total characters8069
Distinct characters124
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique58 ?
Unique (%)5.8%

Sample

1st row경기도 여주시
2nd row전라북도 익산시
3rd row경기도 안양시
4th row전라북도 진안군
5th row충청남도 부여군
ValueCountFrequency (%)
전라북도 180
 
9.1%
경상남도 157
 
7.9%
전라남도 136
 
6.9%
경상북도 128
 
6.5%
경기도 99
 
5.0%
서울특별시 70
 
3.5%
전주시 47
 
2.4%
부산광역시 44
 
2.2%
상주시 42
 
2.1%
김제시 39
 
2.0%
Other values (178) 1041
52.5%
2023-12-13T07:54:31.814030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1007
 
12.5%
823
 
10.2%
771
 
9.6%
403
 
5.0%
379
 
4.7%
374
 
4.6%
343
 
4.3%
330
 
4.1%
316
 
3.9%
254
 
3.1%
Other values (114) 3069
38.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7062
87.5%
Space Separator 1007
 
12.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
823
 
11.7%
771
 
10.9%
403
 
5.7%
379
 
5.4%
374
 
5.3%
343
 
4.9%
330
 
4.7%
316
 
4.5%
254
 
3.6%
234
 
3.3%
Other values (113) 2835
40.1%
Space Separator
ValueCountFrequency (%)
1007
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7062
87.5%
Common 1007
 
12.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
823
 
11.7%
771
 
10.9%
403
 
5.7%
379
 
5.4%
374
 
5.3%
343
 
4.9%
330
 
4.7%
316
 
4.5%
254
 
3.6%
234
 
3.3%
Other values (113) 2835
40.1%
Common
ValueCountFrequency (%)
1007
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7062
87.5%
ASCII 1007
 
12.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1007
100.0%
Hangul
ValueCountFrequency (%)
823
 
11.7%
771
 
10.9%
403
 
5.7%
379
 
5.4%
374
 
5.3%
343
 
4.9%
330
 
4.7%
316
 
4.5%
254
 
3.6%
234
 
3.3%
Other values (113) 2835
40.1%

Interactions

2023-12-13T07:54:29.301464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:54:28.736228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:54:29.022518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:54:29.416373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:54:28.843854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:54:29.127903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:54:29.503124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:54:28.937386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:54:29.211659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:54:31.903832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도순번보육센터성별나이
연도1.0000.2900.2080.1600.218
순번0.2901.0000.8460.2270.122
보육센터0.2080.8461.0000.1940.052
성별0.1600.2270.1941.0000.148
나이0.2180.1220.0520.1481.000
2023-12-13T07:54:31.986096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별보육센터
성별1.0000.134
보육센터0.1341.000
2023-12-13T07:54:32.066438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도순번나이보육센터성별
연도1.0000.2480.0130.4640.134
순번0.2481.0000.0070.6470.137
나이0.0130.0071.0000.0300.088
보육센터0.4640.6470.0301.0000.134
성별0.1340.1370.0880.1341.000

Missing values

2023-12-13T07:54:29.608593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:54:29.735426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도순번보육센터성별나이지역
020181전라북도(JATC)28경기도 여주시
120182전라북도(JATC)27전라북도 익산시
220183전라북도(JATC)32경기도 안양시
320184전라북도(JATC)39전라북도 진안군
420185전라북도(JATC)24충청남도 부여군
520186전라북도(JATC)39전라북도 정읍시
620187전라북도(JATC)37서울특별시 송파구
720188전라북도(JATC)26충청북도 옥천군
820189전라북도(JATC)27전라남도 순천시
9201810전라북도(JATC)32인천광역시 남동구
연도순번보육센터성별나이지역
9862023199경상남도37경상남도 밀양시
9872023200경상남도30경상남도 밀양시
9882023201경상남도27경상남도 산청군
9892023202경상남도39경상남도 합천군
9902023203경상남도38인천광역시 연수구
9912023204경상남도26경상북도 경산시
9922023205경상남도37경상남도 밀양시
9932023206경상남도30경상남도 밀양시
9942023207경상남도35경상남도 밀양시
9952023208경상남도31울산광역시 북구