Overview

Dataset statistics

Number of variables4
Number of observations34
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.3 KiB
Average record size in memory37.9 B

Variable types

Categorical1
Text1
Numeric2

Dataset

Description창업기업의 업력별, 창업기업의 업종별, 창업기업의 형태별, 창업기업의 성별별, 창업기업의 연령별 교육 경험 여부
URLhttps://www.data.go.kr/data/15037551/fileData.do

Alerts

있음 is highly overall correlated with 없음High correlation
없음 is highly overall correlated with 있음High correlation
구분별(2) has unique valuesUnique

Reproduction

Analysis started2023-12-12 19:25:31.297728
Analysis finished2023-12-12 19:25:32.009905
Duration0.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분별(1)
Categorical

Distinct5
Distinct (%)14.7%
Missing0
Missing (%)0.0%
Memory size404.0 B
업종
18 
업력
창업자 연령
기업형태
창업자 성별

Length

Max length6
Median length2
Mean length2.9411765
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row업력
2nd row업력
3rd row업력
4th row업력
5th row업력

Common Values

ValueCountFrequency (%)
업종 18
52.9%
업력 7
 
20.6%
창업자 연령 5
 
14.7%
기업형태 2
 
5.9%
창업자 성별 2
 
5.9%

Length

2023-12-13T04:25:32.087751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:25:32.205580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
업종 18
43.9%
업력 7
 
17.1%
창업자 7
 
17.1%
연령 5
 
12.2%
기업형태 2
 
4.9%
성별 2
 
4.9%

구분별(2)
Text

UNIQUE 

Distinct34
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size404.0 B
2023-12-13T04:25:32.413618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length22
Mean length7.1764706
Min length2

Characters and Unicode

Total characters244
Distinct characters88
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34 ?
Unique (%)100.0%

Sample

1st row1년
2nd row2년
3rd row3년
4th row4년
5th row5년
ValueCountFrequency (%)
12
 
15.0%
서비스업 6
 
7.5%
개인 2
 
2.5%
1년 1
 
1.2%
예술 1
 
1.2%
50대 1
 
1.2%
40대 1
 
1.2%
30대 1
 
1.2%
20대 1
 
1.2%
이하 1
 
1.2%
Other values (53) 53
66.2%
2023-12-13T04:25:32.895521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
46
 
18.9%
23
 
9.4%
12
 
4.9%
, 8
 
3.3%
8
 
3.3%
7
 
2.9%
6
 
2.5%
6
 
2.5%
6
 
2.5%
6
 
2.5%
Other values (78) 116
47.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 173
70.9%
Space Separator 46
 
18.9%
Decimal Number 17
 
7.0%
Other Punctuation 8
 
3.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
23
 
13.3%
12
 
6.9%
8
 
4.6%
7
 
4.0%
6
 
3.5%
6
 
3.5%
6
 
3.5%
6
 
3.5%
4
 
2.3%
3
 
1.7%
Other values (68) 92
53.2%
Decimal Number
ValueCountFrequency (%)
0 5
29.4%
6 2
 
11.8%
5 2
 
11.8%
4 2
 
11.8%
3 2
 
11.8%
2 2
 
11.8%
1 1
 
5.9%
7 1
 
5.9%
Space Separator
ValueCountFrequency (%)
46
100.0%
Other Punctuation
ValueCountFrequency (%)
, 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 173
70.9%
Common 71
29.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
23
 
13.3%
12
 
6.9%
8
 
4.6%
7
 
4.0%
6
 
3.5%
6
 
3.5%
6
 
3.5%
6
 
3.5%
4
 
2.3%
3
 
1.7%
Other values (68) 92
53.2%
Common
ValueCountFrequency (%)
46
64.8%
, 8
 
11.3%
0 5
 
7.0%
6 2
 
2.8%
5 2
 
2.8%
4 2
 
2.8%
3 2
 
2.8%
2 2
 
2.8%
1 1
 
1.4%
7 1
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 173
70.9%
ASCII 71
29.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
46
64.8%
, 8
 
11.3%
0 5
 
7.0%
6 2
 
2.8%
5 2
 
2.8%
4 2
 
2.8%
3 2
 
2.8%
2 2
 
2.8%
1 1
 
1.4%
7 1
 
1.4%
Hangul
ValueCountFrequency (%)
23
 
13.3%
12
 
6.9%
8
 
4.6%
7
 
4.0%
6
 
3.5%
6
 
3.5%
6
 
3.5%
6
 
3.5%
4
 
2.3%
3
 
1.7%
Other values (68) 92
53.2%

있음
Real number (ℝ)

HIGH CORRELATION 

Distinct30
Distinct (%)88.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.361765
Minimum1.6
Maximum21.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size438.0 B
2023-12-13T04:25:33.044383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.6
5-th percentile9.93
Q112.675
median14.5
Q316.675
95-th percentile19.235
Maximum21.5
Range19.9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.6988135
Coefficient of variation (CV)0.25754589
Kurtosis3.1057348
Mean14.361765
Median Absolute Deviation (MAD)1.9
Skewness-1.0214353
Sum488.3
Variance13.681221
MonotonicityNot monotonic
2023-12-13T04:25:33.180713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
16.0 2
 
5.9%
12.6 2
 
5.9%
14.0 2
 
5.9%
15.6 2
 
5.9%
11.9 1
 
2.9%
16.9 1
 
2.9%
14.6 1
 
2.9%
19.3 1
 
2.9%
17.2 1
 
2.9%
18.2 1
 
2.9%
Other values (20) 20
58.8%
ValueCountFrequency (%)
1.6 1
2.9%
9.8 1
2.9%
10.0 1
2.9%
10.1 1
2.9%
10.2 1
2.9%
10.8 1
2.9%
11.9 1
2.9%
12.6 2
5.9%
12.9 1
2.9%
13.0 1
2.9%
ValueCountFrequency (%)
21.5 1
2.9%
19.3 1
2.9%
19.2 1
2.9%
18.6 1
2.9%
18.5 1
2.9%
18.2 1
2.9%
17.3 1
2.9%
17.2 1
2.9%
16.9 1
2.9%
16.0 2
5.9%

없음
Real number (ℝ)

HIGH CORRELATION 

Distinct30
Distinct (%)88.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean85.638235
Minimum78.5
Maximum98.4
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size438.0 B
2023-12-13T04:25:33.343303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum78.5
5-th percentile80.765
Q183.325
median85.5
Q387.325
95-th percentile90.07
Maximum98.4
Range19.9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.6988135
Coefficient of variation (CV)0.043191145
Kurtosis3.1057348
Mean85.638235
Median Absolute Deviation (MAD)1.9
Skewness1.0214353
Sum2911.7
Variance13.681221
MonotonicityNot monotonic
2023-12-13T04:25:33.504481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
84.0 2
 
5.9%
87.4 2
 
5.9%
86.0 2
 
5.9%
84.4 2
 
5.9%
88.1 1
 
2.9%
83.1 1
 
2.9%
85.4 1
 
2.9%
80.7 1
 
2.9%
82.8 1
 
2.9%
81.8 1
 
2.9%
Other values (20) 20
58.8%
ValueCountFrequency (%)
78.5 1
2.9%
80.7 1
2.9%
80.8 1
2.9%
81.4 1
2.9%
81.5 1
2.9%
81.8 1
2.9%
82.7 1
2.9%
82.8 1
2.9%
83.1 1
2.9%
84.0 2
5.9%
ValueCountFrequency (%)
98.4 1
2.9%
90.2 1
2.9%
90.0 1
2.9%
89.9 1
2.9%
89.8 1
2.9%
89.2 1
2.9%
88.1 1
2.9%
87.4 2
5.9%
87.1 1
2.9%
87.0 1
2.9%

Interactions

2023-12-13T04:25:31.666289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:25:31.457844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:25:31.762128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:25:31.558465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:25:33.607991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분별(1)구분별(2)있음없음
구분별(1)1.0001.0000.0000.000
구분별(2)1.0001.0001.0001.000
있음0.0001.0001.0001.000
없음0.0001.0001.0001.000
2023-12-13T04:25:33.752960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
있음없음구분별(1)
있음1.000-1.0000.000
없음-1.0001.0000.000
구분별(1)0.0000.0001.000

Missing values

2023-12-13T04:25:31.875564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:25:31.973197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분별(1)구분별(2)있음없음
0업력1년17.282.8
1업력2년12.687.4
2업력3년19.280.8
3업력4년15.184.9
4업력5년18.681.4
5업력6년10.090.0
6업력7년10.189.9
7업종농업, 임업 및 어업9.890.2
8업종광업14.485.6
9업종제조업10.289.8
구분별(1)구분별(2)있음없음
24업종수리 및 기타 개인 서비스업11.988.1
25기업형태개인15.684.4
26기업형태법인13.686.4
27창업자 성별남성14.086.0
28창업자 성별여성17.382.7
29창업자 연령20대 이하13.087.0
30창업자 연령30대18.581.5
31창업자 연령40대15.085.0
32창업자 연령50대13.886.2
33창업자 연령60대 이상15.284.8