Overview

Dataset statistics

Number of variables5
Number of observations25
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 KiB
Average record size in memory48.3 B

Variable types

Categorical1
Text1
Numeric3

Dataset

Description창업진흥원 창업기업의 형태별(개인, 법인) 업력, 창업진흥원 창업기업의 형태별(개인, 법인) 업종(2020년 창업기업실태조사 통계자료)
URLhttps://www.data.go.kr/data/15048993/fileData.do

Alerts

기업수 is highly overall correlated with 개인 and 2 other fieldsHigh correlation
개인 is highly overall correlated with 기업수 and 2 other fieldsHigh correlation
법인 is highly overall correlated with 기업수 and 2 other fieldsHigh correlation
구분별(1) is highly overall correlated with 기업수 and 2 other fieldsHigh correlation
구분별(2) has unique valuesUnique
기업수 has unique valuesUnique
개인 has unique valuesUnique
법인 has unique valuesUnique

Reproduction

Analysis started2023-12-12 10:01:48.108267
Analysis finished2023-12-12 10:01:49.604684
Duration1.5 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분별(1)
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size332.0 B
업종
18 
업력

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row업력
2nd row업력
3rd row업력
4th row업력
5th row업력

Common Values

ValueCountFrequency (%)
업종 18
72.0%
업력 7
 
28.0%

Length

2023-12-12T19:01:49.680007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:01:49.793810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
업종 18
72.0%
업력 7
 
28.0%

구분별(2)
Text

UNIQUE 

Distinct25
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size332.0 B
2023-12-12T19:01:50.010961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length21
Mean length8.6
Min length2

Characters and Unicode

Total characters215
Distinct characters82
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)100.0%

Sample

1st row1년
2nd row2년
3rd row3년
4th row4년
5th row5년
ValueCountFrequency (%)
12
 
17.4%
서비스업 6
 
8.7%
1년 1
 
1.4%
관리 1
 
1.4%
숙박 1
 
1.4%
음식점업 1
 
1.4%
정보통신업 1
 
1.4%
금융 1
 
1.4%
보험업 1
 
1.4%
부동산업 1
 
1.4%
Other values (43) 43
62.3%
2023-12-12T19:01:50.393216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
44
20.5%
23
 
10.7%
12
 
5.6%
, 8
 
3.7%
8
 
3.7%
7
 
3.3%
6
 
2.8%
6
 
2.8%
6
 
2.8%
4
 
1.9%
Other values (72) 91
42.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 156
72.6%
Space Separator 44
 
20.5%
Other Punctuation 8
 
3.7%
Decimal Number 7
 
3.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
23
 
14.7%
12
 
7.7%
8
 
5.1%
7
 
4.5%
6
 
3.8%
6
 
3.8%
6
 
3.8%
4
 
2.6%
3
 
1.9%
3
 
1.9%
Other values (63) 78
50.0%
Decimal Number
ValueCountFrequency (%)
1 1
14.3%
2 1
14.3%
3 1
14.3%
4 1
14.3%
5 1
14.3%
6 1
14.3%
7 1
14.3%
Space Separator
ValueCountFrequency (%)
44
100.0%
Other Punctuation
ValueCountFrequency (%)
, 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 156
72.6%
Common 59
 
27.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
23
 
14.7%
12
 
7.7%
8
 
5.1%
7
 
4.5%
6
 
3.8%
6
 
3.8%
6
 
3.8%
4
 
2.6%
3
 
1.9%
3
 
1.9%
Other values (63) 78
50.0%
Common
ValueCountFrequency (%)
44
74.6%
, 8
 
13.6%
1 1
 
1.7%
2 1
 
1.7%
3 1
 
1.7%
4 1
 
1.7%
5 1
 
1.7%
6 1
 
1.7%
7 1
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 156
72.6%
ASCII 59
 
27.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
44
74.6%
, 8
 
13.6%
1 1
 
1.7%
2 1
 
1.7%
3 1
 
1.7%
4 1
 
1.7%
5 1
 
1.7%
6 1
 
1.7%
7 1
 
1.7%
Hangul
ValueCountFrequency (%)
23
 
14.7%
12
 
7.7%
8
 
5.1%
7
 
4.5%
6
 
3.8%
6
 
3.8%
6
 
3.8%
4
 
2.6%
3
 
1.9%
3
 
1.9%
Other values (63) 78
50.0%

기업수
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct25
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean245735.52
Minimum544
Maximum806781
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size357.0 B
2023-12-12T19:01:50.552183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum544
5-th percentile4848.8
Q169584
median184837
Q3342302
95-th percentile652569
Maximum806781
Range806237
Interquartile range (IQR)272718

Descriptive statistics

Standard deviation229880.85
Coefficient of variation (CV)0.93548076
Kurtosis0.11923532
Mean245735.52
Median Absolute Deviation (MAD)118202
Skewness1.005817
Sum6143388
Variance5.2845205 × 1010
MonotonicityNot monotonic
2023-12-12T19:01:50.671062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
659405 1
 
4.0%
625225 1
 
4.0%
184837 1
 
4.0%
94114 1
 
4.0%
26325 1
 
4.0%
137390 1
 
4.0%
92429 1
 
4.0%
122154 1
 
4.0%
127451 1
 
4.0%
17961 1
 
4.0%
Other values (15) 15
60.0%
ValueCountFrequency (%)
544 1
4.0%
4846 1
4.0%
4860 1
4.0%
17961 1
4.0%
26325 1
4.0%
66635 1
4.0%
69584 1
4.0%
92429 1
4.0%
94114 1
4.0%
122154 1
4.0%
ValueCountFrequency (%)
806781 1
4.0%
659405 1
4.0%
625225 1
4.0%
574606 1
4.0%
493884 1
4.0%
408608 1
4.0%
342302 1
4.0%
290907 1
4.0%
257463 1
4.0%
251363 1
4.0%

개인
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct25
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean214255.44
Minimum4
Maximum706458
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size357.0 B
2023-12-12T19:01:50.790107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile758
Q162282
median168662
Q3294647
95-th percentile582830.2
Maximum706458
Range706454
Interquartile range (IQR)232365

Descriptive statistics

Standard deviation208034.36
Coefficient of variation (CV)0.97096419
Kurtosis0.030825503
Mean214255.44
Median Absolute Deviation (MAD)125985
Skewness1.0217478
Sum5356386
Variance4.3278295 × 1010
MonotonicityNot monotonic
2023-12-12T19:01:50.916599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
587150 1
 
4.0%
550351 1
 
4.0%
181151 1
 
4.0%
90022 1
 
4.0%
26054 1
 
4.0%
131245 1
 
4.0%
73298 1
 
4.0%
85374 1
 
4.0%
88149 1
 
4.0%
7833 1
 
4.0%
Other values (15) 15
60.0%
ValueCountFrequency (%)
4 1
4.0%
230 1
4.0%
2870 1
4.0%
7833 1
4.0%
26054 1
4.0%
40834 1
4.0%
62282 1
4.0%
73298 1
4.0%
85374 1
4.0%
88149 1
4.0%
ValueCountFrequency (%)
706458 1
4.0%
587150 1
4.0%
565551 1
4.0%
550351 1
4.0%
431031 1
4.0%
354694 1
4.0%
294647 1
4.0%
247303 1
4.0%
246969 1
4.0%
213351 1
4.0%

법인
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct25
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31480.08
Minimum271
Maximum100323
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size357.0 B
2023-12-12T19:01:51.049384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum271
5-th percentile646.4
Q14856
median28750
Q347655
95-th percentile74350.2
Maximum100323
Range100052
Interquartile range (IQR)42799

Descriptive statistics

Standard deviation28774.188
Coefficient of variation (CV)0.91404431
Kurtosis-0.45733619
Mean31480.08
Median Absolute Deviation (MAD)23894
Skewness0.68617147
Sum787002
Variance8.279539 × 108
MonotonicityNot monotonic
2023-12-12T19:01:51.159391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
72255 1
 
4.0%
74874 1
 
4.0%
3686 1
 
4.0%
4092 1
 
4.0%
271 1
 
4.0%
6145 1
 
4.0%
19131 1
 
4.0%
36780 1
 
4.0%
39302 1
 
4.0%
10128 1
 
4.0%
Other values (15) 15
60.0%
ValueCountFrequency (%)
271 1
4.0%
314 1
4.0%
1976 1
4.0%
3686 1
4.0%
4092 1
4.0%
4353 1
4.0%
4856 1
4.0%
6145 1
4.0%
9055 1
4.0%
10128 1
4.0%
ValueCountFrequency (%)
100323 1
4.0%
74874 1
4.0%
72255 1
4.0%
68035 1
4.0%
62853 1
4.0%
53914 1
4.0%
47655 1
4.0%
46144 1
4.0%
43938 1
4.0%
39302 1
4.0%

Interactions

2023-12-12T19:01:49.110538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:01:48.336496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:01:48.714310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:01:49.218096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:01:48.447792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:01:48.872972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:01:49.321595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:01:48.582263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:01:48.998542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:01:51.237079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분별(1)구분별(2)기업수개인법인
구분별(1)1.0001.0000.8830.9440.659
구분별(2)1.0001.0001.0001.0001.000
기업수0.8831.0001.0000.9960.801
개인0.9441.0000.9961.0000.790
법인0.6591.0000.8010.7901.000
2023-12-12T19:01:51.330743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기업수개인법인구분별(1)
기업수1.0000.9920.7850.584
개인0.9921.0000.7280.641
법인0.7850.7281.0000.562
구분별(1)0.5840.6410.5621.000

Missing values

2023-12-12T19:01:49.444649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:01:49.563009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분별(1)구분별(2)기업수개인법인
0업력1년65940558715072255
1업력2년62522555035174874
2업력3년49388443103162853
3업력4년40860835469453914
4업력5년34230229464747655
5업력6년29090724696943938
6업력7년25136321335138012
7업종농업, 임업 및 어업486044856
8업종광업544230314
9업종제조업23669716866268035
구분별(1)구분별(2)기업수개인법인
15업종숙박 및 음식점업5746065655519055
16업종정보통신업695844083428750
17업종금융 및 보험업17961783310128
18업종부동산업1274518814939302
19업종전문, 과학 및 기술 서비스업1221548537436780
20업종사업시설 관리, 사업 지원 및 임대 서비스업924297329819131
21업종교육 서비스업1373901312456145
22업종보건업 및 사회복지 서비스업2632526054271
23업종예술, 스포츠 및 여가관련 서비스업94114900224092
24업종수리 및 기타 개인 서비스업1848371811513686