Overview

Dataset statistics

Number of variables6
Number of observations2895
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory141.5 KiB
Average record size in memory50.0 B

Variable types

Categorical5
Numeric1

Dataset

Description2021이후 선발된 기업의 업종별(제조/비제조), 기술분류별(기계재료, 화공섬유, 생명식품, 전기전자, 정보통신, 지식SW, 환경에너지, 기타) 창업사업화 지원사업 현황
Author중소벤처기업진흥공단
URLhttps://www.data.go.kr/data/15073337/fileData.do

Alerts

지원내용 has constant value ""Constant
기술분류 is highly overall correlated with 업종High correlation
업종 is highly overall correlated with 기술분류High correlation
업체 일련번호 is highly overall correlated with 지원년도High correlation
지원년도 is highly overall correlated with 업체 일련번호High correlation
업체 일련번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 22:34:52.959447
Analysis finished2023-12-12 22:34:53.487445
Duration0.53 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지원년도
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size22.7 KiB
2021
1065 
2022
915 
2023
915 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2021 1065
36.8%
2022 915
31.6%
2023 915
31.6%

Length

2023-12-13T07:34:53.545303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:34:53.627916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 1065
36.8%
2022 915
31.6%
2023 915
31.6%

지원내용
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.7 KiB
창업사업화 지원
2895 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row창업사업화 지원
2nd row창업사업화 지원
3rd row창업사업화 지원
4th row창업사업화 지원
5th row창업사업화 지원

Common Values

ValueCountFrequency (%)
창업사업화 지원 2895
100.0%

Length

2023-12-13T07:34:53.714948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:34:53.796889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
창업사업화 2895
50.0%
지원 2895
50.0%

지원지역
Categorical

Distinct19
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size22.7 KiB
안산
420 
서울
385 
경북
165 
광주
165 
충남
165 
Other values (14)
1595 

Length

Max length4
Median length2
Mean length2.0794473
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원
2nd row강원
3rd row강원
4th row강원
5th row강원

Common Values

ValueCountFrequency (%)
안산 420
14.5%
서울 385
13.3%
경북 165
 
5.7%
광주 165
 
5.7%
충남 165
 
5.7%
경남 140
 
4.8%
대구 140
 
4.8%
부산 140
 
4.8%
전북 140
 
4.8%
인천 125
 
4.3%
Other values (9) 910
31.4%

Length

2023-12-13T07:34:53.891310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
안산 420
14.5%
서울 385
13.3%
경북 165
 
5.7%
광주 165
 
5.7%
충남 165
 
5.7%
경남 140
 
4.8%
대구 140
 
4.8%
부산 140
 
4.8%
전북 140
 
4.8%
대전 125
 
4.3%
Other values (9) 910
31.4%

업체 일련번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct2895
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12731064
Minimum101201
Maximum20126582
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.6 KiB
2023-12-13T07:34:54.008278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101201
5-th percentile101345.7
Q1101924.5
median20040215
Q320122158
95-th percentile20125822
Maximum20126582
Range20025381
Interquartile range (IQR)20020233

Descriptive statistics

Standard deviation9636235.4
Coefficient of variation (CV)0.75690731
Kurtosis-1.7005754
Mean12731064
Median Absolute Deviation (MAD)84355
Skewness-0.54821973
Sum3.685643 × 1010
Variance9.2857034 × 1013
MonotonicityNot monotonic
2023-12-13T07:34:54.134078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
101328 1
 
< 0.1%
20016918 1
 
< 0.1%
20041615 1
 
< 0.1%
20042259 1
 
< 0.1%
20042669 1
 
< 0.1%
20040831 1
 
< 0.1%
20035223 1
 
< 0.1%
20016870 1
 
< 0.1%
20016888 1
 
< 0.1%
20017096 1
 
< 0.1%
Other values (2885) 2885
99.7%
ValueCountFrequency (%)
101201 1
< 0.1%
101202 1
< 0.1%
101203 1
< 0.1%
101204 1
< 0.1%
101205 1
< 0.1%
101206 1
< 0.1%
101207 1
< 0.1%
101208 1
< 0.1%
101209 1
< 0.1%
101210 1
< 0.1%
ValueCountFrequency (%)
20126582 1
< 0.1%
20126567 1
< 0.1%
20126539 1
< 0.1%
20126531 1
< 0.1%
20126527 1
< 0.1%
20126517 1
< 0.1%
20126516 1
< 0.1%
20126506 1
< 0.1%
20126498 1
< 0.1%
20126497 1
< 0.1%

업종
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size22.7 KiB
제조업
2012 
비제조업
883 

Length

Max length4
Median length3
Mean length3.3050086
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제조업
2nd row제조업
3rd row제조업
4th row제조업
5th row제조업

Common Values

ValueCountFrequency (%)
제조업 2012
69.5%
비제조업 883
30.5%

Length

2023-12-13T07:34:54.246515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:34:54.329286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제조업 2012
69.5%
비제조업 883
30.5%

기술분류
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size22.7 KiB
전기전자
640 
기계재료
517 
생명식품
506 
지식S/W
420 
정보통신
379 
Other values (3)
433 

Length

Max length5
Median length4
Mean length4.1284974
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row생명식품
2nd row생명식품
3rd row생명식품
4th row생명식품
5th row생명식품

Common Values

ValueCountFrequency (%)
전기전자 640
22.1%
기계재료 517
17.9%
생명식품 506
17.5%
지식S/W 420
14.5%
정보통신 379
13.1%
화공섬유 229
 
7.9%
환경에너지 120
 
4.1%
기타 84
 
2.9%

Length

2023-12-13T07:34:54.423151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:34:54.528886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전기전자 640
22.1%
기계재료 517
17.9%
생명식품 506
17.5%
지식s/w 420
14.5%
정보통신 379
13.1%
화공섬유 229
 
7.9%
환경에너지 120
 
4.1%
기타 84
 
2.9%

Interactions

2023-12-13T07:34:53.233566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:34:54.624089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지원년도지원지역업체 일련번호업종기술분류
지원년도1.0000.0001.0000.2210.428
지원지역0.0001.0000.0000.3020.358
업체 일련번호1.0000.0001.0000.2810.278
업종0.2210.3020.2811.0001.000
기술분류0.4280.3580.2781.0001.000
2023-12-13T07:34:54.733385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기술분류업종지원년도지원지역
기술분류1.0000.9990.3020.158
업종0.9991.0000.3620.267
지원년도0.3020.3621.0000.000
지원지역0.1580.2670.0001.000
2023-12-13T07:34:54.815346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체 일련번호지원년도지원지역업종기술분류
업체 일련번호1.0001.0000.0000.1820.208
지원년도1.0001.0000.0000.3620.302
지원지역0.0000.0001.0000.2670.158
업종0.1820.3620.2671.0000.999
기술분류0.2080.3020.1580.9991.000

Missing values

2023-12-13T07:34:53.354492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:34:53.450057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지원년도지원내용지원지역업체 일련번호업종기술분류
02021창업사업화 지원강원101328제조업생명식품
12021창업사업화 지원강원101327제조업생명식품
22021창업사업화 지원강원101326제조업생명식품
32021창업사업화 지원강원101325제조업생명식품
42021창업사업화 지원강원101324제조업생명식품
52021창업사업화 지원강원101323제조업기계재료
62021창업사업화 지원강원101322제조업생명식품
72021창업사업화 지원강원101321제조업생명식품
82021창업사업화 지원강원101320제조업생명식품
92021창업사업화 지원강원101319제조업환경에너지
지원년도지원내용지원지역업체 일련번호업종기술분류
28852023창업사업화 지원충북20126118제조업환경에너지
28862023창업사업화 지원충북20125514제조업환경에너지
28872023창업사업화 지원경북20125309제조업환경에너지
28882023창업사업화 지원경북20123049제조업환경에너지
28892023창업사업화 지원제주20125042제조업환경에너지
28902023창업사업화 지원충남20122883제조업환경에너지
28912023창업사업화 지원광주20124469제조업환경에너지
28922023창업사업화 지원강원20120473제조업환경에너지
28932023창업사업화 지원경남20122175제조업환경에너지
28942023창업사업화 지원안산20122213제조업환경에너지