Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.7 KiB
Average record size in memory68.3 B

Variable types

Text1
Numeric3
Categorical4

Dataset

Description중소기업 기술사업화 역량강화를 위한 RnD, 시제품제작 등 지원 현황1. 사업화지원 : 보유기술의 시장성 보완을 위해 사업화기획, 제품성능테스트, 시장마케팅 등 지원① 사업화기획 : 기술컨설팅, 경영컨설팅, 비즈니스모델개선 등② 제품성능테스트 : 시제품제작, 성능테스트 등③ 시장마케팅: 시장조사, 마케팅전략수립, 전시회참가, 플랫폼 제작 등2. 시장친화형기능개선 : 추가 RnD : 상용화를 위한 기능개선, 성능향상 등을 위한 RnD 지원
Author중소벤처기업진흥공단
URLhttps://www.data.go.kr/data/15071321/fileData.do

Alerts

세부지원항목 is highly overall correlated with 지원금액(백만원) and 1 other fieldsHigh correlation
지원항목 is highly overall correlated with 지원금액(백만원) and 2 other fieldsHigh correlation
종업원수 is highly overall correlated with 매출액(백만원)High correlation
매출액(백만원) is highly overall correlated with 종업원수High correlation
지원금액(백만원) is highly overall correlated with 지원항목 and 1 other fieldsHigh correlation
업종 is highly overall correlated with 지원항목High correlation
매출액(백만원) has 4 (4.0%) zerosZeros

Reproduction

Analysis started2023-12-12 22:19:49.104367
Analysis finished2023-12-12 22:19:50.130004
Duration1.03 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct85
Distinct (%)85.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-13T07:19:50.252930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length4.82
Min length2

Characters and Unicode

Total characters482
Distinct characters72
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)72.0%

Sample

1st row울****
2nd row소**
3rd row아***
4th row아***
5th row칸*
ValueCountFrequency (%)
9
 
9.0%
5
 
5.0%
5
 
5.0%
4
 
4.0%
3
 
3.0%
3
 
3.0%
2
 
2.0%
2
 
2.0%
2
 
2.0%
2
 
2.0%
Other values (61) 63
63.0%
2023-12-13T07:19:50.532577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 382
79.3%
9
 
1.9%
5
 
1.0%
5
 
1.0%
4
 
0.8%
3
 
0.6%
3
 
0.6%
2
 
0.4%
2
 
0.4%
2
 
0.4%
Other values (62) 65
 
13.5%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 382
79.3%
Other Letter 99
 
20.5%
Uppercase Letter 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
 
9.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
2
 
2.0%
2
 
2.0%
2
 
2.0%
2
 
2.0%
Other values (60) 62
62.6%
Other Punctuation
ValueCountFrequency (%)
* 382
100.0%
Uppercase Letter
ValueCountFrequency (%)
Y 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 382
79.3%
Hangul 99
 
20.5%
Latin 1
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
 
9.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
2
 
2.0%
2
 
2.0%
2
 
2.0%
2
 
2.0%
Other values (60) 62
62.6%
Common
ValueCountFrequency (%)
* 382
100.0%
Latin
ValueCountFrequency (%)
Y 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 383
79.5%
Hangul 99
 
20.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 382
99.7%
Y 1
 
0.3%
Hangul
ValueCountFrequency (%)
9
 
9.1%
5
 
5.1%
5
 
5.1%
4
 
4.0%
3
 
3.0%
3
 
3.0%
2
 
2.0%
2
 
2.0%
2
 
2.0%
2
 
2.0%
Other values (60) 62
62.6%

종업원수
Real number (ℝ)

HIGH CORRELATION 

Distinct34
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.23
Minimum1
Maximum264
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-13T07:19:50.637259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7.5
Q318.25
95-th percentile62.45
Maximum264
Range263
Interquartile range (IQR)14.25

Descriptive statistics

Standard deviation31.378064
Coefficient of variation (CV)1.8211297
Kurtosis40.184897
Mean17.23
Median Absolute Deviation (MAD)4.5
Skewness5.6770314
Sum1723
Variance984.58293
MonotonicityNot monotonic
2023-12-13T07:19:50.745346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
7 13
13.0%
4 10
 
10.0%
6 9
 
9.0%
1 8
 
8.0%
3 7
 
7.0%
9 7
 
7.0%
25 4
 
4.0%
8 4
 
4.0%
12 3
 
3.0%
17 3
 
3.0%
Other values (24) 32
32.0%
ValueCountFrequency (%)
1 8
8.0%
2 2
 
2.0%
3 7
7.0%
4 10
10.0%
5 1
 
1.0%
6 9
9.0%
7 13
13.0%
8 4
 
4.0%
9 7
7.0%
10 2
 
2.0%
ValueCountFrequency (%)
264 1
1.0%
131 1
1.0%
73 1
1.0%
71 2
2.0%
62 1
1.0%
43 1
1.0%
40 1
1.0%
39 1
1.0%
36 1
1.0%
35 1
1.0%

지역
Categorical

Distinct15
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기도
30 
서울특별시
12 
대전광역시
12 
광주광역시
충청남도
Other values (10)
33 

Length

Max length5
Median length4
Mean length4.14
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대구광역시
2nd row대구광역시
3rd row경기도
4th row전라북도
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 30
30.0%
서울특별시 12
 
12.0%
대전광역시 12
 
12.0%
광주광역시 7
 
7.0%
충청남도 6
 
6.0%
경상남도 6
 
6.0%
대구광역시 5
 
5.0%
인천광역시 4
 
4.0%
경상북도 4
 
4.0%
부산광역시 3
 
3.0%
Other values (5) 11
 
11.0%

Length

2023-12-13T07:19:50.848081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 30
30.0%
서울특별시 12
 
12.0%
대전광역시 12
 
12.0%
광주광역시 7
 
7.0%
충청남도 6
 
6.0%
경상남도 6
 
6.0%
대구광역시 5
 
5.0%
인천광역시 4
 
4.0%
경상북도 4
 
4.0%
부산광역시 3
 
3.0%
Other values (5) 11
 
11.0%

업종
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)12.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
전기·전자
22 
기계
20 
정보통신
17 
바이오·의료
14 
화공
Other values (7)
21 

Length

Max length8
Median length6
Mean length4.29
Min length2

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row기계
2nd row화공
3rd row전기·전자
4th row전기·전자
5th row바이오·의료

Common Values

ValueCountFrequency (%)
전기·전자 22
22.0%
기계 20
20.0%
정보통신 17
17.0%
바이오·의료 14
14.0%
화공 6
 
6.0%
전기ㆍ전자 6
 
6.0%
기계ㆍ소재 4
 
4.0%
지식서비스/기타 3
 
3.0%
금속·소재 3
 
3.0%
바이오ㆍ의료 3
 
3.0%
Other values (2) 2
 
2.0%

Length

2023-12-13T07:19:50.971753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전기·전자 22
22.0%
기계 20
20.0%
정보통신 17
17.0%
바이오·의료 14
14.0%
화공 6
 
6.0%
전기ㆍ전자 6
 
6.0%
기계ㆍ소재 4
 
4.0%
지식서비스/기타 3
 
3.0%
금속·소재 3
 
3.0%
바이오ㆍ의료 3
 
3.0%
Other values (2) 2
 
2.0%

매출액(백만원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct96
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3359.58
Minimum-255
Maximum22322
Zeros4
Zeros (%)4.0%
Negative1
Negative (%)1.0%
Memory size1.0 KiB
2023-12-13T07:19:51.116948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-255
5-th percentile2.85
Q1345.25
median1489.5
Q34175.5
95-th percentile13439.7
Maximum22322
Range22577
Interquartile range (IQR)3830.25

Descriptive statistics

Standard deviation4784.5769
Coefficient of variation (CV)1.4241592
Kurtosis4.7229959
Mean3359.58
Median Absolute Deviation (MAD)1233.5
Skewness2.1664835
Sum335958
Variance22892176
MonotonicityNot monotonic
2023-12-13T07:19:51.242238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4
 
4.0%
1782 2
 
2.0%
3386 1
 
1.0%
1285 1
 
1.0%
1007 1
 
1.0%
7697 1
 
1.0%
10918 1
 
1.0%
1014 1
 
1.0%
1560 1
 
1.0%
270 1
 
1.0%
Other values (86) 86
86.0%
ValueCountFrequency (%)
-255 1
 
1.0%
0 4
4.0%
3 1
 
1.0%
13 1
 
1.0%
21 1
 
1.0%
30 1
 
1.0%
46 1
 
1.0%
55 1
 
1.0%
72 1
 
1.0%
105 1
 
1.0%
ValueCountFrequency (%)
22322 1
1.0%
22128 1
1.0%
18542 1
1.0%
15635 1
1.0%
14536 1
1.0%
13382 1
1.0%
12068 1
1.0%
11707 1
1.0%
10918 1
1.0%
10782 1
1.0%

지원금액(백만원)
Real number (ℝ)

HIGH CORRELATION 

Distinct18
Distinct (%)18.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean70.059145
Minimum11.511
Maximum98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-13T07:19:51.348276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11.511
5-th percentile52.723487
Q154
median57
Q396
95-th percentile97
Maximum98
Range86.489
Interquartile range (IQR)42

Descriptive statistics

Standard deviation20.905056
Coefficient of variation (CV)0.29839154
Kurtosis-1.2109007
Mean70.059145
Median Absolute Deviation (MAD)3
Skewness0.2623446
Sum7005.9145
Variance437.02136
MonotonicityNot monotonic
2023-12-13T07:19:51.438434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
54.0 27
27.0%
57.0 23
23.0%
97.0 14
14.0%
96.0 10
 
10.0%
95.0 8
 
8.0%
98.0 4
 
4.0%
60.0 2
 
2.0%
52.5 2
 
2.0%
48.0 1
 
1.0%
90.0 1
 
1.0%
Other values (8) 8
 
8.0%
ValueCountFrequency (%)
11.511 1
 
1.0%
39.0 1
 
1.0%
48.0 1
 
1.0%
52.5 2
 
2.0%
52.73525 1
 
1.0%
53.6 1
 
1.0%
53.99325 1
 
1.0%
54.0 27
27.0%
56.166 1
 
1.0%
56.909 1
 
1.0%
ValueCountFrequency (%)
98.0 4
 
4.0%
97.0 14
14.0%
96.0 10
10.0%
95.0 8
 
8.0%
90.0 1
 
1.0%
80.0 1
 
1.0%
60.0 2
 
2.0%
57.0 23
23.0%
56.909 1
 
1.0%
56.166 1
 
1.0%

지원항목
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
사업화지원
62 
시장친화형 기능개선
38 

Length

Max length10
Median length5
Mean length6.9
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사업화지원
2nd row사업화지원
3rd row사업화지원
4th row사업화지원
5th row사업화지원

Common Values

ValueCountFrequency (%)
사업화지원 62
62.0%
시장친화형 기능개선 38
38.0%

Length

2023-12-13T07:19:51.548099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:19:51.638485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
사업화지원 62
44.9%
시장친화형 38
27.5%
기능개선 38
27.5%

세부지원항목
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)19.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
추가R&D
38 
사업화기획, 제품성능
15 
시장마케팅, 제품성능
13 
시장마케팅, 제품성능
제품성능, 시장마케팅
Other values (14)
19 

Length

Max length24
Median length23
Mean length9.51
Min length4

Unique

Unique10 ?
Unique (%)10.0%

Sample

1st row시장마케팅, 제품성능
2nd row시장마케팅, 제품성능
3rd row사업화기획, 제품성능
4th row사업화기획, 제품성능
5th row사업화기획, 제품성능

Common Values

ValueCountFrequency (%)
추가R&D 38
38.0%
사업화기획, 제품성능 15
 
15.0%
시장마케팅, 제품성능 13
 
13.0%
시장마케팅, 제품성능 9
 
9.0%
제품성능, 시장마케팅 6
 
6.0%
사업화기획제품성능 3
 
3.0%
제품성능시장마케팅 2
 
2.0%
시장마케팅, 사업화기획 2
 
2.0%
사업화기획, 시장마케팅, 제품성능 2
 
2.0%
제품성능 1
 
1.0%
Other values (9) 9
 
9.0%

Length

2023-12-13T07:19:51.754298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
제품성능 58
35.4%
시장마케팅 39
23.8%
추가r&d 38
23.2%
사업화기획 22
 
13.4%
사업화기획제품성능 3
 
1.8%
제품성능시장마케팅 2
 
1.2%
사업화기획시장마케팅 2
 
1.2%

Interactions

2023-12-13T07:19:49.785092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:19:49.414881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:19:49.594444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:19:49.849443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:19:49.476286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:19:49.662530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:19:49.914945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:19:49.535278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:19:49.724088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:19:51.837788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체명종업원수지역업종매출액(백만원)지원금액(백만원)지원항목세부지원항목
업체명1.0000.9320.9040.7680.9330.0000.0000.766
종업원수0.9321.0000.0000.0000.9460.0000.0000.000
지역0.9040.0001.0000.3530.0000.0000.0000.316
업종0.7680.0000.3531.0000.0000.8180.7360.246
매출액(백만원)0.9330.9460.0000.0001.0000.0000.0000.000
지원금액(백만원)0.0000.0000.0000.8180.0001.0001.0000.903
지원항목0.0000.0000.0000.7360.0001.0001.0001.000
세부지원항목0.7660.0000.3160.2460.0000.9031.0001.000
2023-12-13T07:19:51.949964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종지역세부지원항목지원항목
업종1.0000.1290.0730.553
지역0.1291.0000.0940.000
세부지원항목0.0730.0941.0000.909
지원항목0.5530.0000.9091.000
2023-12-13T07:19:52.049651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종업원수매출액(백만원)지원금액(백만원)지역업종지원항목세부지원항목
종업원수1.0000.757-0.0190.0000.0000.0000.000
매출액(백만원)0.7571.0000.0180.0000.0000.0000.000
지원금액(백만원)-0.0190.0181.0000.0000.4490.9790.652
지역0.0000.0000.0001.0000.1290.0000.094
업종0.0000.0000.4490.1291.0000.5530.073
지원항목0.0000.0000.9790.0000.5531.0000.909
세부지원항목0.0000.0000.6520.0940.0730.9091.000

Missing values

2023-12-13T07:19:50.000277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:19:50.093638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업체명종업원수지역업종매출액(백만원)지원금액(백만원)지원항목세부지원항목
0울****12대구광역시기계338660.0사업화지원시장마케팅, 제품성능
1소**16대구광역시화공783757.0사업화지원시장마케팅, 제품성능
2아***7경기도전기·전자125457.0사업화지원사업화기획, 제품성능
3아***3전라북도전기·전자153457.0사업화지원사업화기획, 제품성능
4칸*8서울특별시바이오·의료4657.0사업화지원사업화기획, 제품성능
5페***6대전광역시화공178257.0사업화지원시장마케팅, 제품성능
6비***1대구광역시기계27157.0사업화지원사업화기획, 제품성능
7나*******4경기도정보통신38757.0사업화지원사업화기획, 제품성능
8스****43경기도전기·전자1854257.0사업화지원사업화기획, 제품성능
9정***33충청남도화공1206856.166사업화지원시장마케팅, 제품성능
업체명종업원수지역업종매출액(백만원)지원금액(백만원)지원항목세부지원항목
90황*********2전라북도바이오·의료1395.0시장친화형 기능개선추가R&D
91담*******4광주광역시기계42795.0시장친화형 기능개선추가R&D
92에***20경상남도바이오·의료35095.0시장친화형 기능개선추가R&D
93에**7경기도바이오·의료172096.0시장친화형 기능개선추가R&D
94우****7경기도전기·전자173096.0시장친화형 기능개선추가R&D
95텐***6강원도정보통신114996.0시장친화형 기능개선추가R&D
96비*****9광주광역시정보통신84195.0시장친화형 기능개선추가R&D
97제*******1서울특별시바이오·의료171195.0시장친화형 기능개선추가R&D
98다****17충청남도바이오·의료480895.0시장친화형 기능개선추가R&D
99제******33서울특별시전기·전자1170795.0시장친화형 기능개선추가R&D