Overview

Dataset statistics

Number of variables7
Number of observations109
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.2 KiB
Average record size in memory58.2 B

Variable types

Numeric1
Categorical4
Text2

Dataset

Description"23년 중소벤처기업진흥공단에서 신규 출범된 "데이터가치평가 지원사업"의 지원현황(업종, 주생산품, 지원유형 등) 데이터를 개방
Author중소벤처기업진흥공단
URLhttps://www.data.go.kr/data/15122090/fileData.do

Alerts

지원유형 has constant value ""Constant
지원년도 has constant value ""Constant
업종 is highly imbalanced (53.3%)Imbalance
순번 has unique valuesUnique

Reproduction

Analysis started2024-03-14 11:04:10.041552
Analysis finished2024-03-14 11:04:11.246000
Duration1.2 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct109
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55
Minimum1
Maximum109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2024-03-14T20:04:11.469152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.4
Q128
median55
Q382
95-th percentile103.6
Maximum109
Range108
Interquartile range (IQR)54

Descriptive statistics

Standard deviation31.609598
Coefficient of variation (CV)0.57471996
Kurtosis-1.2
Mean55
Median Absolute Deviation (MAD)27
Skewness0
Sum5995
Variance999.16667
MonotonicityStrictly increasing
2024-03-14T20:04:11.917221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.9%
70 1
 
0.9%
81 1
 
0.9%
80 1
 
0.9%
79 1
 
0.9%
78 1
 
0.9%
77 1
 
0.9%
76 1
 
0.9%
75 1
 
0.9%
74 1
 
0.9%
Other values (99) 99
90.8%
ValueCountFrequency (%)
1 1
0.9%
2 1
0.9%
3 1
0.9%
4 1
0.9%
5 1
0.9%
6 1
0.9%
7 1
0.9%
8 1
0.9%
9 1
0.9%
10 1
0.9%
ValueCountFrequency (%)
109 1
0.9%
108 1
0.9%
107 1
0.9%
106 1
0.9%
105 1
0.9%
104 1
0.9%
103 1
0.9%
102 1
0.9%
101 1
0.9%
100 1
0.9%

업종
Categorical

IMBALANCE 

Distinct7
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Memory size1000.0 B
제조업
71 
정보통신업
31 
기술 서비스업
 
3
건설업
 
1
도소매업
 
1
Other values (2)
 
2

Length

Max length8
Median length3
Mean length3.7798165
Min length3

Unique

Unique4 ?
Unique (%)3.7%

Sample

1st row제조업
2nd row정보통신업
3rd row정보통신업
4th row제조업
5th row제조업

Common Values

ValueCountFrequency (%)
제조업 71
65.1%
정보통신업 31
28.4%
기술 서비스업 3
 
2.8%
건설업 1
 
0.9%
도소매업 1
 
0.9%
운수 및 창고업 1
 
0.9%
도매 및 소매업 1
 
0.9%

Length

2024-03-14T20:04:12.355203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T20:04:12.704072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제조업 71
61.2%
정보통신업 31
26.7%
기술 3
 
2.6%
서비스업 3
 
2.6%
2
 
1.7%
건설업 1
 
0.9%
도소매업 1
 
0.9%
운수 1
 
0.9%
창고업 1
 
0.9%
도매 1
 
0.9%
Distinct105
Distinct (%)96.3%
Missing0
Missing (%)0.0%
Memory size1000.0 B
2024-03-14T20:04:13.740659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length11
Mean length6.2568807
Min length2

Characters and Unicode

Total characters682
Distinct characters252
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique102 ?
Unique (%)93.6%

Sample

1st row자동차 부품
2nd row플랫폼 서비스
3rd row드론 솔루션
4th row방산부품
5th row건조저장육류
ValueCountFrequency (%)
부품 7
 
4.4%
자동차 5
 
3.1%
s/w 4
 
2.5%
플랫폼 3
 
1.9%
공급개발 3
 
1.9%
2
 
1.3%
드론 2
 
1.3%
자동차용 2
 
1.3%
서비스 2
 
1.3%
금형 2
 
1.3%
Other values (127) 127
79.9%
2024-03-14T20:04:15.046164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
52
 
7.6%
23
 
3.4%
17
 
2.5%
16
 
2.3%
15
 
2.2%
12
 
1.8%
12
 
1.8%
11
 
1.6%
10
 
1.5%
8
 
1.2%
Other values (242) 506
74.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 558
81.8%
Space Separator 52
 
7.6%
Uppercase Letter 42
 
6.2%
Lowercase Letter 20
 
2.9%
Other Punctuation 8
 
1.2%
Open Punctuation 1
 
0.1%
Close Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
23
 
4.1%
17
 
3.0%
16
 
2.9%
15
 
2.7%
12
 
2.2%
12
 
2.2%
11
 
2.0%
10
 
1.8%
8
 
1.4%
8
 
1.4%
Other values (208) 426
76.3%
Uppercase Letter
ValueCountFrequency (%)
S 6
14.3%
W 6
14.3%
A 5
11.9%
P 4
9.5%
I 4
9.5%
E 4
9.5%
G 2
 
4.8%
C 2
 
4.8%
B 2
 
4.8%
T 1
 
2.4%
Other values (6) 6
14.3%
Lowercase Letter
ValueCountFrequency (%)
h 3
15.0%
s 3
15.0%
e 3
15.0%
y 2
10.0%
t 2
10.0%
r 1
 
5.0%
a 1
 
5.0%
u 1
 
5.0%
i 1
 
5.0%
n 1
 
5.0%
Other values (2) 2
10.0%
Other Punctuation
ValueCountFrequency (%)
/ 4
50.0%
, 3
37.5%
' 1
 
12.5%
Space Separator
ValueCountFrequency (%)
52
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 558
81.8%
Common 62
 
9.1%
Latin 62
 
9.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
23
 
4.1%
17
 
3.0%
16
 
2.9%
15
 
2.7%
12
 
2.2%
12
 
2.2%
11
 
2.0%
10
 
1.8%
8
 
1.4%
8
 
1.4%
Other values (208) 426
76.3%
Latin
ValueCountFrequency (%)
S 6
 
9.7%
W 6
 
9.7%
A 5
 
8.1%
P 4
 
6.5%
I 4
 
6.5%
E 4
 
6.5%
h 3
 
4.8%
s 3
 
4.8%
e 3
 
4.8%
G 2
 
3.2%
Other values (18) 22
35.5%
Common
ValueCountFrequency (%)
52
83.9%
/ 4
 
6.5%
, 3
 
4.8%
( 1
 
1.6%
) 1
 
1.6%
' 1
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 558
81.8%
ASCII 124
 
18.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
52
41.9%
S 6
 
4.8%
W 6
 
4.8%
A 5
 
4.0%
/ 4
 
3.2%
P 4
 
3.2%
I 4
 
3.2%
E 4
 
3.2%
h 3
 
2.4%
s 3
 
2.4%
Other values (24) 33
26.6%
Hangul
ValueCountFrequency (%)
23
 
4.1%
17
 
3.0%
16
 
2.9%
15
 
2.7%
12
 
2.2%
12
 
2.2%
11
 
2.0%
10
 
1.8%
8
 
1.4%
8
 
1.4%
Other values (208) 426
76.3%

소재지
Categorical

Distinct16
Distinct (%)14.7%
Missing0
Missing (%)0.0%
Memory size1000.0 B
서울
21 
경기
18 
경남
15 
대구
10 
경북
Other values (11)
40 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)0.9%

Sample

1st row경남
2nd row부산
3rd row대구
4th row경북
5th row경기

Common Values

ValueCountFrequency (%)
서울 21
19.3%
경기 18
16.5%
경남 15
13.8%
대구 10
9.2%
경북 5
 
4.6%
충남 5
 
4.6%
대전 5
 
4.6%
인천 5
 
4.6%
강원 5
 
4.6%
부산 4
 
3.7%
Other values (6) 16
14.7%

Length

2024-03-14T20:04:15.268804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울 21
19.3%
경기 18
16.5%
경남 15
13.8%
대구 10
9.2%
경북 5
 
4.6%
충남 5
 
4.6%
대전 5
 
4.6%
인천 5
 
4.6%
강원 5
 
4.6%
부산 4
 
3.7%
Other values (6) 16
14.7%
Distinct107
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size1000.0 B
2024-03-14T20:04:16.368809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length3.9724771
Min length1

Characters and Unicode

Total characters433
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique105 ?
Unique (%)96.3%

Sample

1st row9288
2nd row1600
3rd row2808
4th row5310
5th row52694
ValueCountFrequency (%)
1460 2
 
1.8%
11 2
 
1.8%
1077 1
 
0.9%
33 1
 
0.9%
46877 1
 
0.9%
9633 1
 
0.9%
47315 1
 
0.9%
26887 1
 
0.9%
11343 1
 
0.9%
75972 1
 
0.9%
Other values (97) 97
89.0%
2024-03-14T20:04:18.033029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 64
14.8%
2 48
11.1%
3 43
9.9%
4 42
9.7%
6 42
9.7%
0 41
9.5%
8 39
9.0%
5 39
9.0%
7 37
8.5%
9 37
8.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 432
99.8%
Dash Punctuation 1
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 64
14.8%
2 48
11.1%
3 43
10.0%
4 42
9.7%
6 42
9.7%
0 41
9.5%
8 39
9.0%
5 39
9.0%
7 37
8.6%
9 37
8.6%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 433
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 64
14.8%
2 48
11.1%
3 43
9.9%
4 42
9.7%
6 42
9.7%
0 41
9.5%
8 39
9.0%
5 39
9.0%
7 37
8.5%
9 37
8.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 433
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 64
14.8%
2 48
11.1%
3 43
9.9%
4 42
9.7%
6 42
9.7%
0 41
9.5%
8 39
9.0%
5 39
9.0%
7 37
8.5%
9 37
8.5%

지원유형
Categorical

CONSTANT 

Distinct1
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1000.0 B
디지털역량진단
109 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row디지털역량진단
2nd row디지털역량진단
3rd row디지털역량진단
4th row디지털역량진단
5th row디지털역량진단

Common Values

ValueCountFrequency (%)
디지털역량진단 109
100.0%

Length

2024-03-14T20:04:18.451391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T20:04:18.766531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
디지털역량진단 109
100.0%

지원년도
Categorical

CONSTANT 

Distinct1
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1000.0 B
2023년
109 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023년
2nd row2023년
3rd row2023년
4th row2023년
5th row2023년

Common Values

ValueCountFrequency (%)
2023년 109
100.0%

Length

2024-03-14T20:04:19.083485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T20:04:19.377427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023년 109
100.0%

Interactions

2024-03-14T20:04:10.369415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T20:04:19.739750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번업종소재지
순번1.0000.3490.323
업종0.3491.0000.515
소재지0.3230.5151.000
2024-03-14T20:04:19.975666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소재지업종
소재지1.0000.251
업종0.2511.000
2024-03-14T20:04:20.119558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번업종소재지
순번1.0000.1720.124
업종0.1721.0000.251
소재지0.1240.2511.000

Missing values

2024-03-14T20:04:10.701688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T20:04:11.098926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번업종주생산품소재지매출액(백만원)지원유형지원년도
01제조업자동차 부품경남9288디지털역량진단2023년
12정보통신업플랫폼 서비스부산1600디지털역량진단2023년
23정보통신업드론 솔루션대구2808디지털역량진단2023년
34제조업방산부품경북5310디지털역량진단2023년
45제조업건조저장육류경기52694디지털역량진단2023년
56정보통신업모션데이터서울723디지털역량진단2023년
67기술 서비스업잔류농약분석충남6580디지털역량진단2023년
78제조업자동차 램프류경북24584디지털역량진단2023년
89정보통신업소프트웨어개발서울689디지털역량진단2023년
910제조업자동차부품경남3208디지털역량진단2023년
순번업종주생산품소재지매출액(백만원)지원유형지원년도
99100제조업아동복서울13145디지털역량진단2023년
100101제조업미숫가루경기16300디지털역량진단2023년
101102제조업Battery Ass'y경기18243디지털역량진단2023년
102103제조업피혁제품(지갑,벨트,골프)서울10879디지털역량진단2023년
103104제조업컬러콘택트렌즈경기3287디지털역량진단2023년
104105제조업승강기 부품인천328디지털역량진단2023년
105106제조업히팅자켓경기21322디지털역량진단2023년
106107제조업인쇄업서울3154디지털역량진단2023년
107108정보통신업온라인옷수선서비스부산33디지털역량진단2023년
108109도매 및 소매업곤충식품경기19디지털역량진단2023년