Overview

Dataset statistics

Number of variables12
Number of observations10000
Missing cells68
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.0 MiB
Average record size in memory105.0 B

Variable types

Numeric1
Text5
Categorical4
DateTime2

Dataset

Description"23년 3월 29일 기준 벤처기업명단입니다. 업체명, 대표자, 확인유형, 지역, 주소, 업종분류, 업종명, 주생산품, 유효시작일, 유효만료일, 확인기관 항목으로 이루어짐
URLhttps://www.data.go.kr/data/15112684/fileData.do

Alerts

벤처확인기관 is highly overall correlated with 벤처확인유형High correlation
벤처확인유형 is highly overall correlated with 벤처확인기관High correlation
벤처확인기관 is highly imbalanced (97.3%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 18:46:19.169391
Analysis finished2023-12-12 18:46:22.667723
Duration3.5 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17567.328
Minimum3
Maximum35593
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T03:46:22.788291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile1789.9
Q18673.25
median17407.5
Q326613
95-th percentile33711.1
Maximum35593
Range35590
Interquartile range (IQR)17939.75

Descriptive statistics

Standard deviation10277.884
Coefficient of variation (CV)0.58505675
Kurtosis-1.2035864
Mean17567.328
Median Absolute Deviation (MAD)8949
Skewness0.036036354
Sum1.7567328 × 108
Variance1.056349 × 108
MonotonicityNot monotonic
2023-12-13T03:46:23.014384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4587 1
 
< 0.1%
18661 1
 
< 0.1%
24189 1
 
< 0.1%
18668 1
 
< 0.1%
11124 1
 
< 0.1%
10922 1
 
< 0.1%
6357 1
 
< 0.1%
9651 1
 
< 0.1%
18208 1
 
< 0.1%
6546 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
3 1
< 0.1%
6 1
< 0.1%
25 1
< 0.1%
26 1
< 0.1%
31 1
< 0.1%
32 1
< 0.1%
36 1
< 0.1%
45 1
< 0.1%
46 1
< 0.1%
47 1
< 0.1%
ValueCountFrequency (%)
35593 1
< 0.1%
35592 1
< 0.1%
35590 1
< 0.1%
35585 1
< 0.1%
35574 1
< 0.1%
35573 1
< 0.1%
35568 1
< 0.1%
35561 1
< 0.1%
35560 1
< 0.1%
35541 1
< 0.1%
Distinct9973
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T03:46:23.232876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length39
Mean length8.0735
Min length1

Characters and Unicode

Total characters80735
Distinct characters914
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9946 ?
Unique (%)99.5%

Sample

1st row㈜가원폴리텍
2nd row다안 스마트 이엔지
3rd row주식회사 호반식품
4th row뉴통 주식회사
5th row㈜에이앤티랩스
ValueCountFrequency (%)
주식회사 4010
 
27.3%
108
 
0.7%
예비창업자 67
 
0.5%
농업회사법인 59
 
0.4%
유한회사 41
 
0.3%
co.,ltd 38
 
0.3%
inc 34
 
0.2%
ltd 25
 
0.2%
co 23
 
0.2%
tech 9
 
0.1%
Other values (10181) 10288
70.0%
2023-12-13T03:46:23.715969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5145
 
6.4%
4749
 
5.9%
4503
 
5.6%
4362
 
5.4%
4249
 
5.3%
3797
 
4.7%
3612
 
4.5%
2964
 
3.7%
1615
 
2.0%
) 1106
 
1.4%
Other values (904) 44633
55.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67469
83.6%
Space Separator 4749
 
5.9%
Other Symbol 3797
 
4.7%
Uppercase Letter 1235
 
1.5%
Close Punctuation 1107
 
1.4%
Open Punctuation 1105
 
1.4%
Lowercase Letter 910
 
1.1%
Other Punctuation 276
 
0.3%
Decimal Number 69
 
0.1%
Dash Punctuation 18
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5145
 
7.6%
4503
 
6.7%
4362
 
6.5%
4249
 
6.3%
3612
 
5.4%
2964
 
4.4%
1615
 
2.4%
1046
 
1.6%
875
 
1.3%
866
 
1.3%
Other values (832) 38232
56.7%
Uppercase Letter
ValueCountFrequency (%)
C 142
11.5%
T 107
 
8.7%
L 105
 
8.5%
E 99
 
8.0%
I 89
 
7.2%
S 86
 
7.0%
N 71
 
5.7%
A 63
 
5.1%
O 63
 
5.1%
M 60
 
4.9%
Other values (15) 350
28.3%
Lowercase Letter
ValueCountFrequency (%)
o 117
12.9%
t 104
11.4%
e 89
9.8%
n 82
9.0%
c 76
8.4%
d 72
 
7.9%
a 59
 
6.5%
i 52
 
5.7%
r 42
 
4.6%
l 33
 
3.6%
Other values (15) 184
20.2%
Decimal Number
ValueCountFrequency (%)
1 20
29.0%
2 11
15.9%
3 9
13.0%
5 9
13.0%
6 4
 
5.8%
9 4
 
5.8%
0 4
 
5.8%
4 3
 
4.3%
8 3
 
4.3%
7 2
 
2.9%
Other Punctuation
ValueCountFrequency (%)
. 184
66.7%
, 66
 
23.9%
& 24
 
8.7%
/ 1
 
0.4%
' 1
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 1106
99.9%
] 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 1104
99.9%
[ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
4749
100.0%
Other Symbol
ValueCountFrequency (%)
3797
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 71266
88.3%
Common 7324
 
9.1%
Latin 2145
 
2.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5145
 
7.2%
4503
 
6.3%
4362
 
6.1%
4249
 
6.0%
3797
 
5.3%
3612
 
5.1%
2964
 
4.2%
1615
 
2.3%
1046
 
1.5%
875
 
1.2%
Other values (833) 39098
54.9%
Latin
ValueCountFrequency (%)
C 142
 
6.6%
o 117
 
5.5%
T 107
 
5.0%
L 105
 
4.9%
t 104
 
4.8%
E 99
 
4.6%
e 89
 
4.1%
I 89
 
4.1%
S 86
 
4.0%
n 82
 
3.8%
Other values (40) 1125
52.4%
Common
ValueCountFrequency (%)
4749
64.8%
) 1106
 
15.1%
( 1104
 
15.1%
. 184
 
2.5%
, 66
 
0.9%
& 24
 
0.3%
1 20
 
0.3%
- 18
 
0.2%
2 11
 
0.2%
3 9
 
0.1%
Other values (11) 33
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67469
83.6%
ASCII 9469
 
11.7%
None 3797
 
4.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5145
 
7.6%
4503
 
6.7%
4362
 
6.5%
4249
 
6.3%
3612
 
5.4%
2964
 
4.4%
1615
 
2.4%
1046
 
1.6%
875
 
1.3%
866
 
1.3%
Other values (832) 38232
56.7%
ASCII
ValueCountFrequency (%)
4749
50.2%
) 1106
 
11.7%
( 1104
 
11.7%
. 184
 
1.9%
C 142
 
1.5%
o 117
 
1.2%
T 107
 
1.1%
L 105
 
1.1%
t 104
 
1.1%
E 99
 
1.0%
Other values (61) 1652
 
17.4%
None
ValueCountFrequency (%)
3797
100.0%
Distinct431
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T03:46:24.068848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length3
Mean length3.2328
Min length3

Characters and Unicode

Total characters32328
Distinct characters121
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique243 ?
Unique (%)2.4%

Sample

1st row김**
2nd row김**
3rd row박**
4th row김**
5th row윤**
ValueCountFrequency (%)
2028
20.3%
1447
 
14.5%
811
 
8.1%
438
 
4.4%
424
 
4.2%
299
 
3.0%
246
 
2.5%
188
 
1.9%
186
 
1.9%
184
 
1.8%
Other values (421) 3749
37.5%
2023-12-13T03:46:25.040817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 21164
65.5%
2256
 
7.0%
1629
 
5.0%
905
 
2.8%
, 582
 
1.8%
500
 
1.5%
478
 
1.5%
331
 
1.0%
281
 
0.9%
212
 
0.7%
Other values (111) 3990
 
12.3%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 21746
67.3%
Other Letter 10537
32.6%
Uppercase Letter 45
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2256
21.4%
1629
15.5%
905
 
8.6%
500
 
4.7%
478
 
4.5%
331
 
3.1%
281
 
2.7%
212
 
2.0%
212
 
2.0%
209
 
2.0%
Other values (94) 3524
33.4%
Uppercase Letter
ValueCountFrequency (%)
L 9
20.0%
K 9
20.0%
C 6
13.3%
J 3
 
6.7%
P 3
 
6.7%
B 2
 
4.4%
Z 2
 
4.4%
Y 2
 
4.4%
H 2
 
4.4%
S 2
 
4.4%
Other values (5) 5
11.1%
Other Punctuation
ValueCountFrequency (%)
* 21164
97.3%
, 582
 
2.7%

Most occurring scripts

ValueCountFrequency (%)
Common 21746
67.3%
Hangul 10537
32.6%
Latin 45
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2256
21.4%
1629
15.5%
905
 
8.6%
500
 
4.7%
478
 
4.5%
331
 
3.1%
281
 
2.7%
212
 
2.0%
212
 
2.0%
209
 
2.0%
Other values (94) 3524
33.4%
Latin
ValueCountFrequency (%)
L 9
20.0%
K 9
20.0%
C 6
13.3%
J 3
 
6.7%
P 3
 
6.7%
B 2
 
4.4%
Z 2
 
4.4%
Y 2
 
4.4%
H 2
 
4.4%
S 2
 
4.4%
Other values (5) 5
11.1%
Common
ValueCountFrequency (%)
* 21164
97.3%
, 582
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21791
67.4%
Hangul 10537
32.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 21164
97.1%
, 582
 
2.7%
L 9
 
< 0.1%
K 9
 
< 0.1%
C 6
 
< 0.1%
J 3
 
< 0.1%
P 3
 
< 0.1%
B 2
 
< 0.1%
Z 2
 
< 0.1%
Y 2
 
< 0.1%
Other values (7) 9
 
< 0.1%
Hangul
ValueCountFrequency (%)
2256
21.4%
1629
15.5%
905
 
8.6%
500
 
4.7%
478
 
4.5%
331
 
3.1%
281
 
2.7%
212
 
2.0%
212
 
2.0%
209
 
2.0%
Other values (94) 3524
33.4%

벤처확인유형
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
혁신성장유형
6339 
연구개발유형
1858 
벤처투자유형
1698 
예비벤처유형
 
68
기술평가보증기업(기금)
 
21

Length

Max length14
Median length6
Mean length6.0254
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row혁신성장유형
2nd row혁신성장유형
3rd row혁신성장유형
4th row벤처투자유형
5th row연구개발유형

Common Values

ValueCountFrequency (%)
혁신성장유형 6339
63.4%
연구개발유형 1858
 
18.6%
벤처투자유형 1698
 
17.0%
예비벤처유형 68
 
0.7%
기술평가보증기업(기금) 21
 
0.2%
기술평가대출기업(중진공) 16
 
0.2%

Length

2023-12-13T03:46:25.265704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:46:25.510481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
혁신성장유형 6339
63.4%
연구개발유형 1858
 
18.6%
벤처투자유형 1698
 
17.0%
예비벤처유형 68
 
0.7%
기술평가보증기업(기금 21
 
0.2%
기술평가대출기업(중진공 16
 
0.2%

지역
Categorical

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경기
3146 
서울
2848 
부산
487 
인천
464 
대전
401 
Other values (12)
2654 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충북
2nd row부산
3rd row충북
4th row경기
5th row경북

Common Values

ValueCountFrequency (%)
경기 3146
31.5%
서울 2848
28.5%
부산 487
 
4.9%
인천 464
 
4.6%
대전 401
 
4.0%
경남 388
 
3.9%
대구 342
 
3.4%
경북 340
 
3.4%
충남 336
 
3.4%
충북 231
 
2.3%
Other values (7) 1017
 
10.2%

Length

2023-12-13T03:46:25.708040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 3146
31.5%
서울 2848
28.5%
부산 487
 
4.9%
인천 464
 
4.6%
대전 401
 
4.0%
경남 388
 
3.9%
대구 342
 
3.4%
경북 340
 
3.4%
충남 336
 
3.4%
충북 231
 
2.3%
Other values (7) 1017
 
10.2%

주소
Text

Distinct241
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T03:46:26.163606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length13
Mean length8.1639
Min length7

Characters and Unicode

Total characters81639
Distinct characters158
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)0.4%

Sample

1st row충청북도 청주시
2nd row부산광역시 해운대구
3rd row충청북도 청주시
4th row경기도 고양시
5th row경상북도 구미시
ValueCountFrequency (%)
경기도 3146
 
15.7%
서울특별시 2848
 
14.2%
강남구 652
 
3.3%
부산광역시 487
 
2.4%
성남시 476
 
2.4%
인천광역시 464
 
2.3%
화성시 421
 
2.1%
대전광역시 401
 
2.0%
경상남도 388
 
1.9%
대구광역시 342
 
1.7%
Other values (226) 10375
51.9%
2023-12-13T03:46:27.036458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10000
 
12.2%
9771
 
12.0%
5244
 
6.4%
5141
 
6.3%
3963
 
4.9%
3716
 
4.6%
3183
 
3.9%
3018
 
3.7%
2949
 
3.6%
2949
 
3.6%
Other values (148) 31705
38.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 71625
87.7%
Space Separator 10000
 
12.2%
Decimal Number 14
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9771
 
13.6%
5244
 
7.3%
5141
 
7.2%
3963
 
5.5%
3716
 
5.2%
3183
 
4.4%
3018
 
4.2%
2949
 
4.1%
2949
 
4.1%
2431
 
3.4%
Other values (143) 29260
40.9%
Decimal Number
ValueCountFrequency (%)
7 8
57.1%
1 4
28.6%
2 1
 
7.1%
3 1
 
7.1%
Space Separator
ValueCountFrequency (%)
10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 71625
87.7%
Common 10014
 
12.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9771
 
13.6%
5244
 
7.3%
5141
 
7.2%
3963
 
5.5%
3716
 
5.2%
3183
 
4.4%
3018
 
4.2%
2949
 
4.1%
2949
 
4.1%
2431
 
3.4%
Other values (143) 29260
40.9%
Common
ValueCountFrequency (%)
10000
99.9%
7 8
 
0.1%
1 4
 
< 0.1%
2 1
 
< 0.1%
3 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 71625
87.7%
ASCII 10014
 
12.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10000
99.9%
7 8
 
0.1%
1 4
 
< 0.1%
2 1
 
< 0.1%
3 1
 
< 0.1%
Hangul
ValueCountFrequency (%)
9771
 
13.6%
5244
 
7.3%
5141
 
7.2%
3963
 
5.5%
3716
 
5.2%
3183
 
4.4%
3018
 
4.2%
2949
 
4.1%
2949
 
4.1%
2431
 
3.4%
Other values (143) 29260
40.9%

업종분류
Categorical

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
제조업
5886 
정보처리S/W
2166 
기타
958 
연구개발서비스
 
387
도소매업
 
359
Other values (2)
 
244

Length

Max length8
Median length3
Mean length4.0001
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제조업
2nd row제조업
3rd row제조업
4th row건설운수
5th row정보처리S/W

Common Values

ValueCountFrequency (%)
제조업 5886
58.9%
정보처리S/W 2166
 
21.7%
기타 958
 
9.6%
연구개발서비스 387
 
3.9%
도소매업 359
 
3.6%
건설운수 208
 
2.1%
농,어,임,광업 36
 
0.4%

Length

2023-12-13T03:46:27.293739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:46:27.543524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제조업 5886
58.9%
정보처리s/w 2166
 
21.7%
기타 958
 
9.6%
연구개발서비스 387
 
3.9%
도소매업 359
 
3.6%
건설운수 208
 
2.1%
농,어,임,광업 36
 
0.4%
Distinct704
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T03:46:28.080729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length24
Mean length15.5258
Min length3

Characters and Unicode

Total characters155258
Distinct characters398
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique168 ?
Unique (%)1.7%

Sample

1st row접착제 및 젤라틴 제조업
2nd row기기용 자동측정 및 제어장치 제조업
3rd row수프 및 균질화식품 제조업
4th row냉장 및 냉동 창고업
5th row응용 소프트웨어 개발 및 공급업
ValueCountFrequency (%)
제조업 5561
 
12.2%
4511
 
9.9%
기타 3365
 
7.4%
1830
 
4.0%
1825
 
4.0%
서비스업 1304
 
2.9%
소프트웨어 1216
 
2.7%
공급업 1211
 
2.7%
개발 1206
 
2.7%
응용 791
 
1.7%
Other values (1049) 22638
49.8%
2023-12-13T03:46:28.903234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35460
22.8%
10249
 
6.6%
7523
 
4.8%
6909
 
4.5%
6309
 
4.1%
4511
 
2.9%
3386
 
2.2%
2738
 
1.8%
2461
 
1.6%
2202
 
1.4%
Other values (388) 73510
47.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 118834
76.5%
Space Separator 35460
 
22.8%
Other Punctuation 876
 
0.6%
Decimal Number 32
 
< 0.1%
Open Punctuation 28
 
< 0.1%
Close Punctuation 28
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10249
 
8.6%
7523
 
6.3%
6909
 
5.8%
6309
 
5.3%
4511
 
3.8%
3386
 
2.8%
2738
 
2.3%
2461
 
2.1%
2202
 
1.9%
2171
 
1.8%
Other values (382) 70375
59.2%
Other Punctuation
ValueCountFrequency (%)
, 869
99.2%
. 7
 
0.8%
Space Separator
ValueCountFrequency (%)
35460
100.0%
Decimal Number
ValueCountFrequency (%)
1 32
100.0%
Open Punctuation
ValueCountFrequency (%)
( 28
100.0%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 118834
76.5%
Common 36424
 
23.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10249
 
8.6%
7523
 
6.3%
6909
 
5.8%
6309
 
5.3%
4511
 
3.8%
3386
 
2.8%
2738
 
2.3%
2461
 
2.1%
2202
 
1.9%
2171
 
1.8%
Other values (382) 70375
59.2%
Common
ValueCountFrequency (%)
35460
97.4%
, 869
 
2.4%
1 32
 
0.1%
( 28
 
0.1%
) 28
 
0.1%
. 7
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 118762
76.5%
ASCII 36424
 
23.5%
Compat Jamo 72
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
35460
97.4%
, 869
 
2.4%
1 32
 
0.1%
( 28
 
0.1%
) 28
 
0.1%
. 7
 
< 0.1%
Hangul
ValueCountFrequency (%)
10249
 
8.6%
7523
 
6.3%
6909
 
5.8%
6309
 
5.3%
4511
 
3.8%
3386
 
2.9%
2738
 
2.3%
2461
 
2.1%
2202
 
1.9%
2171
 
1.8%
Other values (381) 70303
59.2%
Compat Jamo
ValueCountFrequency (%)
72
100.0%
Distinct8972
Distinct (%)90.3%
Missing68
Missing (%)0.7%
Memory size156.2 KiB
2023-12-13T03:46:29.452175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length114
Median length84
Mean length12.535441
Min length1

Characters and Unicode

Total characters124502
Distinct characters948
Distinct categories15 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8665 ?
Unique (%)87.2%

Sample

1st row건축자재용 접착제, 폴리우레탄
2nd row주문식 제어장치
3rd row복합조미시즈닝엑기스소스
4th row기업형 식자재 유통물류서비스
5th row대학정보시스템 인공지능솔루션
ValueCountFrequency (%)
1277
 
5.1%
소프트웨어 477
 
1.9%
개발 374
 
1.5%
서비스 326
 
1.3%
280
 
1.1%
273
 
1.1%
플랫폼 235
 
0.9%
시스템 190
 
0.8%
솔루션 185
 
0.7%
화장품 160
 
0.6%
Other values (11500) 21282
84.9%
2023-12-13T03:46:30.188575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
15574
 
12.5%
, 3347
 
2.7%
3341
 
2.7%
2444
 
2.0%
1653
 
1.3%
1637
 
1.3%
1532
 
1.2%
1520
 
1.2%
1464
 
1.2%
1456
 
1.2%
Other values (938) 90534
72.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 92343
74.2%
Space Separator 15574
 
12.5%
Uppercase Letter 6831
 
5.5%
Lowercase Letter 4624
 
3.7%
Other Punctuation 3883
 
3.1%
Close Punctuation 451
 
0.4%
Open Punctuation 356
 
0.3%
Decimal Number 345
 
0.3%
Dash Punctuation 78
 
0.1%
Math Symbol 10
 
< 0.1%
Other values (5) 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3341
 
3.6%
2444
 
2.6%
1653
 
1.8%
1637
 
1.8%
1532
 
1.7%
1520
 
1.6%
1464
 
1.6%
1456
 
1.6%
1389
 
1.5%
1384
 
1.5%
Other values (851) 74523
80.7%
Uppercase Letter
ValueCountFrequency (%)
S 681
 
10.0%
C 556
 
8.1%
E 543
 
7.9%
I 492
 
7.2%
D 491
 
7.2%
T 481
 
7.0%
P 437
 
6.4%
A 423
 
6.2%
L 377
 
5.5%
R 376
 
5.5%
Other values (16) 1974
28.9%
Lowercase Letter
ValueCountFrequency (%)
e 559
12.1%
o 439
 
9.5%
a 417
 
9.0%
t 375
 
8.1%
i 363
 
7.9%
r 361
 
7.8%
l 258
 
5.6%
n 254
 
5.5%
s 208
 
4.5%
c 197
 
4.3%
Other values (16) 1193
25.8%
Other Punctuation
ValueCountFrequency (%)
, 3347
86.2%
/ 299
 
7.7%
. 111
 
2.9%
& 63
 
1.6%
; 38
 
1.0%
: 16
 
0.4%
· 5
 
0.1%
* 2
 
0.1%
! 1
 
< 0.1%
% 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
3 95
27.5%
2 94
27.2%
1 49
14.2%
0 30
 
8.7%
5 23
 
6.7%
4 17
 
4.9%
9 11
 
3.2%
7 10
 
2.9%
8 8
 
2.3%
6 8
 
2.3%
Open Punctuation
ValueCountFrequency (%)
( 354
99.4%
{ 1
 
0.3%
[ 1
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 450
99.8%
} 1
 
0.2%
Math Symbol
ValueCountFrequency (%)
+ 6
60.0%
| 4
40.0%
Other Symbol
ValueCountFrequency (%)
® 1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
15574
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 78
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 92324
74.2%
Common 20704
 
16.6%
Latin 11455
 
9.2%
Han 19
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3341
 
3.6%
2444
 
2.6%
1653
 
1.8%
1637
 
1.8%
1532
 
1.7%
1520
 
1.6%
1464
 
1.6%
1456
 
1.6%
1389
 
1.5%
1384
 
1.5%
Other values (848) 74504
80.7%
Latin
ValueCountFrequency (%)
S 681
 
5.9%
e 559
 
4.9%
C 556
 
4.9%
E 543
 
4.7%
I 492
 
4.3%
D 491
 
4.3%
T 481
 
4.2%
o 439
 
3.8%
P 437
 
3.8%
A 423
 
3.7%
Other values (42) 6353
55.5%
Common
ValueCountFrequency (%)
15574
75.2%
, 3347
 
16.2%
) 450
 
2.2%
( 354
 
1.7%
/ 299
 
1.4%
. 111
 
0.5%
3 95
 
0.5%
2 94
 
0.5%
- 78
 
0.4%
& 63
 
0.3%
Other values (25) 239
 
1.2%
Han
ValueCountFrequency (%)
17
89.5%
1
 
5.3%
1
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 92324
74.2%
ASCII 32150
 
25.8%
CJK 19
 
< 0.1%
None 6
 
< 0.1%
Punctuation 2
 
< 0.1%
Letterlike Symbols 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
15574
48.4%
, 3347
 
10.4%
S 681
 
2.1%
e 559
 
1.7%
C 556
 
1.7%
E 543
 
1.7%
I 492
 
1.5%
D 491
 
1.5%
T 481
 
1.5%
) 450
 
1.4%
Other values (72) 8976
27.9%
Hangul
ValueCountFrequency (%)
3341
 
3.6%
2444
 
2.6%
1653
 
1.8%
1637
 
1.8%
1532
 
1.7%
1520
 
1.6%
1464
 
1.6%
1456
 
1.6%
1389
 
1.5%
1384
 
1.5%
Other values (848) 74504
80.7%
CJK
ValueCountFrequency (%)
17
89.5%
1
 
5.3%
1
 
5.3%
None
ValueCountFrequency (%)
· 5
83.3%
® 1
 
16.7%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Letterlike Symbols
ValueCountFrequency (%)
1
100.0%
Distinct753
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2021-01-25 00:00:00
Maximum2023-03-30 00:00:00
2023-12-13T03:46:30.404734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:30.659495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct775
Distinct (%)7.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2023-03-31 00:00:00
Maximum2026-03-29 00:00:00
2023-12-13T03:46:30.903574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:31.149291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

벤처확인기관
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
벤처기업확인기관
9959 
기술보증기금 기술평가센터
 
23
중소기업진흥공단
 
18

Length

Max length13
Median length8
Mean length8.0115
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row벤처기업확인기관
2nd row벤처기업확인기관
3rd row벤처기업확인기관
4th row벤처기업확인기관
5th row벤처기업확인기관

Common Values

ValueCountFrequency (%)
벤처기업확인기관 9959
99.6%
기술보증기금 기술평가센터 23
 
0.2%
중소기업진흥공단 18
 
0.2%

Length

2023-12-13T03:46:31.393150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:46:31.575556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
벤처기업확인기관 9959
99.4%
기술보증기금 23
 
0.2%
기술평가센터 23
 
0.2%
중소기업진흥공단 18
 
0.2%

Interactions

2023-12-13T03:46:21.982704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:46:31.690670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번벤처확인유형지역업종분류벤처확인기관
연번1.0000.1800.0570.0500.217
벤처확인유형0.1801.0000.2500.2650.998
지역0.0570.2501.0000.4220.082
업종분류0.0500.2650.4221.0000.000
벤처확인기관0.2170.9980.0820.0001.000
2023-12-13T03:46:31.854159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
벤처확인기관지역업종분류벤처확인유형
벤처확인기관1.0000.0430.0000.949
지역0.0431.0000.2050.120
업종분류0.0000.2051.0000.161
벤처확인유형0.9490.1200.1611.000
2023-12-13T03:46:32.013338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번벤처확인유형지역업종분류벤처확인기관
연번1.0000.0950.0220.0250.132
벤처확인유형0.0951.0000.1200.1610.949
지역0.0220.1201.0000.2050.043
업종분류0.0250.1610.2051.0000.000
벤처확인기관0.1320.9490.0430.0001.000

Missing values

2023-12-13T03:46:22.238633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:46:22.516240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번업체명대표자명벤처확인유형지역주소업종분류업종명주생산품유효시작일유효종료일벤처확인기관
45864587㈜가원폴리텍김**혁신성장유형충북충청북도 청주시제조업접착제 및 젤라틴 제조업건축자재용 접착제, 폴리우레탄2021-05-262024-05-25벤처기업확인기관
2242722428다안 스마트 이엔지김**혁신성장유형부산부산광역시 해운대구제조업기기용 자동측정 및 제어장치 제조업주문식 제어장치2022-05-302025-05-29벤처기업확인기관
2321823219주식회사 호반식품박**혁신성장유형충북충청북도 청주시제조업수프 및 균질화식품 제조업복합조미시즈닝엑기스소스2022-03-242025-03-23벤처기업확인기관
1454614547뉴통 주식회사김**벤처투자유형경기경기도 고양시건설운수냉장 및 냉동 창고업기업형 식자재 유통물류서비스2021-12-082024-12-07벤처기업확인기관
3439934400㈜에이앤티랩스윤**연구개발유형경북경상북도 구미시정보처리S/W응용 소프트웨어 개발 및 공급업대학정보시스템 인공지능솔루션2023-01-082026-01-07벤처기업확인기관
1804818049주식회사 리신바이오박**벤처투자유형대전대전광역시 유성구연구개발서비스의학 및 약학 연구개발업항생제 개발 연구2022-02-232025-02-22벤처기업확인기관
63996400소성정보기술 주식회사김**혁신성장유형서울서울특별시 구로구제조업방송장비 제조업방송장비,CCTV,출입통제 외2021-07-102024-07-09벤처기업확인기관
3349433495주식회사 큐브인스트루먼트이**연구개발유형대전대전광역시 유성구제조업그 외 기타 의료용 기기 제조업저온 플라즈마 멸균기2023-02-092026-02-08벤처기업확인기관
92639264엔아이피(NIP)이**혁신성장유형세종세종특별자치시 집현중앙7로연구개발서비스기타 공학 연구개발업노즐 및 디자인2021-09-022024-09-01벤처기업확인기관
3538535386주식회사 아이투프럼이**,라**벤처투자유형서울서울특별시 강남구제조업전자감지장치 제조업무선 디지털카운터2023-03-022026-03-01벤처기업확인기관
연번업체명대표자명벤처확인유형지역주소업종분류업종명주생산품유효시작일유효종료일벤처확인기관
518519㈜큐아이티배**연구개발유형경기경기도 수원시제조업기타 전기 변환장치 제조업전기변환장치2021-03-052024-03-04벤처기업확인기관
53845385㈜텔레큐브정**혁신성장유형서울서울특별시 영등포구정보처리S/W컴퓨터 프로그래밍 서비스업전자문서 솔루션 개발, 교환기 구축2021-06-092024-06-08벤처기업확인기관
1832418325㈜필텍김**혁신성장유형전북전라북도 군산시제조업그 외 기타 의료용 기기 제조업일회용 주사기2022-03-262025-03-25벤처기업확인기관
1919519196㈜이너웨이브이**혁신성장유형서울서울특별시 금천구정보처리S/W컴퓨터 프로그래밍 서비스업응용 소프트웨어2022-02-232025-02-22벤처기업확인기관
2899929000주식회사 비밥소프트웨어전**혁신성장유형인천인천광역시 연수구정보처리S/W응용 소프트웨어 개발 및 공급업스프트웨어2022-09-182025-09-17벤처기업확인기관
255256㈜시루정보류**연구개발유형서울서울특별시 마포구정보처리S/W컴퓨터 프로그래밍 서비스업모바일결제, 모바일인증 S/W2021-02-112024-02-10벤처기업확인기관
34663467주식회사 제이씨엔터웍스박**혁신성장유형서울서울특별시 강남구기타영화, 비디오물 및 방송프로그램 배급업영화제작,수입,배급2021-05-282024-05-27벤처기업확인기관
1898218983주식회사 넥사주**연구개발유형서울서울특별시 송파구제조업반도체 제조용 기계 제조업자동화기계기구, 제조설계용역2022-04-142025-04-13벤처기업확인기관
2798127982주식회사 마켓오브메테리얼조**벤처투자유형울산울산광역시 울주군도소매업전자상거래 소매 중개업플랜트 자재 거래2022-08-172025-08-16벤처기업확인기관
74197420㈜신성금속최**혁신성장유형인천인천광역시 남동구제조업톱 및 호환성 공구 제조업절삭공구제조2021-08-252024-08-24벤처기업확인기관