Overview

Dataset statistics

Number of variables6
Number of observations302
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory14.6 KiB
Average record size in memory49.4 B

Variable types

Numeric1
Categorical4
Text1

Dataset

Description글로벌 전자상거래 시장 진출 및 수출 활성화를 위해 온라인 수출 인프라 구축 및 해외마케팅, 해외바이어 구매오퍼 사후관리 등을 지원하는 온라인수출플랫폼 지원사업 참여한 중소벤처기업의 현황정보(지역, 품목, 업종 등)을 제공합니다.
Author중소벤처기업진흥공단
URLhttps://www.data.go.kr/data/15095250/fileData.do

Alerts

업종1 is highly overall correlated with 업종2High correlation
업종2 is highly overall correlated with 업종1High correlation
업종1 is highly imbalanced (50.2%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 15:08:30.757222
Analysis finished2023-12-12 15:08:31.344462
Duration0.59 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct302
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean151.5
Minimum1
Maximum302
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.8 KiB
2023-12-13T00:08:31.410055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile16.05
Q176.25
median151.5
Q3226.75
95-th percentile286.95
Maximum302
Range301
Interquartile range (IQR)150.5

Descriptive statistics

Standard deviation87.324109
Coefficient of variation (CV)0.57639676
Kurtosis-1.2
Mean151.5
Median Absolute Deviation (MAD)75.5
Skewness0
Sum45753
Variance7625.5
MonotonicityStrictly increasing
2023-12-13T00:08:31.562377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.3%
209 1
 
0.3%
207 1
 
0.3%
206 1
 
0.3%
205 1
 
0.3%
204 1
 
0.3%
203 1
 
0.3%
202 1
 
0.3%
201 1
 
0.3%
200 1
 
0.3%
Other values (292) 292
96.7%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
302 1
0.3%
301 1
0.3%
300 1
0.3%
299 1
0.3%
298 1
0.3%
297 1
0.3%
296 1
0.3%
295 1
0.3%
294 1
0.3%
293 1
0.3%

지역
Categorical

Distinct16
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
경기도
100 
서울특별시
92 
인천광역시
14 
부산광역시
12 
강원도
11 
Other values (11)
73 

Length

Max length7
Median length5
Mean length4.1324503
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row제주특별자치도
3rd row서울특별시
4th row서울특별시
5th row경기도

Common Values

ValueCountFrequency (%)
경기도 100
33.1%
서울특별시 92
30.5%
인천광역시 14
 
4.6%
부산광역시 12
 
4.0%
강원도 11
 
3.6%
충청북도 10
 
3.3%
경상남도 10
 
3.3%
대구광역시 9
 
3.0%
경상북도 8
 
2.6%
충청남도 8
 
2.6%
Other values (6) 28
 
9.3%

Length

2023-12-13T00:08:31.709360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 100
33.1%
서울특별시 92
30.5%
인천광역시 14
 
4.6%
부산광역시 12
 
4.0%
강원도 11
 
3.6%
충청북도 10
 
3.3%
경상남도 10
 
3.3%
대구광역시 9
 
3.0%
경상북도 8
 
2.6%
충청남도 8
 
2.6%
Other values (6) 28
 
9.3%

업종1
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
제조업
269 
지식서비스업
33 

Length

Max length6
Median length3
Mean length3.3278146
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제조업
2nd row제조업
3rd row지식서비스업
4th row지식서비스업
5th row제조업

Common Values

ValueCountFrequency (%)
제조업 269
89.1%
지식서비스업 33
 
10.9%

Length

2023-12-13T00:08:31.822646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:08:31.909242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제조업 269
89.1%
지식서비스업 33
 
10.9%

업종2
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
기타
133 
식료
31 
전자
23 
화공
22 
도매 및 상품 중개업
21 
Other values (10)
72 

Length

Max length32
Median length2
Mean length3.1721854
Min length2

Unique

Unique3 ?
Unique (%)1.0%

Sample

1st row섬유
2nd row식료
3rd row도매 및 상품 중개업
4th row소프트웨어 개발 및 공급업
5th row기타

Common Values

ValueCountFrequency (%)
기타 133
44.0%
식료 31
 
10.3%
전자 23
 
7.6%
화공 22
 
7.3%
도매 및 상품 중개업 21
 
7.0%
잡화 18
 
6.0%
기계 16
 
5.3%
섬유 12
 
4.0%
전기 9
 
3.0%
소프트웨어 개발 및 공급업 6
 
2.0%
Other values (5) 11
 
3.6%

Length

2023-12-13T00:08:32.005893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
기타 134
33.3%
32
 
8.0%
식료 31
 
7.7%
전자 23
 
5.7%
화공 22
 
5.5%
도매 21
 
5.2%
상품 21
 
5.2%
중개업 21
 
5.2%
잡화 18
 
4.5%
기계 16
 
4.0%
Other values (19) 63
15.7%

업력
Categorical

Distinct5
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
7년이상
140 
5년미만
62 
3년미만
53 
7년미만
46 
1년미만
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row7년미만
2nd row3년미만
3rd row7년이상
4th row7년미만
5th row7년미만

Common Values

ValueCountFrequency (%)
7년이상 140
46.4%
5년미만 62
20.5%
3년미만 53
 
17.5%
7년미만 46
 
15.2%
1년미만 1
 
0.3%

Length

2023-12-13T00:08:32.107029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:08:32.203598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
7년이상 140
46.4%
5년미만 62
20.5%
3년미만 53
 
17.5%
7년미만 46
 
15.2%
1년미만 1
 
0.3%
Distinct132
Distinct (%)43.7%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2023-12-13T00:08:32.516629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length16
Mean length5.910596
Min length1

Characters and Unicode

Total characters1785
Distinct characters219
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique92 ?
Unique (%)30.5%

Sample

1st row여성복
2nd row수산물
3rd row화장품
4th row캠핑 및 하이킹
5th row머리 손질
ValueCountFrequency (%)
기타 72
 
11.1%
52
 
8.0%
케어 40
 
6.2%
스킨 39
 
6.0%
화장품 30
 
4.6%
제품 21
 
3.3%
식품 17
 
2.6%
음료 13
 
2.0%
기계 12
 
1.9%
용품 12
 
1.9%
Other values (171) 338
52.3%
2023-12-13T00:08:32.979611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
344
 
19.3%
116
 
6.5%
96
 
5.4%
72
 
4.0%
63
 
3.5%
52
 
2.9%
44
 
2.5%
43
 
2.4%
42
 
2.4%
39
 
2.2%
Other values (209) 874
49.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1425
79.8%
Space Separator 344
 
19.3%
Uppercase Letter 15
 
0.8%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
116
 
8.1%
96
 
6.7%
72
 
5.1%
63
 
4.4%
52
 
3.6%
44
 
3.1%
43
 
3.0%
42
 
2.9%
39
 
2.7%
33
 
2.3%
Other values (201) 825
57.9%
Uppercase Letter
ValueCountFrequency (%)
C 3
20.0%
B 3
20.0%
P 3
20.0%
D 2
13.3%
E 2
13.3%
L 2
13.3%
Space Separator
ValueCountFrequency (%)
344
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1425
79.8%
Common 345
 
19.3%
Latin 15
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
116
 
8.1%
96
 
6.7%
72
 
5.1%
63
 
4.4%
52
 
3.6%
44
 
3.1%
43
 
3.0%
42
 
2.9%
39
 
2.7%
33
 
2.3%
Other values (201) 825
57.9%
Latin
ValueCountFrequency (%)
C 3
20.0%
B 3
20.0%
P 3
20.0%
D 2
13.3%
E 2
13.3%
L 2
13.3%
Common
ValueCountFrequency (%)
344
99.7%
, 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1425
79.8%
ASCII 360
 
20.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
344
95.6%
C 3
 
0.8%
B 3
 
0.8%
P 3
 
0.8%
D 2
 
0.6%
E 2
 
0.6%
L 2
 
0.6%
, 1
 
0.3%
Hangul
ValueCountFrequency (%)
116
 
8.1%
96
 
6.7%
72
 
5.1%
63
 
4.4%
52
 
3.6%
44
 
3.1%
43
 
3.0%
42
 
2.9%
39
 
2.7%
33
 
2.3%
Other values (201) 825
57.9%

Interactions

2023-12-13T00:08:31.073843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:08:33.089030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번지역업종1업종2업력
연번1.0000.1910.0000.0000.000
지역0.1911.0000.0830.3380.217
업종10.0000.0831.0001.0000.147
업종20.0000.3381.0001.0000.352
업력0.0000.2170.1470.3521.000
2023-12-13T00:08:33.186473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업력지역업종1업종2
업력1.0000.1090.1790.154
지역0.1091.0000.0630.118
업종10.1790.0631.0000.978
업종20.1540.1180.9781.000
2023-12-13T00:08:33.277058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번지역업종1업종2업력
연번1.0000.0740.0000.0000.000
지역0.0741.0000.0630.1180.109
업종10.0000.0631.0000.9780.179
업종20.0000.1180.9781.0000.154
업력0.0000.1090.1790.1541.000

Missing values

2023-12-13T00:08:31.174619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:08:31.297606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번지역업종1업종2업력품목명
01서울특별시제조업섬유7년미만여성복
12제주특별자치도제조업식료3년미만수산물
23서울특별시지식서비스업도매 및 상품 중개업7년이상화장품
34서울특별시지식서비스업소프트웨어 개발 및 공급업7년미만캠핑 및 하이킹
45경기도제조업기타7년미만머리 손질
56서울특별시제조업기타7년미만향수 탈취제
67광주광역시제조업기타5년미만스킨 케어
78강원도제조업기타5년미만화장품
89제주특별자치도지식서비스업도매 및 상품 중개업7년이상수산물
910경기도제조업기타7년이상스킨 케어
연번지역업종1업종2업력품목명
292293서울특별시제조업기타3년미만머리 손질
293294부산광역시제조업기타3년미만구강 위생
294295서울특별시제조업잡화5년미만특수 목적 가방 케이스
295296서울특별시제조업기타5년미만스킨 케어
296297경기도지식서비스업도매 및 상품 중개업3년미만화장품
297298대구광역시제조업금속5년미만치과 장비
298299서울특별시제조업전자7년이상기타 컴퓨터 제품
299300서울특별시지식서비스업도매 및 상품 중개업5년미만화장품
300301서울특별시제조업기타7년이상바닥 및 액세서리
301302서울특별시제조업잡화5년미만양말류