Overview

Dataset statistics

Number of variables6
Number of observations514
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory24.7 KiB
Average record size in memory49.3 B

Variable types

Numeric1
Categorical4
Text1

Dataset

Description해외 온라인플랫폼 계정생성, 마케팅, 컨설팅 등을 지원하는 2021년 온라인 직접수출 사업의 지원기업 별 지역, 업종, 업력, 품목에 관한 데이터.본 자료로 온라인수출 지원사업 참여를 통한 수출 활성화에 기여하기를 소망함.
Author중소벤처기업진흥공단
URLhttps://www.data.go.kr/data/15091632/fileData.do

Alerts

업종1 is highly overall correlated with 업종2High correlation
업종2 is highly overall correlated with 업종1High correlation
연번 is highly overall correlated with 지역High correlation
지역 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 00:50:58.480669
Analysis finished2023-12-12 00:50:59.194875
Duration0.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct514
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean257.5
Minimum1
Maximum514
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.6 KiB
2023-12-12T09:50:59.288049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile26.65
Q1129.25
median257.5
Q3385.75
95-th percentile488.35
Maximum514
Range513
Interquartile range (IQR)256.5

Descriptive statistics

Standard deviation148.52329
Coefficient of variation (CV)0.57678946
Kurtosis-1.2
Mean257.5
Median Absolute Deviation (MAD)128.5
Skewness0
Sum132355
Variance22059.167
MonotonicityStrictly increasing
2023-12-12T09:50:59.471599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.2%
387 1
 
0.2%
353 1
 
0.2%
352 1
 
0.2%
351 1
 
0.2%
350 1
 
0.2%
349 1
 
0.2%
348 1
 
0.2%
347 1
 
0.2%
346 1
 
0.2%
Other values (504) 504
98.1%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
514 1
0.2%
513 1
0.2%
512 1
0.2%
511 1
0.2%
510 1
0.2%
509 1
0.2%
508 1
0.2%
507 1
0.2%
506 1
0.2%
505 1
0.2%

지역
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
서울특별시
169 
경기도
130 
인천광역시
32 
부산광역시
29 
대전광역시
27 
Other values (12)
127 

Length

Max length7
Median length5
Mean length4.3249027
Min length3

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row강원도
2nd row강원도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
서울특별시 169
32.9%
경기도 130
25.3%
인천광역시 32
 
6.2%
부산광역시 29
 
5.6%
대전광역시 27
 
5.3%
전라북도 24
 
4.7%
충청남도 16
 
3.1%
경상북도 15
 
2.9%
강원도 13
 
2.5%
대구광역시 12
 
2.3%
Other values (7) 47
 
9.1%

Length

2023-12-12T09:50:59.674204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울특별시 169
32.9%
경기도 130
25.3%
인천광역시 32
 
6.2%
부산광역시 29
 
5.6%
대전광역시 27
 
5.3%
전라북도 24
 
4.7%
충청남도 16
 
3.1%
경상북도 15
 
2.9%
강원도 13
 
2.5%
대구광역시 12
 
2.3%
Other values (7) 47
 
9.1%

업종1
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
제조업
425 
서비스업
89 

Length

Max length4
Median length3
Mean length3.1731518
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제조업
2nd row제조업
3rd row서비스업
4th row제조업
5th row제조업

Common Values

ValueCountFrequency (%)
제조업 425
82.7%
서비스업 89
 
17.3%

Length

2023-12-12T09:50:59.810014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:50:59.918445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제조업 425
82.7%
서비스업 89
 
17.3%

업종2
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
기타
204 
식료
91 
유통
76 
잡화
49 
화공
46 
Other values (5)
48 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row식료
2nd row식료
3rd row기타
4th row식료
5th row식료

Common Values

ValueCountFrequency (%)
기타 204
39.7%
식료 91
17.7%
유통 76
 
14.8%
잡화 49
 
9.5%
화공 46
 
8.9%
섬유 24
 
4.7%
전자 13
 
2.5%
전기 5
 
1.0%
기계 3
 
0.6%
금속 3
 
0.6%

Length

2023-12-12T09:51:00.051799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:51:00.222672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기타 204
39.7%
식료 91
17.7%
유통 76
 
14.8%
잡화 49
 
9.5%
화공 46
 
8.9%
섬유 24
 
4.7%
전자 13
 
2.5%
전기 5
 
1.0%
기계 3
 
0.6%
금속 3
 
0.6%

업력
Categorical

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
10년이상
137 
3년미만
116 
5년미만
102 
10년미만
81 
7년미만
78 

Length

Max length5
Median length4
Mean length4.4241245
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5년미만
2nd row5년미만
3rd row7년미만
4th row7년미만
5th row7년미만

Common Values

ValueCountFrequency (%)
10년이상 137
26.7%
3년미만 116
22.6%
5년미만 102
19.8%
10년미만 81
15.8%
7년미만 78
15.2%

Length

2023-12-12T09:51:00.429555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:51:00.567605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
10년이상 137
26.7%
3년미만 116
22.6%
5년미만 102
19.8%
10년미만 81
15.8%
7년미만 78
15.2%
Distinct141
Distinct (%)27.4%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
2023-12-12T09:51:01.075276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length17.5
Mean length6.5603113
Min length2

Characters and Unicode

Total characters3372
Distinct characters202
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique81 ?
Unique (%)15.8%

Sample

1st row건강 관리 보충
2nd row기타 식품 및 음료
3rd row애완 동물 제품
4th row야채 제품
5th row인스턴트 식품
ValueCountFrequency (%)
기타 100
 
9.3%
케어 86
 
8.0%
스킨 85
 
7.9%
78
 
7.3%
화장품 67
 
6.3%
식품 51
 
4.8%
제품 45
 
4.2%
음료 38
 
3.5%
용품 37
 
3.5%
건강 25
 
2.3%
Other values (169) 459
42.9%
2023-12-12T09:51:01.734125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1025
30.4%
214
 
6.3%
130
 
3.9%
122
 
3.6%
105
 
3.1%
100
 
3.0%
94
 
2.8%
85
 
2.5%
81
 
2.4%
78
 
2.3%
Other values (192) 1338
39.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2333
69.2%
Space Separator 1025
30.4%
Other Punctuation 6
 
0.2%
Uppercase Letter 6
 
0.2%
Dash Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
214
 
9.2%
130
 
5.6%
122
 
5.2%
105
 
4.5%
100
 
4.3%
94
 
4.0%
85
 
3.6%
81
 
3.5%
78
 
3.3%
71
 
3.0%
Other values (186) 1253
53.7%
Uppercase Letter
ValueCountFrequency (%)
T 3
50.0%
C 2
33.3%
V 1
 
16.7%
Space Separator
ValueCountFrequency (%)
1025
100.0%
Other Punctuation
ValueCountFrequency (%)
, 6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2333
69.2%
Common 1033
30.6%
Latin 6
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
214
 
9.2%
130
 
5.6%
122
 
5.2%
105
 
4.5%
100
 
4.3%
94
 
4.0%
85
 
3.6%
81
 
3.5%
78
 
3.3%
71
 
3.0%
Other values (186) 1253
53.7%
Common
ValueCountFrequency (%)
1025
99.2%
, 6
 
0.6%
- 2
 
0.2%
Latin
ValueCountFrequency (%)
T 3
50.0%
C 2
33.3%
V 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2333
69.2%
ASCII 1039
30.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1025
98.7%
, 6
 
0.6%
T 3
 
0.3%
C 2
 
0.2%
- 2
 
0.2%
V 1
 
0.1%
Hangul
ValueCountFrequency (%)
214
 
9.2%
130
 
5.6%
122
 
5.2%
105
 
4.5%
100
 
4.3%
94
 
4.0%
85
 
3.6%
81
 
3.5%
78
 
3.3%
71
 
3.0%
Other values (186) 1253
53.7%

Interactions

2023-12-12T09:50:58.892072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T09:51:01.882201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번지역업종1업종2업력
연번1.0000.9230.2710.4700.799
지역0.9231.0000.2050.4820.200
업종10.2710.2051.0000.9900.041
업종20.4700.4820.9901.0000.178
업력0.7990.2000.0410.1781.000
2023-12-12T09:51:02.016003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업력지역업종1업종2
업력1.0000.1020.0500.074
지역0.1021.0000.1810.209
업종10.0500.1811.0000.905
업종20.0740.2090.9051.000
2023-12-12T09:51:02.121387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번지역업종1업종2업력
연번1.0000.6970.2060.1600.454
지역0.6971.0000.1810.2090.102
업종10.2060.1811.0000.9050.050
업종20.1600.2090.9051.0000.074
업력0.4540.1020.0500.0741.000

Missing values

2023-12-12T09:50:59.020927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T09:50:59.150685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번지역업종1업종2업력수출품목명
01강원도제조업식료5년미만건강 관리 보충
12강원도제조업식료5년미만기타 식품 및 음료
23강원도서비스업기타7년미만애완 동물 제품
34강원도제조업식료7년미만야채 제품
45강원도제조업식료7년미만인스턴트 식품
56강원도제조업식료7년미만
67강원도제조업기타10년미만건강 관리 용품
78강원도제조업식료10년미만기타 식품 및 음료
89강원도제조업식료10년미만기타 식품 및 음료
910강원도제조업기타10년이상기타 식품 및 음료
연번지역업종1업종2업력수출품목명
504505충청남도제조업화공10년이상스티커
505506충청남도제조업식료10년이상건강 관리 보충
506507충청북도제조업화공3년미만스킨 케어
507508충청북도제조업식료3년미만방부제 및 조미료
508509충청북도제조업화공3년미만화장품
509510충청북도제조업식료7년미만방부제 및 조미료
510511충청북도제조업식료10년미만식재료
511512충청북도제조업기타10년이상기타
512513충청북도제조업화공10년이상화장품
513514충청북도제조업기타10년이상헤어케어