Overview

Dataset statistics

Number of variables6
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.9 KiB
Average record size in memory50.3 B

Variable types

Numeric1
DateTime1
Categorical3
Text1

Dataset

Description중소벤처기업진흥공단에서 지원하는 온라인수출 지원사업 內 외국어홈페이지 제작사업 현황 자료를 외부에 제공합니다.
URLhttps://www.data.go.kr/data/15040367/fileData.do

Alerts

업태 is highly imbalanced (53.1%)Imbalance
외국어홈페이지(신청언어) is highly imbalanced (50.5%)Imbalance
일련번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 05:51:35.584277
Analysis finished2023-12-12 05:51:36.178056
Duration0.59 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일련번호
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-12T14:51:36.262132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-12T14:51:36.428073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%
Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2019-05-07 00:00:00
Maximum2019-09-02 00:00:00
2023-12-12T14:51:36.550157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:51:36.650877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=5)

지역
Categorical

Distinct16
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
서울
25 
경기
19 
인천
11 
경북
11 
부산
Other values (11)
27 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique4 ?
Unique (%)4.0%

Sample

1st row서울
2nd row경기
3rd row부산
4th row경남
5th row서울

Common Values

ValueCountFrequency (%)
서울 25
25.0%
경기 19
19.0%
인천 11
11.0%
경북 11
11.0%
부산 7
 
7.0%
대전 4
 
4.0%
충북 4
 
4.0%
충남 4
 
4.0%
전북 4
 
4.0%
강원 3
 
3.0%
Other values (6) 8
 
8.0%

Length

2023-12-12T14:51:36.758834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울 25
25.0%
경기 19
19.0%
인천 11
11.0%
경북 11
11.0%
부산 7
 
7.0%
대전 4
 
4.0%
충북 4
 
4.0%
충남 4
 
4.0%
전북 4
 
4.0%
강원 3
 
3.0%
Other values (6) 8
 
8.0%

업태
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
제조업
90 
지식서비스업
10 

Length

Max length7
Median length4
Mean length4.3
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제조업
2nd row제조업
3rd row제조업
4th row제조업
5th row지식서비스업

Common Values

ValueCountFrequency (%)
제조업 90
90.0%
지식서비스업 10
 
10.0%

Length

2023-12-12T14:51:36.881387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:51:36.979881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제조업 90
90.0%
지식서비스업 10
 
10.0%
Distinct82
Distinct (%)82.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-12T14:51:37.310035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length16
Mean length8.18
Min length1

Characters and Unicode

Total characters818
Distinct characters198
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique71 ?
Unique (%)71.0%

Sample

1st row애완 동물용 식품
2nd row맥주
3rd row기초
4th row플라스틱 장난감
5th row기타 비 알코올 음료
ValueCountFrequency (%)
기타 34
 
12.8%
제품 13
 
4.9%
11
 
4.2%
용품 6
 
2.3%
플라스틱 6
 
2.3%
기초 5
 
1.9%
부품 5
 
1.9%
화장품 4
 
1.5%
전자 4
 
1.5%
장비 4
 
1.5%
Other values (136) 173
65.3%
2023-12-12T14:51:37.810506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
165
 
20.2%
49
 
6.0%
35
 
4.3%
35
 
4.3%
30
 
3.7%
21
 
2.6%
17
 
2.1%
14
 
1.7%
12
 
1.5%
11
 
1.3%
Other values (188) 429
52.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 631
77.1%
Space Separator 165
 
20.2%
Uppercase Letter 13
 
1.6%
Other Punctuation 6
 
0.7%
Open Punctuation 1
 
0.1%
Dash Punctuation 1
 
0.1%
Close Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
49
 
7.8%
35
 
5.5%
35
 
5.5%
30
 
4.8%
21
 
3.3%
17
 
2.7%
14
 
2.2%
12
 
1.9%
11
 
1.7%
10
 
1.6%
Other values (172) 397
62.9%
Uppercase Letter
ValueCountFrequency (%)
T 3
23.1%
M 2
15.4%
X 1
 
7.7%
I 1
 
7.7%
S 1
 
7.7%
O 1
 
7.7%
F 1
 
7.7%
L 1
 
7.7%
E 1
 
7.7%
P 1
 
7.7%
Other Punctuation
ValueCountFrequency (%)
, 5
83.3%
& 1
 
16.7%
Space Separator
ValueCountFrequency (%)
165
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 631
77.1%
Common 174
 
21.3%
Latin 13
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
49
 
7.8%
35
 
5.5%
35
 
5.5%
30
 
4.8%
21
 
3.3%
17
 
2.7%
14
 
2.2%
12
 
1.9%
11
 
1.7%
10
 
1.6%
Other values (172) 397
62.9%
Latin
ValueCountFrequency (%)
T 3
23.1%
M 2
15.4%
X 1
 
7.7%
I 1
 
7.7%
S 1
 
7.7%
O 1
 
7.7%
F 1
 
7.7%
L 1
 
7.7%
E 1
 
7.7%
P 1
 
7.7%
Common
ValueCountFrequency (%)
165
94.8%
, 5
 
2.9%
& 1
 
0.6%
( 1
 
0.6%
- 1
 
0.6%
) 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 631
77.1%
ASCII 187
 
22.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
165
88.2%
, 5
 
2.7%
T 3
 
1.6%
M 2
 
1.1%
X 1
 
0.5%
& 1
 
0.5%
I 1
 
0.5%
S 1
 
0.5%
( 1
 
0.5%
- 1
 
0.5%
Other values (6) 6
 
3.2%
Hangul
ValueCountFrequency (%)
49
 
7.8%
35
 
5.5%
35
 
5.5%
30
 
4.8%
21
 
3.3%
17
 
2.7%
14
 
2.2%
12
 
1.9%
11
 
1.7%
10
 
1.6%
Other values (172) 397
62.9%
Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
영어
74 
중국어
14 
일본어
 
6
기타
 
4
러시아어
 
1

Length

Max length4
Median length2
Mean length2.23
Min length2

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row영어
2nd row영어
3rd row기타
4th row영어
5th row영어

Common Values

ValueCountFrequency (%)
영어 74
74.0%
중국어 14
 
14.0%
일본어 6
 
6.0%
기타 4
 
4.0%
러시아어 1
 
1.0%
아랍어 1
 
1.0%

Length

2023-12-12T14:51:37.956940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:51:38.086012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영어 74
74.0%
중국어 14
 
14.0%
일본어 6
 
6.0%
기타 4
 
4.0%
러시아어 1
 
1.0%
아랍어 1
 
1.0%

Interactions

2023-12-12T14:51:35.907481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:51:38.163775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호지원일자지역업태수출품목명외국어홈페이지(신청언어)
일련번호1.0000.9040.0000.0000.0000.000
지원일자0.9041.0000.3960.0000.9600.000
지역0.0000.3961.0000.3170.7750.700
업태0.0000.0000.3171.0000.9050.502
수출품목명0.0000.9600.7750.9051.0000.000
외국어홈페이지(신청언어)0.0000.0000.7000.5020.0001.000
2023-12-12T14:51:38.282578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
외국어홈페이지(신청언어)업태지역
외국어홈페이지(신청언어)1.0000.3550.404
업태0.3551.0000.227
지역0.4040.2271.000
2023-12-12T14:51:38.390295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호지역업태외국어홈페이지(신청언어)
일련번호1.0000.0000.0000.000
지역0.0001.0000.2270.404
업태0.0000.2271.0000.355
외국어홈페이지(신청언어)0.0000.4040.3551.000

Missing values

2023-12-12T14:51:36.026425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:51:36.136414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일련번호지원일자지역업태수출품목명외국어홈페이지(신청언어)
012019-05-07서울제조업애완 동물용 식품영어
122019-05-07경기제조업맥주영어
232019-05-07부산제조업기초기타
342019-05-07경남제조업플라스틱 장난감영어
452019-05-07서울지식서비스업기타 비 알코올 음료영어
562019-05-07인천제조업건설 기계 및 공구 부품영어
672019-05-07인천제조업기타 사무실 및 학교 용품영어
782019-05-07인천제조업화장품영어
892019-05-07부산제조업기타 비 알코올 음료영어
9102019-05-07서울제조업전자 제품영어
일련번호지원일자지역업태수출품목명외국어홈페이지(신청언어)
90912019-06-10서울제조업기초중국어
91922019-06-10경기제조업사무실 의자영어
92932019-06-10경기제조업기타 소방 용품영어
93942019-09-02경북제조업영어
94952019-09-02부산제조업포장상자영어
95962019-09-02경북제조업기타 농업 제품영어
96972019-09-02경기제조업호스용관이음쇠영어
97982019-09-02경북제조업플라스틱 필름영어
98992019-09-02대구지식서비스업가전 사업기타
991002019-09-02충남제조업해조중국어