Overview

Dataset statistics

Number of variables6
Number of observations38
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.1 KiB
Average record size in memory55.5 B

Variable types

Categorical1
Text1
Numeric4

Dataset

DescriptionSample
Author소상공인연합회
URLhttps://www.bigdata-telecom.kr/invoke/SOKBP2603/?goodsCode=KFMECMS006

Alerts

수혜사업체수 is highly overall correlated with 비중 and 1 other fieldsHigh correlation
비중 is highly overall correlated with 수혜사업체수 and 1 other fieldsHigh correlation
총지원액 is highly overall correlated with 수혜사업체수 and 1 other fieldsHigh correlation
수혜사업체수 has unique valuesUnique
총지원액 has unique valuesUnique
비중 has 3 (7.9%) zerosZeros

Reproduction

Analysis started2023-12-10 06:26:59.371746
Analysis finished2023-12-10 06:27:04.711818
Duration5.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

광역시도명
Categorical

Distinct18
Distinct (%)47.4%
Missing0
Missing (%)0.0%
Memory size436.0 B
전국
21 
경기
 
1
경남
 
1
경북
 
1
광주
 
1
Other values (13)
13 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique17 ?
Unique (%)44.7%

Sample

1st row강원
2nd row경기
3rd row경남
4th row경북
5th row광주

Common Values

ValueCountFrequency (%)
전국 21
55.3%
경기 1
 
2.6%
경남 1
 
2.6%
경북 1
 
2.6%
광주 1
 
2.6%
대구 1
 
2.6%
대전 1
 
2.6%
부산 1
 
2.6%
서울 1
 
2.6%
강원 1
 
2.6%
Other values (8) 8
 
21.1%

Length

2023-12-10T15:27:04.832466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전국 21
55.3%
경기 1
 
2.6%
충북 1
 
2.6%
충남 1
 
2.6%
제주 1
 
2.6%
전북 1
 
2.6%
전남 1
 
2.6%
인천 1
 
2.6%
울산 1
 
2.6%
강원 1
 
2.6%
Other values (8) 8
 
21.1%

업종
Text

Distinct21
Distinct (%)55.3%
Missing0
Missing (%)0.0%
Memory size436.0 B
2023-12-10T15:27:05.452568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length26
Mean length9.2631579
Min length3

Characters and Unicode

Total characters352
Distinct characters112
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)52.6%

Sample

1st row전업종
2nd row전업종
3rd row전업종
4th row전업종
5th row전업종
ValueCountFrequency (%)
전업종 18
 
19.1%
14
 
14.9%
단체 1
 
1.1%
관리 1
 
1.1%
사업 1
 
1.1%
지원 1
 
1.1%
임대서비스업(n 1
 
1.1%
교육서비스업(p 1
 
1.1%
보건업 1
 
1.1%
사회복지 1
 
1.1%
Other values (54) 54
57.4%
2023-12-10T15:27:06.088320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
56
 
15.9%
41
 
11.6%
20
 
5.7%
( 20
 
5.7%
) 20
 
5.7%
18
 
5.1%
14
 
4.0%
8
 
2.3%
7
 
2.0%
6
 
1.7%
Other values (102) 142
40.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 236
67.0%
Space Separator 56
 
15.9%
Open Punctuation 20
 
5.7%
Close Punctuation 20
 
5.7%
Uppercase Letter 20
 
5.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
41
 
17.4%
20
 
8.5%
18
 
7.6%
14
 
5.9%
8
 
3.4%
7
 
3.0%
6
 
2.5%
6
 
2.5%
5
 
2.1%
4
 
1.7%
Other values (79) 107
45.3%
Uppercase Letter
ValueCountFrequency (%)
P 1
 
5.0%
N 1
 
5.0%
T 1
 
5.0%
M 1
 
5.0%
S 1
 
5.0%
Q 1
 
5.0%
R 1
 
5.0%
J 1
 
5.0%
E 1
 
5.0%
D 1
 
5.0%
Other values (10) 10
50.0%
Space Separator
ValueCountFrequency (%)
56
100.0%
Open Punctuation
ValueCountFrequency (%)
( 20
100.0%
Close Punctuation
ValueCountFrequency (%)
) 20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 236
67.0%
Common 96
27.3%
Latin 20
 
5.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
41
 
17.4%
20
 
8.5%
18
 
7.6%
14
 
5.9%
8
 
3.4%
7
 
3.0%
6
 
2.5%
6
 
2.5%
5
 
2.1%
4
 
1.7%
Other values (79) 107
45.3%
Latin
ValueCountFrequency (%)
P 1
 
5.0%
N 1
 
5.0%
T 1
 
5.0%
M 1
 
5.0%
S 1
 
5.0%
Q 1
 
5.0%
R 1
 
5.0%
J 1
 
5.0%
E 1
 
5.0%
D 1
 
5.0%
Other values (10) 10
50.0%
Common
ValueCountFrequency (%)
56
58.3%
( 20
 
20.8%
) 20
 
20.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 236
67.0%
ASCII 116
33.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
56
48.3%
( 20
 
17.2%
) 20
 
17.2%
P 1
 
0.9%
N 1
 
0.9%
T 1
 
0.9%
M 1
 
0.9%
S 1
 
0.9%
Q 1
 
0.9%
R 1
 
0.9%
Other values (13) 13
 
11.2%
Hangul
ValueCountFrequency (%)
41
 
17.4%
20
 
8.5%
18
 
7.6%
14
 
5.9%
8
 
3.4%
7
 
3.0%
6
 
2.5%
6
 
2.5%
5
 
2.1%
4
 
1.7%
Other values (79) 107
45.3%

수혜사업체수
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct38
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean192151.18
Minimum24
Maximum2433915
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size474.0 B
2023-12-10T15:27:06.318915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum24
5-th percentile188.7
Q132143.75
median83663
Q3155704.75
95-th percentile586014.1
Maximum2433915
Range2433891
Interquartile range (IQR)123561

Descriptive statistics

Standard deviation407833.5
Coefficient of variation (CV)2.1224615
Kurtosis25.872449
Mean192151.18
Median Absolute Deviation (MAD)65774.5
Skewness4.7899713
Sum7301745
Variance1.6632816 × 1011
MonotonicityNot monotonic
2023-12-10T15:27:06.536233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
82997 1
 
2.6%
61738 1
 
2.6%
145006 1
 
2.6%
623426 1
 
2.6%
296670 1
 
2.6%
548329 1
 
2.6%
29282 1
 
2.6%
24 1
 
2.6%
72526 1
 
2.6%
56870 1
 
2.6%
Other values (28) 28
73.7%
ValueCountFrequency (%)
24 1
2.6%
102 1
2.6%
204 1
2.6%
1146 1
2.6%
1499 1
2.6%
2486 1
2.6%
10158 1
2.6%
13394 1
2.6%
13457 1
2.6%
29282 1
2.6%
ValueCountFrequency (%)
2433915 1
2.6%
623426 1
2.6%
579412 1
2.6%
548329 1
2.6%
497538 1
2.6%
296670 1
2.6%
201554 1
2.6%
171945 1
2.6%
159398 1
2.6%
159271 1
2.6%

비중
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct31
Distinct (%)81.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.9
Minimum0
Maximum100
Zeros3
Zeros (%)7.9%
Negative0
Negative (%)0.0%
Memory size474.0 B
2023-12-10T15:27:06.768987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11.325
median3.45
Q36.375
95-th percentile24.07
Maximum100
Range100
Interquartile range (IQR)5.05

Descriptive statistics

Standard deviation16.752152
Coefficient of variation (CV)2.1205256
Kurtosis25.893879
Mean7.9
Median Absolute Deviation (MAD)2.7
Skewness4.7923467
Sum300.2
Variance280.63459
MonotonicityNot monotonic
2023-12-10T15:27:07.017463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0.0 3
 
7.9%
3.5 3
 
7.9%
0.1 3
 
7.9%
0.6 2
 
5.3%
100.0 1
 
2.6%
8.3 1
 
2.6%
4.5 1
 
2.6%
2.3 1
 
2.6%
2.5 1
 
2.6%
3.0 1
 
2.6%
Other values (21) 21
55.3%
ValueCountFrequency (%)
0.0 3
7.9%
0.1 3
7.9%
0.4 1
 
2.6%
0.6 2
5.3%
1.2 1
 
2.6%
1.7 1
 
2.6%
2.0 1
 
2.6%
2.3 1
 
2.6%
2.5 1
 
2.6%
2.6 1
 
2.6%
ValueCountFrequency (%)
100.0 1
2.6%
25.6 1
2.6%
23.8 1
2.6%
22.5 1
2.6%
20.4 1
2.6%
12.2 1
2.6%
8.3 1
2.6%
7.1 1
2.6%
6.6 1
2.6%
6.5 1
2.6%

평균지원액
Real number (ℝ)

Distinct36
Distinct (%)94.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1080973.7
Minimum1000000
Maximum1894409
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size474.0 B
2023-12-10T15:27:07.259512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000000
5-th percentile1000000
Q11002482.5
median1017817.5
Q31042577.5
95-th percentile1316716.6
Maximum1894409
Range894409
Interquartile range (IQR)40095

Descriptive statistics

Standard deviation176321.71
Coefficient of variation (CV)0.16311379
Kurtosis13.727623
Mean1080973.7
Median Absolute Deviation (MAD)17136
Skewness3.5567644
Sum41076999
Variance3.1089345 × 1010
MonotonicityNot monotonic
2023-12-10T15:27:07.469156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
1000000 3
 
7.9%
1002361 1
 
2.6%
1000683 1
 
2.6%
1012031 1
 
2.6%
1000243 1
 
2.6%
1265545 1
 
2.6%
1004371 1
 
2.6%
1125000 1
 
2.6%
1002847 1
 
2.6%
1004001 1
 
2.6%
Other values (26) 26
68.4%
ValueCountFrequency (%)
1000000 3
7.9%
1000175 1
 
2.6%
1000243 1
 
2.6%
1000258 1
 
2.6%
1000560 1
 
2.6%
1000683 1
 
2.6%
1001115 1
 
2.6%
1002361 1
 
2.6%
1002847 1
 
2.6%
1003009 1
 
2.6%
ValueCountFrequency (%)
1894409 1
2.6%
1606689 1
2.6%
1265545 1
2.6%
1191635 1
2.6%
1186141 1
2.6%
1171364 1
2.6%
1156331 1
2.6%
1125000 1
2.6%
1094567 1
2.6%
1045047 1
2.6%

총지원액
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct38
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1032236 × 1011
Minimum27000000
Maximum2.66408 × 1012
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size474.0 B
2023-12-10T15:27:07.693791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum27000000
5-th percentile1.887 × 108
Q13.2242375 × 1010
median8.5569 × 1010
Q31.6153675 × 1011
95-th percentile6.809878 × 1011
Maximum2.66408 × 1012
Range2.664053 × 1012
Interquartile range (IQR)1.2929438 × 1011

Descriptive statistics

Standard deviation4.4957731 × 1011
Coefficient of variation (CV)2.1375631
Kurtosis25.084807
Mean2.1032236 × 1011
Median Absolute Deviation (MAD)6.9454 × 1010
Skewness4.7071282
Sum7.9922495 × 1012
Variance2.0211976 × 1023
MonotonicityNot monotonic
2023-12-10T15:27:07.965447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
84429000000 1
 
2.6%
61985000000 1
 
2.6%
145105000000 1
 
2.6%
630927000000 1
 
2.6%
296742000000 1
 
2.6%
693935000000 1
 
2.6%
29410000000 1
 
2.6%
27000000 1
 
2.6%
72732500000 1
 
2.6%
57187500000 1
 
2.6%
Other values (28) 28
73.7%
ValueCountFrequency (%)
27000000 1
2.6%
102000000 1
2.6%
204000000 1
2.6%
1154500000 1
2.6%
1499000000 1
2.6%
4709500000 1
2.6%
10410000000 1
2.6%
13401500000 1
2.6%
13472000000 1
2.6%
29410000000 1
2.6%
ValueCountFrequency (%)
2664080000000 1
2.6%
693935000000 1
2.6%
678703000000 1
2.6%
630927000000 1
2.6%
575319000000 1
2.6%
296742000000 1
2.6%
202655000000 1
2.6%
172351000000 1
2.6%
164688000000 1
2.6%
164589000000 1
2.6%

Interactions

2023-12-10T15:27:03.759723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:02.010626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:02.629682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:03.152862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:03.959136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:02.210972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:02.761199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:03.284885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:04.130603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:02.361300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:02.892756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:03.410458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:04.250251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:02.491248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:03.017340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:27:03.591925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:27:08.121871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
광역시도명업종수혜사업체수비중평균지원액총지원액
광역시도명1.0000.0000.0000.0000.0000.000
업종0.0001.0000.0000.0000.8800.000
수혜사업체수0.0000.0001.0001.0000.3101.000
비중0.0000.0001.0001.0000.3101.000
평균지원액0.0000.8800.3100.3101.0000.433
총지원액0.0000.0001.0001.0000.4331.000
2023-12-10T15:27:08.286801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수혜사업체수비중평균지원액총지원액광역시도명
수혜사업체수1.0000.9990.2950.9930.000
비중0.9991.0000.2980.9940.000
평균지원액0.2950.2981.0000.3420.000
총지원액0.9930.9940.3421.0000.000
광역시도명0.0000.0000.0000.0001.000

Missing values

2023-12-10T15:27:04.428475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:27:04.613032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

광역시도명업종수혜사업체수비중평균지원액총지원액
0강원전업종829973.4101725484429000000
1경기전업종57941223.81171364678703000000
2경남전업종1593986.61032566164589000000
3경북전업종1327825.51003009133182000000
4광주전업종634822.6103495565701000000
5대구전업종1197584.91000175119779000000
6대전전업종670262.8102976569021000000
7부산전업종1592716.51034008164688000000
8서울전업종49753820.41156331575319000000
9세종전업종101580.4102480810410000000
광역시도명업종수혜사업체수비중평균지원액총지원액
28전국부동산업(L)725263.0100284772732500000
29전국전문 과학 및 기술 서비스업(M)617382.5100400161985000000
30전국사업시설 관리 사업 지원 및 임대서비스업(N)568702.3100558357187500000
31전국교육서비스업(P)1099284.51191635130994000000
32전국보건업 및 사회복지 서비스업(Q)11460.110074171154500000
33전국예술 스포츠 및 여가관련 서비스업(R)843293.51606689135491000000
34전국협회 및 단체 수리 및 기타 개인 서비스업(S)2015548.31005460202655000000
35전국가구 내 고용활동 및 달리 분류되지 않 은 자가 소비 생산활동(T)2040.01000000204000000
36전국확인불가(X)24860.118944094709500000
37전국전업종2433915100.010945672664080000000