Overview

Dataset statistics

Number of variables6
Number of observations123
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.1 KiB
Average record size in memory51.1 B

Variable types

Categorical3
Text1
Numeric2

Dataset

DescriptionSample
Author소상공인연합회
URLhttps://www.bigdata-telecom.kr/invoke/SOKBP2603/?goodsCode=KFMECMS007

Alerts

서비스업지급금액 is highly overall correlated with 평균지급금액 High correlation
평균지급금액 is highly overall correlated with 서비스업지급금액 High correlation
제조업지급금액 is highly imbalanced (70.5%)Imbalance

Reproduction

Analysis started2023-12-10 06:14:07.105118
Analysis finished2023-12-10 06:14:08.234591
Duration1.13 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

광역
Categorical

Distinct8
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
경기도
32 
경상북도
24 
경상남도
21 
강원도
18 
대구광역시
12 
Other values (3)
16 

Length

Max length7
Median length6
Mean length5.8211382
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 강원도
2nd row 강원도
3rd row 강원도
4th row 강원도
5th row 강원도

Common Values

ValueCountFrequency (%)
경기도 32
26.0%
경상북도 24
19.5%
경상남도 21
17.1%
강원도 18
14.6%
대구광역시 12
 
9.8%
광주광역시 6
 
4.9%
대전광역시 5
 
4.1%
부산광역시 5
 
4.1%

Length

2023-12-10T15:14:08.335937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:14:08.528344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 32
26.0%
경상북도 24
19.5%
경상남도 21
17.1%
강원도 18
14.6%
대구광역시 12
 
9.8%
광주광역시 6
 
4.9%
대전광역시 5
 
4.1%
부산광역시 5
 
4.1%
Distinct109
Distinct (%)88.6%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2023-12-10T15:14:09.042545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length4.902439
Min length4

Characters and Unicode

Total characters603
Distinct characters92
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)80.5%

Sample

1st row 강릉시
2nd row 고성군
3rd row 동해시
4th row 삼척시
5th row 속초시
ValueCountFrequency (%)
동구 4
 
3.3%
남구 3
 
2.4%
서구 3
 
2.4%
창녕군 2
 
1.6%
고령군 2
 
1.6%
칠곡군 2
 
1.6%
북구 2
 
1.6%
고성군 2
 
1.6%
유성구 2
 
1.6%
중구 2
 
1.6%
Other values (99) 99
80.5%
2023-12-10T15:14:09.810905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
246
40.8%
58
 
9.6%
44
 
7.3%
26
 
4.3%
14
 
2.3%
13
 
2.2%
12
 
2.0%
11
 
1.8%
9
 
1.5%
8
 
1.3%
Other values (82) 162
26.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 357
59.2%
Space Separator 246
40.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
58
 
16.2%
44
 
12.3%
26
 
7.3%
14
 
3.9%
13
 
3.6%
12
 
3.4%
11
 
3.1%
9
 
2.5%
8
 
2.2%
7
 
2.0%
Other values (81) 155
43.4%
Space Separator
ValueCountFrequency (%)
246
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 357
59.2%
Common 246
40.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
58
 
16.2%
44
 
12.3%
26
 
7.3%
14
 
3.9%
13
 
3.6%
12
 
3.4%
11
 
3.1%
9
 
2.5%
8
 
2.2%
7
 
2.0%
Other values (81) 155
43.4%
Common
ValueCountFrequency (%)
246
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 357
59.2%
ASCII 246
40.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
246
100.0%
Hangul
ValueCountFrequency (%)
58
 
16.2%
44
 
12.3%
26
 
7.3%
14
 
3.9%
13
 
3.6%
12
 
3.4%
11
 
3.1%
9
 
2.5%
8
 
2.2%
7
 
2.0%
Other values (81) 155
43.4%
Distinct11
Distinct (%)8.9%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
-
77 
1000000
24 
1500000
 
6
2000000
 
3
1333333
 
3
Other values (6)
10 

Length

Max length7
Median length3
Mean length4.4796748
Min length3

Unique

Unique2 ?
Unique (%)1.6%

Sample

1st row -
2nd row1000000
3rd row -
4th row -
5th row1000000

Common Values

ValueCountFrequency (%)
- 77
62.6%
1000000 24
 
19.5%
1500000 6
 
4.9%
2000000 3
 
2.4%
1333333 3
 
2.4%
1250000 2
 
1.6%
1200000 2
 
1.6%
1666667 2
 
1.6%
666667 2
 
1.6%
1285714 1
 
0.8%

Length

2023-12-10T15:14:10.043168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
77
62.6%
1000000 24
 
19.5%
1500000 6
 
4.9%
2000000 3
 
2.4%
1333333 3
 
2.4%
1250000 2
 
1.6%
1200000 2
 
1.6%
1666667 2
 
1.6%
666667 2
 
1.6%
1285714 1
 
0.8%

서비스업지급금액
Real number (ℝ)

HIGH CORRELATION 

Distinct58
Distinct (%)47.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1069143.6
Minimum1000000
Maximum2583333
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-10T15:14:10.292534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000000
5-th percentile1000000
Q11000000
median1000000
Q31045186
95-th percentile1260749.9
Maximum2583333
Range1583333
Interquartile range (IQR)45186

Descriptive statistics

Standard deviation196961.81
Coefficient of variation (CV)0.1842239
Kurtosis33.803959
Mean1069143.6
Median Absolute Deviation (MAD)0
Skewness5.3721666
Sum1.3150467 × 108
Variance3.8793953 × 1010
MonotonicityNot monotonic
2023-12-10T15:14:10.496348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000000 64
52.0%
1040000 2
 
1.6%
1142857 2
 
1.6%
1187500 1
 
0.8%
1008475 1
 
0.8%
2583333 1
 
0.8%
1042553 1
 
0.8%
1100000 1
 
0.8%
1115502 1
 
0.8%
1047619 1
 
0.8%
Other values (48) 48
39.0%
ValueCountFrequency (%)
1000000 64
52.0%
1008475 1
 
0.8%
1008772 1
 
0.8%
1010870 1
 
0.8%
1011364 1
 
0.8%
1012821 1
 
0.8%
1015504 1
 
0.8%
1015873 1
 
0.8%
1016949 1
 
0.8%
1017699 1
 
0.8%
ValueCountFrequency (%)
2583333 1
0.8%
2000000 1
0.8%
1818182 1
0.8%
1642105 1
0.8%
1357143 1
0.8%
1309091 1
0.8%
1260870 1
0.8%
1259669 1
0.8%
1240000 1
0.8%
1200000 1
0.8%

제조업지급금액
Categorical

IMBALANCE 

Distinct5
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
-
108 
2000000
11 
2333333
 
2
2375000
 
1
2500000
 
1

Length

Max length7
Median length3
Mean length3.4878049
Min length3

Unique

Unique2 ?
Unique (%)1.6%

Sample

1st row -
2nd row -
3rd row -
4th row -
5th row -

Common Values

ValueCountFrequency (%)
- 108
87.8%
2000000 11
 
8.9%
2333333 2
 
1.6%
2375000 1
 
0.8%
2500000 1
 
0.8%

Length

2023-12-10T15:14:10.735307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:14:10.920230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
108
87.8%
2000000 11
 
8.9%
2333333 2
 
1.6%
2375000 1
 
0.8%
2500000 1
 
0.8%

평균지급금액
Real number (ℝ)

HIGH CORRELATION 

Distinct65
Distinct (%)52.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1075223.1
Minimum981818
Maximum2583333
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.2 KiB
2023-12-10T15:14:11.199640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum981818
5-th percentile1000000
Q11000000
median1017699
Q31061381
95-th percentile1269695.6
Maximum2583333
Range1601515
Interquartile range (IQR)61381

Descriptive statistics

Standard deviation195711.08
Coefficient of variation (CV)0.18201905
Kurtosis33.837033
Mean1075223.1
Median Absolute Deviation (MAD)17699
Skewness5.3409973
Sum1.3225244 × 108
Variance3.8302828 × 1010
MonotonicityNot monotonic
2023-12-10T15:14:11.457105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000000 51
41.5%
1017699 2
 
1.6%
1021277 2
 
1.6%
1066667 2
 
1.6%
1030769 2
 
1.6%
1105263 2
 
1.6%
1142857 2
 
1.6%
1012658 2
 
1.6%
1020000 2
 
1.6%
981818 1
 
0.8%
Other values (55) 55
44.7%
ValueCountFrequency (%)
981818 1
 
0.8%
1000000 51
41.5%
1007407 1
 
0.8%
1008403 1
 
0.8%
1008439 1
 
0.8%
1008696 1
 
0.8%
1011173 1
 
0.8%
1012658 2
 
1.6%
1013889 1
 
0.8%
1015504 1
 
0.8%
ValueCountFrequency (%)
2583333 1
0.8%
2000000 1
0.8%
1800000 1
0.8%
1628866 1
0.8%
1357143 1
0.8%
1309091 1
0.8%
1270833 1
0.8%
1259459 1
0.8%
1240000 1
0.8%
1217949 1
0.8%

Interactions

2023-12-10T15:14:07.679467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:14:07.410835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:14:07.835339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:14:07.542348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:14:11.639297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
광역도소매업지급금액서비스업지급금액제조업지급금액평균지급금액
광역1.0000.4260.4400.4440.206
도소매업지급금액0.4261.0000.0000.5700.000
서비스업지급금액0.4400.0001.0000.0000.997
제조업지급금액0.4440.5700.0001.0000.000
평균지급금액0.2060.0000.9970.0001.000
2023-12-10T15:14:11.792835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도소매업지급금액제조업지급금액광역
도소매업지급금액1.0000.3490.212
제조업지급금액0.3491.0000.288
광역0.2120.2881.000
2023-12-10T15:14:11.920335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
서비스업지급금액평균지급금액광역도소매업지급금액제조업지급금액
서비스업지급금액1.0000.8940.0000.0000.000
평균지급금액0.8941.0000.1060.0000.000
광역0.0000.1061.0000.2120.288
도소매업지급금액0.0000.0000.2121.0000.349
제조업지급금액0.0000.0000.2880.3491.000

Missing values

2023-12-10T15:14:08.023539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:14:08.176207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

광역시군구도소매업지급금액서비스업지급금액제조업지급금액평균지급금액
0강원도강릉시-1309091-1309091
1강원도고성군10000001000000-1000000
2강원도동해시-1000000-1000000
3강원도삼척시-1000000-1000000
4강원도속초시10000001818182-1800000
5강원도양구군-1000000-1000000
6강원도양양군-1000000-1000000
7강원도영월군-1000000-1000000
8강원도원주시12500001000000-1007407
9강원도인제군-1000000-1000000
광역시군구도소매업지급금액서비스업지급금액제조업지급금액평균지급금액
113대전광역시대덕구-100000020000001012658
114대전광역시동구10000001000000-1000000
115대전광역시서구1000000100000020000001018182
116대전광역시유성구1333333100000025000001050000
117대전광역시중구15000001040000-1089286
118부산광역시강서구-100000020000001036364
119부산광역시금정구-1000000-1000000
120부산광역시기장군-103448320000001066667
121부산광역시남구10000001000000-1000000
122부산광역시동구-1105263-1105263