Overview

Dataset statistics

Number of variables11
Number of observations400
Missing cells400
Missing cells (%)9.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory36.8 KiB
Average record size in memory94.3 B

Variable types

Categorical6
Unsupported1
Numeric3
Text1

Dataset

DescriptionSample
Author소상공인연합회
URLhttps://www.bigdata-telecom.kr/invoke/SOKBP2603/?goodsCode=KFMZEROSTT011

Alerts

소상공인결제분류코드 has constant value ""Constant
년월 has constant value ""Constant
광역시도코드 has constant value ""Constant
광역시도명 has constant value ""Constant
소상공인시스템로그일시 has constant value ""Constant
결제건수 is highly overall correlated with 합계금액High correlation
합계금액 is highly overall correlated with 결제건수High correlation
표준산업업종상세분류코드 is highly overall correlated with 표준산업업종대분류코드High correlation
표준산업업종대분류코드 is highly overall correlated with 표준산업업종상세분류코드High correlation
소상공인시스템로그ID has 400 (100.0%) missing valuesMissing
표준산업업종상세분류코드 has unique valuesUnique
표준산업업종상세분류명 has unique valuesUnique
소상공인시스템로그ID is an unsupported type, check if it needs cleaning or further analysisUnsupported
결제건수 has 229 (57.2%) zerosZeros
합계금액 has 229 (57.2%) zerosZeros

Reproduction

Analysis started2023-12-10 06:55:14.661042
Analysis finished2023-12-10 06:55:16.131513
Duration1.47 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

소상공인결제분류코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
ZEROP42000
400 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowZEROP42000
2nd rowZEROP42000
3rd rowZEROP42000
4th rowZEROP42000
5th rowZEROP42000

Common Values

ValueCountFrequency (%)
ZEROP42000 400
100.0%

Length

2023-12-10T15:55:16.229999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:55:16.330147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
zerop42000 400
100.0%

년월
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
202008
400 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202008
2nd row202008
3rd row202008
4th row202008
5th row202008

Common Values

ValueCountFrequency (%)
202008 400
100.0%

Length

2023-12-10T15:55:16.435049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:55:16.532758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202008 400
100.0%

소상공인시스템로그ID
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing400
Missing (%)100.0%
Memory size3.6 KiB

광역시도코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
42
400 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row42
2nd row42
3rd row42
4th row42
5th row42

Common Values

ValueCountFrequency (%)
42 400
100.0%

Length

2023-12-10T15:55:16.620721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:55:16.711802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
42 400
100.0%

광역시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
강원도
400 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row강원도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
강원도 400
100.0%

Length

2023-12-10T15:55:16.804780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:55:16.897582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
강원도 400
100.0%

결제건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct75
Distinct (%)18.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean131.1975
Minimum0
Maximum18513
Zeros229
Zeros (%)57.2%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:55:16.999218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q37
95-th percentile185.05
Maximum18513
Range18513
Interquartile range (IQR)7

Descriptive statistics

Standard deviation1181.8317
Coefficient of variation (CV)9.0080354
Kurtosis174.4912
Mean131.1975
Median Absolute Deviation (MAD)0
Skewness12.736001
Sum52479
Variance1396726.2
MonotonicityNot monotonic
2023-12-10T15:55:17.137256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 229
57.2%
1 18
 
4.5%
3 16
 
4.0%
2 12
 
3.0%
4 9
 
2.2%
5 9
 
2.2%
7 8
 
2.0%
8 5
 
1.2%
15 4
 
1.0%
19 4
 
1.0%
Other values (65) 86
 
21.5%
ValueCountFrequency (%)
0 229
57.2%
1 18
 
4.5%
2 12
 
3.0%
3 16
 
4.0%
4 9
 
2.2%
5 9
 
2.2%
6 2
 
0.5%
7 8
 
2.0%
8 5
 
1.2%
9 3
 
0.8%
ValueCountFrequency (%)
18513 1
0.2%
10958 1
0.2%
9646 1
0.2%
1328 1
0.2%
1183 1
0.2%
1104 1
0.2%
1028 1
0.2%
1027 1
0.2%
691 1
0.2%
564 1
0.2%

합계금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct168
Distinct (%)42.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4608035.2
Minimum0
Maximum4.8997702 × 108
Zeros229
Zeros (%)57.2%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:55:17.283594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3421250
95-th percentile16582361
Maximum4.8997702 × 108
Range4.8997702 × 108
Interquartile range (IQR)421250

Descriptive statistics

Standard deviation30481824
Coefficient of variation (CV)6.6149286
Kurtosis180.89756
Mean4608035.2
Median Absolute Deviation (MAD)0
Skewness12.600091
Sum1.8432141 × 109
Variance9.291416 × 1014
MonotonicityNot monotonic
2023-12-10T15:55:17.434860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 229
57.2%
78000 2
 
0.5%
76000 2
 
0.5%
200000 2
 
0.5%
97000 2
 
0.5%
3302110 1
 
0.2%
87130507 1
 
0.2%
20521609 1
 
0.2%
16534655 1
 
0.2%
6556105 1
 
0.2%
Other values (158) 158
39.5%
ValueCountFrequency (%)
0 229
57.2%
1 1
 
0.2%
3000 1
 
0.2%
6000 1
 
0.2%
10200 1
 
0.2%
12560 1
 
0.2%
12900 1
 
0.2%
13500 1
 
0.2%
16000 1
 
0.2%
17500 1
 
0.2%
ValueCountFrequency (%)
489977022 1
0.2%
285105325 1
0.2%
163523266 1
0.2%
87130507 1
0.2%
63980801 1
0.2%
55675154 1
0.2%
52576826 1
0.2%
50934000 1
0.2%
35788980 1
0.2%
35136000 1
0.2%

표준산업업종대분류코드
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
C
169 
G
161 
A
24 
F
21 
H
17 
Other values (3)
 
8

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
C 169
42.2%
G 161
40.2%
A 24
 
6.0%
F 21
 
5.2%
H 17
 
4.2%
D 3
 
0.8%
E 3
 
0.8%
I 2
 
0.5%

Length

2023-12-10T15:55:17.561519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:55:17.666494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
c 169
42.2%
g 161
40.2%
a 24
 
6.0%
f 21
 
5.2%
h 17
 
4.2%
d 3
 
0.8%
e 3
 
0.8%
i 2
 
0.5%

표준산업업종상세분류코드
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32794.677
Minimum1110
Maximum55102
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:55:17.803367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1110
5-th percentile2039.5
Q117901.75
median41111.5
Q346693.75
95-th percentile47992.05
Maximum55102
Range53992
Interquartile range (IQR)28792

Descriptive statistics

Standard deviation15885.66
Coefficient of variation (CV)0.4843975
Kurtosis-1.1773585
Mean32794.677
Median Absolute Deviation (MAD)7947
Skewness-0.53017602
Sum13117871
Variance2.5235419 × 108
MonotonicityStrictly increasing
2023-12-10T15:55:17.943949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1110 1
 
0.2%
46431 1
 
0.2%
46453 1
 
0.2%
46452 1
 
0.2%
46451 1
 
0.2%
46444 1
 
0.2%
46443 1
 
0.2%
46442 1
 
0.2%
46441 1
 
0.2%
46439 1
 
0.2%
Other values (390) 390
97.5%
ValueCountFrequency (%)
1110 1
0.2%
1121 1
0.2%
1122 1
0.2%
1123 1
0.2%
1131 1
0.2%
1132 1
0.2%
1140 1
0.2%
1152 1
0.2%
1159 1
0.2%
1212 1
0.2%
ValueCountFrequency (%)
55102 1
0.2%
55101 1
0.2%
52999 1
0.2%
52992 1
0.2%
52929 1
0.2%
52919 1
0.2%
52915 1
0.2%
52913 1
0.2%
52109 1
0.2%
52103 1
0.2%
Distinct400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:55:18.175596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length22
Mean length13.3325
Min length3

Characters and Unicode

Total characters5333
Distinct characters313
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique400 ?
Unique (%)100.0%

Sample

1st row곡물 및 기타 식량작물 재배업
2nd row채소작물 재배업
3rd row화훼작물 재배업
4th row종자 및 묘목 생산업
5th row과실작물 재배업
ValueCountFrequency (%)
195
 
12.2%
제조업 140
 
8.7%
기타 90
 
5.6%
도매업 72
 
4.5%
소매업 64
 
4.0%
23
 
1.4%
23
 
1.4%
기기 12
 
0.7%
판매업 12
 
0.7%
자동차 12
 
0.7%
Other values (575) 960
59.9%
2023-12-10T15:55:18.591913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1203
22.6%
417
 
7.8%
213
 
4.0%
195
 
3.7%
180
 
3.4%
173
 
3.2%
151
 
2.8%
136
 
2.6%
117
 
2.2%
96
 
1.8%
Other values (303) 2452
46.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4125
77.3%
Space Separator 1203
 
22.6%
Close Punctuation 2
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
417
 
10.1%
213
 
5.2%
195
 
4.7%
180
 
4.4%
173
 
4.2%
151
 
3.7%
136
 
3.3%
117
 
2.8%
96
 
2.3%
84
 
2.0%
Other values (299) 2363
57.3%
Space Separator
ValueCountFrequency (%)
1203
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4125
77.3%
Common 1208
 
22.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
417
 
10.1%
213
 
5.2%
195
 
4.7%
180
 
4.4%
173
 
4.2%
151
 
3.7%
136
 
3.3%
117
 
2.8%
96
 
2.3%
84
 
2.0%
Other values (299) 2363
57.3%
Common
ValueCountFrequency (%)
1203
99.6%
) 2
 
0.2%
( 2
 
0.2%
1 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4106
77.0%
ASCII 1208
 
22.7%
Compat Jamo 19
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1203
99.6%
) 2
 
0.2%
( 2
 
0.2%
1 1
 
0.1%
Hangul
ValueCountFrequency (%)
417
 
10.2%
213
 
5.2%
195
 
4.7%
180
 
4.4%
173
 
4.2%
151
 
3.7%
136
 
3.3%
117
 
2.8%
96
 
2.3%
84
 
2.0%
Other values (298) 2344
57.1%
Compat Jamo
ValueCountFrequency (%)
19
100.0%

소상공인시스템로그일시
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2020-10-21 12:28:43.0
400 

Length

Max length21
Median length21
Mean length21
Min length21

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-10-21 12:28:43.0
2nd row2020-10-21 12:28:43.0
3rd row2020-10-21 12:28:43.0
4th row2020-10-21 12:28:43.0
5th row2020-10-21 12:28:43.0

Common Values

ValueCountFrequency (%)
2020-10-21 12:28:43.0 400
100.0%

Length

2023-12-10T15:55:18.726428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:55:18.816287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-10-21 400
50.0%
12:28:43.0 400
50.0%

Interactions

2023-12-10T15:55:15.516332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:55:14.945238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:55:15.234116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:55:15.620012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:55:15.039738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:55:15.348274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:55:15.717217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:55:15.127598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:55:15.433327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:55:18.876318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
결제건수합계금액표준산업업종대분류코드표준산업업종상세분류코드
결제건수1.0001.0000.0000.000
합계금액1.0001.0000.0000.000
표준산업업종대분류코드0.0000.0001.0000.910
표준산업업종상세분류코드0.0000.0000.9101.000
2023-12-10T15:55:18.969331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
결제건수합계금액표준산업업종상세분류코드표준산업업종대분류코드
결제건수1.0000.9860.2800.000
합계금액0.9861.0000.3010.000
표준산업업종상세분류코드0.2800.3011.0000.744
표준산업업종대분류코드0.0000.0000.7441.000

Missing values

2023-12-10T15:55:15.850639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:55:16.027478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

소상공인결제분류코드년월소상공인시스템로그ID광역시도코드광역시도명결제건수합계금액표준산업업종대분류코드표준산업업종상세분류코드표준산업업종상세분류명소상공인시스템로그일시
0ZEROP42000202008<NA>42강원도00A1110곡물 및 기타 식량작물 재배업2020-10-21 12:28:43.0
1ZEROP42000202008<NA>42강원도00A1121채소작물 재배업2020-10-21 12:28:43.0
2ZEROP42000202008<NA>42강원도00A1122화훼작물 재배업2020-10-21 12:28:43.0
3ZEROP42000202008<NA>42강원도00A1123종자 및 묘목 생산업2020-10-21 12:28:43.0
4ZEROP42000202008<NA>42강원도482180000A1131과실작물 재배업2020-10-21 12:28:43.0
5ZEROP42000202008<NA>42강원도00A1132음료용 및 향신용 작물 재배업2020-10-21 12:28:43.0
6ZEROP42000202008<NA>42강원도00A1140기타 작물 재배업2020-10-21 12:28:43.0
7ZEROP42000202008<NA>42강원도00A1152채소화훼 및 과실작물 시설 재배업2020-10-21 12:28:43.0
8ZEROP42000202008<NA>42강원도00A1159기타 시설작물 재배업2020-10-21 12:28:43.0
9ZEROP42000202008<NA>42강원도21600000A1212육우 사육업2020-10-21 12:28:43.0
소상공인결제분류코드년월소상공인시스템로그ID광역시도코드광역시도명결제건수합계금액표준산업업종대분류코드표준산업업종상세분류코드표준산업업종상세분류명소상공인시스템로그일시
390ZEROP42000202008<NA>42강원도00H52103농산물 창고업2020-10-21 12:28:43.0
391ZEROP42000202008<NA>42강원도00H52109기타 보관 및 창고업2020-10-21 12:28:43.0
392ZEROP42000202008<NA>42강원도00H52913물류 터미널 운영업2020-10-21 12:28:43.0
393ZEROP42000202008<NA>42강원도1084800H52915주차장 운영업2020-10-21 12:28:43.0
394ZEROP42000202008<NA>42강원도4161697H52919기타 육상 운송지원 서비스업2020-10-21 12:28:43.0
395ZEROP42000202008<NA>42강원도00H52929기타 수상 운송 지원 서비스업2020-10-21 12:28:43.0
396ZEROP42000202008<NA>42강원도00H52992화물 운송 중개대리 및 관련 서비스업2020-10-21 12:28:43.0
397ZEROP42000202008<NA>42강원도00H52999그 외 기타 분류 안된 운송관련 서비스업2020-10-21 12:28:43.0
398ZEROP42000202008<NA>42강원도924556700I55101호텔업2020-10-21 12:28:43.0
399ZEROP42000202008<NA>42강원도7716404320I55102여관업2020-10-21 12:28:43.0