Overview

Dataset statistics

Number of variables11
Number of observations400
Missing cells400
Missing cells (%)9.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory36.8 KiB
Average record size in memory94.3 B

Variable types

Categorical6
Unsupported1
Numeric3
Text1

Dataset

DescriptionSample
Author소상공인연합회
URLhttps://www.bigdata-telecom.kr/invoke/SOKBP2603/?goodsCode=KFMZEROSTT005

Alerts

소상공인결제분류코드 has constant value ""Constant
년월 has constant value ""Constant
광역시도코드 has constant value ""Constant
광역시도명 has constant value ""Constant
소상공인시스템로그일시 has constant value ""Constant
결제건수 is highly overall correlated with 합계금액High correlation
합계금액 is highly overall correlated with 결제건수High correlation
표준산업업종상세분류코드 is highly overall correlated with 표준산업업종대분류코드High correlation
표준산업업종대분류코드 is highly overall correlated with 표준산업업종상세분류코드High correlation
소상공인시스템로그ID has 400 (100.0%) missing valuesMissing
표준산업업종상세분류코드 has unique valuesUnique
표준산업업종상세분류명 has unique valuesUnique
소상공인시스템로그ID is an unsupported type, check if it needs cleaning or further analysisUnsupported
결제건수 has 346 (86.5%) zerosZeros
합계금액 has 346 (86.5%) zerosZeros

Reproduction

Analysis started2023-12-10 06:24:20.225637
Analysis finished2023-12-10 06:24:23.027284
Duration2.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

소상공인결제분류코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
ZEROP26000
400 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowZEROP26000
2nd rowZEROP26000
3rd rowZEROP26000
4th rowZEROP26000
5th rowZEROP26000

Common Values

ValueCountFrequency (%)
ZEROP26000 400
100.0%

Length

2023-12-10T15:24:23.132532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:24:23.302038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
zerop26000 400
100.0%

년월
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
202008
400 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202008
2nd row202008
3rd row202008
4th row202008
5th row202008

Common Values

ValueCountFrequency (%)
202008 400
100.0%

Length

2023-12-10T15:24:23.486859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:24:23.636650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202008 400
100.0%

소상공인시스템로그ID
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing400
Missing (%)100.0%
Memory size3.6 KiB

광역시도코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
26
400 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row26
2nd row26
3rd row26
4th row26
5th row26

Common Values

ValueCountFrequency (%)
26 400
100.0%

Length

2023-12-10T15:24:23.794171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:24:23.979744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
26 400
100.0%

광역시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
부산광역시
400 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산광역시
2nd row부산광역시
3rd row부산광역시
4th row부산광역시
5th row부산광역시

Common Values

ValueCountFrequency (%)
부산광역시 400
100.0%

Length

2023-12-10T15:24:24.139067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:24:24.310905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산광역시 400
100.0%

결제건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct16
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.9625
Minimum0
Maximum791
Zeros346
Zeros (%)86.5%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:24:24.467659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3.05
Maximum791
Range791
Interquartile range (IQR)0

Descriptive statistics

Standard deviation39.889146
Coefficient of variation (CV)13.464691
Kurtosis384.5517
Mean2.9625
Median Absolute Deviation (MAD)0
Skewness19.440914
Sum1185
Variance1591.144
MonotonicityNot monotonic
2023-12-10T15:24:24.734292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
0 346
86.5%
1 22
 
5.5%
3 6
 
1.5%
2 6
 
1.5%
5 5
 
1.2%
6 2
 
0.5%
60 2
 
0.5%
4 2
 
0.5%
36 2
 
0.5%
10 1
 
0.2%
Other values (6) 6
 
1.5%
ValueCountFrequency (%)
0 346
86.5%
1 22
 
5.5%
2 6
 
1.5%
3 6
 
1.5%
4 2
 
0.5%
5 5
 
1.2%
6 2
 
0.5%
7 1
 
0.2%
9 1
 
0.2%
10 1
 
0.2%
ValueCountFrequency (%)
791 1
0.2%
60 2
0.5%
38 1
0.2%
36 2
0.5%
26 1
0.2%
15 1
0.2%
10 1
0.2%
9 1
0.2%
7 1
0.2%
6 2
0.5%

합계금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct53
Distinct (%)13.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1475415
Minimum0
Maximum5.2541288 × 108
Zeros346
Zeros (%)86.5%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:24:24.982906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile270610
Maximum5.2541288 × 108
Range5.2541288 × 108
Interquartile range (IQR)0

Descriptive statistics

Standard deviation26302857
Coefficient of variation (CV)17.827429
Kurtosis397.53555
Mean1475415
Median Absolute Deviation (MAD)0
Skewness19.910014
Sum5.9016602 × 108
Variance6.9184027 × 1014
MonotonicityNot monotonic
2023-12-10T15:24:25.284361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 346
86.5%
21000 2
 
0.5%
15000 2
 
0.5%
269500 1
 
0.2%
23744400 1
 
0.2%
4091829 1
 
0.2%
194000 1
 
0.2%
25000 1
 
0.2%
2836870 1
 
0.2%
2838050 1
 
0.2%
Other values (43) 43
 
10.8%
ValueCountFrequency (%)
0 346
86.5%
1 1
 
0.2%
6500 1
 
0.2%
7000 1
 
0.2%
13000 1
 
0.2%
14000 1
 
0.2%
15000 2
 
0.5%
16000 1
 
0.2%
17000 1
 
0.2%
17500 1
 
0.2%
ValueCountFrequency (%)
525412877 1
0.2%
23744400 1
0.2%
15409800 1
0.2%
4091829 1
0.2%
3405800 1
0.2%
2838050 1
0.2%
2836870 1
0.2%
1800000 1
0.2%
1390000 1
0.2%
1200000 1
0.2%

표준산업업종대분류코드
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
C
259 
G
76 
F
37 
A
 
16
E
 
7
Other values (2)
 
5

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
C 259
64.8%
G 76
 
19.0%
F 37
 
9.2%
A 16
 
4.0%
E 7
 
1.8%
D 3
 
0.8%
B 2
 
0.5%

Length

2023-12-10T15:24:25.539428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:24:25.744416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
c 259
64.8%
g 76
 
19.0%
f 37
 
9.2%
a 16
 
4.0%
e 7
 
1.8%
d 3
 
0.8%
b 2
 
0.5%

표준산업업종상세분류코드
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28171.265
Minimum1110
Maximum46592
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:24:25.966657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1110
5-th percentile10206.9
Q118060.5
median28120
Q342112.75
95-th percentile46444.35
Maximum46592
Range45482
Interquartile range (IQR)24052.25

Descriptive statistics

Standard deviation13065.105
Coefficient of variation (CV)0.46377417
Kurtosis-0.98304248
Mean28171.265
Median Absolute Deviation (MAD)12900.5
Skewness-0.1165375
Sum11268506
Variance1.7069697 × 108
MonotonicityStrictly increasing
2023-12-10T15:24:26.199821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1110 1
 
0.2%
33309 1
 
0.2%
34011 1
 
0.2%
33999 1
 
0.2%
33992 1
 
0.2%
33933 1
 
0.2%
33932 1
 
0.2%
33931 1
 
0.2%
33920 1
 
0.2%
33910 1
 
0.2%
Other values (390) 390
97.5%
ValueCountFrequency (%)
1110 1
0.2%
1121 1
0.2%
1122 1
0.2%
1131 1
0.2%
1140 1
0.2%
1151 1
0.2%
1152 1
0.2%
1231 1
0.2%
1239 1
0.2%
1299 1
0.2%
ValueCountFrequency (%)
46592 1
0.2%
46591 1
0.2%
46539 1
0.2%
46533 1
0.2%
46532 1
0.2%
46522 1
0.2%
46521 1
0.2%
46510 1
0.2%
46499 1
0.2%
46493 1
0.2%
Distinct400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:24:26.552090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length22
Mean length13.8175
Min length3

Characters and Unicode

Total characters5527
Distinct characters319
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique400 ?
Unique (%)100.0%

Sample

1st row곡물 및 기타 식량작물 재배업
2nd row채소작물 재배업
3rd row화훼작물 재배업
4th row과실작물 재배업
5th row기타 작물 재배업
ValueCountFrequency (%)
제조업 221
 
13.3%
189
 
11.4%
기타 97
 
5.8%
도매업 60
 
3.6%
29
 
1.7%
27
 
1.6%
공사업 19
 
1.1%
금속 17
 
1.0%
기기 15
 
0.9%
유사 14
 
0.8%
Other values (600) 977
58.7%
2023-12-10T15:24:27.230150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1265
22.9%
420
 
7.6%
298
 
5.4%
270
 
4.9%
213
 
3.9%
189
 
3.4%
125
 
2.3%
116
 
2.1%
102
 
1.8%
76
 
1.4%
Other values (309) 2453
44.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4255
77.0%
Space Separator 1265
 
22.9%
Decimal Number 3
 
0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
420
 
9.9%
298
 
7.0%
270
 
6.3%
213
 
5.0%
189
 
4.4%
125
 
2.9%
116
 
2.7%
102
 
2.4%
76
 
1.8%
71
 
1.7%
Other values (305) 2375
55.8%
Space Separator
ValueCountFrequency (%)
1265
100.0%
Decimal Number
ValueCountFrequency (%)
1 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4255
77.0%
Common 1272
 
23.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
420
 
9.9%
298
 
7.0%
270
 
6.3%
213
 
5.0%
189
 
4.4%
125
 
2.9%
116
 
2.7%
102
 
2.4%
76
 
1.8%
71
 
1.7%
Other values (305) 2375
55.8%
Common
ValueCountFrequency (%)
1265
99.4%
1 3
 
0.2%
( 2
 
0.2%
) 2
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4237
76.7%
ASCII 1272
 
23.0%
Compat Jamo 18
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1265
99.4%
1 3
 
0.2%
( 2
 
0.2%
) 2
 
0.2%
Hangul
ValueCountFrequency (%)
420
 
9.9%
298
 
7.0%
270
 
6.4%
213
 
5.0%
189
 
4.5%
125
 
3.0%
116
 
2.7%
102
 
2.4%
76
 
1.8%
71
 
1.7%
Other values (304) 2357
55.6%
Compat Jamo
ValueCountFrequency (%)
18
100.0%

소상공인시스템로그일시
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2020-10-21 12:28:43.0
400 

Length

Max length21
Median length21
Mean length21
Min length21

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-10-21 12:28:43.0
2nd row2020-10-21 12:28:43.0
3rd row2020-10-21 12:28:43.0
4th row2020-10-21 12:28:43.0
5th row2020-10-21 12:28:43.0

Common Values

ValueCountFrequency (%)
2020-10-21 12:28:43.0 400
100.0%

Length

2023-12-10T15:24:27.482129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:24:27.683232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-10-21 400
50.0%
12:28:43.0 400
50.0%

Interactions

2023-12-10T15:24:21.834164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:24:20.799226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:24:21.404515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:24:21.977435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:24:21.059235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:24:21.550149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:24:22.445752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:24:21.235280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:24:21.690653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:24:27.847057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
결제건수합계금액표준산업업종대분류코드표준산업업종상세분류코드
결제건수1.0000.7040.0000.000
합계금액0.7041.0000.0000.000
표준산업업종대분류코드0.0000.0001.0000.882
표준산업업종상세분류코드0.0000.0000.8821.000
2023-12-10T15:24:28.012376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
결제건수합계금액표준산업업종상세분류코드표준산업업종대분류코드
결제건수1.0000.9970.1670.000
합계금액0.9971.0000.1690.000
표준산업업종상세분류코드0.1670.1691.0000.712
표준산업업종대분류코드0.0000.0000.7121.000

Missing values

2023-12-10T15:24:22.657110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:24:22.932106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

소상공인결제분류코드년월소상공인시스템로그ID광역시도코드광역시도명결제건수합계금액표준산업업종대분류코드표준산업업종상세분류코드표준산업업종상세분류명소상공인시스템로그일시
0ZEROP26000202008<NA>26부산광역시00A1110곡물 및 기타 식량작물 재배업2020-10-21 12:28:43.0
1ZEROP26000202008<NA>26부산광역시00A1121채소작물 재배업2020-10-21 12:28:43.0
2ZEROP26000202008<NA>26부산광역시00A1122화훼작물 재배업2020-10-21 12:28:43.0
3ZEROP26000202008<NA>26부산광역시00A1131과실작물 재배업2020-10-21 12:28:43.0
4ZEROP26000202008<NA>26부산광역시00A1140기타 작물 재배업2020-10-21 12:28:43.0
5ZEROP26000202008<NA>26부산광역시00A1151콩나물 재배업2020-10-21 12:28:43.0
6ZEROP26000202008<NA>26부산광역시00A1152채소화훼 및 과실작물 시설 재배업2020-10-21 12:28:43.0
7ZEROP26000202008<NA>26부산광역시00A1231양계업2020-10-21 12:28:43.0
8ZEROP26000202008<NA>26부산광역시00A1239기타 가금류 및 조류 사육업2020-10-21 12:28:43.0
9ZEROP26000202008<NA>26부산광역시174000A1299그 외 기타 축산업2020-10-21 12:28:43.0
소상공인결제분류코드년월소상공인시스템로그ID광역시도코드광역시도명결제건수합계금액표준산업업종대분류코드표준산업업종상세분류코드표준산업업종상세분류명소상공인시스템로그일시
390ZEROP26000202008<NA>26부산광역시00G46493안경사진장비 및 광학용품 도매업2020-10-21 12:28:43.0
391ZEROP26000202008<NA>26부산광역시116000G46499그 외 기타 생활용품 도매업2020-10-21 12:28:43.0
392ZEROP26000202008<NA>26부산광역시129000G46510컴퓨터 및 주변장치소프트웨어 도매업2020-10-21 12:28:43.0
393ZEROP26000202008<NA>26부산광역시31390000G46521가전제품 및 부품 도매업2020-10-21 12:28:43.0
394ZEROP26000202008<NA>26부산광역시00G46522통신ㆍ방송장비 및 부품 도매업2020-10-21 12:28:43.0
395ZEROP26000202008<NA>26부산광역시00G46532건설ㆍ광업용 기계 및 장비 도매업2020-10-21 12:28:43.0
396ZEROP26000202008<NA>26부산광역시00G46533공작용 기계 및 장비 도매업2020-10-21 12:28:43.0
397ZEROP26000202008<NA>26부산광역시00G46539기타 산업용 기계 및 장비 도매업2020-10-21 12:28:43.0
398ZEROP26000202008<NA>26부산광역시00G46591사무용 가구 및 기기 도매업2020-10-21 12:28:43.0
399ZEROP26000202008<NA>26부산광역시00G46592의료 기기 도매업2020-10-21 12:28:43.0