Overview

Dataset statistics

Number of variables11
Number of observations400
Missing cells400
Missing cells (%)9.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory36.8 KiB
Average record size in memory94.3 B

Variable types

Categorical5
Unsupported1
Numeric3
Text1
DateTime1

Dataset

DescriptionSample
Author소상공인연합회
URLhttps://www.bigdata-telecom.kr/invoke/SOKBP2603/?goodsCode=KFMZEROSTT012

Alerts

소상공인결제분류코드 has constant value ""Constant
년월 has constant value ""Constant
광역시도코드 has constant value ""Constant
광역시도명 has constant value ""Constant
소상공인시스템로그일시 has constant value ""Constant
결제건수 is highly overall correlated with 합계금액High correlation
합계금액 is highly overall correlated with 결제건수High correlation
표준산업업종상세분류코드 is highly overall correlated with 표준산업업종대분류코드High correlation
표준산업업종대분류코드 is highly overall correlated with 표준산업업종상세분류코드High correlation
소상공인시스템로그ID has 400 (100.0%) missing valuesMissing
표준산업업종상세분류코드 has unique valuesUnique
표준산업업종상세분류명 has unique valuesUnique
소상공인시스템로그ID is an unsupported type, check if it needs cleaning or further analysisUnsupported
결제건수 has 290 (72.5%) zerosZeros
합계금액 has 290 (72.5%) zerosZeros

Reproduction

Analysis started2023-12-10 06:35:59.764454
Analysis finished2023-12-10 06:36:02.081677
Duration2.32 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

소상공인결제분류코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
ZEROP43000
400 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowZEROP43000
2nd rowZEROP43000
3rd rowZEROP43000
4th rowZEROP43000
5th rowZEROP43000

Common Values

ValueCountFrequency (%)
ZEROP43000 400
100.0%

Length

2023-12-10T15:36:02.200474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:36:02.381616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
zerop43000 400
100.0%

년월
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
202008
400 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202008
2nd row202008
3rd row202008
4th row202008
5th row202008

Common Values

ValueCountFrequency (%)
202008 400
100.0%

Length

2023-12-10T15:36:02.606582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:36:02.789286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202008 400
100.0%

소상공인시스템로그ID
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing400
Missing (%)100.0%
Memory size3.6 KiB

광역시도코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
43
400 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row43
2nd row43
3rd row43
4th row43
5th row43

Common Values

ValueCountFrequency (%)
43 400
100.0%

Length

2023-12-10T15:36:02.949855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:36:03.106445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
43 400
100.0%

광역시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
충청북도
400 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충청북도
2nd row충청북도
3rd row충청북도
4th row충청북도
5th row충청북도

Common Values

ValueCountFrequency (%)
충청북도 400
100.0%

Length

2023-12-10T15:36:03.273123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:36:03.425504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
충청북도 400
100.0%

결제건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct40
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.9125
Minimum0
Maximum4433
Zeros290
Zeros (%)72.5%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:36:03.921713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile29.1
Maximum4433
Range4433
Interquartile range (IQR)1

Descriptive statistics

Standard deviation226.85389
Coefficient of variation (CV)11.392537
Kurtosis361.58702
Mean19.9125
Median Absolute Deviation (MAD)0
Skewness18.640877
Sum7965
Variance51462.687
MonotonicityNot monotonic
2023-12-10T15:36:04.132257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
0 290
72.5%
1 29
 
7.2%
2 16
 
4.0%
3 11
 
2.8%
4 4
 
1.0%
9 4
 
1.0%
7 3
 
0.8%
10 2
 
0.5%
14 2
 
0.5%
20 2
 
0.5%
Other values (30) 37
 
9.2%
ValueCountFrequency (%)
0 290
72.5%
1 29
 
7.2%
2 16
 
4.0%
3 11
 
2.8%
4 4
 
1.0%
5 1
 
0.2%
7 3
 
0.8%
8 2
 
0.5%
9 4
 
1.0%
10 2
 
0.5%
ValueCountFrequency (%)
4433 1
0.2%
613 1
0.2%
600 1
0.2%
319 1
0.2%
216 1
0.2%
209 1
0.2%
194 1
0.2%
141 1
0.2%
128 1
0.2%
81 1
0.2%

합계금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct108
Distinct (%)27.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean665410.26
Minimum0
Maximum1.06839 × 108
Zeros290
Zeros (%)72.5%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:36:04.364805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q313350
95-th percentile1152900
Maximum1.06839 × 108
Range1.06839 × 108
Interquartile range (IQR)13350

Descriptive statistics

Standard deviation5800716.5
Coefficient of variation (CV)8.7175038
Kurtosis285.016
Mean665410.26
Median Absolute Deviation (MAD)0
Skewness16.013322
Sum2.661641 × 108
Variance3.3648311 × 1013
MonotonicityNot monotonic
2023-12-10T15:36:04.654463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 290
72.5%
15000 3
 
0.8%
155000 2
 
0.5%
99000 1
 
0.2%
913100 1
 
0.2%
1820300 1
 
0.2%
687740 1
 
0.2%
324000 1
 
0.2%
200200 1
 
0.2%
124000 1
 
0.2%
Other values (98) 98
 
24.5%
ValueCountFrequency (%)
0 290
72.5%
4000 1
 
0.2%
4200 1
 
0.2%
4500 1
 
0.2%
4900 1
 
0.2%
5000 1
 
0.2%
8000 1
 
0.2%
9000 1
 
0.2%
10100 1
 
0.2%
10500 1
 
0.2%
ValueCountFrequency (%)
106839000 1
0.2%
30637250 1
0.2%
20973000 1
0.2%
17704760 1
0.2%
13755360 1
0.2%
10108500 1
0.2%
7420000 1
0.2%
6234020 1
0.2%
6106500 1
0.2%
5115900 1
0.2%

표준산업업종대분류코드
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
G
151 
C
103 
I
26 
P
18 
M
17 
Other values (12)
85 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique3 ?
Unique (%)0.8%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
G 151
37.8%
C 103
25.8%
I 26
 
6.5%
P 18
 
4.5%
M 17
 
4.2%
Q 13
 
3.2%
F 13
 
3.2%
N 12
 
3.0%
A 11
 
2.8%
J 9
 
2.2%
Other values (7) 27
 
6.8%

Length

2023-12-10T15:36:04.886487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
g 151
37.8%
c 103
25.8%
i 26
 
6.5%
p 18
 
4.5%
m 17
 
4.2%
q 13
 
3.2%
f 13
 
3.2%
n 12
 
3.0%
a 11
 
2.8%
j 9
 
2.2%
Other values (7) 27
 
6.8%

표준산업업종상세분류코드
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean45393.3
Minimum1110
Maximum91134
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:36:05.104327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1110
5-th percentile10501.95
Q129236.25
median46695.5
Q356122.25
95-th percentile86102.05
Maximum91134
Range90024
Interquartile range (IQR)26886

Descriptive statistics

Standard deviation22658.713
Coefficient of variation (CV)0.49916427
Kurtosis-0.45102824
Mean45393.3
Median Absolute Deviation (MAD)11416.5
Skewness0.062511089
Sum18157320
Variance5.1341729 × 108
MonotonicityStrictly increasing
2023-12-10T15:36:05.333560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1110 1
 
0.2%
47813 1
 
0.2%
47869 1
 
0.2%
47862 1
 
0.2%
47859 1
 
0.2%
47852 1
 
0.2%
47851 1
 
0.2%
47842 1
 
0.2%
47841 1
 
0.2%
47830 1
 
0.2%
Other values (390) 390
97.5%
ValueCountFrequency (%)
1110 1
0.2%
1121 1
0.2%
1122 1
0.2%
1123 1
0.2%
1131 1
0.2%
1140 1
0.2%
1152 1
0.2%
1159 1
0.2%
1299 1
0.2%
1300 1
0.2%
ValueCountFrequency (%)
91134 1
0.2%
91132 1
0.2%
91131 1
0.2%
91111 1
0.2%
90212 1
0.2%
90199 1
0.2%
90132 1
0.2%
90121 1
0.2%
90110 1
0.2%
87299 1
0.2%
Distinct400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:36:05.854653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length21
Mean length12.4225
Min length3

Characters and Unicode

Total characters4969
Distinct characters320
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique400 ?
Unique (%)100.0%

Sample

1st row곡물 및 기타 식량작물 재배업
2nd row채소작물 재배업
3rd row화훼작물 재배업
4th row종자 및 묘목 생산업
5th row과실작물 재배업
ValueCountFrequency (%)
169
 
11.3%
기타 94
 
6.3%
제조업 83
 
5.6%
도매업 68
 
4.6%
소매업 62
 
4.2%
27
 
1.8%
27
 
1.8%
서비스업 14
 
0.9%
운영업 13
 
0.9%
일반 10
 
0.7%
Other values (586) 923
61.9%
2023-12-10T15:36:06.548503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1090
21.9%
376
 
7.6%
169
 
3.4%
146
 
2.9%
143
 
2.9%
139
 
2.8%
113
 
2.3%
105
 
2.1%
96
 
1.9%
87
 
1.8%
Other values (310) 2505
50.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3874
78.0%
Space Separator 1090
 
21.9%
Decimal Number 3
 
0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
376
 
9.7%
169
 
4.4%
146
 
3.8%
143
 
3.7%
139
 
3.6%
113
 
2.9%
105
 
2.7%
96
 
2.5%
87
 
2.2%
78
 
2.0%
Other values (306) 2422
62.5%
Space Separator
ValueCountFrequency (%)
1090
100.0%
Decimal Number
ValueCountFrequency (%)
1 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3874
78.0%
Common 1095
 
22.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
376
 
9.7%
169
 
4.4%
146
 
3.8%
143
 
3.7%
139
 
3.6%
113
 
2.9%
105
 
2.7%
96
 
2.5%
87
 
2.2%
78
 
2.0%
Other values (306) 2422
62.5%
Common
ValueCountFrequency (%)
1090
99.5%
1 3
 
0.3%
( 1
 
0.1%
) 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3855
77.6%
ASCII 1095
 
22.0%
Compat Jamo 19
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1090
99.5%
1 3
 
0.3%
( 1
 
0.1%
) 1
 
0.1%
Hangul
ValueCountFrequency (%)
376
 
9.8%
169
 
4.4%
146
 
3.8%
143
 
3.7%
139
 
3.6%
113
 
2.9%
105
 
2.7%
96
 
2.5%
87
 
2.3%
78
 
2.0%
Other values (305) 2403
62.3%
Compat Jamo
ValueCountFrequency (%)
19
100.0%
Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
Minimum2020-10-21 12:28:43
Maximum2020-10-21 12:28:43
2023-12-10T15:36:06.716570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:36:06.859855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-10T15:36:01.173445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:36:00.333690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:36:00.779284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:36:01.312099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:36:00.501951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:36:00.904970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:36:01.470082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:36:00.650856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:36:01.038148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:36:06.961700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
결제건수합계금액표준산업업종대분류코드표준산업업종상세분류코드
결제건수1.0000.8410.0000.000
합계금액0.8411.0000.0000.000
표준산업업종대분류코드0.0000.0001.0000.958
표준산업업종상세분류코드0.0000.0000.9581.000
2023-12-10T15:36:07.142918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
결제건수합계금액표준산업업종상세분류코드표준산업업종대분류코드
결제건수1.0000.9910.1940.000
합계금액0.9911.0000.1950.000
표준산업업종상세분류코드0.1940.1951.0000.805
표준산업업종대분류코드0.0000.0000.8051.000

Missing values

2023-12-10T15:36:01.655166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:36:01.975035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

소상공인결제분류코드년월소상공인시스템로그ID광역시도코드광역시도명결제건수합계금액표준산업업종대분류코드표준산업업종상세분류코드표준산업업종상세분류명소상공인시스템로그일시
0ZEROP43000202008<NA>43충청북도00A1110곡물 및 기타 식량작물 재배업2020-10-21 12:28:43.0
1ZEROP43000202008<NA>43충청북도00A1121채소작물 재배업2020-10-21 12:28:43.0
2ZEROP43000202008<NA>43충청북도00A1122화훼작물 재배업2020-10-21 12:28:43.0
3ZEROP43000202008<NA>43충청북도00A1123종자 및 묘목 생산업2020-10-21 12:28:43.0
4ZEROP43000202008<NA>43충청북도00A1131과실작물 재배업2020-10-21 12:28:43.0
5ZEROP43000202008<NA>43충청북도00A1140기타 작물 재배업2020-10-21 12:28:43.0
6ZEROP43000202008<NA>43충청북도00A1152채소화훼 및 과실작물 시설 재배업2020-10-21 12:28:43.0
7ZEROP43000202008<NA>43충청북도00A1159기타 시설작물 재배업2020-10-21 12:28:43.0
8ZEROP43000202008<NA>43충청북도3291390A1299그 외 기타 축산업2020-10-21 12:28:43.0
9ZEROP43000202008<NA>43충청북도00A1300작물재배 및 축산 복합농업2020-10-21 12:28:43.0
소상공인결제분류코드년월소상공인시스템로그ID광역시도코드광역시도명결제건수합계금액표준산업업종대분류코드표준산업업종상세분류코드표준산업업종상세분류명소상공인시스템로그일시
390ZEROP43000202008<NA>43충청북도00Q87299그 외 기타 비거주 복지 서비스업2020-10-21 12:28:43.0
391ZEROP43000202008<NA>43충청북도00R90110공연시설 운영업2020-10-21 12:28:43.0
392ZEROP43000202008<NA>43충청북도00R90121연극단체2020-10-21 12:28:43.0
393ZEROP43000202008<NA>43충청북도00R90132비공연 예술가2020-10-21 12:28:43.0
394ZEROP43000202008<NA>43충청북도00R90199그 외 기타 창작 및 예술관련 서비스업2020-10-21 12:28:43.0
395ZEROP43000202008<NA>43충청북도00R90212독서실 운영업2020-10-21 12:28:43.0
396ZEROP43000202008<NA>43충청북도00R91111실내 경기장 운영업2020-10-21 12:28:43.0
397ZEROP43000202008<NA>43충청북도199000R91131종합 스포츠시설 운영업2020-10-21 12:28:43.0
398ZEROP43000202008<NA>43충청북도3115000R91132체력 단련시설 운영업2020-10-21 12:28:43.0
399ZEROP43000202008<NA>43충청북도00R91134볼링장 운영업2020-10-21 12:28:43.0