Overview

Dataset statistics

Number of variables11
Number of observations400
Missing cells400
Missing cells (%)9.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory36.8 KiB
Average record size in memory94.3 B

Variable types

Categorical6
Unsupported1
Numeric3
Text1

Dataset

DescriptionSample
Author소상공인연합회
URLhttps://www.bigdata-telecom.kr/invoke/SOKBP2603/?goodsCode=KFMZEROSTT006

Alerts

소상공인결제분류코드 has constant value ""Constant
년월 has constant value ""Constant
광역시도코드 has constant value ""Constant
광역시도명 has constant value ""Constant
소상공인시스템로그일시 has constant value ""Constant
결제건수 is highly overall correlated with 합계금액High correlation
합계금액 is highly overall correlated with 결제건수High correlation
표준산업업종상세분류코드 is highly overall correlated with 표준산업업종대분류코드High correlation
표준산업업종대분류코드 is highly overall correlated with 표준산업업종상세분류코드High correlation
소상공인시스템로그ID has 400 (100.0%) missing valuesMissing
표준산업업종상세분류코드 has unique valuesUnique
표준산업업종상세분류명 has unique valuesUnique
소상공인시스템로그ID is an unsupported type, check if it needs cleaning or further analysisUnsupported
결제건수 has 338 (84.5%) zerosZeros
합계금액 has 338 (84.5%) zerosZeros

Reproduction

Analysis started2023-12-10 06:56:12.189625
Analysis finished2023-12-10 06:56:13.550957
Duration1.36 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

소상공인결제분류코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
ZEROP31000
400 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowZEROP31000
2nd rowZEROP31000
3rd rowZEROP31000
4th rowZEROP31000
5th rowZEROP31000

Common Values

ValueCountFrequency (%)
ZEROP31000 400
100.0%

Length

2023-12-10T15:56:13.613197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:56:13.702205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
zerop31000 400
100.0%

년월
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
202008
400 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202008
2nd row202008
3rd row202008
4th row202008
5th row202008

Common Values

ValueCountFrequency (%)
202008 400
100.0%

Length

2023-12-10T15:56:13.793477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:56:13.885472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202008 400
100.0%

소상공인시스템로그ID
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing400
Missing (%)100.0%
Memory size3.6 KiB

광역시도코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
31
400 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row31
2nd row31
3rd row31
4th row31
5th row31

Common Values

ValueCountFrequency (%)
31 400
100.0%

Length

2023-12-10T15:56:13.978050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:56:14.081022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
31 400
100.0%

광역시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
울산광역시
400 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row울산광역시
2nd row울산광역시
3rd row울산광역시
4th row울산광역시
5th row울산광역시

Common Values

ValueCountFrequency (%)
울산광역시 400
100.0%

Length

2023-12-10T15:56:14.177947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:56:14.268982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
울산광역시 400
100.0%

결제건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct21
Distinct (%)5.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.635
Minimum0
Maximum2005
Zeros338
Zeros (%)84.5%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:56:14.351657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile5
Maximum2005
Range2005
Interquartile range (IQR)0

Descriptive statistics

Standard deviation106.5489
Coefficient of variation (CV)12.33919
Kurtosis314.48943
Mean8.635
Median Absolute Deviation (MAD)0
Skewness17.20958
Sum3454
Variance11352.668
MonotonicityNot monotonic
2023-12-10T15:56:14.454577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
0 338
84.5%
1 23
 
5.8%
2 9
 
2.2%
5 5
 
1.2%
3 4
 
1.0%
6 3
 
0.8%
4 3
 
0.8%
8 2
 
0.5%
676 1
 
0.2%
11 1
 
0.2%
Other values (11) 11
 
2.8%
ValueCountFrequency (%)
0 338
84.5%
1 23
 
5.8%
2 9
 
2.2%
3 4
 
1.0%
4 3
 
0.8%
5 5
 
1.2%
6 3
 
0.8%
7 1
 
0.2%
8 2
 
0.5%
9 1
 
0.2%
ValueCountFrequency (%)
2005 1
0.2%
676 1
0.2%
191 1
0.2%
166 1
0.2%
95 1
0.2%
78 1
0.2%
42 1
0.2%
21 1
0.2%
17 1
0.2%
12 1
0.2%

합계금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct59
Distinct (%)14.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1230670.4
Minimum0
Maximum4.56068 × 108
Zeros338
Zeros (%)84.5%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:56:14.577810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile131625
Maximum4.56068 × 108
Range4.56068 × 108
Interquartile range (IQR)0

Descriptive statistics

Standard deviation22811969
Coefficient of variation (CV)18.536213
Kurtosis399.07482
Mean1230670.4
Median Absolute Deviation (MAD)0
Skewness19.965859
Sum4.9226815 × 108
Variance5.2038592 × 1014
MonotonicityNot monotonic
2023-12-10T15:56:14.699785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 338
84.5%
70000 2
 
0.5%
61000 2
 
0.5%
15000 2
 
0.5%
7000 2
 
0.5%
24000 1
 
0.2%
20500 1
 
0.2%
30000 1
 
0.2%
17000 1
 
0.2%
131500 1
 
0.2%
Other values (49) 49
 
12.2%
ValueCountFrequency (%)
0 338
84.5%
500 1
 
0.2%
2000 1
 
0.2%
4400 1
 
0.2%
5700 1
 
0.2%
7000 2
 
0.5%
15000 2
 
0.5%
17000 1
 
0.2%
18800 1
 
0.2%
20500 1
 
0.2%
ValueCountFrequency (%)
456068000 1
0.2%
11932000 1
0.2%
7924370 1
0.2%
4269700 1
0.2%
3785680 1
0.2%
1205000 1
0.2%
854300 1
0.2%
700000 1
0.2%
633000 1
0.2%
405000 1
0.2%

표준산업업종대분류코드
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
G
135 
C
85 
I
26 
R
23 
M
20 
Other values (10)
111 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA
2nd rowA
3rd rowA
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
G 135
33.8%
C 85
21.2%
I 26
 
6.5%
R 23
 
5.8%
M 20
 
5.0%
F 18
 
4.5%
P 18
 
4.5%
Q 13
 
3.2%
N 12
 
3.0%
H 10
 
2.5%
Other values (5) 40
 
10.0%

Length

2023-12-10T15:56:14.827493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
g 135
33.8%
c 85
21.2%
i 26
 
6.5%
r 23
 
5.8%
m 20
 
5.0%
f 18
 
4.5%
p 18
 
4.5%
q 13
 
3.2%
n 12
 
3.0%
h 10
 
2.5%
Other values (5) 40
 
10.0%

표준산업업종상세분류코드
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50653.605
Minimum1122
Maximum95310
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:56:14.954152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1122
5-th percentile10750.9
Q141905.5
median47412.5
Q368149.5
95-th percentile91142
Maximum95310
Range94188
Interquartile range (IQR)26244

Descriptive statistics

Standard deviation23598.027
Coefficient of variation (CV)0.46587063
Kurtosis-0.50733056
Mean50653.605
Median Absolute Deviation (MAD)13407.5
Skewness0.074872976
Sum20261442
Variance5.5686686 × 108
MonotonicityStrictly increasing
2023-12-10T15:56:15.090307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1122 1
 
0.2%
56113 1
 
0.2%
56192 1
 
0.2%
56191 1
 
0.2%
56142 1
 
0.2%
56141 1
 
0.2%
56130 1
 
0.2%
56129 1
 
0.2%
56123 1
 
0.2%
56122 1
 
0.2%
Other values (390) 390
97.5%
ValueCountFrequency (%)
1122 1
0.2%
1123 1
0.2%
1151 1
0.2%
1152 1
0.2%
1299 1
0.2%
1411 1
0.2%
1420 1
0.2%
2040 1
0.2%
3112 1
0.2%
10121 1
0.2%
ValueCountFrequency (%)
95310 1
0.2%
95220 1
0.2%
95213 1
0.2%
95212 1
0.2%
95211 1
0.2%
95120 1
0.2%
95110 1
0.2%
94990 1
0.2%
94939 1
0.2%
94110 1
0.2%
Distinct400
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-10T15:56:15.431448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length21
Mean length12.1825
Min length3

Characters and Unicode

Total characters4873
Distinct characters317
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique400 ?
Unique (%)100.0%

Sample

1st row화훼작물 재배업
2nd row종자 및 묘목 생산업
3rd row콩나물 재배업
4th row채소화훼 및 과실작물 시설 재배업
5th row그 외 기타 축산업
ValueCountFrequency (%)
149
 
10.1%
기타 96
 
6.5%
제조업 68
 
4.6%
소매업 61
 
4.2%
도매업 52
 
3.5%
30
 
2.0%
30
 
2.0%
운영업 26
 
1.8%
서비스업 24
 
1.6%
자동차 13
 
0.9%
Other values (567) 919
62.6%
2023-12-10T15:56:15.915437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1068
21.9%
378
 
7.8%
163
 
3.3%
149
 
3.1%
126
 
2.6%
108
 
2.2%
98
 
2.0%
92
 
1.9%
90
 
1.8%
85
 
1.7%
Other values (307) 2516
51.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3801
78.0%
Space Separator 1068
 
21.9%
Decimal Number 2
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
378
 
9.9%
163
 
4.3%
149
 
3.9%
126
 
3.3%
108
 
2.8%
98
 
2.6%
92
 
2.4%
90
 
2.4%
85
 
2.2%
74
 
1.9%
Other values (303) 2438
64.1%
Space Separator
ValueCountFrequency (%)
1068
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3801
78.0%
Common 1072
 
22.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
378
 
9.9%
163
 
4.3%
149
 
3.9%
126
 
3.3%
108
 
2.8%
98
 
2.6%
92
 
2.4%
90
 
2.4%
85
 
2.2%
74
 
1.9%
Other values (303) 2438
64.1%
Common
ValueCountFrequency (%)
1068
99.6%
1 2
 
0.2%
( 1
 
0.1%
) 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3788
77.7%
ASCII 1072
 
22.0%
Compat Jamo 13
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1068
99.6%
1 2
 
0.2%
( 1
 
0.1%
) 1
 
0.1%
Hangul
ValueCountFrequency (%)
378
 
10.0%
163
 
4.3%
149
 
3.9%
126
 
3.3%
108
 
2.9%
98
 
2.6%
92
 
2.4%
90
 
2.4%
85
 
2.2%
74
 
2.0%
Other values (302) 2425
64.0%
Compat Jamo
ValueCountFrequency (%)
13
100.0%

소상공인시스템로그일시
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2020-10-21 12:28:43.0
400 

Length

Max length21
Median length21
Mean length21
Min length21

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-10-21 12:28:43.0
2nd row2020-10-21 12:28:43.0
3rd row2020-10-21 12:28:43.0
4th row2020-10-21 12:28:43.0
5th row2020-10-21 12:28:43.0

Common Values

ValueCountFrequency (%)
2020-10-21 12:28:43.0 400
100.0%

Length

2023-12-10T15:56:16.045676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:56:16.138035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-10-21 400
50.0%
12:28:43.0 400
50.0%

Interactions

2023-12-10T15:56:13.012714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:56:12.472558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:56:12.755461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:56:13.118412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:56:12.571486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:56:12.843064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:56:13.199320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:56:12.653371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:56:12.928518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:56:16.215900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
결제건수합계금액표준산업업종대분류코드표준산업업종상세분류코드
결제건수1.0001.0000.0000.000
합계금액1.0001.0000.0000.000
표준산업업종대분류코드0.0000.0001.0000.979
표준산업업종상세분류코드0.0000.0000.9791.000
2023-12-10T15:56:16.335234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
결제건수합계금액표준산업업종상세분류코드표준산업업종대분류코드
결제건수1.0000.9950.0030.000
합계금액0.9951.0000.0080.000
표준산업업종상세분류코드0.0030.0081.0000.846
표준산업업종대분류코드0.0000.0000.8461.000

Missing values

2023-12-10T15:56:13.320396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:56:13.483446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

소상공인결제분류코드년월소상공인시스템로그ID광역시도코드광역시도명결제건수합계금액표준산업업종대분류코드표준산업업종상세분류코드표준산업업종상세분류명소상공인시스템로그일시
0ZEROP31000202008<NA>31울산광역시00A1122화훼작물 재배업2020-10-21 12:28:43.0
1ZEROP31000202008<NA>31울산광역시00A1123종자 및 묘목 생산업2020-10-21 12:28:43.0
2ZEROP31000202008<NA>31울산광역시00A1151콩나물 재배업2020-10-21 12:28:43.0
3ZEROP31000202008<NA>31울산광역시00A1152채소화훼 및 과실작물 시설 재배업2020-10-21 12:28:43.0
4ZEROP31000202008<NA>31울산광역시00A1299그 외 기타 축산업2020-10-21 12:28:43.0
5ZEROP31000202008<NA>31울산광역시00A1411작물재배 지원 서비스업2020-10-21 12:28:43.0
6ZEROP31000202008<NA>31울산광역시00A1420축산 관련 서비스업2020-10-21 12:28:43.0
7ZEROP31000202008<NA>31울산광역시00A2040임업 관련 서비스업2020-10-21 12:28:43.0
8ZEROP31000202008<NA>31울산광역시00A3112연근해 어업2020-10-21 12:28:43.0
9ZEROP31000202008<NA>31울산광역시00C10121가금류 가공 및 저장 처리업2020-10-21 12:28:43.0
소상공인결제분류코드년월소상공인시스템로그ID광역시도코드광역시도명결제건수합계금액표준산업업종대분류코드표준산업업종상세분류코드표준산업업종상세분류명소상공인시스템로그일시
390ZEROP31000202008<NA>31울산광역시425500S94110산업 단체2020-10-21 12:28:43.0
391ZEROP31000202008<NA>31울산광역시00S94939기타 시민운동 단체2020-10-21 12:28:43.0
392ZEROP31000202008<NA>31울산광역시00S94990그 외 기타 협회 및 단체2020-10-21 12:28:43.0
393ZEROP31000202008<NA>31울산광역시00S95110컴퓨터 및 주변 기기 수리업2020-10-21 12:28:43.0
394ZEROP31000202008<NA>31울산광역시00S95120통신장비 수리업2020-10-21 12:28:43.0
395ZEROP31000202008<NA>31울산광역시00S95211자동차 종합 수리업2020-10-21 12:28:43.0
396ZEROP31000202008<NA>31울산광역시00S95212자동차 전문 수리업2020-10-21 12:28:43.0
397ZEROP31000202008<NA>31울산광역시00S95213자동차 세차업2020-10-21 12:28:43.0
398ZEROP31000202008<NA>31울산광역시00S95220모터사이클 수리업2020-10-21 12:28:43.0
399ZEROP31000202008<NA>31울산광역시00S95310가전제품 수리업2020-10-21 12:28:43.0