Overview

Dataset statistics

Number of variables20
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.0 KiB
Average record size in memory170.4 B

Variable types

DateTime3
Categorical10
Text5
Numeric2

Dataset

Description샘플 데이터
Author경기신용보증재단
URLhttps://bigdata-region.kr/#/dataset/266e8af1-d30d-4af0-9c56-cdba6e6c4528

Alerts

기준년월 has constant value ""Constant
시도명 has constant value ""Constant
자금종류중분류명 has constant value ""Constant
성별코드 is highly overall correlated with 이전과세년도High correlation
업종대분류명 is highly overall correlated with 이전과세년도High correlation
사업소득과세금액 is highly overall correlated with 신청금액 and 2 other fieldsHigh correlation
근로소득과세금액 is highly overall correlated with 이전과세년도 and 1 other fieldsHigh correlation
자금종류대분류명 is highly overall correlated with 신청금액 and 1 other fieldsHigh correlation
연령대코드 is highly overall correlated with 이전과세년도High correlation
과세유형명 is highly overall correlated with 이전과세년도High correlation
이전과세년도 is highly overall correlated with 부가가치과세금액 and 8 other fieldsHigh correlation
부가가치과세금액 is highly overall correlated with 이전과세년도High correlation
신청금액 is highly overall correlated with 이전과세년도 and 2 other fieldsHigh correlation
사업소득과세금액 is highly imbalanced (78.9%)Imbalance
근로소득과세금액 is highly imbalanced (64.1%)Imbalance
자금종류대분류명 is highly imbalanced (53.1%)Imbalance
사업자등록번호 has unique valuesUnique
개업일자 has unique valuesUnique
기업번호 has unique valuesUnique
부가가치과세금액 has 18 (60.0%) zerosZeros

Reproduction

Analysis started2023-12-10 14:01:45.138756
Analysis finished2023-12-10 14:01:47.750938
Duration2.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준년월
Date

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
Minimum2020-10-01 00:00:00
Maximum2020-10-01 00:00:00
2023-12-10T23:01:47.838037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:01:47.977957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

성별코드
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
F
15 
M
15 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowF
3rd rowM
4th rowF
5th rowF

Common Values

ValueCountFrequency (%)
F 15
50.0%
M 15
50.0%

Length

2023-12-10T23:01:48.155007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:01:48.295141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
f 15
50.0%
m 15
50.0%

연령대코드
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
40
18 
30
50
20
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row30
2nd row40
3rd row40
4th row40
5th row40

Common Values

ValueCountFrequency (%)
40 18
60.0%
30 6
 
20.0%
50 5
 
16.7%
20 1
 
3.3%

Length

2023-12-10T23:01:48.435742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:01:48.631542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
40 18
60.0%
30 6
 
20.0%
50 5
 
16.7%
20 1
 
3.3%
Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:01:48.941520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters300
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)100.0%

Sample

1st row28419*****
2nd row19129*****
3rd row39626*****
4th row67325*****
5th row44915*****
ValueCountFrequency (%)
28419 1
 
3.3%
19129 1
 
3.3%
14104 1
 
3.3%
73616 1
 
3.3%
22710 1
 
3.3%
14303 1
 
3.3%
12494 1
 
3.3%
27968 1
 
3.3%
79613 1
 
3.3%
18422 1
 
3.3%
Other values (20) 20
66.7%
2023-12-10T23:01:49.392108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 150
50.0%
1 31
 
10.3%
2 21
 
7.0%
6 18
 
6.0%
4 16
 
5.3%
3 14
 
4.7%
9 13
 
4.3%
8 12
 
4.0%
0 12
 
4.0%
5 8
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 150
50.0%
Decimal Number 150
50.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 31
20.7%
2 21
14.0%
6 18
12.0%
4 16
10.7%
3 14
9.3%
9 13
8.7%
8 12
 
8.0%
0 12
 
8.0%
5 8
 
5.3%
7 5
 
3.3%
Other Punctuation
ValueCountFrequency (%)
* 150
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 300
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 150
50.0%
1 31
 
10.3%
2 21
 
7.0%
6 18
 
6.0%
4 16
 
5.3%
3 14
 
4.7%
9 13
 
4.3%
8 12
 
4.0%
0 12
 
4.0%
5 8
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 300
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 150
50.0%
1 31
 
10.3%
2 21
 
7.0%
6 18
 
6.0%
4 16
 
5.3%
3 14
 
4.7%
9 13
 
4.3%
8 12
 
4.0%
0 12
 
4.0%
5 8
 
2.7%

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
경기도
30 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경기도
2nd row경기도
3rd row경기도
4th row경기도
5th row경기도

Common Values

ValueCountFrequency (%)
경기도 30
100.0%

Length

2023-12-10T23:01:49.552994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:01:49.660973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 30
100.0%
Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
Minimum2020-07-01 00:00:00
Maximum2020-08-01 00:00:00
2023-12-10T23:01:49.765473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:01:49.887742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=2)

개업일자
Date

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
Minimum1997-01-25 00:00:00
Maximum2020-03-18 00:00:00
2023-12-10T23:01:49.997526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:01:50.113947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)

과세유형명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
일반과세자
23 
부가가치세 면세사업자
간이과세자

Length

Max length13
Median length7
Mean length7.8
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 일반과세자
2nd row 일반과세자
3rd row 일반과세자
4th row 간이과세자
5th row 일반과세자

Common Values

ValueCountFrequency (%)
일반과세자 23
76.7%
부가가치세 면세사업자 4
 
13.3%
간이과세자 3
 
10.0%

Length

2023-12-10T23:01:50.255401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:01:50.355597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반과세자 23
67.6%
부가가치세 4
 
11.8%
면세사업자 4
 
11.8%
간이과세자 3
 
8.8%

이전과세년도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2019
23 
<NA>

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row<NA>
5th row2019

Common Values

ValueCountFrequency (%)
2019 23
76.7%
<NA> 7
 
23.3%

Length

2023-12-10T23:01:50.473145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:01:50.578796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2019 23
76.7%
na 7
 
23.3%

사업소득과세금액
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
0
29 
378520
 
1

Length

Max length6
Median length1
Mean length1.1666667
Min length1

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row378520
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 29
96.7%
378520 1
 
3.3%

Length

2023-12-10T23:01:50.726535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:01:50.831082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 29
96.7%
378520 1
 
3.3%

근로소득과세금액
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
0
26 
17390
 
1
30370
 
1
77800
 
1
340
 
1

Length

Max length5
Median length1
Mean length1.4666667
Min length1

Unique

Unique4 ?
Unique (%)13.3%

Sample

1st row17390
2nd row30370
3rd row77800
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 26
86.7%
17390 1
 
3.3%
30370 1
 
3.3%
77800 1
 
3.3%
340 1
 
3.3%

Length

2023-12-10T23:01:50.961252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:01:51.096800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 26
86.7%
17390 1
 
3.3%
30370 1
 
3.3%
77800 1
 
3.3%
340 1
 
3.3%

부가가치과세금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct13
Distinct (%)43.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1718628.7
Minimum0
Maximum19852810
Zeros18
Zeros (%)60.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:01:51.221177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3978477.5
95-th percentile7557414.5
Maximum19852810
Range19852810
Interquartile range (IQR)978477.5

Descriptive statistics

Standard deviation4111109.6
Coefficient of variation (CV)2.3920872
Kurtosis13.220571
Mean1718628.7
Median Absolute Deviation (MAD)0
Skewness3.4037816
Sum51558860
Variance1.6901222 × 1013
MonotonicityNot monotonic
2023-12-10T23:01:51.366209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0 18
60.0%
6234910 1
 
3.3%
769250 1
 
3.3%
102630 1
 
3.3%
2320330 1
 
3.3%
6294950 1
 
3.3%
118820 1
 
3.3%
4849230 1
 
3.3%
74370 1
 
3.3%
8590340 1
 
3.3%
Other values (3) 3
 
10.0%
ValueCountFrequency (%)
0 18
60.0%
74370 1
 
3.3%
102630 1
 
3.3%
118820 1
 
3.3%
769250 1
 
3.3%
1048220 1
 
3.3%
1303000 1
 
3.3%
2320330 1
 
3.3%
4849230 1
 
3.3%
6234910 1
 
3.3%
ValueCountFrequency (%)
19852810 1
3.3%
8590340 1
3.3%
6294950 1
3.3%
6234910 1
3.3%
4849230 1
3.3%
2320330 1
3.3%
1303000 1
3.3%
1048220 1
3.3%
769250 1
3.3%
118820 1
3.3%

신청금액
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)26.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31100000
Minimum10000000
Maximum50000000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:01:51.579627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10000000
5-th percentile10000000
Q120000000
median20000000
Q350000000
95-th percentile50000000
Maximum50000000
Range40000000
Interquartile range (IQR)30000000

Descriptive statistics

Standard deviation15737831
Coefficient of variation (CV)0.50603957
Kurtosis-1.7910394
Mean31100000
Median Absolute Deviation (MAD)10000000
Skewness0.18154895
Sum9.33 × 108
Variance2.4767931 × 1014
MonotonicityNot monotonic
2023-12-10T23:01:51.833580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
50000000 10
33.3%
20000000 10
33.3%
10000000 3
 
10.0%
40000000 2
 
6.7%
15000000 2
 
6.7%
18000000 1
 
3.3%
45000000 1
 
3.3%
30000000 1
 
3.3%
ValueCountFrequency (%)
10000000 3
 
10.0%
15000000 2
 
6.7%
18000000 1
 
3.3%
20000000 10
33.3%
30000000 1
 
3.3%
40000000 2
 
6.7%
45000000 1
 
3.3%
50000000 10
33.3%
ValueCountFrequency (%)
50000000 10
33.3%
45000000 1
 
3.3%
40000000 2
 
6.7%
30000000 1
 
3.3%
20000000 10
33.3%
18000000 1
 
3.3%
15000000 2
 
6.7%
10000000 3
 
10.0%

자금종류대분류명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
경기도자금
27 
일반자금

Length

Max length5
Median length5
Mean length4.9
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반자금
2nd row경기도자금
3rd row경기도자금
4th row경기도자금
5th row경기도자금

Common Values

ValueCountFrequency (%)
경기도자금 27
90.0%
일반자금 3
 
10.0%

Length

2023-12-10T23:01:51.982215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:01:52.087747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도자금 27
90.0%
일반자금 3
 
10.0%

자금종류중분류명
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
소상공인지원자금
30 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row소상공인지원자금
2nd row소상공인지원자금
3rd row소상공인지원자금
4th row소상공인지원자금
5th row소상공인지원자금

Common Values

ValueCountFrequency (%)
소상공인지원자금 30
100.0%

Length

2023-12-10T23:01:52.221402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:01:52.395234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
소상공인지원자금 30
100.0%

업종대분류명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)36.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
G도매및소매업(45~47)
I숙박및음식점업(55~56)
P교육서비스업(85)
H운수업(49~52)
C제조업(10~34)
Other values (6)

Length

Max length25
Median length23
Mean length13.8
Min length11

Unique

Unique6 ?
Unique (%)20.0%

Sample

1st rowI숙박및음식점업(55~56)
2nd rowG도매및소매업(45~47)
3rd rowG도매및소매업(45~47)
4th rowI숙박및음식점업(55~56)
5th rowR예술;스포츠및여가관련서비스업(90~91)

Common Values

ValueCountFrequency (%)
G도매및소매업(45~47) 7
23.3%
I숙박및음식점업(55~56) 6
20.0%
P교육서비스업(85) 4
13.3%
H운수업(49~52) 4
13.3%
C제조업(10~34) 3
10.0%
R예술;스포츠및여가관련서비스업(90~91) 1
 
3.3%
F건설업(41~42) 1
 
3.3%
S협회및단체;수리및기타개인서비스업(94~96) 1
 
3.3%
J정보통신업(58~63) 1
 
3.3%
M전문;과학및기술서비스업(70~73) 1
 
3.3%

Length

2023-12-10T23:01:52.531662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
g도매및소매업(45~47 7
23.3%
i숙박및음식점업(55~56 6
20.0%
p교육서비스업(85 4
13.3%
h운수업(49~52 4
13.3%
c제조업(10~34 3
10.0%
r예술;스포츠및여가관련서비스업(90~91 1
 
3.3%
f건설업(41~42 1
 
3.3%
s협회및단체;수리및기타개인서비스업(94~96 1
 
3.3%
j정보통신업(58~63 1
 
3.3%
m전문;과학및기술서비스업(70~73 1
 
3.3%
Distinct20
Distinct (%)66.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:01:52.786782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length16
Mean length8.1333333
Min length4

Characters and Unicode

Total characters244
Distinct characters93
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)53.3%

Sample

1st row음식점업
2nd row무점포소매업
3rd row섬유;의복;신발및가죽제품소매업
4th row주점및비알콜음료점업
5th row스포츠서비스업
ValueCountFrequency (%)
음식점업 5
16.7%
도로화물운송업 4
 
13.3%
기타교육기관 3
 
10.0%
종합소매업 2
 
6.7%
음·식료품및담배소매업 1
 
3.3%
상품종합도매업 1
 
3.3%
부동산관련서비스업 1
 
3.3%
신발및신발부분품제조업 1
 
3.3%
전문디자인업 1
 
3.3%
컴퓨터프로그래밍;시스템통합및관리업 1
 
3.3%
Other values (10) 10
33.3%
2023-12-10T23:01:53.336906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26
 
10.7%
9
 
3.7%
8
 
3.3%
8
 
3.3%
7
 
2.9%
7
 
2.9%
6
 
2.5%
6
 
2.5%
6
 
2.5%
6
 
2.5%
Other values (83) 155
63.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 238
97.5%
Other Punctuation 6
 
2.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26
 
10.9%
9
 
3.8%
8
 
3.4%
8
 
3.4%
7
 
2.9%
7
 
2.9%
6
 
2.5%
6
 
2.5%
6
 
2.5%
6
 
2.5%
Other values (81) 149
62.6%
Other Punctuation
ValueCountFrequency (%)
; 5
83.3%
· 1
 
16.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 238
97.5%
Common 6
 
2.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
26
 
10.9%
9
 
3.8%
8
 
3.4%
8
 
3.4%
7
 
2.9%
7
 
2.9%
6
 
2.5%
6
 
2.5%
6
 
2.5%
6
 
2.5%
Other values (81) 149
62.6%
Common
ValueCountFrequency (%)
; 5
83.3%
· 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 238
97.5%
ASCII 5
 
2.0%
None 1
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
26
 
10.9%
9
 
3.8%
8
 
3.4%
8
 
3.4%
7
 
2.9%
7
 
2.9%
6
 
2.5%
6
 
2.5%
6
 
2.5%
6
 
2.5%
Other values (81) 149
62.6%
ASCII
ValueCountFrequency (%)
; 5
100.0%
None
ValueCountFrequency (%)
· 1
100.0%
Distinct26
Distinct (%)86.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:01:53.678402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length13
Mean length8.9
Min length4

Characters and Unicode

Total characters267
Distinct characters102
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)76.7%

Sample

1st row한식일반음식점업
2nd row전자상거래소매중개업
3rd row남자용겉옷소매업
4th row커피전문점
5th row그외기타스포츠시설운영업
ValueCountFrequency (%)
한식일반음식점업 3
 
10.0%
용달화물자동차운송업 2
 
6.7%
김밥및기타간이음식점업 2
 
6.7%
도배;실내장식및내장목공사업 1
 
3.3%
상품종합도매업 1
 
3.3%
기타직물제품제조업 1
 
3.3%
체인화편의점 1
 
3.3%
부동산중개및대리업 1
 
3.3%
신발부분품제조업 1
 
3.3%
외국어학원 1
 
3.3%
Other values (16) 16
53.3%
2023-12-10T23:01:54.246246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
24
 
9.0%
11
 
4.1%
8
 
3.0%
7
 
2.6%
7
 
2.6%
7
 
2.6%
6
 
2.2%
6
 
2.2%
6
 
2.2%
5
 
1.9%
Other values (92) 180
67.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 263
98.5%
Other Punctuation 4
 
1.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
24
 
9.1%
11
 
4.2%
8
 
3.0%
7
 
2.7%
7
 
2.7%
7
 
2.7%
6
 
2.3%
6
 
2.3%
6
 
2.3%
5
 
1.9%
Other values (90) 176
66.9%
Other Punctuation
ValueCountFrequency (%)
; 3
75.0%
· 1
 
25.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 263
98.5%
Common 4
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
24
 
9.1%
11
 
4.2%
8
 
3.0%
7
 
2.7%
7
 
2.7%
7
 
2.7%
6
 
2.3%
6
 
2.3%
6
 
2.3%
5
 
1.9%
Other values (90) 176
66.9%
Common
ValueCountFrequency (%)
; 3
75.0%
· 1
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 263
98.5%
ASCII 3
 
1.1%
None 1
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
24
 
9.1%
11
 
4.2%
8
 
3.0%
7
 
2.7%
7
 
2.7%
7
 
2.7%
6
 
2.3%
6
 
2.3%
6
 
2.3%
5
 
1.9%
Other values (90) 176
66.9%
ASCII
ValueCountFrequency (%)
; 3
100.0%
None
ValueCountFrequency (%)
· 1
100.0%
Distinct26
Distinct (%)86.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:01:54.555681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length6
Min length6

Characters and Unicode

Total characters180
Distinct characters21
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)76.7%

Sample

1st rowI56111
2nd rowG47911
3rd rowG47411
4th rowI56221
5th rowR91139
ValueCountFrequency (%)
i56111 3
 
10.0%
h49302 2
 
6.7%
i56194 2
 
6.7%
f42412 1
 
3.3%
g46800 1
 
3.3%
c13229 1
 
3.3%
g47122 1
 
3.3%
l68221 1
 
3.3%
c15220 1
 
3.3%
p85631 1
 
3.3%
Other values (16) 16
53.3%
2023-12-10T23:01:54.982844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 37
20.6%
2 22
12.2%
4 16
8.9%
9 15
8.3%
6 14
 
7.8%
5 13
 
7.2%
0 11
 
6.1%
3 10
 
5.6%
G 7
 
3.9%
7 6
 
3.3%
Other values (11) 29
16.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 150
83.3%
Uppercase Letter 30
 
16.7%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
G 7
23.3%
I 6
20.0%
P 4
13.3%
H 4
13.3%
C 3
10.0%
R 1
 
3.3%
F 1
 
3.3%
S 1
 
3.3%
J 1
 
3.3%
M 1
 
3.3%
Decimal Number
ValueCountFrequency (%)
1 37
24.7%
2 22
14.7%
4 16
10.7%
9 15
10.0%
6 14
 
9.3%
5 13
 
8.7%
0 11
 
7.3%
3 10
 
6.7%
7 6
 
4.0%
8 6
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
Common 150
83.3%
Latin 30
 
16.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 7
23.3%
I 6
20.0%
P 4
13.3%
H 4
13.3%
C 3
10.0%
R 1
 
3.3%
F 1
 
3.3%
S 1
 
3.3%
J 1
 
3.3%
M 1
 
3.3%
Common
ValueCountFrequency (%)
1 37
24.7%
2 22
14.7%
4 16
10.7%
9 15
10.0%
6 14
 
9.3%
5 13
 
8.7%
0 11
 
7.3%
3 10
 
6.7%
7 6
 
4.0%
8 6
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 180
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 37
20.6%
2 22
12.2%
4 16
8.9%
9 15
8.3%
6 14
 
7.8%
5 13
 
7.2%
0 11
 
6.1%
3 10
 
5.6%
G 7
 
3.9%
7 6
 
3.3%
Other values (11) 29
16.1%

기업번호
Text

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:01:55.649651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters720
Distinct characters65
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)100.0%

Sample

1st row73tl2sMW1fjxTXbiobt4rA==
2nd rowXbESSl9sSWGkN7nuFGU2bw==
3rd rowbutx0PewcZAl3VyA4nMmIQ==
4th row3/1vcHkdj5lNw/C57y/3ow==
5th rowgQt0A+NjDwW6QFMvqR7KLA==
ValueCountFrequency (%)
73tl2smw1fjxtxbiobt4ra 1
 
3.3%
xbessl9sswgkn7nufgu2bw 1
 
3.3%
ofwhslix4exl4e38la16ug 1
 
3.3%
dyeecll0me+ou/jwzy23rw 1
 
3.3%
9dcllgiwh+qtfhqnqqtdug 1
 
3.3%
8di1gzlkxizztg0gmpztsq 1
 
3.3%
5gcyvzwgjrbt19rv1sgqbg 1
 
3.3%
4rwweqiord3szta7xmy0eg 1
 
3.3%
4oaptv1jxkcz2xmac8nvdq 1
 
3.3%
1m6yzoldusjvibnf6sijkw 1
 
3.3%
Other values (20) 20
66.7%
2023-12-10T23:01:56.240490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
= 60
 
8.3%
g 20
 
2.8%
1 18
 
2.5%
w 16
 
2.2%
A 16
 
2.2%
Q 16
 
2.2%
3 15
 
2.1%
c 15
 
2.1%
l 15
 
2.1%
6 14
 
1.9%
Other values (55) 515
71.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 284
39.4%
Uppercase Letter 245
34.0%
Decimal Number 112
 
15.6%
Math Symbol 68
 
9.4%
Other Punctuation 11
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
g 20
 
7.0%
w 16
 
5.6%
c 15
 
5.3%
l 15
 
5.3%
i 14
 
4.9%
b 14
 
4.9%
o 13
 
4.6%
n 13
 
4.6%
z 12
 
4.2%
e 12
 
4.2%
Other values (16) 140
49.3%
Uppercase Letter
ValueCountFrequency (%)
A 16
 
6.5%
Q 16
 
6.5%
V 13
 
5.3%
X 12
 
4.9%
F 12
 
4.9%
L 12
 
4.9%
D 11
 
4.5%
N 11
 
4.5%
T 11
 
4.5%
M 10
 
4.1%
Other values (16) 121
49.4%
Decimal Number
ValueCountFrequency (%)
1 18
16.1%
3 15
13.4%
6 14
12.5%
2 14
12.5%
4 10
8.9%
0 10
8.9%
7 9
8.0%
9 9
8.0%
8 7
 
6.2%
5 6
 
5.4%
Math Symbol
ValueCountFrequency (%)
= 60
88.2%
+ 8
 
11.8%
Other Punctuation
ValueCountFrequency (%)
/ 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 529
73.5%
Common 191
 
26.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
g 20
 
3.8%
w 16
 
3.0%
A 16
 
3.0%
Q 16
 
3.0%
c 15
 
2.8%
l 15
 
2.8%
i 14
 
2.6%
b 14
 
2.6%
o 13
 
2.5%
V 13
 
2.5%
Other values (42) 377
71.3%
Common
ValueCountFrequency (%)
= 60
31.4%
1 18
 
9.4%
3 15
 
7.9%
6 14
 
7.3%
2 14
 
7.3%
/ 11
 
5.8%
4 10
 
5.2%
0 10
 
5.2%
7 9
 
4.7%
9 9
 
4.7%
Other values (3) 21
 
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 720
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
= 60
 
8.3%
g 20
 
2.8%
1 18
 
2.5%
w 16
 
2.2%
A 16
 
2.2%
Q 16
 
2.2%
3 15
 
2.1%
c 15
 
2.1%
l 15
 
2.1%
6 14
 
1.9%
Other values (55) 515
71.5%

Interactions

2023-12-10T23:01:46.770235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:01:46.470247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:01:46.925660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:01:46.624851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:01:56.420943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별코드연령대코드사업자등록번호신청년월개업일자과세유형명사업소득과세금액근로소득과세금액부가가치과세금액신청금액자금종류대분류명업종대분류명업종중분류명업종소분류명업종코드기업번호
성별코드1.0000.0001.0000.0001.0000.1640.0000.0000.3000.0000.0000.5340.6650.9050.9051.000
연령대코드0.0001.0001.0000.0811.0000.0000.3080.0000.0000.0000.5140.3750.8560.7260.7261.000
사업자등록번호1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
신청년월0.0000.0811.0001.0001.0000.0000.1310.6910.0000.6730.4750.0000.7490.0000.0001.000
개업일자1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
과세유형명0.1640.0001.0000.0001.0001.0000.0000.0000.0000.0000.0000.7450.9431.0001.0001.000
사업소득과세금액0.0000.3081.0000.1311.0000.0001.0001.0000.0000.7980.2640.0000.0000.0000.0001.000
근로소득과세금액0.0000.0001.0000.6911.0000.0001.0001.0000.6380.0000.3770.0000.5690.0000.0001.000
부가가치과세금액0.3000.0001.0000.0001.0000.0000.0000.6381.0000.0000.0000.5950.9581.0001.0001.000
신청금액0.0000.0001.0000.6731.0000.0000.7980.0000.0001.0001.0000.0000.0000.0000.0001.000
자금종류대분류명0.0000.5141.0000.4751.0000.0000.2640.3770.0001.0001.0000.0000.0000.0000.0001.000
업종대분류명0.5340.3751.0000.0001.0000.7450.0000.0000.5950.0000.0001.0001.0001.0001.0001.000
업종중분류명0.6650.8561.0000.7491.0000.9430.0000.5690.9580.0000.0001.0001.0001.0001.0001.000
업종소분류명0.9050.7261.0000.0001.0001.0000.0000.0001.0000.0000.0001.0001.0001.0001.0001.000
업종코드0.9050.7261.0000.0001.0001.0000.0000.0001.0000.0000.0001.0001.0001.0001.0001.000
기업번호1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
2023-12-10T23:01:56.659177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별코드업종대분류명사업소득과세금액근로소득과세금액자금종류대분류명연령대코드과세유형명이전과세년도
성별코드1.0000.4130.0000.0000.0000.0000.2611.000
업종대분류명0.4131.0000.0000.0000.0000.1680.4961.000
사업소득과세금액0.0000.0001.0000.9450.1670.1890.0001.000
근로소득과세금액0.0000.0000.9451.0000.4300.0000.0001.000
자금종류대분류명0.0000.0000.1670.4301.0000.3320.0001.000
연령대코드0.0000.1680.1890.0000.3321.0000.0001.000
과세유형명0.2610.4960.0000.0000.0000.0001.0001.000
이전과세년도1.0001.0001.0001.0001.0001.0001.0001.000
2023-12-10T23:01:56.839128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부가가치과세금액신청금액성별코드연령대코드과세유형명이전과세년도사업소득과세금액근로소득과세금액자금종류대분류명업종대분류명
부가가치과세금액1.0000.1350.1850.0000.0001.0000.0000.4830.0000.298
신청금액0.1351.0000.0000.0000.0001.0000.5350.0000.9060.000
성별코드0.1850.0001.0000.0000.2611.0000.0000.0000.0000.413
연령대코드0.0000.0000.0001.0000.0001.0000.1890.0000.3320.168
과세유형명0.0000.0000.2610.0001.0001.0000.0000.0000.0000.496
이전과세년도1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
사업소득과세금액0.0000.5350.0000.1890.0001.0001.0000.9450.1670.000
근로소득과세금액0.4830.0000.0000.0000.0001.0000.9451.0000.4300.000
자금종류대분류명0.0000.9060.0000.3320.0001.0000.1670.4301.0000.000
업종대분류명0.2980.0000.4130.1680.4961.0000.0000.0000.0001.000

Missing values

2023-12-10T23:01:47.170967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:01:47.567024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준년월성별코드연령대코드사업자등록번호시도명신청년월개업일자과세유형명이전과세년도사업소득과세금액근로소득과세금액부가가치과세금액신청금액자금종류대분류명자금종류중분류명업종대분류명업종중분류명업종소분류명업종코드기업번호
02020-10F3028419*****경기도2020-072019-03-27일반과세자201937852017390040000000일반자금소상공인지원자금I숙박및음식점업(55~56)음식점업한식일반음식점업I5611173tl2sMW1fjxTXbiobt4rA==
12020-10F4019129*****경기도2020-072016-03-14일반과세자2019030370050000000경기도자금소상공인지원자금G도매및소매업(45~47)무점포소매업전자상거래소매중개업G47911XbESSl9sSWGkN7nuFGU2bw==
22020-10M4039626*****경기도2020-072018-09-21일반과세자2019077800623491050000000경기도자금소상공인지원자금G도매및소매업(45~47)섬유;의복;신발및가죽제품소매업남자용겉옷소매업G47411butx0PewcZAl3VyA4nMmIQ==
32020-10F4067325*****경기도2020-082016-01-25간이과세자<NA>00010000000경기도자금소상공인지원자금I숙박및음식점업(55~56)주점및비알콜음료점업커피전문점I562213/1vcHkdj5lNw/C57y/3ow==
42020-10F4044915*****경기도2020-082017-11-16일반과세자20190076925010000000경기도자금소상공인지원자금R예술;스포츠및여가관련서비스업(90~91)스포츠서비스업그외기타스포츠시설운영업R91139gQt0A+NjDwW6QFMvqR7KLA==
52020-10M5021594*****경기도2020-082016-02-11부가가치세 면세사업자<NA>00010000000경기도자금소상공인지원자금P교육서비스업(85)일반교습학원일반교과학원P85501sy3zXhFXgn4+dZnWLpggiQ==
62020-10M4063116*****경기도2020-082019-12-30일반과세자201900015000000경기도자금소상공인지원자금H운수업(49~52)도로화물운송업개별화물자동차운송업H49303nLfc2891UI6d+7s1cwzL3Q==
72020-10F4012495*****경기도2020-082014-11-04부가가치세 면세사업자201900015000000경기도자금소상공인지원자금P교육서비스업(85)기타교육기관음악학원P85621xnlhV907/kM/AxLreqKaoA==
82020-10M4012839*****경기도2020-082013-12-02일반과세자20190010263018000000경기도자금소상공인지원자금H운수업(49~52)도로화물운송업용달화물자동차운송업H49302on/VbPMr22Vfeogi94kJzA==
92020-10M4011025*****경기도2020-082020-02-04일반과세자<NA>00020000000경기도자금소상공인지원자금H운수업(49~52)도로화물운송업일반화물자동차운송업H49301BX321bAalUBx5LhZPTeYEg==
기준년월성별코드연령대코드사업자등록번호시도명신청년월개업일자과세유형명이전과세년도사업소득과세금액근로소득과세금액부가가치과세금액신청금액자금종류대분류명자금종류중분류명업종대분류명업종중분류명업종소분류명업종코드기업번호
202020-10M4032024*****경기도2020-082015-11-17일반과세자2019007437045000000경기도자금소상공인지원자금I숙박및음식점업(55~56)음식점업김밥및기타간이음식점업I56194NI/G2EIDSSuhpc9O1RI1TQ==
212020-10M5018422*****경기도2020-082016-04-01일반과세자201900859034050000000경기도자금소상공인지원자금J정보통신업(58~63)컴퓨터프로그래밍;시스템통합및관리업컴퓨터시스템통합자문및구축서비스업J620211M6YZoLdusjVIBnf6sijkw==
222020-10M4079613*****경기도2020-082016-09-09일반과세자<NA>00050000000경기도자금소상공인지원자금M전문;과학및기술서비스업(70~73)전문디자인업시각디자인업M732034oApTv1JxkCz2xMAc8nVDQ==
232020-10F3027968*****경기도2020-082020-03-18일반과세자201900050000000경기도자금소상공인지원자금I숙박및음식점업(55~56)음식점업한식일반음식점업I561114rWWEqiorD3szTA7XmY0eg==
242020-10M4012494*****경기도2020-082012-03-12부가가치세 면세사업자201900050000000경기도자금소상공인지원자금P교육서비스업(85)기타교육기관외국어학원P856315GcYvZWgjRbt19RV1SGQbg==
252020-10F2014303*****경기도2020-082020-01-01일반과세자201900050000000경기도자금소상공인지원자금C제조업(10~34)신발및신발부분품제조업신발부분품제조업C152208di1gzLKXIzztg0GMpztsQ==
262020-10F5022710*****경기도2020-082019-04-01일반과세자201900104822050000000경기도자금소상공인지원자금L부동산업및임대업(68)부동산관련서비스업부동산중개및대리업L682219DclLGiWH+qTFHqNqqTDug==
272020-10M4073616*****경기도2020-082016-01-23일반과세자201903401985281050000000경기도자금소상공인지원자금G도매및소매업(45~47)종합소매업체인화편의점G47122DyEEclL0me+oU/JwZy23Rw==
282020-10M3014104*****경기도2020-072014-01-02일반과세자201900130300030000000일반자금소상공인지원자금C제조업(10~34)직물직조및직물제품제조업기타직물제품제조업C13229ofwhSlIx4eXl4e38lA16ug==
292020-10M3046592*****경기도2020-082019-06-21부가가치세 면세사업자201900050000000경기도자금소상공인지원자금P교육서비스업(85)기타교육기관컴퓨터학원P85691EHjopkYT6cUja3Jgv7VN6Q==