Overview

Dataset statistics

Number of variables9
Number of observations24
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 KiB
Average record size in memory82.5 B

Variable types

Categorical4
Text2
Numeric3

Dataset

Description대구광역시 수성구의 연도별 지방세 과세 및 비과세 현황을 세목별로 제공하여 국민 조세 혜택 규모를 파악하는 데 사용 예정
Author대구광역시 수성구
URLhttps://www.data.go.kr/data/15079173/fileData.do

Alerts

시도명 has constant value ""Constant
시군구명 has constant value ""Constant
자치단체코드 has constant value ""Constant
과세건수 is highly overall correlated with 비과세건수 and 1 other fieldsHigh correlation
비과세건수 is highly overall correlated with 과세건수 and 1 other fieldsHigh correlation
비과세금액 is highly overall correlated with 과세건수 and 1 other fieldsHigh correlation
과세건수 has 5 (20.8%) zerosZeros
비과세건수 has 10 (41.7%) zerosZeros
비과세금액 has 10 (41.7%) zerosZeros

Reproduction

Analysis started2024-04-21 01:50:45.493741
Analysis finished2024-04-21 01:50:48.182100
Duration2.69 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size324.0 B
대구광역시
24 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대구광역시
2nd row대구광역시
3rd row대구광역시
4th row대구광역시
5th row대구광역시

Common Values

ValueCountFrequency (%)
대구광역시 24
100.0%

Length

2024-04-21T10:50:48.260392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T10:50:48.357239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대구광역시 24
100.0%

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size324.0 B
수성구
24 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수성구
2nd row수성구
3rd row수성구
4th row수성구
5th row수성구

Common Values

ValueCountFrequency (%)
수성구 24
100.0%

Length

2024-04-21T10:50:48.463551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T10:50:48.564840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수성구 24
100.0%

자치단체코드
Categorical

CONSTANT 

Distinct1
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Memory size324.0 B
27260
24 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row27260
2nd row27260
3rd row27260
4th row27260
5th row27260

Common Values

ValueCountFrequency (%)
27260 24
100.0%

Length

2024-04-21T10:50:48.662128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T10:50:48.754154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
27260 24
100.0%

과세년도
Categorical

Distinct2
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size324.0 B
2021
12 
2022
12 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021
2nd row2021
3rd row2021
4th row2021
5th row2021

Common Values

ValueCountFrequency (%)
2021 12
50.0%
2022 12
50.0%

Length

2024-04-21T10:50:48.840840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T10:50:48.927246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021 12
50.0%
2022 12
50.0%
Distinct12
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size324.0 B
2024-04-21T10:50:49.065943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length4.25
Min length3

Characters and Unicode

Total characters102
Distinct characters31
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row취득세
2nd row주민세
3rd row재산세
4th row자동차세
5th row레저세
ValueCountFrequency (%)
취득세 2
8.3%
주민세 2
8.3%
재산세 2
8.3%
자동차세 2
8.3%
레저세 2
8.3%
담배소비세 2
8.3%
지방소비세 2
8.3%
등록면허세 2
8.3%
도시계획세 2
8.3%
지역자원시설세 2
8.3%
Other values (2) 4
16.7%
2024-04-21T10:50:49.391094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
24
23.5%
6
 
5.9%
6
 
5.9%
4
 
3.9%
4
 
3.9%
4
 
3.9%
4
 
3.9%
4
 
3.9%
2
 
2.0%
2
 
2.0%
Other values (21) 42
41.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 102
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
24
23.5%
6
 
5.9%
6
 
5.9%
4
 
3.9%
4
 
3.9%
4
 
3.9%
4
 
3.9%
4
 
3.9%
2
 
2.0%
2
 
2.0%
Other values (21) 42
41.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul 102
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
24
23.5%
6
 
5.9%
6
 
5.9%
4
 
3.9%
4
 
3.9%
4
 
3.9%
4
 
3.9%
4
 
3.9%
2
 
2.0%
2
 
2.0%
Other values (21) 42
41.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 102
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
24
23.5%
6
 
5.9%
6
 
5.9%
4
 
3.9%
4
 
3.9%
4
 
3.9%
4
 
3.9%
4
 
3.9%
2
 
2.0%
2
 
2.0%
Other values (21) 42
41.2%

과세건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct20
Distinct (%)83.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean187930.79
Minimum0
Maximum910033
Zeros5
Zeros (%)20.8%
Negative0
Negative (%)0.0%
Memory size348.0 B
2024-04-21T10:50:49.503222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q18.5
median131405.5
Q3238288.25
95-th percentile822132.85
Maximum910033
Range910033
Interquartile range (IQR)238279.75

Descriptive statistics

Standard deviation250842.12
Coefficient of variation (CV)1.334758
Kurtosis4.4108619
Mean187930.79
Median Absolute Deviation (MAD)131397.5
Skewness2.0943693
Sum4510339
Variance6.2921769 × 1010
MonotonicityNot monotonic
2024-04-21T10:50:49.598643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
0 5
20.8%
47805 1
 
4.2%
174027 1
 
4.2%
906043 1
 
4.2%
172504 1
 
4.2%
299697 1
 
4.2%
113278 1
 
4.2%
9 1
 
4.2%
45 1
 
4.2%
337303 1
 
4.2%
Other values (10) 10
41.7%
ValueCountFrequency (%)
0 5
20.8%
7 1
 
4.2%
9 1
 
4.2%
45 1
 
4.2%
41646 1
 
4.2%
47805 1
 
4.2%
111789 1
 
4.2%
113278 1
 
4.2%
149533 1
 
4.2%
172504 1
 
4.2%
ValueCountFrequency (%)
910033 1
4.2%
906043 1
4.2%
346642 1
4.2%
337303 1
4.2%
299697 1
4.2%
294623 1
4.2%
219510 1
4.2%
213281 1
4.2%
174027 1
4.2%
172564 1
4.2%
Distinct20
Distinct (%)83.3%
Missing0
Missing (%)0.0%
Memory size324.0 B
2024-04-21T10:50:49.727662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length11
Mean length9.625
Min length1

Characters and Unicode

Total characters231
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)79.2%

Sample

1st row="308000000000"
2nd row7189582000
3rd row="128000000000"
4th row55280320000
5th row0
ValueCountFrequency (%)
0 5
20.8%
308000000000 1
 
4.2%
1508000000 1
 
4.2%
178000000000 1
 
4.2%
9516000000 1
 
4.2%
11885000000 1
 
4.2%
12339000000 1
 
4.2%
220000000 1
 
4.2%
54000000000 1
 
4.2%
142000000000 1
 
4.2%
Other values (10) 10
41.7%
2024-04-21T10:50:49.996866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 128
55.4%
2 14
 
6.1%
1 13
 
5.6%
" 12
 
5.2%
5 12
 
5.2%
3 10
 
4.3%
8 10
 
4.3%
9 9
 
3.9%
6 7
 
3.0%
= 6
 
2.6%
Other values (2) 10
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 213
92.2%
Other Punctuation 12
 
5.2%
Math Symbol 6
 
2.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 128
60.1%
2 14
 
6.6%
1 13
 
6.1%
5 12
 
5.6%
3 10
 
4.7%
8 10
 
4.7%
9 9
 
4.2%
6 7
 
3.3%
4 6
 
2.8%
7 4
 
1.9%
Other Punctuation
ValueCountFrequency (%)
" 12
100.0%
Math Symbol
ValueCountFrequency (%)
= 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 231
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 128
55.4%
2 14
 
6.1%
1 13
 
5.6%
" 12
 
5.2%
5 12
 
5.2%
3 10
 
4.3%
8 10
 
4.3%
9 9
 
3.9%
6 7
 
3.0%
= 6
 
2.6%
Other values (2) 10
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 231
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 128
55.4%
2 14
 
6.1%
1 13
 
5.6%
" 12
 
5.2%
5 12
 
5.2%
3 10
 
4.3%
8 10
 
4.3%
9 9
 
3.9%
6 7
 
3.0%
= 6
 
2.6%
Other values (2) 10
 
4.3%

비과세건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct15
Distinct (%)62.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16359.75
Minimum0
Maximum90556
Zeros10
Zeros (%)41.7%
Negative0
Negative (%)0.0%
Memory size348.0 B
2024-04-21T10:50:50.109889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2162
Q319117.75
95-th percentile78868.7
Maximum90556
Range90556
Interquartile range (IQR)19117.75

Descriptive statistics

Standard deviation27064.411
Coefficient of variation (CV)1.6543292
Kurtosis2.3426244
Mean16359.75
Median Absolute Deviation (MAD)2162
Skewness1.7874622
Sum392634
Variance7.3248237 × 108
MonotonicityNot monotonic
2024-04-21T10:50:50.205499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
0 10
41.7%
8143 1
 
4.2%
43706 1
 
4.2%
84353 1
 
4.2%
40957 1
 
4.2%
4307 1
 
4.2%
5381 1
 
4.2%
9 1
 
4.2%
8727 1
 
4.2%
47791 1
 
4.2%
Other values (5) 5
20.8%
ValueCountFrequency (%)
0 10
41.7%
9 1
 
4.2%
17 1
 
4.2%
4307 1
 
4.2%
5321 1
 
4.2%
5381 1
 
4.2%
8143 1
 
4.2%
8727 1
 
4.2%
11838 1
 
4.2%
40957 1
 
4.2%
ValueCountFrequency (%)
90556 1
4.2%
84353 1
4.2%
47791 1
4.2%
43706 1
4.2%
41528 1
4.2%
40957 1
4.2%
11838 1
4.2%
8727 1
4.2%
8143 1
4.2%
5381 1
4.2%

비과세금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct15
Distinct (%)62.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.9466274 × 109
Minimum0
Maximum4.7751012 × 1010
Zeros10
Zeros (%)41.7%
Negative0
Negative (%)0.0%
Memory size348.0 B
2024-04-21T10:50:50.307330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median87637000
Q31.645101 × 109
95-th percentile4.3036125 × 1010
Maximum4.7751012 × 1010
Range4.7751012 × 1010
Interquartile range (IQR)1.645101 × 109

Descriptive statistics

Standard deviation1.5054649 × 1010
Coefficient of variation (CV)2.1671882
Kurtosis2.7933699
Mean6.9466274 × 109
Median Absolute Deviation (MAD)87637000
Skewness2.0660746
Sum1.6671906 × 1011
Variance2.2664246 × 1020
MonotonicityNot monotonic
2024-04-21T10:50:50.430285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
0 10
41.7%
29671623000 1
 
4.2%
2840933000 1
 
4.2%
44485658000 1
 
4.2%
1411101000 1
 
4.2%
213963000 1
 
4.2%
818965000 1
 
4.2%
1000 1
 
4.2%
34822106000 1
 
4.2%
2347101000 1
 
4.2%
Other values (5) 5
20.8%
ValueCountFrequency (%)
0 10
41.7%
1000 1
 
4.2%
2000 1
 
4.2%
175272000 1
 
4.2%
213963000 1
 
4.2%
818965000 1
 
4.2%
870029000 1
 
4.2%
1311292000 1
 
4.2%
1411101000 1
 
4.2%
2347101000 1
 
4.2%
ValueCountFrequency (%)
47751012000 1
4.2%
44485658000 1
4.2%
34822106000 1
4.2%
29671623000 1
4.2%
2840933000 1
4.2%
2347101000 1
4.2%
1411101000 1
4.2%
1311292000 1
4.2%
870029000 1
4.2%
818965000 1
4.2%

Interactions

2024-04-21T10:50:47.743438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:50:47.261456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:50:47.538073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:50:47.817072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:50:47.383136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:50:47.608614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:50:47.899738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:50:47.452182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:50:47.673534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T10:50:50.511358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세년도세목명과세건수과세금액비과세건수비과세금액
과세년도1.0000.0000.0000.2530.0000.000
세목명0.0001.0001.0000.8820.8400.900
과세건수0.0001.0001.0001.0000.8740.571
과세금액0.2530.8821.0001.0001.0001.000
비과세건수0.0000.8400.8741.0001.0000.533
비과세금액0.0000.9000.5711.0000.5331.000
2024-04-21T10:50:50.605147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세건수비과세건수비과세금액과세년도
과세건수1.0000.6360.5800.000
비과세건수0.6361.0000.9530.000
비과세금액0.5800.9531.0000.000
과세년도0.0000.0000.0001.000

Missing values

2024-04-21T10:50:47.998715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T10:50:48.128479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도명시군구명자치단체코드과세년도세목명과세건수과세금액비과세건수비과세금액
0대구광역시수성구272602021취득세47805="308000000000"814329671623000
1대구광역시수성구272602021주민세1725647189582000437062840933000
2대구광역시수성구272602021재산세213281="128000000000"8435344485658000
3대구광역시수성구272602021자동차세34664255280320000409571411101000
4대구광역시수성구272602021레저세0000
5대구광역시수성구272602021담배소비세0000
6대구광역시수성구272602021지방소비세7468179900000
7대구광역시수성구272602021등록면허세111789135532330004307213963000
8대구광역시수성구272602021도시계획세0000
9대구광역시수성구272602021지역자원시설세29462392596210005381818965000
시도명시군구명자치단체코드과세년도세목명과세건수과세금액비과세건수비과세금액
14대구광역시수성구272602022재산세219510="142000000000"9055647751012000
15대구광역시수성구272602022자동차세33730354000000000415281311292000
16대구광역시수성구272602022레저세4522000000000
17대구광역시수성구272602022담배소비세0000
18대구광역시수성구272602022지방소비세91233900000000
19대구광역시수성구272602022등록면허세1132781188500000011838175272000
20대구광역시수성구272602022도시계획세0000
21대구광역시수성구272602022지역자원시설세29969795160000005321870029000
22대구광역시수성구272602022지방소득세172504="178000000000"00
23대구광역시수성구272602022교육세90604350627000000172000