Overview

Dataset statistics

Number of variables8
Number of observations93
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.3 KiB
Average record size in memory69.4 B

Variable types

Categorical5
Text1
Numeric2

Dataset

Description지방세 세원이 되는 과세물건 유형별 부과된 현황에 대하여 과세연도, 세목명, 세원 유형명, 부과건수, 부과금액 항목을 제공합니다.
URLhttps://www.data.go.kr/data/15079771/fileData.do

Alerts

시도명 has constant value ""Constant
시군구명 has constant value ""Constant
자치단체코드 has constant value ""Constant
부과건수 is highly overall correlated with 부과금액 and 1 other fieldsHigh correlation
부과금액 is highly overall correlated with 부과건수High correlation
세목명 is highly overall correlated with 부과건수High correlation
부과건수 has 22 (23.7%) zerosZeros
부과금액 has 22 (23.7%) zerosZeros

Reproduction

Analysis started2023-12-12 21:27:48.710980
Analysis finished2023-12-12 21:27:49.629885
Duration0.92 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size876.0 B
경상북도
93 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경상북도
2nd row경상북도
3rd row경상북도
4th row경상북도
5th row경상북도

Common Values

ValueCountFrequency (%)
경상북도 93
100.0%

Length

2023-12-13T06:27:49.687687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:27:49.767808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경상북도 93
100.0%

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size876.0 B
청도군
93 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row청도군
2nd row청도군
3rd row청도군
4th row청도군
5th row청도군

Common Values

ValueCountFrequency (%)
청도군 93
100.0%

Length

2023-12-13T06:27:49.857963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:27:49.937236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
청도군 93
100.0%

자치단체코드
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size876.0 B
47820
93 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row47820
2nd row47820
3rd row47820
4th row47820
5th row47820

Common Values

ValueCountFrequency (%)
47820 93
100.0%

Length

2023-12-13T06:27:50.031131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:27:50.129326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
47820 93
100.0%

과세년도
Categorical

Distinct2
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size876.0 B
2020
47 
2021
46 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 47
50.5%
2021 46
49.5%

Length

2023-12-13T06:27:50.230680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:27:50.338886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 47
50.5%
2021 46
49.5%

세목명
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)14.0%
Missing0
Missing (%)0.0%
Memory size876.0 B
취득세
18 
주민세
16 
자동차세
14 
재산세
10 
레저세
Other values (8)
27 

Length

Max length7
Median length3
Mean length3.7311828
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row담배소비세
2nd row교육세
3rd row도시계획세
4th row취득세
5th row취득세

Common Values

ValueCountFrequency (%)
취득세 18
19.4%
주민세 16
17.2%
자동차세 14
15.1%
재산세 10
10.8%
레저세 8
8.6%
지방소득세 8
8.6%
지역자원시설세 5
 
5.4%
등록면허세 4
 
4.3%
담배소비세 2
 
2.2%
교육세 2
 
2.2%
Other values (3) 6
 
6.5%

Length

2023-12-13T06:27:50.448788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
취득세 18
19.4%
주민세 16
17.2%
자동차세 14
15.1%
재산세 10
10.8%
레저세 8
8.6%
지방소득세 8
8.6%
지역자원시설세 5
 
5.4%
등록면허세 4
 
4.3%
담배소비세 2
 
2.2%
교육세 2
 
2.2%
Other values (3) 6
 
6.5%
Distinct50
Distinct (%)53.8%
Missing0
Missing (%)0.0%
Memory size876.0 B
2023-12-13T06:27:50.677886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length6.0322581
Min length2

Characters and Unicode

Total characters561
Distinct characters74
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)7.5%

Sample

1st row담배소비세
2nd row교육세
3rd row도시계획세
4th row건축물
5th row주택(개별)
ValueCountFrequency (%)
담배소비세 2
 
2.2%
주택(단독 2
 
2.2%
주민세(종합소득 2
 
2.2%
승합 2
 
2.2%
교육세 2
 
2.2%
기타승용 2
 
2.2%
승용 2
 
2.2%
주민세(종업원분 2
 
2.2%
주민세(특별징수 2
 
2.2%
체납 2
 
2.2%
Other values (40) 73
78.5%
2023-12-13T06:27:51.048524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
55
 
9.8%
) 49
 
8.7%
( 49
 
8.7%
27
 
4.8%
24
 
4.3%
19
 
3.4%
18
 
3.2%
16
 
2.9%
12
 
2.1%
11
 
2.0%
Other values (64) 281
50.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 461
82.2%
Close Punctuation 49
 
8.7%
Open Punctuation 49
 
8.7%
Decimal Number 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
55
 
11.9%
27
 
5.9%
24
 
5.2%
19
 
4.1%
18
 
3.9%
16
 
3.5%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
Other values (61) 258
56.0%
Close Punctuation
ValueCountFrequency (%)
) 49
100.0%
Open Punctuation
ValueCountFrequency (%)
( 49
100.0%
Decimal Number
ValueCountFrequency (%)
3 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 461
82.2%
Common 100
 
17.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
55
 
11.9%
27
 
5.9%
24
 
5.2%
19
 
4.1%
18
 
3.9%
16
 
3.5%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
Other values (61) 258
56.0%
Common
ValueCountFrequency (%)
) 49
49.0%
( 49
49.0%
3 2
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 461
82.2%
ASCII 100
 
17.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
55
 
11.9%
27
 
5.9%
24
 
5.2%
19
 
4.1%
18
 
3.9%
16
 
3.5%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
Other values (61) 258
56.0%
ASCII
ValueCountFrequency (%)
) 49
49.0%
( 49
49.0%
3 2
 
2.0%

부과건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct66
Distinct (%)71.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8011.6774
Minimum0
Maximum128380
Zeros22
Zeros (%)23.7%
Negative0
Negative (%)0.0%
Memory size969.0 B
2023-12-13T06:27:51.195845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median156
Q34455
95-th percentile51763.6
Maximum128380
Range128380
Interquartile range (IQR)4454

Descriptive statistics

Standard deviation22069.43
Coefficient of variation (CV)2.7546579
Kurtosis17.892593
Mean8011.6774
Median Absolute Deviation (MAD)156
Skewness4.0756896
Sum745086
Variance4.8705975 × 108
MonotonicityNot monotonic
2023-12-13T06:27:51.330081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 22
 
23.7%
1 3
 
3.2%
12 2
 
2.2%
7 2
 
2.2%
94 2
 
2.2%
4 2
 
2.2%
273 1
 
1.1%
9 1
 
1.1%
7083 1
 
1.1%
14 1
 
1.1%
Other values (56) 56
60.2%
ValueCountFrequency (%)
0 22
23.7%
1 3
 
3.2%
3 1
 
1.1%
4 2
 
2.2%
6 1
 
1.1%
7 2
 
2.2%
9 1
 
1.1%
12 2
 
2.2%
13 1
 
1.1%
14 1
 
1.1%
ValueCountFrequency (%)
128380 1
1.1%
124095 1
1.1%
70753 1
1.1%
69503 1
1.1%
52018 1
1.1%
51594 1
1.1%
27142 1
1.1%
26888 1
1.1%
22232 1
1.1%
21639 1
1.1%

부과금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct72
Distinct (%)77.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2227198 × 109
Minimum0
Maximum9.795309 × 109
Zeros22
Zeros (%)23.7%
Negative0
Negative (%)0.0%
Memory size969.0 B
2023-12-13T06:27:51.491215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q181000
median1.24349 × 108
Q31.765624 × 109
95-th percentile5.1375414 × 109
Maximum9.795309 × 109
Range9.795309 × 109
Interquartile range (IQR)1.765543 × 109

Descriptive statistics

Standard deviation2.0540716 × 109
Coefficient of variation (CV)1.6799201
Kurtosis5.4053628
Mean1.2227198 × 109
Median Absolute Deviation (MAD)1.24349 × 108
Skewness2.2900641
Sum1.1371294 × 1011
Variance4.21921 × 1018
MonotonicityNot monotonic
2023-12-13T06:27:51.640751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 22
 
23.7%
2945508000 1
 
1.1%
2206829000 1
 
1.1%
1942000 1
 
1.1%
3754149000 1
 
1.1%
906283000 1
 
1.1%
364000 1
 
1.1%
3891717000 1
 
1.1%
878866000 1
 
1.1%
80628000 1
 
1.1%
Other values (62) 62
66.7%
ValueCountFrequency (%)
0 22
23.7%
10000 1
 
1.1%
81000 1
 
1.1%
364000 1
 
1.1%
438000 1
 
1.1%
545000 1
 
1.1%
1379000 1
 
1.1%
1942000 1
 
1.1%
2346000 1
 
1.1%
4859000 1
 
1.1%
ValueCountFrequency (%)
9795309000 1
1.1%
8338600000 1
1.1%
8336033000 1
1.1%
7423944000 1
1.1%
5494218000 1
1.1%
4899757000 1
1.1%
4611901000 1
1.1%
4255563000 1
1.1%
3891717000 1
1.1%
3754149000 1
1.1%

Interactions

2023-12-13T06:27:49.178238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:27:48.984468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:27:49.283972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:27:49.092817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:27:51.722498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세년도세목명세원 유형명부과건수부과금액
과세년도1.0000.0000.0000.0000.000
세목명0.0001.0001.0000.8680.774
세원 유형명0.0001.0001.0000.9970.852
부과건수0.0000.8680.9971.0000.736
부과금액0.0000.7740.8520.7361.000
2023-12-13T06:27:51.810029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세년도세목명
과세년도1.0000.000
세목명0.0001.000
2023-12-13T06:27:51.886076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부과건수부과금액과세년도세목명
부과건수1.0000.7840.0000.640
부과금액0.7841.0000.0000.462
과세년도0.0000.0001.0000.000
세목명0.6400.4620.0001.000

Missing values

2023-12-13T06:27:49.411924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:27:49.570165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도명시군구명자치단체코드과세년도세목명세원 유형명부과건수부과금액
0경상북도청도군478202020담배소비세담배소비세2732945508000
1경상북도청도군478202020교육세교육세1240954611901000
2경상북도청도군478202020도시계획세도시계획세00
3경상북도청도군478202020취득세건축물9811232086000
4경상북도청도군478202020취득세주택(개별)13051816040000
5경상북도청도군478202020취득세주택(단독)220270740000
6경상북도청도군478202020취득세기타74863000
7경상북도청도군478202020취득세항공기00
8경상북도청도군478202020취득세기계장비9449997000
9경상북도청도군478202020취득세차량35302906804000
시도명시군구명자치단체코드과세년도세목명세원 유형명부과건수부과금액
83경상북도청도군478202021지방소득세지방소득세(특별징수)62961813183000
84경상북도청도군478202021지방소득세지방소득세(법인소득)7311197824000
85경상북도청도군478202021지방소득세지방소득세(양도소득)11172376737000
86경상북도청도군478202021지방소득세지방소득세(종합소득)3938579673000
87경상북도청도군478202021등록면허세등록면허세(면허)8960124349000
88경상북도청도군478202021등록면허세등록면허세(등록)134241103966000
89경상북도청도군478202021지역자원시설세지역자원시설세(소방)13856534643000
90경상북도청도군478202021지역자원시설세지역자원시설세(시설)00
91경상북도청도군478202021지역자원시설세지역자원시설세(특자)9423451000
92경상북도청도군478202021체납체납515942725712000