Overview

Dataset statistics

Number of variables8
Number of observations93
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.3 KiB
Average record size in memory69.4 B

Variable types

Categorical5
Text1
Numeric2

Dataset

Description지방세 과세를 위해 세원이 되는 과세 대상 유형별 부과된 현황을 제공물건 유형에 따른 세부담 수준의 형평성 검토 및 부동산 등 관련분야 규제정책 대상 확인 시 기초자료 활용
Author부산광역시 사상구
URLhttps://www.data.go.kr/data/15079674/fileData.do

Alerts

시도명 has constant value ""Constant
시군구명 has constant value ""Constant
자치단체코드 has constant value ""Constant
부과건수 is highly overall correlated with 부과금액 and 1 other fieldsHigh correlation
부과금액 is highly overall correlated with 부과건수High correlation
세목명 is highly overall correlated with 부과건수High correlation
부과건수 has 27 (29.0%) zerosZeros
부과금액 has 27 (29.0%) zerosZeros

Reproduction

Analysis started2024-04-21 01:52:09.098571
Analysis finished2024-04-21 01:52:11.505516
Duration2.41 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size876.0 B
부산광역시
93 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산광역시
2nd row부산광역시
3rd row부산광역시
4th row부산광역시
5th row부산광역시

Common Values

ValueCountFrequency (%)
부산광역시 93
100.0%

Length

2024-04-21T10:52:11.571986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T10:52:11.662640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산광역시 93
100.0%

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size876.0 B
사상구
93 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사상구
2nd row사상구
3rd row사상구
4th row사상구
5th row사상구

Common Values

ValueCountFrequency (%)
사상구 93
100.0%

Length

2024-04-21T10:52:11.748175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T10:52:11.836648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
사상구 93
100.0%

자치단체코드
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size876.0 B
26530
93 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row26530
2nd row26530
3rd row26530
4th row26530
5th row26530

Common Values

ValueCountFrequency (%)
26530 93
100.0%

Length

2024-04-21T10:52:11.940312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T10:52:12.033813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
26530 93
100.0%

과세년도
Categorical

Distinct2
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size876.0 B
2020
47 
2021
46 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 47
50.5%
2021 46
49.5%

Length

2024-04-21T10:52:12.126449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T10:52:12.215719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 47
50.5%
2021 46
49.5%

세목명
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)14.0%
Missing0
Missing (%)0.0%
Memory size876.0 B
취득세
18 
주민세
16 
자동차세
14 
재산세
10 
지방소득세
Other values (8)
27 

Length

Max length7
Median length3
Mean length3.7311828
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row담배소비세
2nd row교육세
3rd row도시계획세
4th row취득세
5th row취득세

Common Values

ValueCountFrequency (%)
취득세 18
19.4%
주민세 16
17.2%
자동차세 14
15.1%
재산세 10
10.8%
지방소득세 8
8.6%
레저세 8
8.6%
지역자원시설세 5
 
5.4%
등록면허세 4
 
4.3%
담배소비세 2
 
2.2%
교육세 2
 
2.2%
Other values (3) 6
 
6.5%

Length

2024-04-21T10:52:12.312864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
취득세 18
19.4%
주민세 16
17.2%
자동차세 14
15.1%
재산세 10
10.8%
지방소득세 8
8.6%
레저세 8
8.6%
지역자원시설세 5
 
5.4%
등록면허세 4
 
4.3%
담배소비세 2
 
2.2%
교육세 2
 
2.2%
Other values (3) 6
 
6.5%
Distinct50
Distinct (%)53.8%
Missing0
Missing (%)0.0%
Memory size876.0 B
2024-04-21T10:52:12.533825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length6.0322581
Min length2

Characters and Unicode

Total characters561
Distinct characters74
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)7.5%

Sample

1st row담배소비세
2nd row교육세
3rd row도시계획세
4th row건축물
5th row주택(개별)
ValueCountFrequency (%)
담배소비세 2
 
2.2%
도시계획세 2
 
2.2%
경륜 2
 
2.2%
주민세(종합소득 2
 
2.2%
지방소득세(특별징수 2
 
2.2%
지방소득세(법인소득 2
 
2.2%
지방소득세(양도소득 2
 
2.2%
지방소득세(종합소득 2
 
2.2%
지방소비세 2
 
2.2%
승합 2
 
2.2%
Other values (40) 73
78.5%
2024-04-21T10:52:12.875583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
55
 
9.8%
( 49
 
8.7%
) 49
 
8.7%
27
 
4.8%
24
 
4.3%
19
 
3.4%
18
 
3.2%
16
 
2.9%
12
 
2.1%
11
 
2.0%
Other values (64) 281
50.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 461
82.2%
Open Punctuation 49
 
8.7%
Close Punctuation 49
 
8.7%
Decimal Number 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
55
 
11.9%
27
 
5.9%
24
 
5.2%
19
 
4.1%
18
 
3.9%
16
 
3.5%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
Other values (61) 258
56.0%
Open Punctuation
ValueCountFrequency (%)
( 49
100.0%
Close Punctuation
ValueCountFrequency (%)
) 49
100.0%
Decimal Number
ValueCountFrequency (%)
3 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 461
82.2%
Common 100
 
17.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
55
 
11.9%
27
 
5.9%
24
 
5.2%
19
 
4.1%
18
 
3.9%
16
 
3.5%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
Other values (61) 258
56.0%
Common
ValueCountFrequency (%)
( 49
49.0%
) 49
49.0%
3 2
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 461
82.2%
ASCII 100
 
17.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
55
 
11.9%
27
 
5.9%
24
 
5.2%
19
 
4.1%
18
 
3.9%
16
 
3.5%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
Other values (61) 258
56.0%
ASCII
ValueCountFrequency (%)
( 49
49.0%
) 49
49.0%
3 2
 
2.0%

부과건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct67
Distinct (%)72.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27227.86
Minimum0
Maximum439256
Zeros27
Zeros (%)29.0%
Negative0
Negative (%)0.0%
Memory size969.0 B
2024-04-21T10:52:13.006616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median992
Q320917
95-th percentile140296.6
Maximum439256
Range439256
Interquartile range (IQR)20917

Descriptive statistics

Standard deviation71111.926
Coefficient of variation (CV)2.6117339
Kurtosis23.047782
Mean27227.86
Median Absolute Deviation (MAD)992
Skewness4.486352
Sum2532191
Variance5.0569061 × 109
MonotonicityNot monotonic
2024-04-21T10:52:13.144358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 27
29.0%
1428 1
 
1.1%
80821 1
 
1.1%
20917 1
 
1.1%
23782 1
 
1.1%
108 1
 
1.1%
21042 1
 
1.1%
80138 1
 
1.1%
24 1
 
1.1%
439256 1
 
1.1%
Other values (57) 57
61.3%
ValueCountFrequency (%)
0 27
29.0%
6 1
 
1.1%
7 1
 
1.1%
14 1
 
1.1%
17 1
 
1.1%
24 1
 
1.1%
29 1
 
1.1%
60 1
 
1.1%
72 1
 
1.1%
108 1
 
1.1%
ValueCountFrequency (%)
439256 1
1.1%
435941 1
1.1%
147993 1
1.1%
145488 1
1.1%
143152 1
1.1%
138393 1
1.1%
112066 1
1.1%
110606 1
1.1%
81312 1
1.1%
80821 1
1.1%

부과금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct67
Distinct (%)72.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.4283064 × 109
Minimum0
Maximum3.3426454 × 1010
Zeros27
Zeros (%)29.0%
Negative0
Negative (%)0.0%
Memory size969.0 B
2024-04-21T10:52:13.292161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3.82443 × 108
Q35.426785 × 109
95-th percentile2.0191231 × 1010
Maximum3.3426454 × 1010
Range3.3426454 × 1010
Interquartile range (IQR)5.426785 × 109

Descriptive statistics

Standard deviation7.2700072 × 109
Coefficient of variation (CV)1.6417128
Kurtosis4.4332459
Mean4.4283064 × 109
Median Absolute Deviation (MAD)3.82443 × 108
Skewness2.1204013
Sum4.1183249 × 1011
Variance5.2853004 × 1019
MonotonicityNot monotonic
2024-04-21T10:52:13.431264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 27
29.0%
27640117000 1
 
1.1%
812543000 1
 
1.1%
2592898000 1
 
1.1%
7567106000 1
 
1.1%
10449000 1
 
1.1%
33426454000 1
 
1.1%
10349221000 1
 
1.1%
11407000 1
 
1.1%
16147473000 1
 
1.1%
Other values (57) 57
61.3%
ValueCountFrequency (%)
0 27
29.0%
5105000 1
 
1.1%
8905000 1
 
1.1%
10449000 1
 
1.1%
10682000 1
 
1.1%
11039000 1
 
1.1%
11407000 1
 
1.1%
12330000 1
 
1.1%
16107000 1
 
1.1%
18512000 1
 
1.1%
ValueCountFrequency (%)
33426454000 1
1.1%
30743515000 1
1.1%
27640117000 1
1.1%
21740631000 1
1.1%
21005336000 1
1.1%
19648494000 1
1.1%
17014458000 1
1.1%
16147473000 1
1.1%
15259866000 1
1.1%
14947966000 1
1.1%

Interactions

2024-04-21T10:52:11.107914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:52:10.890049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:52:11.198532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T10:52:11.014943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T10:52:13.533866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세년도세목명세원 유형명부과건수부과금액
과세년도1.0000.0000.0000.0000.000
세목명0.0001.0001.0000.8360.533
세원 유형명0.0001.0001.0001.0000.941
부과건수0.0000.8361.0001.0000.697
부과금액0.0000.5330.9410.6971.000
2024-04-21T10:52:13.622114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세목명과세년도
세목명1.0000.000
과세년도0.0001.000
2024-04-21T10:52:13.697176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부과건수부과금액과세년도세목명
부과건수1.0000.8530.0000.621
부과금액0.8531.0000.0000.253
과세년도0.0000.0001.0000.000
세목명0.6210.2530.0001.000

Missing values

2024-04-21T10:52:11.327724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T10:52:11.452818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도명시군구명자치단체코드과세년도세목명세원 유형명부과건수부과금액
0부산광역시사상구265302020담배소비세담배소비세00
1부산광역시사상구265302020교육세교육세43925616147473000
2부산광역시사상구265302020도시계획세도시계획세00
3부산광역시사상구265302020취득세건축물10428008873000
4부산광역시사상구265302020취득세주택(개별)9695729013000
5부산광역시사상구265302020취득세주택(단독)36439116931000
6부산광역시사상구265302020취득세기타72336760000
7부산광역시사상구265302020취득세항공기00
8부산광역시사상구265302020취득세기계장비145105000
9부산광역시사상구265302020취득세차량1931243405000
시도명시군구명자치단체코드과세년도세목명세원 유형명부과건수부과금액
83부산광역시사상구265302021지역자원시설세지역자원시설세(시설)00
84부산광역시사상구265302021지역자원시설세지역자원시설세(특자)36434779000
85부산광역시사상구265302021자동차세자동차세(주행)00
86부산광역시사상구265302021자동차세3륜이하95410682000
87부산광역시사상구265302021자동차세특수88729252000
88부산광역시사상구265302021자동차세화물17152475400000
89부산광역시사상구265302021자동차세승합2472127665000
90부산광역시사상구265302021자동차세기타승용38824675000
91부산광역시사상구265302021자동차세승용11060614871672000
92부산광역시사상구265302021체납체납1383935119210000