Overview

Dataset statistics

Number of variables8
Number of observations44
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.1 KiB
Average record size in memory71.0 B

Variable types

Categorical5
Text1
Numeric2

Dataset

Description세원유형별현황(2022년) 지방세 과세를 위해 세원이 되는 과세 대상 유형별 부과된 현황을 제공 물건 유형에 따른 세부담 수준의 형평성 검토 및 부동산 등 관련분야 규제정책 대상 확인 시 기초자료 활용
URLhttps://www.data.go.kr/data/15078841/fileData.do

Alerts

시도명 has constant value ""Constant
시군구명 has constant value ""Constant
자치단체코드 has constant value ""Constant
과세년도 has constant value ""Constant
부과건수 is highly overall correlated with 부과금액 and 1 other fieldsHigh correlation
부과금액 is highly overall correlated with 부과건수High correlation
세목명 is highly overall correlated with 부과건수High correlation
세원 유형명 has unique valuesUnique
부과건수 has 10 (22.7%) zerosZeros
부과금액 has 10 (22.7%) zerosZeros

Reproduction

Analysis started2023-12-12 07:31:22.019032
Analysis finished2023-12-12 07:31:22.874641
Duration0.86 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size484.0 B
대전광역시
44 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대전광역시
2nd row대전광역시
3rd row대전광역시
4th row대전광역시
5th row대전광역시

Common Values

ValueCountFrequency (%)
대전광역시 44
100.0%

Length

2023-12-12T16:31:22.940640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:31:23.052946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대전광역시 44
100.0%

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size484.0 B
서구
44 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서구
2nd row서구
3rd row서구
4th row서구
5th row서구

Common Values

ValueCountFrequency (%)
서구 44
100.0%

Length

2023-12-12T16:31:23.158091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:31:23.269560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서구 44
100.0%

자치단체코드
Categorical

CONSTANT 

Distinct1
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size484.0 B
30170
44 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row30170
2nd row30170
3rd row30170
4th row30170
5th row30170

Common Values

ValueCountFrequency (%)
30170 44
100.0%

Length

2023-12-12T16:31:23.400300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:31:23.506032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
30170 44
100.0%

과세년도
Categorical

CONSTANT 

Distinct1
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size484.0 B
2022
44 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 44
100.0%

Length

2023-12-12T16:31:23.629558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:31:23.800283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 44
100.0%

세목명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)25.0%
Missing0
Missing (%)0.0%
Memory size484.0 B
취득세
자동차세
주민세
재산세
지방소득세
Other values (6)
12 

Length

Max length7
Median length3
Mean length3.7272727
Min length2

Unique

Unique3 ?
Unique (%)6.8%

Sample

1st row지방소득세
2nd row지방소득세
3rd row지방소득세
4th row지방소득세
5th row지방소비세

Common Values

ValueCountFrequency (%)
취득세 9
20.5%
자동차세 7
15.9%
주민세 7
15.9%
재산세 5
11.4%
지방소득세 4
9.1%
레저세 4
9.1%
지역자원시설세 3
 
6.8%
등록면허세 2
 
4.5%
지방소비세 1
 
2.3%
교육세 1
 
2.3%

Length

2023-12-12T16:31:23.945428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
취득세 9
20.5%
자동차세 7
15.9%
주민세 7
15.9%
재산세 5
11.4%
지방소득세 4
9.1%
레저세 4
9.1%
지역자원시설세 3
 
6.8%
등록면허세 2
 
4.5%
지방소비세 1
 
2.3%
교육세 1
 
2.3%

세원 유형명
Text

UNIQUE 

Distinct44
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size484.0 B
2023-12-12T16:31:24.196061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length6.0681818
Min length2

Characters and Unicode

Total characters267
Distinct characters70
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44 ?
Unique (%)100.0%

Sample

1st row지방소득세(특별징수)
2nd row지방소득세(법인소득)
3rd row지방소득세(양도소득)
4th row지방소득세(종합소득)
5th row지방소비세
ValueCountFrequency (%)
지방소득세(특별징수 1
 
2.3%
지방소득세(법인소득 1
 
2.3%
재산세(토지 1
 
2.3%
기타승용 1
 
2.3%
승용 1
 
2.3%
등록면허세(면허 1
 
2.3%
등록면허세(등록 1
 
2.3%
지역자원시설세(소방 1
 
2.3%
지역자원시설세(시설 1
 
2.3%
지역자원시설세(특자 1
 
2.3%
Other values (34) 34
77.3%
2023-12-12T16:31:24.651345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25
 
9.4%
( 24
 
9.0%
) 24
 
9.0%
13
 
4.9%
11
 
4.1%
10
 
3.7%
9
 
3.4%
7
 
2.6%
6
 
2.2%
5
 
1.9%
Other values (60) 133
49.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 218
81.6%
Open Punctuation 24
 
9.0%
Close Punctuation 24
 
9.0%
Decimal Number 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
25
 
11.5%
13
 
6.0%
11
 
5.0%
10
 
4.6%
9
 
4.1%
7
 
3.2%
6
 
2.8%
5
 
2.3%
5
 
2.3%
5
 
2.3%
Other values (57) 122
56.0%
Open Punctuation
ValueCountFrequency (%)
( 24
100.0%
Close Punctuation
ValueCountFrequency (%)
) 24
100.0%
Decimal Number
ValueCountFrequency (%)
3 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 218
81.6%
Common 49
 
18.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
25
 
11.5%
13
 
6.0%
11
 
5.0%
10
 
4.6%
9
 
4.1%
7
 
3.2%
6
 
2.8%
5
 
2.3%
5
 
2.3%
5
 
2.3%
Other values (57) 122
56.0%
Common
ValueCountFrequency (%)
( 24
49.0%
) 24
49.0%
3 1
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 218
81.6%
ASCII 49
 
18.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
25
 
11.5%
13
 
6.0%
11
 
5.0%
10
 
4.6%
9
 
4.1%
7
 
3.2%
6
 
2.8%
5
 
2.3%
5
 
2.3%
5
 
2.3%
Other values (57) 122
56.0%
ASCII
ValueCountFrequency (%)
( 24
49.0%
) 24
49.0%
3 1
 
2.0%

부과건수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct35
Distinct (%)79.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean58745.5
Minimum0
Maximum907820
Zeros10
Zeros (%)22.7%
Negative0
Negative (%)0.0%
Memory size528.0 B
2023-12-12T16:31:24.838851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q17.5
median2276.5
Q339715.5
95-th percentile279070.9
Maximum907820
Range907820
Interquartile range (IQR)39708

Descriptive statistics

Standard deviation152113.76
Coefficient of variation (CV)2.5893687
Kurtosis23.127544
Mean58745.5
Median Absolute Deviation (MAD)2276.5
Skewness4.438575
Sum2584802
Variance2.3138595 × 1010
MonotonicityNot monotonic
2023-12-12T16:31:25.264875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
0 10
 
22.7%
86549 1
 
2.3%
22342 1
 
2.3%
5525 1
 
2.3%
286630 1
 
2.3%
51948 1
 
2.3%
81493 1
 
2.3%
301202 1
 
2.3%
456 1
 
2.3%
165086 1
 
2.3%
Other values (25) 25
56.8%
ValueCountFrequency (%)
0 10
22.7%
3 1
 
2.3%
9 1
 
2.3%
11 1
 
2.3%
21 1
 
2.3%
29 1
 
2.3%
34 1
 
2.3%
82 1
 
2.3%
456 1
 
2.3%
1158 1
 
2.3%
ValueCountFrequency (%)
907820 1
2.3%
301202 1
2.3%
286630 1
2.3%
236236 1
2.3%
179398 1
2.3%
165086 1
2.3%
114864 1
2.3%
86549 1
2.3%
81493 1
2.3%
51948 1
2.3%

부과금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct35
Distinct (%)79.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1128881 × 1010
Minimum0
Maximum7.0582485 × 1010
Zeros10
Zeros (%)22.7%
Negative0
Negative (%)0.0%
Memory size528.0 B
2023-12-12T16:31:25.399448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11915000
median5.378405 × 108
Q31.8653201 × 1010
95-th percentile4.1759468 × 1010
Maximum7.0582485 × 1010
Range7.0582485 × 1010
Interquartile range (IQR)1.8651286 × 1010

Descriptive statistics

Standard deviation1.6648966 × 1010
Coefficient of variation (CV)1.4960144
Kurtosis2.7928356
Mean1.1128881 × 1010
Median Absolute Deviation (MAD)5.378405 × 108
Skewness1.7151965
Sum4.8967078 × 1011
Variance2.7718809 × 1020
MonotonicityNot monotonic
2023-12-12T16:31:25.519701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
0 10
 
22.7%
48112683000 1
 
2.3%
534188000 1
 
2.3%
290861000 1
 
2.3%
41105899000 1
 
2.3%
2107469000 1
 
2.3%
7548704000 1
 
2.3%
11395162000 1
 
2.3%
13099000 1
 
2.3%
41874804000 1
 
2.3%
Other values (25) 25
56.8%
ValueCountFrequency (%)
0 10
22.7%
928000 1
 
2.3%
2244000 1
 
2.3%
8056000 1
 
2.3%
13099000 1
 
2.3%
24793000 1
 
2.3%
37575000 1
 
2.3%
71732000 1
 
2.3%
180876000 1
 
2.3%
283543000 1
 
2.3%
ValueCountFrequency (%)
70582485000 1
2.3%
48112683000 1
2.3%
41874804000 1
2.3%
41105899000 1
2.3%
34283442000 1
2.3%
33081944000 1
2.3%
31882217000 1
2.3%
30158846000 1
2.3%
19742994000 1
2.3%
19597693000 1
2.3%

Interactions

2023-12-12T16:31:22.449172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:31:22.241302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:31:22.546688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:31:22.337932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:31:25.601285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세목명세원 유형명부과건수부과금액
세목명1.0001.0000.8490.533
세원 유형명1.0001.0001.0001.000
부과건수0.8491.0001.0000.477
부과금액0.5331.0000.4771.000
2023-12-12T16:31:25.688645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부과건수부과금액세목명
부과건수1.0000.8230.621
부과금액0.8231.0000.271
세목명0.6210.2711.000

Missing values

2023-12-12T16:31:22.685200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:31:22.824018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도명시군구명자치단체코드과세년도세목명세원 유형명부과건수부과금액
0대전광역시서구301702022지방소득세지방소득세(특별징수)8654948112683000
1대전광역시서구301702022지방소득세지방소득세(법인소득)598270582485000
2대전광역시서구301702022지방소득세지방소득세(양도소득)648618568221000
3대전광역시서구301702022지방소득세지방소득세(종합소득)11486418908142000
4대전광역시서구301702022지방소비세지방소비세914034010000
5대전광역시서구301702022교육세교육세90782033081944000
6대전광역시서구301702022취득세건축물184816405335000
7대전광역시서구301702022취득세주택(개별)131419742994000
8대전광역시서구301702022취득세주택(단독)668234283442000
9대전광역시서구301702022취득세기타21337906000
시도명시군구명자치단체코드과세년도세목명세원 유형명부과건수부과금액
34대전광역시서구301702022재산세재산세(선박)82928000
35대전광역시서구301702022재산세재산세(건축물)3957214584028000
36대전광역시서구301702022주민세주민세(사업소분)248822997340000
37대전광역시서구301702022주민세주민세(개인분)1793981796148000
38대전광역시서구301702022주민세주민세(종업원분)36958575922000
39대전광역시서구301702022주민세주민세(특별징수)00
40대전광역시서구301702022주민세주민세(법인세분)00
41대전광역시서구301702022주민세주민세(양도소득)00
42대전광역시서구301702022주민세주민세(종합소득)00
43대전광역시서구301702022체납체납23623619597693000