Overview

Dataset statistics

Number of variables8
Number of observations46
Missing cells20
Missing cells (%)5.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.2 KiB
Average record size in memory70.9 B

Variable types

Categorical5
Text1
Numeric2

Dataset

Description지방세 과세를 위해 세원이 되는 과세 대상 유형별 부과된 현황에 대한 데이터로 세목명, 세원 유형명, 부과건수, 부과금액 등의 항목을 제공합니다.
Author전라남도 여수시
URLhttps://www.data.go.kr/data/15079191/fileData.do

Alerts

시도명 has constant value ""Constant
시군구명 has constant value ""Constant
자치단체코드 has constant value ""Constant
과세년도 has constant value ""Constant
부과건수 is highly overall correlated with 세목명High correlation
세목명 is highly overall correlated with 부과건수 High correlation
부과건수 has 10 (21.7%) missing valuesMissing
부과금액 has 10 (21.7%) missing valuesMissing
세원 유형명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 12:35:31.957433
Analysis finished2023-12-12 12:35:32.860291
Duration0.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size500.0 B
전라남도
46 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전라남도
2nd row전라남도
3rd row전라남도
4th row전라남도
5th row전라남도

Common Values

ValueCountFrequency (%)
전라남도 46
100.0%

Length

2023-12-12T21:35:32.952475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:35:33.082483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전라남도 46
100.0%

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size500.0 B
여수시
46 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row여수시
2nd row여수시
3rd row여수시
4th row여수시
5th row여수시

Common Values

ValueCountFrequency (%)
여수시 46
100.0%

Length

2023-12-12T21:35:33.191306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:35:33.280039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
여수시 46
100.0%

자치단체코드
Categorical

CONSTANT 

Distinct1
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size500.0 B
46130
46 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row46130
2nd row46130
3rd row46130
4th row46130
5th row46130

Common Values

ValueCountFrequency (%)
46130 46
100.0%

Length

2023-12-12T21:35:33.388672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:35:33.497810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
46130 46
100.0%

과세년도
Categorical

CONSTANT 

Distinct1
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size500.0 B
2022
46 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 46
100.0%

Length

2023-12-12T21:35:33.611112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:35:33.715959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 46
100.0%

세목명
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)28.3%
Missing0
Missing (%)0.0%
Memory size500.0 B
취득세
자동차세
주민세
재산세
레저세
Other values (8)
14 

Length

Max length7
Median length3
Mean length3.7826087
Min length2

Unique

Unique5 ?
Unique (%)10.9%

Sample

1st row담배소비세
2nd row교육세
3rd row도시계획세
4th row취득세
5th row취득세

Common Values

ValueCountFrequency (%)
취득세 9
19.6%
자동차세 7
15.2%
주민세 7
15.2%
재산세 5
10.9%
레저세 4
8.7%
지방소득세 4
8.7%
지역자원시설세 3
 
6.5%
등록면허세 2
 
4.3%
담배소비세 1
 
2.2%
교육세 1
 
2.2%
Other values (3) 3
 
6.5%

Length

2023-12-12T21:35:33.820325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
취득세 9
19.6%
자동차세 7
15.2%
주민세 7
15.2%
재산세 5
10.9%
레저세 4
8.7%
지방소득세 4
8.7%
지역자원시설세 3
 
6.5%
등록면허세 2
 
4.3%
담배소비세 1
 
2.2%
교육세 1
 
2.2%
Other values (3) 3
 
6.5%

세원 유형명
Text

UNIQUE 

Distinct46
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size500.0 B
2023-12-12T21:35:34.139606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length8
Mean length6.0217391
Min length2

Characters and Unicode

Total characters277
Distinct characters73
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique46 ?
Unique (%)100.0%

Sample

1st row담배소비세
2nd row교육세
3rd row도시계획세
4th row건축물
5th row주택(개별)
ValueCountFrequency (%)
담배소비세 1
 
2.2%
주민세(양도소득 1
 
2.2%
지역자원시설세(특자 1
 
2.2%
승합 1
 
2.2%
기타승용 1
 
2.2%
승용 1
 
2.2%
주민세(사업소분 1
 
2.2%
주민세(개인분 1
 
2.2%
주민세(종업원분 1
 
2.2%
주민세(특별징수 1
 
2.2%
Other values (36) 36
78.3%
2023-12-12T21:35:34.536601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
27
 
9.7%
) 24
 
8.7%
( 24
 
8.7%
14
 
5.1%
11
 
4.0%
10
 
3.6%
9
 
3.2%
7
 
2.5%
6
 
2.2%
5
 
1.8%
Other values (63) 140
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 228
82.3%
Close Punctuation 24
 
8.7%
Open Punctuation 24
 
8.7%
Decimal Number 1
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
27
 
11.8%
14
 
6.1%
11
 
4.8%
10
 
4.4%
9
 
3.9%
7
 
3.1%
6
 
2.6%
5
 
2.2%
5
 
2.2%
5
 
2.2%
Other values (60) 129
56.6%
Close Punctuation
ValueCountFrequency (%)
) 24
100.0%
Open Punctuation
ValueCountFrequency (%)
( 24
100.0%
Decimal Number
ValueCountFrequency (%)
3 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 228
82.3%
Common 49
 
17.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
27
 
11.8%
14
 
6.1%
11
 
4.8%
10
 
4.4%
9
 
3.9%
7
 
3.1%
6
 
2.6%
5
 
2.2%
5
 
2.2%
5
 
2.2%
Other values (60) 129
56.6%
Common
ValueCountFrequency (%)
) 24
49.0%
( 24
49.0%
3 1
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 228
82.3%
ASCII 49
 
17.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
27
 
11.8%
14
 
6.1%
11
 
4.8%
10
 
4.4%
9
 
3.9%
7
 
3.1%
6
 
2.6%
5
 
2.2%
5
 
2.2%
5
 
2.2%
Other values (60) 129
56.6%
ASCII
ValueCountFrequency (%)
) 24
49.0%
( 24
49.0%
3 1
 
2.0%

부과건수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct36
Distinct (%)100.0%
Missing10
Missing (%)21.7%
Infinite0
Infinite (%)0.0%
Mean40173.028
Minimum9
Maximum477865
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size546.0 B
2023-12-12T21:35:34.681422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile11.75
Q1945.75
median4223.5
Q347714.25
95-th percentile130568.5
Maximum477865
Range477856
Interquartile range (IQR)46768.5

Descriptive statistics

Standard deviation86693.646
Coefficient of variation (CV)2.1580063
Kurtosis19.055232
Mean40173.028
Median Absolute Deviation (MAD)4200.5
Skewness4.015206
Sum1446229
Variance7.5157883 × 109
MonotonicityNot monotonic
2023-12-12T21:35:34.846097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
25833 1
 
2.2%
1853 1
 
2.2%
176947 1
 
2.2%
16839 1
 
2.2%
4044 1
 
2.2%
47665 1
 
2.2%
4403 1
 
2.2%
3869 1
 
2.2%
47862 1
 
2.2%
9 1
 
2.2%
Other values (26) 26
56.5%
(Missing) 10
 
21.7%
ValueCountFrequency (%)
9 1
2.2%
11 1
2.2%
12 1
2.2%
34 1
2.2%
44 1
2.2%
233 1
2.2%
365 1
2.2%
645 1
2.2%
723 1
2.2%
1020 1
2.2%
ValueCountFrequency (%)
477865 1
2.2%
176947 1
2.2%
115109 1
2.2%
112845 1
2.2%
106862 1
2.2%
97598 1
2.2%
64107 1
2.2%
58923 1
2.2%
47862 1
2.2%
47665 1
2.2%

부과금액
Real number (ℝ)

MISSING 

Distinct36
Distinct (%)100.0%
Missing10
Missing (%)21.7%
Infinite0
Infinite (%)0.0%
Mean1.7760891 × 1010
Minimum11384000
Maximum1.50541 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size546.0 B
2023-12-12T21:35:35.006121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11384000
5-th percentile40898750
Q19.2673625 × 108
median1.0396194 × 1010
Q32.5789833 × 1010
95-th percentile5.2838631 × 1010
Maximum1.50541 × 1011
Range1.5052962 × 1011
Interquartile range (IQR)2.4863097 × 1010

Descriptive statistics

Standard deviation2.7582235 × 1010
Coefficient of variation (CV)1.5529759
Kurtosis15.274135
Mean1.7760891 × 1010
Median Absolute Deviation (MAD)1.0010634 × 1010
Skewness3.4549876
Sum6.3939206 × 1011
Variance7.6077968 × 1020
MonotonicityNot monotonic
2023-12-12T21:35:35.181689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
775156000 1
 
2.2%
122576000 1
 
2.2%
25785827000 1
 
2.2%
3990426000 1
 
2.2%
14950332000 1
 
2.2%
51134262000 1
 
2.2%
150541000000 1
 
2.2%
9197736000 1
 
2.2%
10402703000 1
 
2.2%
14310118000 1
 
2.2%
Other values (26) 26
56.5%
(Missing) 10
 
21.7%
ValueCountFrequency (%)
11384000 1
2.2%
35687000 1
2.2%
42636000 1
2.2%
107508000 1
2.2%
113399000 1
2.2%
122576000 1
2.2%
242461000 1
2.2%
528659000 1
2.2%
775156000 1
2.2%
977263000 1
2.2%
ValueCountFrequency (%)
150541000000 1
2.2%
57951738000 1
2.2%
51134262000 1
2.2%
40572278000 1
2.2%
38879219000 1
2.2%
33340817000 1
2.2%
30866064000 1
2.2%
26143814000 1
2.2%
25801852000 1
2.2%
25785827000 1
2.2%

Interactions

2023-12-12T21:35:32.321798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:35:32.144215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:35:32.416773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:35:32.233145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:35:35.297435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세목명세원 유형명부과건수부과금액
세목명1.0001.0000.8210.000
세원 유형명1.0001.0001.0001.000
부과건수0.8211.0001.0000.289
부과금액0.0001.0000.2891.000
2023-12-12T21:35:35.429278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부과건수부과금액세목명
부과건수1.0000.4090.557
부과금액0.4091.0000.000
세목명0.5570.0001.000

Missing values

2023-12-12T21:35:32.546429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:35:32.698750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T21:35:32.807123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시도명시군구명자치단체코드과세년도세목명세원 유형명부과건수부과금액
0전라남도여수시461302022담배소비세담배소비세64523173070000
1전라남도여수시461302022교육세교육세47786538879219000
2전라남도여수시461302022도시계획세도시계획세<NA><NA>
3전라남도여수시461302022취득세건축물371057951738000
4전라남도여수시461302022취득세주택(개별)254711153797000
5전라남도여수시461302022취득세주택(단독)584230866064000
6전라남도여수시461302022취득세기타2331357582000
7전라남도여수시461302022취득세항공기<NA><NA>
8전라남도여수시461302022취득세기계장비723977263000
9전라남도여수시461302022취득세차량2073225801852000
시도명시군구명자치단체코드과세년도세목명세원 유형명부과건수부과금액
36전라남도여수시461302022지방소득세지방소득세(법인소득)4403150541000000
37전라남도여수시461302022지방소득세지방소득세(양도소득)38699197736000
38전라남도여수시461302022지방소득세지방소득세(종합소득)4786210402703000
39전라남도여수시461302022지방소비세지방소비세914310118000
40전라남도여수시461302022등록면허세등록면허세(면허)641071356331000
41전라남도여수시461302022등록면허세등록면허세(등록)589238930338000
42전라남도여수시461302022지역자원시설세지역자원시설세(소방)10686213510253000
43전라남도여수시461302022지역자원시설세지역자원시설세(시설)441358930000
44전라남도여수시461302022지역자원시설세지역자원시설세(특자)134835687000
45전라남도여수시461302022체납체납11510910389686000