Overview

Dataset statistics

Number of variables9
Number of observations63
Missing cells80
Missing cells (%)14.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.9 KiB
Average record size in memory80.1 B

Variable types

Categorical5
Numeric4

Dataset

Description2017년부터 2022년까지 서천군 지방세 세목별 과세현황, 과세건수, 과세금액 및 비과세건수, 비과세금액에 대한 지방세 과세현황 자료입니다
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=347&beforeMenuCd=DOM_000000201001001000&publicdatapk=15080474

Alerts

자치단체코드 has constant value ""Constant
과세년도 is highly overall correlated with 시도명 and 1 other fieldsHigh correlation
시도명 is highly overall correlated with 시군구명 and 2 other fieldsHigh correlation
시군구명 is highly overall correlated with 시도명 and 2 other fieldsHigh correlation
과세건수 is highly overall correlated with 세목명High correlation
과세금액 is highly overall correlated with 비과세금액High correlation
비과세건수 is highly overall correlated with 비과세금액 and 1 other fieldsHigh correlation
비과세금액 is highly overall correlated with 과세금액 and 1 other fieldsHigh correlation
세목명 is highly overall correlated with 과세건수 and 3 other fieldsHigh correlation
과세건수 has 16 (25.4%) missing valuesMissing
과세금액 has 16 (25.4%) missing valuesMissing
비과세건수 has 24 (38.1%) missing valuesMissing
비과세금액 has 24 (38.1%) missing valuesMissing

Reproduction

Analysis started2024-01-09 22:47:34.023868
Analysis finished2024-01-09 22:47:36.103145
Duration2.08 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size636.0 B
충청남도
51 
충청남도
12 

Length

Max length6
Median length4
Mean length4.3809524
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충청남도
2nd row충청남도
3rd row충청남도
4th row충청남도
5th row충청남도

Common Values

ValueCountFrequency (%)
충청남도 51
81.0%
충청남도 12
 
19.0%

Length

2024-01-10T07:47:36.175339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:47:36.281628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
충청남도 63
100.0%

시군구명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size636.0 B
서천군
51 
서천군
12 

Length

Max length5
Median length3
Mean length3.3809524
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서천군
2nd row서천군
3rd row서천군
4th row서천군
5th row서천군

Common Values

ValueCountFrequency (%)
서천군 51
81.0%
서천군 12
 
19.0%

Length

2024-01-10T07:47:36.372523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:47:36.465938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서천군 63
100.0%

자치단체코드
Categorical

CONSTANT 

Distinct1
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size636.0 B
44770
63 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row44770
2nd row44770
3rd row44770
4th row44770
5th row44770

Common Values

ValueCountFrequency (%)
44770 63
100.0%

Length

2024-01-10T07:47:36.550561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:47:36.624925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
44770 63
100.0%

과세년도
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Memory size636.0 B
2018
13 
2017
13 
2021
13 
2019
12 
2020
12 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2018
2nd row2018
3rd row2018
4th row2018
5th row2018

Common Values

ValueCountFrequency (%)
2018 13
20.6%
2017 13
20.6%
2021 13
20.6%
2019 12
19.0%
2020 12
19.0%

Length

2024-01-10T07:47:36.711189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:47:36.799599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2018 13
20.6%
2017 13
20.6%
2021 13
20.6%
2019 12
19.0%
2020 12
19.0%

세목명
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)39.7%
Missing0
Missing (%)0.0%
Memory size636.0 B
취득세
 
4
지방소비세
 
4
등록세
 
4
지방소득세
 
4
지역자원시설세
 
4
Other values (20)
43 

Length

Max length9
Median length7
Mean length4.5396825
Min length3

Unique

Unique12 ?
Unique (%)19.0%

Sample

1st row취득세
2nd row등록세
3rd row주민세
4th row재산세
5th row자동차세

Common Values

ValueCountFrequency (%)
취득세 4
 
6.3%
지방소비세 4
 
6.3%
등록세 4
 
6.3%
지방소득세 4
 
6.3%
지역자원시설세 4
 
6.3%
등록면허세 4
 
6.3%
교육세 4
 
6.3%
담배소비세 4
 
6.3%
자동차세 4
 
6.3%
재산세 4
 
6.3%
Other values (15) 23
36.5%

Length

2024-01-10T07:47:36.902322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
취득세 5
 
7.9%
지방소비세 5
 
7.9%
지방소득세 5
 
7.9%
지역자원시설세 5
 
7.9%
등록면허세 5
 
7.9%
교육세 5
 
7.9%
담배소비세 5
 
7.9%
자동차세 5
 
7.9%
재산세 5
 
7.9%
주민세 5
 
7.9%
Other values (3) 13
20.6%

과세건수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct47
Distinct (%)100.0%
Missing16
Missing (%)25.4%
Infinite0
Infinite (%)0.0%
Mean39596.319
Minimum6
Maximum149650
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size699.0 B
2024-01-10T07:47:37.019573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile82.8
Q111946.5
median27236
Q341090.5
95-th percentile145214.6
Maximum149650
Range149644
Interquartile range (IQR)29144

Descriptive statistics

Standard deviation45058.165
Coefficient of variation (CV)1.1379382
Kurtosis1.0928006
Mean39596.319
Median Absolute Deviation (MAD)15128
Skewness1.500553
Sum1861027
Variance2.0302382 × 109
MonotonicityNot monotonic
2024-01-10T07:47:37.143087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
27527 1
 
1.6%
145355 1
 
1.6%
12108 1
 
1.6%
27697 1
 
1.6%
91824 1
 
1.6%
41302 1
 
1.6%
278 1
 
1.6%
6 1
 
1.6%
30521 1
 
1.6%
13747 1
 
1.6%
Other values (37) 37
58.7%
(Missing) 16
25.4%
ValueCountFrequency (%)
6 1
1.6%
7 1
1.6%
81 1
1.6%
87 1
1.6%
107 1
1.6%
278 1
1.6%
476 1
1.6%
10684 1
1.6%
10833 1
1.6%
10911 1
1.6%
ValueCountFrequency (%)
149650 1
1.6%
148379 1
1.6%
145355 1
1.6%
144887 1
1.6%
144679 1
1.6%
91824 1
1.6%
91586 1
1.6%
90175 1
1.6%
90072 1
1.6%
88458 1
1.6%

과세금액
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct47
Distinct (%)100.0%
Missing16
Missing (%)25.4%
Infinite0
Infinite (%)0.0%
Mean6.3077051 × 109
Minimum8.14323 × 108
Maximum3.7975004 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size699.0 B
2024-01-10T07:47:37.267369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8.14323 × 108
5-th percentile9.891975 × 108
Q11.921467 × 109
median5.04442 × 109
Q37.6511575 × 109
95-th percentile1.4966159 × 1010
Maximum3.7975004 × 1010
Range3.7160681 × 1010
Interquartile range (IQR)5.7296905 × 109

Descriptive statistics

Standard deviation6.3322247 × 109
Coefficient of variation (CV)1.0038872
Kurtosis13.315973
Mean6.3077051 × 109
Median Absolute Deviation (MAD)3.044814 × 109
Skewness3.0964691
Sum2.9646214 × 1011
Variance4.009707 × 1019
MonotonicityNot monotonic
2024-01-10T07:47:37.399022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
1758263000 1
 
1.6%
4775902000 1
 
1.6%
20889475000 1
 
1.6%
2009116000 1
 
1.6%
5926641000 1
 
1.6%
7464934000 1
 
1.6%
4605933000 1
 
1.6%
7557100000 1
 
1.6%
1584474000 1
 
1.6%
972504000 1
 
1.6%
Other values (37) 37
58.7%
(Missing) 16
25.4%
ValueCountFrequency (%)
814323000 1
1.6%
869240000 1
1.6%
972504000 1
1.6%
1028149000 1
1.6%
1077374000 1
1.6%
1175249000 1
1.6%
1228950000 1
1.6%
1311099000 1
1.6%
1584474000 1
1.6%
1615267000 1
1.6%
ValueCountFrequency (%)
37975004000 1
1.6%
20889475000 1
1.6%
15665717000 1
1.6%
13333858000 1
1.6%
12887476000 1
1.6%
10784848000 1
1.6%
9240270000 1
1.6%
8665054000 1
1.6%
8501343000 1
1.6%
8381626000 1
1.6%

비과세건수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct39
Distinct (%)100.0%
Missing24
Missing (%)38.1%
Infinite0
Infinite (%)0.0%
Mean6648.4103
Minimum2
Maximum29527
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size699.0 B
2024-01-10T07:47:37.512932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile5.7
Q11888
median3352
Q37561.5
95-th percentile28393.3
Maximum29527
Range29525
Interquartile range (IQR)5673.5

Descriptive statistics

Standard deviation8829.0161
Coefficient of variation (CV)1.3279891
Kurtosis2.462706
Mean6648.4103
Median Absolute Deviation (MAD)3254
Skewness1.9115937
Sum259288
Variance77951526
MonotonicityNot monotonic
2024-01-10T07:47:37.630946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
6 1
 
1.6%
92 1
 
1.6%
2007 1
 
1.6%
2 1
 
1.6%
6593 1
 
1.6%
29527 1
 
1.6%
7844 1
 
1.6%
4062 1
 
1.6%
2890 1
 
1.6%
93 1
 
1.6%
Other values (29) 29
46.0%
(Missing) 24
38.1%
ValueCountFrequency (%)
2 1
1.6%
3 1
1.6%
6 1
1.6%
7 1
1.6%
92 1
1.6%
93 1
1.6%
98 1
1.6%
123 1
1.6%
350 1
1.6%
1860 1
1.6%
ValueCountFrequency (%)
29527 1
1.6%
28567 1
1.6%
28374 1
1.6%
28280 1
1.6%
26639 1
1.6%
8000 1
1.6%
7872 1
1.6%
7844 1
1.6%
7766 1
1.6%
7621 1
1.6%

비과세금액
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct39
Distinct (%)100.0%
Missing24
Missing (%)38.1%
Infinite0
Infinite (%)0.0%
Mean1.1605154 × 109
Minimum7000
Maximum5.931871 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size699.0 B
2024-01-10T07:47:37.741202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7000
5-th percentile8900
Q18213500
median1.73109 × 108
Q31.5196815 × 109
95-th percentile4.7109581 × 109
Maximum5.931871 × 109
Range5.931864 × 109
Interquartile range (IQR)1.511468 × 109

Descriptive statistics

Standard deviation1.8833343 × 109
Coefficient of variation (CV)1.622843
Kurtosis0.057974948
Mean1.1605154 × 109
Median Absolute Deviation (MAD)1.69842 × 108
Skewness1.3293191
Sum4.5260101 × 1010
Variance3.5469481 × 1018
MonotonicityNot monotonic
2024-01-10T07:47:37.884508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
1947000 1
 
1.6%
7000 1
 
1.6%
4852304000 1
 
1.6%
553000 1
 
1.6%
3267000 1
 
1.6%
4492685000 1
 
1.6%
259460000 1
 
1.6%
101123000 1
 
1.6%
186701000 1
 
1.6%
9000 1
 
1.6%
Other values (29) 29
46.0%
(Missing) 24
38.1%
ValueCountFrequency (%)
7000 1
1.6%
8000 1
1.6%
9000 1
1.6%
11000 1
1.6%
31000 1
1.6%
553000 1
1.6%
1161000 1
1.6%
1947000 1
1.6%
2163000 1
1.6%
3267000 1
1.6%
ValueCountFrequency (%)
5931871000 1
1.6%
4852304000 1
1.6%
4695253000 1
1.6%
4492685000 1
1.6%
4319655000 1
1.6%
4173076000 1
1.6%
4005693000 1
1.6%
3823535000 1
1.6%
3388856000 1
1.6%
2744921000 1
1.6%

Interactions

2024-01-10T07:47:35.258517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:34.316414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:34.642866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:34.968714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:35.331048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:34.397100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:34.731679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:35.041779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:35.427822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:34.476251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:34.815583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:35.117027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:35.513403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:34.554533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:34.891471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:47:35.185482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T07:47:37.972335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명시군구명과세년도세목명과세건수과세금액비과세건수비과세금액
시도명1.0000.9971.0001.0000.0000.0000.0000.000
시군구명0.9971.0001.0001.0000.0000.0000.0000.000
과세년도1.0001.0001.0000.0000.0000.0000.0000.000
세목명1.0001.0000.0001.0000.9270.7670.9800.222
과세건수0.0000.0000.0000.9271.0000.6840.8820.576
과세금액0.0000.0000.0000.7670.6841.0000.6700.860
비과세건수0.0000.0000.0000.9800.8820.6701.0000.585
비과세금액0.0000.0000.0000.2220.5760.8600.5851.000
2024-01-10T07:47:38.079161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세년도시도명세목명시군구명
과세년도1.0000.9750.0000.975
시도명0.9751.0000.7890.948
세목명0.0000.7891.0000.789
시군구명0.9750.9480.7891.000
2024-01-10T07:47:38.172035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세건수과세금액비과세건수비과세금액시도명시군구명과세년도세목명
과세건수1.000-0.0500.225-0.3060.0000.0000.0000.624
과세금액-0.0501.000-0.0070.5490.0000.0000.0000.405
비과세건수0.225-0.0071.0000.5240.0000.0000.0000.791
비과세금액-0.3060.5490.5241.0000.0000.0000.0000.088
시도명0.0000.0000.0000.0001.0000.9480.9750.789
시군구명0.0000.0000.0000.0000.9481.0000.9750.789
과세년도0.0000.0000.0000.0000.9750.9751.0000.000
세목명0.6240.4050.7910.0880.7890.7890.0001.000

Missing values

2024-01-10T07:47:35.613786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T07:47:35.966070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-10T07:47:36.052372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시도명시군구명자치단체코드과세년도세목명과세건수과세금액비과세건수비과세금액
0충청남도서천군447702018취득세108331288747600018603388856000
1충청남도서천군447702018등록세<NA><NA>61947000
2충청남도서천군447702018주민세275271758263000750218351000
3충청남도서천군447702018재산세900725248223000282804005693000
4충청남도서천군447702018자동차세40914107848480007621282947000
5충청남도서천군447702018레저세<NA><NA><NA><NA>
6충청남도서천군447702018담배소비세873905661000<NA><NA>
7충청남도서천군447702018지방소비세<NA><NA><NA><NA>
8충청남도서천군447702018등록면허세276361028149000369477966000
9충청남도서천군447702018도시계획세<NA><NA><NA><NA>
시도명시군구명자치단체코드과세년도세목명과세건수과세금액비과세건수비과세금액
53충청남도서천군447702021재산세915866777592000283744695253000
54충청남도서천군447702021자동차세4113186650540007872252879000
55충청남도서천군447702021레저세<NA><NA><NA><NA>
56충청남도서천군447702021담배소비세4764321448000<NA><NA>
57충청남도서천군447702021지방소비세77625455000<NA><NA>
58충청남도서천군447702021등록면허세301121311099000415083794000
59충청남도서천군447702021도시계획세<NA><NA><NA><NA>
60충청남도서천군447702021지역자원시설세1406821237430002980192712000
61충청남도서천군447702021지방소득세152538265661000<NA><NA>
62충청남도서천군447702021교육세148379611878500012311000