Overview

Dataset statistics

Number of variables10
Number of observations331
Missing cells48
Missing cells (%)1.5%
Duplicate rows1
Duplicate rows (%)0.3%
Total size in memory27.6 KiB
Average record size in memory85.4 B

Variable types

Categorical6
Boolean1
Numeric3

Dataset

Description최근 3년간 지방세 부과 징수 자료에 따른 지방세 세목별 통계자료를 근거로 연도별 지방세 납부 현황을 추출한 자료에 해당됩니다
Author충청북도 진천군
URLhttps://www.data.go.kr/data/15079487/fileData.do

Alerts

Dataset has 1 (0.3%) duplicate rowsDuplicates
납부매체전자고지여부 is highly overall correlated with 시도명 and 3 other fieldsHigh correlation
납부매체 is highly overall correlated with 시도명 and 3 other fieldsHigh correlation
시도명 is highly overall correlated with 납부건수 and 8 other fieldsHigh correlation
자치단체코드 is highly overall correlated with 납부건수 and 8 other fieldsHigh correlation
납부년도 is highly overall correlated with 시도명 and 2 other fieldsHigh correlation
세목명 is highly overall correlated with 시도명 and 2 other fieldsHigh correlation
시군구명 is highly overall correlated with 납부건수 and 8 other fieldsHigh correlation
납부건수 is highly overall correlated with 납부금액 and 4 other fieldsHigh correlation
납부금액 is highly overall correlated with 납부건수 and 4 other fieldsHigh correlation
납부매체비율 is highly overall correlated with 납부건수 and 4 other fieldsHigh correlation
시도명 is highly imbalanced (77.5%)Imbalance
시군구명 is highly imbalanced (77.5%)Imbalance
자치단체코드 is highly imbalanced (77.5%)Imbalance
납부매체전자고지여부 has 12 (3.6%) missing valuesMissing
납부건수 has 12 (3.6%) missing valuesMissing
납부금액 has 12 (3.6%) missing valuesMissing
납부매체비율 has 12 (3.6%) missing valuesMissing
납부매체비율 has 15 (4.5%) zerosZeros

Reproduction

Analysis started2024-03-30 07:33:38.763486
Analysis finished2024-03-30 07:33:45.270324
Duration6.51 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
충청북도
319 
<NA>
 
12

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충청북도
2nd row충청북도
3rd row충청북도
4th row충청북도
5th row충청북도

Common Values

ValueCountFrequency (%)
충청북도 319
96.4%
<NA> 12
 
3.6%

Length

2024-03-30T07:33:45.473464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T07:33:45.831971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
충청북도 319
96.4%
na 12
 
3.6%

시군구명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
진천군
319 
<NA>
 
12

Length

Max length4
Median length3
Mean length3.0362538
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row진천군
2nd row진천군
3rd row진천군
4th row진천군
5th row진천군

Common Values

ValueCountFrequency (%)
진천군 319
96.4%
<NA> 12
 
3.6%

Length

2024-03-30T07:33:46.356189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T07:33:46.789925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
진천군 319
96.4%
na 12
 
3.6%

자치단체코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
43750
319 
<NA>
 
12

Length

Max length5
Median length5
Mean length4.9637462
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row43750
2nd row43750
3rd row43750
4th row43750
5th row43750

Common Values

ValueCountFrequency (%)
43750 319
96.4%
<NA> 12
 
3.6%

Length

2024-03-30T07:33:47.267174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T07:33:47.684237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
43750 319
96.4%
na 12
 
3.6%

납부년도
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
2020
83 
2022
82 
2021
79 
2019
75 
<NA>
12 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
2020 83
25.1%
2022 82
24.8%
2021 79
23.9%
2019 75
22.7%
<NA> 12
 
3.6%

Length

2024-03-30T07:33:48.098840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-30T07:33:48.584558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 83
25.1%
2022 82
24.8%
2021 79
23.9%
2019 75
22.7%
na 12
 
3.6%

세목명
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
자동차세
44 
주민세
44 
재산세
43 
등록면허세
40 
지방소득세
36 
Other values (10)
124 

Length

Max length7
Median length5
Mean length3.9697885
Min length3

Unique

Unique2 ?
Unique (%)0.6%

Sample

1st row등록면허세
2nd row자동차세
3rd row자동차세
4th row재산세
5th row재산세

Common Values

ValueCountFrequency (%)
자동차세 44
13.3%
주민세 44
13.3%
재산세 43
13.0%
등록면허세 40
12.1%
지방소득세 36
10.9%
취득세 35
10.6%
등록세 23
6.9%
지역자원시설세 16
 
4.8%
면허세 12
 
3.6%
담배소비세 12
 
3.6%
Other values (5) 26
7.9%

Length

2024-03-30T07:33:49.083593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
자동차세 44
13.3%
주민세 44
13.3%
재산세 43
13.0%
등록면허세 40
12.1%
지방소득세 36
10.9%
취득세 35
10.6%
등록세 23
6.9%
지역자원시설세 16
 
4.8%
면허세 12
 
3.6%
담배소비세 12
 
3.6%
Other values (5) 26
7.9%

납부매체
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
가상계좌
41 
은행창구
39 
ARS
38 
위택스
36 
기타
33 
Other values (6)
144 

Length

Max length5
Median length4
Mean length3.9274924
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowARS
2nd rowARS
3rd rowARS
4th rowARS
5th rowARS

Common Values

ValueCountFrequency (%)
가상계좌 41
12.4%
은행창구 39
11.8%
ARS 38
11.5%
위택스 36
10.9%
기타 33
10.0%
인터넷지로 32
9.7%
지자체방문 32
9.7%
자동화기기 31
9.4%
페이사납부 21
6.3%
자동이체 16
 
4.8%

Length

2024-03-30T07:33:49.596157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
가상계좌 41
12.4%
은행창구 39
11.8%
ars 38
11.5%
위택스 36
10.9%
기타 33
10.0%
인터넷지로 32
9.7%
지자체방문 32
9.7%
자동화기기 31
9.4%
페이사납부 21
6.3%
자동이체 16
 
4.8%

납부매체전자고지여부
Boolean

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)0.6%
Missing12
Missing (%)3.6%
Memory size794.0 B
False
160 
True
159 
(Missing)
 
12
ValueCountFrequency (%)
False 160
48.3%
True 159
48.0%
(Missing) 12
 
3.6%
2024-03-30T07:33:50.250476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

납부건수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct251
Distinct (%)78.7%
Missing12
Missing (%)3.6%
Infinite0
Infinite (%)0.0%
Mean3604.9091
Minimum1
Maximum51015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.0 KiB
2024-03-30T07:33:50.707438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q130
median887
Q33258
95-th percentile15993
Maximum51015
Range51014
Interquartile range (IQR)3228

Descriptive statistics

Standard deviation7386.8769
Coefficient of variation (CV)2.049116
Kurtosis15.849477
Mean3604.9091
Median Absolute Deviation (MAD)883
Skewness3.720115
Sum1149966
Variance54565950
MonotonicityNot monotonic
2024-03-30T07:33:51.352652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 14
 
4.2%
3 12
 
3.6%
2 8
 
2.4%
6 7
 
2.1%
4 6
 
1.8%
9 5
 
1.5%
30 4
 
1.2%
5 4
 
1.2%
14 3
 
0.9%
12 3
 
0.9%
Other values (241) 253
76.4%
(Missing) 12
 
3.6%
ValueCountFrequency (%)
1 14
4.2%
2 8
2.4%
3 12
3.6%
4 6
1.8%
5 4
 
1.2%
6 7
2.1%
7 2
 
0.6%
8 1
 
0.3%
9 5
 
1.5%
10 1
 
0.3%
ValueCountFrequency (%)
51015 1
0.3%
47797 1
0.3%
42833 1
0.3%
38347 1
0.3%
37916 1
0.3%
34885 1
0.3%
32047 1
0.3%
29564 1
0.3%
28677 1
0.3%
26541 1
0.3%

납부금액
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct318
Distinct (%)99.7%
Missing12
Missing (%)3.6%
Infinite0
Infinite (%)0.0%
Mean2.3916572 × 109
Minimum2140
Maximum5.0644438 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.0 KiB
2024-03-30T07:33:52.138157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2140
5-th percentile28842
Q15805415
median1.6336036 × 108
Q32.290966 × 109
95-th percentile1.2282418 × 1010
Maximum5.0644438 × 1010
Range5.0644435 × 1010
Interquartile range (IQR)2.2851606 × 109

Descriptive statistics

Standard deviation5.1989288 × 109
Coefficient of variation (CV)2.1737767
Kurtosis28.121686
Mean2.3916572 × 109
Median Absolute Deviation (MAD)1.6330899 × 108
Skewness4.3381396
Sum7.6293865 × 1011
Variance2.7028861 × 1019
MonotonicityNot monotonic
2024-03-30T07:33:52.927323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3090 2
 
0.6%
15883420350 1
 
0.3%
284560 1
 
0.3%
435547390 1
 
0.3%
38820330 1
 
0.3%
16307400 1
 
0.3%
457000 1
 
0.3%
7192350 1
 
0.3%
105919380 1
 
0.3%
267871920 1
 
0.3%
Other values (308) 308
93.1%
(Missing) 12
 
3.6%
ValueCountFrequency (%)
2140 1
0.3%
2960 1
0.3%
3090 2
0.6%
6150 1
0.3%
6180 1
0.3%
8000 1
0.3%
8400 1
0.3%
11330 1
0.3%
11580 1
0.3%
12600 1
0.3%
ValueCountFrequency (%)
50644437580 1
0.3%
33077498540 1
0.3%
21870794670 1
0.3%
21601482060 1
0.3%
19572798830 1
0.3%
18825732560 1
0.3%
18500591600 1
0.3%
17049466090 1
0.3%
16896579410 1
0.3%
15883420350 1
0.3%

납부매체비율
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct244
Distinct (%)76.5%
Missing12
Missing (%)3.6%
Infinite0
Infinite (%)0.0%
Mean12.539122
Minimum0
Maximum77.01
Zeros15
Zeros (%)4.5%
Negative0
Negative (%)0.0%
Memory size3.0 KiB
2024-03-30T07:33:53.651310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.01
Q10.215
median9.83
Q319.645
95-th percentile38.713
Maximum77.01
Range77.01
Interquartile range (IQR)19.43

Descriptive statistics

Standard deviation13.778951
Coefficient of variation (CV)1.0988769
Kurtosis2.0833522
Mean12.539122
Median Absolute Deviation (MAD)9.62
Skewness1.3585008
Sum3999.98
Variance189.8595
MonotonicityNot monotonic
2024-03-30T07:33:54.376213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 15
 
4.5%
0.01 10
 
3.0%
0.02 7
 
2.1%
0.03 6
 
1.8%
0.09 5
 
1.5%
0.04 4
 
1.2%
0.14 4
 
1.2%
0.13 4
 
1.2%
0.12 3
 
0.9%
0.05 3
 
0.9%
Other values (234) 258
77.9%
(Missing) 12
 
3.6%
ValueCountFrequency (%)
0.0 15
4.5%
0.01 10
3.0%
0.02 7
2.1%
0.03 6
 
1.8%
0.04 4
 
1.2%
0.05 3
 
0.9%
0.06 3
 
0.9%
0.07 2
 
0.6%
0.08 1
 
0.3%
0.09 5
 
1.5%
ValueCountFrequency (%)
77.01 1
0.3%
63.28 1
0.3%
61.76 1
0.3%
57.31 1
0.3%
55.68 1
0.3%
48.51 1
0.3%
47.88 1
0.3%
46.47 1
0.3%
46.04 1
0.3%
46.01 1
0.3%

Interactions

2024-03-30T07:33:42.313479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T07:33:40.149189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T07:33:41.360496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T07:33:42.664427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T07:33:40.496262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T07:33:41.646341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T07:33:42.938987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T07:33:40.818669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-30T07:33:41.989192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-30T07:33:54.796812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
납부년도세목명납부매체납부매체전자고지여부납부건수납부금액납부매체비율
납부년도1.0000.0000.0000.0000.0000.0000.000
세목명0.0001.0000.1640.0500.0000.4530.482
납부매체0.0000.1641.0000.9960.5720.2890.506
납부매체전자고지여부0.0000.0500.9961.0000.1960.0640.196
납부건수0.0000.0000.5720.1961.0000.4600.593
납부금액0.0000.4530.2890.0640.4601.0000.180
납부매체비율0.0000.4820.5060.1960.5930.1801.000
2024-03-30T07:33:55.299804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
납부매체전자고지여부납부매체시도명자치단체코드납부년도세목명시군구명
납부매체전자고지여부1.0000.9311.0001.0000.0000.0371.000
납부매체0.9311.0001.0001.0000.0000.0651.000
시도명1.0001.0001.0001.0001.0001.0001.000
자치단체코드1.0001.0001.0001.0001.0001.0001.000
납부년도0.0000.0001.0001.0001.0000.0001.000
세목명0.0370.0651.0001.0000.0001.0001.000
시군구명1.0001.0001.0001.0001.0001.0001.000
2024-03-30T07:33:55.621395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
납부건수납부금액납부매체비율시도명시군구명자치단체코드납부년도세목명납부매체납부매체전자고지여부
납부건수1.0000.8090.8181.0001.0001.0000.0000.0000.2070.148
납부금액0.8091.0000.6181.0001.0001.0000.0000.1820.1500.069
납부매체비율0.8180.6181.0001.0001.0001.0000.0000.2160.1760.148
시도명1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
시군구명1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
자치단체코드1.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
납부년도0.0000.0000.0001.0001.0001.0001.0000.0000.0000.000
세목명0.0000.1820.2161.0001.0001.0000.0001.0000.0650.037
납부매체0.2070.1500.1761.0001.0001.0000.0000.0651.0000.931
납부매체전자고지여부0.1480.0690.1481.0001.0001.0000.0000.0370.9311.000

Missing values

2024-03-30T07:33:43.387686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-30T07:33:44.194600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-30T07:33:44.849332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시도명시군구명자치단체코드납부년도세목명납부매체납부매체전자고지여부납부건수납부금액납부매체비율
0충청북도진천군437502019등록면허세ARSN667487101.98
1충청북도진천군437502019자동차세ARSN206237729938061.76
2충청북도진천군437502019자동차세ARSY63026900.18
3충청북도진천군437502019재산세ARSN74511849182022.31
4충청북도진천군437502019재산세ARSY3513700.09
5충청북도진천군437502019주민세ARSN406598763012.16
6충청북도진천군437502019주민세ARSY171899700.51
7충청북도진천군437502019지방소득세ARSN2463145400.72
8충청북도진천군437502019취득세ARSN10296547000.3
9충청북도진천군437502019등록면허세가상계좌Y108142107795609.63
시도명시군구명자치단체코드납부년도세목명납부매체납부매체전자고지여부납부건수납부금액납부매체비율
321<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
322<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
323<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
324<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
325<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
326<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
327<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
328<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
329<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
330<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

시도명시군구명자치단체코드납부년도세목명납부매체납부매체전자고지여부납부건수납부금액납부매체비율# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>12