Overview

Dataset statistics

Number of variables8
Number of observations33
Missing cells3
Missing cells (%)1.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 KiB
Average record size in memory73.0 B

Variable types

Categorical3
Numeric4
Text1

Dataset

Description경상남도 하동군의 지방세징수 현황 (연번, 구분, 세목별, 목표액, 부과액, 징수액, 미수액, 징수율 등)의 정보를 제공하고 있습니다
Author경상남도 하동군
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15085755

Alerts

목표액 is highly overall correlated with 부과액 and 2 other fieldsHigh correlation
부과액 is highly overall correlated with 목표액 and 3 other fieldsHigh correlation
징수액 is highly overall correlated with 목표액 and 3 other fieldsHigh correlation
구분 is highly overall correlated with 부과액 and 2 other fieldsHigh correlation
세목별 is highly overall correlated with 목표액 and 3 other fieldsHigh correlation
미수액 has 2 (6.1%) missing valuesMissing
징수율 has 1 (3.0%) missing valuesMissing
부과액 has unique valuesUnique
징수액 has unique valuesUnique
미수액 has 1 (3.0%) zerosZeros

Reproduction

Analysis started2023-12-11 00:05:35.315612
Analysis finished2023-12-11 00:05:37.187066
Duration1.87 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Categorical

Distinct3
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size396.0 B
2020
11 
2021
11 
2022
11 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 11
33.3%
2021 11
33.3%
2022 11
33.3%

Length

2023-12-11T09:05:37.244625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:05:37.357352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 11
33.3%
2021 11
33.3%
2022 11
33.3%

구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Memory size396.0 B
군세
18 
도세
15 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row도세
2nd row도세
3rd row도세
4th row도세
5th row도세

Common Values

ValueCountFrequency (%)
군세 18
54.5%
도세 15
45.5%

Length

2023-12-11T09:05:37.461951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:05:37.597206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
군세 18
54.5%
도세 15
45.5%

세목별
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)30.3%
Missing0
Missing (%)0.0%
Memory size396.0 B
과년도수입
취득세
등록면허세
지역자원시설세
지방교육세
Other values (5)
15 

Length

Max length7
Median length5
Mean length4.5454545
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row취득세
2nd row등록면허세
3rd row지역자원시설세
4th row지방교육세
5th row과년도수입

Common Values

ValueCountFrequency (%)
과년도수입 6
18.2%
취득세 3
9.1%
등록면허세 3
9.1%
지역자원시설세 3
9.1%
지방교육세 3
9.1%
주민세 3
9.1%
재산세 3
9.1%
자동차세 3
9.1%
담배소비세 3
9.1%
지방소득세 3
9.1%

Length

2023-12-11T09:05:37.717022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:05:37.847261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
과년도수입 6
18.2%
취득세 3
9.1%
등록면허세 3
9.1%
지역자원시설세 3
9.1%
지방교육세 3
9.1%
주민세 3
9.1%
재산세 3
9.1%
자동차세 3
9.1%
담배소비세 3
9.1%
지방소득세 3
9.1%

목표액
Real number (ℝ)

HIGH CORRELATION 

Distinct31
Distinct (%)93.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4154.3333
Minimum38
Maximum13187
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size429.0 B
2023-12-11T09:05:38.001599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum38
5-th percentile111.4
Q11200
median3998
Q35710
95-th percentile11868.6
Maximum13187
Range13149
Interquartile range (IQR)4510

Descriptive statistics

Standard deviation3645.9415
Coefficient of variation (CV)0.87762372
Kurtosis0.5754328
Mean4154.3333
Median Absolute Deviation (MAD)2721
Skewness1.0477338
Sum137093
Variance13292889
MonotonicityNot monotonic
2023-12-11T09:05:38.154908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
1200 2
 
6.1%
450 2
 
6.1%
1226 1
 
3.0%
4600 1
 
3.0%
2850 1
 
3.0%
5850 1
 
3.0%
5000 1
 
3.0%
58 1
 
3.0%
3969 1
 
3.0%
6223 1
 
3.0%
Other values (21) 21
63.6%
ValueCountFrequency (%)
38 1
3.0%
58 1
3.0%
147 1
3.0%
350 1
3.0%
450 2
6.1%
1130 1
3.0%
1150 1
3.0%
1200 2
6.1%
1226 1
3.0%
1277 1
3.0%
ValueCountFrequency (%)
13187 1
3.0%
12714 1
3.0%
11305 1
3.0%
10028 1
3.0%
8697 1
3.0%
6223 1
3.0%
5850 1
3.0%
5750 1
3.0%
5710 1
3.0%
5000 1
3.0%

부과액
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct33
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4358.6364
Minimum-483
Maximum12507
Zeros0
Zeros (%)0.0%
Negative1
Negative (%)3.0%
Memory size429.0 B
2023-12-11T09:05:38.301996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-483
5-th percentile783.8
Q11226
median4069
Q36273
95-th percentile11892.8
Maximum12507
Range12990
Interquartile range (IQR)5047

Descriptive statistics

Standard deviation3407.9232
Coefficient of variation (CV)0.7818783
Kurtosis0.41313954
Mean4358.6364
Median Absolute Deviation (MAD)2757
Skewness0.88521437
Sum143835
Variance11613940
MonotonicityNot monotonic
2023-12-11T09:05:38.438698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
12507 1
 
3.0%
4184 1
 
3.0%
2940 1
 
3.0%
5233 1
 
3.0%
1071 1
 
3.0%
12359 1
 
3.0%
1113 1
 
3.0%
7449 1
 
3.0%
1312 1
 
3.0%
1140 1
 
3.0%
Other values (23) 23
69.7%
ValueCountFrequency (%)
-483 1
3.0%
494 1
3.0%
977 1
3.0%
1071 1
3.0%
1101 1
3.0%
1113 1
3.0%
1140 1
3.0%
1223 1
3.0%
1226 1
3.0%
1247 1
3.0%
ValueCountFrequency (%)
12507 1
3.0%
12359 1
3.0%
11582 1
3.0%
7449 1
3.0%
7421 1
3.0%
7047 1
3.0%
6665 1
3.0%
6351 1
3.0%
6273 1
3.0%
5683 1
3.0%

징수액
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct33
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4066.4545
Minimum-765
Maximum12276
Zeros0
Zeros (%)0.0%
Negative2
Negative (%)6.1%
Memory size429.0 B
2023-12-11T09:05:38.614731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-765
5-th percentile-200.2
Q11133
median3913
Q35964
95-th percentile11262.4
Maximum12276
Range13041
Interquartile range (IQR)4831

Descriptive statistics

Standard deviation3443.811
Coefficient of variation (CV)0.84688293
Kurtosis0.10600485
Mean4066.4545
Median Absolute Deviation (MAD)2707
Skewness0.71572564
Sum134193
Variance11859834
MonotonicityNot monotonic
2023-12-11T09:05:38.776703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
11086 1
 
3.0%
4081 1
 
3.0%
2940 1
 
3.0%
4679 1
 
3.0%
376 1
 
3.0%
12276 1
 
3.0%
1107 1
 
3.0%
7426 1
 
3.0%
1050 1
 
3.0%
1133 1
 
3.0%
Other values (23) 23
69.7%
ValueCountFrequency (%)
-765 1
3.0%
-694 1
3.0%
129 1
3.0%
144 1
3.0%
376 1
3.0%
970 1
3.0%
1050 1
3.0%
1107 1
3.0%
1133 1
3.0%
1200 1
3.0%
ValueCountFrequency (%)
12276 1
3.0%
11527 1
3.0%
11086 1
3.0%
7426 1
3.0%
7407 1
3.0%
7036 1
3.0%
6366 1
3.0%
6091 1
3.0%
5964 1
3.0%
5493 1
3.0%

미수액
Real number (ℝ)

MISSING  ZEROS 

Distinct29
Distinct (%)93.5%
Missing2
Missing (%)6.1%
Infinite0
Infinite (%)0.0%
Mean441.51613
Minimum0
Maximum6800
Zeros1
Zeros (%)3.0%
Negative0
Negative (%)0.0%
Memory size429.0 B
2023-12-11T09:05:38.939852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q124
median129
Q3285
95-th percentile1103
Maximum6800
Range6800
Interquartile range (IQR)261

Descriptive statistics

Standard deviation1216.081
Coefficient of variation (CV)2.7543297
Kurtosis27.13677
Mean441.51613
Median Absolute Deviation (MAD)113
Skewness5.0920393
Sum13687
Variance1478852.9
MonotonicityNot monotonic
2023-12-11T09:05:39.115835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
129 2
 
6.1%
6 2
 
6.1%
256 1
 
3.0%
785 1
 
3.0%
294 1
 
3.0%
0 1
 
3.0%
220 1
 
3.0%
190 1
 
3.0%
18 1
 
3.0%
225 1
 
3.0%
Other values (19) 19
57.6%
(Missing) 2
 
6.1%
ValueCountFrequency (%)
0 1
3.0%
6 2
6.1%
10 1
3.0%
13 1
3.0%
16 1
3.0%
18 1
3.0%
23 1
3.0%
25 1
3.0%
55 1
3.0%
83 1
3.0%
ValueCountFrequency (%)
6800 1
3.0%
1421 1
3.0%
785 1
3.0%
589 1
3.0%
537 1
3.0%
525 1
3.0%
296 1
3.0%
294 1
3.0%
276 1
3.0%
256 1
3.0%

징수율
Text

MISSING 

Distinct30
Distinct (%)93.8%
Missing1
Missing (%)3.0%
Memory size396.0 B
2023-12-11T09:05:39.334436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length6
Mean length6.125
Min length5

Characters and Unicode

Total characters196
Distinct characters13
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)90.6%

Sample

1st row88.64%
2nd row99.39%
3rd row99.84%
4th row94.94%
5th row97.88%
ValueCountFrequency (%)
100.00 3
 
9.4%
6.64 1
 
3.1%
98.61 1
 
3.1%
95.07 1
 
3.1%
96.06 1
 
3.1%
96.66 1
 
3.1%
98.56 1
 
3.1%
80.03 1
 
3.1%
97.54 1
 
3.1%
99.69 1
 
3.1%
Other values (20) 20
62.5%
2023-12-11T09:05:39.737876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 35
17.9%
. 32
16.3%
% 32
16.3%
0 19
9.7%
6 15
7.7%
1 13
 
6.6%
8 13
 
6.6%
4 10
 
5.1%
5 10
 
5.1%
7 8
 
4.1%
Other values (3) 9
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 131
66.8%
Other Punctuation 64
32.7%
Dash Punctuation 1
 
0.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 35
26.7%
0 19
14.5%
6 15
11.5%
1 13
 
9.9%
8 13
 
9.9%
4 10
 
7.6%
5 10
 
7.6%
7 8
 
6.1%
3 7
 
5.3%
2 1
 
0.8%
Other Punctuation
ValueCountFrequency (%)
. 32
50.0%
% 32
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 196
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 35
17.9%
. 32
16.3%
% 32
16.3%
0 19
9.7%
6 15
7.7%
1 13
 
6.6%
8 13
 
6.6%
4 10
 
5.1%
5 10
 
5.1%
7 8
 
4.1%
Other values (3) 9
 
4.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 196
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 35
17.9%
. 32
16.3%
% 32
16.3%
0 19
9.7%
6 15
7.7%
1 13
 
6.6%
8 13
 
6.6%
4 10
 
5.1%
5 10
 
5.1%
7 8
 
4.1%
Other values (3) 9
 
4.6%

Interactions

2023-12-11T09:05:36.562545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:35.601262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:35.912582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:36.238085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:36.638980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:35.677938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:35.990209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:36.311893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:36.719882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:35.757035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:36.082105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:36.389516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:36.802069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:35.829620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:36.156596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:05:36.483003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:05:39.868807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도구분세목별목표액부과액징수액미수액징수율
연도1.0000.0000.0000.0000.0000.0000.0700.000
구분0.0001.0000.9760.3500.8650.8460.0961.000
세목별0.0000.9761.0000.8770.9260.9520.0001.000
목표액0.0000.3500.8771.0000.9450.9450.0001.000
부과액0.0000.8650.9260.9451.0001.0000.8631.000
징수액0.0000.8460.9520.9451.0001.0000.0001.000
미수액0.0700.0960.0000.0000.8630.0001.0001.000
징수율0.0001.0001.0001.0001.0001.0001.0001.000
2023-12-11T09:05:40.005408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도세목별구분
연도1.0000.0000.000
세목별0.0001.0000.743
구분0.0000.7431.000
2023-12-11T09:05:40.136446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
목표액부과액징수액미수액연도구분세목별
목표액1.0000.9260.965-0.1410.0000.2220.646
부과액0.9261.0000.976-0.0510.0000.5770.762
징수액0.9650.9761.000-0.1510.0000.5870.821
미수액-0.141-0.051-0.1511.0000.0260.0210.000
연도0.0000.0000.0000.0261.0000.0000.000
구분0.2220.5770.5870.0210.0001.0000.743
세목별0.6460.7620.8210.0000.0000.7431.000

Missing values

2023-12-11T09:05:36.925384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:05:37.035646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T09:05:37.130790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연도구분세목별목표액부과액징수액미수액징수율
02020도세취득세127141250711086142188.64%
12020도세등록면허세122611401133680099.39%
22020도세지역자원시설세10028704770361099.84%
32020도세지방교육세39984069386320494.94%
42020도세과년도수입147-483-694129<NA>
52020군세주민세1150122612002597.88%
62020군세재산세48005201507412697.56%
72020군세자동차세57106665636629695.51%
82020군세담배소비세273029222922<NA>100.00%
92020군세지방소득세43125588543012997.17%
연도구분세목별목표액부과액징수액미수액징수율
232022도세등록면허세113011131107699.46%
242022도세지역자원시설세6223744974262399.69%
252022도세지방교육세39694184408110397.54%
262022도세과년도수입581312105022580.03%
272022군세주민세1200124712291898.56%
282022군세재산세50005683549319096.66%
292022군세자동차세58505582536222096.06%
302022군세담배소비세2850301130110100.00%
312022군세지방소득세46006273596429495.07%
322022군세과년도수입450494-765785-154.86%