Overview

Dataset statistics

Number of variables9
Number of observations39
Missing cells28
Missing cells (%)8.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.1 KiB
Average record size in memory81.3 B

Variable types

Categorical5
Numeric4

Dataset

Description3년간(2019~2021) 연도별 지방세 과세 및 비과세 현황을 세목별로 제공하는 데이터로 과세건수, 과세금액, 비과세건수, 비과세금액 등의 항목을 제공합니다
Author전라남도 나주시
URLhttps://www.data.go.kr/data/15079501/fileData.do

Alerts

시도명 has constant value ""Constant
시군구명 has constant value ""Constant
자치단체코드 has constant value ""Constant
과세건수 is highly overall correlated with 비과세건수 and 1 other fieldsHigh correlation
과세금액 is highly overall correlated with 세목명High correlation
비과세건수 is highly overall correlated with 과세건수 and 2 other fieldsHigh correlation
비과세금액 is highly overall correlated with 비과세건수High correlation
세목명 is highly overall correlated with 과세건수 and 2 other fieldsHigh correlation
과세건수 has 7 (17.9%) missing valuesMissing
과세금액 has 7 (17.9%) missing valuesMissing
비과세건수 has 7 (17.9%) missing valuesMissing
비과세금액 has 7 (17.9%) missing valuesMissing
과세건수 has 3 (7.7%) zerosZeros
과세금액 has 3 (7.7%) zerosZeros
비과세건수 has 8 (20.5%) zerosZeros
비과세금액 has 8 (20.5%) zerosZeros

Reproduction

Analysis started2024-04-21 02:14:45.458972
Analysis finished2024-04-21 02:14:49.106769
Duration3.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size440.0 B
전라남도
39 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전라남도
2nd row전라남도
3rd row전라남도
4th row전라남도
5th row전라남도

Common Values

ValueCountFrequency (%)
전라남도 39
100.0%

Length

2024-04-21T11:14:49.210250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T11:14:49.373316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전라남도 39
100.0%

시군구명
Categorical

CONSTANT 

Distinct1
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size440.0 B
나주시
39 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row나주시
2nd row나주시
3rd row나주시
4th row나주시
5th row나주시

Common Values

ValueCountFrequency (%)
나주시 39
100.0%

Length

2024-04-21T11:14:49.537414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T11:14:49.705149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
나주시 39
100.0%

자치단체코드
Categorical

CONSTANT 

Distinct1
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size440.0 B
46170
39 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row46170
2nd row46170
3rd row46170
4th row46170
5th row46170

Common Values

ValueCountFrequency (%)
46170 39
100.0%

Length

2024-04-21T11:14:49.868270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T11:14:50.028560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
46170 39
100.0%

과세년도
Categorical

Distinct3
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size440.0 B
2019
13 
2020
13 
2021
13 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
2019 13
33.3%
2020 13
33.3%
2021 13
33.3%

Length

2024-04-21T11:14:50.194440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T11:14:50.365294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2019 13
33.3%
2020 13
33.3%
2021 13
33.3%

세목명
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)33.3%
Missing0
Missing (%)0.0%
Memory size440.0 B
취득세
등록세
주민세
재산세
자동차세
Other values (8)
24 

Length

Max length7
Median length5
Mean length4.1538462
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row취득세
2nd row등록세
3rd row주민세
4th row재산세
5th row자동차세

Common Values

ValueCountFrequency (%)
취득세 3
 
7.7%
등록세 3
 
7.7%
주민세 3
 
7.7%
재산세 3
 
7.7%
자동차세 3
 
7.7%
레저세 3
 
7.7%
담배소비세 3
 
7.7%
지방소비세 3
 
7.7%
등록면허세 3
 
7.7%
도시계획세 3
 
7.7%
Other values (3) 9
23.1%

Length

2024-04-21T11:14:50.581144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
취득세 3
 
7.7%
등록세 3
 
7.7%
주민세 3
 
7.7%
재산세 3
 
7.7%
자동차세 3
 
7.7%
레저세 3
 
7.7%
담배소비세 3
 
7.7%
지방소비세 3
 
7.7%
등록면허세 3
 
7.7%
도시계획세 3
 
7.7%
Other values (3) 9
23.1%

과세건수
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct30
Distinct (%)93.8%
Missing7
Missing (%)17.9%
Infinite0
Infinite (%)0.0%
Mean77333.625
Minimum0
Maximum327549
Zeros3
Zeros (%)7.7%
Negative0
Negative (%)0.0%
Memory size479.0 B
2024-04-21T11:14:50.801607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q119167
median51528.5
Q395893
95-th percentile312266
Maximum327549
Range327549
Interquartile range (IQR)76726

Descriptive statistics

Standard deviation90899.956
Coefficient of variation (CV)1.175426
Kurtosis2.6992209
Mean77333.625
Median Absolute Deviation (MAD)45275.5
Skewness1.7846418
Sum2474676
Variance8.2628019 × 109
MonotonicityNot monotonic
2024-04-21T11:14:51.010791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
0 3
 
7.7%
70608 1
 
2.6%
327549 1
 
2.6%
45294 1
 
2.6%
52935 1
 
2.6%
73734 1
 
2.6%
7 1
 
2.6%
477 1
 
2.6%
102470 1
 
2.6%
160087 1
 
2.6%
Other values (20) 20
51.3%
(Missing) 7
 
17.9%
ValueCountFrequency (%)
0 3
7.7%
6 1
 
2.6%
7 1
 
2.6%
80 1
 
2.6%
274 1
 
2.6%
477 1
 
2.6%
25397 1
 
2.6%
26381 1
 
2.6%
29518 1
 
2.6%
35425 1
 
2.6%
ValueCountFrequency (%)
327549 1
2.6%
315753 1
2.6%
309413 1
2.6%
160087 1
2.6%
156404 1
2.6%
153800 1
2.6%
102470 1
2.6%
98626 1
2.6%
94982 1
2.6%
73734 1
2.6%

과세금액
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct30
Distinct (%)93.8%
Missing7
Missing (%)17.9%
Infinite0
Infinite (%)0.0%
Mean1.7286754 × 1010
Minimum0
Maximum5.826368 × 1010
Zeros3
Zeros (%)7.7%
Negative0
Negative (%)0.0%
Memory size479.0 B
2024-04-21T11:14:51.229032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q14.9580328 × 109
median1.011269 × 1010
Q32.807986 × 1010
95-th percentile4.6670513 × 1010
Maximum5.826368 × 1010
Range5.826368 × 1010
Interquartile range (IQR)2.3121828 × 1010

Descriptive statistics

Standard deviation1.5724467 × 1010
Coefficient of variation (CV)0.90962519
Kurtosis0.17665811
Mean1.7286754 × 1010
Median Absolute Deviation (MAD)5.7389755 × 109
Skewness1.0102334
Sum5.5317612 × 1011
Variance2.4725885 × 1020
MonotonicityNot monotonic
2024-04-21T11:14:51.449837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
0 3
 
7.7%
4962978000 1
 
2.6%
15500633000 1
 
2.6%
36877328000 1
 
2.6%
4943197000 1
 
2.6%
4883519000 1
 
2.6%
10067081000 1
 
2.6%
8033901000 1
 
2.6%
27565251000 1
 
2.6%
29623689000 1
 
2.6%
Other values (20) 20
51.3%
(Missing) 7
 
17.9%
ValueCountFrequency (%)
0 3
7.7%
4268465000 1
 
2.6%
4478965000 1
 
2.6%
4516876000 1
 
2.6%
4883519000 1
 
2.6%
4943197000 1
 
2.6%
4962978000 1
 
2.6%
6127221000 1
 
2.6%
6331017000 1
 
2.6%
7232592000 1
 
2.6%
ValueCountFrequency (%)
58263680000 1
2.6%
50954808000 1
2.6%
43165181000 1
2.6%
36877328000 1
2.6%
35833210000 1
2.6%
32465726000 1
2.6%
31818028000 1
2.6%
29623689000 1
2.6%
27565251000 1
2.6%
25754021000 1
2.6%

비과세건수
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct25
Distinct (%)78.1%
Missing7
Missing (%)17.9%
Infinite0
Infinite (%)0.0%
Mean7258.875
Minimum0
Maximum40074
Zeros8
Zeros (%)20.5%
Negative0
Negative (%)0.0%
Memory size479.0 B
2024-04-21T11:14:51.666125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q118
median2845.5
Q37936.25
95-th percentile39387.7
Maximum40074
Range40074
Interquartile range (IQR)7918.25

Descriptive statistics

Standard deviation11597.568
Coefficient of variation (CV)1.5977087
Kurtosis3.9335068
Mean7258.875
Median Absolute Deviation (MAD)2845.5
Skewness2.1564804
Sum232284
Variance1.3450358 × 108
MonotonicityNot monotonic
2024-04-21T11:14:51.906899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
0 8
20.5%
5409 1
 
2.6%
112 1
 
2.6%
2874 1
 
2.6%
5909 1
 
2.6%
16144 1
 
2.6%
40074 1
 
2.6%
8696 1
 
2.6%
44 1
 
2.6%
4990 1
 
2.6%
Other values (15) 15
38.5%
(Missing) 7
17.9%
ValueCountFrequency (%)
0 8
20.5%
24 1
 
2.6%
40 1
 
2.6%
44 1
 
2.6%
83 1
 
2.6%
112 1
 
2.6%
125 1
 
2.6%
2808 1
 
2.6%
2817 1
 
2.6%
2874 1
 
2.6%
ValueCountFrequency (%)
40074 1
2.6%
39611 1
2.6%
39205 1
2.6%
16144 1
2.6%
15124 1
2.6%
13925 1
2.6%
11741 1
2.6%
8696 1
2.6%
7683 1
2.6%
5909 1
2.6%

비과세금액
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct25
Distinct (%)78.1%
Missing7
Missing (%)17.9%
Infinite0
Infinite (%)0.0%
Mean2.9438804 × 109
Minimum0
Maximum2.4961374 × 1010
Zeros8
Zeros (%)20.5%
Negative0
Negative (%)0.0%
Memory size479.0 B
2024-04-21T11:14:52.149269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15250
median44039500
Q36.9686975 × 108
95-th percentile1.3685655 × 1010
Maximum2.4961374 × 1010
Range2.4961374 × 1010
Interquartile range (IQR)6.968645 × 108

Descriptive statistics

Standard deviation6.1912936 × 109
Coefficient of variation (CV)2.1031063
Kurtosis4.3853652
Mean2.9438804 × 109
Median Absolute Deviation (MAD)44039500
Skewness2.2069352
Sum9.4204174 × 1010
Variance3.8332117 × 1019
MonotonicityNot monotonic
2024-04-21T11:14:52.386604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
0 8
20.5%
405563000 1
 
2.6%
9000 1
 
2.6%
460951000 1
 
2.6%
296460000 1
 
2.6%
693751000 1
 
2.6%
13897126000 1
 
2.6%
24607000 1
 
2.6%
18473000 1
 
2.6%
24961374000 1
 
2.6%
Other values (15) 15
38.5%
(Missing) 7
17.9%
ValueCountFrequency (%)
0 8
20.5%
7000 1
 
2.6%
9000 1
 
2.6%
10000 1
 
2.6%
1629000 1
 
2.6%
18473000 1
 
2.6%
21127000 1
 
2.6%
23544000 1
 
2.6%
24607000 1
 
2.6%
63472000 1
 
2.6%
ValueCountFrequency (%)
24961374000 1
2.6%
13897126000 1
2.6%
13512634000 1
2.6%
13242973000 1
2.6%
12913074000 1
2.6%
11056808000 1
2.6%
726160000 1
2.6%
706226000 1
2.6%
693751000 1
2.6%
460951000 1
2.6%

Interactions

2024-04-21T11:14:47.658409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:45.785463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:46.387069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:46.999088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:47.828322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:45.931767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:46.538450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:47.166531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:47.997806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:46.080338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:46.684046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:47.324836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:48.162516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:46.243490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:46.847041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:14:47.489506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T11:14:52.553455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세년도세목명과세건수과세금액비과세건수비과세금액
과세년도1.0000.0000.0000.0000.0000.000
세목명0.0001.0000.9720.8500.9090.712
과세건수0.0000.9721.0000.8090.9230.564
과세금액0.0000.8500.8091.0000.6840.967
비과세건수0.0000.9090.9230.6841.0000.631
비과세금액0.0000.7120.5640.9670.6311.000
2024-04-21T11:14:52.730106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세년도세목명
과세년도1.0000.000
세목명0.0001.000
2024-04-21T11:14:52.883420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
과세건수과세금액비과세건수비과세금액과세년도세목명
과세건수1.0000.2460.6640.4070.0000.830
과세금액0.2461.0000.1230.2160.0000.558
비과세건수0.6640.1231.0000.8540.0000.683
비과세금액0.4070.2160.8541.0000.0000.443
과세년도0.0000.0000.0000.0001.0000.000
세목명0.8300.5580.6830.4430.0001.000

Missing values

2024-04-21T11:14:48.387494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T11:14:48.819964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-21T11:14:49.003748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시도명시군구명자치단체코드과세년도세목명과세건수과세금액비과세건수비과세금액
0전라남도나주시461702019취득세2539743165181000433611056808000
1전라남도나주시461702019등록세004021127000
2전라남도나주시461702019주민세6371361272210001174163472000
3전라남도나주시461702019재산세153800247948540003920513242973000
4전라남도나주시461702019자동차세949823246572600013925726160000
5전라남도나주시461702019레저세<NA><NA><NA><NA>
6전라남도나주시461702019담배소비세80737188000000
7전라남도나주시461702019지방소비세<NA><NA><NA><NA>
8전라남도나주시461702019등록면허세6794544789650005867305361000
9전라남도나주시461702019도시계획세<NA><NA><NA><NA>
시도명시군구명자치단체코드과세년도세목명과세건수과세금액비과세건수비과세금액
29전라남도나주시461702021재산세160087296236890004007413897126000
30전라남도나주시461702021자동차세1024702756525100016144693751000
31전라남도나주시461702021레저세<NA><NA><NA><NA>
32전라남도나주시461702021담배소비세477803390100000
33전라남도나주시461702021지방소비세71006708100000
34전라남도나주시461702021등록면허세7373448835190005909296460000
35전라남도나주시461702021도시계획세<NA><NA><NA><NA>
36전라남도나주시461702021지역자원시설세5293549431970002874460951000
37전라남도나주시461702021지방소득세452943687732800000
38전라남도나주시461702021교육세327549155006330001129000