Overview

Dataset statistics

Number of variables8
Number of observations64
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.5 KiB
Average record size in memory72.1 B

Variable types

Categorical2
Numeric6

Dataset

Description국세환급현황을 국세통계로 제공 - 지역별 등 구분하여 제공(종합소득세, 법인세, 부가가치세, 양도소득세, 상속 · 증여세 등)
URLhttps://www.data.go.kr/data/3059447/fileData.do

Alerts

종합소득세(백만원) is highly overall correlated with 법인세(백만원) and 4 other fieldsHigh correlation
법인세(백만원) is highly overall correlated with 종합소득세(백만원) and 5 other fieldsHigh correlation
부가가치세(백만원) is highly overall correlated with 종합소득세(백만원) and 4 other fieldsHigh correlation
양도소득세(백만원) is highly overall correlated with 종합소득세(백만원) and 4 other fieldsHigh correlation
상속_증여세(백만원) is highly overall correlated with 종합소득세(백만원) and 5 other fieldsHigh correlation
기타(백만원) is highly overall correlated with 종합소득세(백만원) and 4 other fieldsHigh correlation
시도별 is highly overall correlated with 법인세(백만원) and 1 other fieldsHigh correlation
법인세(백만원) has 11 (17.2%) zerosZeros
부가가치세(백만원) has 1 (1.6%) zerosZeros
양도소득세(백만원) has 12 (18.8%) zerosZeros
상속_증여세(백만원) has 15 (23.4%) zerosZeros

Reproduction

Analysis started2023-12-12 05:12:50.263039
Analysis finished2023-12-12 05:12:54.428611
Duration4.17 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

Distinct4
Distinct (%)6.2%
Missing0
Missing (%)0.0%
Memory size644.0 B
발생액
16 
지급액
16 
충당액
16 
미처리
16 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row발생액
2nd row발생액
3rd row발생액
4th row발생액
5th row발생액

Common Values

ValueCountFrequency (%)
발생액 16
25.0%
지급액 16
25.0%
충당액 16
25.0%
미처리 16
25.0%

Length

2023-12-12T14:12:54.540410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:12:54.691178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
발생액 16
25.0%
지급액 16
25.0%
충당액 16
25.0%
미처리 16
25.0%

시도별
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)25.0%
Missing0
Missing (%)0.0%
Memory size644.0 B
서울
 
4
인천
 
4
경기
 
4
강원
 
4
대전
 
4
Other values (11)
44 

Length

Max length7
Median length4
Mean length4.1875
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 서울
2nd row 인천
3rd row 경기
4th row 강원
5th row 대전

Common Values

ValueCountFrequency (%)
서울 4
 
6.2%
인천 4
 
6.2%
경기 4
 
6.2%
강원 4
 
6.2%
대전 4
 
6.2%
충북 4
 
6.2%
충남 세종 4
 
6.2%
광주 4
 
6.2%
전북 4
 
6.2%
전남 4
 
6.2%
Other values (6) 24
37.5%

Length

2023-12-12T14:12:54.875821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울 4
 
5.9%
전북 4
 
5.9%
경남 4
 
5.9%
울산 4
 
5.9%
부산 4
 
5.9%
경북 4
 
5.9%
대구 4
 
5.9%
전남 4
 
5.9%
광주 4
 
5.9%
인천 4
 
5.9%
Other values (7) 28
41.2%

종합소득세(백만원)
Real number (ℝ)

HIGH CORRELATION 

Distinct62
Distinct (%)96.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean99223.469
Minimum16
Maximum931873
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size708.0 B
2023-12-12T14:12:55.376471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum16
5-th percentile28.15
Q11572
median35067.5
Q385679
95-th percentile756353.35
Maximum931873
Range931857
Interquartile range (IQR)84107

Descriptive statistics

Standard deviation213505.86
Coefficient of variation (CV)2.1517677
Kurtosis10.266552
Mean99223.469
Median Absolute Deviation (MAD)34979.5
Skewness3.3263994
Sum6350302
Variance4.5584752 × 1010
MonotonicityNot monotonic
2023-12-12T14:12:55.570511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
19 2
 
3.1%
40 2
 
3.1%
890505 1
 
1.6%
10222 1
 
1.6%
40292 1
 
1.6%
3850 1
 
1.6%
3609 1
 
1.6%
2426 1
 
1.6%
4009 1
 
1.6%
3354 1
 
1.6%
Other values (52) 52
81.2%
ValueCountFrequency (%)
16 1
1.6%
19 2
3.1%
28 1
1.6%
29 1
1.6%
31 1
1.6%
37 1
1.6%
40 2
3.1%
50 1
1.6%
65 1
1.6%
67 1
1.6%
ValueCountFrequency (%)
931873 1
1.6%
891297 1
1.6%
890505 1
1.6%
855586 1
1.6%
194035 1
1.6%
185354 1
1.6%
183582 1
1.6%
178837 1
1.6%
138371 1
1.6%
137571 1
1.6%

법인세(백만원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct54
Distinct (%)84.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean234234.22
Minimum-26
Maximum4720013
Zeros11
Zeros (%)17.2%
Negative1
Negative (%)1.6%
Memory size708.0 B
2023-12-12T14:12:55.771709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-26
5-th percentile0
Q11092.5
median42579
Q3108220.75
95-th percentile976003.15
Maximum4720013
Range4720039
Interquartile range (IQR)107128.25

Descriptive statistics

Standard deviation827957.04
Coefficient of variation (CV)3.5347399
Kurtosis25.754439
Mean234234.22
Median Absolute Deviation (MAD)42579
Skewness5.0787231
Sum14990990
Variance6.8551285 × 1011
MonotonicityNot monotonic
2023-12-12T14:12:55.952863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 11
 
17.2%
4720013 1
 
1.6%
16018 1
 
1.6%
133930 1
 
1.6%
27479 1
 
1.6%
95323 1
 
1.6%
6302 1
 
1.6%
56233 1
 
1.6%
1458 1
 
1.6%
1745 1
 
1.6%
Other values (44) 44
68.8%
ValueCountFrequency (%)
-26 1
 
1.6%
0 11
17.2%
1 1
 
1.6%
2 1
 
1.6%
19 1
 
1.6%
35 1
 
1.6%
1445 1
 
1.6%
1458 1
 
1.6%
1745 1
 
1.6%
2276 1
 
1.6%
ValueCountFrequency (%)
4720013 1
1.6%
4624716 1
1.6%
1167172 1
1.6%
1110904 1
1.6%
211565 1
1.6%
210147 1
1.6%
203846 1
1.6%
201528 1
1.6%
181024 1
1.6%
164345 1
1.6%

부가가치세(백만원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct60
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2613283.8
Minimum0
Maximum26045597
Zeros1
Zeros (%)1.6%
Negative0
Negative (%)0.0%
Memory size708.0 B
2023-12-12T14:12:56.126623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4
Q12696.5
median221309.5
Q32584809
95-th percentile20232945
Maximum26045597
Range26045597
Interquartile range (IQR)2582112.5

Descriptive statistics

Standard deviation5909891.6
Coefficient of variation (CV)2.261481
Kurtosis10.33577
Mean2613283.8
Median Absolute Deviation (MAD)221306.5
Skewness3.3204768
Sum1.6725016 × 108
Variance3.4926819 × 1013
MonotonicityNot monotonic
2023-12-12T14:12:56.339732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12 2
 
3.1%
4 2
 
3.1%
3 2
 
3.1%
6 2
 
3.1%
7244 1
 
1.6%
6378 1
 
1.6%
8343 1
 
1.6%
9259 1
 
1.6%
11152 1
 
1.6%
26045597 1
 
1.6%
Other values (50) 50
78.1%
ValueCountFrequency (%)
0 1
1.6%
3 2
3.1%
4 2
3.1%
5 1
1.6%
6 2
3.1%
7 1
1.6%
12 2
3.1%
13 1
1.6%
14 1
1.6%
15 1
1.6%
ValueCountFrequency (%)
26045597 1
1.6%
25982185 1
1.6%
22985252 1
1.6%
22872310 1
1.6%
5276543 1
1.6%
5265378 1
1.6%
4324226 1
1.6%
4307729 1
1.6%
4166444 1
1.6%
4156663 1
1.6%

양도소득세(백만원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct52
Distinct (%)81.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13354.75
Minimum0
Maximum195308
Zeros12
Zeros (%)18.8%
Negative0
Negative (%)0.0%
Memory size708.0 B
2023-12-12T14:12:56.514652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q157.75
median4072
Q310260.75
95-th percentile82255.6
Maximum195308
Range195308
Interquartile range (IQR)10203

Descriptive statistics

Standard deviation36442.201
Coefficient of variation (CV)2.728782
Kurtosis17.920123
Mean13354.75
Median Absolute Deviation (MAD)4071.5
Skewness4.2133699
Sum854704
Variance1.328034 × 109
MonotonicityNot monotonic
2023-12-12T14:12:56.731652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 12
 
18.8%
1 2
 
3.1%
195308 1
 
1.6%
166 1
 
1.6%
5343 1
 
1.6%
9369 1
 
1.6%
2033 1
 
1.6%
7050 1
 
1.6%
416 1
 
1.6%
5201 1
 
1.6%
Other values (42) 42
65.6%
ValueCountFrequency (%)
0 12
18.8%
1 2
 
3.1%
2 1
 
1.6%
45 1
 
1.6%
62 1
 
1.6%
113 1
 
1.6%
152 1
 
1.6%
166 1
 
1.6%
241 1
 
1.6%
275 1
 
1.6%
ValueCountFrequency (%)
195308 1
1.6%
188255 1
1.6%
98949 1
1.6%
93748 1
1.6%
17132 1
1.6%
16495 1
1.6%
14471 1
1.6%
14195 1
1.6%
12829 1
1.6%
12745 1
1.6%

상속_증여세(백만원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct50
Distinct (%)78.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5210.6562
Minimum0
Maximum94774
Zeros15
Zeros (%)23.4%
Negative0
Negative (%)0.0%
Memory size708.0 B
2023-12-12T14:12:56.947292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110.25
median831.5
Q33101.75
95-th percentile21103.25
Maximum94774
Range94774
Interquartile range (IQR)3091.5

Descriptive statistics

Standard deviation16049.73
Coefficient of variation (CV)3.0801744
Kurtosis24.68086
Mean5210.6562
Median Absolute Deviation (MAD)831.5
Skewness4.9156729
Sum333482
Variance2.5759384 × 108
MonotonicityNot monotonic
2023-12-12T14:12:57.133354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 15
 
23.4%
94774 1
 
1.6%
53 1
 
1.6%
6539 1
 
1.6%
4748 1
 
1.6%
5990 1
 
1.6%
1449 1
 
1.6%
9471 1
 
1.6%
175 1
 
1.6%
1331 1
 
1.6%
Other values (40) 40
62.5%
ValueCountFrequency (%)
0 15
23.4%
2 1
 
1.6%
13 1
 
1.6%
14 1
 
1.6%
24 1
 
1.6%
32 1
 
1.6%
53 1
 
1.6%
149 1
 
1.6%
151 1
 
1.6%
175 1
 
1.6%
ValueCountFrequency (%)
94774 1
1.6%
85304 1
1.6%
24487 1
1.6%
23156 1
1.6%
9471 1
1.6%
8593 1
1.6%
8462 1
1.6%
8442 1
1.6%
6539 1
1.6%
6180 1
1.6%

기타(백만원)
Real number (ℝ)

HIGH CORRELATION 

Distinct60
Distinct (%)93.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean288281.12
Minimum1
Maximum2383468
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size708.0 B
2023-12-12T14:12:57.291970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q16427.5
median135252.5
Q3338885.5
95-th percentile1413202.6
Maximum2383468
Range2383467
Interquartile range (IQR)332458

Descriptive statistics

Standard deviation487011.27
Coefficient of variation (CV)1.6893623
Kurtosis9.7733562
Mean288281.12
Median Absolute Deviation (MAD)135242.5
Skewness3.0385736
Sum18449992
Variance2.3717998 × 1011
MonotonicityNot monotonic
2023-12-12T14:12:57.494481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5 3
 
4.7%
6 2
 
3.1%
1 2
 
3.1%
2383468 1
 
1.6%
20830 1
 
1.6%
149048 1
 
1.6%
179180 1
 
1.6%
11328 1
 
1.6%
13932 1
 
1.6%
23186 1
 
1.6%
Other values (50) 50
78.1%
ValueCountFrequency (%)
1 2
3.1%
3 1
 
1.6%
5 3
4.7%
6 2
3.1%
8 1
 
1.6%
12 1
 
1.6%
13 1
 
1.6%
26 1
 
1.6%
27 1
 
1.6%
48 1
 
1.6%
ValueCountFrequency (%)
2383468 1
1.6%
2261743 1
1.6%
1735719 1
1.6%
1556451 1
1.6%
601462 1
1.6%
591118 1
1.6%
567014 1
1.6%
544400 1
1.6%
523543 1
1.6%
464840 1
1.6%

Interactions

2023-12-12T14:12:53.447389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:50.571475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:51.147840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:51.732657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:52.324950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:52.929603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:53.558569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:50.674608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:51.236410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:51.826747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:52.420476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:53.005705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:53.651618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:50.771001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:51.325050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:51.923621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:52.514444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:53.085708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:53.803649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:50.873503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:51.419050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:52.021879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:52.613392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:53.167666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:53.930480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:50.973215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:51.547006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:52.141252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:52.718467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:53.253292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:54.040281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:51.059147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:51.627916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:52.231492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:52.820756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:12:53.336417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:12:57.652153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분시도별종합소득세(백만원)법인세(백만원)부가가치세(백만원)양도소득세(백만원)상속_증여세(백만원)기타(백만원)
구분1.0000.0000.5610.0000.2750.0000.0000.622
시도별0.0001.0000.7070.7500.7800.6240.7500.567
종합소득세(백만원)0.5610.7071.0000.6620.7450.8780.6620.849
법인세(백만원)0.0000.7500.6621.0001.0001.0001.0001.000
부가가치세(백만원)0.2750.7800.7451.0001.0000.8321.0000.874
양도소득세(백만원)0.0000.6240.8781.0000.8321.0001.0001.000
상속_증여세(백만원)0.0000.7500.6621.0001.0001.0001.0001.000
기타(백만원)0.6220.5670.8491.0000.8741.0001.0001.000
2023-12-12T14:12:57.797171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도별구분
시도별1.0000.000
구분0.0001.000
2023-12-12T14:12:57.909709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종합소득세(백만원)법인세(백만원)부가가치세(백만원)양도소득세(백만원)상속_증여세(백만원)기타(백만원)구분시도별
종합소득세(백만원)1.0000.9570.9530.9520.9000.9650.2450.359
법인세(백만원)0.9571.0000.9600.9270.9010.9530.0000.503
부가가치세(백만원)0.9530.9601.0000.9290.8760.9720.2240.485
양도소득세(백만원)0.9520.9270.9291.0000.8920.9270.0000.297
상속_증여세(백만원)0.9000.9010.8760.8921.0000.8660.0000.503
기타(백만원)0.9650.9530.9720.9270.8661.0000.4430.280
구분0.2450.0000.2240.0000.0000.4431.0000.000
시도별0.3590.5030.4850.2970.5030.2800.0001.000

Missing values

2023-12-12T14:12:54.191859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:12:54.362652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분시도별종합소득세(백만원)법인세(백만원)부가가치세(백만원)양도소득세(백만원)상속_증여세(백만원)기타(백만원)
0발생액서울890505472001326045597195308947742383468
1발생액인천1940352101474324226127453065601462
2발생액경기93187311671722298525298949244871735719
3발생액강원567978136079545256231203275970
4발생액대전7943485231912469116421265248208
5발생액충북65947719081930144144711366293024
6발생액충남 세종1096591810245276543171324438448081
7발생액광주9191110797296359476822167277465
8발생액전북647646561011607616138601355871
9발생액전남5652113419834169774917433337580
구분시도별종합소득세(백만원)법인세(백만원)부가가치세(백만원)양도소득세(백만원)상속_증여세(백만원)기타(백만원)
54미처리충남 세종650120026
55미처리광주10906025
56미처리전북2903006
57미처리전남50014001
58미처리대구19034503
59미처리경북191912105
60미처리부산402331027
61미처리울산2806006
62미처리경남37070013
63미처리제주1600008