Overview

Dataset statistics

Number of variables6
Number of observations41
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.2 KiB
Average record size in memory55.2 B

Variable types

Numeric4
Text2

Dataset

Description국세청 소관세수현황을 파악한 데이터(연도, 국세, 국세청세수, 일반회계, 특별회계, 구성비) - 소관세수현황 통계표의 금액 단위는 (억원)입니다. - 통계생산을 하지 않은 항목은 "**"로 표기하였습니다.
URLhttps://www.data.go.kr/data/15070687/fileData.do

Alerts

연도 is highly overall correlated with 국세(억원) and 2 other fieldsHigh correlation
국세(억원) is highly overall correlated with 연도 and 2 other fieldsHigh correlation
국세청세수(억원) is highly overall correlated with 연도 and 2 other fieldsHigh correlation
구성비 is highly overall correlated with 연도 and 2 other fieldsHigh correlation
연도 has unique valuesUnique
국세(억원) has unique valuesUnique
국세청세수(억원) has unique valuesUnique

Reproduction

Analysis started2023-12-12 05:01:14.778370
Analysis finished2023-12-12 05:01:16.969618
Duration2.19 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct41
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2001.1951
Minimum1966
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size501.0 B
2023-12-12T14:01:17.038295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1966
5-th percentile1980
Q11992
median2002
Q32012
95-th percentile2020
Maximum2022
Range56
Interquartile range (IQR)20

Descriptive statistics

Standard deviation13.63125
Coefficient of variation (CV)0.0068115547
Kurtosis-0.026508039
Mean2001.1951
Median Absolute Deviation (MAD)10
Skewness-0.53348863
Sum82049
Variance185.81098
MonotonicityStrictly increasing
2023-12-12T14:01:17.188017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
1966 1
 
2.4%
2013 1
 
2.4%
2005 1
 
2.4%
2006 1
 
2.4%
2007 1
 
2.4%
2008 1
 
2.4%
2009 1
 
2.4%
2010 1
 
2.4%
2011 1
 
2.4%
2012 1
 
2.4%
Other values (31) 31
75.6%
ValueCountFrequency (%)
1966 1
2.4%
1970 1
2.4%
1980 1
2.4%
1985 1
2.4%
1986 1
2.4%
1987 1
2.4%
1988 1
2.4%
1989 1
2.4%
1990 1
2.4%
1991 1
2.4%
ValueCountFrequency (%)
2022 1
2.4%
2021 1
2.4%
2020 1
2.4%
2019 1
2.4%
2018 1
2.4%
2017 1
2.4%
2016 1
2.4%
2015 1
2.4%
2014 1
2.4%
2013 1
2.4%

국세(억원)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct41
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1269700
Minimum951
Maximum3959393
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size501.0 B
2023-12-12T14:01:17.321417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum951
5-th percentile58077
Q1352184
median1039678
Q32019065
95-th percentile2935704
Maximum3959393
Range3958442
Interquartile range (IQR)1666881

Descriptive statistics

Standard deviation1049310.1
Coefficient of variation (CV)0.82642363
Kurtosis-0.29535391
Mean1269700
Median Absolute Deviation (MAD)771204
Skewness0.7305905
Sum52057702
Variance1.1010517 × 1012
MonotonicityNot monotonic
2023-12-12T14:01:17.500798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
951 1
 
2.4%
2019065 1
 
2.4%
1274657 1
 
2.4%
1380443 1
 
2.4%
1614591 1
 
2.4%
1673060 1
 
2.4%
1645407 1
 
2.4%
1777184 1
 
2.4%
1923812 1
 
2.4%
2030149 1
 
2.4%
Other values (31) 31
75.6%
ValueCountFrequency (%)
951 1
2.4%
3648 1
2.4%
58077 1
2.4%
118764 1
2.4%
136063 1
2.4%
163437 1
2.4%
194842 1
2.4%
212341 1
2.4%
268474 1
2.4%
303198 1
2.4%
ValueCountFrequency (%)
3959393 1
2.4%
3440782 1
2.4%
2935704 1
2.4%
2934543 1
2.4%
2855462 1
2.4%
2653849 1
2.4%
2425617 1
2.4%
2178851 1
2.4%
2055198 1
2.4%
2030149 1
2.4%

국세청세수(억원)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct41
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1202395
Minimum700
Maximum3842495
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size501.0 B
2023-12-12T14:01:17.675309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum700
5-th percentile42177
Q1320853
median966166
Q31902353
95-th percentile2844127
Maximum3842495
Range3841795
Interquartile range (IQR)1581500

Descriptive statistics

Standard deviation1021055.3
Coefficient of variation (CV)0.84918462
Kurtosis-0.21473444
Mean1202395
Median Absolute Deviation (MAD)739388
Skewness0.77172193
Sum49298194
Variance1.042554 × 1012
MonotonicityNot monotonic
2023-12-12T14:01:17.854396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
700 1
 
2.4%
1902353 1
 
2.4%
1204237 1
 
2.4%
1302609 1
 
2.4%
1530628 1
 
2.4%
1575286 1
 
2.4%
1543305 1
 
2.4%
1660149 1
 
2.4%
1801532 1
 
2.4%
1920926 1
 
2.4%
Other values (31) 31
75.6%
ValueCountFrequency (%)
700 1
2.4%
2838 1
2.4%
42177 1
2.4%
89416 1
2.4%
100990 1
2.4%
120159 1
2.4%
150838 1
2.4%
180780 1
2.4%
226778 1
2.4%
269854 1
2.4%
ValueCountFrequency (%)
3842495 1
2.4%
3344714 1
2.4%
2844127 1
2.4%
2835355 1
2.4%
2772753 1
2.4%
2555932 1
2.4%
2333291 1
2.4%
2081615 1
2.4%
1957271 1
2.4%
1920926 1
2.4%
Distinct22
Distinct (%)53.7%
Missing0
Missing (%)0.0%
Memory size460.0 B
2023-12-12T14:01:18.093118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length4.5121951
Min length2

Characters and Unicode

Total characters185
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)51.2%

Sample

1st row**
2nd row**
3rd row**
4th row**
5th row**
ValueCountFrequency (%)
20
48.8%
876844 1
 
2.4%
3242768 1
 
2.4%
2692186 1
 
2.4%
2781488 1
 
2.4%
2770757 1
 
2.4%
2499810 1
 
2.4%
2276690 1
 
2.4%
2023357 1
 
2.4%
1906128 1
 
2.4%
Other values (12) 12
29.3%
2023-12-12T14:01:18.404904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 40
21.6%
1 21
11.4%
8 18
9.7%
7 18
9.7%
2 17
9.2%
6 14
 
7.6%
4 14
 
7.6%
0 13
 
7.0%
3 13
 
7.0%
9 11
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 145
78.4%
Other Punctuation 40
 
21.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 21
14.5%
8 18
12.4%
7 18
12.4%
2 17
11.7%
6 14
9.7%
4 14
9.7%
0 13
9.0%
3 13
9.0%
9 11
7.6%
5 6
 
4.1%
Other Punctuation
ValueCountFrequency (%)
* 40
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 185
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 40
21.6%
1 21
11.4%
8 18
9.7%
7 18
9.7%
2 17
9.2%
6 14
 
7.6%
4 14
 
7.6%
0 13
 
7.0%
3 13
 
7.0%
9 11
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 185
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 40
21.6%
1 21
11.4%
8 18
9.7%
7 18
9.7%
2 17
9.2%
6 14
 
7.6%
4 14
 
7.6%
0 13
 
7.0%
3 13
 
7.0%
9 11
 
5.9%
Distinct22
Distinct (%)53.7%
Missing0
Missing (%)0.0%
Memory size460.0 B
2023-12-12T14:01:18.570347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length3.5609756
Min length2

Characters and Unicode

Total characters146
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)51.2%

Sample

1st row**
2nd row**
3rd row**
4th row**
5th row**
ValueCountFrequency (%)
20
48.8%
89322 1
 
2.4%
101946 1
 
2.4%
80567 1
 
2.4%
62639 1
 
2.4%
64598 1
 
2.4%
56122 1
 
2.4%
56601 1
 
2.4%
58258 1
 
2.4%
51143 1
 
2.4%
Other values (12) 12
29.3%
2023-12-12T14:01:18.876168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 40
27.4%
5 18
12.3%
4 16
 
11.0%
8 14
 
9.6%
6 12
 
8.2%
1 10
 
6.8%
9 9
 
6.2%
3 8
 
5.5%
2 7
 
4.8%
7 7
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 106
72.6%
Other Punctuation 40
 
27.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 18
17.0%
4 16
15.1%
8 14
13.2%
6 12
11.3%
1 10
9.4%
9 9
8.5%
3 8
7.5%
2 7
 
6.6%
7 7
 
6.6%
0 5
 
4.7%
Other Punctuation
ValueCountFrequency (%)
* 40
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 146
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 40
27.4%
5 18
12.3%
4 16
 
11.0%
8 14
 
9.6%
6 12
 
8.2%
1 10
 
6.8%
9 9
 
6.2%
3 8
 
5.5%
2 7
 
4.8%
7 7
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 146
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 40
27.4%
5 18
12.3%
4 16
 
11.0%
8 14
 
9.6%
6 12
 
8.2%
1 10
 
6.8%
9 9
 
6.2%
3 8
 
5.5%
2 7
 
4.8%
7 7
 
4.8%

구성비
Real number (ℝ)

HIGH CORRELATION 

Distinct34
Distinct (%)82.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean90.268293
Minimum72.6
Maximum97.2
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size501.0 B
2023-12-12T14:01:19.034580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum72.6
5-th percentile73.6
Q191
median93.4
Q394.6
95-th percentile97
Maximum97.2
Range24.6
Interquartile range (IQR)3.6

Descriptive statistics

Standard deviation7.5828569
Coefficient of variation (CV)0.084003549
Kurtosis0.63467429
Mean90.268293
Median Absolute Deviation (MAD)2.3
Skewness-1.4340626
Sum3701
Variance57.49972
MonotonicityNot monotonic
2023-12-12T14:01:19.169894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
91.1 3
 
7.3%
93.2 3
 
7.3%
93.4 2
 
4.9%
94.2 2
 
4.9%
93.6 2
 
4.9%
96.9 1
 
2.4%
96.6 1
 
2.4%
96.3 1
 
2.4%
96.2 1
 
2.4%
94.5 1
 
2.4%
Other values (24) 24
58.5%
ValueCountFrequency (%)
72.6 1
2.4%
73.5 1
2.4%
73.6 1
2.4%
74.2 1
2.4%
75.3 1
2.4%
77.4 1
2.4%
77.8 1
2.4%
84.5 1
2.4%
85.1 1
2.4%
89.0 1
2.4%
ValueCountFrequency (%)
97.2 1
2.4%
97.1 1
2.4%
97.0 1
2.4%
96.9 1
2.4%
96.6 1
2.4%
96.3 1
2.4%
96.2 1
2.4%
95.5 1
2.4%
95.2 1
2.4%
94.8 1
2.4%

Interactions

2023-12-12T14:01:16.366947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:15.001000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:15.404687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:15.885206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:16.451527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:15.084224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:15.535628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:16.008881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:16.543938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:15.205069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:15.637952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:16.111193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:16.656251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:15.326312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:15.760707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:01:16.245430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:01:19.286646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도국세(억원)국세청세수(억원)일반회계(억원)특별회계(억원)구성비
연도1.0000.8560.8290.0000.0000.925
국세(억원)0.8561.0000.9980.9630.9630.000
국세청세수(억원)0.8290.9981.0000.9630.9630.346
일반회계(억원)0.0000.9630.9631.0001.0000.000
특별회계(억원)0.0000.9630.9631.0001.0000.000
구성비0.9250.0000.3460.0000.0001.000
2023-12-12T14:01:19.398256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도국세(억원)국세청세수(억원)구성비
연도1.0000.9990.9990.964
국세(억원)0.9991.0001.0000.962
국세청세수(억원)0.9991.0001.0000.962
구성비0.9640.9620.9621.000

Missing values

2023-12-12T14:01:16.784335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:01:16.904540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도국세(억원)국세청세수(억원)일반회계(억원)특별회계(억원)구성비
01966951700****73.6
1197036482838****77.8
219805807742177****72.6
3198511876489416****75.3
41986136063100990****74.2
51987163437120159****73.5
61988194842150838****77.4
71989212341180780****85.1
81990268474226778****84.5
91991303198269854****89.0
연도국세(억원)국세청세수(억원)일반회계(억원)특별회계(억원)구성비
3120132019065190235318482085414594.2
3220142055198195727119061285114395.2
3320152178851208161520233575825895.5
3420162425617233329122766905660196.2
3520172653849255593224998105612296.3
3620182935704283535527707576459896.6
3720192934543284412727814886263996.9
3820202855462277275326921868056797.1
39202134407823344714324276810194697.2
4020223959393384249537483469414997.0