Overview

Dataset statistics

Number of variables7
Number of observations109
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.7 KiB
Average record size in memory63.2 B

Variable types

Text1
Categorical1
Numeric5

Dataset

Description근로소득 백분위(상위 1% 1,000분위) 자료- 인원(명)- 총급여액(억 원)- 근로소득금액(억 원)- 소득공제액(억 원) (근로소득공제+인적공제+연금보험료공제+특별소득공제+그밖의소득공제-소득공제한도초과액)- 과세표준(억 원)- 결정세액(억 원)
Author국세청
URLhttps://www.data.go.kr/data/15082063/fileData.do

Alerts

총급여 is highly overall correlated with 근로소득금액 and 3 other fieldsHigh correlation
근로소득금액 is highly overall correlated with 총급여 and 3 other fieldsHigh correlation
소득공제액 is highly overall correlated with 총급여 and 3 other fieldsHigh correlation
과세표준 is highly overall correlated with 총급여 and 3 other fieldsHigh correlation
결정세액 is highly overall correlated with 총급여 and 3 other fieldsHigh correlation
구분 has unique valuesUnique
총급여 has unique valuesUnique
근로소득금액 has unique valuesUnique
소득공제액 has unique valuesUnique
과세표준 has 7 (6.4%) zerosZeros
결정세액 has 18 (16.5%) zerosZeros

Reproduction

Analysis started2024-03-14 17:24:55.852982
Analysis finished2024-03-14 17:25:05.107741
Duration9.25 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Text

UNIQUE 

Distinct109
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1000.0 B
2024-03-15T02:25:05.957186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length6
Mean length6.2110092
Min length5

Characters and Unicode

Total characters677
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique109 ?
Unique (%)100.0%

Sample

1st row상위 0.1% 내
2nd row상위 0.2% 내
3rd row상위 0.3% 내
4th row상위 0.4% 내
5th row상위 0.5% 내
ValueCountFrequency (%)
상위 10
 
7.8%
10
 
7.8%
상위60%내 1
 
0.8%
상위71%내 1
 
0.8%
상위70%내 1
 
0.8%
상위69%내 1
 
0.8%
상위68%내 1
 
0.8%
상위67%내 1
 
0.8%
상위66%내 1
 
0.8%
상위65%내 1
 
0.8%
Other values (101) 101
78.3%
2024-03-15T02:25:07.436435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
109
16.1%
109
16.1%
% 109
16.1%
109
16.1%
1 22
 
3.2%
0 21
 
3.1%
4 21
 
3.1%
5 21
 
3.1%
3 21
 
3.1%
6 21
 
3.1%
Other values (6) 114
16.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 327
48.3%
Decimal Number 211
31.2%
Other Punctuation 119
 
17.6%
Space Separator 20
 
3.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 22
10.4%
0 21
10.0%
4 21
10.0%
5 21
10.0%
3 21
10.0%
6 21
10.0%
7 21
10.0%
8 21
10.0%
9 21
10.0%
2 21
10.0%
Other Letter
ValueCountFrequency (%)
109
33.3%
109
33.3%
109
33.3%
Other Punctuation
ValueCountFrequency (%)
% 109
91.6%
. 10
 
8.4%
Space Separator
ValueCountFrequency (%)
20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 350
51.7%
Hangul 327
48.3%

Most frequent character per script

Common
ValueCountFrequency (%)
% 109
31.1%
1 22
 
6.3%
0 21
 
6.0%
4 21
 
6.0%
5 21
 
6.0%
3 21
 
6.0%
6 21
 
6.0%
7 21
 
6.0%
8 21
 
6.0%
9 21
 
6.0%
Other values (3) 51
14.6%
Hangul
ValueCountFrequency (%)
109
33.3%
109
33.3%
109
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 350
51.7%
Hangul 327
48.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
109
33.3%
109
33.3%
109
33.3%
ASCII
ValueCountFrequency (%)
% 109
31.1%
1 22
 
6.3%
0 21
 
6.0%
4 21
 
6.0%
5 21
 
6.0%
3 21
 
6.0%
6 21
 
6.0%
7 21
 
6.0%
8 21
 
6.0%
9 21
 
6.0%
Other values (3) 51
14.6%

인원
Categorical

Distinct4
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Memory size1000.0 B
205396
85 
205397
14 
20540
 
6
20539
 
4

Length

Max length6
Median length6
Mean length5.9082569
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20539
2nd row20540
3rd row20539
4th row20540
5th row20540

Common Values

ValueCountFrequency (%)
205396 85
78.0%
205397 14
 
12.8%
20540 6
 
5.5%
20539 4
 
3.7%

Length

2024-03-15T02:25:07.875559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T02:25:08.241580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
205396 85
78.0%
205397 14
 
12.8%
20540 6
 
5.5%
20539 4
 
3.7%

총급여
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct109
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79400.495
Minimum433
Maximum339553
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2024-03-15T02:25:08.584730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum433
5-th percentile7833.6
Q139637
median61515
Q3102613
95-th percentile203910.4
Maximum339553
Range339120
Interquartile range (IQR)62976

Descriptive statistics

Standard deviation63459.94
Coefficient of variation (CV)0.79923859
Kurtosis2.8759159
Mean79400.495
Median Absolute Deviation (MAD)29692
Skewness1.5574539
Sum8654654
Variance4.027164 × 109
MonotonicityNot monotonic
2024-03-15T02:25:09.066065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
202921 1
 
0.9%
52961 1
 
0.9%
42689 1
 
0.9%
44141 1
 
0.9%
45387 1
 
0.9%
46648 1
 
0.9%
47242 1
 
0.9%
47853 1
 
0.9%
48801 1
 
0.9%
49330 1
 
0.9%
Other values (99) 99
90.8%
ValueCountFrequency (%)
433 1
0.9%
1781 1
0.9%
3183 1
0.9%
4429 1
0.9%
5809 1
0.9%
7302 1
0.9%
8631 1
0.9%
10117 1
0.9%
11685 1
0.9%
13001 1
0.9%
ValueCountFrequency (%)
339553 1
0.9%
285277 1
0.9%
254864 1
0.9%
234303 1
0.9%
217721 1
0.9%
204570 1
0.9%
202921 1
0.9%
194353 1
0.9%
185584 1
0.9%
177859 1
0.9%

근로소득금액
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct109
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean61085.771
Minimum130
Maximum306645
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2024-03-15T02:25:09.381937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum130
5-th percentile2350.2
Q124244
median44377
Q379906
95-th percentile182047.6
Maximum306645
Range306515
Interquartile range (IQR)55662

Descriptive statistics

Standard deviation57531.038
Coefficient of variation (CV)0.94180752
Kurtosis3.6520725
Mean61085.771
Median Absolute Deviation (MAD)26464
Skewness1.7737694
Sum6658349
Variance3.3098204 × 109
MonotonicityNot monotonic
2024-03-15T02:25:09.861900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
198917 1
 
0.9%
34234 1
 
0.9%
25502 1
 
0.9%
26736 1
 
0.9%
27796 1
 
0.9%
28868 1
 
0.9%
29373 1
 
0.9%
29892 1
 
0.9%
30697 1
 
0.9%
31148 1
 
0.9%
Other values (99) 99
90.8%
ValueCountFrequency (%)
130 1
0.9%
534 1
0.9%
955 1
0.9%
1329 1
0.9%
1743 1
0.9%
2191 1
0.9%
2589 1
0.9%
3070 1
0.9%
3930 1
0.9%
4720 1
0.9%
ValueCountFrequency (%)
306645 1
0.9%
253399 1
0.9%
223582 1
0.9%
203432 1
0.9%
198917 1
0.9%
187180 1
0.9%
174349 1
0.9%
164610 1
0.9%
156280 1
0.9%
148941 1
0.9%

소득공제액
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct109
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33994.339
Minimum433
Maximum74304
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2024-03-15T02:25:10.113317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum433
5-th percentile7460.8
Q118570
median32954
Q348147
95-th percentile64354.2
Maximum74304
Range73871
Interquartile range (IQR)29577

Descriptive statistics

Standard deviation18823.669
Coefficient of variation (CV)0.55372953
Kurtosis-0.91732801
Mean33994.339
Median Absolute Deviation (MAD)14655
Skewness0.13945614
Sum3705383
Variance3.5433053 × 108
MonotonicityNot monotonic
2024-03-15T02:25:10.548778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13206 1
 
0.9%
30843 1
 
0.9%
26317 1
 
0.9%
27165 1
 
0.9%
27558 1
 
0.9%
28201 1
 
0.9%
28919 1
 
0.9%
28683 1
 
0.9%
29902 1
 
0.9%
28829 1
 
0.9%
Other values (99) 99
90.8%
ValueCountFrequency (%)
433 1
0.9%
1781 1
0.9%
3183 1
0.9%
4429 1
0.9%
5809 1
0.9%
7302 1
0.9%
7699 1
0.9%
7748 1
0.9%
7837 1
0.9%
7916 1
0.9%
ValueCountFrequency (%)
74304 1
0.9%
70644 1
0.9%
68309 1
0.9%
67732 1
0.9%
65952 1
0.9%
64603 1
0.9%
63981 1
0.9%
63804 1
0.9%
63075 1
0.9%
62261 1
0.9%

과세표준
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct103
Distinct (%)94.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean45406.183
Minimum0
Maximum265250
Zeros7
Zeros (%)6.4%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2024-03-15T02:25:10.977190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q114992
median31576
Q357727
95-th percentile147048.6
Maximum265250
Range265250
Interquartile range (IQR)42735

Descriptive statistics

Standard deviation48556.587
Coefficient of variation (CV)1.0693827
Kurtosis5.1204881
Mean45406.183
Median Absolute Deviation (MAD)20365
Skewness2.086993
Sum4949274
Variance2.3577421 × 109
MonotonicityNot monotonic
2024-03-15T02:25:11.373775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 7
 
6.4%
189715 1
 
0.9%
24533 1
 
0.9%
19170 1
 
0.9%
18899 1
 
0.9%
20501 1
 
0.9%
20294 1
 
0.9%
20908 1
 
0.9%
21499 1
 
0.9%
22119 1
 
0.9%
Other values (93) 93
85.3%
ValueCountFrequency (%)
0 7
6.4%
21 1
 
0.9%
357 1
 
0.9%
1055 1
 
0.9%
1758 1
 
0.9%
2390 1
 
0.9%
3114 1
 
0.9%
3863 1
 
0.9%
4563 1
 
0.9%
5326 1
 
0.9%
ValueCountFrequency (%)
265250 1
0.9%
214634 1
0.9%
189715 1
0.9%
186554 1
0.9%
166571 1
0.9%
151769 1
0.9%
139968 1
0.9%
130372 1
0.9%
121780 1
0.9%
114784 1
0.9%

결정세액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct92
Distinct (%)84.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5427.1651
Minimum0
Maximum72145
Zeros18
Zeros (%)16.5%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2024-03-15T02:25:11.685285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1135
median827
Q36185
95-th percentile23494.4
Maximum72145
Range72145
Interquartile range (IQR)6050

Descriptive statistics

Standard deviation10937.154
Coefficient of variation (CV)2.0152609
Kurtosis17.441005
Mean5427.1651
Median Absolute Deviation (MAD)827
Skewness3.7838279
Sum591561
Variance1.1962133 × 108
MonotonicityNot monotonic
2024-03-15T02:25:11.974886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 18
 
16.5%
72145 1
 
0.9%
599 1
 
0.9%
322 1
 
0.9%
351 1
 
0.9%
370 1
 
0.9%
394 1
 
0.9%
423 1
 
0.9%
454 1
 
0.9%
498 1
 
0.9%
Other values (82) 82
75.2%
ValueCountFrequency (%)
0 18
16.5%
1 1
 
0.9%
6 1
 
0.9%
15 1
 
0.9%
32 1
 
0.9%
52 1
 
0.9%
70 1
 
0.9%
86 1
 
0.9%
106 1
 
0.9%
122 1
 
0.9%
ValueCountFrequency (%)
72145 1
0.9%
58046 1
0.9%
40658 1
0.9%
31378 1
0.9%
25771 1
0.9%
24304 1
0.9%
22280 1
0.9%
19567 1
0.9%
17492 1
0.9%
17314 1
0.9%

Interactions

2024-03-15T02:25:02.830790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:56.167448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:57.462953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:58.645311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:25:00.430715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:25:03.399062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:56.465757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:57.620903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:58.914620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:25:00.847529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:25:03.727055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:56.719173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:57.815624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:59.222351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:25:01.340229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:25:04.069026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:56.987723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:58.009928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:59.557085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:25:01.947437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:25:04.228922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:57.236799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:24:58.288032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:25:00.085645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T02:25:02.411177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T02:25:12.257226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인원총급여근로소득금액소득공제액과세표준결정세액
인원1.0000.0000.0000.4830.0980.759
총급여0.0001.0000.9830.9350.9930.916
근로소득금액0.0000.9831.0000.9510.9790.942
소득공제액0.4830.9350.9511.0000.9030.654
과세표준0.0980.9930.9790.9031.0000.916
결정세액0.7590.9160.9420.6540.9161.000
2024-03-15T02:25:12.538254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
총급여근로소득금액소득공제액과세표준결정세액인원
총급여1.0000.9880.9080.9640.8730.000
근로소득금액0.9881.0000.8520.9930.9330.000
소득공제액0.9080.8521.0000.7950.6370.289
과세표준0.9640.9930.7951.0000.9660.050
결정세액0.8730.9330.6370.9661.0000.417
인원0.0000.0000.2890.0500.4171.000

Missing values

2024-03-15T02:25:04.576292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T02:25:04.961956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분인원총급여근로소득금액소득공제액과세표준결정세액
0상위 0.1% 내205392029211989171320618971572145
1상위 0.2% 내20540855868155598017578524304
2상위 0.3% 내20539672016329789785822417492
3상위 0.4% 내20540579125418484974941414199
4상위 0.5% 내20540519034828582014370212073
5상위 0.6% 내20539479244437780013992310666
6상위 0.7% 내2054044818413287916369019548
7상위 0.8% 내2053942476390337837346398696
8상위 0.9% 내2054040673372607748329258066
9상위 1.0% 내2054039154357697699314557521
구분인원총급여근로소득금액소득공제액과세표준결정세액
99상위91%내2053961300147201194710550
100상위92%내205396116853930113283570
101상위93%내20539710117307010096210
102상위94%내20539686312589863100
103상위95%내20539673022191730200
104상위96%내20539658091743580900
105상위97%내20539644291329442900
106상위98%내2053963183955318300
107상위99%내2053961781534178100
108상위100%내20539743313043300