Overview

Dataset statistics

Number of variables6
Number of observations133
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.9 KiB
Average record size in memory53.0 B

Variable types

Categorical1
Text1
Numeric4

Dataset

Description2022년말 기준 세무서별 체납액 정리현황임 정리중체납액은 정리대상체납액에서 현금정리액과 정리보류액을 차감한 금액임 단위:백만원
Author공공데이터포털
URLhttps://www.data.go.kr/data/15113723/fileData.do

Alerts

정리대상체납액 is highly overall correlated with 현금정리 and 2 other fieldsHigh correlation
현금정리 is highly overall correlated with 정리대상체납액 and 2 other fieldsHigh correlation
정리중체납 is highly overall correlated with 정리대상체납액 and 2 other fieldsHigh correlation
정리보류 is highly overall correlated with 정리대상체납액 and 2 other fieldsHigh correlation
세무서 has unique valuesUnique

Reproduction

Analysis started2024-04-17 16:23:11.898027
Analysis finished2024-04-17 16:23:13.290276
Duration1.39 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지방청
Categorical

Distinct7
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
서울청
28 
중부청
25 
부산청
19 
대전청
17 
인천청
15 
Other values (2)
29 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울청
2nd row서울청
3rd row서울청
4th row서울청
5th row서울청

Common Values

ValueCountFrequency (%)
서울청 28
21.1%
중부청 25
18.8%
부산청 19
14.3%
대전청 17
12.8%
인천청 15
11.3%
광주청 15
11.3%
대구청 14
10.5%

Length

2024-04-18T01:23:13.341809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T01:23:13.446945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울청 28
21.1%
중부청 25
18.8%
부산청 19
14.3%
대전청 17
12.8%
인천청 15
11.3%
광주청 15
11.3%
대구청 14
10.5%

세무서
Text

UNIQUE 

Distinct133
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.2 KiB
2024-04-18T01:23:13.720426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.2481203
Min length2

Characters and Unicode

Total characters299
Distinct characters98
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique133 ?
Unique (%)100.0%

Sample

1st row종로
2nd row남대문
3rd row마포
4th row용산
5th row영등포
ValueCountFrequency (%)
종로 1
 
0.8%
세종 1
 
0.8%
여수 1
 
0.8%
순천 1
 
0.8%
해남 1
 
0.8%
나주 1
 
0.8%
목포 1
 
0.8%
광산 1
 
0.8%
서광주 1
 
0.8%
북광주 1
 
0.8%
Other values (123) 123
92.5%
2024-04-18T01:23:14.095362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20
 
6.7%
19
 
6.4%
16
 
5.4%
14
 
4.7%
11
 
3.7%
10
 
3.3%
10
 
3.3%
9
 
3.0%
8
 
2.7%
8
 
2.7%
Other values (88) 174
58.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 299
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
20
 
6.7%
19
 
6.4%
16
 
5.4%
14
 
4.7%
11
 
3.7%
10
 
3.3%
10
 
3.3%
9
 
3.0%
8
 
2.7%
8
 
2.7%
Other values (88) 174
58.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul 299
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
20
 
6.7%
19
 
6.4%
16
 
5.4%
14
 
4.7%
11
 
3.7%
10
 
3.3%
10
 
3.3%
9
 
3.0%
8
 
2.7%
8
 
2.7%
Other values (88) 174
58.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 299
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
20
 
6.7%
19
 
6.4%
16
 
5.4%
14
 
4.7%
11
 
3.7%
10
 
3.3%
10
 
3.3%
9
 
3.0%
8
 
2.7%
8
 
2.7%
Other values (88) 174
58.2%

정리대상체납액
Real number (ℝ)

HIGH CORRELATION 

Distinct131
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2480.0602
Minimum244
Maximum7636
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.3 KiB
2024-04-18T01:23:14.208373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum244
5-th percentile422.6
Q11498
median2161
Q33117
95-th percentile5774.6
Maximum7636
Range7392
Interquartile range (IQR)1619

Descriptive statistics

Standard deviation1575.3384
Coefficient of variation (CV)0.63520168
Kurtosis0.44425629
Mean2480.0602
Median Absolute Deviation (MAD)877
Skewness0.94440513
Sum329848
Variance2481691
MonotonicityNot monotonic
2024-04-18T01:23:14.314298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1871 2
 
1.5%
2049 2
 
1.5%
2111 1
 
0.8%
1662 1
 
0.8%
1260 1
 
0.8%
1690 1
 
0.8%
1499 1
 
0.8%
697 1
 
0.8%
366 1
 
0.8%
2465 1
 
0.8%
Other values (121) 121
91.0%
ValueCountFrequency (%)
244 1
0.8%
259 1
0.8%
341 1
0.8%
364 1
0.8%
366 1
0.8%
401 1
0.8%
419 1
0.8%
425 1
0.8%
440 1
0.8%
471 1
0.8%
ValueCountFrequency (%)
7636 1
0.8%
6327 1
0.8%
6196 1
0.8%
6041 1
0.8%
6003 1
0.8%
5970 1
0.8%
5825 1
0.8%
5741 1
0.8%
5587 1
0.8%
5572 1
0.8%

현금정리
Real number (ℝ)

HIGH CORRELATION 

Distinct127
Distinct (%)95.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean857.7594
Minimum105
Maximum2768
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.3 KiB
2024-04-18T01:23:14.420131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum105
5-th percentile183
Q1528
median755
Q31128
95-th percentile1898.8
Maximum2768
Range2663
Interquartile range (IQR)600

Descriptive statistics

Standard deviation518.47413
Coefficient of variation (CV)0.60445171
Kurtosis0.81439126
Mean857.7594
Median Absolute Deviation (MAD)263
Skewness0.97718839
Sum114082
Variance268815.43
MonotonicityNot monotonic
2024-04-18T01:23:14.521145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
735 2
 
1.5%
183 2
 
1.5%
577 2
 
1.5%
567 2
 
1.5%
690 2
 
1.5%
158 2
 
1.5%
564 1
 
0.8%
874 1
 
0.8%
345 1
 
0.8%
432 1
 
0.8%
Other values (117) 117
88.0%
ValueCountFrequency (%)
105 1
0.8%
114 1
0.8%
158 2
1.5%
165 1
0.8%
173 1
0.8%
183 2
1.5%
198 1
0.8%
211 1
0.8%
229 1
0.8%
236 1
0.8%
ValueCountFrequency (%)
2768 1
0.8%
2185 1
0.8%
2064 1
0.8%
2036 1
0.8%
1981 1
0.8%
1936 1
0.8%
1909 1
0.8%
1892 1
0.8%
1867 1
0.8%
1836 1
0.8%

정리중체납
Real number (ℝ)

HIGH CORRELATION 

Distinct129
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1170.4737
Minimum103
Maximum4007
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.3 KiB
2024-04-18T01:23:14.626290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum103
5-th percentile165.2
Q1618
median1003
Q31495
95-th percentile2773.6
Maximum4007
Range3904
Interquartile range (IQR)877

Descriptive statistics

Standard deviation810.46194
Coefficient of variation (CV)0.69242218
Kurtosis1.1773575
Mean1170.4737
Median Absolute Deviation (MAD)428
Skewness1.1471499
Sum155673
Variance656848.55
MonotonicityNot monotonic
2024-04-18T01:23:14.734324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
914 2
 
1.5%
304 2
 
1.5%
176 2
 
1.5%
1088 2
 
1.5%
125 1
 
0.8%
557 1
 
0.8%
646 1
 
0.8%
575 1
 
0.8%
302 1
 
0.8%
1275 1
 
0.8%
Other values (119) 119
89.5%
ValueCountFrequency (%)
103 1
0.8%
112 1
0.8%
125 1
0.8%
141 1
0.8%
145 1
0.8%
147 1
0.8%
149 1
0.8%
176 2
1.5%
190 1
0.8%
206 1
0.8%
ValueCountFrequency (%)
4007 1
0.8%
3595 1
0.8%
3380 1
0.8%
3198 1
0.8%
3105 1
0.8%
3038 1
0.8%
3013 1
0.8%
2614 1
0.8%
2567 1
0.8%
2564 1
0.8%

정리보류
Real number (ℝ)

HIGH CORRELATION 

Distinct124
Distinct (%)93.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean451.82707
Minimum33
Maximum2167
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.3 KiB
2024-04-18T01:23:15.071305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum33
5-th percentile68.8
Q1233
median357
Q3528
95-th percentile1311.4
Maximum2167
Range2134
Interquartile range (IQR)295

Descriptive statistics

Standard deviation371.03017
Coefficient of variation (CV)0.82117738
Kurtosis4.8771239
Mean451.82707
Median Absolute Deviation (MAD)151
Skewness2.0092535
Sum60093
Variance137663.39
MonotonicityNot monotonic
2024-04-18T01:23:15.172681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
132 3
 
2.3%
1323 2
 
1.5%
305 2
 
1.5%
420 2
 
1.5%
341 2
 
1.5%
361 2
 
1.5%
252 2
 
1.5%
254 2
 
1.5%
299 1
 
0.8%
83 1
 
0.8%
Other values (114) 114
85.7%
ValueCountFrequency (%)
33 1
0.8%
36 1
0.8%
42 1
0.8%
50 1
0.8%
58 1
0.8%
66 1
0.8%
67 1
0.8%
70 1
0.8%
77 1
0.8%
83 1
0.8%
ValueCountFrequency (%)
2167 1
0.8%
1809 1
0.8%
1670 1
0.8%
1429 1
0.8%
1323 2
1.5%
1321 1
0.8%
1305 1
0.8%
1197 1
0.8%
1175 1
0.8%
1110 1
0.8%

Interactions

2024-04-18T01:23:12.880541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.083822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.356730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.614769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.946596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.157617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.423217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.687729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:13.010252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.222894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.486559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.751160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:13.075661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.293495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.554175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-18T01:23:12.820611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-18T01:23:15.241419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지방청정리대상체납액현금정리정리중체납정리보류
지방청1.0000.4520.2530.4040.274
정리대상체납액0.4521.0000.9140.9550.923
현금정리0.2530.9141.0000.8260.805
정리중체납0.4040.9550.8261.0000.837
정리보류0.2740.9230.8050.8371.000
2024-04-18T01:23:15.315258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정리대상체납액현금정리정리중체납정리보류지방청
정리대상체납액1.0000.9600.9760.8820.247
현금정리0.9601.0000.9240.7990.133
정리중체납0.9760.9241.0000.8220.214
정리보류0.8820.7990.8221.0000.139
지방청0.2470.1330.2140.1391.000

Missing values

2024-04-18T01:23:13.149456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-18T01:23:13.248550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지방청세무서정리대상체납액현금정리정리중체납정리보류
0서울청종로2111821914376
1서울청남대문21852765881321
2서울청마포333712241495618
3서울청용산331814951518305
4서울청영등포316612271542397
5서울청동작23447471177420
6서울청강서341812171632569
7서울청서대문20495031293253
8서울청은평1498534751213
9서울청구로26389261184528
지방청세무서정리대상체납액현금정리정리중체납정리보류
123부산청거창41917314799
124부산청통영23895731282534
125부산청진주2060730897433
126부산청해운대27909801467343
127부산청김해429615212062713
128부산청양산1645577721347
129부산청제주6196198131051110
130부산청수영1859606907346
131부산청동울산24829051133444
132부산청금정1871638872361