Overview

Dataset statistics

Number of variables8
Number of observations27
Missing cells18
Missing cells (%)8.3%
Duplicate rows1
Duplicate rows (%)3.7%
Total size in memory2.0 KiB
Average record size in memory75.7 B

Variable types

Categorical2
Numeric6

Dataset

Description경기도 하남시의 지방세 체납액 징수율에 대한 데이터로, 이월체납조정액, 징수액, 결손액, 미수액, 정리율 등의 항목을 제공합니다. (단위:백만원)
Author경기도 하남시
URLhttps://www.data.go.kr/data/15107161/fileData.do

Alerts

Dataset has 1 (3.7%) duplicate rowsDuplicates
연도 is highly overall correlated with and 6 other fieldsHigh correlation
구분 is highly overall correlated with 이월체납조정액 and 3 other fieldsHigh correlation
is highly overall correlated with 정리액(징수액) and 3 other fieldsHigh correlation
이월체납조정액 is highly overall correlated with 정리액(징수액) and 4 other fieldsHigh correlation
정리액(징수액) is highly overall correlated with and 4 other fieldsHigh correlation
정리액(결손액) is highly overall correlated with and 4 other fieldsHigh correlation
미수액 is highly overall correlated with 이월체납조정액 and 3 other fieldsHigh correlation
정리율 is highly overall correlated with and 3 other fieldsHigh correlation
has 3 (11.1%) missing valuesMissing
이월체납조정액 has 3 (11.1%) missing valuesMissing
정리액(징수액) has 3 (11.1%) missing valuesMissing
정리액(결손액) has 3 (11.1%) missing valuesMissing
미수액 has 3 (11.1%) missing valuesMissing
정리율 has 3 (11.1%) missing valuesMissing

Reproduction

Analysis started2024-04-13 13:06:00.016406
Analysis finished2024-04-13 13:06:10.673783
Duration10.66 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Memory size344.0 B
2023
24 
<NA>

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2023 24
88.9%
<NA> 3
 
11.1%

Length

2024-04-13T22:06:10.783711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-13T22:06:10.950218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 24
88.9%
na 3
 
11.1%


Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct12
Distinct (%)50.0%
Missing3
Missing (%)11.1%
Infinite0
Infinite (%)0.0%
Mean6.5
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size371.0 B
2024-04-13T22:06:11.110700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.15
Q13.75
median6.5
Q39.25
95-th percentile11.85
Maximum12
Range11
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation3.5262987
Coefficient of variation (CV)0.54250749
Kurtosis-1.2156934
Mean6.5
Median Absolute Deviation (MAD)3
Skewness0
Sum156
Variance12.434783
MonotonicityIncreasing
2024-04-13T22:06:11.363247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1 2
7.4%
2 2
7.4%
3 2
7.4%
4 2
7.4%
5 2
7.4%
6 2
7.4%
7 2
7.4%
8 2
7.4%
9 2
7.4%
10 2
7.4%
Other values (2) 4
14.8%
(Missing) 3
11.1%
ValueCountFrequency (%)
1 2
7.4%
2 2
7.4%
3 2
7.4%
4 2
7.4%
5 2
7.4%
6 2
7.4%
7 2
7.4%
8 2
7.4%
9 2
7.4%
10 2
7.4%
ValueCountFrequency (%)
12 2
7.4%
11 2
7.4%
10 2
7.4%
9 2
7.4%
8 2
7.4%
7 2
7.4%
6 2
7.4%
5 2
7.4%
4 2
7.4%
3 2
7.4%

구분
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Memory size344.0 B
시세
12 
도세
12 
<NA>

Length

Max length4
Median length2
Mean length2.2222222
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row시세
2nd row도세
3rd row시세
4th row도세
5th row시세

Common Values

ValueCountFrequency (%)
시세 12
44.4%
도세 12
44.4%
<NA> 3
 
11.1%

Length

2024-04-13T22:06:11.771059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-13T22:06:12.108981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
시세 12
44.4%
도세 12
44.4%
na 3
 
11.1%

이월체납조정액
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct24
Distinct (%)100.0%
Missing3
Missing (%)11.1%
Infinite0
Infinite (%)0.0%
Mean14008.083
Minimum6235
Maximum21973
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size371.0 B
2024-04-13T22:06:12.332221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6235
5-th percentile6245.6
Q16280
median13902.5
Q321727.25
95-th percentile21925.25
Maximum21973
Range15738
Interquartile range (IQR)15447.25

Descriptive statistics

Standard deviation7889.2383
Coefficient of variation (CV)0.56319184
Kurtosis-2.1892408
Mean14008.083
Median Absolute Deviation (MAD)7655
Skewness0.00071484203
Sum336194
Variance62240081
MonotonicityNot monotonic
2024-04-13T22:06:12.669726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
21791 1
 
3.7%
21973 1
 
3.7%
6358 1
 
3.7%
21932 1
 
3.7%
6346 1
 
3.7%
21887 1
 
3.7%
6326 1
 
3.7%
21840 1
 
3.7%
6314 1
 
3.7%
21844 1
 
3.7%
Other values (14) 14
51.9%
(Missing) 3
 
11.1%
ValueCountFrequency (%)
6235 1
3.7%
6245 1
3.7%
6249 1
3.7%
6255 1
3.7%
6264 1
3.7%
6268 1
3.7%
6284 1
3.7%
6287 1
3.7%
6314 1
3.7%
6326 1
3.7%
ValueCountFrequency (%)
21973 1
3.7%
21932 1
3.7%
21887 1
3.7%
21844 1
3.7%
21840 1
3.7%
21791 1
3.7%
21706 1
3.7%
21678 1
3.7%
21624 1
3.7%
21559 1
3.7%

정리액(징수액)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct24
Distinct (%)100.0%
Missing3
Missing (%)11.1%
Infinite0
Infinite (%)0.0%
Mean3718.9167
Minimum605
Maximum7572
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size371.0 B
2024-04-13T22:06:12.882016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum605
5-th percentile1035.4
Q12168.75
median2580
Q35933.75
95-th percentile7282.5
Maximum7572
Range6967
Interquartile range (IQR)3765

Descriptive statistics

Standard deviation2291.7067
Coefficient of variation (CV)0.61622963
Kurtosis-1.3373447
Mean3718.9167
Median Absolute Deviation (MAD)1376.5
Skewness0.46531999
Sum89254
Variance5251919.4
MonotonicityNot monotonic
2024-04-13T22:06:13.082904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
6236 1
 
3.7%
7572 1
 
3.7%
2591 1
 
3.7%
7317 1
 
3.7%
2569 1
 
3.7%
7087 1
 
3.7%
2532 1
 
3.7%
6857 1
 
3.7%
2499 1
 
3.7%
6651 1
 
3.7%
Other values (14) 14
51.9%
(Missing) 3
 
11.1%
ValueCountFrequency (%)
605 1
3.7%
997 1
3.7%
1253 1
3.7%
1453 1
3.7%
1544 1
3.7%
1655 1
3.7%
2340 1
3.7%
2404 1
3.7%
2463 1
3.7%
2499 1
3.7%
ValueCountFrequency (%)
7572 1
3.7%
7317 1
3.7%
7087 1
3.7%
6857 1
3.7%
6651 1
3.7%
6236 1
3.7%
5833 1
3.7%
5264 1
3.7%
4744 1
3.7%
4006 1
3.7%

정리액(결손액)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct21
Distinct (%)87.5%
Missing3
Missing (%)11.1%
Infinite0
Infinite (%)0.0%
Mean366.91667
Minimum1
Maximum1242
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size371.0 B
2024-04-13T22:06:13.388662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q111
median186.5
Q3786.25
95-th percentile857.7
Maximum1242
Range1241
Interquartile range (IQR)775.25

Descriptive statistics

Standard deviation395.21507
Coefficient of variation (CV)1.0771249
Kurtosis-1.0237773
Mean366.91667
Median Absolute Deviation (MAD)184.5
Skewness0.62267453
Sum8806
Variance156194.95
MonotonicityNot monotonic
2024-04-13T22:06:13.685145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
2 3
 
11.1%
823 2
 
7.4%
15 1
 
3.7%
1242 1
 
3.7%
858 1
 
3.7%
831 1
 
3.7%
856 1
 
3.7%
774 1
 
3.7%
674 1
 
3.7%
524 1
 
3.7%
Other values (11) 11
40.7%
(Missing) 3
 
11.1%
ValueCountFrequency (%)
1 1
 
3.7%
2 3
11.1%
3 1
 
3.7%
5 1
 
3.7%
13 1
 
3.7%
15 1
 
3.7%
33 1
 
3.7%
48 1
 
3.7%
67 1
 
3.7%
170 1
 
3.7%
ValueCountFrequency (%)
1242 1
3.7%
858 1
3.7%
856 1
3.7%
831 1
3.7%
823 2
7.4%
774 1
3.7%
674 1
3.7%
524 1
3.7%
490 1
3.7%
347 1
3.7%

미수액
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct24
Distinct (%)100.0%
Missing3
Missing (%)11.1%
Infinite0
Infinite (%)0.0%
Mean9922.25
Minimum2909
Maximum19870
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size371.0 B
2024-04-13T22:06:13.888217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2909
5-th percentile2928.5
Q13909.75
median9399
Q315180.25
95-th percentile18477.1
Maximum19870
Range16961
Interquartile range (IQR)11270.5

Descriptive statistics

Standard deviation6198.4354
Coefficient of variation (CV)0.62470059
Kurtosis-1.8223221
Mean9922.25
Median Absolute Deviation (MAD)5585.5
Skewness0.13869043
Sum238134
Variance38420601
MonotonicityNot monotonic
2024-04-13T22:06:14.095899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
15065 1
 
3.7%
13159 1
 
3.7%
2909 1
 
3.7%
13784 1
 
3.7%
2921 1
 
3.7%
14026 1
 
3.7%
2971 1
 
3.7%
14309 1
 
3.7%
2992 1
 
3.7%
14669 1
 
3.7%
Other values (14) 14
51.9%
(Missing) 3
 
11.1%
ValueCountFrequency (%)
2909 1
3.7%
2921 1
3.7%
2971 1
3.7%
2992 1
3.7%
3757 1
3.7%
3870 1
3.7%
3923 1
3.7%
4597 1
3.7%
4794 1
3.7%
5029 1
3.7%
ValueCountFrequency (%)
19870 1
3.7%
18652 1
3.7%
17486 1
3.7%
16710 1
3.7%
16211 1
3.7%
15526 1
3.7%
15065 1
3.7%
14669 1
3.7%
14309 1
3.7%
14026 1
3.7%

정리율
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct24
Distinct (%)100.0%
Missing3
Missing (%)11.1%
Infinite0
Infinite (%)0.0%
Mean31.33625
Minimum7.35
Maximum54.25
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size371.0 B
2024-04-13T22:06:14.306938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7.35
5-th percentile10.2205
Q122.0325
median31.86
Q338.765
95-th percentile53.8305
Maximum54.25
Range46.9
Interquartile range (IQR)16.7325

Descriptive statistics

Standard deviation13.776929
Coefficient of variation (CV)0.4396483
Kurtosis-0.69395668
Mean31.33625
Median Absolute Deviation (MAD)8.415
Skewness0.12997533
Sum752.07
Variance189.80377
MonotonicityNot monotonic
2024-04-13T22:06:14.630155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
30.87 1
 
3.7%
40.11 1
 
3.7%
54.25 1
 
3.7%
37.15 1
 
3.7%
53.97 1
 
3.7%
35.92 1
 
3.7%
53.04 1
 
3.7%
34.48 1
 
3.7%
52.61 1
 
3.7%
32.85 1
 
3.7%
Other values (14) 14
51.9%
(Missing) 3
 
11.1%
ValueCountFrequency (%)
7.35 1
3.7%
9.7 1
3.7%
13.17 1
3.7%
15.95 1
3.7%
18.89 1
3.7%
19.97 1
3.7%
22.72 1
3.7%
23.28 1
3.7%
25.22 1
3.7%
26.51 1
3.7%
ValueCountFrequency (%)
54.25 1
3.7%
53.97 1
3.7%
53.04 1
3.7%
52.61 1
3.7%
40.11 1
3.7%
39.74 1
3.7%
38.44 1
3.7%
37.41 1
3.7%
37.15 1
3.7%
35.92 1
3.7%

Interactions

2024-04-13T22:06:08.690594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:01.696036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:03.231874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:04.590004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:05.596943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:07.194075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:08.919676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:01.940955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:03.480731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:04.731112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:05.837523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:07.438085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:09.172057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:02.213623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:03.748196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:04.896236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:06.096650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:07.702066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:09.409782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:02.507858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:04.003120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:05.046049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:06.346078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:07.951192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:09.647424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:02.749272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:04.264943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:05.193998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:06.587319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:08.200845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:09.880473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:03.000917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:04.443195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:05.363219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:06.962358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-13T22:06:08.453798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-13T22:06:14.868752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분이월체납조정액정리액(징수액)정리액(결손액)미수액정리율
1.0000.0000.0000.5090.6660.3100.782
구분0.0001.0000.9900.9870.5511.0000.187
이월체납조정액0.0000.9901.0000.9840.4971.0000.236
정리액(징수액)0.5090.9870.9841.0000.7550.9260.865
정리액(결손액)0.6660.5510.4970.7551.0000.5300.821
미수액0.3101.0001.0000.9260.5301.0000.600
정리율0.7820.1870.2360.8650.8210.6001.000
2024-04-13T22:06:15.138020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도구분
연도1.0001.000
구분1.0001.000
2024-04-13T22:06:15.292421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
이월체납조정액정리액(징수액)정리액(결손액)미수액정리율연도구분
1.0000.4290.6030.846-0.4990.9261.0000.000
이월체납조정액0.4291.0000.9280.7040.5370.1281.0000.913
정리액(징수액)0.6030.9281.0000.7680.3670.3381.0000.721
정리액(결손액)0.8460.7040.7681.000-0.1080.6911.0000.339
미수액-0.4990.5370.367-0.1081.000-0.7121.0000.905
정리율0.9260.1280.3380.691-0.7121.0001.0000.036
연도1.0001.0001.0001.0001.0001.0001.0001.000
구분0.0000.9130.7210.3390.9050.0361.0001.000

Missing values

2024-04-13T22:06:10.071312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-13T22:06:10.296787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-13T22:06:10.515021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연도구분이월체납조정액정리액(징수액)정리액(결손액)미수액정리율
020231시세6245605156399.7
120231도세21447154433198707.35
220232시세62649972526515.95
320232도세214822782481865213.17
420233시세628412532502919.97
520233도세215594006671748618.89
620234시세624914532479423.28
720234도세2162447441701671022.72
820235시세625516553459726.51
920235도세2167852642031621125.22
연도구분이월체납조정액정리액(징수액)정리액(결손액)미수액정리율
1720239도세2184068576741430934.48
18202310시세63262532823297153.04
19202310도세2188770877741402635.92
20202311시세63462569856292153.97
21202311도세2193273178311378437.15
22202312시세63582591858290954.25
23202312도세21973757212421315940.11
24<NA><NA><NA><NA><NA><NA><NA><NA>
25<NA><NA><NA><NA><NA><NA><NA><NA>
26<NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연도구분이월체납조정액정리액(징수액)정리액(결손액)미수액정리율# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA>3