Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows768
Duplicate rows (%)7.7%
Total size in memory742.2 KiB
Average record size in memory76.0 B

Variable types

Numeric4
Categorical3
DateTime1

Dataset

Description경기도 구리시 지역내에서 실시간가상계좌수납으로 납부된된 환경개선부담금 거래에 대한 현황정보(연도, 지역, 부과일, 부과금, 가산금, 수납금 등)를 제공합니다.
URLhttps://www.data.go.kr/data/15088831/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 768 (7.7%) duplicate rowsDuplicates
구분 is highly overall correlated with 분기 and 1 other fieldsHigh correlation
지역코드 is highly overall correlated with 구분High correlation
분기 is highly overall correlated with 구분High correlation
부과금 is highly overall correlated with 수납금High correlation
수납금 is highly overall correlated with 부과금High correlation
부과금 is highly skewed (γ1 = 28.85623165)Skewed
가산금 is highly skewed (γ1 = 24.91471922)Skewed
수납금 is highly skewed (γ1 = 28.80232433)Skewed
가산금 has 5275 (52.8%) zerosZeros

Reproduction

Analysis started2023-12-12 07:10:07.599608
Analysis finished2023-12-12 07:10:10.804674
Duration3.21 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

분기
Real number (ℝ)

HIGH CORRELATION 

Distinct30
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20087.059
Minimum20011
Maximum20152
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:10:10.877995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20011
5-th percentile20031
Q120071
median20101
Q320102
95-th percentile20141
Maximum20152
Range141
Interquartile range (IQR)31

Descriptive statistics

Standard deviation31.106747
Coefficient of variation (CV)0.0015485964
Kurtosis-0.064231173
Mean20087.059
Median Absolute Deviation (MAD)19
Skewness-0.40111287
Sum2.0087059 × 108
Variance967.62973
MonotonicityNot monotonic
2023-12-12T16:10:11.043287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
20101 2136
21.4%
20102 1525
15.2%
20092 497
 
5.0%
20091 409
 
4.1%
20071 381
 
3.8%
20082 373
 
3.7%
20072 370
 
3.7%
20081 356
 
3.6%
20061 324
 
3.2%
20062 292
 
2.9%
Other values (20) 3337
33.4%
ValueCountFrequency (%)
20011 54
 
0.5%
20012 142
1.4%
20021 140
1.4%
20022 138
1.4%
20031 190
1.9%
20032 158
1.6%
20041 209
2.1%
20042 246
2.5%
20051 280
2.8%
20052 265
2.6%
ValueCountFrequency (%)
20152 144
1.4%
20151 137
1.4%
20142 153
1.5%
20141 155
1.6%
20132 182
1.8%
20131 150
1.5%
20122 155
1.6%
20121 159
1.6%
20112 136
1.4%
20111 144
1.4%

구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
자동차
5966 
시설물
4034 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row자동차
2nd row시설물
3rd row자동차
4th row자동차
5th row시설물

Common Values

ValueCountFrequency (%)
자동차 5966
59.7%
시설물 4034
40.3%

Length

2023-12-12T16:10:11.180481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:10:11.283710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
자동차 5966
59.7%
시설물 4034
40.3%

지역코드
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
구리시 인창동
2222 
구리시 수택동
1523 
구리시 교문동
1477 
수택1동
1097 
교문1동
602 
Other values (10)
3079 

Length

Max length7
Median length7
Mean length5.6941
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row구리시 인창동
2nd row인창동
3rd row구리시 수택동
4th row구리시 수택동
5th row교문1동

Common Values

ValueCountFrequency (%)
구리시 인창동 2222
22.2%
구리시 수택동 1523
15.2%
구리시 교문동 1477
14.8%
수택1동 1097
11.0%
교문1동 602
 
6.0%
수택3동 592
 
5.9%
수택2동 531
 
5.3%
인창동 464
 
4.6%
동구동 425
 
4.2%
구리시 토평동 373
 
3.7%
Other values (5) 694
 
6.9%

Length

2023-12-12T16:10:11.392248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
구리시 5966
37.4%
인창동 2686
16.8%
수택동 1523
 
9.5%
교문동 1477
 
9.3%
수택1동 1097
 
6.9%
교문1동 602
 
3.8%
수택3동 592
 
3.7%
수택2동 531
 
3.3%
동구동 425
 
2.7%
토평동 373
 
2.3%
Other values (4) 694
 
4.3%
Distinct71
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2001-03-10 00:00:00
Maximum2015-06-30 00:00:00
2023-12-12T16:10:11.514427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:11.657004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

부과금
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct4243
Distinct (%)42.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean81365.049
Minimum0
Maximum24740970
Zeros44
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:10:11.801676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile8750
Q124380
median38450
Q356720
95-th percentile178325.5
Maximum24740970
Range24740970
Interquartile range (IQR)32340

Descriptive statistics

Standard deviation530953.11
Coefficient of variation (CV)6.5255673
Kurtosis993.39045
Mean81365.049
Median Absolute Deviation (MAD)16045
Skewness28.856232
Sum8.1365049 × 108
Variance2.819112 × 1011
MonotonicityNot monotonic
2023-12-12T16:10:11.945174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
58900 173
 
1.7%
42070 167
 
1.7%
15580 137
 
1.4%
32410 134
 
1.3%
40510 105
 
1.1%
33650 105
 
1.1%
38950 101
 
1.0%
19470 99
 
1.0%
59930 91
 
0.9%
31160 89
 
0.9%
Other values (4233) 8799
88.0%
ValueCountFrequency (%)
0 44
0.4%
3020 1
 
< 0.1%
3050 2
 
< 0.1%
3060 1
 
< 0.1%
3090 2
 
< 0.1%
3110 2
 
< 0.1%
3170 1
 
< 0.1%
3230 1
 
< 0.1%
3280 3
 
< 0.1%
3290 1
 
< 0.1%
ValueCountFrequency (%)
24740970 1
< 0.1%
18633230 1
< 0.1%
17562130 1
< 0.1%
17138650 1
< 0.1%
13697710 1
< 0.1%
12662920 1
< 0.1%
12000430 1
< 0.1%
11544980 1
< 0.1%
9996170 1
< 0.1%
8791720 1
< 0.1%

가산금
Real number (ℝ)

SKEWED  ZEROS 

Distinct548
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1045.66
Minimum0
Maximum134650
Zeros5275
Zeros (%)52.8%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:10:12.069195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31730
95-th percentile3050
Maximum134650
Range134650
Interquartile range (IQR)1730

Descriptive statistics

Standard deviation2814.033
Coefficient of variation (CV)2.6911549
Kurtosis977.33018
Mean1045.66
Median Absolute Deviation (MAD)0
Skewness24.914719
Sum10456600
Variance7918781.6
MonotonicityNot monotonic
2023-12-12T16:10:12.203403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 5275
52.8%
2140 118
 
1.2%
1640 96
 
1.0%
2940 84
 
0.8%
2100 80
 
0.8%
2290 75
 
0.8%
970 63
 
0.6%
770 62
 
0.6%
1620 61
 
0.6%
2230 59
 
0.6%
Other values (538) 4027
40.3%
ValueCountFrequency (%)
0 5275
52.8%
110 1
 
< 0.1%
130 2
 
< 0.1%
150 6
 
0.1%
160 5
 
0.1%
170 4
 
< 0.1%
180 4
 
< 0.1%
190 7
 
0.1%
200 5
 
0.1%
210 8
 
0.1%
ValueCountFrequency (%)
134650 1
< 0.1%
122400 1
< 0.1%
79320 1
< 0.1%
71820 1
< 0.1%
47360 1
< 0.1%
39770 1
< 0.1%
37450 1
< 0.1%
36640 1
< 0.1%
30750 1
< 0.1%
29070 1
< 0.1%

수납금
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct4330
Distinct (%)43.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean82614.918
Minimum0
Maximum24740970
Zeros24
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:10:12.341987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile9080
Q125047.5
median39185
Q357710
95-th percentile181005.5
Maximum24740970
Range24740970
Interquartile range (IQR)32662.5

Descriptive statistics

Standard deviation531275.09
Coefficient of variation (CV)6.4307403
Kurtosis990.73337
Mean82614.918
Median Absolute Deviation (MAD)16275
Skewness28.802324
Sum8.2614918 × 108
Variance2.8225322 × 1011
MonotonicityNot monotonic
2023-12-12T16:10:12.489166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
42070 93
 
0.9%
58900 92
 
0.9%
15580 84
 
0.8%
61840 81
 
0.8%
32410 79
 
0.8%
44170 74
 
0.7%
33650 66
 
0.7%
42810 64
 
0.6%
45000 59
 
0.6%
48280 59
 
0.6%
Other values (4320) 9249
92.5%
ValueCountFrequency (%)
0 24
0.2%
3020 1
 
< 0.1%
3050 1
 
< 0.1%
3110 1
 
< 0.1%
3170 1
 
< 0.1%
3200 1
 
< 0.1%
3210 1
 
< 0.1%
3230 1
 
< 0.1%
3240 2
 
< 0.1%
3260 1
 
< 0.1%
ValueCountFrequency (%)
24740970 1
< 0.1%
18633230 1
< 0.1%
17562130 1
< 0.1%
17138650 1
< 0.1%
13697710 1
< 0.1%
12662920 1
< 0.1%
12000430 1
< 0.1%
11544980 1
< 0.1%
9996170 1
< 0.1%
8791720 1
< 0.1%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-07-01
10000 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-07-01
2nd row2023-07-01
3rd row2023-07-01
4th row2023-07-01
5th row2023-07-01

Common Values

ValueCountFrequency (%)
2023-07-01 10000
100.0%

Length

2023-12-12T16:10:12.615427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:10:12.702855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-07-01 10000
100.0%

Interactions

2023-12-12T16:10:10.051183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:08.283038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:08.702025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:09.535718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:10.161075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:08.392797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:08.815471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:09.666979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:10.297504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:08.496494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:08.928488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:09.785193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:10.437235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:08.602136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:09.063094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:10:09.902667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:10:12.757608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분기구분지역코드부과일부과금가산금수납금
분기1.0000.8400.5391.0000.0700.0440.070
구분0.8401.0001.0000.9850.0500.0870.051
지역코드0.5391.0001.0000.7220.1200.1060.125
부과일1.0000.9850.7221.0000.2760.1090.276
부과금0.0700.0500.1200.2761.0000.2481.000
가산금0.0440.0870.1060.1090.2481.0000.442
수납금0.0700.0510.1250.2761.0000.4421.000
2023-12-12T16:10:12.879099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분지역코드
구분1.0000.999
지역코드0.9991.000
2023-12-12T16:10:12.971430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분기부과금가산금수납금구분지역코드
분기1.0000.031-0.2160.0170.6680.232
부과금0.0311.0000.1500.9960.0500.049
가산금-0.2160.1501.0000.1860.0630.050
수납금0.0170.9960.1861.0000.0510.051
구분0.6680.0500.0630.0511.0000.999
지역코드0.2320.0490.0500.0510.9991.000

Missing values

2023-12-12T16:10:10.582879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:10:10.724474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

분기구분지역코드부과일부과금가산금수납금데이터기준일자
4505720061자동차구리시 인창동2006-03-10319701590335602023-07-01
521920032시설물인창동2003-09-106460064602023-07-01
6054420092자동차구리시 수택동2009-09-10272601360286202023-07-01
7818920101자동차구리시 수택동2010-02-28596700596702023-07-01
2552720112시설물교문1동2011-06-3013739001373902023-07-01
1191620061시설물수택2동2006-01-0233041003304102023-07-01
5254720081자동차구리시 인창동2008-03-10384501920403702023-07-01
6088920092자동차구리시 교문동2009-09-10389501940408902023-07-01
2098120092시설물수택3동2009-08-31669103340702502023-07-01
5747820091자동차구리시 수택동2009-03-10379501890398402023-07-01
분기구분지역코드부과일부과금가산금수납금데이터기준일자
1162820061시설물수택1동2006-01-02211400211402023-07-01
4965420071자동차구리시 인창동2007-03-1219700980206802023-07-01
3286520141시설물교문1동2013-12-30353700353702023-07-01
2700820121시설물교문1동2012-03-0913057065201370902023-07-01
3099820132시설물동구동2013-06-19592100592102023-07-01
5945820092자동차구리시 인창동2009-09-10311601550327102023-07-01
8615820102자동차구리시 인창동2010-09-10449500449502023-07-01
4049220041자동차구리시 교문동2004-03-10297901480312702023-07-01
394020031시설물인창동2003-03-10394800394802023-07-01
6701920101자동차구리시 인창동2010-02-28739900739902023-07-01

Duplicate rows

Most frequently occurring

분기구분지역코드부과일부과금가산금수납금데이터기준일자# duplicates
57520101자동차구리시 인창동2010-02-28589000589002023-07-0139
55720101자동차구리시 인창동2010-02-28420700420702023-07-0138
54320101자동차구리시 인창동2010-02-28324100324102023-07-0136
74920102자동차구리시 인창동2010-09-10599300599302023-07-0136
52920101자동차구리시 인창동2010-02-28155800155802023-07-0132
71820102자동차구리시 인창동2010-09-10329700329702023-07-0130
54520101자동차구리시 인창동2010-02-28336500336502023-07-0126
55320101자동차구리시 인창동2010-02-28405100405102023-07-0124
72020102자동차구리시 인창동2010-09-10342400342402023-07-0124
49820101자동차구리시 수택동2010-02-28420700420702023-07-0123