Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows450
Duplicate rows (%)4.5%
Total size in memory742.2 KiB
Average record size in memory76.0 B

Variable types

Numeric4
DateTime3
Text1

Dataset

Description경기도 구리시 지역내에서 발생한 세외수입(과태료, 시설사용료, 쓰레기배출 등)에 대한 현황정보(수납일자, 과목, 납기 내·후 금액, 부가가치세 등)를 제공합니다.
URLhttps://www.data.go.kr/data/15090123/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 450 (4.5%) duplicate rowsDuplicates
납기내금액 is highly skewed (γ1 = 74.24817224)Skewed
부가가치세 is highly skewed (γ1 = 39.14749711)Skewed
납기후금액 has 6178 (61.8%) zerosZeros
부가가치세 has 9288 (92.9%) zerosZeros

Reproduction

Analysis started2023-12-12 22:34:28.004012
Analysis finished2023-12-12 22:34:30.268484
Duration2.26 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

부과년도
Real number (ℝ)

Distinct28
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.5337
Minimum1995
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T07:34:30.320847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1995
5-th percentile2008
Q12014
median2017
Q32020
95-th percentile2022
Maximum2023
Range28
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.2413752
Coefficient of variation (CV)0.0021032999
Kurtosis1.403713
Mean2016.5337
Median Absolute Deviation (MAD)3
Skewness-1.1109062
Sum20165337
Variance17.989263
MonotonicityNot monotonic
2023-12-13T07:34:30.428221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
2020 1189
11.9%
2021 1183
11.8%
2019 1039
10.4%
2018 976
9.8%
2017 877
8.8%
2016 777
7.8%
2015 691
6.9%
2014 641
 
6.4%
2013 487
 
4.9%
2022 439
 
4.4%
Other values (18) 1701
17.0%
ValueCountFrequency (%)
1995 1
 
< 0.1%
1997 3
 
< 0.1%
1998 7
 
0.1%
1999 18
 
0.2%
2000 7
 
0.1%
2001 13
 
0.1%
2002 22
 
0.2%
2003 39
0.4%
2004 60
0.6%
2005 54
0.5%
ValueCountFrequency (%)
2023 87
 
0.9%
2022 439
 
4.4%
2021 1183
11.8%
2020 1189
11.9%
2019 1039
10.4%
2018 976
9.8%
2017 877
8.8%
2016 777
7.8%
2015 691
6.9%
2014 641
6.4%
Distinct3256
Distinct (%)32.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2009-05-28 00:00:00
Maximum2023-03-31 00:00:00
2023-12-13T07:34:30.589636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:30.722804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct3257
Distinct (%)32.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2009-05-28 00:00:00
Maximum2023-03-31 00:00:00
2023-12-13T07:34:30.884155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:31.014533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

과목
Text

Distinct168
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T07:34:31.390086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length21
Mean length11.1673
Min length3

Characters and Unicode

Total characters111673
Distinct characters191
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)0.5%

Sample

1st row계속도로점용료
2nd row음식물쓰레기수거운반처리수수료
3rd row자동차손해배상보장법위반과태료
4th row자동차검사지연과태료
5th row자동차손해배상보장법위반과태료
ValueCountFrequency (%)
자동차손해배상보장법위반과태료 2430
24.2%
자동차검사지연과태료 2171
21.7%
장애인주차구역위반과태료 1094
10.9%
음식물쓰레기수거운반처리수수료 757
 
7.6%
그외수입 417
 
4.2%
이행강제금 354
 
3.5%
계속도로점용료 299
 
3.0%
시군구재산사용료 247
 
2.5%
주차요금 231
 
2.3%
재활용품수거판매수입 178
 
1.8%
Other values (159) 1848
18.4%
2023-12-13T07:34:31.924743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8194
 
7.3%
6623
 
5.9%
6472
 
5.8%
6184
 
5.5%
4915
 
4.4%
4885
 
4.4%
4864
 
4.4%
4149
 
3.7%
3561
 
3.2%
3487
 
3.1%
Other values (181) 58339
52.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 111485
99.8%
Open Punctuation 81
 
0.1%
Close Punctuation 81
 
0.1%
Space Separator 26
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8194
 
7.3%
6623
 
5.9%
6472
 
5.8%
6184
 
5.5%
4915
 
4.4%
4885
 
4.4%
4864
 
4.4%
4149
 
3.7%
3561
 
3.2%
3487
 
3.1%
Other values (178) 58151
52.2%
Open Punctuation
ValueCountFrequency (%)
( 81
100.0%
Close Punctuation
ValueCountFrequency (%)
) 81
100.0%
Space Separator
ValueCountFrequency (%)
26
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 111485
99.8%
Common 188
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8194
 
7.3%
6623
 
5.9%
6472
 
5.8%
6184
 
5.5%
4915
 
4.4%
4885
 
4.4%
4864
 
4.4%
4149
 
3.7%
3561
 
3.2%
3487
 
3.1%
Other values (178) 58151
52.2%
Common
ValueCountFrequency (%)
( 81
43.1%
) 81
43.1%
26
 
13.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 111485
99.8%
ASCII 188
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8194
 
7.3%
6623
 
5.9%
6472
 
5.8%
6184
 
5.5%
4915
 
4.4%
4885
 
4.4%
4864
 
4.4%
4149
 
3.7%
3561
 
3.2%
3487
 
3.1%
Other values (178) 58151
52.2%
ASCII
ValueCountFrequency (%)
( 81
43.1%
) 81
43.1%
26
 
13.8%

납기내금액
Real number (ℝ)

SKEWED 

Distinct3566
Distinct (%)35.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1192293.8
Minimum20
Maximum2.7 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T07:34:32.079340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile8000
Q116000
median56900
Q3238042.5
95-th percentile1745000
Maximum2.7 × 109
Range2.7 × 109
Interquartile range (IQR)222042.5

Descriptive statistics

Standard deviation30456146
Coefficient of variation (CV)25.544162
Kurtosis6289.0133
Mean1192293.8
Median Absolute Deviation (MAD)42830
Skewness74.248172
Sum1.1922938 × 1010
Variance9.2757684 × 1014
MonotonicityNot monotonic
2023-12-13T07:34:32.225271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16000 804
 
8.0%
80000 739
 
7.4%
20000 571
 
5.7%
12000 311
 
3.1%
15000 224
 
2.2%
40000 173
 
1.7%
100000 157
 
1.6%
300000 118
 
1.2%
200000 116
 
1.2%
50000 112
 
1.1%
Other values (3556) 6675
66.8%
ValueCountFrequency (%)
20 2
< 0.1%
30 1
< 0.1%
40 1
< 0.1%
50 2
< 0.1%
60 1
< 0.1%
70 1
< 0.1%
80 1
< 0.1%
90 1
< 0.1%
100 1
< 0.1%
117 1
< 0.1%
ValueCountFrequency (%)
2700000000 1
 
< 0.1%
858863700 1
 
< 0.1%
836673000 1
 
< 0.1%
249910750 1
 
< 0.1%
226223560 1
 
< 0.1%
219634530 1
 
< 0.1%
217459930 4
< 0.1%
167328540 1
 
< 0.1%
147698700 1
 
< 0.1%
127863970 1
 
< 0.1%

납기후금액
Real number (ℝ)

ZEROS 

Distinct514
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54288.831
Minimum0
Maximum10300000
Zeros6178
Zeros (%)61.8%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T07:34:32.378939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q316000
95-th percentile160800
Maximum10300000
Range10300000
Interquartile range (IQR)16000

Descriptive statistics

Standard deviation323254.53
Coefficient of variation (CV)5.9543469
Kurtosis379.7389
Mean54288.831
Median Absolute Deviation (MAD)0
Skewness16.653548
Sum5.4288831 × 108
Variance1.0449349 × 1011
MonotonicityNot monotonic
2023-12-13T07:34:32.554913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6178
61.8%
16000 800
 
8.0%
80000 704
 
7.0%
12000 308
 
3.1%
21000 224
 
2.2%
40000 136
 
1.4%
15750 128
 
1.3%
20600 85
 
0.9%
50000 73
 
0.7%
100000 70
 
0.7%
Other values (504) 1294
 
12.9%
ValueCountFrequency (%)
0 6178
61.8%
2400 3
 
< 0.1%
2500 1
 
< 0.1%
3150 1
 
< 0.1%
3810 1
 
< 0.1%
3840 1
 
< 0.1%
4000 33
 
0.3%
4070 1
 
< 0.1%
4320 1
 
< 0.1%
4800 1
 
< 0.1%
ValueCountFrequency (%)
10300000 1
< 0.1%
9528220 1
< 0.1%
9322420 1
< 0.1%
8464950 1
< 0.1%
6880810 1
< 0.1%
6466850 1
< 0.1%
6159340 1
< 0.1%
5273600 1
< 0.1%
4880000 1
< 0.1%
4548580 1
< 0.1%

부가가치세
Real number (ℝ)

SKEWED  ZEROS 

Distinct548
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20034.037
Minimum0
Maximum20565770
Zeros9288
Zeros (%)92.9%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T07:34:32.712733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile22710.5
Maximum20565770
Range20565770
Interquartile range (IQR)0

Descriptive statistics

Standard deviation495696.95
Coefficient of variation (CV)24.742739
Kurtosis1565.1238
Mean20034.037
Median Absolute Deviation (MAD)0
Skewness39.147497
Sum2.0034037 × 108
Variance2.4571547 × 1011
MonotonicityNot monotonic
2023-12-13T07:34:32.874585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 9288
92.9%
44480 11
 
0.1%
190110 11
 
0.1%
30680 11
 
0.1%
191930 9
 
0.1%
79290 9
 
0.1%
72530 8
 
0.1%
87610 7
 
0.1%
31240 6
 
0.1%
134670 6
 
0.1%
Other values (538) 634
 
6.3%
ValueCountFrequency (%)
0 9288
92.9%
200 2
 
< 0.1%
230 1
 
< 0.1%
280 1
 
< 0.1%
320 1
 
< 0.1%
340 1
 
< 0.1%
370 1
 
< 0.1%
450 1
 
< 0.1%
570 1
 
< 0.1%
580 1
 
< 0.1%
ValueCountFrequency (%)
20565770 1
 
< 0.1%
19966770 1
 
< 0.1%
19769070 4
< 0.1%
5395000 1
 
< 0.1%
4099900 1
 
< 0.1%
1472500 1
 
< 0.1%
1425000 1
 
< 0.1%
1420640 1
 
< 0.1%
1411360 1
 
< 0.1%
1336300 1
 
< 0.1%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2023-07-01 00:00:00
Maximum2023-07-01 00:00:00
2023-12-13T07:34:32.992102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:33.106566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-13T07:34:29.651633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:28.581079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:28.938316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:29.275343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:29.742395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:28.665391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:29.029497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:29.378921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:29.841783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:28.762976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:29.112842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:29.468943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:29.973186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:28.847521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:29.196415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:34:29.554618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:34:33.203872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부과년도납기내금액납기후금액부가가치세
부과년도1.0000.0000.0000.011
납기내금액0.0001.0000.0000.000
납기후금액0.0000.0001.0000.000
부가가치세0.0110.0000.0001.000
2023-12-13T07:34:33.605666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부과년도납기내금액납기후금액부가가치세
부과년도1.0000.0840.1780.006
납기내금액0.0841.000-0.0970.265
납기후금액0.178-0.0971.0000.007
부가가치세0.0060.2650.0071.000

Missing values

2023-12-13T07:34:30.085939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:34:30.206747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

부과년도등록일자수납일자과목납기내금액납기후금액부가가치세데이터기준일자
2716020162016-12-272016-12-27계속도로점용료813007202023-07-01
5921920202020-12-142020-12-14음식물쓰레기수거운반처리수수료16190002023-07-01
1075820132013-12-302013-12-30자동차손해배상보장법위반과태료78750002023-07-01
5509920202020-07-172020-07-17자동차검사지연과태료160001600002023-07-01
3558320172018-03-272018-03-27자동차손해배상보장법위반과태료5450002023-07-01
1413820062014-09-152014-09-15자동차검사지연과태료34400002023-07-01
4163720192019-01-102019-01-10음식물쓰레기수거운반처리수수료11400002023-07-01
1264220072014-05-222014-05-22자동차손해배상보장법위반과태료49000002023-07-01
3435220182018-01-202018-01-20자동차손해배상보장법위반과태료210002163002023-07-01
534720102012-06-292012-06-29자동차손해배상보장법위반과태료77940002023-07-01
부과년도등록일자수납일자과목납기내금액납기후금액부가가치세데이터기준일자
6814720212021-11-102021-11-10장애인주차구역위반과태료800008000002023-07-01
3263620112017-10-272017-10-27자동차검사지연과태료35400002023-07-01
6922620212021-12-182021-12-18장애인주차구역위반과태료800008000002023-07-01
5372720202020-05-292020-05-29음식물쓰레기수거운반처리수수료449250002023-07-01
5679420202020-09-182020-09-18음식물쓰레기수거운반처리수수료16100002023-07-01
7123120222022-05-092022-05-09쓰레기불법투기과태료846650002023-07-01
4079120182018-11-292018-11-29장애인주차구역위반과태료800008000002023-07-01
6738420212021-10-182021-10-18주차요금20000002023-07-01
4390820122019-04-252019-04-25이행강제금250000002023-07-01
5302020202020-04-302020-04-30자동차손해배상보장법위반과태료120001200002023-07-01

Duplicate rows

Most frequently occurring

부과년도등록일자수납일자과목납기내금액납기후금액부가가치세데이터기준일자# duplicates
15320172017-11-152017-11-15부동산거래신고의무위반과태료20000020000002023-07-0117
26620202020-01-282020-01-28장애인주차구역위반과태료800008000002023-07-017
33320202020-09-102020-09-10자동차손해배상보장법위반과태료120001200002023-07-016
40820212021-05-312021-05-31시군구재산임대료30000002023-07-016
11820172017-02-202017-02-20자동차검사지연과태료160001600002023-07-015
28220202020-03-312020-03-31시군구재산임대료30000002023-07-015
30420202020-06-102020-06-10자동차검사지연과태료160001600002023-07-015
40920212021-05-312021-05-31주차요금20000002023-07-015
2820122012-12-312012-12-31자동차검사지연과태료200002100002023-07-014
8120162016-06-302016-06-30자동차검사지연과태료160001600002023-07-014