Overview

Dataset statistics

Number of variables9
Number of observations2983
Missing cells1468
Missing cells (%)5.5%
Duplicate rows17
Duplicate rows (%)0.6%
Total size in memory224.4 KiB
Average record size in memory77.0 B

Variable types

Categorical1
Numeric4
Text1
DateTime2
Boolean1

Dataset

Description오산시 지방세ARS카드납부시스템 분납내역 세목명,내용,등록일자, 기간설정,금액설정 등의 항목을 제공합니다
Author경기도 오산시
URLhttps://www.data.go.kr/data/15090250/fileData.do

Alerts

Dataset has 17 (0.6%) duplicate rowsDuplicates
수정일자 has 1468 (49.2%) missing valuesMissing
달설정 is highly skewed (γ1 = 37.71166911)Skewed

Reproduction

Analysis started2023-12-12 17:23:16.414494
Analysis finished2023-12-12 17:23:19.396993
Duration2.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

분납구분
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size23.4 KiB
1
1686 
2
1297 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1686
56.5%
2 1297
43.5%

Length

2023-12-13T02:23:19.471155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:23:19.593429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 1686
56.5%
2 1297
43.5%

총금액
Real number (ℝ)

Distinct2671
Distinct (%)89.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1921184.6
Minimum10000
Maximum1.23 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.3 KiB
2023-12-13T02:23:20.025904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10000
5-th percentile236839
Q1600510
median1093000
Q32135760
95-th percentile4944702
Maximum1.23 × 108
Range1.2299 × 108
Interquartile range (IQR)1535250

Descriptive statistics

Standard deviation4005507.3
Coefficient of variation (CV)2.0849153
Kurtosis331.71864
Mean1921184.6
Median Absolute Deviation (MAD)629970
Skewness14.464479
Sum5.7308937 × 109
Variance1.6044089 × 1013
MonotonicityNot monotonic
2023-12-13T02:23:20.179250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
200000 8
 
0.3%
1410630 6
 
0.2%
4893410 6
 
0.2%
1200000 6
 
0.2%
1628370 5
 
0.2%
2293570 5
 
0.2%
747710 5
 
0.2%
4164700 5
 
0.2%
828080 5
 
0.2%
904330 4
 
0.1%
Other values (2661) 2928
98.2%
ValueCountFrequency (%)
10000 1
< 0.1%
17460 1
< 0.1%
20000 2
0.1%
23540 1
< 0.1%
31000 1
< 0.1%
39380 1
< 0.1%
40000 1
< 0.1%
50000 1
< 0.1%
50770 1
< 0.1%
51070 1
< 0.1%
ValueCountFrequency (%)
123000000 1
< 0.1%
53186000 2
0.1%
52092690 1
< 0.1%
48969080 1
< 0.1%
46611470 1
< 0.1%
46226820 1
< 0.1%
40964625 1
< 0.1%
35310810 1
< 0.1%
35125600 1
< 0.1%
33025840 1
< 0.1%

설정금액
Real number (ℝ)

Distinct210
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean217365.38
Minimum0
Maximum12000000
Zeros7
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size26.3 KiB
2023-12-13T02:23:20.329074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile50000
Q1100000
median150000
Q3200000
95-th percentile500000
Maximum12000000
Range12000000
Interquartile range (IQR)100000

Descriptive statistics

Standard deviation392138.89
Coefficient of variation (CV)1.8040541
Kurtosis422.94402
Mean217365.38
Median Absolute Deviation (MAD)50000
Skewness16.855822
Sum6.4840092 × 108
Variance1.5377291 × 1011
MonotonicityNot monotonic
2023-12-13T02:23:20.483671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100000 927
31.1%
200000 576
19.3%
50000 272
 
9.1%
300000 269
 
9.0%
150000 200
 
6.7%
500000 96
 
3.2%
250000 62
 
2.1%
400000 54
 
1.8%
30000 26
 
0.9%
70000 23
 
0.8%
Other values (200) 478
16.0%
ValueCountFrequency (%)
0 7
0.2%
3 2
 
0.1%
10 2
 
0.1%
20 8
0.3%
22 2
 
0.1%
30 8
0.3%
34 4
 
0.1%
5000 1
 
< 0.1%
10000 11
0.4%
12889 2
 
0.1%
ValueCountFrequency (%)
12000000 1
 
< 0.1%
10000000 1
 
< 0.1%
5000000 2
 
0.1%
3000000 4
 
0.1%
2581140 1
 
< 0.1%
2511090 1
 
< 0.1%
2047090 1
 
< 0.1%
2000000 13
0.4%
1900000 1
 
< 0.1%
1800000 1
 
< 0.1%

세목
Text

Distinct278
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Memory size23.4 KiB
2023-12-13T02:23:20.719135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length19
Mean length5.9946363
Min length1

Characters and Unicode

Total characters17882
Distinct characters127
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique159 ?
Unique (%)5.3%

Sample

1st row자동차세
2nd row취득세
3rd row자동차세
4th row자동차세
5th row등록세
ValueCountFrequency (%)
자동차세 780
20.1%
세외수입 613
15.8%
지방세 346
 
8.9%
자동차세외 334
 
8.6%
298
 
7.7%
환경개선부담금 136
 
3.5%
지방소득세 115
 
3.0%
과태료 95
 
2.5%
재산세 80
 
2.1%
76
 
2.0%
Other values (188) 998
25.8%
2023-12-13T02:23:21.121675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3005
16.8%
1687
 
9.4%
1386
 
7.8%
1219
 
6.8%
1215
 
6.8%
894
 
5.0%
841
 
4.7%
840
 
4.7%
764
 
4.3%
734
 
4.1%
Other values (117) 5297
29.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 16253
90.9%
Space Separator 894
 
5.0%
Other Punctuation 618
 
3.5%
Decimal Number 70
 
0.4%
Math Symbol 19
 
0.1%
Close Punctuation 14
 
0.1%
Open Punctuation 14
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3005
18.5%
1687
 
10.4%
1386
 
8.5%
1219
 
7.5%
1215
 
7.5%
841
 
5.2%
840
 
5.2%
764
 
4.7%
734
 
4.5%
287
 
1.8%
Other values (100) 4275
26.3%
Decimal Number
ValueCountFrequency (%)
1 18
25.7%
5 14
20.0%
3 12
17.1%
4 6
 
8.6%
6 6
 
8.6%
2 6
 
8.6%
7 5
 
7.1%
0 2
 
2.9%
8 1
 
1.4%
Other Punctuation
ValueCountFrequency (%)
, 582
94.2%
. 34
 
5.5%
& 1
 
0.2%
/ 1
 
0.2%
Space Separator
ValueCountFrequency (%)
894
100.0%
Math Symbol
ValueCountFrequency (%)
+ 19
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 16253
90.9%
Common 1629
 
9.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3005
18.5%
1687
 
10.4%
1386
 
8.5%
1219
 
7.5%
1215
 
7.5%
841
 
5.2%
840
 
5.2%
764
 
4.7%
734
 
4.5%
287
 
1.8%
Other values (100) 4275
26.3%
Common
ValueCountFrequency (%)
894
54.9%
, 582
35.7%
. 34
 
2.1%
+ 19
 
1.2%
1 18
 
1.1%
) 14
 
0.9%
5 14
 
0.9%
( 14
 
0.9%
3 12
 
0.7%
4 6
 
0.4%
Other values (7) 22
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 16252
90.9%
ASCII 1629
 
9.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3005
18.5%
1687
 
10.4%
1386
 
8.5%
1219
 
7.5%
1215
 
7.5%
841
 
5.2%
840
 
5.2%
764
 
4.7%
734
 
4.5%
287
 
1.8%
Other values (99) 4274
26.3%
ASCII
ValueCountFrequency (%)
894
54.9%
, 582
35.7%
. 34
 
2.1%
+ 19
 
1.2%
1 18
 
1.1%
) 14
 
0.9%
5 14
 
0.9%
( 14
 
0.9%
3 12
 
0.7%
4 6
 
0.4%
Other values (7) 22
 
1.4%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

달설정
Real number (ℝ)

SKEWED 

Distinct19
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1716393
Minimum1
Maximum130
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.3 KiB
2023-12-13T02:23:21.240803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile1
Maximum130
Range129
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.7180323
Coefficient of variation (CV)2.3198542
Kurtosis1713.456
Mean1.1716393
Median Absolute Deviation (MAD)0
Skewness37.711669
Sum3495
Variance7.3876998
MonotonicityNot monotonic
2023-12-13T02:23:21.347656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
1 2926
98.1%
3 10
 
0.3%
4 9
 
0.3%
2 8
 
0.3%
10 5
 
0.2%
14 5
 
0.2%
5 4
 
0.1%
9 2
 
0.1%
11 2
 
0.1%
6 2
 
0.1%
Other values (9) 10
 
0.3%
ValueCountFrequency (%)
1 2926
98.1%
2 8
 
0.3%
3 10
 
0.3%
4 9
 
0.3%
5 4
 
0.1%
6 2
 
0.1%
7 2
 
0.1%
8 1
 
< 0.1%
9 2
 
0.1%
10 5
 
0.2%
ValueCountFrequency (%)
130 1
 
< 0.1%
39 1
 
< 0.1%
30 1
 
< 0.1%
22 1
 
< 0.1%
19 1
 
< 0.1%
18 1
 
< 0.1%
15 1
 
< 0.1%
14 5
0.2%
11 2
 
0.1%
10 5
0.2%

일설정
Real number (ℝ)

Distinct33
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.266845
Minimum1
Maximum256
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size26.3 KiB
2023-12-13T02:23:21.481705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q17
median15
Q325
95-th percentile30
Maximum256
Range255
Interquartile range (IQR)18

Descriptive statistics

Standard deviation12.026653
Coefficient of variation (CV)0.73933527
Kurtosis103.1205
Mean16.266845
Median Absolute Deviation (MAD)10
Skewness5.1760825
Sum48524
Variance144.64037
MonotonicityNot monotonic
2023-12-13T02:23:21.599115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
1 520
17.4%
30 447
15.0%
15 302
10.1%
25 290
9.7%
20 275
9.2%
10 264
8.9%
5 102
 
3.4%
29 62
 
2.1%
28 59
 
2.0%
11 58
 
1.9%
Other values (23) 604
20.2%
ValueCountFrequency (%)
1 520
17.4%
2 28
 
0.9%
3 22
 
0.7%
4 19
 
0.6%
5 102
 
3.4%
6 45
 
1.5%
7 29
 
1.0%
8 20
 
0.7%
9 19
 
0.6%
10 264
8.9%
ValueCountFrequency (%)
256 1
 
< 0.1%
255 1
 
< 0.1%
31 19
 
0.6%
30 447
15.0%
29 62
 
2.1%
28 59
 
2.0%
27 46
 
1.5%
26 50
 
1.7%
25 290
9.7%
24 19
 
0.6%
Distinct687
Distinct (%)23.0%
Missing0
Missing (%)0.0%
Memory size23.4 KiB
Minimum2014-10-07 00:00:00
Maximum2023-08-11 00:00:00
2023-12-13T02:23:21.734165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:21.874826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

수정일자
Date

MISSING 

Distinct469
Distinct (%)31.0%
Missing1468
Missing (%)49.2%
Memory size23.4 KiB
Minimum2014-10-08 00:00:00
Maximum2023-08-11 00:00:00
2023-12-13T02:23:22.012903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:22.155562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
False
1640 
True
1343 
ValueCountFrequency (%)
False 1640
55.0%
True 1343
45.0%
2023-12-13T02:23:22.282238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-13T02:23:18.682596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:17.028886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:17.564393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:18.135276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:18.814282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:17.163754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:17.698363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:18.267839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:18.930180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:17.297563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:17.848390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:18.422487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:19.047785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:17.441357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:17.998574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:23:18.558099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:23:22.352977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분납구분총금액설정금액달설정일설정삭제여부
분납구분1.0000.0390.0570.0000.0570.238
총금액0.0391.0000.5590.0000.0000.000
설정금액0.0570.5591.0000.0000.0000.000
달설정0.0000.0000.0001.0000.0000.041
일설정0.0570.0000.0000.0001.0000.000
삭제여부0.2380.0000.0000.0410.0001.000
2023-12-13T02:23:22.473167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분납구분삭제여부
분납구분1.0000.153
삭제여부0.1531.000
2023-12-13T02:23:22.555035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
총금액설정금액달설정일설정분납구분삭제여부
총금액1.0000.279-0.0010.0270.0280.000
설정금액0.2791.0000.0790.0220.0410.000
달설정-0.0010.0791.0000.0090.0000.027
일설정0.0270.0220.0091.0000.0950.000
분납구분0.0280.0410.0000.0951.0000.153
삭제여부0.0000.0000.0270.0000.1531.000

Missing values

2023-12-13T02:23:19.187259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:23:19.341376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

분납구분총금액설정금액세목달설정일설정등록일자수정일자삭제여부
012382900100000자동차세112014-10-072015-06-01Y
115951000500000취득세112014-10-072015-05-14Y
21188620100000자동차세112014-10-072015-04-23N
31839080100000자동차세112014-10-072015-08-25Y
41173600100000등록세112014-10-07<NA>N
52901560300000과태료112014-10-07<NA>N
6121221050000지방소득세112014-10-07<NA>N
71325080100000자동차세112014-10-07<NA>N
811042420100000재산세112014-10-07<NA>N
91810440200000자동차세112014-10-07<NA>N
분납구분총금액설정금액세목달설정일설정등록일자수정일자삭제여부
29731414490300000지방세212023-06-22<NA>N
297412575000150000자동차세1252023-08-11<NA>N
297513400000100000자동차1252023-08-11<NA>N
2976177168601000001252023-08-11<NA>N
2977138872802000001252023-08-11<NA>N
2978119664301000001202023-08-11<NA>N
2979144474101000001252023-08-11<NA>N
2980143000001000001252023-08-112023-08-11Y
298113598820200000결손1252023-08-11<NA>N
2982140603003000001142023-08-11<NA>N

Duplicate rows

Most frequently occurring

분납구분총금액설정금액세목달설정일설정등록일자수정일자삭제여부# duplicates
132177968030손해배상1302014-12-15<NA>N4
152416470034세외수입, 지방세1112015-01-14<NA>N4
162489341030손해배상,주정차1302014-12-15<NA>N4
01200000100000테스트1132014-10-152014-10-16Y3
71208534020자동차세1152015-07-01<NA>N3
1025364000주정차위반1102017-04-07<NA>N3
11690030690030자동차세4152015-02-132015-02-13Y2
2174771010재산세1112015-05-12<NA>N2
317685000자동차세1302015-06-11<NA>N2
4190433020지방세등1152015-05-19<NA>N2