Overview

Dataset statistics

Number of variables10
Number of observations2815
Missing cells0
Missing cells (%)0.0%
Duplicate rows408
Duplicate rows (%)14.5%
Total size in memory231.0 KiB
Average record size in memory84.0 B

Variable types

DateTime3
Numeric3
Categorical3
Text1

Dataset

Description충청북도 충주시 대형폐기물인터넷 배출신고처리시스템 환불처리내역에 대한 정보(환불처리일, 환불접수일, 환불금액,환불처리사유, 환불결제수단, 금액, 수량, 단가, 품목코드, 데이터기준일자)
URLhttps://www.data.go.kr/data/15122270/fileData.do

Alerts

수량 has constant value ""Constant
데이터기준일자 has constant value ""Constant
Dataset has 408 (14.5%) duplicate rowsDuplicates
금액 is highly overall correlated with 단가High correlation
단가 is highly overall correlated with 금액High correlation
환불처리사유 is highly overall correlated with 환불결제수단High correlation
환불결제수단 is highly overall correlated with 환불처리사유High correlation
환불처리사유 is highly imbalanced (91.1%)Imbalance

Reproduction

Analysis started2023-12-12 07:39:17.353990
Analysis finished2023-12-12 07:39:19.206408
Duration1.85 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct234
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
Minimum2023-01-01 00:00:00
Maximum2023-09-01 00:00:00
2023-12-12T16:39:19.284139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:39:19.465771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct105
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
Minimum2023-01-02 00:00:00
Maximum2023-09-04 00:00:00
2023-12-12T16:39:19.692423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:39:19.842087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

환불금액
Real number (ℝ)

Distinct58
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23198.934
Minimum1000
Maximum168000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.9 KiB
2023-12-12T16:39:20.021903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile2000
Q15000
median11000
Q324000
95-th percentile94000
Maximum168000
Range167000
Interquartile range (IQR)19000

Descriptive statistics

Standard deviation32296.127
Coefficient of variation (CV)1.3921384
Kurtosis7.067053
Mean23198.934
Median Absolute Deviation (MAD)7000
Skewness2.5957792
Sum65305000
Variance1.0430398 × 109
MonotonicityNot monotonic
2023-12-12T16:39:20.247929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6000 203
 
7.2%
2000 202
 
7.2%
4000 188
 
6.7%
3000 179
 
6.4%
8000 175
 
6.2%
5000 157
 
5.6%
10000 153
 
5.4%
12000 132
 
4.7%
14000 83
 
2.9%
9000 71
 
2.5%
Other values (48) 1272
45.2%
ValueCountFrequency (%)
1000 2
 
0.1%
2000 202
7.2%
3000 179
6.4%
4000 188
6.7%
5000 157
5.6%
5500 1
 
< 0.1%
6000 203
7.2%
7000 69
 
2.5%
8000 175
6.2%
9000 71
 
2.5%
ValueCountFrequency (%)
168000 55
2.0%
119000 28
1.0%
101000 18
 
0.6%
95000 25
0.9%
94000 35
1.2%
92000 28
1.0%
82000 25
0.9%
80000 17
 
0.6%
79000 20
 
0.7%
78000 23
0.8%

환불처리사유
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct41
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
미수거
2689 
클린센터에 민원인이 직접 입고
 
15
없어짐
 
10
환불
 
9
재접수
 
8
Other values (36)
 
84

Length

Max length39
Median length3
Mean length3.19254
Min length2

Unique

Unique16 ?
Unique (%)0.6%

Sample

1st row미수거
2nd row미수거
3rd row미수거
4th row미수거
5th row미수거

Common Values

ValueCountFrequency (%)
미수거 2689
95.5%
클린센터에 민원인이 직접 입고 15
 
0.5%
없어짐 10
 
0.4%
환불 9
 
0.3%
재접수 8
 
0.3%
다시 접수처리 7
 
0.2%
변심 7
 
0.2%
착오송금 6
 
0.2%
제접수함 5
 
0.2%
금가면처리 5
 
0.2%
Other values (31) 54
 
1.9%

Length

2023-12-12T16:39:20.442074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
미수거 2693
92.6%
민원인이 16
 
0.6%
직접 16
 
0.6%
입고 15
 
0.5%
클린센터에 15
 
0.5%
없어짐 11
 
0.4%
재접수 9
 
0.3%
환불 9
 
0.3%
다시 8
 
0.3%
접수처리 7
 
0.2%
Other values (52) 110
 
3.8%

환불결제수단
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
카드
1594 
계좌이체
1204 
방문
 
17

Length

Max length4
Median length2
Mean length2.8554174
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row카드
2nd row카드
3rd row카드
4th row카드
5th row카드

Common Values

ValueCountFrequency (%)
카드 1594
56.6%
계좌이체 1204
42.8%
방문 17
 
0.6%

Length

2023-12-12T16:39:20.624237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:39:20.760406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
카드 1594
56.6%
계좌이체 1204
42.8%
방문 17
 
0.6%

금액
Real number (ℝ)

HIGH CORRELATION 

Distinct15
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3665.0089
Minimum1000
Maximum30000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.9 KiB
2023-12-12T16:39:20.860177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile2000
Q12000
median3000
Q34000
95-th percentile8000
Maximum30000
Range29000
Interquartile range (IQR)2000

Descriptive statistics

Standard deviation2403.5114
Coefficient of variation (CV)0.6557996
Kurtosis17.052494
Mean3665.0089
Median Absolute Deviation (MAD)1000
Skewness2.9968206
Sum10317000
Variance5776866.9
MonotonicityNot monotonic
2023-12-12T16:39:20.968579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
2000 1092
38.8%
3000 657
23.3%
4000 344
 
12.2%
5000 250
 
8.9%
6000 171
 
6.1%
8000 106
 
3.8%
10000 96
 
3.4%
1000 40
 
1.4%
7000 22
 
0.8%
9000 12
 
0.4%
Other values (5) 25
 
0.9%
ValueCountFrequency (%)
1000 40
 
1.4%
2000 1092
38.8%
3000 657
23.3%
3500 1
 
< 0.1%
4000 344
 
12.2%
5000 250
 
8.9%
5500 3
 
0.1%
6000 171
 
6.1%
7000 22
 
0.8%
8000 106
 
3.8%
ValueCountFrequency (%)
30000 2
 
0.1%
20000 7
 
0.2%
15000 12
 
0.4%
10000 96
 
3.4%
9000 12
 
0.4%
8000 106
3.8%
7000 22
 
0.8%
6000 171
6.1%
5500 3
 
0.1%
5000 250
8.9%

수량
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
1
2815 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 2815
100.0%

Length

2023-12-12T16:39:21.113695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:39:21.212359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 2815
100.0%

단가
Real number (ℝ)

HIGH CORRELATION 

Distinct15
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3665.0089
Minimum1000
Maximum30000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size24.9 KiB
2023-12-12T16:39:21.330412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile2000
Q12000
median3000
Q34000
95-th percentile8000
Maximum30000
Range29000
Interquartile range (IQR)2000

Descriptive statistics

Standard deviation2403.5114
Coefficient of variation (CV)0.6557996
Kurtosis17.052494
Mean3665.0089
Median Absolute Deviation (MAD)1000
Skewness2.9968206
Sum10317000
Variance5776866.9
MonotonicityNot monotonic
2023-12-12T16:39:21.463811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
2000 1092
38.8%
3000 657
23.3%
4000 344
 
12.2%
5000 250
 
8.9%
6000 171
 
6.1%
8000 106
 
3.8%
10000 96
 
3.4%
1000 40
 
1.4%
7000 22
 
0.8%
9000 12
 
0.4%
Other values (5) 25
 
0.9%
ValueCountFrequency (%)
1000 40
 
1.4%
2000 1092
38.8%
3000 657
23.3%
3500 1
 
< 0.1%
4000 344
 
12.2%
5000 250
 
8.9%
5500 3
 
0.1%
6000 171
 
6.1%
7000 22
 
0.8%
8000 106
 
3.8%
ValueCountFrequency (%)
30000 2
 
0.1%
20000 7
 
0.2%
15000 12
 
0.4%
10000 96
 
3.4%
9000 12
 
0.4%
8000 106
3.8%
7000 22
 
0.8%
6000 171
6.1%
5500 3
 
0.1%
5000 250
8.9%
Distinct217
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
2023-12-12T16:39:21.860109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length7
Mean length6.8284192
Min length2

Characters and Unicode

Total characters19222
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)1.1%

Sample

1st row2010012
2nd row2010038
3rd row2020057
4th row2020057
5th row2020065
ValueCountFrequency (%)
2020055 166
 
5.9%
2020053 119
 
4.2%
2090115 110
 
3.9%
2020054 93
 
3.3%
2010012 70
 
2.5%
2010001 68
 
2.4%
2020074 63
 
2.2%
2090071 52
 
1.8%
2020002 51
 
1.8%
2020070 48
 
1.7%
Other values (207) 1975
70.2%
2023-12-12T16:39:22.439703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8610
44.8%
2 4741
24.7%
1 1228
 
6.4%
9 1089
 
5.7%
5 889
 
4.6%
7 624
 
3.2%
3 601
 
3.1%
8 558
 
2.9%
4 502
 
2.6%
6 376
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19218
> 99.9%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8610
44.8%
2 4741
24.7%
1 1228
 
6.4%
9 1089
 
5.7%
5 889
 
4.6%
7 624
 
3.2%
3 601
 
3.1%
8 558
 
2.9%
4 502
 
2.6%
6 376
 
2.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 19222
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8610
44.8%
2 4741
24.7%
1 1228
 
6.4%
9 1089
 
5.7%
5 889
 
4.6%
7 624
 
3.2%
3 601
 
3.1%
8 558
 
2.9%
4 502
 
2.6%
6 376
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19222
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8610
44.8%
2 4741
24.7%
1 1228
 
6.4%
9 1089
 
5.7%
5 889
 
4.6%
7 624
 
3.2%
3 601
 
3.1%
8 558
 
2.9%
4 502
 
2.6%
6 376
 
2.0%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.1 KiB
Minimum2023-08-31 00:00:00
Maximum2023-08-31 00:00:00
2023-12-12T16:39:22.587294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:39:22.693263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T16:39:18.531703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:39:17.783794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:39:18.165993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:39:18.655473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:39:17.918375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:39:18.304854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:39:18.745521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:39:18.046786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:39:18.415994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:39:22.801531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
환불금액환불처리사유환불결제수단금액단가
환불금액1.0000.4820.4250.1700.170
환불처리사유0.4821.0000.8620.7060.706
환불결제수단0.4250.8621.0000.0290.029
금액0.1700.7060.0291.0001.000
단가0.1700.7060.0291.0001.000
2023-12-12T16:39:22.930734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
환불결제수단환불처리사유
환불결제수단1.0000.665
환불처리사유0.6651.000
2023-12-12T16:39:23.031249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
환불금액금액단가환불처리사유환불결제수단
환불금액1.0000.1290.1290.2040.299
금액0.1291.0001.0000.3480.018
단가0.1291.0001.0000.3480.018
환불처리사유0.2040.3480.3481.0000.665
환불결제수단0.2990.0180.0180.6651.000

Missing values

2023-12-12T16:39:18.900856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:39:19.136335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

환불처리일자환불접수일자환불금액환불처리사유환불결제수단금액수량단가품목코드데이터기준일자
02023-01-012023-01-027000미수거카드20001200020100122023-08-31
12023-01-012023-01-027000미수거카드30001300020100382023-08-31
22023-01-012023-01-0229000미수거카드80001800020200572023-08-31
32023-01-012023-01-0229000미수거카드80001800020200572023-08-31
42023-01-012023-01-0229000미수거카드90001900020200652023-08-31
52023-01-012023-01-0229000미수거카드40001400020100092023-08-31
62023-01-012023-01-0229000미수거계좌이체20001200020200482023-08-31
72023-01-012023-01-0229000미수거계좌이체20001200020200552023-08-31
82023-01-012023-01-0229000미수거계좌이체30001300020200602023-08-31
92023-01-012023-01-0229000미수거계좌이체40001400020200372023-08-31
환불처리일자환불접수일자환불금액환불처리사유환불결제수단금액수량단가품목코드데이터기준일자
28052023-08-312023-09-0321000미수거카드50001500020100432023-08-31
28062023-08-312023-09-0321000미수거카드20001200020100122023-08-31
28072023-08-312023-09-0321000미수거카드30001300020100042023-08-31
28082023-08-312023-09-0321000미수거카드40001400020900052023-08-31
28092023-08-312023-09-0321000미수거카드20001200020901152023-08-31
28102023-08-312023-09-048000미수거카드20001200020901152023-08-31
28112023-08-312023-09-048000미수거카드20001200020901152023-08-31
28122023-08-312023-09-048000미수거카드20001200020901152023-08-31
28132023-08-312023-09-048000미수거카드20001200020901152023-08-31
28142023-09-012023-09-032000미수거카드20001200020200682023-08-31

Duplicate rows

Most frequently occurring

환불처리일자환불접수일자환불금액환불처리사유환불결제수단금액수량단가품목코드데이터기준일자# duplicates
3372023-07-312023-08-0782000미수거카드30001300020200532023-08-3121
582023-02-172023-02-1945000클린센터에 민원인이 직접 입고방문30001300020900442023-08-3115
802023-02-272023-03-0275000미수거계좌이체50001500020900632023-08-3114
2672023-06-152023-06-1858000미수거계좌이체40001400020200092023-08-3114
3812023-08-162023-08-2028000미수거카드20001200020901152023-08-3114
3572023-08-062023-08-0794000미수거카드20001200020200542023-08-3111
1422023-04-042023-04-0634000미수거계좌이체20001200020900532023-08-3110
1742023-04-282023-04-3020000미수거계좌이체20001200020901152023-08-3110
3192023-07-162023-07-23119000미수거계좌이체20001200020200022023-08-3110
3312023-07-242023-07-3020000미수거카드20001200020200552023-08-3110