Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory634.8 KiB
Average record size in memory65.0 B

Variable types

Text3
Numeric1
Categorical1
DateTime2

Dataset

Description경주시 대형폐기물온라인센터로 신고된 현황입니다.(배출품명, 배출규격, 수량, 금액, 결제방법, 접수일자, 수거일자 등)을 나타냅니다.
Author경상북도 경주시
URLhttps://www.data.go.kr/data/15062552/fileData.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
결제방법 is highly imbalanced (61.5%)Imbalance

Reproduction

Analysis started2024-03-14 16:39:51.444836
Analysis finished2024-03-14 16:39:53.096291
Duration1.65 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2113
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-15T01:39:54.182926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length92
Median length71
Mean length7.3062
Min length2

Characters and Unicode

Total characters73062
Distinct characters167
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1769 ?
Unique (%)17.7%

Sample

1st row자전거
2nd row카펫, 매트
3rd row변기
4th row침대목재, 매트리스
5th row기타03
ValueCountFrequency (%)
의자 812
 
6.3%
침대목재 811
 
6.3%
매트리스 727
 
5.6%
소파 550
 
4.2%
매트 441
 
3.4%
장난감류 347
 
2.7%
가방 343
 
2.6%
장식장 311
 
2.4%
인형류 306
 
2.4%
서랍장 280
 
2.2%
Other values (2204) 8037
62.0%
2024-03-15T01:39:55.739825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 9335
 
12.8%
3745
 
5.1%
2965
 
4.1%
2654
 
3.6%
2447
 
3.3%
2359
 
3.2%
2312
 
3.2%
2044
 
2.8%
1735
 
2.4%
1674
 
2.3%
Other values (157) 41792
57.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 56858
77.8%
Other Punctuation 9335
 
12.8%
Space Separator 2965
 
4.1%
Decimal Number 1988
 
2.7%
Open Punctuation 730
 
1.0%
Close Punctuation 730
 
1.0%
Uppercase Letter 456
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3745
 
6.6%
2654
 
4.7%
2447
 
4.3%
2359
 
4.1%
2312
 
4.1%
2044
 
3.6%
1735
 
3.1%
1674
 
2.9%
1637
 
2.9%
1558
 
2.7%
Other values (138) 34693
61.0%
Decimal Number
ValueCountFrequency (%)
0 983
49.4%
2 366
 
18.4%
1 304
 
15.3%
3 188
 
9.5%
4 60
 
3.0%
5 57
 
2.9%
6 13
 
0.7%
7 11
 
0.6%
8 4
 
0.2%
9 2
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
V 222
48.7%
T 222
48.7%
F 4
 
0.9%
P 4
 
0.9%
R 4
 
0.9%
Other Punctuation
ValueCountFrequency (%)
, 9335
100.0%
Space Separator
ValueCountFrequency (%)
2965
100.0%
Open Punctuation
ValueCountFrequency (%)
( 730
100.0%
Close Punctuation
ValueCountFrequency (%)
) 730
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 56858
77.8%
Common 15748
 
21.6%
Latin 456
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3745
 
6.6%
2654
 
4.7%
2447
 
4.3%
2359
 
4.1%
2312
 
4.1%
2044
 
3.6%
1735
 
3.1%
1674
 
2.9%
1637
 
2.9%
1558
 
2.7%
Other values (138) 34693
61.0%
Common
ValueCountFrequency (%)
, 9335
59.3%
2965
 
18.8%
0 983
 
6.2%
( 730
 
4.6%
) 730
 
4.6%
2 366
 
2.3%
1 304
 
1.9%
3 188
 
1.2%
4 60
 
0.4%
5 57
 
0.4%
Other values (4) 30
 
0.2%
Latin
ValueCountFrequency (%)
V 222
48.7%
T 222
48.7%
F 4
 
0.9%
P 4
 
0.9%
R 4
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 56858
77.8%
ASCII 16204
 
22.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 9335
57.6%
2965
 
18.3%
0 983
 
6.1%
( 730
 
4.5%
) 730
 
4.5%
2 366
 
2.3%
1 304
 
1.9%
V 222
 
1.4%
T 222
 
1.4%
3 188
 
1.2%
Other values (9) 159
 
1.0%
Hangul
ValueCountFrequency (%)
3745
 
6.6%
2654
 
4.7%
2447
 
4.3%
2359
 
4.1%
2312
 
4.1%
2044
 
3.6%
1735
 
3.1%
1674
 
2.9%
1637
 
2.9%
1558
 
2.7%
Other values (138) 34693
61.0%
Distinct2111
Distinct (%)21.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-15T01:39:56.816380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length110
Median length95
Mean length11.2451
Min length1

Characters and Unicode

Total characters112451
Distinct characters135
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1743 ?
Unique (%)17.4%

Sample

1st row아동용 3발
2nd row한 묶음
3rd row모든 규격
4th row1인용(매트리스+목재)
5th row각종 품목
ValueCountFrequency (%)
이상 2698
 
8.6%
규격 2332
 
7.4%
모든 2332
 
7.4%
미만 2225
 
7.1%
1m 1618
 
5.2%
각종 994
 
3.2%
품목 993
 
3.2%
높이 870
 
2.8%
철재의자 756
 
2.4%
목재 756
 
2.4%
Other values (99) 15747
50.3%
2024-03-15T01:39:58.343756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
22591
 
20.1%
5034
 
4.5%
4510
 
4.0%
1 3744
 
3.3%
m 3487
 
3.1%
2867
 
2.5%
2867
 
2.5%
2703
 
2.4%
2646
 
2.4%
2637
 
2.3%
Other values (125) 59365
52.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67931
60.4%
Space Separator 22591
 
20.1%
Decimal Number 11182
 
9.9%
Lowercase Letter 6104
 
5.4%
Open Punctuation 1988
 
1.8%
Close Punctuation 1988
 
1.8%
Math Symbol 546
 
0.5%
Other Punctuation 65
 
0.1%
Other Symbol 56
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5034
 
7.4%
4510
 
6.6%
2867
 
4.2%
2867
 
4.2%
2703
 
4.0%
2646
 
3.9%
2637
 
3.9%
2557
 
3.8%
2448
 
3.6%
2265
 
3.3%
Other values (101) 37397
55.1%
Decimal Number
ValueCountFrequency (%)
1 3744
33.5%
0 2193
19.6%
2 1860
16.6%
4 1041
 
9.3%
3 746
 
6.7%
6 634
 
5.7%
9 521
 
4.7%
5 400
 
3.6%
8 43
 
0.4%
Lowercase Letter
ValueCountFrequency (%)
m 3487
57.1%
c 1555
25.5%
g 488
 
8.0%
k 488
 
8.0%
86
 
1.4%
Other Symbol
ValueCountFrequency (%)
24
42.9%
17
30.4%
15
26.8%
Math Symbol
ValueCountFrequency (%)
+ 378
69.2%
× 168
30.8%
Other Punctuation
ValueCountFrequency (%)
/ 61
93.8%
. 4
 
6.2%
Space Separator
ValueCountFrequency (%)
22591
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1988
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1988
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67931
60.4%
Common 38502
34.2%
Latin 6018
 
5.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5034
 
7.4%
4510
 
6.6%
2867
 
4.2%
2867
 
4.2%
2703
 
4.0%
2646
 
3.9%
2637
 
3.9%
2557
 
3.8%
2448
 
3.6%
2265
 
3.3%
Other values (101) 37397
55.1%
Common
ValueCountFrequency (%)
22591
58.7%
1 3744
 
9.7%
0 2193
 
5.7%
( 1988
 
5.2%
) 1988
 
5.2%
2 1860
 
4.8%
4 1041
 
2.7%
3 746
 
1.9%
6 634
 
1.6%
9 521
 
1.4%
Other values (10) 1196
 
3.1%
Latin
ValueCountFrequency (%)
m 3487
57.9%
c 1555
25.8%
g 488
 
8.1%
k 488
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67931
60.4%
ASCII 44210
39.3%
None 168
 
0.1%
Letterlike Symbols 86
 
0.1%
CJK Compat 56
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
22591
51.1%
1 3744
 
8.5%
m 3487
 
7.9%
0 2193
 
5.0%
( 1988
 
4.5%
) 1988
 
4.5%
2 1860
 
4.2%
c 1555
 
3.5%
4 1041
 
2.4%
3 746
 
1.7%
Other values (9) 3017
 
6.8%
Hangul
ValueCountFrequency (%)
5034
 
7.4%
4510
 
6.6%
2867
 
4.2%
2867
 
4.2%
2703
 
4.0%
2646
 
3.9%
2637
 
3.9%
2557
 
3.8%
2448
 
3.6%
2265
 
3.3%
Other values (101) 37397
55.1%
None
ValueCountFrequency (%)
× 168
100.0%
Letterlike Symbols
ValueCountFrequency (%)
86
100.0%
CJK Compat
ValueCountFrequency (%)
24
42.9%
17
30.4%
15
26.8%

수량
Text

Distinct363
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-15T01:39:59.093679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length1
Mean length2.1413
Min length1

Characters and Unicode

Total characters21413
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique261 ?
Unique (%)2.6%

Sample

1st row1
2nd row1
3rd row1
4th row2
5th row1
ValueCountFrequency (%)
1 6324
63.2%
1,1 1078
 
10.8%
2 713
 
7.1%
1,1,1 377
 
3.8%
3 152
 
1.5%
1,1,1,1 131
 
1.3%
2,1 125
 
1.2%
1,2 102
 
1.0%
4 76
 
0.8%
1,1,1,1,1 57
 
0.6%
Other values (332) 865
 
8.6%
2024-03-15T01:40:00.291529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 13201
61.6%
, 5718
26.7%
2 1644
 
7.7%
3 427
 
2.0%
4 178
 
0.8%
5 88
 
0.4%
0 60
 
0.3%
6 43
 
0.2%
7 27
 
0.1%
8 15
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15695
73.3%
Other Punctuation 5718
 
26.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 13201
84.1%
2 1644
 
10.5%
3 427
 
2.7%
4 178
 
1.1%
5 88
 
0.6%
0 60
 
0.4%
6 43
 
0.3%
7 27
 
0.2%
8 15
 
0.1%
9 12
 
0.1%
Other Punctuation
ValueCountFrequency (%)
, 5718
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 21413
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 13201
61.6%
, 5718
26.7%
2 1644
 
7.7%
3 427
 
2.0%
4 178
 
0.8%
5 88
 
0.4%
0 60
 
0.3%
6 43
 
0.2%
7 27
 
0.1%
8 15
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21413
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 13201
61.6%
, 5718
26.7%
2 1644
 
7.7%
3 427
 
2.0%
4 178
 
0.8%
5 88
 
0.4%
0 60
 
0.3%
6 43
 
0.2%
7 27
 
0.1%
8 15
 
0.1%

결제금액
Real number (ℝ)

Distinct106
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7606.4
Minimum1000
Maximum235000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-15T01:40:00.728745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile2000
Q13000
median4000
Q38000
95-th percentile23000
Maximum235000
Range234000
Interquartile range (IQR)5000

Descriptive statistics

Standard deviation10803.136
Coefficient of variation (CV)1.4202693
Kurtosis59.436968
Mean7606.4
Median Absolute Deviation (MAD)2000
Skewness5.9969718
Sum76064000
Variance1.1670775 × 108
MonotonicityNot monotonic
2024-03-15T01:40:00.981354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2000 2189
21.9%
4000 1458
14.6%
3000 1358
13.6%
5000 864
 
8.6%
6000 778
 
7.8%
8000 631
 
6.3%
10000 607
 
6.1%
12000 260
 
2.6%
7000 222
 
2.2%
15000 209
 
2.1%
Other values (96) 1424
14.2%
ValueCountFrequency (%)
1000 127
 
1.3%
2000 2189
21.9%
3000 1358
13.6%
4000 1458
14.6%
5000 864
 
8.6%
6000 778
 
7.8%
7000 222
 
2.2%
8000 631
 
6.3%
9000 162
 
1.6%
10000 607
 
6.1%
ValueCountFrequency (%)
235000 1
< 0.1%
189000 1
< 0.1%
152000 1
< 0.1%
143000 1
< 0.1%
138000 1
< 0.1%
130000 1
< 0.1%
124000 1
< 0.1%
122000 1
< 0.1%
121000 1
< 0.1%
115000 1
< 0.1%

결제방법
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
카드결제
8506 
계좌이체
1493 
현장결제
 
1

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row카드결제
2nd row카드결제
3rd row카드결제
4th row카드결제
5th row카드결제

Common Values

ValueCountFrequency (%)
카드결제 8506
85.1%
계좌이체 1493
 
14.9%
현장결제 1
 
< 0.1%

Length

2024-03-15T01:40:01.391635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T01:40:01.724775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
카드결제 8506
85.1%
계좌이체 1493
 
14.9%
현장결제 1
 
< 0.1%
Distinct9878
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-09-01 07:31:00
Maximum2024-01-30 08:34:00
2024-03-15T01:40:02.078756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T01:40:02.512011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct443
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-09-02 00:00:00
Maximum2024-01-30 00:00:00
2024-03-15T01:40:02.777231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T01:40:03.221196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-03-15T01:39:52.171866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T01:40:03.394001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
결제금액결제방법
결제금액1.0000.055
결제방법0.0551.000
2024-03-15T01:40:03.605811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
결제금액결제방법
결제금액1.0000.024
결제방법0.0241.000

Missing values

2024-03-15T01:39:52.518136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T01:39:52.917789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

배출폐기물배출규격수량결제금액결제방법등록일수거요청일
12752자전거아동용 3발12000카드결제2023-08-16 9:482023-08-17
8718카펫, 매트한 묶음14000카드결제2023-05-16 10:112023-05-17
17378변기모든 규격13000카드결제2023-11-22 21:362023-11-23
1039침대목재, 매트리스1인용(매트리스+목재)220000카드결제2022-09-30 13:152022-10-04
14841기타03각종 품목13000카드결제2023-09-26 8:572023-09-27
20191장식장높이 1m 미만14000카드결제2024-01-26 22:172024-01-27
9884가방여행용 가방(캐리어)12000카드결제2023-06-16 12:312023-06-17
8468선풍기모든 규격24000카드결제2023-05-09 13:502023-05-10
13039기타02,기타02,기타02,기타02,각종 품목 각종 품목 각종 품목 각종 품목1,1,1,1,8000카드결제2023-08-21 14:042023-08-22
3065유모차, 보행기모든 규격12000카드결제2022-12-08 23:592022-12-10
배출폐기물배출규격수량결제금액결제방법등록일수거요청일
4753침대목재, 매트리스2인용 매트리스18000카드결제2023-02-01 17:332023-02-02
2831소파,장식장3인용 이상 높이 1m 이상1,116000카드결제2022-12-01 7:122022-12-02
1660장롱,문짝,의자폭 90cm 이상 1짝 나무문 목재 철재의자1,2,125000계좌이체2022-10-20 15:222022-10-22
3162유모차, 보행기모든 규격12000카드결제2022-12-12 19:122022-12-14
6434싱크찬장,캐비닛2칸 3단 이하1,15000카드결제2023-03-13 8:482023-03-14
18248거울,캐비닛1m 이상 3단 이하1,16000카드결제2023-12-16 22:022023-12-18
13979전기장판모든 규격14000카드결제2023-09-08 13:302023-09-09
18414소파,기타01,식탁3인용 이상 각종 품목 6인용 미만(대리석)1,1,119000카드결제2023-12-20 14:172023-12-26
11381장롱폭 90cm 미만 1짝112000카드결제2023-07-19 22:542023-07-20
12267의자사무용의자26000카드결제2023-08-05 17:172023-08-07

Duplicate rows

Most frequently occurring

배출폐기물배출규격수량결제금액결제방법등록일수거요청일# duplicates
0카시트모든 규격12000카드결제2022-10-26 14:552022-10-272