Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows1388
Duplicate rows (%)13.9%
Total size in memory644.5 KiB
Average record size in memory66.0 B

Variable types

Categorical3
DateTime1
Text2
Numeric1

Dataset

Description경기도 안산시 대형페기물인터넷배출시스템 배출신고 현황입니다. 결제구분,접수일자,품목,규격,단가,수량,데이터기준일자 등의 목록을 제공합니다.
URLhttps://www.data.go.kr/data/15042356/fileData.do

Alerts

수량 has constant value ""Constant
데이터기준일자 has constant value ""Constant
Dataset has 1388 (13.9%) duplicate rowsDuplicates
단가 has 783 (7.8%) zerosZeros

Reproduction

Analysis started2023-12-13 00:39:17.141130
Analysis finished2023-12-13 00:39:17.723680
Duration0.58 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

결제구분
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
결제
7671 
미결제
2329 

Length

Max length3
Median length2
Mean length2.2329
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row결제
2nd row결제
3rd row결제
4th row결제
5th row미결제

Common Values

ValueCountFrequency (%)
결제 7671
76.7%
미결제 2329
 
23.3%

Length

2023-12-13T09:39:17.769678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:39:17.836680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
결제 7671
76.7%
미결제 2329
 
23.3%
Distinct432
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-01-01 00:00:00
Maximum2023-03-08 00:00:00
2023-12-13T09:39:17.918140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:39:18.029447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

품목
Text

Distinct148
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T09:39:18.248335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length3.2092
Min length1

Characters and Unicode

Total characters32092
Distinct characters202
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.1%

Sample

1st row서랍장
2nd row의자
3rd row나무묶음
4th row나무묶음
5th row거울
ValueCountFrequency (%)
의자 1478
 
14.7%
서랍장 641
 
6.4%
소파 614
 
6.1%
매트리스 459
 
4.6%
책상 397
 
3.9%
나무묶음 345
 
3.4%
소화기 331
 
3.3%
장롱 303
 
3.0%
책장 255
 
2.5%
식탁 254
 
2.5%
Other values (142) 4984
49.5%
2023-12-13T09:39:18.577648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2151
 
6.7%
1722
 
5.4%
1490
 
4.6%
1215
 
3.8%
1086
 
3.4%
900
 
2.8%
888
 
2.8%
788
 
2.5%
751
 
2.3%
749
 
2.3%
Other values (192) 20352
63.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 31036
96.7%
Other Punctuation 621
 
1.9%
Uppercase Letter 374
 
1.2%
Space Separator 61
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2151
 
6.9%
1722
 
5.5%
1490
 
4.8%
1215
 
3.9%
1086
 
3.5%
900
 
2.9%
888
 
2.9%
788
 
2.5%
751
 
2.4%
749
 
2.4%
Other values (186) 19296
62.2%
Uppercase Letter
ValueCountFrequency (%)
V 185
49.5%
T 181
48.4%
D 8
 
2.1%
Other Punctuation
ValueCountFrequency (%)
· 348
56.0%
, 273
44.0%
Space Separator
ValueCountFrequency (%)
61
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 31036
96.7%
Common 682
 
2.1%
Latin 374
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2151
 
6.9%
1722
 
5.5%
1490
 
4.8%
1215
 
3.9%
1086
 
3.5%
900
 
2.9%
888
 
2.9%
788
 
2.5%
751
 
2.4%
749
 
2.4%
Other values (186) 19296
62.2%
Common
ValueCountFrequency (%)
· 348
51.0%
, 273
40.0%
61
 
8.9%
Latin
ValueCountFrequency (%)
V 185
49.5%
T 181
48.4%
D 8
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 31036
96.7%
ASCII 708
 
2.2%
None 348
 
1.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2151
 
6.9%
1722
 
5.5%
1490
 
4.8%
1215
 
3.9%
1086
 
3.5%
900
 
2.9%
888
 
2.9%
788
 
2.5%
751
 
2.4%
749
 
2.4%
Other values (186) 19296
62.2%
None
ValueCountFrequency (%)
· 348
100.0%
ASCII
ValueCountFrequency (%)
, 273
38.6%
V 185
26.1%
T 181
25.6%
61
 
8.6%
D 8
 
1.1%

규격
Text

Distinct115
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T09:39:18.789269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length4.652
Min length1

Characters and Unicode

Total characters46520
Distinct characters111
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)0.1%

Sample

1st row4단 이하
2nd row1인용
3rd row10kg이내
4th row10kg이내
5th row110x120cm이상
ValueCountFrequency (%)
1인용 1619
 
13.1%
이상 1189
 
9.6%
미만 1025
 
8.3%
이하 872
 
7.0%
높이 681
 
5.5%
1m 628
 
5.1%
4단 595
 
4.8%
120cm 528
 
4.3%
일반용 390
 
3.1%
모든규격 362
 
2.9%
Other values (88) 4498
36.3%
2023-12-13T09:39:19.099963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5811
 
12.5%
1 4726
 
10.2%
3239
 
7.0%
2997
 
6.4%
2700
 
5.8%
0 2274
 
4.9%
m 2141
 
4.6%
1460
 
3.1%
2 1397
 
3.0%
1212
 
2.6%
Other values (101) 18563
39.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 23230
49.9%
Decimal Number 11350
24.4%
Space Separator 5811
 
12.5%
Lowercase Letter 5262
 
11.3%
Other Punctuation 556
 
1.2%
Close Punctuation 221
 
0.5%
Uppercase Letter 86
 
0.2%
Other Symbol 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3239
13.9%
2997
12.9%
2700
 
11.6%
1460
 
6.3%
1212
 
5.2%
1212
 
5.2%
896
 
3.9%
872
 
3.8%
683
 
2.9%
495
 
2.1%
Other values (80) 7464
32.1%
Decimal Number
ValueCountFrequency (%)
1 4726
41.6%
0 2274
20.0%
2 1397
 
12.3%
3 988
 
8.7%
4 759
 
6.7%
5 584
 
5.1%
8 248
 
2.2%
9 199
 
1.8%
6 157
 
1.4%
7 18
 
0.2%
Lowercase Letter
ValueCountFrequency (%)
m 2141
40.7%
c 1118
21.2%
k 704
 
13.4%
g 704
 
13.4%
x 595
 
11.3%
Other Punctuation
ValueCountFrequency (%)
. 331
59.5%
, 225
40.5%
Space Separator
ValueCountFrequency (%)
5811
100.0%
Close Punctuation
ValueCountFrequency (%)
) 221
100.0%
Uppercase Letter
ValueCountFrequency (%)
L 86
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 23230
49.9%
Common 17942
38.6%
Latin 5348
 
11.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3239
13.9%
2997
12.9%
2700
 
11.6%
1460
 
6.3%
1212
 
5.2%
1212
 
5.2%
896
 
3.9%
872
 
3.8%
683
 
2.9%
495
 
2.1%
Other values (80) 7464
32.1%
Common
ValueCountFrequency (%)
5811
32.4%
1 4726
26.3%
0 2274
 
12.7%
2 1397
 
7.8%
3 988
 
5.5%
4 759
 
4.2%
5 584
 
3.3%
. 331
 
1.8%
8 248
 
1.4%
, 225
 
1.3%
Other values (5) 599
 
3.3%
Latin
ValueCountFrequency (%)
m 2141
40.0%
c 1118
20.9%
k 704
 
13.2%
g 704
 
13.2%
x 595
 
11.1%
L 86
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23286
50.1%
Hangul 23230
49.9%
CJK Compat 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5811
25.0%
1 4726
20.3%
0 2274
 
9.8%
m 2141
 
9.2%
2 1397
 
6.0%
c 1118
 
4.8%
3 988
 
4.2%
4 759
 
3.3%
k 704
 
3.0%
g 704
 
3.0%
Other values (10) 2664
11.4%
Hangul
ValueCountFrequency (%)
3239
13.9%
2997
12.9%
2700
 
11.6%
1460
 
6.3%
1212
 
5.2%
1212
 
5.2%
896
 
3.9%
872
 
3.8%
683
 
2.9%
495
 
2.1%
Other values (80) 7464
32.1%
CJK Compat
ValueCountFrequency (%)
4
100.0%

단가
Real number (ℝ)

ZEROS 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3695.6
Minimum0
Maximum13000
Zeros783
Zeros (%)7.8%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T09:39:19.192586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12000
median2000
Q35000
95-th percentile11000
Maximum13000
Range13000
Interquartile range (IQR)3000

Descriptive statistics

Standard deviation2992.1958
Coefficient of variation (CV)0.80966442
Kurtosis1.3798018
Mean3695.6
Median Absolute Deviation (MAD)1000
Skewness1.3407719
Sum36956000
Variance8953236
MonotonicityNot monotonic
2023-12-13T09:39:19.271490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2000 4194
41.9%
5000 2113
21.1%
8000 981
 
9.8%
3000 835
 
8.3%
0 783
 
7.8%
1000 492
 
4.9%
11000 334
 
3.3%
13000 268
 
2.7%
ValueCountFrequency (%)
0 783
 
7.8%
1000 492
 
4.9%
2000 4194
41.9%
3000 835
 
8.3%
5000 2113
21.1%
8000 981
 
9.8%
11000 334
 
3.3%
13000 268
 
2.7%
ValueCountFrequency (%)
13000 268
 
2.7%
11000 334
 
3.3%
8000 981
 
9.8%
5000 2113
21.1%
3000 835
 
8.3%
2000 4194
41.9%
1000 492
 
4.9%
0 783
 
7.8%

수량
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 10000
100.0%

Length

2023-12-13T09:39:19.360669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:39:19.424731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 10000
100.0%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-03-08
10000 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-03-08
2nd row2023-03-08
3rd row2023-03-08
4th row2023-03-08
5th row2023-03-08

Common Values

ValueCountFrequency (%)
2023-03-08 10000
100.0%

Length

2023-12-13T09:39:19.491642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:39:19.560191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-03-08 10000
100.0%

Interactions

2023-12-13T09:39:17.489237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T09:39:19.605241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
결제구분단가
결제구분1.0000.082
단가0.0821.000
2023-12-13T09:39:19.680994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단가결제구분
단가1.0000.088
결제구분0.0881.000

Missing values

2023-12-13T09:39:17.591053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:39:17.676873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

결제구분접수일자품목규격단가수량데이터기준일자
9379결제2023-01-08서랍장4단 이하200012023-03-08
17171결제2022-11-06의자1인용200012023-03-08
46703결제2022-01-01나무묶음10kg이내100012023-03-08
7853결제2023-01-20나무묶음10kg이내100012023-03-08
45045미결제2022-02-10거울110x120cm이상500012023-03-08
15380결제2022-11-22의자1인용200012023-03-08
43832결제2022-02-19파렛트500012023-03-08
38569결제2022-03-30고무통70x80cm 이하200012023-03-08
45439미결제2022-02-05매트리스800012023-03-08
38473결제2022-03-31책상일반용500012023-03-08
결제구분접수일자품목규격단가수량데이터기준일자
5833미결제2023-02-06욕조유아욕조200012023-03-08
2270미결제2023-02-24나무묶음10kg이내100012023-03-08
36086미결제2022-04-19침대1인용세트1100012023-03-08
24594결제2022-09-14소파3인용500012023-03-08
17523결제2022-11-02거울110x120cm미만200012023-03-08
7386결제2023-01-26식탁의자별도300012023-03-08
17041결제2022-11-07침대틀1,2인용800012023-03-08
32535결제2022-05-16액자높이 1m 미만200012023-03-08
3979결제2023-02-15컴퓨터모니터012023-03-08
38345결제2022-04-01냉장고300L 이상800012023-03-08

Duplicate rows

Most frequently occurring

결제구분접수일자품목규격단가수량데이터기준일자# duplicates
1280미결제2022-12-27소파3인용500012023-03-08110
1207미결제2022-09-16소화기3.3kg 미만200012023-03-0895
1208미결제2022-09-16소화기3.3kg 이상300012023-03-0861
1243미결제2022-11-15샌드위치·전기판넬높이 2m 이하500012023-03-0833
73결제2022-02-23의자1인용200012023-03-0829
40결제2022-02-16의자1인용200012023-03-0827
1058결제2023-02-24의자1인용200012023-03-0827
555결제2022-09-16소화기3.3kg 미만200012023-03-0826
586결제2022-09-26나무묶음10kg이내100012023-03-0826
918결제2023-01-20협탁보조책상200012023-03-0825