Overview

Dataset statistics

Number of variables5
Number of observations290
Missing cells0
Missing cells (%)0.0%
Duplicate rows32
Duplicate rows (%)11.0%
Total size in memory11.7 KiB
Average record size in memory41.5 B

Variable types

Text2
Categorical2
Numeric1

Dataset

Description대구광역시 동구_사업장폐기물 배출자 신고현황_20200603
Author대구광역시 동구
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=15060399&dataSetDetailId=150603991b90d5ca55a57&provdMethod=FILE

Alerts

Dataset has 32 (11.0%) duplicate rowsDuplicates
생활계구분 is highly overall correlated with 폐기물 종류High correlation
폐기물 종류 is highly overall correlated with 생활계구분High correlation

Reproduction

Analysis started2023-12-10 19:45:40.655155
Analysis finished2023-12-10 19:45:41.454940
Duration0.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상호
Text

Distinct69
Distinct (%)23.8%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-11T04:45:41.700190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length15
Mean length10.306897
Min length4

Characters and Unicode

Total characters2989
Distinct characters200
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)6.2%

Sample

1st row강남병원
2nd row강남병원
3rd row공군제11전투비행단
4th row(주)노비아갈라
5th row(주)노비아갈라
ValueCountFrequency (%)
롯데쇼핑(주)롯데마트대구율하점 19
 
5.4%
주)신세계동대구복합환승센터 17
 
4.9%
주)이마트 16
 
4.6%
반야월점 16
 
4.6%
대한민국상이군경회 13
 
3.7%
대구사업소 13
 
3.7%
홈플러스(주 10
 
2.9%
공군군수사령부 8
 
2.3%
대구공업고등학교 8
 
2.3%
이시아폴리스점 7
 
2.0%
Other values (68) 223
63.7%
2023-12-11T04:45:42.212151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 165
 
5.5%
) 165
 
5.5%
140
 
4.7%
134
 
4.5%
100
 
3.3%
64
 
2.1%
62
 
2.1%
60
 
2.0%
56
 
1.9%
56
 
1.9%
Other values (190) 1987
66.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2519
84.3%
Open Punctuation 165
 
5.5%
Close Punctuation 165
 
5.5%
Space Separator 60
 
2.0%
Lowercase Letter 49
 
1.6%
Uppercase Letter 16
 
0.5%
Other Punctuation 8
 
0.3%
Decimal Number 7
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
140
 
5.6%
134
 
5.3%
100
 
4.0%
64
 
2.5%
62
 
2.5%
56
 
2.2%
56
 
2.2%
52
 
2.1%
46
 
1.8%
46
 
1.8%
Other values (176) 1763
70.0%
Lowercase Letter
ValueCountFrequency (%)
t 14
28.6%
l 14
28.6%
a 7
14.3%
o 7
14.3%
e 7
14.3%
Uppercase Letter
ValueCountFrequency (%)
M 7
43.8%
L 7
43.8%
G 1
 
6.2%
Y 1
 
6.2%
Open Punctuation
ValueCountFrequency (%)
( 165
100.0%
Close Punctuation
ValueCountFrequency (%)
) 165
100.0%
Space Separator
ValueCountFrequency (%)
60
100.0%
Other Punctuation
ValueCountFrequency (%)
. 8
100.0%
Decimal Number
ValueCountFrequency (%)
1 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2519
84.3%
Common 405
 
13.5%
Latin 65
 
2.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
140
 
5.6%
134
 
5.3%
100
 
4.0%
64
 
2.5%
62
 
2.5%
56
 
2.2%
56
 
2.2%
52
 
2.1%
46
 
1.8%
46
 
1.8%
Other values (176) 1763
70.0%
Latin
ValueCountFrequency (%)
t 14
21.5%
l 14
21.5%
M 7
10.8%
a 7
10.8%
L 7
10.8%
o 7
10.8%
e 7
10.8%
G 1
 
1.5%
Y 1
 
1.5%
Common
ValueCountFrequency (%)
( 165
40.7%
) 165
40.7%
60
 
14.8%
. 8
 
2.0%
1 7
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2519
84.3%
ASCII 470
 
15.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 165
35.1%
) 165
35.1%
60
 
12.8%
t 14
 
3.0%
l 14
 
3.0%
. 8
 
1.7%
1 7
 
1.5%
M 7
 
1.5%
a 7
 
1.5%
L 7
 
1.5%
Other values (4) 16
 
3.4%
Hangul
ValueCountFrequency (%)
140
 
5.6%
134
 
5.3%
100
 
4.0%
64
 
2.5%
62
 
2.5%
56
 
2.2%
56
 
2.2%
52
 
2.1%
46
 
1.8%
46
 
1.8%
Other values (176) 1763
70.0%

생활계구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
생활계
221 
배출시설계
69 

Length

Max length5
Median length3
Mean length3.4758621
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row생활계
2nd row생활계
3rd row생활계
4th row생활계
5th row생활계

Common Values

ValueCountFrequency (%)
생활계 221
76.2%
배출시설계 69
 
23.8%

Length

2023-12-11T04:45:42.420101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T04:45:42.562760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
생활계 221
76.2%
배출시설계 69
 
23.8%

폐기물 종류
Categorical

HIGH CORRELATION 

Distinct37
Distinct (%)12.8%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
폐합성수지류
84 
음식물류폐기물
31 
폐합성고무류
21 
폐가구류, 폐도장목, 폐목재포장재, 폐전선드럼
19 
그 밖의 폐합성고분자화합물
18 
Other values (32)
117 

Length

Max length25
Median length22
Mean length7.9931034
Min length3

Unique

Unique11 ?
Unique (%)3.8%

Sample

1st row폐합성수지류
2nd row음식물류폐기물
3rd row음식물류폐기물
4th row폐합성수지류
5th row음식물류 폐기물

Common Values

ValueCountFrequency (%)
폐합성수지류 84
29.0%
음식물류폐기물 31
 
10.7%
폐합성고무류 21
 
7.2%
폐가구류, 폐도장목, 폐목재포장재, 폐전선드럼 19
 
6.6%
그 밖의 폐합성고분자화합물 18
 
6.2%
폐합성섬유 16
 
5.5%
폐종이류 15
 
5.2%
그 밖의 폐목재류 12
 
4.1%
음식물류 폐기물 7
 
2.4%
폐식용유 5
 
1.7%
Other values (27) 62
21.4%

Length

2023-12-11T04:45:42.685365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
폐합성수지류 84
18.5%
47
 
10.4%
밖의 47
 
10.4%
음식물류폐기물 31
 
6.8%
폐합성고무류 21
 
4.6%
폐가구류 19
 
4.2%
폐도장목 19
 
4.2%
폐목재포장재 19
 
4.2%
폐전선드럼 19
 
4.2%
폐합성고분자화합물 18
 
4.0%
Other values (34) 130
28.6%

배출량(톤)
Real number (ℝ)

Distinct63
Distinct (%)21.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean124.81862
Minimum0.5
Maximum2500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-11T04:45:42.834461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.5
5-th percentile1
Q112
median48
Q3100
95-th percentile425.7
Maximum2500
Range2499.5
Interquartile range (IQR)88

Descriptive statistics

Standard deviation297.68529
Coefficient of variation (CV)2.384943
Kurtosis35.376278
Mean124.81862
Median Absolute Deviation (MAD)36
Skewness5.522655
Sum36197.4
Variance88616.533
MonotonicityNot monotonic
2023-12-11T04:45:43.000851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60.0 32
 
11.0%
12.0 27
 
9.3%
1.0 24
 
8.3%
36.0 22
 
7.6%
48.0 10
 
3.4%
24.0 10
 
3.4%
72.0 10
 
3.4%
150.0 10
 
3.4%
300.0 9
 
3.1%
54.0 9
 
3.1%
Other values (53) 127
43.8%
ValueCountFrequency (%)
0.5 1
 
0.3%
0.6 4
 
1.4%
0.96 1
 
0.3%
1.0 24
8.3%
1.2 1
 
0.3%
2.0 3
 
1.0%
2.5 2
 
0.7%
3.0 5
 
1.7%
3.6 1
 
0.3%
3.8 1
 
0.3%
ValueCountFrequency (%)
2500.0 1
 
0.3%
2400.0 1
 
0.3%
2000.0 1
 
0.3%
1920.0 1
 
0.3%
1300.0 1
 
0.3%
1000.0 1
 
0.3%
960.0 2
0.7%
840.0 1
 
0.3%
600.0 3
1.0%
540.0 1
 
0.3%
Distinct68
Distinct (%)23.4%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-11T04:45:43.356139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length34.5
Mean length24.203448
Min length17

Characters and Unicode

Total characters7019
Distinct characters110
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)6.2%

Sample

1st row대구광역시 동구 동촌로 207 (방촌동)
2nd row대구광역시 동구 동촌로 207 (방촌동)
3rd row대구광역시 동구 아양로 352, 201호 (입석동)
4th row대구광역시 동구 동촌로 91 (검사동)
5th row대구광역시 동구 동촌로 91 (검사동)
ValueCountFrequency (%)
대구광역시 290
19.5%
동구 290
19.5%
안심로 39
 
2.6%
동촌로 36
 
2.4%
각산동 34
 
2.3%
신서동 33
 
2.2%
신암동 29
 
1.9%
방촌동 28
 
1.9%
신천동 27
 
1.8%
아양로 20
 
1.3%
Other values (121) 664
44.6%
2023-12-11T04:45:43.919967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1200
17.1%
693
 
9.9%
609
 
8.7%
333
 
4.7%
307
 
4.4%
293
 
4.2%
290
 
4.1%
288
 
4.1%
) 276
 
3.9%
( 276
 
3.9%
Other values (100) 2454
35.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4161
59.3%
Space Separator 1200
 
17.1%
Decimal Number 1005
 
14.3%
Close Punctuation 276
 
3.9%
Open Punctuation 276
 
3.9%
Dash Punctuation 55
 
0.8%
Other Punctuation 46
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
693
16.7%
609
14.6%
333
 
8.0%
307
 
7.4%
293
 
7.0%
290
 
7.0%
288
 
6.9%
94
 
2.3%
78
 
1.9%
71
 
1.7%
Other values (84) 1105
26.6%
Decimal Number
ValueCountFrequency (%)
1 193
19.2%
2 118
11.7%
3 102
10.1%
9 101
10.0%
5 93
9.3%
0 86
8.6%
8 85
8.5%
7 83
8.3%
4 80
8.0%
6 64
 
6.4%
Other Punctuation
ValueCountFrequency (%)
, 40
87.0%
. 6
 
13.0%
Space Separator
ValueCountFrequency (%)
1200
100.0%
Close Punctuation
ValueCountFrequency (%)
) 276
100.0%
Open Punctuation
ValueCountFrequency (%)
( 276
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 55
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4161
59.3%
Common 2858
40.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
693
16.7%
609
14.6%
333
 
8.0%
307
 
7.4%
293
 
7.0%
290
 
7.0%
288
 
6.9%
94
 
2.3%
78
 
1.9%
71
 
1.7%
Other values (84) 1105
26.6%
Common
ValueCountFrequency (%)
1200
42.0%
) 276
 
9.7%
( 276
 
9.7%
1 193
 
6.8%
2 118
 
4.1%
3 102
 
3.6%
9 101
 
3.5%
5 93
 
3.3%
0 86
 
3.0%
8 85
 
3.0%
Other values (6) 328
 
11.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4161
59.3%
ASCII 2858
40.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1200
42.0%
) 276
 
9.7%
( 276
 
9.7%
1 193
 
6.8%
2 118
 
4.1%
3 102
 
3.6%
9 101
 
3.5%
5 93
 
3.3%
0 86
 
3.0%
8 85
 
3.0%
Other values (6) 328
 
11.5%
Hangul
ValueCountFrequency (%)
693
16.7%
609
14.6%
333
 
8.0%
307
 
7.4%
293
 
7.0%
290
 
7.0%
288
 
6.9%
94
 
2.3%
78
 
1.9%
71
 
1.7%
Other values (84) 1105
26.6%

Interactions

2023-12-11T04:45:41.101542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T04:45:44.072223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상호생활계구분폐기물 종류배출량(톤)사업장도로명주소
상호1.0000.9970.7440.8641.000
생활계구분0.9971.0000.8360.0001.000
폐기물 종류0.7440.8361.0000.0000.760
배출량(톤)0.8640.0000.0001.0000.864
사업장도로명주소1.0001.0000.7600.8641.000
2023-12-11T04:45:44.209208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
폐기물 종류생활계구분
폐기물 종류1.0000.693
생활계구분0.6931.000
2023-12-11T04:45:44.340498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
배출량(톤)생활계구분폐기물 종류
배출량(톤)1.0000.0000.000
생활계구분0.0001.0000.693
폐기물 종류0.0000.6931.000

Missing values

2023-12-11T04:45:41.251045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T04:45:41.405097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

상호생활계구분폐기물 종류배출량(톤)사업장도로명주소
0강남병원생활계폐합성수지류60.0대구광역시 동구 동촌로 207 (방촌동)
1강남병원생활계음식물류폐기물48.0대구광역시 동구 동촌로 207 (방촌동)
2공군제11전투비행단생활계음식물류폐기물540.0대구광역시 동구 아양로 352, 201호 (입석동)
3(주)노비아갈라생활계폐합성수지류70.0대구광역시 동구 동촌로 91 (검사동)
4(주)노비아갈라생활계음식물류 폐기물300.0대구광역시 동구 동촌로 91 (검사동)
5(주)노비아갈라생활계폐합성수지류80.0대구광역시 동구 동촌로 91 (검사동)
6강동중학교생활계음식물류 폐기물24.0대구광역시 동구 동호로 112 (신서동, 강동중학교)
7강동중학교생활계폐합성수지류60.0대구광역시 동구 동호로 112 (신서동, 강동중학교)
8강동중학교생활계폐합성수지류60.0대구광역시 동구 동호로 112 (신서동, 강동중학교)
9강동중학교생활계그 밖의 폐목재류36.0대구광역시 동구 동호로 112 (신서동, 강동중학교)
상호생활계구분폐기물 종류배출량(톤)사업장도로명주소
280경북재향군인회각산동지점배출시설계그 밖의 폐합성고분자화합물150.0대구광역시 동구 안심로65길 14 (각산동)
281경북재향군인회각산동지점배출시설계폐합성고무류75.0대구광역시 동구 안심로65길 14 (각산동)
282신성상사배출시설계그 밖의 폐합성고분자화합물180.0대구광역시 동구 안심로65길 14 (각산동)
283신성상사배출시설계그 밖의 폐합성고분자화합물150.0대구광역시 동구 안심로65길 14 (각산동)
284신성상사배출시설계폐합성고무류30.0대구광역시 동구 안심로65길 14 (각산동)
285신성상사배출시설계그 밖의 폐합성고분자화합물150.0대구광역시 동구 안심로65길 14 (각산동)
286대구그린파워(주)배출시설계폐수처리오니15.0대구광역시 동구 매여로 14 (율암동)
287대구그린파워(주)배출시설계폐합성수지류30.0대구광역시 동구 매여로 14 (율암동)
288대구그린파워(주)배출시설계그 밖의 폐목재류15.0대구광역시 동구 매여로 14 (율암동)
289대경맨홀배출시설계폐콘크리트300.0대구광역시 동구 둔산로40길 21 (방촌동)

Duplicate rows

Most frequently occurring

상호생활계구분폐기물 종류배출량(톤)사업장도로명주소# duplicates
8(주)이마트 반야월점생활계폐합성수지류204.0대구광역시 동구 안심로 389-2 (신서동)3
17대한민국상이군경회 대구사업소배출시설계그 밖의 폐합성고분자화합물150.0대구광역시 동구 각산동 298-173
18대한민국상이군경회 대구사업소배출시설계폐흡착제36.0대구광역시 동구 각산동 298-173
26롯데쇼핑(주)롯데마트대구율하점생활계폐합성수지류360.0대구광역시 동구 안심로 80 (율하동)3
0(의)열경의료재단생활계폐합성수지류54.0대구광역시 동구 화랑로 81 (효목동)2
1(주)스타하우스생활계폐합성수지류60.0대구광역시 동구 동촌로 316 (방촌동)2
2(주)신세계동대구복합환승센터생활계폐합성고무류36.0대구광역시 동구 동부로 149 (신천동, 동대구역복합환승센터)2
3(주)신세계동대구복합환승센터생활계폐합성섬유36.0대구광역시 동구 동부로 149 (신천동, 동대구역복합환승센터)2
4(주)유창알앤씨생활계폐합성수지류360.0대구광역시 동구 공항로31길 26 (불로동)2
5(주)이마트 반야월점생활계폐가구류, 폐도장목, 폐목재포장재, 폐전선드럼12.0대구광역시 동구 안심로 389-2 (신서동)2