Overview

Dataset statistics

Number of variables3
Number of observations1291
Missing cells0
Missing cells (%)0.0%
Duplicate rows251
Duplicate rows (%)19.4%
Total size in memory30.4 KiB
Average record size in memory24.1 B

Variable types

Text3

Dataset

Description경상남도 함안군 사업장폐기물 배출신고사업장 현황입니다 . 상호명, 폐기물종류, 소재지(도로명주소)폐기물 관리법 제17조 관련
Author경상남도 함안군
URLhttps://www.data.go.kr/data/15062044/fileData.do

Alerts

Dataset has 251 (19.4%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 08:46:28.985128
Analysis finished2023-12-12 08:46:29.558414
Duration0.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상호
Text

Distinct365
Distinct (%)28.3%
Missing0
Missing (%)0.0%
Memory size10.2 KiB
2023-12-12T17:46:29.773356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length17
Mean length8.6498838
Min length3

Characters and Unicode

Total characters11167
Distinct characters292
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique119 ?
Unique (%)9.2%

Sample

1st row(주)나산전기산업 함안공장
2nd row(주)나산전기산업 함안공장
3rd row(주)현대티엠씨
4th row주식회사 부경콘크리트
5th row엔티케이산업(주)
ValueCountFrequency (%)
주)서진인바이러테크 45
 
3.2%
주)칠서알씨 30
 
2.1%
일산실업(주)칠서에탄올공장 25
 
1.8%
엠함안(주 21
 
1.5%
한국주강(주 17
 
1.2%
삼영엠텍(주 16
 
1.1%
주)아시아그린 16
 
1.1%
주)한국특강 15
 
1.1%
신성에코(주 15
 
1.1%
주)약동산업 14
 
1.0%
Other values (373) 1206
84.9%
2023-12-12T17:46:30.237076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1139
 
10.2%
( 1090
 
9.8%
) 1090
 
9.8%
305
 
2.7%
278
 
2.5%
273
 
2.4%
260
 
2.3%
247
 
2.2%
202
 
1.8%
171
 
1.5%
Other values (282) 6112
54.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8605
77.1%
Open Punctuation 1096
 
9.8%
Close Punctuation 1096
 
9.8%
Uppercase Letter 177
 
1.6%
Space Separator 129
 
1.2%
Decimal Number 39
 
0.3%
Dash Punctuation 10
 
0.1%
Lowercase Letter 10
 
0.1%
Other Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1139
 
13.2%
305
 
3.5%
278
 
3.2%
273
 
3.2%
260
 
3.0%
247
 
2.9%
202
 
2.3%
171
 
2.0%
170
 
2.0%
167
 
1.9%
Other values (252) 5393
62.7%
Uppercase Letter
ValueCountFrequency (%)
S 27
15.3%
I 17
9.6%
B 16
9.0%
M 13
 
7.3%
E 13
 
7.3%
C 12
 
6.8%
H 11
 
6.2%
F 11
 
6.2%
N 10
 
5.6%
R 10
 
5.6%
Other values (7) 37
20.9%
Decimal Number
ValueCountFrequency (%)
2 20
51.3%
1 14
35.9%
3 3
 
7.7%
8 1
 
2.6%
6 1
 
2.6%
Open Punctuation
ValueCountFrequency (%)
( 1090
99.5%
[ 6
 
0.5%
Close Punctuation
ValueCountFrequency (%)
) 1090
99.5%
] 6
 
0.5%
Space Separator
ValueCountFrequency (%)
129
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Lowercase Letter
ValueCountFrequency (%)
o 10
100.0%
Other Punctuation
ValueCountFrequency (%)
. 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8605
77.1%
Common 2375
 
21.3%
Latin 187
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1139
 
13.2%
305
 
3.5%
278
 
3.2%
273
 
3.2%
260
 
3.0%
247
 
2.9%
202
 
2.3%
171
 
2.0%
170
 
2.0%
167
 
1.9%
Other values (252) 5393
62.7%
Latin
ValueCountFrequency (%)
S 27
14.4%
I 17
 
9.1%
B 16
 
8.6%
M 13
 
7.0%
E 13
 
7.0%
C 12
 
6.4%
H 11
 
5.9%
F 11
 
5.9%
N 10
 
5.3%
R 10
 
5.3%
Other values (8) 47
25.1%
Common
ValueCountFrequency (%)
( 1090
45.9%
) 1090
45.9%
129
 
5.4%
2 20
 
0.8%
1 14
 
0.6%
- 10
 
0.4%
[ 6
 
0.3%
] 6
 
0.3%
. 5
 
0.2%
3 3
 
0.1%
Other values (2) 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8605
77.1%
ASCII 2562
 
22.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1139
 
13.2%
305
 
3.5%
278
 
3.2%
273
 
3.2%
260
 
3.0%
247
 
2.9%
202
 
2.3%
171
 
2.0%
170
 
2.0%
167
 
1.9%
Other values (252) 5393
62.7%
ASCII
ValueCountFrequency (%)
( 1090
42.5%
) 1090
42.5%
129
 
5.0%
S 27
 
1.1%
2 20
 
0.8%
I 17
 
0.7%
B 16
 
0.6%
1 14
 
0.5%
M 13
 
0.5%
E 13
 
0.5%
Other values (20) 133
 
5.2%
Distinct89
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size10.2 KiB
2023-12-12T17:46:30.598603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length88
Median length64
Mean length12.699458
Min length1

Characters and Unicode

Total characters16395
Distinct characters184
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)2.0%

Sample

1st row폐수처리오니
2nd row폐합성수지류(폐염화비닐수지류는 제외한다)
3rd row폐활성탄
4th row폐콘크리트
5th row폐합성수지류(폐염화비닐수지류는 제외한다)
ValueCountFrequency (%)
529
17.1%
밖의 529
17.1%
제외한다 331
 
10.7%
폐합성수지류(폐염화비닐수지류는 320
 
10.4%
분진 208
 
6.7%
광재류 99
 
3.2%
폐수처리오니 96
 
3.1%
폐목재류 51
 
1.7%
폐기물 37
 
1.2%
폐토사 32
 
1.0%
Other values (150) 853
27.6%
2023-12-12T17:46:31.051060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1888
 
11.5%
1246
 
7.6%
872
 
5.3%
809
 
4.9%
694
 
4.2%
551
 
3.4%
540
 
3.3%
529
 
3.2%
443
 
2.7%
401
 
2.4%
Other values (174) 8422
51.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13694
83.5%
Space Separator 1888
 
11.5%
Close Punctuation 368
 
2.2%
Open Punctuation 368
 
2.2%
Connector Punctuation 76
 
0.5%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1246
 
9.1%
872
 
6.4%
809
 
5.9%
694
 
5.1%
551
 
4.0%
540
 
3.9%
529
 
3.9%
443
 
3.2%
401
 
2.9%
381
 
2.8%
Other values (167) 7228
52.8%
Close Punctuation
ValueCountFrequency (%)
) 367
99.7%
1
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 367
99.7%
1
 
0.3%
Space Separator
ValueCountFrequency (%)
1888
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 76
100.0%
Decimal Number
ValueCountFrequency (%)
2 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13694
83.5%
Common 2701
 
16.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1246
 
9.1%
872
 
6.4%
809
 
5.9%
694
 
5.1%
551
 
4.0%
540
 
3.9%
529
 
3.9%
443
 
3.2%
401
 
2.9%
381
 
2.8%
Other values (167) 7228
52.8%
Common
ValueCountFrequency (%)
1888
69.9%
) 367
 
13.6%
( 367
 
13.6%
_ 76
 
2.8%
1
 
< 0.1%
2 1
 
< 0.1%
1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13653
83.3%
ASCII 2699
 
16.5%
Compat Jamo 41
 
0.3%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1888
70.0%
) 367
 
13.6%
( 367
 
13.6%
_ 76
 
2.8%
2 1
 
< 0.1%
Hangul
ValueCountFrequency (%)
1246
 
9.1%
872
 
6.4%
809
 
5.9%
694
 
5.1%
551
 
4.0%
540
 
4.0%
529
 
3.9%
443
 
3.2%
401
 
2.9%
381
 
2.8%
Other values (166) 7187
52.6%
Compat Jamo
ValueCountFrequency (%)
41
100.0%
None
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct339
Distinct (%)26.3%
Missing0
Missing (%)0.0%
Memory size10.2 KiB
2023-12-12T17:46:31.365195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length37
Mean length23.088304
Min length1

Characters and Unicode

Total characters29807
Distinct characters226
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)7.7%

Sample

1st row경상남도 함안군 군북면 함안산단5길 7
2nd row경상남도 함안군 군북면 함안산단5길 7
3rd row경상남도 함안군 가야읍 남문길 66 (주)현대티엠씨
4th row경상남도 함안군 군북면 장지1길 193
5th row경상남도 함안군 칠서면 구포1길 30
ValueCountFrequency (%)
경상남도 1261
18.9%
함안군 1261
18.9%
군북면 348
 
5.2%
칠서면 319
 
4.8%
법수면 215
 
3.2%
대부로 103
 
1.5%
대산면 98
 
1.5%
윤외공단길 93
 
1.4%
칠원읍 92
 
1.4%
장백로 64
 
1.0%
Other values (458) 2806
42.1%
2023-12-12T17:46:31.829331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5433
18.2%
1609
 
5.4%
1517
 
5.1%
1497
 
5.0%
1300
 
4.4%
1293
 
4.3%
1261
 
4.2%
1261
 
4.2%
1140
 
3.8%
1 909
 
3.0%
Other values (216) 12587
42.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 19120
64.1%
Space Separator 5433
 
18.2%
Decimal Number 4191
 
14.1%
Dash Punctuation 388
 
1.3%
Open Punctuation 281
 
0.9%
Close Punctuation 281
 
0.9%
Connector Punctuation 91
 
0.3%
Uppercase Letter 17
 
0.1%
Other Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1609
 
8.4%
1517
 
7.9%
1497
 
7.8%
1300
 
6.8%
1293
 
6.8%
1261
 
6.6%
1261
 
6.6%
1140
 
6.0%
651
 
3.4%
631
 
3.3%
Other values (194) 6960
36.4%
Decimal Number
ValueCountFrequency (%)
1 909
21.7%
2 603
14.4%
3 461
11.0%
9 380
9.1%
5 366
8.7%
4 338
 
8.1%
8 310
 
7.4%
0 305
 
7.3%
6 303
 
7.2%
7 216
 
5.2%
Uppercase Letter
ValueCountFrequency (%)
N 5
29.4%
E 4
23.5%
G 4
23.5%
K 2
 
11.8%
T 1
 
5.9%
J 1
 
5.9%
Space Separator
ValueCountFrequency (%)
5433
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 388
100.0%
Open Punctuation
ValueCountFrequency (%)
( 281
100.0%
Close Punctuation
ValueCountFrequency (%)
) 281
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 91
100.0%
Other Punctuation
ValueCountFrequency (%)
: 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 19120
64.1%
Common 10670
35.8%
Latin 17
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1609
 
8.4%
1517
 
7.9%
1497
 
7.8%
1300
 
6.8%
1293
 
6.8%
1261
 
6.6%
1261
 
6.6%
1140
 
6.0%
651
 
3.4%
631
 
3.3%
Other values (194) 6960
36.4%
Common
ValueCountFrequency (%)
5433
50.9%
1 909
 
8.5%
2 603
 
5.7%
3 461
 
4.3%
- 388
 
3.6%
9 380
 
3.6%
5 366
 
3.4%
4 338
 
3.2%
8 310
 
2.9%
0 305
 
2.9%
Other values (6) 1177
 
11.0%
Latin
ValueCountFrequency (%)
N 5
29.4%
E 4
23.5%
G 4
23.5%
K 2
 
11.8%
T 1
 
5.9%
J 1
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 19120
64.1%
ASCII 10687
35.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5433
50.8%
1 909
 
8.5%
2 603
 
5.6%
3 461
 
4.3%
- 388
 
3.6%
9 380
 
3.6%
5 366
 
3.4%
4 338
 
3.2%
8 310
 
2.9%
0 305
 
2.9%
Other values (12) 1194
 
11.2%
Hangul
ValueCountFrequency (%)
1609
 
8.4%
1517
 
7.9%
1497
 
7.8%
1300
 
6.8%
1293
 
6.8%
1261
 
6.6%
1261
 
6.6%
1140
 
6.0%
651
 
3.4%
631
 
3.3%
Other values (194) 6960
36.4%

Missing values

2023-12-12T17:46:29.399880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:46:29.509482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

상호폐기물 종류사업장도로명주소
0(주)나산전기산업 함안공장폐수처리오니경상남도 함안군 군북면 함안산단5길 7
1(주)나산전기산업 함안공장폐합성수지류(폐염화비닐수지류는 제외한다)경상남도 함안군 군북면 함안산단5길 7
2(주)현대티엠씨폐활성탄경상남도 함안군 가야읍 남문길 66 (주)현대티엠씨
3주식회사 부경콘크리트폐콘크리트경상남도 함안군 군북면 장지1길 193
4엔티케이산업(주)폐합성수지류(폐염화비닐수지류는 제외한다)경상남도 함안군 칠서면 구포1길 30
5(주)동아폐합성수지류(폐염화비닐수지류는 제외한다)경상남도 함안군 군북면 함안산단6길 97
6베르데산업(주)(종합)폐콘크리트경상남도 함안군 군북면 석교천길 270-2
7(주)범진기업폐합성수지류(폐염화비닐수지류는 제외한다)경상남도 함안군 법수면 법수로 358-64
8(주)성진에이앤에이그 밖의 분진경상남도 함안군 군북면 삼봉로 30 (주)성진에이앤에이
9(주)성진에이앤에이폐합성수지류(폐염화비닐수지류는 제외한다)경상남도 함안군 군북면 삼봉로 30 (주)성진에이앤에이
상호폐기물 종류사업장도로명주소
1281삼신정밀(주)그 밖의 광재류경상남도 함안군 군북면 함안산단4길 3
1282삼신정밀(주)폐토사경상남도 함안군 군북면 함안산단4길 3
1283삼신정밀(주)그 밖의 분진경상남도 함안군 군북면 함안산단4길 3
1284삼신정밀(주)그 밖의 광재류경상남도 함안군 군북면 함안산단4길 3
1285삼신정밀(주)그 밖의 광재류경상남도 함안군 군북면 함안산단4길 3
1286삼신정밀(주)화학점결폐주물사경상남도 함안군 군북면 함안산단4길 3
1287삼신정밀(주)그 밖의 광재류경상남도 함안군 군북면 함안산단4길 3
1288삼신정밀(주)폐합성수지류(폐염화비닐수지류는 제외한다)경상남도 함안군 군북면 함안산단4길 3
1289삼신정밀(주)폐활성탄경상남도 함안군 군북면 함안산단4길 3
1290삼신정밀(주)그 밖의 분진경상남도 함안군 군북면 함안산단4길 3

Duplicate rows

Most frequently occurring

상호폐기물 종류사업장도로명주소# duplicates
207일산실업(주)칠서에탄올공장그 밖의 폐수처리오니경상남도 함안군 칠서면 대부로 55116
91(주)칠서알씨폐합성수지류(폐염화비닐수지류는 제외한다)경상남도 함안군 칠서면 공단서4길 1112
85(주)진주스틸폐합성수지류(폐염화비닐수지류는 제외한다)경상남도 함안군 법수면 강주1길 100-4810
34(주)서진인바이러테크그 밖의 공정오니경상남도 함안군 법수면 윤외공단길 26-99 (주)서진인바이러테크9
90(주)칠서알씨폐폴리우레탄폼류경상남도 함안군 칠서면 공단서4길 118
194엠함안(주)그 밖의 폐목재류경상남도 함안군 대산면 대부로 420_ 엠함안주식회사8
35(주)서진인바이러테크그 밖의 광재류경상남도 함안군 법수면 윤외공단길 26-99 (주)서진인바이러테크7
36(주)서진인바이러테크그 밖의 무기성오니경상남도 함안군 법수면 윤외공단길 26-99 (주)서진인바이러테크7
82(주)주일에코텍폐합성수지류(폐염화비닐수지류는 제외한다)경상남도 함안군 대산면 대부로 398 ((주)오코)7
37(주)서진인바이러테크그 밖의 분진경상남도 함안군 법수면 윤외공단길 26-99 (주)서진인바이러테크6