Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells33348
Missing cells (%)55.6%
Duplicate rows83
Duplicate rows (%)0.8%
Total size in memory546.9 KiB
Average record size in memory56.0 B

Variable types

Categorical2
Text3
DateTime1

Dataset

Description전라남도 무안군 사업장 폐기물 배출 신고현황으로 사업장 구분, 폐기물 종류, 주소, 등록일자등의 데이터를 제공합니다.
URLhttps://www.data.go.kr/data/15081044/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 83 (0.8%) duplicate rowsDuplicates
구분 is highly overall correlated with 폐기물 종류High correlation
폐기물 종류 is highly overall correlated with 구분High correlation
폐기물 종류 is highly imbalanced (79.1%)Imbalance
상호 has 8337 (83.4%) missing valuesMissing
사업장도로명주소 has 8337 (83.4%) missing valuesMissing
사업장지번주소 has 8337 (83.4%) missing valuesMissing
데이터기준일자 has 8337 (83.4%) missing valuesMissing

Reproduction

Analysis started2023-12-12 14:21:30.335661
Analysis finished2023-12-12 14:21:31.516249
Duration1.18 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
8337 
지정
1663 

Length

Max length4
Median length4
Mean length3.6674
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row지정
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 8337
83.4%
지정 1663
 
16.6%

Length

2023-12-12T23:21:31.605736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:21:31.737092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 8337
83.4%
지정 1663
 
16.6%

상호
Text

MISSING 

Distinct375
Distinct (%)22.5%
Missing8337
Missing (%)83.4%
Memory size156.2 KiB
2023-12-12T23:21:31.958788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length23
Mean length6.4485869
Min length1

Characters and Unicode

Total characters10724
Distinct characters284
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique274 ?
Unique (%)16.5%

Sample

1st row개인
2nd row개인
3rd row개인
4th row개인
5th row무안군청(건설교통과)
ValueCountFrequency (%)
개인 343
 
17.7%
유한회사 187
 
9.7%
유)남해환경 75
 
3.9%
진응건설(주 70
 
3.6%
주)동양환경 68
 
3.5%
신성건설 66
 
3.4%
유)신성건설 62
 
3.2%
미래개발 59
 
3.0%
유)서부환경 57
 
2.9%
주)대양환경산업건설 48
 
2.5%
Other values (390) 900
46.5%
2023-12-12T23:21:32.390404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 677
 
6.3%
( 676
 
6.3%
496
 
4.6%
485
 
4.5%
468
 
4.4%
427
 
4.0%
393
 
3.7%
384
 
3.6%
374
 
3.5%
372
 
3.5%
Other values (274) 5972
55.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9049
84.4%
Close Punctuation 679
 
6.3%
Open Punctuation 678
 
6.3%
Space Separator 277
 
2.6%
Decimal Number 36
 
0.3%
Connector Punctuation 4
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
496
 
5.5%
485
 
5.4%
468
 
5.2%
427
 
4.7%
393
 
4.3%
384
 
4.2%
374
 
4.1%
372
 
4.1%
280
 
3.1%
249
 
2.8%
Other values (261) 5121
56.6%
Decimal Number
ValueCountFrequency (%)
9 16
44.4%
1 10
27.8%
8 7
19.4%
6 1
 
2.8%
5 1
 
2.8%
3 1
 
2.8%
Close Punctuation
ValueCountFrequency (%)
) 677
99.7%
] 2
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 676
99.7%
[ 2
 
0.3%
Space Separator
ValueCountFrequency (%)
277
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9049
84.4%
Common 1675
 
15.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
496
 
5.5%
485
 
5.4%
468
 
5.2%
427
 
4.7%
393
 
4.3%
384
 
4.2%
374
 
4.1%
372
 
4.1%
280
 
3.1%
249
 
2.8%
Other values (261) 5121
56.6%
Common
ValueCountFrequency (%)
) 677
40.4%
( 676
40.4%
277
16.5%
9 16
 
1.0%
1 10
 
0.6%
8 7
 
0.4%
_ 4
 
0.2%
] 2
 
0.1%
[ 2
 
0.1%
6 1
 
0.1%
Other values (3) 3
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9049
84.4%
ASCII 1675
 
15.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 677
40.4%
( 676
40.4%
277
16.5%
9 16
 
1.0%
1 10
 
0.6%
8 7
 
0.4%
_ 4
 
0.2%
] 2
 
0.1%
[ 2
 
0.1%
6 1
 
0.1%
Other values (3) 3
 
0.2%
Hangul
ValueCountFrequency (%)
496
 
5.5%
485
 
5.4%
468
 
5.2%
427
 
4.7%
393
 
4.3%
384
 
4.2%
374
 
4.1%
372
 
4.1%
280
 
3.1%
249
 
2.8%
Other values (261) 5121
56.6%

폐기물 종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct46
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
8337 
흩날릴 우려가 없는 폐석면
 
490
석면의 제거작업에 사용된 모든 비닐시트ㆍ방진마스크ㆍ작업복ㆍ집진필터 등
 
476
슬레이트 등 고형화되어 있어 흩날릴 우려가 없는 것
 
189
석면의 제거작업에 사용된 바닥비닐시트ㆍ방진마스크ㆍ작업복 등
 
138
Other values (41)
 
370

Length

Max length81
Median length4
Mean length7.5169
Min length1

Unique

Unique15 ?
Unique (%)0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row슬레이트 등 고형화되어 있어 흩날릴 우려가 없는 것
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 8337
83.4%
흩날릴 우려가 없는 폐석면 490
 
4.9%
석면의 제거작업에 사용된 모든 비닐시트ㆍ방진마스크ㆍ작업복ㆍ집진필터 등 476
 
4.8%
슬레이트 등 고형화되어 있어 흩날릴 우려가 없는 것 189
 
1.9%
석면의 제거작업에 사용된 바닥비닐시트ㆍ방진마스크ㆍ작업복 등 138
 
1.4%
흩날릴 우려가 있는 폐석면 105
 
1.1%
건조고형물의 함량을 기준으로 하여 석면이 1퍼센트 이상 함유된 제품ㆍ설비(뿜칠로 사용된 것을 포함한다) 등의 해체ㆍ제거 시 발생되는 것 46
 
0.5%
손상성폐기물 30
 
0.3%
일반의료폐기물 28
 
0.3%
폐유 28
 
0.3%
Other values (36) 133
 
1.3%

Length

2023-12-12T23:21:32.580131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 8337
49.1%
808
 
4.8%
우려가 785
 
4.6%
흩날릴 785
 
4.6%
없는 679
 
4.0%
사용된 664
 
3.9%
석면의 618
 
3.6%
제거작업에 618
 
3.6%
폐석면 604
 
3.6%
모든 476
 
2.8%
Other values (97) 2617
 
15.4%
Distinct657
Distinct (%)39.5%
Missing8337
Missing (%)83.4%
Memory size156.2 KiB
2023-12-12T23:21:33.014845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length45
Mean length22.335538
Min length1

Characters and Unicode

Total characters37144
Distinct characters308
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique542 ?
Unique (%)32.6%

Sample

1st row
2nd row전라남도 무안군 해제면 매안1길 60-4
3rd row전라남도 무안군 삼향읍 덕영길 36
4th row전라남도 무안군 무안읍 교촌길 142
5th row전라남도 무안군 무안읍 무안로 530_ 무안군청
ValueCountFrequency (%)
전라남도 1499
18.5%
무안군 1401
 
17.2%
삼향읍 538
 
6.6%
무안읍 246
 
3.0%
청계면 207
 
2.5%
삼향중앙로 165
 
2.0%
일로읍 158
 
1.9%
영산로 138
 
1.7%
삼일로 134
 
1.6%
260 128
 
1.6%
Other values (1059) 3510
43.2%
2023-12-12T23:21:33.577267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6744
18.2%
1864
 
5.0%
1829
 
4.9%
1776
 
4.8%
1595
 
4.3%
1514
 
4.1%
1506
 
4.1%
1429
 
3.8%
1 1172
 
3.2%
1146
 
3.1%
Other values (298) 16569
44.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 22743
61.2%
Space Separator 6744
 
18.2%
Decimal Number 5899
 
15.9%
Dash Punctuation 759
 
2.0%
Open Punctuation 431
 
1.2%
Close Punctuation 431
 
1.2%
Connector Punctuation 130
 
0.3%
Math Symbol 4
 
< 0.1%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1864
 
8.2%
1829
 
8.0%
1776
 
7.8%
1595
 
7.0%
1514
 
6.7%
1506
 
6.6%
1429
 
6.3%
1146
 
5.0%
948
 
4.2%
919
 
4.0%
Other values (279) 8217
36.1%
Decimal Number
ValueCountFrequency (%)
1 1172
19.9%
2 935
15.9%
5 694
11.8%
0 689
11.7%
3 539
9.1%
6 516
8.7%
4 494
8.4%
7 304
 
5.2%
8 296
 
5.0%
9 260
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
S 1
33.3%
G 1
33.3%
I 1
33.3%
Space Separator
ValueCountFrequency (%)
6744
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 759
100.0%
Open Punctuation
ValueCountFrequency (%)
( 431
100.0%
Close Punctuation
ValueCountFrequency (%)
) 431
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 130
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 22743
61.2%
Common 14398
38.8%
Latin 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1864
 
8.2%
1829
 
8.0%
1776
 
7.8%
1595
 
7.0%
1514
 
6.7%
1506
 
6.6%
1429
 
6.3%
1146
 
5.0%
948
 
4.2%
919
 
4.0%
Other values (279) 8217
36.1%
Common
ValueCountFrequency (%)
6744
46.8%
1 1172
 
8.1%
2 935
 
6.5%
- 759
 
5.3%
5 694
 
4.8%
0 689
 
4.8%
3 539
 
3.7%
6 516
 
3.6%
4 494
 
3.4%
( 431
 
3.0%
Other values (6) 1425
 
9.9%
Latin
ValueCountFrequency (%)
S 1
33.3%
G 1
33.3%
I 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 22743
61.2%
ASCII 14401
38.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6744
46.8%
1 1172
 
8.1%
2 935
 
6.5%
- 759
 
5.3%
5 694
 
4.8%
0 689
 
4.8%
3 539
 
3.7%
6 516
 
3.6%
4 494
 
3.4%
( 431
 
3.0%
Other values (9) 1428
 
9.9%
Hangul
ValueCountFrequency (%)
1864
 
8.2%
1829
 
8.0%
1776
 
7.8%
1595
 
7.0%
1514
 
6.7%
1506
 
6.6%
1429
 
6.3%
1146
 
5.0%
948
 
4.2%
919
 
4.0%
Other values (279) 8217
36.1%

사업장지번주소
Text

MISSING 

Distinct659
Distinct (%)39.6%
Missing8337
Missing (%)83.4%
Memory size156.2 KiB
2023-12-12T23:21:33.965816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length45
Median length40
Mean length22.301263
Min length1

Characters and Unicode

Total characters37087
Distinct characters273
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique534 ?
Unique (%)32.1%

Sample

1st row전라남도 무안군 운남면 성내리 560
2nd row전라남도 무안군 해제면 창매리 343-19
3rd row전라남도 무안군 삼향읍 지산리 667
4th row전라남도 무안군 무안읍 교촌리 889-5
5th row전라남도 무안군 무안읍 성동리 712-1 무안군청
ValueCountFrequency (%)
전라남도 1511
18.6%
무안군 1414
17.4%
삼향읍 534
 
6.6%
무안읍 272
 
3.3%
청계면 210
 
2.6%
용포리 173
 
2.1%
일로읍 158
 
1.9%
유교리 132
 
1.6%
308-6 128
 
1.6%
성남리 116
 
1.4%
Other values (949) 3489
42.9%
2023-12-12T23:21:34.521204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8265
22.3%
1912
 
5.2%
1726
 
4.7%
1717
 
4.6%
1569
 
4.2%
1523
 
4.1%
1520
 
4.1%
1439
 
3.9%
1426
 
3.8%
1 1070
 
2.9%
Other values (263) 14920
40.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 21632
58.3%
Space Separator 8265
 
22.3%
Decimal Number 5916
 
16.0%
Dash Punctuation 883
 
2.4%
Close Punctuation 190
 
0.5%
Open Punctuation 190
 
0.5%
Connector Punctuation 9
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1912
 
8.8%
1726
 
8.0%
1717
 
7.9%
1569
 
7.3%
1523
 
7.0%
1520
 
7.0%
1439
 
6.7%
1426
 
6.6%
967
 
4.5%
612
 
2.8%
Other values (247) 7221
33.4%
Decimal Number
ValueCountFrequency (%)
1 1070
18.1%
3 848
14.3%
5 766
12.9%
2 623
10.5%
6 522
8.8%
0 521
8.8%
4 419
 
7.1%
7 407
 
6.9%
8 406
 
6.9%
9 334
 
5.6%
Space Separator
ValueCountFrequency (%)
8265
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 883
100.0%
Close Punctuation
ValueCountFrequency (%)
) 190
100.0%
Open Punctuation
ValueCountFrequency (%)
( 190
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 9
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 21632
58.3%
Common 15455
41.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1912
 
8.8%
1726
 
8.0%
1717
 
7.9%
1569
 
7.3%
1523
 
7.0%
1520
 
7.0%
1439
 
6.7%
1426
 
6.6%
967
 
4.5%
612
 
2.8%
Other values (247) 7221
33.4%
Common
ValueCountFrequency (%)
8265
53.5%
1 1070
 
6.9%
- 883
 
5.7%
3 848
 
5.5%
5 766
 
5.0%
2 623
 
4.0%
6 522
 
3.4%
0 521
 
3.4%
4 419
 
2.7%
7 407
 
2.6%
Other values (6) 1131
 
7.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 21632
58.3%
ASCII 15455
41.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8265
53.5%
1 1070
 
6.9%
- 883
 
5.7%
3 848
 
5.5%
5 766
 
5.0%
2 623
 
4.0%
6 522
 
3.4%
0 521
 
3.4%
4 419
 
2.7%
7 407
 
2.6%
Other values (6) 1131
 
7.3%
Hangul
ValueCountFrequency (%)
1912
 
8.8%
1726
 
8.0%
1717
 
7.9%
1569
 
7.3%
1523
 
7.0%
1520
 
7.0%
1439
 
6.7%
1426
 
6.6%
967
 
4.5%
612
 
2.8%
Other values (247) 7221
33.4%

데이터기준일자
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)0.1%
Missing8337
Missing (%)83.4%
Memory size156.2 KiB
Minimum2023-04-20 00:00:00
Maximum2023-04-20 00:00:00
2023-12-12T23:21:34.640000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:21:34.740850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2023-12-12T23:21:34.839680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
폐기물 종류
폐기물 종류1.000
2023-12-12T23:21:34.917748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분폐기물 종류
구분1.0001.000
폐기물 종류1.0001.000
2023-12-12T23:21:35.002764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분폐기물 종류
구분1.0001.000
폐기물 종류1.0001.000

Missing values

2023-12-12T23:21:31.113552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:21:31.258910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T23:21:31.418586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분상호폐기물 종류사업장도로명주소사업장지번주소데이터기준일자
28140<NA><NA><NA><NA><NA><NA>
20459<NA><NA><NA><NA><NA><NA>
9091지정개인슬레이트 등 고형화되어 있어 흩날릴 우려가 없는 것전라남도 무안군 운남면 성내리 5602023-04-20
45772<NA><NA><NA><NA><NA><NA>
47000<NA><NA><NA><NA><NA><NA>
57575<NA><NA><NA><NA><NA><NA>
34912<NA><NA><NA><NA><NA><NA>
50158<NA><NA><NA><NA><NA><NA>
63127<NA><NA><NA><NA><NA><NA>
46166<NA><NA><NA><NA><NA><NA>
구분상호폐기물 종류사업장도로명주소사업장지번주소데이터기준일자
34578<NA><NA><NA><NA><NA><NA>
31313<NA><NA><NA><NA><NA><NA>
45810<NA><NA><NA><NA><NA><NA>
24452<NA><NA><NA><NA><NA><NA>
20801<NA><NA><NA><NA><NA><NA>
24712<NA><NA><NA><NA><NA><NA>
19332<NA><NA><NA><NA><NA><NA>
38844<NA><NA><NA><NA><NA><NA>
5174지정(주)대양환경산업건설흩날릴 우려가 없는 폐석면전라남도 무안군 일로읍 죽산리 219-5 (박선심)2023-04-20
61612<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

구분상호폐기물 종류사업장도로명주소사업장지번주소데이터기준일자# duplicates
82<NA><NA><NA><NA><NA><NA>8337
81지정진응건설(주)흩날릴 우려가 없는 폐석면전라남도 여수시 여수산단로 86-24 (주삼동)전라남도 여수시 주삼동 1832023-04-2041
72지정유한회사 신성건설흩날릴 우려가 없는 폐석면전라남도 무안군 삼향읍 삼일로 260전라남도 무안군 삼향읍 용포리 308-62023-04-2036
71지정유한회사 신성건설석면의 제거작업에 사용된 모든 비닐시트ㆍ방진마스크ㆍ작업복ㆍ집진필터 등전라남도 무안군 삼향읍 삼일로 260전라남도 무안군 삼향읍 용포리 308-62023-04-2030
80지정진응건설(주)석면의 제거작업에 사용된 모든 비닐시트ㆍ방진마스크ㆍ작업복ㆍ집진필터 등전라남도 여수시 여수산단로 86-24 (주삼동)전라남도 여수시 주삼동 1832023-04-2029
13지정(유)서부환경석면의 제거작업에 사용된 모든 비닐시트ㆍ방진마스크ㆍ작업복ㆍ집진필터 등전라남도 무안군 삼향읍 삼향중앙로 140-53전라남도 무안군 삼향읍 유교리 350-52023-04-2025
22지정(유)신성건설흩날릴 우려가 없는 폐석면전라남도 무안군 삼향읍 삼일로 260전라남도 무안군 삼향읍 용포리 308-62023-04-2024
23지정(유)신성건설흩날릴 우려가 있는 폐석면전라남도 무안군 삼향읍 삼일로 260전라남도 무안군 삼향읍 용포리 308-62023-04-2020
36지정(주)승달건설흩날릴 우려가 없는 폐석면전라남도 무안군 삼향읍 삼향중앙로 277전라남도 무안군 삼향읍 용포리 752-22023-04-2020
9지정(유)남해환경흩날릴 우려가 있는 폐석면전라남도 무안군 삼향읍 삼향중앙로 140-51 (유)남해환경전라남도 무안군 삼향읍 유교리 350-1 (유)남해환경2023-04-2019