Overview

Dataset statistics

Number of variables3
Number of observations366
Missing cells92
Missing cells (%)8.4%
Duplicate rows3
Duplicate rows (%)0.8%
Total size in memory8.7 KiB
Average record size in memory24.4 B

Variable types

Text2
Categorical1

Dataset

Description세종특별자치시 사업장폐기물 배출 사업장 내역을 공개합니다.사업장명과 사업장 주소, 구분으로 구성되어 있습니다.
Author세종특별자치시
URLhttps://www.data.go.kr/data/15060324/fileData.do

Alerts

Dataset has 3 (0.8%) duplicate rowsDuplicates
사업장명 has 46 (12.6%) missing valuesMissing
사업장 주소 has 46 (12.6%) missing valuesMissing

Reproduction

Analysis started2023-12-23 08:03:42.749477
Analysis finished2023-12-23 08:03:45.459879
Duration2.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사업장명
Text

MISSING 

Distinct299
Distinct (%)93.4%
Missing46
Missing (%)12.6%
Memory size3.0 KiB
2023-12-23T08:03:46.140710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length21
Mean length9.5625
Min length3

Characters and Unicode

Total characters3060
Distinct characters288
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique287 ?
Unique (%)89.7%

Sample

1st row(주)포스코퓨처엠
2nd row주식회사 도원콘크리트
3rd row신안포장산업(주)
4th row국제금속주식회사
5th row에이치엘비(주)헬스케어 2공장
ValueCountFrequency (%)
주식회사 30
 
7.4%
세종공장 10
 
2.5%
세종특별자치시시설관리공단 5
 
1.2%
엔백 4
 
1.0%
주)동양에이케이코리아 3
 
0.7%
테크로스환경서비스 3
 
0.7%
주)두현이엔씨 3
 
0.7%
입주기업체협의회 3
 
0.7%
세종점 3
 
0.7%
주)포스코퓨처엠 3
 
0.7%
Other values (320) 339
83.5%
2023-12-23T08:03:48.045831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
264
 
8.6%
) 230
 
7.5%
( 229
 
7.5%
86
 
2.8%
74
 
2.4%
73
 
2.4%
68
 
2.2%
65
 
2.1%
63
 
2.1%
62
 
2.0%
Other values (278) 1846
60.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2493
81.5%
Close Punctuation 230
 
7.5%
Open Punctuation 229
 
7.5%
Space Separator 86
 
2.8%
Decimal Number 10
 
0.3%
Uppercase Letter 7
 
0.2%
Other Symbol 2
 
0.1%
Lowercase Letter 2
 
0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
264
 
10.6%
74
 
3.0%
73
 
2.9%
68
 
2.7%
65
 
2.6%
63
 
2.5%
62
 
2.5%
49
 
2.0%
48
 
1.9%
44
 
1.8%
Other values (260) 1683
67.5%
Uppercase Letter
ValueCountFrequency (%)
C 1
14.3%
V 1
14.3%
O 1
14.3%
P 1
14.3%
A 1
14.3%
S 1
14.3%
K 1
14.3%
Decimal Number
ValueCountFrequency (%)
2 4
40.0%
1 3
30.0%
9 2
20.0%
3 1
 
10.0%
Lowercase Letter
ValueCountFrequency (%)
t 1
50.0%
k 1
50.0%
Close Punctuation
ValueCountFrequency (%)
) 230
100.0%
Open Punctuation
ValueCountFrequency (%)
( 229
100.0%
Space Separator
ValueCountFrequency (%)
86
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
· 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2495
81.5%
Common 556
 
18.2%
Latin 9
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
264
 
10.6%
74
 
3.0%
73
 
2.9%
68
 
2.7%
65
 
2.6%
63
 
2.5%
62
 
2.5%
49
 
2.0%
48
 
1.9%
44
 
1.8%
Other values (261) 1685
67.5%
Latin
ValueCountFrequency (%)
C 1
11.1%
V 1
11.1%
O 1
11.1%
P 1
11.1%
A 1
11.1%
t 1
11.1%
k 1
11.1%
S 1
11.1%
K 1
11.1%
Common
ValueCountFrequency (%)
) 230
41.4%
( 229
41.2%
86
 
15.5%
2 4
 
0.7%
1 3
 
0.5%
9 2
 
0.4%
3 1
 
0.2%
· 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2493
81.5%
ASCII 564
 
18.4%
None 3
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
264
 
10.6%
74
 
3.0%
73
 
2.9%
68
 
2.7%
65
 
2.6%
63
 
2.5%
62
 
2.5%
49
 
2.0%
48
 
1.9%
44
 
1.8%
Other values (260) 1683
67.5%
ASCII
ValueCountFrequency (%)
) 230
40.8%
( 229
40.6%
86
 
15.2%
2 4
 
0.7%
1 3
 
0.5%
9 2
 
0.4%
C 1
 
0.2%
V 1
 
0.2%
O 1
 
0.2%
P 1
 
0.2%
Other values (6) 6
 
1.1%
None
ValueCountFrequency (%)
2
66.7%
· 1
33.3%

사업장 주소
Text

MISSING 

Distinct285
Distinct (%)89.1%
Missing46
Missing (%)12.6%
Memory size3.0 KiB
2023-12-23T08:03:49.244248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length22
Mean length20.3375
Min length14

Characters and Unicode

Total characters6508
Distinct characters155
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique256 ?
Unique (%)80.0%

Sample

1st row세종특별자치시 소정면 고등리 748
2nd row세종특별자치시 전동면 수회길 28
3rd row세종특별자치시 소정면 소정산단2로 9
4th row세종특별자치시 전동면 전동로 313
5th row세종특별자치시 연동면 청연로 442-48
ValueCountFrequency (%)
세종특별자치시 320
25.4%
전의면 63
 
5.0%
부강면 48
 
3.8%
전동면 46
 
3.7%
연동면 33
 
2.6%
연서면 29
 
2.3%
산단길 27
 
2.1%
소정면 26
 
2.1%
조치원읍 17
 
1.4%
노장공단길 14
 
1.1%
Other values (371) 635
50.5%
2023-12-23T08:03:50.860980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
969
 
14.9%
337
 
5.2%
332
 
5.1%
329
 
5.1%
325
 
5.0%
320
 
4.9%
320
 
4.9%
320
 
4.9%
280
 
4.3%
1 221
 
3.4%
Other values (145) 2755
42.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4383
67.3%
Decimal Number 1038
 
15.9%
Space Separator 969
 
14.9%
Dash Punctuation 118
 
1.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
337
 
7.7%
332
 
7.6%
329
 
7.5%
325
 
7.4%
320
 
7.3%
320
 
7.3%
320
 
7.3%
280
 
6.4%
174
 
4.0%
134
 
3.1%
Other values (133) 1512
34.5%
Decimal Number
ValueCountFrequency (%)
1 221
21.3%
2 161
15.5%
4 121
11.7%
5 101
9.7%
3 94
9.1%
7 89
8.6%
0 76
 
7.3%
6 61
 
5.9%
8 57
 
5.5%
9 57
 
5.5%
Space Separator
ValueCountFrequency (%)
969
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 118
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4383
67.3%
Common 2125
32.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
337
 
7.7%
332
 
7.6%
329
 
7.5%
325
 
7.4%
320
 
7.3%
320
 
7.3%
320
 
7.3%
280
 
6.4%
174
 
4.0%
134
 
3.1%
Other values (133) 1512
34.5%
Common
ValueCountFrequency (%)
969
45.6%
1 221
 
10.4%
2 161
 
7.6%
4 121
 
5.7%
- 118
 
5.6%
5 101
 
4.8%
3 94
 
4.4%
7 89
 
4.2%
0 76
 
3.6%
6 61
 
2.9%
Other values (2) 114
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4383
67.3%
ASCII 2125
32.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
969
45.6%
1 221
 
10.4%
2 161
 
7.6%
4 121
 
5.7%
- 118
 
5.6%
5 101
 
4.8%
3 94
 
4.4%
7 89
 
4.2%
0 76
 
3.6%
6 61
 
2.9%
Other values (2) 114
 
5.4%
Hangul
ValueCountFrequency (%)
337
 
7.7%
332
 
7.6%
329
 
7.5%
325
 
7.4%
320
 
7.3%
320
 
7.3%
320
 
7.3%
280
 
6.4%
174
 
4.0%
134
 
3.1%
Other values (133) 1512
34.5%

구분
Categorical

Distinct3
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
배출시설계
268 
비배출시설계
52 
<NA>
46 

Length

Max length6
Median length5
Mean length5.0163934
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row배출시설계
2nd row배출시설계
3rd row배출시설계
4th row배출시설계
5th row배출시설계

Common Values

ValueCountFrequency (%)
배출시설계 268
73.2%
비배출시설계 52
 
14.2%
<NA> 46
 
12.6%

Length

2023-12-23T08:03:51.528828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-23T08:03:52.003770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
배출시설계 268
73.2%
비배출시설계 52
 
14.2%
na 46
 
12.6%

Missing values

2023-12-23T08:03:44.277977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-23T08:03:44.837420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-23T08:03:45.264906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

사업장명사업장 주소구분
0(주)포스코퓨처엠세종특별자치시 소정면 고등리 748배출시설계
1주식회사 도원콘크리트세종특별자치시 전동면 수회길 28배출시설계
2신안포장산업(주)세종특별자치시 소정면 소정산단2로 9배출시설계
3국제금속주식회사세종특별자치시 전동면 전동로 313배출시설계
4에이치엘비(주)헬스케어 2공장세종특별자치시 연동면 청연로 442-48배출시설계
5(주)에스엠엔지니어링세종특별자치시 종합운동장로 11-13배출시설계
6주식회사 명성철거세종특별자치시 연서면 부국길 28배출시설계
7(사)세종미래일반산업단지 입주기업체협의회세종특별자치시 전의면 미래산단3로 24배출시설계
8(주)원익머트리얼즈세종특별자치시 전의면 산단길 21-125배출시설계
9서부자원세종특별자치시 연서면 부동리 202비배출시설계
사업장명사업장 주소구분
356<NA><NA><NA>
357<NA><NA><NA>
358<NA><NA><NA>
359<NA><NA><NA>
360<NA><NA><NA>
361<NA><NA><NA>
362<NA><NA><NA>
363<NA><NA><NA>
364<NA><NA><NA>
365<NA><NA><NA>

Duplicate rows

Most frequently occurring

사업장명사업장 주소구분# duplicates
2<NA><NA><NA>46
0에스원텍(주)세종특별자치시 전동면 배일길 45배출시설계2
1한국수자원공사 청주권관리단세종특별자치시 연기면 수문강길 281-27배출시설계2