Overview

Dataset statistics

Number of variables7
Number of observations527
Missing cells510
Missing cells (%)13.8%
Duplicate rows8
Duplicate rows (%)1.5%
Total size in memory28.9 KiB
Average record size in memory56.3 B

Variable types

Categorical3
Text2
DateTime2

Dataset

Description경기도 구리시 지역내에 위치한 환경오염물질배출시설과 해당 업체들에 대한 단속 현황정보(시설구분, 시설명, 점검일자 등)를 제공합니다.
URLhttps://www.data.go.kr/data/15051004/fileData.do

Alerts

시군명 has constant value ""Constant
관리기관명 has constant value ""Constant
데이터기준일자 has constant value ""Constant
Dataset has 8 (1.5%) duplicate rowsDuplicates
시설구분 is highly imbalanced (54.7%)Imbalance
특이사항 has 510 (96.8%) missing valuesMissing

Reproduction

Analysis started2023-12-12 12:05:13.447704
Analysis finished2023-12-12 12:05:13.986871
Duration0.54 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
구리시
527 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row구리시
2nd row구리시
3rd row구리시
4th row구리시
5th row구리시

Common Values

ValueCountFrequency (%)
구리시 527
100.0%

Length

2023-12-12T21:05:14.049810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:05:14.155387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
구리시 527
100.0%

시설구분
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
폐수배출업소관리
386 
대기배출업소관리
136 
대기배출업소관리(무허가)
 
3
소음진동관리
 
2

Length

Max length13
Median length8
Mean length8.0208729
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row폐수배출업소관리
2nd row폐수배출업소관리
3rd row폐수배출업소관리
4th row폐수배출업소관리
5th row폐수배출업소관리

Common Values

ValueCountFrequency (%)
폐수배출업소관리 386
73.2%
대기배출업소관리 136
 
25.8%
대기배출업소관리(무허가) 3
 
0.6%
소음진동관리 2
 
0.4%

Length

2023-12-12T21:05:14.281821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:05:14.416625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
폐수배출업소관리 386
73.2%
대기배출업소관리 136
 
25.8%
대기배출업소관리(무허가 3
 
0.6%
소음진동관리 2
 
0.4%
Distinct193
Distinct (%)36.6%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
2023-12-12T21:05:14.686794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length20
Mean length8.7324478
Min length2

Characters and Unicode

Total characters4602
Distinct characters269
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)12.1%

Sample

1st row(주)능마을 보금자리주유소
2nd row(주)성림에너지
3rd row구리LPG충전소
4th row유워시셀프세차
5th row코스모자동차공업사
ValueCountFrequency (%)
구리현대자동차공업사 16
 
2.5%
하이카구리토평점 11
 
1.7%
구리농수산물도매시장관리공사 9
 
1.4%
제일공업사 9
 
1.4%
성신양회(주)구리공장 8
 
1.2%
주)미래교통 8
 
1.2%
우원개발(주 7
 
1.1%
구리농수산물공사 7
 
1.1%
애니카랜드 7
 
1.1%
신성사 7
 
1.1%
Other values (214) 563
86.3%
2023-12-12T21:05:15.098462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
232
 
5.0%
167
 
3.6%
( 159
 
3.5%
) 159
 
3.5%
148
 
3.2%
143
 
3.1%
133
 
2.9%
131
 
2.8%
114
 
2.5%
110
 
2.4%
Other values (259) 3106
67.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4005
87.0%
Open Punctuation 162
 
3.5%
Close Punctuation 162
 
3.5%
Space Separator 131
 
2.8%
Uppercase Letter 64
 
1.4%
Decimal Number 32
 
0.7%
Other Symbol 17
 
0.4%
Dash Punctuation 11
 
0.2%
Other Punctuation 9
 
0.2%
Lowercase Letter 9
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
232
 
5.8%
167
 
4.2%
148
 
3.7%
143
 
3.6%
133
 
3.3%
114
 
2.8%
110
 
2.7%
103
 
2.6%
81
 
2.0%
81
 
2.0%
Other values (228) 2693
67.2%
Uppercase Letter
ValueCountFrequency (%)
K 12
18.8%
S 12
18.8%
L 9
14.1%
G 8
12.5%
P 8
12.5%
B 4
 
6.2%
I 4
 
6.2%
C 4
 
6.2%
T 2
 
3.1%
D 1
 
1.6%
Decimal Number
ValueCountFrequency (%)
1 11
34.4%
0 5
15.6%
2 5
15.6%
7 4
 
12.5%
8 3
 
9.4%
5 3
 
9.4%
9 1
 
3.1%
Lowercase Letter
ValueCountFrequency (%)
s 3
33.3%
k 3
33.3%
m 1
 
11.1%
p 1
 
11.1%
a 1
 
11.1%
Open Punctuation
ValueCountFrequency (%)
( 159
98.1%
[ 3
 
1.9%
Close Punctuation
ValueCountFrequency (%)
) 159
98.1%
] 3
 
1.9%
Other Punctuation
ValueCountFrequency (%)
& 8
88.9%
; 1
 
11.1%
Space Separator
ValueCountFrequency (%)
131
100.0%
Other Symbol
ValueCountFrequency (%)
17
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4016
87.3%
Common 507
 
11.0%
Latin 73
 
1.6%
Han 6
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
232
 
5.8%
167
 
4.2%
148
 
3.7%
143
 
3.6%
133
 
3.3%
114
 
2.8%
110
 
2.7%
103
 
2.6%
81
 
2.0%
81
 
2.0%
Other values (228) 2704
67.3%
Common
ValueCountFrequency (%)
( 159
31.4%
) 159
31.4%
131
25.8%
- 11
 
2.2%
1 11
 
2.2%
& 8
 
1.6%
0 5
 
1.0%
2 5
 
1.0%
7 4
 
0.8%
8 3
 
0.6%
Other values (5) 11
 
2.2%
Latin
ValueCountFrequency (%)
K 12
16.4%
S 12
16.4%
L 9
12.3%
G 8
11.0%
P 8
11.0%
B 4
 
5.5%
I 4
 
5.5%
C 4
 
5.5%
s 3
 
4.1%
k 3
 
4.1%
Other values (5) 6
8.2%
Han
ValueCountFrequency (%)
6
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3999
86.9%
ASCII 580
 
12.6%
None 17
 
0.4%
CJK 6
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
232
 
5.8%
167
 
4.2%
148
 
3.7%
143
 
3.6%
133
 
3.3%
114
 
2.9%
110
 
2.8%
103
 
2.6%
81
 
2.0%
81
 
2.0%
Other values (227) 2687
67.2%
ASCII
ValueCountFrequency (%)
( 159
27.4%
) 159
27.4%
131
22.6%
K 12
 
2.1%
S 12
 
2.1%
- 11
 
1.9%
1 11
 
1.9%
L 9
 
1.6%
& 8
 
1.4%
G 8
 
1.4%
Other values (20) 60
 
10.3%
None
ValueCountFrequency (%)
17
100.0%
CJK
ValueCountFrequency (%)
6
100.0%
Distinct190
Distinct (%)36.1%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
Minimum2012-04-05 00:00:00
Maximum2022-05-24 00:00:00
2023-12-12T21:05:15.278848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:05:15.451512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

특이사항
Text

MISSING 

Distinct12
Distinct (%)70.6%
Missing510
Missing (%)96.8%
Memory size4.2 KiB
2023-12-12T21:05:15.682892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length23
Mean length17.058824
Min length5

Characters and Unicode

Total characters290
Distinct characters71
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)47.1%

Sample

1st row유량계고장 운영일지 거짓작성 방지시설 미운영
2nd row외부급수로 인한 유입 방류량 차이 있으며 방류량계 교체 권고.
3rd row개선명령(배출허용기준초과)
4th row개선명령(배출허용기준초과)
5th row무단 방류
ValueCountFrequency (%)
미신고 5
 
9.8%
변경신고 4
 
7.8%
미이행(행정처분 4
 
7.8%
운영 3
 
5.9%
대기배출시설 3
 
5.9%
운영일지 3
 
5.9%
폐수배출시설 2
 
3.9%
2
 
3.9%
도장시설(행정처분 2
 
3.9%
개선명령(배출허용기준초과 2
 
3.9%
Other values (21) 21
41.2%
2023-12-12T21:05:16.081884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34
 
11.7%
14
 
4.8%
12
 
4.1%
12
 
4.1%
( 12
 
4.1%
) 12
 
4.1%
10
 
3.4%
9
 
3.1%
9
 
3.1%
9
 
3.1%
Other values (61) 157
54.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 229
79.0%
Space Separator 34
 
11.7%
Open Punctuation 12
 
4.1%
Close Punctuation 12
 
4.1%
Other Punctuation 3
 
1.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
6.1%
12
 
5.2%
12
 
5.2%
10
 
4.4%
9
 
3.9%
9
 
3.9%
9
 
3.9%
9
 
3.9%
8
 
3.5%
7
 
3.1%
Other values (56) 130
56.8%
Other Punctuation
ValueCountFrequency (%)
: 2
66.7%
. 1
33.3%
Space Separator
ValueCountFrequency (%)
34
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 229
79.0%
Common 61
 
21.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
6.1%
12
 
5.2%
12
 
5.2%
10
 
4.4%
9
 
3.9%
9
 
3.9%
9
 
3.9%
9
 
3.9%
8
 
3.5%
7
 
3.1%
Other values (56) 130
56.8%
Common
ValueCountFrequency (%)
34
55.7%
( 12
 
19.7%
) 12
 
19.7%
: 2
 
3.3%
. 1
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 229
79.0%
ASCII 61
 
21.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
34
55.7%
( 12
 
19.7%
) 12
 
19.7%
: 2
 
3.3%
. 1
 
1.6%
Hangul
ValueCountFrequency (%)
14
 
6.1%
12
 
5.2%
12
 
5.2%
10
 
4.4%
9
 
3.9%
9
 
3.9%
9
 
3.9%
9
 
3.9%
8
 
3.5%
7
 
3.1%
Other values (56) 130
56.8%

관리기관명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
경기도 구리시청
527 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경기도 구리시청
2nd row경기도 구리시청
3rd row경기도 구리시청
4th row경기도 구리시청
5th row경기도 구리시청

Common Values

ValueCountFrequency (%)
경기도 구리시청 527
100.0%

Length

2023-12-12T21:05:16.260228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:05:16.358668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 527
50.0%
구리시청 527
50.0%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
Minimum2023-06-01 00:00:00
Maximum2023-06-01 00:00:00
2023-12-12T21:05:16.441633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:05:16.537998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2023-12-12T21:05:16.609093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설구분특이사항
시설구분1.0000.911
특이사항0.9111.000

Missing values

2023-12-12T21:05:13.808212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:05:13.942383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군명시설구분시설명점검일자특이사항관리기관명데이터기준일자
0구리시폐수배출업소관리(주)능마을 보금자리주유소2022-05-24<NA>경기도 구리시청2023-06-01
1구리시폐수배출업소관리(주)성림에너지2022-05-24<NA>경기도 구리시청2023-06-01
2구리시폐수배출업소관리구리LPG충전소2022-05-20<NA>경기도 구리시청2023-06-01
3구리시폐수배출업소관리유워시셀프세차2022-04-29<NA>경기도 구리시청2023-06-01
4구리시폐수배출업소관리코스모자동차공업사2022-04-27<NA>경기도 구리시청2023-06-01
5구리시폐수배출업소관리하이카구리토평점2022-04-20<NA>경기도 구리시청2023-06-01
6구리시폐수배출업소관리한국타이어경기대리점2022-04-19<NA>경기도 구리시청2023-06-01
7구리시폐수배출업소관리농수산실업2022-04-18유량계고장 운영일지 거짓작성 방지시설 미운영경기도 구리시청2023-06-01
8구리시폐수배출업소관리(주)케이피앤지2022-04-13<NA>경기도 구리시청2023-06-01
9구리시폐수배출업소관리금호자동차공업사2022-04-13<NA>경기도 구리시청2023-06-01
시군명시설구분시설명점검일자특이사항관리기관명데이터기준일자
517구리시폐수배출업소관리사노주유소2012-06-25<NA>경기도 구리시청2023-06-01
518구리시폐수배출업소관리담터주유소2012-06-25<NA>경기도 구리시청2023-06-01
519구리시폐수배출업소관리구리LPG충전소2012-06-25<NA>경기도 구리시청2023-06-01
520구리시폐수배출업소관리보금자리주유소2012-06-25<NA>경기도 구리시청2023-06-01
521구리시대기배출업소관리(주)현대아이엔에스2012-05-31<NA>경기도 구리시청2023-06-01
522구리시대기배출업소관리해창개발(주)-용마터널 종점2012-04-16가동개시신고 미이행(행정처분)경기도 구리시청2023-06-01
523구리시폐수배출업소관리구리농수산물공사2012-04-05<NA>경기도 구리시청2023-06-01
524구리시폐수배출업소관리토평정수장2012-04-05<NA>경기도 구리시청2023-06-01
525구리시폐수배출업소관리한국석유공사 구리지사2012-04-05<NA>경기도 구리시청2023-06-01
526구리시폐수배출업소관리해창개발(주)-용마터널 종점2012-04-05<NA>경기도 구리시청2023-06-01

Duplicate rows

Most frequently occurring

시군명시설구분시설명점검일자특이사항관리기관명데이터기준일자# duplicates
0구리시대기배출업소관리(주)교문자동차공업사2022-01-20<NA>경기도 구리시청2023-06-012
1구리시대기배출업소관리(주)금강고속2022-01-20<NA>경기도 구리시청2023-06-012
2구리시대기배출업소관리우진프라스틱 제1공장2014-09-16<NA>경기도 구리시청2023-06-012
3구리시대기배출업소관리협신자동차공업사2013-07-23<NA>경기도 구리시청2023-06-012
4구리시소음진동관리우진프라스틱 제1공장2014-09-16<NA>경기도 구리시청2023-06-012
5구리시폐수배출업소관리구리현대자동차공업사2013-03-05<NA>경기도 구리시청2023-06-012
6구리시폐수배출업소관리대웅세차장2013-03-05<NA>경기도 구리시청2023-06-012
7구리시폐수배출업소관리동일자동차 공업사2013-03-05<NA>경기도 구리시청2023-06-012