Overview

Dataset statistics

Number of variables9
Number of observations111
Missing cells20
Missing cells (%)2.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.9 KiB
Average record size in memory73.2 B

Variable types

Text2
Categorical3
DateTime2
Boolean2

Dataset

Description경기도 여주시의 특정토양오염관리대상시설 현황(상호, 소재지도로명주소, 용도, 설치일자, 정기검사대상구분, 당해연도정기검사여부, 누출검사대상여부, 당해연도누출검사여부, 데이터기준일자)에 대한 정보를 제공합니다.
Author경기도 여주시
URLhttps://www.data.go.kr/data/15049195/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
정기검사대상구분 is highly overall correlated with 누출검사대상여부High correlation
누출검사대상여부 is highly overall correlated with 정기검사대상구분High correlation
누출검사대상여부 is highly imbalanced (66.1%)Imbalance
당해연도정기검사여부 has 13 (11.7%) missing valuesMissing
당해연도누출검사여부 has 7 (6.3%) missing valuesMissing
상호 has unique valuesUnique

Reproduction

Analysis started2023-12-11 23:42:35.320156
Analysis finished2023-12-11 23:42:36.146699
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상호
Text

UNIQUE 

Distinct111
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1020.0 B
2023-12-12T08:42:36.375190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length21
Mean length7.6486486
Min length3

Characters and Unicode

Total characters849
Distinct characters200
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique111 ?
Unique (%)100.0%

Sample

1st row현대오일뱅크㈜ 직영 탑골주유소
2nd row석봉주유소
3rd row흥천주유소
4th row대림주유소
5th rowSK네트웍스㈜ 남한강주유소
ValueCountFrequency (%)
현대오일뱅크㈜ 2
 
1.5%
여주공장 2
 
1.5%
주식회사 2
 
1.5%
직영 2
 
1.5%
여주지점 1
 
0.8%
금강웰빙주유소 1
 
0.8%
더케이소피아그린 1
 
0.8%
남강주유소 1
 
0.8%
산북바다주유소 1
 
0.8%
능서주유소·능서충전소 1
 
0.8%
Other values (119) 119
89.5%
2023-12-12T08:42:36.782290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
99
 
11.7%
81
 
9.5%
81
 
9.5%
30
 
3.5%
22
 
2.6%
18
 
2.1%
18
 
2.1%
13
 
1.5%
10
 
1.2%
9
 
1.1%
Other values (190) 468
55.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 757
89.2%
Other Symbol 30
 
3.5%
Space Separator 22
 
2.6%
Uppercase Letter 16
 
1.9%
Close Punctuation 7
 
0.8%
Open Punctuation 7
 
0.8%
Other Punctuation 5
 
0.6%
Decimal Number 5
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
99
 
13.1%
81
 
10.7%
81
 
10.7%
18
 
2.4%
18
 
2.4%
13
 
1.7%
10
 
1.3%
9
 
1.2%
9
 
1.2%
9
 
1.2%
Other values (172) 410
54.2%
Uppercase Letter
ValueCountFrequency (%)
C 7
43.8%
S 3
18.8%
K 3
18.8%
I 2
 
12.5%
E 1
 
6.2%
Decimal Number
ValueCountFrequency (%)
5 1
20.0%
6 1
20.0%
8 1
20.0%
0 1
20.0%
1 1
20.0%
Other Punctuation
ValueCountFrequency (%)
, 2
40.0%
& 1
20.0%
· 1
20.0%
. 1
20.0%
Other Symbol
ValueCountFrequency (%)
30
100.0%
Space Separator
ValueCountFrequency (%)
22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 787
92.7%
Common 46
 
5.4%
Latin 16
 
1.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
99
 
12.6%
81
 
10.3%
81
 
10.3%
30
 
3.8%
18
 
2.3%
18
 
2.3%
13
 
1.7%
10
 
1.3%
9
 
1.1%
9
 
1.1%
Other values (173) 419
53.2%
Common
ValueCountFrequency (%)
22
47.8%
) 7
 
15.2%
( 7
 
15.2%
, 2
 
4.3%
& 1
 
2.2%
5 1
 
2.2%
6 1
 
2.2%
· 1
 
2.2%
8 1
 
2.2%
0 1
 
2.2%
Other values (2) 2
 
4.3%
Latin
ValueCountFrequency (%)
C 7
43.8%
S 3
18.8%
K 3
18.8%
I 2
 
12.5%
E 1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 757
89.2%
ASCII 61
 
7.2%
None 31
 
3.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
99
 
13.1%
81
 
10.7%
81
 
10.7%
18
 
2.4%
18
 
2.4%
13
 
1.7%
10
 
1.3%
9
 
1.2%
9
 
1.2%
9
 
1.2%
Other values (172) 410
54.2%
None
ValueCountFrequency (%)
30
96.8%
· 1
 
3.2%
ASCII
ValueCountFrequency (%)
22
36.1%
C 7
 
11.5%
) 7
 
11.5%
( 7
 
11.5%
S 3
 
4.9%
K 3
 
4.9%
, 2
 
3.3%
I 2
 
3.3%
E 1
 
1.6%
& 1
 
1.6%
Other values (6) 6
 
9.8%
Distinct110
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size1020.0 B
2023-12-12T08:42:37.235524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length23
Mean length18.900901
Min length14

Characters and Unicode

Total characters2098
Distinct characters86
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique109 ?
Unique (%)98.2%

Sample

1st row경기도 여주시 여주남로 158
2nd row경기도 여주시 북내면 여양2로 567
3rd row경기도 여주시 흥천면 효자로 87
4th row경기도 여주시 금사면 금품1로 436
5th row경기도 여주시 세종로 432
ValueCountFrequency (%)
경기도 111
21.2%
여주시 111
21.2%
가남읍 24
 
4.6%
대신면 13
 
2.5%
경충대로 11
 
2.1%
여주남로 11
 
2.1%
세종대왕면 9
 
1.7%
강천면 9
 
1.7%
여양로 9
 
1.7%
흥천면 8
 
1.5%
Other values (147) 207
39.6%
2023-12-12T08:42:37.801775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
412
19.6%
151
 
7.2%
124
 
5.9%
122
 
5.8%
114
 
5.4%
111
 
5.3%
111
 
5.3%
104
 
5.0%
1 74
 
3.5%
56
 
2.7%
Other values (76) 719
34.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1308
62.3%
Space Separator 412
 
19.6%
Decimal Number 366
 
17.4%
Dash Punctuation 12
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
151
11.5%
124
 
9.5%
122
 
9.3%
114
 
8.7%
111
 
8.5%
111
 
8.5%
104
 
8.0%
56
 
4.3%
40
 
3.1%
37
 
2.8%
Other values (64) 338
25.8%
Decimal Number
ValueCountFrequency (%)
1 74
20.2%
2 48
13.1%
6 38
10.4%
7 36
9.8%
3 34
9.3%
8 30
8.2%
4 30
8.2%
5 29
 
7.9%
0 28
 
7.7%
9 19
 
5.2%
Space Separator
ValueCountFrequency (%)
412
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1308
62.3%
Common 790
37.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
151
11.5%
124
 
9.5%
122
 
9.3%
114
 
8.7%
111
 
8.5%
111
 
8.5%
104
 
8.0%
56
 
4.3%
40
 
3.1%
37
 
2.8%
Other values (64) 338
25.8%
Common
ValueCountFrequency (%)
412
52.2%
1 74
 
9.4%
2 48
 
6.1%
6 38
 
4.8%
7 36
 
4.6%
3 34
 
4.3%
8 30
 
3.8%
4 30
 
3.8%
5 29
 
3.7%
0 28
 
3.5%
Other values (2) 31
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1308
62.3%
ASCII 790
37.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
412
52.2%
1 74
 
9.4%
2 48
 
6.1%
6 38
 
4.8%
7 36
 
4.6%
3 34
 
4.3%
8 30
 
3.8%
4 30
 
3.8%
5 29
 
3.7%
0 28
 
3.5%
Other values (2) 31
 
3.9%
Hangul
ValueCountFrequency (%)
151
11.5%
124
 
9.5%
122
 
9.3%
114
 
8.7%
111
 
8.5%
111
 
8.5%
104
 
8.0%
56
 
4.3%
40
 
3.1%
37
 
2.8%
Other values (64) 338
25.8%

용도
Categorical

Distinct4
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size1020.0 B
주유소
82 
산업용
18 
난방용
10 
기타
 
1

Length

Max length3
Median length3
Mean length2.990991
Min length2

Unique

Unique1 ?
Unique (%)0.9%

Sample

1st row주유소
2nd row주유소
3rd row주유소
4th row주유소
5th row주유소

Common Values

ValueCountFrequency (%)
주유소 82
73.9%
산업용 18
 
16.2%
난방용 10
 
9.0%
기타 1
 
0.9%

Length

2023-12-12T08:42:37.987486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:42:38.108909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주유소 82
73.9%
산업용 18
 
16.2%
난방용 10
 
9.0%
기타 1
 
0.9%
Distinct107
Distinct (%)96.4%
Missing0
Missing (%)0.0%
Memory size1020.0 B
Minimum1978-12-16 00:00:00
Maximum2022-04-08 00:00:00
2023-12-12T08:42:38.250790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:42:38.421647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

정기검사대상구분
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size1020.0 B
대상
68 
매년
28 
면제
15 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row면제
2nd row대상
3rd row매년
4th row매년
5th row대상

Common Values

ValueCountFrequency (%)
대상 68
61.3%
매년 28
25.2%
면제 15
 
13.5%

Length

2023-12-12T08:42:38.591452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:42:38.712741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대상 68
61.3%
매년 28
25.2%
면제 15
 
13.5%
Distinct2
Distinct (%)2.0%
Missing13
Missing (%)11.7%
Memory size354.0 B
False
56 
True
42 
(Missing)
13 
ValueCountFrequency (%)
False 56
50.5%
True 42
37.8%
(Missing) 13
 
11.7%
2023-12-12T08:42:38.822176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

누출검사대상여부
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size1020.0 B
대상
104 
면제
 
7

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대상
2nd row대상
3rd row대상
4th row대상
5th row대상

Common Values

ValueCountFrequency (%)
대상 104
93.7%
면제 7
 
6.3%

Length

2023-12-12T08:42:38.922493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:42:39.022884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대상 104
93.7%
면제 7
 
6.3%
Distinct2
Distinct (%)1.9%
Missing7
Missing (%)6.3%
Memory size354.0 B
False
88 
True
16 
(Missing)
 
7
ValueCountFrequency (%)
False 88
79.3%
True 16
 
14.4%
(Missing) 7
 
6.3%
2023-12-12T08:42:39.131991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1020.0 B
Minimum2023-11-15 00:00:00
Maximum2023-11-15 00:00:00
2023-12-12T08:42:39.230188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:42:39.350763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2023-12-12T08:42:39.445419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용도정기검사대상구분당해연도정기검사여부누출검사대상여부당해연도누출검사여부
용도1.0000.1390.0820.5440.000
정기검사대상구분0.1391.0000.1260.4130.000
당해연도정기검사여부0.0820.1261.0000.0000.000
누출검사대상여부0.5440.4130.0001.0000.000
당해연도누출검사여부0.0000.0000.0000.0001.000
2023-12-12T08:42:39.586479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
누출검사대상여부당해연도정기검사여부용도당해연도누출검사여부정기검사대상구분
누출검사대상여부1.0000.0000.3680.0000.645
당해연도정기검사여부0.0001.0000.0500.0000.207
용도0.3680.0501.0000.0000.130
당해연도누출검사여부0.0000.0000.0001.0000.000
정기검사대상구분0.6450.2070.1300.0001.000
2023-12-12T08:42:39.691350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용도정기검사대상구분당해연도정기검사여부누출검사대상여부당해연도누출검사여부
용도1.0000.1300.0500.3680.000
정기검사대상구분0.1301.0000.2070.6450.000
당해연도정기검사여부0.0500.2071.0000.0000.000
누출검사대상여부0.3680.6450.0001.0000.000
당해연도누출검사여부0.0000.0000.0000.0001.000

Missing values

2023-12-12T08:42:35.794010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:42:35.961374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T08:42:36.091227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

상호소재지도로명주소용도설치일자정기검사대상구분당해연도정기검사여부누출검사대상여부당해연도누출검사여부데이터기준일자
0현대오일뱅크㈜ 직영 탑골주유소경기도 여주시 여주남로 158주유소2016-06-03면제N대상<NA>2023-11-15
1석봉주유소경기도 여주시 북내면 여양2로 567주유소1991-09-13대상Y대상N2023-11-15
2흥천주유소경기도 여주시 흥천면 효자로 87주유소1990-12-19매년N대상N2023-11-15
3대림주유소경기도 여주시 금사면 금품1로 436주유소1995-11-02매년N대상N2023-11-15
4SK네트웍스㈜ 남한강주유소경기도 여주시 세종로 432주유소1991-12-07대상Y대상Y2023-11-15
5청기와주유소경기도 여주시 여주남로 48주유소1993-06-29대상Y대상N2023-11-15
6강남주유소경기도 여주시 대신면 여양로 681주유소1994-04-23매년N대상Y2023-11-15
7강천주유소경기도 여주시 강천면 강문로 460주유소1992-08-26대상Y대상N2023-11-15
8초현주유소경기도 여주시 대신면 대신1로 186주유소1992-12-24매년Y대상N2023-11-15
9세경SK주유소경기도 여주시 대신면 여양로 1801주유소1992-05-04매년N대상N2023-11-15
상호소재지도로명주소용도설치일자정기검사대상구분당해연도정기검사여부누출검사대상여부당해연도누출검사여부데이터기준일자
101금강레미콘㈜경기도 여주시 가남읍 설가로 662산업용2013-06-05대상N대상N2023-11-15
102한라엔컴㈜경기도 여주시 대신면 여양1로 302산업용2013-10-10매년N대상Y2023-11-15
103여주농협클린주유소경기도 여주시 세종로 421-7주유소2014-11-26면제<NA>대상N2023-11-15
104한일시멘트㈜여주공장경기도 여주시 장여로 1463-81산업용2015-04-06대상N대상N2023-11-15
105현대오일뱅크㈜경기도 여주시 여주남로 148기타2016-07-08대상N대상N2023-11-15
106대신농협클린주유소경기도 여주시 대신면 여양로 1537주유소2016-06-13면제<NA>대상N2023-11-15
107㈜다성레미콘 여주지점경기도 여주시 흥천면 흥천로 308-85산업용2016-09-26매년Y대상N2023-11-15
108㈜대광아스콘경기도 여주시 가남읍 여주남로 761주유소2017-04-12대상N대상N2023-11-15
109북여주IC주유소경기도 여주시 흥천면 이여로 1067주유소2017-04-12면제<NA>대상N2023-11-15
110대진주유소경기도 여주시 강천면 강문로 344주유소2022-04-08대상N대상N2023-11-15