Overview

Dataset statistics

Number of variables5
Number of observations66
Missing cells30
Missing cells (%)9.1%
Duplicate rows1
Duplicate rows (%)1.5%
Total size in memory2.8 KiB
Average record size in memory43.0 B

Variable types

Categorical2
Text2
DateTime1

Dataset

Description제주특별자치도 제주시 관내 환경 오염물질 배출시설 자율점검업소 관련 현황 데이터를 제공합니다. 항목 : 해당연도, 사업장명칭, 소재지, 지정분야
URLhttps://www.data.go.kr/data/15017605/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (1.5%) duplicate rowsDuplicates
해당연도 is highly overall correlated with 지정분야High correlation
지정분야 is highly overall correlated with 해당연도High correlation
사업장명칭 has 10 (15.2%) missing valuesMissing
소재지 has 10 (15.2%) missing valuesMissing
데이터기준일자 has 10 (15.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 13:05:45.855014
Analysis finished2023-12-12 13:05:46.395177
Duration0.54 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

해당연도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size660.0 B
2022
56 
<NA>
10 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2022
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 56
84.8%
<NA> 10
 
15.2%

Length

2023-12-12T22:05:46.458502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:05:46.567577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 56
84.8%
na 10
 
15.2%

사업장명칭
Text

MISSING 

Distinct56
Distinct (%)100.0%
Missing10
Missing (%)15.2%
Memory size660.0 B
2023-12-12T22:05:46.769087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length16
Mean length8.5178571
Min length4

Characters and Unicode

Total characters477
Distinct characters137
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)100.0%

Sample

1st row소명충전소
2nd row용담주유소
3rd row제주OK충전소
4th row공단주유소
5th row제주시농협동부주유소
ValueCountFrequency (%)
주식회사 3
 
4.6%
용담주유소 1
 
1.5%
한국공항공사 1
 
1.5%
남동주유소 1
 
1.5%
㈜삼광개발 1
 
1.5%
성신주유소 1
 
1.5%
제주특별자치도 1
 
1.5%
제주도개인택시운송사업조합 1
 
1.5%
lpg충전소 1
 
1.5%
동아주유소 1
 
1.5%
Other values (53) 53
81.5%
2023-12-12T22:05:47.101571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
40
 
8.4%
22
 
4.6%
22
 
4.6%
18
 
3.8%
18
 
3.8%
17
 
3.6%
16
 
3.4%
15
 
3.1%
15
 
3.1%
14
 
2.9%
Other values (127) 280
58.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 433
90.8%
Close Punctuation 10
 
2.1%
Open Punctuation 10
 
2.1%
Space Separator 9
 
1.9%
Uppercase Letter 8
 
1.7%
Other Symbol 7
 
1.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
40
 
9.2%
22
 
5.1%
22
 
5.1%
18
 
4.2%
18
 
4.2%
17
 
3.9%
16
 
3.7%
15
 
3.5%
15
 
3.5%
14
 
3.2%
Other values (118) 236
54.5%
Uppercase Letter
ValueCountFrequency (%)
G 2
25.0%
L 2
25.0%
P 2
25.0%
O 1
12.5%
K 1
12.5%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Space Separator
ValueCountFrequency (%)
9
100.0%
Other Symbol
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 440
92.2%
Common 29
 
6.1%
Latin 8
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
40
 
9.1%
22
 
5.0%
22
 
5.0%
18
 
4.1%
18
 
4.1%
17
 
3.9%
16
 
3.6%
15
 
3.4%
15
 
3.4%
14
 
3.2%
Other values (119) 243
55.2%
Latin
ValueCountFrequency (%)
G 2
25.0%
L 2
25.0%
P 2
25.0%
O 1
12.5%
K 1
12.5%
Common
ValueCountFrequency (%)
) 10
34.5%
( 10
34.5%
9
31.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 433
90.8%
ASCII 37
 
7.8%
None 7
 
1.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
40
 
9.2%
22
 
5.1%
22
 
5.1%
18
 
4.2%
18
 
4.2%
17
 
3.9%
16
 
3.7%
15
 
3.5%
15
 
3.5%
14
 
3.2%
Other values (118) 236
54.5%
ASCII
ValueCountFrequency (%)
) 10
27.0%
( 10
27.0%
9
24.3%
G 2
 
5.4%
L 2
 
5.4%
P 2
 
5.4%
O 1
 
2.7%
K 1
 
2.7%
None
ValueCountFrequency (%)
7
100.0%

소재지
Text

MISSING 

Distinct54
Distinct (%)96.4%
Missing10
Missing (%)15.2%
Memory size660.0 B
2023-12-12T22:05:47.348974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length24
Mean length19.267857
Min length17

Characters and Unicode

Total characters1079
Distinct characters68
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique53 ?
Unique (%)94.6%

Sample

1st row제주특별자치도 제주시 한북로 313
2nd row제주특별자치도 제주시 용담로 136
3rd row제주특별자치도 제주시 일주서로 7588
4th row제주특별자치도 제주시 연삼로 727
5th row제주특별자치도 제주시 번영로 417
ValueCountFrequency (%)
제주특별자치도 56
24.6%
제주시 56
24.6%
일주서로 5
 
2.2%
선반로 4
 
1.8%
연삼로 4
 
1.8%
공항로 3
 
1.3%
2 3
 
1.3%
남광로 3
 
1.3%
동광로 3
 
1.3%
서광로 3
 
1.3%
Other values (78) 88
38.6%
2023-12-12T22:05:47.711309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
172
15.9%
118
10.9%
113
 
10.5%
58
 
5.4%
56
 
5.2%
56
 
5.2%
56
 
5.2%
56
 
5.2%
56
 
5.2%
50
 
4.6%
Other values (58) 288
26.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 756
70.1%
Space Separator 172
 
15.9%
Decimal Number 148
 
13.7%
Dash Punctuation 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
118
15.6%
113
14.9%
58
7.7%
56
7.4%
56
7.4%
56
7.4%
56
7.4%
56
7.4%
50
 
6.6%
10
 
1.3%
Other values (46) 127
16.8%
Decimal Number
ValueCountFrequency (%)
1 29
19.6%
7 19
12.8%
2 18
12.2%
5 16
10.8%
3 14
9.5%
4 14
9.5%
6 12
8.1%
8 10
 
6.8%
9 9
 
6.1%
0 7
 
4.7%
Space Separator
ValueCountFrequency (%)
172
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 756
70.1%
Common 323
29.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
118
15.6%
113
14.9%
58
7.7%
56
7.4%
56
7.4%
56
7.4%
56
7.4%
56
7.4%
50
 
6.6%
10
 
1.3%
Other values (46) 127
16.8%
Common
ValueCountFrequency (%)
172
53.3%
1 29
 
9.0%
7 19
 
5.9%
2 18
 
5.6%
5 16
 
5.0%
3 14
 
4.3%
4 14
 
4.3%
6 12
 
3.7%
8 10
 
3.1%
9 9
 
2.8%
Other values (2) 10
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 756
70.1%
ASCII 323
29.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
172
53.3%
1 29
 
9.0%
7 19
 
5.9%
2 18
 
5.6%
5 16
 
5.0%
3 14
 
4.3%
4 14
 
4.3%
6 12
 
3.7%
8 10
 
3.1%
9 9
 
2.8%
Other values (2) 10
 
3.1%
Hangul
ValueCountFrequency (%)
118
15.6%
113
14.9%
58
7.7%
56
7.4%
56
7.4%
56
7.4%
56
7.4%
56
7.4%
50
 
6.6%
10
 
1.3%
Other values (46) 127
16.8%

지정분야
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)13.6%
Missing0
Missing (%)0.0%
Memory size660.0 B
수질분야5
36 
<NA>
10 
대기분야5+기타분야
대기분야4+기타분야
대기분야4+수질분야4
 
2
Other values (4)

Length

Max length11
Median length5
Mean length6.1666667
Min length4

Unique

Unique4 ?
Unique (%)6.1%

Sample

1st row수질분야5
2nd row수질분야5
3rd row수질분야5
4th row수질분야5
5th row수질분야5

Common Values

ValueCountFrequency (%)
수질분야5 36
54.5%
<NA> 10
 
15.2%
대기분야5+기타분야 8
 
12.1%
대기분야4+기타분야 6
 
9.1%
대기분야4+수질분야4 2
 
3.0%
기타분야 1
 
1.5%
대기분야5+수질분야5 1
 
1.5%
대기분야4 1
 
1.5%
대기분야5 1
 
1.5%

Length

2023-12-12T22:05:47.985219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:05:48.144755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수질분야5 36
54.5%
na 10
 
15.2%
대기분야5+기타분야 8
 
12.1%
대기분야4+기타분야 6
 
9.1%
대기분야4+수질분야4 2
 
3.0%
기타분야 1
 
1.5%
대기분야5+수질분야5 1
 
1.5%
대기분야4 1
 
1.5%
대기분야5 1
 
1.5%

데이터기준일자
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)1.8%
Missing10
Missing (%)15.2%
Memory size660.0 B
Minimum2023-01-05 00:00:00
Maximum2023-01-05 00:00:00
2023-12-12T22:05:48.251301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:05:48.335703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2023-12-12T22:05:48.413297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업장명칭소재지지정분야
사업장명칭1.0001.0001.000
소재지1.0001.0001.000
지정분야1.0001.0001.000
2023-12-12T22:05:48.516468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
해당연도지정분야
해당연도1.0001.000
지정분야1.0001.000
2023-12-12T22:05:48.612354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
해당연도지정분야
해당연도1.0001.000
지정분야1.0001.000

Missing values

2023-12-12T22:05:46.095195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:05:46.194806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T22:05:46.317796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

해당연도사업장명칭소재지지정분야데이터기준일자
02022소명충전소제주특별자치도 제주시 한북로 313수질분야52023-01-05
12022용담주유소제주특별자치도 제주시 용담로 136수질분야52023-01-05
22022제주OK충전소제주특별자치도 제주시 일주서로 7588수질분야52023-01-05
32022공단주유소제주특별자치도 제주시 연삼로 727수질분야52023-01-05
42022제주시농협동부주유소제주특별자치도 제주시 번영로 417수질분야52023-01-05
52022정우오라LPG충전소제주특별자치도 제주시 오남로 150수질분야52023-01-05
62022(주)광동제주특별자치도 제주시 일주서로 7832수질분야52023-01-05
72022내트럭자동세차제주특별자치도 제주시 번영로 345수질분야52023-01-05
82022엔크린주유소제주특별자치도 제주시 일주서로 7650수질분야52023-01-05
92022임덕방사선과의원제주특별자치도 제주시 중앙로 77기타분야2023-01-05
해당연도사업장명칭소재지지정분야데이터기준일자
56<NA><NA><NA><NA><NA>
57<NA><NA><NA><NA><NA>
58<NA><NA><NA><NA><NA>
59<NA><NA><NA><NA><NA>
60<NA><NA><NA><NA><NA>
61<NA><NA><NA><NA><NA>
62<NA><NA><NA><NA><NA>
63<NA><NA><NA><NA><NA>
64<NA><NA><NA><NA><NA>
65<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

해당연도사업장명칭소재지지정분야데이터기준일자# duplicates
0<NA><NA><NA><NA><NA>10