Overview

Dataset statistics

Number of variables5
Number of observations203
Missing cells356
Missing cells (%)35.1%
Duplicate rows1
Duplicate rows (%)0.5%
Total size in memory8.3 KiB
Average record size in memory41.7 B

Variable types

Numeric1
Text3
Categorical1

Dataset

Description인천광역시 중구 관내에 위치한 대기오염물질 배출시설 현황에 대한 데이터 입니다.파일명 인천광역시_중구_대기오염물질 배출시설 현황파일내용 업소명, 구분, 도로명 주소 등
Author인천광역시 중구
URLhttps://www.data.go.kr/data/15087839/fileData.do

Alerts

Dataset has 1 (0.5%) duplicate rowsDuplicates
연번 is highly overall correlated with 데이터 기준일자High correlation
데이터 기준일자 is highly overall correlated with 연번High correlation
연번 has 89 (43.8%) missing valuesMissing
업소명 has 89 (43.8%) missing valuesMissing
구분 has 89 (43.8%) missing valuesMissing
도로명 주소 has 89 (43.8%) missing valuesMissing

Reproduction

Analysis started2024-04-17 17:31:23.358773
Analysis finished2024-04-17 17:31:23.890574
Duration0.53 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct114
Distinct (%)100.0%
Missing89
Missing (%)43.8%
Infinite0
Infinite (%)0.0%
Mean63.745614
Minimum1
Maximum203
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2024-04-18T02:31:23.952204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.65
Q129.25
median57.5
Q385.75
95-th percentile197.35
Maximum203
Range202
Interquartile range (IQR)56.5

Descriptive statistics

Standard deviation47.768396
Coefficient of variation (CV)0.74935973
Kurtosis2.2108269
Mean63.745614
Median Absolute Deviation (MAD)28.5
Skewness1.393919
Sum7267
Variance2281.8197
MonotonicityStrictly increasing
2024-04-18T02:31:24.057864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
87 1
 
0.5%
85 1
 
0.5%
84 1
 
0.5%
83 1
 
0.5%
82 1
 
0.5%
81 1
 
0.5%
80 1
 
0.5%
79 1
 
0.5%
78 1
 
0.5%
77 1
 
0.5%
Other values (104) 104
51.2%
(Missing) 89
43.8%
ValueCountFrequency (%)
1 1
0.5%
2 1
0.5%
3 1
0.5%
4 1
0.5%
5 1
0.5%
6 1
0.5%
7 1
0.5%
8 1
0.5%
9 1
0.5%
10 1
0.5%
ValueCountFrequency (%)
203 1
0.5%
202 1
0.5%
201 1
0.5%
200 1
0.5%
199 1
0.5%
198 1
0.5%
197 1
0.5%
196 1
0.5%
106 1
0.5%
105 1
0.5%

업소명
Text

MISSING 

Distinct107
Distinct (%)93.9%
Missing89
Missing (%)43.8%
Memory size1.7 KiB
2024-04-18T02:31:24.245791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length14
Mean length8.0526316
Min length3

Characters and Unicode

Total characters918
Distinct characters208
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique106 ?
Unique (%)93.0%

Sample

1st row씨제이제일제당㈜인천3공장
2nd row㈜삼양사 인천2공장
3rd row씨제이제일제당㈜인천냉동식품공장
4th row대한제당㈜
5th row씨제이제일제당㈜ 인천1공장
ValueCountFrequency (%)
보일러 8
 
6.1%
㈜한진 2
 
1.5%
씨제이제일제당㈜ 2
 
1.5%
인천2공장 2
 
1.5%
사단법인 1
 
0.8%
1
 
0.8%
케이준오토 1
 
0.8%
정석기업㈜인천관리사무소-신관 1
 
0.8%
쌍용레미콘㈜인천파쇄장 1
 
0.8%
인천관광공사 1
 
0.8%
Other values (111) 111
84.7%
2024-04-18T02:31:24.535907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
78
 
8.5%
41
 
4.5%
34
 
3.7%
28
 
3.1%
27
 
2.9%
27
 
2.9%
24
 
2.6%
23
 
2.5%
22
 
2.4%
22
 
2.4%
Other values (198) 592
64.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 785
85.5%
Other Symbol 78
 
8.5%
Space Separator 28
 
3.1%
Uppercase Letter 10
 
1.1%
Decimal Number 6
 
0.7%
Close Punctuation 4
 
0.4%
Open Punctuation 4
 
0.4%
Dash Punctuation 2
 
0.2%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
41
 
5.2%
34
 
4.3%
27
 
3.4%
27
 
3.4%
24
 
3.1%
23
 
2.9%
22
 
2.8%
22
 
2.8%
19
 
2.4%
18
 
2.3%
Other values (183) 528
67.3%
Uppercase Letter
ValueCountFrequency (%)
S 4
40.0%
G 2
20.0%
L 1
 
10.0%
N 1
 
10.0%
U 1
 
10.0%
J 1
 
10.0%
Decimal Number
ValueCountFrequency (%)
2 3
50.0%
1 2
33.3%
3 1
 
16.7%
Other Symbol
ValueCountFrequency (%)
78
100.0%
Space Separator
ValueCountFrequency (%)
28
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 863
94.0%
Common 45
 
4.9%
Latin 10
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
78
 
9.0%
41
 
4.8%
34
 
3.9%
27
 
3.1%
27
 
3.1%
24
 
2.8%
23
 
2.7%
22
 
2.5%
22
 
2.5%
19
 
2.2%
Other values (184) 546
63.3%
Common
ValueCountFrequency (%)
28
62.2%
) 4
 
8.9%
( 4
 
8.9%
2 3
 
6.7%
- 2
 
4.4%
1 2
 
4.4%
& 1
 
2.2%
3 1
 
2.2%
Latin
ValueCountFrequency (%)
S 4
40.0%
G 2
20.0%
L 1
 
10.0%
N 1
 
10.0%
U 1
 
10.0%
J 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 785
85.5%
None 78
 
8.5%
ASCII 55
 
6.0%

Most frequent character per block

None
ValueCountFrequency (%)
78
100.0%
Hangul
ValueCountFrequency (%)
41
 
5.2%
34
 
4.3%
27
 
3.4%
27
 
3.4%
24
 
3.1%
23
 
2.9%
22
 
2.8%
22
 
2.8%
19
 
2.4%
18
 
2.3%
Other values (183) 528
67.3%
ASCII
ValueCountFrequency (%)
28
50.9%
) 4
 
7.3%
S 4
 
7.3%
( 4
 
7.3%
2 3
 
5.5%
- 2
 
3.6%
G 2
 
3.6%
1 2
 
3.6%
L 1
 
1.8%
& 1
 
1.8%
Other values (4) 4
 
7.3%

구분
Text

MISSING 

Distinct61
Distinct (%)53.5%
Missing89
Missing (%)43.8%
Memory size1.7 KiB
2024-04-18T02:31:24.749299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length16
Mean length7.7192982
Min length2

Characters and Unicode

Total characters880
Distinct characters153
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)43.9%

Sample

1st row식료품제조
2nd row식료품제조
3rd row식료품제조
4th row식료품제조
5th row식료품제조
ValueCountFrequency (%)
자동차정비업 21
 
12.8%
자동차종합수리업 13
 
7.9%
9
 
5.5%
식료품제조 8
 
4.9%
모래 7
 
4.3%
자갈채취업 7
 
4.3%
사료제조업 5
 
3.0%
세차시설 5
 
3.0%
제조업 4
 
2.4%
자동차정비 3
 
1.8%
Other values (73) 82
50.0%
2024-04-18T02:31:25.049165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
81
 
9.2%
67
 
7.6%
52
 
5.9%
51
 
5.8%
48
 
5.5%
35
 
4.0%
28
 
3.2%
27
 
3.1%
25
 
2.8%
20
 
2.3%
Other values (143) 446
50.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 804
91.4%
Space Separator 67
 
7.6%
Other Symbol 4
 
0.5%
Close Punctuation 2
 
0.2%
Open Punctuation 2
 
0.2%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
81
 
10.1%
52
 
6.5%
51
 
6.3%
48
 
6.0%
35
 
4.4%
28
 
3.5%
27
 
3.4%
25
 
3.1%
20
 
2.5%
20
 
2.5%
Other values (138) 417
51.9%
Space Separator
ValueCountFrequency (%)
67
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 808
91.8%
Common 72
 
8.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
81
 
10.0%
52
 
6.4%
51
 
6.3%
48
 
5.9%
35
 
4.3%
28
 
3.5%
27
 
3.3%
25
 
3.1%
20
 
2.5%
20
 
2.5%
Other values (139) 421
52.1%
Common
ValueCountFrequency (%)
67
93.1%
) 2
 
2.8%
( 2
 
2.8%
, 1
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 804
91.4%
ASCII 72
 
8.2%
None 4
 
0.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
81
 
10.1%
52
 
6.5%
51
 
6.3%
48
 
6.0%
35
 
4.4%
28
 
3.5%
27
 
3.4%
25
 
3.1%
20
 
2.5%
20
 
2.5%
Other values (138) 417
51.9%
ASCII
ValueCountFrequency (%)
67
93.1%
) 2
 
2.8%
( 2
 
2.8%
, 1
 
1.4%
None
ValueCountFrequency (%)
4
100.0%

도로명 주소
Text

MISSING 

Distinct107
Distinct (%)93.9%
Missing89
Missing (%)43.8%
Memory size1.7 KiB
2024-04-18T02:31:25.301685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length27
Mean length19.368421
Min length7

Characters and Unicode

Total characters2208
Distinct characters75
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique101 ?
Unique (%)88.6%

Sample

1st row인천광역시 중구 축항대로87번길 30
2nd row인천광역시 중구 축항대로290번길 121
3rd row인천광역시 중구 서해대로 168
4th row인천광역시 중구 월미로 116
5th row인천광역시 중구 아암대로 20
ValueCountFrequency (%)
중구 110
24.4%
인천광역시 106
23.5%
서해대로 21
 
4.7%
축항대로296번길 9
 
2.0%
월미로 9
 
2.0%
서해대로94번길 6
 
1.3%
축항대로290번길 6
 
1.3%
서해대로180번길 5
 
1.1%
10 4
 
0.9%
서해대로179번길 3
 
0.7%
Other values (137) 172
38.1%
2024-04-18T02:31:25.664151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
337
 
15.3%
113
 
5.1%
110
 
5.0%
110
 
5.0%
110
 
5.0%
110
 
5.0%
106
 
4.8%
106
 
4.8%
106
 
4.8%
1 102
 
4.6%
Other values (65) 898
40.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1373
62.2%
Decimal Number 464
 
21.0%
Space Separator 337
 
15.3%
Dash Punctuation 12
 
0.5%
Close Punctuation 10
 
0.5%
Open Punctuation 10
 
0.5%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
113
 
8.2%
110
 
8.0%
110
 
8.0%
110
 
8.0%
110
 
8.0%
106
 
7.7%
106
 
7.7%
106
 
7.7%
79
 
5.8%
65
 
4.7%
Other values (50) 358
26.1%
Decimal Number
ValueCountFrequency (%)
1 102
22.0%
2 60
12.9%
6 51
11.0%
9 50
10.8%
3 47
10.1%
0 45
9.7%
4 36
 
7.8%
8 32
 
6.9%
7 21
 
4.5%
5 20
 
4.3%
Space Separator
ValueCountFrequency (%)
337
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1373
62.2%
Common 835
37.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
113
 
8.2%
110
 
8.0%
110
 
8.0%
110
 
8.0%
110
 
8.0%
106
 
7.7%
106
 
7.7%
106
 
7.7%
79
 
5.8%
65
 
4.7%
Other values (50) 358
26.1%
Common
ValueCountFrequency (%)
337
40.4%
1 102
 
12.2%
2 60
 
7.2%
6 51
 
6.1%
9 50
 
6.0%
3 47
 
5.6%
0 45
 
5.4%
4 36
 
4.3%
8 32
 
3.8%
7 21
 
2.5%
Other values (5) 54
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1373
62.2%
ASCII 835
37.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
337
40.4%
1 102
 
12.2%
2 60
 
7.2%
6 51
 
6.1%
9 50
 
6.0%
3 47
 
5.6%
0 45
 
5.4%
4 36
 
4.3%
8 32
 
3.8%
7 21
 
2.5%
Other values (5) 54
 
6.5%
Hangul
ValueCountFrequency (%)
113
 
8.2%
110
 
8.0%
110
 
8.0%
110
 
8.0%
110
 
8.0%
106
 
7.7%
106
 
7.7%
106
 
7.7%
79
 
5.8%
65
 
4.7%
Other values (50) 358
26.1%

데이터 기준일자
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-09-11
106 
<NA>
97 

Length

Max length10
Median length10
Mean length7.1330049
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-09-11
2nd row2023-09-11
3rd row2023-09-11
4th row2023-09-11
5th row2023-09-11

Common Values

ValueCountFrequency (%)
2023-09-11 106
52.2%
<NA> 97
47.8%

Length

2024-04-18T02:31:25.772320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-18T02:31:25.849609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-09-11 106
52.2%
na 97
47.8%

Interactions

2024-04-18T02:31:23.604307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-18T02:31:25.896218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분
연번1.0000.898
구분0.8981.000
2024-04-18T02:31:25.955966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번데이터 기준일자
연번1.0001.000
데이터 기준일자1.0001.000

Missing values

2024-04-18T02:31:23.695560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-18T02:31:23.764995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-18T02:31:23.839312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번업소명구분도로명 주소데이터 기준일자
01씨제이제일제당㈜인천3공장식료품제조인천광역시 중구 축항대로87번길 302023-09-11
12㈜삼양사 인천2공장식료품제조인천광역시 중구 축항대로290번길 1212023-09-11
23씨제이제일제당㈜인천냉동식품공장식료품제조인천광역시 중구 서해대로 1682023-09-11
34대한제당㈜식료품제조인천광역시 중구 월미로 1162023-09-11
45씨제이제일제당㈜ 인천1공장식료품제조인천광역시 중구 아암대로 202023-09-11
56씨제이제일제당㈜ 인천2공장식료품제조인천광역시 중구 서해대로140번길 492023-09-11
67제일사료㈜인천공장사료제조업인천광역시 중구 서해대로209번길 692023-09-11
78티에스사료㈜사료제조업인천광역시 중구 월미로 1812023-09-11
89GS칼텍스㈜인천물류센터저유업인천광역시 중구 월미로 1822023-09-11
910인천기독병원병원인천광역시 중구 답동로30번길 102023-09-11
연번업소명구분도로명 주소데이터 기준일자
193<NA><NA><NA><NA><NA>
194<NA><NA><NA><NA><NA>
195196보일러지투호텔서울특별시 중구 수표로 24(저동2가)<NA>
196197보일러케이비부동산신탁㈜서울특별시 중구 퇴계로 65(회현동1가) 외 32필지<NA>
197198보일러(주)하나은행서울특별시 중구 을지로 35(을지로1가)<NA>
198199보일러재단법인 천주교쌘볼수도원유지재단(샽트르 성 바오로 수녀회 서울관구)서읕특별시 중구 명동길 74-2(명동2가)<NA>
199200보일러더유니스타 주식회사을지로4가 261-4<NA>
200201보일러해성산업㈜남대문로4가 17-19<NA>
201202보일러㈜케이티에스테이트을지로 238<NA>
202203보일러씨제이㈜소월로2길 12<NA>

Duplicate rows

Most frequently occurring

연번업소명구분도로명 주소데이터 기준일자# duplicates
0<NA><NA><NA><NA><NA>89