Overview

Dataset statistics

Number of variables5
Number of observations166
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.8 KiB
Average record size in memory41.8 B

Variable types

Numeric1
Text3
Categorical1

Dataset

Description인천광역시 부평구 대기 환경오염물질 배출사업장 현황입니다.<br/>예) 연번,사업장명,소재지,사업장,업종,종별<br/>(그리디언코리아(유),부평공장,인천광역시 부평구 백범로 584 (십정동),전분 및 당류 제조시설,허1)<br/><br/>
Author인천광역시 부평구
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15081065&srcSe=7661IVAWM27C61E190

Alerts

연번 is highly overall correlated with 종별High correlation
종별 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique

Reproduction

Analysis started2024-04-06 09:42:02.947297
Analysis finished2024-04-06 09:42:03.551063
Duration0.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct166
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean83.5
Minimum1
Maximum166
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2024-04-06T18:42:03.631027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9.25
Q142.25
median83.5
Q3124.75
95-th percentile157.75
Maximum166
Range165
Interquartile range (IQR)82.5

Descriptive statistics

Standard deviation48.064193
Coefficient of variation (CV)0.57561908
Kurtosis-1.2
Mean83.5
Median Absolute Deviation (MAD)41.5
Skewness0
Sum13861
Variance2310.1667
MonotonicityStrictly increasing
2024-04-06T18:42:03.790436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.6%
106 1
 
0.6%
108 1
 
0.6%
109 1
 
0.6%
110 1
 
0.6%
111 1
 
0.6%
112 1
 
0.6%
113 1
 
0.6%
114 1
 
0.6%
115 1
 
0.6%
Other values (156) 156
94.0%
ValueCountFrequency (%)
1 1
0.6%
2 1
0.6%
3 1
0.6%
4 1
0.6%
5 1
0.6%
6 1
0.6%
7 1
0.6%
8 1
0.6%
9 1
0.6%
10 1
0.6%
ValueCountFrequency (%)
166 1
0.6%
165 1
0.6%
164 1
0.6%
163 1
0.6%
162 1
0.6%
161 1
0.6%
160 1
0.6%
159 1
0.6%
158 1
0.6%
157 1
0.6%
Distinct165
Distinct (%)99.4%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2024-04-06T18:42:04.059731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length19
Mean length7.3192771
Min length2

Characters and Unicode

Total characters1215
Distinct characters258
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique164 ?
Unique (%)98.8%

Sample

1st row인그리디언코리아(유) 부평공장
2nd row한국지엠 주식회사
3rd row학교법인 가톨릭학원 가톨릭대학교인천성모병원
4th row(주)진영알엔에치
5th row근로복지공단 인천병원
ValueCountFrequency (%)
주식회사 6
 
2.9%
명신산업 2
 
1.0%
근로복지공단 2
 
1.0%
2공장 2
 
1.0%
아민모터스 1
 
0.5%
제일산업 1
 
0.5%
건창산업 1
 
0.5%
신원화학㈜ 1
 
0.5%
차오름모터스 1
 
0.5%
㈜엘지생활건강 1
 
0.5%
Other values (186) 186
91.2%
2024-04-06T18:42:04.523663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
48
 
4.0%
46
 
3.8%
43
 
3.5%
32
 
2.6%
31
 
2.6%
29
 
2.4%
29
 
2.4%
29
 
2.4%
24
 
2.0%
24
 
2.0%
Other values (248) 880
72.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1049
86.3%
Space Separator 46
 
3.8%
Other Symbol 43
 
3.5%
Open Punctuation 18
 
1.5%
Close Punctuation 18
 
1.5%
Decimal Number 13
 
1.1%
Lowercase Letter 13
 
1.1%
Uppercase Letter 13
 
1.1%
Other Punctuation 1
 
0.1%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
48
 
4.6%
32
 
3.1%
31
 
3.0%
29
 
2.8%
29
 
2.8%
29
 
2.8%
24
 
2.3%
24
 
2.3%
24
 
2.3%
22
 
2.1%
Other values (219) 757
72.2%
Uppercase Letter
ValueCountFrequency (%)
U 2
15.4%
B 2
15.4%
E 1
7.7%
R 1
7.7%
C 1
7.7%
S 1
7.7%
K 1
7.7%
V 1
7.7%
M 1
7.7%
T 1
7.7%
Lowercase Letter
ValueCountFrequency (%)
o 3
23.1%
r 3
23.1%
t 2
15.4%
e 1
 
7.7%
s 1
 
7.7%
h 1
 
7.7%
g 1
 
7.7%
i 1
 
7.7%
Decimal Number
ValueCountFrequency (%)
2 5
38.5%
1 5
38.5%
4 2
 
15.4%
3 1
 
7.7%
Space Separator
ValueCountFrequency (%)
46
100.0%
Other Symbol
ValueCountFrequency (%)
43
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1092
89.9%
Common 97
 
8.0%
Latin 26
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
48
 
4.4%
43
 
3.9%
32
 
2.9%
31
 
2.8%
29
 
2.7%
29
 
2.7%
29
 
2.7%
24
 
2.2%
24
 
2.2%
24
 
2.2%
Other values (220) 779
71.3%
Latin
ValueCountFrequency (%)
o 3
 
11.5%
r 3
 
11.5%
U 2
 
7.7%
B 2
 
7.7%
t 2
 
7.7%
E 1
 
3.8%
R 1
 
3.8%
C 1
 
3.8%
S 1
 
3.8%
K 1
 
3.8%
Other values (9) 9
34.6%
Common
ValueCountFrequency (%)
46
47.4%
( 18
 
18.6%
) 18
 
18.6%
2 5
 
5.2%
1 5
 
5.2%
4 2
 
2.1%
& 1
 
1.0%
3 1
 
1.0%
- 1
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1049
86.3%
ASCII 123
 
10.1%
None 43
 
3.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
48
 
4.6%
32
 
3.1%
31
 
3.0%
29
 
2.8%
29
 
2.8%
29
 
2.8%
24
 
2.3%
24
 
2.3%
24
 
2.3%
22
 
2.1%
Other values (219) 757
72.2%
ASCII
ValueCountFrequency (%)
46
37.4%
( 18
 
14.6%
) 18
 
14.6%
2 5
 
4.1%
1 5
 
4.1%
o 3
 
2.4%
r 3
 
2.4%
U 2
 
1.6%
4 2
 
1.6%
B 2
 
1.6%
Other values (18) 19
15.4%
None
ValueCountFrequency (%)
43
100.0%
Distinct152
Distinct (%)91.6%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2024-04-06T18:42:04.827531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length33
Mean length26.277108
Min length19

Characters and Unicode

Total characters4362
Distinct characters88
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique148 ?
Unique (%)89.2%

Sample

1st row인천광역시 부평구 백범로 584 (십정동)
2nd row인천광역시 부평구 부평대로 233 (청천동)
3rd row인천광역시 부평구 동수로 56(부평동)
4th row인천광역시 부평구 부평북로 50(청천동)
5th row인천광역시 부평구 무네미로 446(구산동)
ValueCountFrequency (%)
인천광역시 166
21.9%
부평구 166
21.9%
십정동 30
 
4.0%
청천동 25
 
3.3%
부평북로 19
 
2.5%
서달로298번길 16
 
2.1%
백범로578번길 14
 
1.8%
백범로 14
 
1.8%
청천마차로 11
 
1.4%
65 10
 
1.3%
Other values (211) 288
37.9%
2024-04-06T18:42:05.303056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
697
 
16.0%
281
 
6.4%
218
 
5.0%
210
 
4.8%
171
 
3.9%
170
 
3.9%
167
 
3.8%
167
 
3.8%
167
 
3.8%
166
 
3.8%
Other values (78) 1948
44.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2565
58.8%
Decimal Number 713
 
16.3%
Space Separator 697
 
16.0%
Open Punctuation 164
 
3.8%
Close Punctuation 164
 
3.8%
Dash Punctuation 40
 
0.9%
Other Punctuation 17
 
0.4%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
281
11.0%
218
 
8.5%
210
 
8.2%
171
 
6.7%
170
 
6.6%
167
 
6.5%
167
 
6.5%
167
 
6.5%
166
 
6.5%
164
 
6.4%
Other values (61) 684
26.7%
Decimal Number
ValueCountFrequency (%)
1 110
15.4%
2 97
13.6%
8 77
10.8%
5 76
10.7%
4 69
9.7%
3 66
9.3%
7 61
8.6%
6 56
7.9%
0 53
7.4%
9 48
6.7%
Uppercase Letter
ValueCountFrequency (%)
B 1
50.0%
C 1
50.0%
Space Separator
ValueCountFrequency (%)
697
100.0%
Open Punctuation
ValueCountFrequency (%)
( 164
100.0%
Close Punctuation
ValueCountFrequency (%)
) 164
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 40
100.0%
Other Punctuation
ValueCountFrequency (%)
, 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2565
58.8%
Common 1795
41.2%
Latin 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
281
11.0%
218
 
8.5%
210
 
8.2%
171
 
6.7%
170
 
6.6%
167
 
6.5%
167
 
6.5%
167
 
6.5%
166
 
6.5%
164
 
6.4%
Other values (61) 684
26.7%
Common
ValueCountFrequency (%)
697
38.8%
( 164
 
9.1%
) 164
 
9.1%
1 110
 
6.1%
2 97
 
5.4%
8 77
 
4.3%
5 76
 
4.2%
4 69
 
3.8%
3 66
 
3.7%
7 61
 
3.4%
Other values (5) 214
 
11.9%
Latin
ValueCountFrequency (%)
B 1
50.0%
C 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2565
58.8%
ASCII 1797
41.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
697
38.8%
( 164
 
9.1%
) 164
 
9.1%
1 110
 
6.1%
2 97
 
5.4%
8 77
 
4.3%
5 76
 
4.2%
4 69
 
3.8%
3 66
 
3.7%
7 61
 
3.4%
Other values (7) 216
 
12.0%
Hangul
ValueCountFrequency (%)
281
11.0%
218
 
8.5%
210
 
8.2%
171
 
6.7%
170
 
6.6%
167
 
6.5%
167
 
6.5%
167
 
6.5%
166
 
6.5%
164
 
6.4%
Other values (61) 684
26.7%
Distinct78
Distinct (%)47.0%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2024-04-06T18:42:05.648723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length22
Mean length9.0301205
Min length1

Characters and Unicode

Total characters1499
Distinct characters147
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)38.6%

Sample

1st row전분 및 당류 제조시설
2nd row승용차 및 여객용 자동차 제조업
3rd row병원
4th row금속가공제품 제조시설
5th row병원시설
ValueCountFrequency (%)
도장및기타피막처리업 21
 
7.3%
자동차종합수리업 20
 
6.9%
17
 
5.9%
도금시설 14
 
4.9%
자동차정비업 13
 
4.5%
기타 13
 
4.5%
제조업 13
 
4.5%
그외 8
 
2.8%
도금업 7
 
2.4%
금속가공업 7
 
2.4%
Other values (116) 155
53.8%
2024-04-06T18:42:06.157005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
130
 
8.7%
122
 
8.1%
62
 
4.1%
62
 
4.1%
57
 
3.8%
52
 
3.5%
51
 
3.4%
48
 
3.2%
47
 
3.1%
46
 
3.1%
Other values (137) 822
54.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1362
90.9%
Space Separator 122
 
8.1%
Other Punctuation 6
 
0.4%
Open Punctuation 2
 
0.1%
Close Punctuation 2
 
0.1%
Decimal Number 2
 
0.1%
Uppercase Letter 2
 
0.1%
Lowercase Letter 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
130
 
9.5%
62
 
4.6%
62
 
4.6%
57
 
4.2%
52
 
3.8%
51
 
3.7%
48
 
3.5%
47
 
3.5%
46
 
3.4%
43
 
3.2%
Other values (128) 764
56.1%
Other Punctuation
ValueCountFrequency (%)
, 5
83.3%
· 1
 
16.7%
Uppercase Letter
ValueCountFrequency (%)
U 1
50.0%
V 1
50.0%
Space Separator
ValueCountFrequency (%)
122
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Decimal Number
ValueCountFrequency (%)
1 2
100.0%
Lowercase Letter
ValueCountFrequency (%)
s 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1362
90.9%
Common 134
 
8.9%
Latin 3
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
130
 
9.5%
62
 
4.6%
62
 
4.6%
57
 
4.2%
52
 
3.8%
51
 
3.7%
48
 
3.5%
47
 
3.5%
46
 
3.4%
43
 
3.2%
Other values (128) 764
56.1%
Common
ValueCountFrequency (%)
122
91.0%
, 5
 
3.7%
( 2
 
1.5%
) 2
 
1.5%
1 2
 
1.5%
· 1
 
0.7%
Latin
ValueCountFrequency (%)
U 1
33.3%
V 1
33.3%
s 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1362
90.9%
ASCII 136
 
9.1%
None 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
130
 
9.5%
62
 
4.6%
62
 
4.6%
57
 
4.2%
52
 
3.8%
51
 
3.7%
48
 
3.5%
47
 
3.5%
46
 
3.4%
43
 
3.2%
Other values (128) 764
56.1%
ASCII
ValueCountFrequency (%)
122
89.7%
, 5
 
3.7%
( 2
 
1.5%
) 2
 
1.5%
1 2
 
1.5%
U 1
 
0.7%
V 1
 
0.7%
s 1
 
0.7%
None
ValueCountFrequency (%)
· 1
100.0%

종별
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
신5
88 
신4
49 
허5
17 
허4
허1
 
2

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)0.6%

Sample

1st row허1
2nd row허1
3rd row신3
4th row신4
5th row신4

Common Values

ValueCountFrequency (%)
신5 88
53.0%
신4 49
29.5%
허5 17
 
10.2%
허4 9
 
5.4%
허1 2
 
1.2%
신3 1
 
0.6%

Length

2024-04-06T18:42:06.306524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T18:42:06.418872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신5 88
53.0%
신4 49
29.5%
허5 17
 
10.2%
허4 9
 
5.4%
허1 2
 
1.2%
신3 1
 
0.6%

Interactions

2024-04-06T18:42:03.288140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T18:42:06.500297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업장 업종종별
연번1.0000.7950.740
사업장 업종0.7951.0000.946
종별0.7400.9461.000
2024-04-06T18:42:06.608806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번종별
연번1.0000.507
종별0.5071.000

Missing values

2024-04-06T18:42:03.419193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T18:42:03.511567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번사업장명소재지사업장 업종종별
01인그리디언코리아(유) 부평공장인천광역시 부평구 백범로 584 (십정동)전분 및 당류 제조시설허1
12한국지엠 주식회사인천광역시 부평구 부평대로 233 (청천동)승용차 및 여객용 자동차 제조업허1
23학교법인 가톨릭학원 가톨릭대학교인천성모병원인천광역시 부평구 동수로 56(부평동)병원신3
34(주)진영알엔에치인천광역시 부평구 부평북로 50(청천동)금속가공제품 제조시설신4
45근로복지공단 인천병원인천광역시 부평구 무네미로 446(구산동)병원시설신4
56서부사료㈜인천광역시 부평구 부평북로 325(갈산동)동물용 사료 및 조제식품 제조업신4
67㈜서울피막인천광역시 부평구 장제로 419, 421(갈산동)도금착색 및 기타표면처리 강재제조업신4
78인천탁주제조제1공장인천광역시 부평구 안남로433번길 26(청천동)s신4
89㈜일신자동차인천광역시 부평구 서촌로 23(일신동)자동차정비업신4
910(주)삼성금속인천광역시 부평구 부평북로 149(청천동)금속가공제품 제조시설신4
연번사업장명소재지사업장 업종종별
156157현대건설㈜캠프마켓인천광역시 부평구 부흥로 144번길 15(산곡동)토양 및 지하수 정화업허5
1571581급 계양스카이 자동차공업사인천광역시 부평구 청천동 373-37자동차종합수리업신5
158159재단법인 인천광역시 부평구 문화재단인천광역시 부평구 아트센터로 166(십정동)공연장대관업신5
159160근로복지공단 인천북부지사인천광역시 부평구 무네미로 478(구산동)금융 및 보험업신5
160161갑도물산(주) 부평엠에이치타워인천광역시 부평구 시장로7(부평동)부동산 임대업신5
161162썬텍인천광역시 부평구 부평북로 245(청천동)전기용 탄소제품 및 절연제품 제조업신5
162163㈜에스앤에이이알인천광역시 부평구 가좌로84번길 67(십정동)지정 외 폐기물 처리업신4
163164KS 모터스인천광역시 부평구 부평북로 9(청천동) 4층자동차종합수리업신5
164165현대 U V인천광역시 부평구 청천마차로 184(청천동), 3층플라스틱 용기 U V 코팅신5
165166부평남부체육센터인천광역시 부평구 부평동 663-30외 1필지공공기관신4