Overview

Dataset statistics

Number of variables5
Number of observations202
Missing cells99
Missing cells (%)9.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.2 KiB
Average record size in memory41.7 B

Variable types

Numeric1
Text4

Dataset

Description경상북도 구미시의 화학물질취급업체 정보로 업체명, 업체의 차고지 주소 또는 업체가 위치한 주소, 업체의 전화번호, 업종 분류명 등의 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15039541/fileData.do

Alerts

주소 has 6 (3.0%) missing valuesMissing
전화번호 has 11 (5.4%) missing valuesMissing
분류명 has 82 (40.6%) missing valuesMissing
연번 has unique valuesUnique
업체명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 03:24:27.282409
Analysis finished2023-12-12 03:24:28.211109
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct202
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean101.5
Minimum1
Maximum202
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-12T12:24:28.324700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile11.05
Q151.25
median101.5
Q3151.75
95-th percentile191.95
Maximum202
Range201
Interquartile range (IQR)100.5

Descriptive statistics

Standard deviation58.456537
Coefficient of variation (CV)0.57592647
Kurtosis-1.2
Mean101.5
Median Absolute Deviation (MAD)50.5
Skewness0
Sum20503
Variance3417.1667
MonotonicityStrictly increasing
2023-12-12T12:24:28.538928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.5%
140 1
 
0.5%
130 1
 
0.5%
131 1
 
0.5%
132 1
 
0.5%
133 1
 
0.5%
134 1
 
0.5%
135 1
 
0.5%
136 1
 
0.5%
137 1
 
0.5%
Other values (192) 192
95.0%
ValueCountFrequency (%)
1 1
0.5%
2 1
0.5%
3 1
0.5%
4 1
0.5%
5 1
0.5%
6 1
0.5%
7 1
0.5%
8 1
0.5%
9 1
0.5%
10 1
0.5%
ValueCountFrequency (%)
202 1
0.5%
201 1
0.5%
200 1
0.5%
199 1
0.5%
198 1
0.5%
197 1
0.5%
196 1
0.5%
195 1
0.5%
194 1
0.5%
193 1
0.5%

업체명
Text

UNIQUE 

Distinct202
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-12T12:24:28.902457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length14
Mean length9.0148515
Min length2

Characters and Unicode

Total characters1821
Distinct characters239
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique202 ?
Unique (%)100.0%

Sample

1st row(주)케이엠씨
2nd row금오화공약품
3rd row(주)남선알미늄 구미공장
4th row대성화공
5th row(주)한국이차전지
ValueCountFrequency (%)
구미공장 9
 
3.5%
구미2공장 5
 
1.9%
도레이첨단소재(주 4
 
1.5%
구미지점 4
 
1.5%
2공장 4
 
1.5%
주)원익큐엔씨 4
 
1.5%
sk실트론(주 3
 
1.2%
1공장 3
 
1.2%
구미사업장 3
 
1.2%
구미1공장 3
 
1.2%
Other values (203) 217
83.8%
2023-12-12T12:24:29.433686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 151
 
8.3%
( 150
 
8.2%
145
 
8.0%
60
 
3.3%
60
 
3.3%
54
 
3.0%
51
 
2.8%
47
 
2.6%
43
 
2.4%
40
 
2.2%
Other values (229) 1020
56.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1410
77.4%
Close Punctuation 151
 
8.3%
Open Punctuation 151
 
8.3%
Space Separator 60
 
3.3%
Decimal Number 25
 
1.4%
Uppercase Letter 20
 
1.1%
Other Symbol 4
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
145
 
10.3%
60
 
4.3%
54
 
3.8%
51
 
3.6%
47
 
3.3%
43
 
3.0%
40
 
2.8%
32
 
2.3%
29
 
2.1%
28
 
2.0%
Other values (210) 881
62.5%
Uppercase Letter
ValueCountFrequency (%)
C 4
20.0%
K 4
20.0%
S 4
20.0%
A 2
10.0%
M 1
 
5.0%
B 1
 
5.0%
N 1
 
5.0%
V 1
 
5.0%
T 1
 
5.0%
E 1
 
5.0%
Decimal Number
ValueCountFrequency (%)
2 13
52.0%
1 7
28.0%
3 3
 
12.0%
4 2
 
8.0%
Open Punctuation
ValueCountFrequency (%)
( 150
99.3%
[ 1
 
0.7%
Close Punctuation
ValueCountFrequency (%)
) 151
100.0%
Space Separator
ValueCountFrequency (%)
60
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1414
77.6%
Common 387
 
21.3%
Latin 20
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
145
 
10.3%
60
 
4.2%
54
 
3.8%
51
 
3.6%
47
 
3.3%
43
 
3.0%
40
 
2.8%
32
 
2.3%
29
 
2.1%
28
 
2.0%
Other values (211) 885
62.6%
Latin
ValueCountFrequency (%)
C 4
20.0%
K 4
20.0%
S 4
20.0%
A 2
10.0%
M 1
 
5.0%
B 1
 
5.0%
N 1
 
5.0%
V 1
 
5.0%
T 1
 
5.0%
E 1
 
5.0%
Common
ValueCountFrequency (%)
) 151
39.0%
( 150
38.8%
60
 
15.5%
2 13
 
3.4%
1 7
 
1.8%
3 3
 
0.8%
4 2
 
0.5%
[ 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1410
77.4%
ASCII 407
 
22.4%
None 4
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 151
37.1%
( 150
36.9%
60
 
14.7%
2 13
 
3.2%
1 7
 
1.7%
C 4
 
1.0%
K 4
 
1.0%
S 4
 
1.0%
3 3
 
0.7%
4 2
 
0.5%
Other values (8) 9
 
2.2%
Hangul
ValueCountFrequency (%)
145
 
10.3%
60
 
4.3%
54
 
3.8%
51
 
3.6%
47
 
3.3%
43
 
3.0%
40
 
2.8%
32
 
2.3%
29
 
2.1%
28
 
2.0%
Other values (210) 881
62.5%
None
ValueCountFrequency (%)
4
100.0%

주소
Text

MISSING 

Distinct183
Distinct (%)93.4%
Missing6
Missing (%)3.0%
Memory size1.7 KiB
2023-12-12T12:24:29.838125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length27.5
Mean length22.515306
Min length15

Characters and Unicode

Total characters4413
Distinct characters99
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique172 ?
Unique (%)87.8%

Sample

1st row경상북도 구미시 3공단3로 40 (시미동)
2nd row경상북도 구미시 1공단로 86-43 (공단동)
3rd row경상북도 구미시 수출대로9길 80 (공단동)
4th row경상북도 구미시 신비로 186 (비산동)
5th row경상북도 구미시 1공단로6길 134 (공단동)
ValueCountFrequency (%)
경상북도 196
21.9%
구미시 194
21.7%
공단동 26
 
2.9%
산동면 25
 
2.8%
옥계2공단로 15
 
1.7%
1공단로6길 13
 
1.5%
3공단3로 11
 
1.2%
수출대로 11
 
1.2%
산호대로 10
 
1.1%
3공단2로 10
 
1.1%
Other values (260) 384
42.9%
2023-12-12T12:24:30.435805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
699
 
15.8%
220
 
5.0%
220
 
5.0%
210
 
4.8%
204
 
4.6%
197
 
4.5%
197
 
4.5%
1 197
 
4.5%
196
 
4.4%
186
 
4.2%
Other values (89) 1887
42.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2568
58.2%
Decimal Number 850
 
19.3%
Space Separator 699
 
15.8%
Open Punctuation 102
 
2.3%
Close Punctuation 101
 
2.3%
Dash Punctuation 92
 
2.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
220
 
8.6%
220
 
8.6%
210
 
8.2%
204
 
7.9%
197
 
7.7%
197
 
7.7%
196
 
7.6%
186
 
7.2%
150
 
5.8%
138
 
5.4%
Other values (74) 650
25.3%
Decimal Number
ValueCountFrequency (%)
1 197
23.2%
2 126
14.8%
3 116
13.6%
4 75
 
8.8%
7 67
 
7.9%
5 61
 
7.2%
6 60
 
7.1%
0 53
 
6.2%
8 51
 
6.0%
9 44
 
5.2%
Space Separator
ValueCountFrequency (%)
699
100.0%
Open Punctuation
ValueCountFrequency (%)
( 102
100.0%
Close Punctuation
ValueCountFrequency (%)
) 101
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 92
100.0%
Other Punctuation
ValueCountFrequency (%)
\ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2568
58.2%
Common 1845
41.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
220
 
8.6%
220
 
8.6%
210
 
8.2%
204
 
7.9%
197
 
7.7%
197
 
7.7%
196
 
7.6%
186
 
7.2%
150
 
5.8%
138
 
5.4%
Other values (74) 650
25.3%
Common
ValueCountFrequency (%)
699
37.9%
1 197
 
10.7%
2 126
 
6.8%
3 116
 
6.3%
( 102
 
5.5%
) 101
 
5.5%
- 92
 
5.0%
4 75
 
4.1%
7 67
 
3.6%
5 61
 
3.3%
Other values (5) 209
 
11.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2568
58.2%
ASCII 1845
41.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
699
37.9%
1 197
 
10.7%
2 126
 
6.8%
3 116
 
6.3%
( 102
 
5.5%
) 101
 
5.5%
- 92
 
5.0%
4 75
 
4.1%
7 67
 
3.6%
5 61
 
3.3%
Other values (5) 209
 
11.3%
Hangul
ValueCountFrequency (%)
220
 
8.6%
220
 
8.6%
210
 
8.2%
204
 
7.9%
197
 
7.7%
197
 
7.7%
196
 
7.6%
186
 
7.2%
150
 
5.8%
138
 
5.4%
Other values (74) 650
25.3%

전화번호
Text

MISSING 

Distinct186
Distinct (%)97.4%
Missing11
Missing (%)5.4%
Memory size1.7 KiB
2023-12-12T12:24:30.777420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.04712
Min length12

Characters and Unicode

Total characters2301
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique181 ?
Unique (%)94.8%

Sample

1st row054-472-0051
2nd row054-461-5191
3rd row054-460-0321
4th row054-464-2187
5th row054-463-7091
ValueCountFrequency (%)
054-465-6657 2
 
1.0%
054-470-9122 2
 
1.0%
054-714-3028 2
 
1.0%
070-4495-0008 2
 
1.0%
054-974-6028 2
 
1.0%
054-971-9800 1
 
0.5%
054-473-4657 1
 
0.5%
054-472-5855 1
 
0.5%
054-974-4000 1
 
0.5%
054-715-3200 1
 
0.5%
Other values (176) 176
92.1%
2023-12-12T12:24:31.330703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 429
18.6%
0 382
16.6%
- 382
16.6%
5 290
12.6%
7 182
7.9%
1 149
 
6.5%
6 135
 
5.9%
2 127
 
5.5%
8 95
 
4.1%
3 70
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1919
83.4%
Dash Punctuation 382
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 429
22.4%
0 382
19.9%
5 290
15.1%
7 182
9.5%
1 149
 
7.8%
6 135
 
7.0%
2 127
 
6.6%
8 95
 
5.0%
3 70
 
3.6%
9 60
 
3.1%
Dash Punctuation
ValueCountFrequency (%)
- 382
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2301
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 429
18.6%
0 382
16.6%
- 382
16.6%
5 290
12.6%
7 182
7.9%
1 149
 
6.5%
6 135
 
5.9%
2 127
 
5.5%
8 95
 
4.1%
3 70
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2301
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 429
18.6%
0 382
16.6%
- 382
16.6%
5 290
12.6%
7 182
7.9%
1 149
 
6.5%
6 135
 
5.9%
2 127
 
5.5%
8 95
 
4.1%
3 70
 
3.0%

분류명
Text

MISSING 

Distinct71
Distinct (%)59.2%
Missing82
Missing (%)40.6%
Memory size1.7 KiB
2023-12-12T12:24:31.613567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length18
Mean length7
Min length2

Characters and Unicode

Total characters840
Distinct characters135
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65 ?
Unique (%)54.2%

Sample

1st row기타 기초무기
2nd row제조
3rd row제조업
4th row도매.제조
5th row제조업
ValueCountFrequency (%)
제조업 43
22.2%
페인트판매업 13
 
6.7%
제조 10
 
5.2%
기타 9
 
4.6%
8
 
4.1%
7
 
3.6%
전자부품제조업 3
 
1.5%
화공약품 3
 
1.5%
3
 
1.5%
기초 2
 
1.0%
Other values (88) 93
47.9%
2023-12-12T12:24:32.137597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
90
 
10.7%
82
 
9.8%
79
 
9.4%
74
 
8.8%
45
 
5.4%
22
 
2.6%
18
 
2.1%
15
 
1.8%
15
 
1.8%
14
 
1.7%
Other values (125) 386
46.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 735
87.5%
Space Separator 74
 
8.8%
Decimal Number 11
 
1.3%
Other Punctuation 7
 
0.8%
Close Punctuation 4
 
0.5%
Open Punctuation 4
 
0.5%
Uppercase Letter 3
 
0.4%
Dash Punctuation 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
90
 
12.2%
82
 
11.2%
79
 
10.7%
45
 
6.1%
22
 
3.0%
18
 
2.4%
15
 
2.0%
15
 
2.0%
14
 
1.9%
14
 
1.9%
Other values (108) 341
46.4%
Decimal Number
ValueCountFrequency (%)
0 3
27.3%
5 3
27.3%
2 1
 
9.1%
9 1
 
9.1%
7 1
 
9.1%
1 1
 
9.1%
4 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
. 3
42.9%
/ 3
42.9%
, 1
 
14.3%
Uppercase Letter
ValueCountFrequency (%)
L 1
33.3%
C 1
33.3%
D 1
33.3%
Space Separator
ValueCountFrequency (%)
74
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 735
87.5%
Common 102
 
12.1%
Latin 3
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
90
 
12.2%
82
 
11.2%
79
 
10.7%
45
 
6.1%
22
 
3.0%
18
 
2.4%
15
 
2.0%
15
 
2.0%
14
 
1.9%
14
 
1.9%
Other values (108) 341
46.4%
Common
ValueCountFrequency (%)
74
72.5%
) 4
 
3.9%
( 4
 
3.9%
0 3
 
2.9%
5 3
 
2.9%
. 3
 
2.9%
/ 3
 
2.9%
- 2
 
2.0%
2 1
 
1.0%
9 1
 
1.0%
Other values (4) 4
 
3.9%
Latin
ValueCountFrequency (%)
L 1
33.3%
C 1
33.3%
D 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 735
87.5%
ASCII 105
 
12.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
90
 
12.2%
82
 
11.2%
79
 
10.7%
45
 
6.1%
22
 
3.0%
18
 
2.4%
15
 
2.0%
15
 
2.0%
14
 
1.9%
14
 
1.9%
Other values (108) 341
46.4%
ASCII
ValueCountFrequency (%)
74
70.5%
) 4
 
3.8%
( 4
 
3.8%
0 3
 
2.9%
5 3
 
2.9%
. 3
 
2.9%
/ 3
 
2.9%
- 2
 
1.9%
2 1
 
1.0%
9 1
 
1.0%
Other values (7) 7
 
6.7%

Interactions

2023-12-12T12:24:27.660387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:24:32.263590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번분류명
연번1.0000.831
분류명0.8311.000

Missing values

2023-12-12T12:24:27.833601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:24:27.995124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T12:24:28.139912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번업체명주소전화번호분류명
01(주)케이엠씨경상북도 구미시 3공단3로 40 (시미동)054-472-0051기타 기초무기
12금오화공약품경상북도 구미시 1공단로 86-43 (공단동)054-461-5191제조
23(주)남선알미늄 구미공장경상북도 구미시 수출대로9길 80 (공단동)054-460-0321제조업
34대성화공경상북도 구미시 신비로 186 (비산동)054-464-2187도매.제조
45(주)한국이차전지경상북도 구미시 1공단로6길 134 (공단동)054-463-7091제조업
56대한화공약품경상북도 구미시 왕산로5길 14-4 (임은동)054-453-1454도.소매
67도레이첨단소재(주) 구미1공장경상북도 구미시 구미대로 102(공단동)054-469-4767사진용화학제품 및 감광재료제조업
78도레이첨단소재(주) 구미4공장경상북도 구미시 4공단로 249-29(금전동)054-479-6642제조업
89도레이첨단소재(주)경상북도 구미시 3공단2로 300(임수동)054-479-6153제조업
910도레이첨단소재(주) 구미2공장경상북도 구미시 1공단로4길 141-11(공단동)054-469-4776제조업
연번업체명주소전화번호분류명
192193㈜뉴나노<NA><NA><NA>
193194구미맑은물(주)경상북도 구미시 오태8길 27-12054-715-3300<NA>
194195새론(주)<NA><NA><NA>
195196바세로<NA>054-461-2600<NA>
196197㈜태영테크폴<NA>054-974-6028<NA>
197198㈜이레테크경상북도 구미시 산동읍 첨단기업로 127-33054-461-9620반도체부품제조업
198199(주)원익큐엔씨 황상지점(제조업)<NA><NA><NA>
199200(주)에프엑스티경상북도 구미시 1공단로4길 38-43054-604-3200제조업
200201(주)재영텍경상북도 구미시 1공단로 10길 90054-462-4006기타 기초 무기 화학물질 제조업
201202[주)엘지에이치와이비씨엠경상북도 구미시 산동읍 동곡리 산70-2054-717-0029축전지 제조업