Overview

Dataset statistics

Number of variables4
Number of observations2227
Missing cells26
Missing cells (%)0.3%
Duplicate rows2
Duplicate rows (%)0.1%
Total size in memory69.7 KiB
Average record size in memory32.1 B

Variable types

Categorical1
Text3

Dataset

Description경상남도 김해시 소음진동배출시설 현황에 대한 데이터로 소음 및 진동 구분,업체명,업종,소재지주소의 정보를 제공하고 있습니다.
Author경상남도 김해시
URLhttps://www.data.go.kr/data/15093331/fileData.do

Alerts

Dataset has 2 (0.1%) duplicate rowsDuplicates
구분 is highly imbalanced (70.4%)Imbalance
업종 has 26 (1.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 12:27:25.443508
Analysis finished2023-12-12 12:27:26.242315
Duration0.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

IMBALANCE 

Distinct7
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
소음
1780 
소음진동
405 
진동
 
36
소음음
 
3
소음
 
1
Other values (2)
 
2

Length

Max length5
Median length2
Mean length2.3682084
Min length2

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st row소음
2nd row소음
3rd row소음
4th row소음
5th row소음

Common Values

ValueCountFrequency (%)
소음 1780
79.9%
소음진동 405
 
18.2%
진동 36
 
1.6%
소음음 3
 
0.1%
소음 1
 
< 0.1%
소음,진동 1
 
< 0.1%
소음.진동 1
 
< 0.1%

Length

2023-12-12T21:27:26.338230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:27:26.531214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
소음 1781
80.0%
소음진동 405
 
18.2%
진동 36
 
1.6%
소음음 3
 
0.1%
소음,진동 1
 
< 0.1%
소음.진동 1
 
< 0.1%
Distinct2120
Distinct (%)95.2%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
2023-12-12T21:27:26.883223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length20
Mean length5.5931747
Min length2

Characters and Unicode

Total characters12456
Distinct characters462
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2027 ?
Unique (%)91.0%

Sample

1st row청룡산업주식회사
2nd row서진섬유㈜
3rd row남성정밀㈜
4th row㈜디엔씨
5th row대신선재공업사
ValueCountFrequency (%)
주식회사 63
 
2.6%
김해공장 12
 
0.5%
2공장 11
 
0.5%
김해지점 6
 
0.3%
대원산업 5
 
0.2%
제2공장 4
 
0.2%
㈜티엠씨 4
 
0.2%
한림공장 3
 
0.1%
고모텍㈜ 3
 
0.1%
㈜바이저 3
 
0.1%
Other values (2147) 2273
95.2%
2023-12-12T21:27:27.471954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1273
 
10.2%
431
 
3.5%
361
 
2.9%
359
 
2.9%
298
 
2.4%
255
 
2.0%
237
 
1.9%
228
 
1.8%
228
 
1.8%
212
 
1.7%
Other values (452) 8574
68.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10460
84.0%
Other Symbol 1273
 
10.2%
Uppercase Letter 327
 
2.6%
Space Separator 162
 
1.3%
Close Punctuation 63
 
0.5%
Open Punctuation 63
 
0.5%
Decimal Number 52
 
0.4%
Other Punctuation 45
 
0.4%
Dash Punctuation 6
 
< 0.1%
Lowercase Letter 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
431
 
4.1%
361
 
3.5%
359
 
3.4%
298
 
2.8%
255
 
2.4%
237
 
2.3%
228
 
2.2%
228
 
2.2%
212
 
2.0%
179
 
1.7%
Other values (411) 7672
73.3%
Uppercase Letter
ValueCountFrequency (%)
C 34
 
10.4%
S 30
 
9.2%
E 30
 
9.2%
T 24
 
7.3%
M 23
 
7.0%
N 23
 
7.0%
P 17
 
5.2%
G 17
 
5.2%
H 16
 
4.9%
I 15
 
4.6%
Other values (13) 98
30.0%
Other Punctuation
ValueCountFrequency (%)
. 31
68.9%
& 9
 
20.0%
/ 2
 
4.4%
, 1
 
2.2%
: 1
 
2.2%
? 1
 
2.2%
Lowercase Letter
ValueCountFrequency (%)
o 2
40.0%
n 1
20.0%
r 1
20.0%
h 1
20.0%
Decimal Number
ValueCountFrequency (%)
2 35
67.3%
1 11
 
21.2%
3 6
 
11.5%
Other Symbol
ValueCountFrequency (%)
1273
100.0%
Space Separator
ValueCountFrequency (%)
162
100.0%
Close Punctuation
ValueCountFrequency (%)
) 63
100.0%
Open Punctuation
ValueCountFrequency (%)
( 63
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11733
94.2%
Common 391
 
3.1%
Latin 332
 
2.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1273
 
10.8%
431
 
3.7%
361
 
3.1%
359
 
3.1%
298
 
2.5%
255
 
2.2%
237
 
2.0%
228
 
1.9%
228
 
1.9%
212
 
1.8%
Other values (412) 7851
66.9%
Latin
ValueCountFrequency (%)
C 34
 
10.2%
S 30
 
9.0%
E 30
 
9.0%
T 24
 
7.2%
M 23
 
6.9%
N 23
 
6.9%
P 17
 
5.1%
G 17
 
5.1%
H 16
 
4.8%
I 15
 
4.5%
Other values (17) 103
31.0%
Common
ValueCountFrequency (%)
162
41.4%
) 63
 
16.1%
( 63
 
16.1%
2 35
 
9.0%
. 31
 
7.9%
1 11
 
2.8%
& 9
 
2.3%
- 6
 
1.5%
3 6
 
1.5%
/ 2
 
0.5%
Other values (3) 3
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10460
84.0%
None 1273
 
10.2%
ASCII 723
 
5.8%

Most frequent character per block

None
ValueCountFrequency (%)
1273
100.0%
Hangul
ValueCountFrequency (%)
431
 
4.1%
361
 
3.5%
359
 
3.4%
298
 
2.8%
255
 
2.4%
237
 
2.3%
228
 
2.2%
228
 
2.2%
212
 
2.0%
179
 
1.7%
Other values (411) 7672
73.3%
ASCII
ValueCountFrequency (%)
162
22.4%
) 63
 
8.7%
( 63
 
8.7%
2 35
 
4.8%
C 34
 
4.7%
. 31
 
4.3%
S 30
 
4.1%
E 30
 
4.1%
T 24
 
3.3%
M 23
 
3.2%
Other values (30) 228
31.5%

업종
Text

MISSING 

Distinct939
Distinct (%)42.7%
Missing26
Missing (%)1.2%
Memory size17.5 KiB
2023-12-12T21:27:27.839161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length83
Median length42
Mean length11.468878
Min length2

Characters and Unicode

Total characters25243
Distinct characters323
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique681 ?
Unique (%)30.9%

Sample

1st row금속제품제조
2nd row섬유제조
3rd row조립금속제품
4th row화합물및화학제품제조
5th row기타금속제품
ValueCountFrequency (%)
제조업 109
 
3.9%
선박구성부분품제조업 106
 
3.8%
102
 
3.7%
그외기타자동차부품제조업 59
 
2.1%
50
 
1.8%
도장및기타피막처리업 43
 
1.5%
금속조립구조재제조업 42
 
1.5%
기타자동차부품제조업 34
 
1.2%
금속제품제조 33
 
1.2%
조립금속제품제조 32
 
1.1%
Other values (1022) 2183
78.2%
2023-12-12T21:27:28.425594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2426
 
9.6%
2113
 
8.4%
1831
 
7.3%
1047
 
4.1%
979
 
3.9%
641
 
2.5%
606
 
2.4%
601
 
2.4%
590
 
2.3%
554
 
2.2%
Other values (313) 13855
54.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 22423
88.8%
Decimal Number 1437
 
5.7%
Space Separator 641
 
2.5%
Open Punctuation 274
 
1.1%
Close Punctuation 273
 
1.1%
Other Punctuation 190
 
0.8%
Uppercase Letter 3
 
< 0.1%
Control 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2426
 
10.8%
2113
 
9.4%
1831
 
8.2%
1047
 
4.7%
979
 
4.4%
606
 
2.7%
601
 
2.7%
590
 
2.6%
554
 
2.5%
407
 
1.8%
Other values (294) 11269
50.3%
Decimal Number
ValueCountFrequency (%)
2 433
30.1%
1 332
23.1%
9 213
14.8%
3 187
13.0%
5 82
 
5.7%
0 79
 
5.5%
4 70
 
4.9%
8 18
 
1.3%
7 12
 
0.8%
6 11
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
G 1
33.3%
N 1
33.3%
L 1
33.3%
Other Punctuation
ValueCountFrequency (%)
, 187
98.4%
. 3
 
1.6%
Space Separator
ValueCountFrequency (%)
641
100.0%
Open Punctuation
ValueCountFrequency (%)
( 274
100.0%
Close Punctuation
ValueCountFrequency (%)
) 273
100.0%
Control
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 22423
88.8%
Common 2817
 
11.2%
Latin 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2426
 
10.8%
2113
 
9.4%
1831
 
8.2%
1047
 
4.7%
979
 
4.4%
606
 
2.7%
601
 
2.7%
590
 
2.6%
554
 
2.5%
407
 
1.8%
Other values (294) 11269
50.3%
Common
ValueCountFrequency (%)
641
22.8%
2 433
15.4%
1 332
11.8%
( 274
9.7%
) 273
9.7%
9 213
 
7.6%
3 187
 
6.6%
, 187
 
6.6%
5 82
 
2.9%
0 79
 
2.8%
Other values (6) 116
 
4.1%
Latin
ValueCountFrequency (%)
G 1
33.3%
N 1
33.3%
L 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 22422
88.8%
ASCII 2820
 
11.2%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2426
 
10.8%
2113
 
9.4%
1831
 
8.2%
1047
 
4.7%
979
 
4.4%
606
 
2.7%
601
 
2.7%
590
 
2.6%
554
 
2.5%
407
 
1.8%
Other values (293) 11268
50.3%
ASCII
ValueCountFrequency (%)
641
22.7%
2 433
15.4%
1 332
11.8%
( 274
9.7%
) 273
9.7%
9 213
 
7.6%
3 187
 
6.6%
, 187
 
6.6%
5 82
 
2.9%
0 79
 
2.8%
Other values (9) 119
 
4.2%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Distinct2136
Distinct (%)95.9%
Missing0
Missing (%)0.0%
Memory size17.5 KiB
2023-12-12T21:27:28.768430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length58
Median length39
Mean length24.594971
Min length15

Characters and Unicode

Total characters54773
Distinct characters126
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2047 ?
Unique (%)91.9%

Sample

1st row경상남도 김해시 상동면 동북로1109번길 30
2nd row경상남도 김해시 장유로 167-41
3rd row경상남도 김해시 상동면 상동로 460
4th row경상남도 김해시 진영읍 김해대로 118-51
5th row경상남도 김해시 진례면 진례로83번길 1
ValueCountFrequency (%)
경상남도 2227
20.1%
김해시 2227
20.1%
한림면 682
 
6.2%
진례면 344
 
3.1%
상동면 305
 
2.8%
주촌면 272
 
2.5%
생림면 254
 
2.3%
진영읍 242
 
2.2%
고모로 107
 
1.0%
상동로 91
 
0.8%
Other values (1758) 4326
39.1%
2023-12-12T21:27:29.274516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9897
18.1%
2680
 
4.9%
2570
 
4.7%
2570
 
4.7%
1 2349
 
4.3%
2229
 
4.1%
2229
 
4.1%
2227
 
4.1%
2227
 
4.1%
2162
 
3.9%
Other values (116) 23633
43.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 31849
58.1%
Decimal Number 11789
 
21.5%
Space Separator 9897
 
18.1%
Dash Punctuation 1176
 
2.1%
Other Punctuation 18
 
< 0.1%
Open Punctuation 14
 
< 0.1%
Close Punctuation 14
 
< 0.1%
Uppercase Letter 14
 
< 0.1%
Math Symbol 1
 
< 0.1%
Control 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2680
 
8.4%
2570
 
8.1%
2570
 
8.1%
2229
 
7.0%
2229
 
7.0%
2227
 
7.0%
2227
 
7.0%
2162
 
6.8%
1860
 
5.8%
1281
 
4.0%
Other values (92) 9814
30.8%
Decimal Number
ValueCountFrequency (%)
1 2349
19.9%
2 1604
13.6%
3 1282
10.9%
4 1167
9.9%
5 1064
9.0%
9 956
8.1%
6 938
 
8.0%
7 892
 
7.6%
0 810
 
6.9%
8 727
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
B 5
35.7%
A 4
28.6%
L 3
21.4%
C 2
 
14.3%
Other Punctuation
ValueCountFrequency (%)
, 16
88.9%
: 2
 
11.1%
Open Punctuation
ValueCountFrequency (%)
( 13
92.9%
[ 1
 
7.1%
Close Punctuation
ValueCountFrequency (%)
) 13
92.9%
] 1
 
7.1%
Space Separator
ValueCountFrequency (%)
9897
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1176
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 31849
58.1%
Common 22910
41.8%
Latin 14
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2680
 
8.4%
2570
 
8.1%
2570
 
8.1%
2229
 
7.0%
2229
 
7.0%
2227
 
7.0%
2227
 
7.0%
2162
 
6.8%
1860
 
5.8%
1281
 
4.0%
Other values (92) 9814
30.8%
Common
ValueCountFrequency (%)
9897
43.2%
1 2349
 
10.3%
2 1604
 
7.0%
3 1282
 
5.6%
- 1176
 
5.1%
4 1167
 
5.1%
5 1064
 
4.6%
9 956
 
4.2%
6 938
 
4.1%
7 892
 
3.9%
Other values (10) 1585
 
6.9%
Latin
ValueCountFrequency (%)
B 5
35.7%
A 4
28.6%
L 3
21.4%
C 2
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 31849
58.1%
ASCII 22924
41.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9897
43.2%
1 2349
 
10.2%
2 1604
 
7.0%
3 1282
 
5.6%
- 1176
 
5.1%
4 1167
 
5.1%
5 1064
 
4.6%
9 956
 
4.2%
6 938
 
4.1%
7 892
 
3.9%
Other values (14) 1599
 
7.0%
Hangul
ValueCountFrequency (%)
2680
 
8.4%
2570
 
8.1%
2570
 
8.1%
2229
 
7.0%
2229
 
7.0%
2227
 
7.0%
2227
 
7.0%
2162
 
6.8%
1860
 
5.8%
1281
 
4.0%
Other values (92) 9814
30.8%

Missing values

2023-12-12T21:27:26.061159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:27:26.189285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분업체명업종소재지주소
0소음청룡산업주식회사금속제품제조경상남도 김해시 상동면 동북로1109번길 30
1소음서진섬유㈜섬유제조경상남도 김해시 장유로 167-41
2소음남성정밀㈜조립금속제품경상남도 김해시 상동면 상동로 460
3소음㈜디엔씨화합물및화학제품제조경상남도 김해시 진영읍 김해대로 118-51
4소음대신선재공업사기타금속제품경상남도 김해시 진례면 진례로83번길 1
5진동㈜동양산업비금속광물제품제조업경상남도 김해시 상동면 동북로437번길 58-62
6소음삼전비료기타비료및질소화합물제조업경상남도 김해시 상동면 상동로 161
7소음㈜대왕레미콘비금속광물경상남도 김해시 상동면 상동로 818-5
8소음㈜제일거울공예유리가공경상남도 김해시 생림면 나전로 276
9소음㈜지석비금속광물경상남도 김해시 진영읍 본산로86번길 21
구분업체명업종소재지주소
2217소음진동㈜상우기업<NA>경상남도 김해시 진례면 고모로180번길 109-60
2218소음유원산업㈜ 김해생림1공장기타 물품 취급장비 제조업 외경상남도 김해시 생림면 생림대로928번길 81
2219소음㈜에스엠디테크금속가공제품제조업외경상남도 김해시 한림면 김해대로916번길 54-42
2220소음KHM테크 1공장그 외 기타 자동차부품제조업경상남도 김해시 주촌면 서부로1541번길 147 외 5
2221소음KHM테크 2공장그 외 기타 자동차부품제조업경상남도 김해시 주촌면 서부로1541번길 150 외 2
2222소음㈜한국티엠아이선박구성부분품제조업경상남도 김해시 주촌면 서부로1548-44
2223소음제이에스테크육상금속골조구조재제조업경상남도 김해시 진례면 고모로526번길 26-20
2224소음세홍철강㈜그외기타분류안된 금속가공제품제조업경상남도 김해시 한림면 한림로343번길 33-35
2225소음㈜서호하이테크자동차 차체용 신품 부품 제조업경상남도 김해시 주촌면 서부로1499번길 22-8
2226소음피앤씨산업끈 및 로프 제조업경상남도 김해시 안곡로59-1

Duplicate rows

Most frequently occurring

구분업체명업종소재지주소# duplicates
0소음㈜유일에프에이금속조립구조제제조업경상남도 김해시 한림면 김해대로916번길 362
1소음케이비구조산업㈜금속조립구조재제조업경상남도 김해시 생림면 상동로 30-602