Overview

Dataset statistics

Number of variables9
Number of observations831
Missing cells655
Missing cells (%)8.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory61.8 KiB
Average record size in memory76.2 B

Variable types

Numeric2
Text5
Categorical2

Dataset

Description경상남도 양산시 대기 및 폐수 배출업소의 업소명, 사업장 소재지주소, 업종, 전화번호 등을 읍면동별로 배출업소사업장현황을 확인할 수 있습니다.
Author경상남도 양산시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=3040406

Alerts

폐수발생량(㎥/일) is highly overall correlated with 수질High correlation
수질 is highly overall correlated with 폐수발생량(㎥/일)High correlation
수질 is highly imbalanced (51.9%)Imbalance
전화번호 has 92 (11.1%) missing valuesMissing
폐수발생량(㎥/일) has 280 (33.7%) missing valuesMissing
폐수처리방법 has 280 (33.7%) missing valuesMissing
일련번호 has unique valuesUnique
폐수발생량(㎥/일) has 34 (4.1%) zerosZeros

Reproduction

Analysis started2023-12-11 00:27:27.554489
Analysis finished2023-12-11 00:27:28.642476
Duration1.09 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일련번호
Real number (ℝ)

UNIQUE 

Distinct831
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean416
Minimum1
Maximum831
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.4 KiB
2023-12-11T09:27:28.700511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile42.5
Q1208.5
median416
Q3623.5
95-th percentile789.5
Maximum831
Range830
Interquartile range (IQR)415

Descriptive statistics

Standard deviation240.03333
Coefficient of variation (CV)0.5770032
Kurtosis-1.2
Mean416
Median Absolute Deviation (MAD)208
Skewness0
Sum345696
Variance57616
MonotonicityStrictly increasing
2023-12-11T09:27:28.830646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
2 1
 
0.1%
549 1
 
0.1%
550 1
 
0.1%
551 1
 
0.1%
552 1
 
0.1%
553 1
 
0.1%
554 1
 
0.1%
555 1
 
0.1%
556 1
 
0.1%
Other values (821) 821
98.8%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
831 1
0.1%
830 1
0.1%
829 1
0.1%
828 1
0.1%
827 1
0.1%
826 1
0.1%
825 1
0.1%
824 1
0.1%
823 1
0.1%
822 1
0.1%
Distinct816
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
2023-12-11T09:27:29.046802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length20
Mean length6.5415162
Min length2

Characters and Unicode

Total characters5436
Distinct characters385
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique802 ?
Unique (%)96.5%

Sample

1st row명광기업㈜
2nd row㈜동호산업
3rd row에스케이정밀㈜
4th row한미공업사
5th row(주)화승소재
ValueCountFrequency (%)
제2공장 6
 
0.7%
양산시 4
 
0.4%
양산지점 4
 
0.4%
태광산업 3
 
0.3%
양산공장 3
 
0.3%
덕성인더스트리㈜ 2
 
0.2%
현대드럼 2
 
0.2%
부산지점 2
 
0.2%
㈜서비스코리아 2
 
0.2%
㈜신성에너지 2
 
0.2%
Other values (853) 871
96.7%
2023-12-11T09:27:29.354094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
347
 
6.4%
204
 
3.8%
152
 
2.8%
151
 
2.8%
) 127
 
2.3%
( 127
 
2.3%
120
 
2.2%
105
 
1.9%
98
 
1.8%
98
 
1.8%
Other values (375) 3907
71.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4559
83.9%
Other Symbol 347
 
6.4%
Close Punctuation 127
 
2.3%
Open Punctuation 127
 
2.3%
Space Separator 120
 
2.2%
Uppercase Letter 86
 
1.6%
Decimal Number 41
 
0.8%
Other Punctuation 27
 
0.5%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
204
 
4.5%
152
 
3.3%
151
 
3.3%
105
 
2.3%
98
 
2.1%
98
 
2.1%
95
 
2.1%
93
 
2.0%
93
 
2.0%
92
 
2.0%
Other values (344) 3378
74.1%
Uppercase Letter
ValueCountFrequency (%)
C 17
19.8%
T 9
10.5%
S 8
9.3%
R 6
 
7.0%
D 6
 
7.0%
H 5
 
5.8%
A 5
 
5.8%
M 5
 
5.8%
I 4
 
4.7%
P 3
 
3.5%
Other values (9) 18
20.9%
Decimal Number
ValueCountFrequency (%)
2 27
65.9%
1 9
 
22.0%
3 3
 
7.3%
4 2
 
4.9%
Other Punctuation
ValueCountFrequency (%)
. 22
81.5%
& 4
 
14.8%
1
 
3.7%
Other Symbol
ValueCountFrequency (%)
347
100.0%
Close Punctuation
ValueCountFrequency (%)
) 127
100.0%
Open Punctuation
ValueCountFrequency (%)
( 127
100.0%
Space Separator
ValueCountFrequency (%)
120
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4906
90.3%
Common 444
 
8.2%
Latin 86
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
347
 
7.1%
204
 
4.2%
152
 
3.1%
151
 
3.1%
105
 
2.1%
98
 
2.0%
98
 
2.0%
95
 
1.9%
93
 
1.9%
93
 
1.9%
Other values (345) 3470
70.7%
Latin
ValueCountFrequency (%)
C 17
19.8%
T 9
10.5%
S 8
9.3%
R 6
 
7.0%
D 6
 
7.0%
H 5
 
5.8%
A 5
 
5.8%
M 5
 
5.8%
I 4
 
4.7%
P 3
 
3.5%
Other values (9) 18
20.9%
Common
ValueCountFrequency (%)
) 127
28.6%
( 127
28.6%
120
27.0%
2 27
 
6.1%
. 22
 
5.0%
1 9
 
2.0%
& 4
 
0.9%
3 3
 
0.7%
- 2
 
0.5%
4 2
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4559
83.9%
ASCII 529
 
9.7%
None 348
 
6.4%

Most frequent character per block

None
ValueCountFrequency (%)
347
99.7%
1
 
0.3%
Hangul
ValueCountFrequency (%)
204
 
4.5%
152
 
3.3%
151
 
3.3%
105
 
2.3%
98
 
2.1%
98
 
2.1%
95
 
2.1%
93
 
2.0%
93
 
2.0%
92
 
2.0%
Other values (344) 3378
74.1%
ASCII
ValueCountFrequency (%)
) 127
24.0%
( 127
24.0%
120
22.7%
2 27
 
5.1%
. 22
 
4.2%
C 17
 
3.2%
T 9
 
1.7%
1 9
 
1.7%
S 8
 
1.5%
R 6
 
1.1%
Other values (19) 57
10.8%
Distinct795
Distinct (%)95.7%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
2023-12-11T09:27:29.534755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length36
Mean length19.323706
Min length13

Characters and Unicode

Total characters16058
Distinct characters118
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique766 ?
Unique (%)92.2%

Sample

1st row경상남도 양산시 교동 114-2
2nd row경상남도 양산시 교동 117
3rd row경상남도 양산시 교동 117-11
4th row경상남도 양산시 교동 129
5th row경상남도 양산시 교동 147-1
ValueCountFrequency (%)
경상남도 831
23.1%
양산시 828
23.0%
상북면 120
 
3.3%
어곡동 80
 
2.2%
북정동 79
 
2.2%
소토리 75
 
2.1%
소주동 75
 
2.1%
유산동 62
 
1.7%
산막동 60
 
1.7%
주남동 42
 
1.2%
Other values (893) 1349
37.5%
2023-12-11T09:27:29.839438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2789
17.4%
1067
 
6.6%
956
 
6.0%
883
 
5.5%
834
 
5.2%
832
 
5.2%
832
 
5.2%
831
 
5.2%
680
 
4.2%
1 648
 
4.0%
Other values (108) 5706
35.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9299
57.9%
Decimal Number 3256
 
20.3%
Space Separator 2789
 
17.4%
Dash Punctuation 594
 
3.7%
Uppercase Letter 41
 
0.3%
Other Punctuation 34
 
0.2%
Close Punctuation 22
 
0.1%
Open Punctuation 22
 
0.1%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1067
11.5%
956
10.3%
883
9.5%
834
9.0%
832
8.9%
832
8.9%
831
8.9%
680
 
7.3%
244
 
2.6%
195
 
2.1%
Other values (85) 1945
20.9%
Decimal Number
ValueCountFrequency (%)
1 648
19.9%
2 466
14.3%
3 370
11.4%
4 319
9.8%
8 258
 
7.9%
5 250
 
7.7%
6 249
 
7.6%
7 247
 
7.6%
9 241
 
7.4%
0 208
 
6.4%
Uppercase Letter
ValueCountFrequency (%)
L 19
46.3%
B 18
43.9%
A 1
 
2.4%
I 1
 
2.4%
C 1
 
2.4%
D 1
 
2.4%
Other Punctuation
ValueCountFrequency (%)
, 18
52.9%
. 16
47.1%
Space Separator
ValueCountFrequency (%)
2789
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 594
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Lowercase Letter
ValueCountFrequency (%)
w 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9299
57.9%
Common 6717
41.8%
Latin 42
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1067
11.5%
956
10.3%
883
9.5%
834
9.0%
832
8.9%
832
8.9%
831
8.9%
680
 
7.3%
244
 
2.6%
195
 
2.1%
Other values (85) 1945
20.9%
Common
ValueCountFrequency (%)
2789
41.5%
1 648
 
9.6%
- 594
 
8.8%
2 466
 
6.9%
3 370
 
5.5%
4 319
 
4.7%
8 258
 
3.8%
5 250
 
3.7%
6 249
 
3.7%
7 247
 
3.7%
Other values (6) 527
 
7.8%
Latin
ValueCountFrequency (%)
L 19
45.2%
B 18
42.9%
w 1
 
2.4%
A 1
 
2.4%
I 1
 
2.4%
C 1
 
2.4%
D 1
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9299
57.9%
ASCII 6759
42.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2789
41.3%
1 648
 
9.6%
- 594
 
8.8%
2 466
 
6.9%
3 370
 
5.5%
4 319
 
4.7%
8 258
 
3.8%
5 250
 
3.7%
6 249
 
3.7%
7 247
 
3.7%
Other values (13) 569
 
8.4%
Hangul
ValueCountFrequency (%)
1067
11.5%
956
10.3%
883
9.5%
834
9.0%
832
8.9%
832
8.9%
831
8.9%
680
 
7.3%
244
 
2.6%
195
 
2.1%
Other values (85) 1945
20.9%

업종
Text

Distinct371
Distinct (%)44.8%
Missing3
Missing (%)0.4%
Memory size6.6 KiB
2023-12-11T09:27:30.079824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length22
Mean length7.6944444
Min length2

Characters and Unicode

Total characters6371
Distinct characters226
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique266 ?
Unique (%)32.1%

Sample

1st row기타자동차부품제조업
2nd row플라스틱호스제조업
3rd row비철금속
4th row장비수선.세차
5th row고무및플라스틱
ValueCountFrequency (%)
세차시설 76
 
6.8%
73
 
6.6%
고무및플라스틱 31
 
2.8%
장비수선.세차 24
 
2.2%
플라스틱 23
 
2.1%
기타화학 22
 
2.0%
제조업 21
 
1.9%
조립금속 20
 
1.8%
도금업 19
 
1.7%
비금속광물 16
 
1.4%
Other values (421) 785
70.7%
2023-12-11T09:27:30.409583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
470
 
7.4%
326
 
5.1%
296
 
4.6%
295
 
4.6%
237
 
3.7%
206
 
3.2%
199
 
3.1%
169
 
2.7%
163
 
2.6%
159
 
2.5%
Other values (216) 3851
60.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5963
93.6%
Space Separator 296
 
4.6%
Other Punctuation 62
 
1.0%
Open Punctuation 23
 
0.4%
Close Punctuation 23
 
0.4%
Decimal Number 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
470
 
7.9%
326
 
5.5%
295
 
4.9%
237
 
4.0%
206
 
3.5%
199
 
3.3%
169
 
2.8%
163
 
2.7%
159
 
2.7%
139
 
2.3%
Other values (208) 3600
60.4%
Other Punctuation
ValueCountFrequency (%)
. 32
51.6%
, 29
46.8%
· 1
 
1.6%
Decimal Number
ValueCountFrequency (%)
1 2
50.0%
3 2
50.0%
Space Separator
ValueCountFrequency (%)
296
100.0%
Open Punctuation
ValueCountFrequency (%)
( 23
100.0%
Close Punctuation
ValueCountFrequency (%)
) 23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5963
93.6%
Common 408
 
6.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
470
 
7.9%
326
 
5.5%
295
 
4.9%
237
 
4.0%
206
 
3.5%
199
 
3.3%
169
 
2.8%
163
 
2.7%
159
 
2.7%
139
 
2.3%
Other values (208) 3600
60.4%
Common
ValueCountFrequency (%)
296
72.5%
. 32
 
7.8%
, 29
 
7.1%
( 23
 
5.6%
) 23
 
5.6%
1 2
 
0.5%
3 2
 
0.5%
· 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5963
93.6%
ASCII 407
 
6.4%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
470
 
7.9%
326
 
5.5%
295
 
4.9%
237
 
4.0%
206
 
3.5%
199
 
3.3%
169
 
2.8%
163
 
2.7%
159
 
2.7%
139
 
2.3%
Other values (208) 3600
60.4%
ASCII
ValueCountFrequency (%)
296
72.7%
. 32
 
7.9%
, 29
 
7.1%
( 23
 
5.7%
) 23
 
5.7%
1 2
 
0.5%
3 2
 
0.5%
None
ValueCountFrequency (%)
· 1
100.0%

대기
Categorical

Distinct6
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
5
318 
<NA>
270 
4
188 
2
 
24
3
 
19

Length

Max length4
Median length1
Mean length1.9747292
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row5
3rd row5
4th row4
5th row2

Common Values

ValueCountFrequency (%)
5 318
38.3%
<NA> 270
32.5%
4 188
22.6%
2 24
 
2.9%
3 19
 
2.3%
1 12
 
1.4%

Length

2023-12-11T09:27:30.518754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:27:30.614662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5 318
38.3%
na 270
32.5%
4 188
22.6%
2 24
 
2.9%
3 19
 
2.3%
1 12
 
1.4%

수질
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size6.6 KiB
5
514 
<NA>
278 
4
 
21
3
 
8
2
 
6

Length

Max length4
Median length1
Mean length2.0036101
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5
2nd row5
3rd row<NA>
4th row<NA>
5th row5

Common Values

ValueCountFrequency (%)
5 514
61.9%
<NA> 278
33.5%
4 21
 
2.5%
3 8
 
1.0%
2 6
 
0.7%
1 4
 
0.5%

Length

2023-12-11T09:27:30.714447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:27:30.805931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5 514
61.9%
na 278
33.5%
4 21
 
2.5%
3 8
 
1.0%
2 6
 
0.7%
1 4
 
0.5%

전화번호
Text

MISSING 

Distinct701
Distinct (%)94.9%
Missing92
Missing (%)11.1%
Memory size6.6 KiB
2023-12-11T09:27:30.983954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.006766
Min length12

Characters and Unicode

Total characters8873
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique668 ?
Unique (%)90.4%

Sample

1st row055-383-8700
2nd row055-388-6966
3rd row055-387-0477
4th row055-384-0012
5th row055-370-3247
ValueCountFrequency (%)
055-388-1101 3
 
0.4%
055-388-3319 3
 
0.4%
055-388-7100 3
 
0.4%
055-366-9991 3
 
0.4%
055-386-0500 3
 
0.4%
055-389-1900 2
 
0.3%
055-374-9566 2
 
0.3%
055-386-4061 2
 
0.3%
055-366-2601 2
 
0.3%
055-367-3641 2
 
0.3%
Other values (691) 714
96.6%
2023-12-11T09:27:31.269715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 1915
21.6%
- 1478
16.7%
0 1258
14.2%
3 1067
12.0%
8 645
 
7.3%
6 535
 
6.0%
1 516
 
5.8%
7 460
 
5.2%
2 378
 
4.3%
4 364
 
4.1%
Other values (2) 257
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7394
83.3%
Dash Punctuation 1478
 
16.7%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 1915
25.9%
0 1258
17.0%
3 1067
14.4%
8 645
 
8.7%
6 535
 
7.2%
1 516
 
7.0%
7 460
 
6.2%
2 378
 
5.1%
4 364
 
4.9%
9 256
 
3.5%
Dash Punctuation
ValueCountFrequency (%)
- 1478
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8873
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 1915
21.6%
- 1478
16.7%
0 1258
14.2%
3 1067
12.0%
8 645
 
7.3%
6 535
 
6.0%
1 516
 
5.8%
7 460
 
5.2%
2 378
 
4.3%
4 364
 
4.1%
Other values (2) 257
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8873
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 1915
21.6%
- 1478
16.7%
0 1258
14.2%
3 1067
12.0%
8 645
 
7.3%
6 535
 
6.0%
1 516
 
5.8%
7 460
 
5.2%
2 378
 
4.3%
4 364
 
4.1%
Other values (2) 257
 
2.9%

폐수발생량(㎥/일)
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct267
Distinct (%)48.5%
Missing280
Missing (%)33.7%
Infinite0
Infinite (%)0.0%
Mean73.763281
Minimum0
Maximum7055.76
Zeros34
Zeros (%)4.1%
Negative0
Negative (%)0.0%
Memory size7.4 KiB
2023-12-11T09:27:31.383627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.43
median2.6
Q312
95-th percentile140.01
Maximum7055.76
Range7055.76
Interquartile range (IQR)11.57

Descriptive statistics

Standard deviation453.00226
Coefficient of variation (CV)6.1412976
Kurtosis126.96938
Mean73.763281
Median Absolute Deviation (MAD)2.5
Skewness10.304228
Sum40643.568
Variance205211.05
MonotonicityNot monotonic
2023-12-11T09:27:31.490076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 34
 
4.1%
2.0 22
 
2.6%
3.0 20
 
2.4%
1.0 15
 
1.8%
0.1 14
 
1.7%
0.5 11
 
1.3%
10.0 11
 
1.3%
20.0 10
 
1.2%
11.0 8
 
1.0%
12.0 8
 
1.0%
Other values (257) 398
47.9%
(Missing) 280
33.7%
ValueCountFrequency (%)
0.0 34
4.1%
0.01 4
 
0.5%
0.011 1
 
0.1%
0.02 4
 
0.5%
0.024 1
 
0.1%
0.04 1
 
0.1%
0.05 2
 
0.2%
0.06 1
 
0.1%
0.066 1
 
0.1%
0.06675 1
 
0.1%
ValueCountFrequency (%)
7055.76 1
0.1%
3990.1 1
0.1%
3563.0 1
0.1%
3200.0 1
0.1%
2817.195 1
0.1%
2700.0 1
0.1%
1800.0 1
0.1%
1573.0 1
0.1%
1296.5 1
0.1%
1184.0 1
0.1%

폐수처리방법
Text

MISSING 

Distinct65
Distinct (%)11.8%
Missing280
Missing (%)33.7%
Memory size6.6 KiB
2023-12-11T09:27:31.682190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length29
Mean length5.4627949
Min length2

Characters and Unicode

Total characters3010
Distinct characters81
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique42 ?
Unique (%)7.6%

Sample

1st row위탁
2nd row물리적처리
3rd row위탁
4th row물리화학적처리
5th row화승알앤에이병합처리
ValueCountFrequency (%)
물리화학적처리 149
23.2%
위탁 147
22.9%
재이용 61
9.5%
위탁처리 43
 
6.7%
전량재이용 37
 
5.8%
물리화학 34
 
5.3%
물리적처리 21
 
3.3%
전량위탁처리 16
 
2.5%
폐수종말처리장 14
 
2.2%
유입 14
 
2.2%
Other values (48) 107
16.6%
2023-12-11T09:27:31.971603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
516
17.1%
285
9.5%
255
 
8.5%
226
 
7.5%
223
 
7.4%
223
 
7.4%
209
 
6.9%
195
 
6.5%
111
 
3.7%
110
 
3.7%
Other values (71) 657
21.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2874
95.5%
Space Separator 104
 
3.5%
Other Punctuation 9
 
0.3%
Open Punctuation 8
 
0.3%
Close Punctuation 7
 
0.2%
Uppercase Letter 3
 
0.1%
Math Symbol 2
 
0.1%
Decimal Number 2
 
0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
516
18.0%
285
9.9%
255
8.9%
226
7.9%
223
7.8%
223
7.8%
209
7.3%
195
 
6.8%
111
 
3.9%
110
 
3.8%
Other values (59) 521
18.1%
Uppercase Letter
ValueCountFrequency (%)
C 1
33.3%
S 1
33.3%
M 1
33.3%
Other Punctuation
ValueCountFrequency (%)
, 8
88.9%
: 1
 
11.1%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
2 1
50.0%
Space Separator
ValueCountFrequency (%)
104
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Math Symbol
ValueCountFrequency (%)
+ 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2875
95.5%
Common 132
 
4.4%
Latin 3
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
516
17.9%
285
9.9%
255
8.9%
226
7.9%
223
7.8%
223
7.8%
209
7.3%
195
 
6.8%
111
 
3.9%
110
 
3.8%
Other values (60) 522
18.2%
Common
ValueCountFrequency (%)
104
78.8%
( 8
 
6.1%
, 8
 
6.1%
) 7
 
5.3%
+ 2
 
1.5%
: 1
 
0.8%
1 1
 
0.8%
2 1
 
0.8%
Latin
ValueCountFrequency (%)
C 1
33.3%
S 1
33.3%
M 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2874
95.5%
ASCII 135
 
4.5%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
516
18.0%
285
9.9%
255
8.9%
226
7.9%
223
7.8%
223
7.8%
209
7.3%
195
 
6.8%
111
 
3.9%
110
 
3.8%
Other values (59) 521
18.1%
ASCII
ValueCountFrequency (%)
104
77.0%
( 8
 
5.9%
, 8
 
5.9%
) 7
 
5.2%
+ 2
 
1.5%
: 1
 
0.7%
C 1
 
0.7%
S 1
 
0.7%
M 1
 
0.7%
1 1
 
0.7%
None
ValueCountFrequency (%)
1
100.0%

Interactions

2023-12-11T09:27:28.211821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:27:28.061948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:27:28.278041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:27:28.128925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:27:32.271856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호대기수질폐수발생량(㎥/일)폐수처리방법
일련번호1.0000.2670.0000.0000.704
대기0.2671.0000.5250.3230.514
수질0.0000.5251.0000.8000.775
폐수발생량(㎥/일)0.0000.3230.8001.0000.829
폐수처리방법0.7040.5140.7750.8291.000
2023-12-11T09:27:32.348607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수질대기
수질1.0000.219
대기0.2191.000
2023-12-11T09:27:32.416746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호폐수발생량(㎥/일)대기수질
일련번호1.000-0.0820.1130.000
폐수발생량(㎥/일)-0.0821.0000.2120.679
대기0.1130.2121.0000.219
수질0.0000.6790.2191.000

Missing values

2023-12-11T09:27:28.377094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:27:28.483528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T09:27:28.578235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

일련번호업소명소재지업종대기수질전화번호폐수발생량(㎥/일)폐수처리방법
01명광기업㈜경상남도 양산시 교동 114-2기타자동차부품제조업45055-383-87000.13위탁
12㈜동호산업경상남도 양산시 교동 117플라스틱호스제조업55055-388-69660.34물리적처리
23에스케이정밀㈜경상남도 양산시 교동 117-11비철금속5<NA>055-387-0477<NA><NA>
34한미공업사경상남도 양산시 교동 129장비수선.세차4<NA>055-384-0012<NA><NA>
45(주)화승소재경상남도 양산시 교동 147-1고무및플라스틱25055-370-32479.2위탁
56(주)화승R&A경상남도 양산시 교동 147-1고무제품제조24055-370-3242190.0물리화학적처리
67대창산업경상남도 양산시 교동 147-1고무제품제조5<NA>055-364-2756<NA><NA>
78승호경상남도 양산시 교동 147-1고무제품제조5<NA>055-381-8198<NA><NA>
89㈜화승엑스윌경상남도 양산시 교동 147-1고무제품제조45055-370-32420.27화승알앤에이병합처리
910태광산업경상남도 양산시 교동 147-1고무제품제조5<NA>055-381-1780<NA><NA>
일련번호업소명소재지업종대기수질전화번호폐수발생량(㎥/일)폐수처리방법
821822(주)동흥포장경상남도 양산시 평산동 163-3종이제품45055-365-47320.1위탁
822823대남FM자동차서비스경상남도 양산시 평산동 18-3자동차정비4<NA>055-365-1234<NA><NA>
823824골든24시셀프세차장경상남도 양산시 평산동 19B-4L(w모텔 앞)세차시설<NA>5<NA>12.0물리화학
824825혜인요양병원경상남도 양산시 평산동 31-5, 23번지일반 병원<NA>5055-385-75510.02위탁
825826영창목재산업경상남도 양산시 평산동 58-4목제가공업4<NA>055-366-2123<NA><NA>
826827웅상정비센터경상남도 양산시 평산동 76-4자동차정비업4<NA>055-364-8285<NA><NA>
827828양산시 웅상정수장경상남도 양산시 평산동 800수도사업시설<NA>3055-382-9005594.0물리,화학,생물
828829문화세차장경상남도 양산시 평산동 9-1세차시설<NA>5<NA>2.0물리화학
829830은진자동차종합정비경상남도 양산시 평산동 토지구획 26B 9L세차시설5<NA>055-388-0670<NA><NA>
830831평산셀프경상남도 양산시 평산동 108-7운수장비수선 및 세차시설<NA>5<NA>5.0물리화학