Overview

Dataset statistics

Number of variables7
Number of observations576
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory32.2 KiB
Average record size in memory57.2 B

Variable types

Numeric1
DateTime1
Categorical3
Text2

Dataset

Description경상남도 환경오염물질배출사업장 단속현황(지도점검일자, 업종명,업체명칭, 소재지, 점검결과, 점검구분등의 데이터를 포함하고있습니다.)
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15047244

Alerts

점검구분 is highly imbalanced (67.3%)Imbalance
번호 has unique valuesUnique

Reproduction

Analysis started2023-08-15 04:10:36.577822
Analysis finished2023-08-15 04:10:38.023608
Duration1.45 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct576
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean288.5
Minimum1
Maximum576
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.2 KiB
2023-08-15T13:10:38.179104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile29.75
Q1144.75
median288.5
Q3432.25
95-th percentile547.25
Maximum576
Range575
Interquartile range (IQR)287.5

Descriptive statistics

Standard deviation166.42115
Coefficient of variation (CV)0.57684975
Kurtosis-1.2
Mean288.5
Median Absolute Deviation (MAD)144
Skewness0
Sum166176
Variance27696
MonotonicityStrictly increasing
2023-08-15T13:10:38.535810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.2%
290 1
 
0.2%
382 1
 
0.2%
383 1
 
0.2%
384 1
 
0.2%
385 1
 
0.2%
386 1
 
0.2%
387 1
 
0.2%
388 1
 
0.2%
389 1
 
0.2%
Other values (566) 566
98.3%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
576 1
0.2%
575 1
0.2%
574 1
0.2%
573 1
0.2%
572 1
0.2%
571 1
0.2%
570 1
0.2%
569 1
0.2%
568 1
0.2%
567 1
0.2%
Distinct171
Distinct (%)29.7%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
Minimum2015-01-14 00:00:00
Maximum2015-12-30 00:00:00
2023-08-15T13:10:38.866235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-08-15T13:10:39.251452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

업종명
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
대기배출업소관리
314 
폐수배출업소관리
262 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row폐수배출업소관리
2nd row폐수배출업소관리
3rd row폐수배출업소관리
4th row폐수배출업소관리
5th row폐수배출업소관리

Common Values

ValueCountFrequency (%)
대기배출업소관리 314
54.5%
폐수배출업소관리 262
45.5%

Length

2023-08-15T13:10:39.553962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T13:10:39.735841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대기배출업소관리 314
54.5%
폐수배출업소관리 262
45.5%
Distinct99
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
2023-08-15T13:10:40.135158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters4032
Distinct characters95
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)4.2%

Sample

1st row효성*****
2nd row현대*****
3rd row(주*****
4th row넥센*****
5th row(주*****
ValueCountFrequency (%)
154
26.7%
현대 28
 
4.9%
고려 24
 
4.2%
두산 23
 
4.0%
한국 16
 
2.8%
에스 12
 
2.1%
넥센 11
 
1.9%
하이 10
 
1.7%
㈜동 10
 
1.7%
무림 9
 
1.6%
Other values (90) 280
48.5%
2023-08-15T13:10:40.877090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 2880
71.4%
163
 
4.0%
( 155
 
3.8%
59
 
1.5%
47
 
1.2%
34
 
0.8%
31
 
0.8%
29
 
0.7%
28
 
0.7%
27
 
0.7%
Other values (85) 579
 
14.4%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 2882
71.5%
Other Letter 923
 
22.9%
Open Punctuation 155
 
3.8%
Other Symbol 47
 
1.2%
Uppercase Letter 24
 
0.6%
Space Separator 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
163
 
17.7%
59
 
6.4%
34
 
3.7%
31
 
3.4%
29
 
3.1%
28
 
3.0%
27
 
2.9%
27
 
2.9%
25
 
2.7%
24
 
2.6%
Other values (73) 476
51.6%
Uppercase Letter
ValueCountFrequency (%)
C 6
25.0%
K 6
25.0%
S 6
25.0%
G 2
 
8.3%
B 2
 
8.3%
T 1
 
4.2%
L 1
 
4.2%
Other Punctuation
ValueCountFrequency (%)
* 2880
99.9%
& 2
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 155
100.0%
Other Symbol
ValueCountFrequency (%)
47
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3038
75.3%
Hangul 970
 
24.1%
Latin 24
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
163
 
16.8%
59
 
6.1%
47
 
4.8%
34
 
3.5%
31
 
3.2%
29
 
3.0%
28
 
2.9%
27
 
2.8%
27
 
2.8%
25
 
2.6%
Other values (74) 500
51.5%
Latin
ValueCountFrequency (%)
C 6
25.0%
K 6
25.0%
S 6
25.0%
G 2
 
8.3%
B 2
 
8.3%
T 1
 
4.2%
L 1
 
4.2%
Common
ValueCountFrequency (%)
* 2880
94.8%
( 155
 
5.1%
& 2
 
0.1%
1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3062
75.9%
Hangul 923
 
22.9%
None 47
 
1.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 2880
94.1%
( 155
 
5.1%
C 6
 
0.2%
K 6
 
0.2%
S 6
 
0.2%
G 2
 
0.1%
B 2
 
0.1%
& 2
 
0.1%
T 1
 
< 0.1%
1
 
< 0.1%
Hangul
ValueCountFrequency (%)
163
 
17.7%
59
 
6.4%
34
 
3.7%
31
 
3.4%
29
 
3.1%
28
 
3.0%
27
 
2.9%
27
 
2.9%
25
 
2.7%
24
 
2.6%
Other values (73) 476
51.6%
None
ValueCountFrequency (%)
47
100.0%

소재지
Categorical

Distinct19
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
경상남도 창원시**** ** **
258 
경상남도 양산시**** ** **
76 
경상남도 함안군**** ** **
56 
경상남도 진주시**** ** **
52 
경상남도 창원시
37 
Other values (14)
97 

Length

Max length18
Median length18
Mean length17.027778
Min length8

Unique

Unique3 ?
Unique (%)0.5%

Sample

1st row경상남도 창원시
2nd row경상남도 창원시**** ** **
3rd row경상남도 창원시
4th row경상남도 창녕군**** ** **
5th row경상남도 양산시**** ** **

Common Values

ValueCountFrequency (%)
경상남도 창원시**** ** ** 258
44.8%
경상남도 양산시**** ** ** 76
 
13.2%
경상남도 함안군**** ** ** 56
 
9.7%
경상남도 진주시**** ** ** 52
 
9.0%
경상남도 창원시 37
 
6.4%
경상남도 사천시**** ** ** 24
 
4.2%
경상남도 창녕군**** ** ** 17
 
3.0%
경상남도 거제시**** ** ** 13
 
2.3%
경상남도 김해시**** ** ** 10
 
1.7%
경상남도 마산시**** ** ** 8
 
1.4%
Other values (9) 25
 
4.3%

Length

2023-08-15T13:10:41.140732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1040
47.4%
경상남도 576
26.3%
창원시 295
 
13.5%
양산시 78
 
3.6%
함안군 63
 
2.9%
진주시 55
 
2.5%
사천시 24
 
1.1%
창녕군 22
 
1.0%
거제시 14
 
0.6%
김해시 10
 
0.5%
Other values (4) 15
 
0.7%
Distinct142
Distinct (%)24.7%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
2023-08-15T13:10:41.629934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length100
Median length11
Mean length17.920139
Min length6

Characters and Unicode

Total characters10322
Distinct characters261
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique128 ?
Unique (%)22.2%

Sample

1st row특이사항 발견치 못함,- 가동시작 확인.-채수 : 4L*1, 1L*3
2nd row특이사항 발견치 못함,- 미채수 : 가동시작 대상 배설시설 전량위탁
3rd row특이사항 발견치 못함,* 가동시작 확인, 미채수(전량위탁)
4th row특이사항 발견치 못함,4L*3, 1L*6,* 기본배출부과금 부과를 위한 채수 병행
5th row특이사항 발견치 못함,시료 미채수(별도배출허용기준 면제 사업장)
ValueCountFrequency (%)
못함 491
19.9%
특이사항 454
18.4%
발견치 430
17.5%
49
 
2.0%
확인치 38
 
1.5%
위반사항 30
 
1.2%
29
 
1.2%
징구 22
 
0.9%
특이사항확인치 19
 
0.8%
따른 19
 
0.8%
Other values (433) 881
35.8%
2023-08-15T13:10:42.460166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1899
18.4%
543
 
5.3%
538
 
5.2%
531
 
5.1%
528
 
5.1%
509
 
4.9%
509
 
4.9%
487
 
4.7%
457
 
4.4%
450
 
4.4%
Other values (251) 3871
37.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7530
73.0%
Space Separator 1899
 
18.4%
Decimal Number 337
 
3.3%
Other Punctuation 309
 
3.0%
Open Punctuation 74
 
0.7%
Close Punctuation 71
 
0.7%
Uppercase Letter 42
 
0.4%
Dash Punctuation 38
 
0.4%
Math Symbol 11
 
0.1%
Lowercase Letter 10
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
543
 
7.2%
538
 
7.1%
531
 
7.1%
528
 
7.0%
509
 
6.8%
509
 
6.8%
487
 
6.5%
457
 
6.1%
450
 
6.0%
136
 
1.8%
Other values (212) 2842
37.7%
Decimal Number
ValueCountFrequency (%)
1 134
39.8%
4 56
16.6%
2 40
 
11.9%
3 35
 
10.4%
0 28
 
8.3%
5 23
 
6.8%
7 12
 
3.6%
6 4
 
1.2%
9 3
 
0.9%
8 2
 
0.6%
Other Punctuation
ValueCountFrequency (%)
, 154
49.8%
. 63
20.4%
* 52
 
16.8%
: 20
 
6.5%
' 10
 
3.2%
4
 
1.3%
/ 3
 
1.0%
\ 2
 
0.6%
· 1
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
L 36
85.7%
T 2
 
4.8%
C 1
 
2.4%
H 1
 
2.4%
S 1
 
2.4%
M 1
 
2.4%
Open Punctuation
ValueCountFrequency (%)
( 72
97.3%
[ 1
 
1.4%
1
 
1.4%
Close Punctuation
ValueCountFrequency (%)
) 69
97.2%
] 1
 
1.4%
1
 
1.4%
Math Symbol
ValueCountFrequency (%)
> 7
63.6%
~ 3
27.3%
= 1
 
9.1%
Lowercase Letter
ValueCountFrequency (%)
x 9
90.0%
m 1
 
10.0%
Space Separator
ValueCountFrequency (%)
1899
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 38
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7530
73.0%
Common 2740
 
26.5%
Latin 52
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
543
 
7.2%
538
 
7.1%
531
 
7.1%
528
 
7.0%
509
 
6.8%
509
 
6.8%
487
 
6.5%
457
 
6.1%
450
 
6.0%
136
 
1.8%
Other values (212) 2842
37.7%
Common
ValueCountFrequency (%)
1899
69.3%
, 154
 
5.6%
1 134
 
4.9%
( 72
 
2.6%
) 69
 
2.5%
. 63
 
2.3%
4 56
 
2.0%
* 52
 
1.9%
2 40
 
1.5%
- 38
 
1.4%
Other values (21) 163
 
5.9%
Latin
ValueCountFrequency (%)
L 36
69.2%
x 9
 
17.3%
T 2
 
3.8%
C 1
 
1.9%
H 1
 
1.9%
S 1
 
1.9%
M 1
 
1.9%
m 1
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7522
72.9%
ASCII 2784
 
27.0%
Compat Jamo 8
 
0.1%
Punctuation 4
 
< 0.1%
None 3
 
< 0.1%
CJK Compat 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1899
68.2%
, 154
 
5.5%
1 134
 
4.8%
( 72
 
2.6%
) 69
 
2.5%
. 63
 
2.3%
4 56
 
2.0%
* 52
 
1.9%
2 40
 
1.4%
- 38
 
1.4%
Other values (24) 207
 
7.4%
Hangul
ValueCountFrequency (%)
543
 
7.2%
538
 
7.2%
531
 
7.1%
528
 
7.0%
509
 
6.8%
509
 
6.8%
487
 
6.5%
457
 
6.1%
450
 
6.0%
136
 
1.8%
Other values (210) 2834
37.7%
Compat Jamo
ValueCountFrequency (%)
7
87.5%
1
 
12.5%
Punctuation
ValueCountFrequency (%)
4
100.0%
CJK Compat
ValueCountFrequency (%)
1
100.0%
None
ValueCountFrequency (%)
1
33.3%
· 1
33.3%
1
33.3%

점검구분
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
정기
516 
수시
56 
기타
 
4

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수시
2nd row정기
3rd row정기
4th row정기
5th row정기

Common Values

ValueCountFrequency (%)
정기 516
89.6%
수시 56
 
9.7%
기타 4
 
0.7%

Length

2023-08-15T13:10:42.712193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-08-15T13:10:42.933302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정기 516
89.6%
수시 56
 
9.7%
기타 4
 
0.7%

Interactions

2023-08-15T13:10:37.380651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-08-15T13:10:43.075697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호업종명업체명칭소재지점검구분
번호1.0000.2060.6170.3690.328
업종명0.2061.0000.0000.0000.047
업체명칭0.6170.0001.0000.9400.614
소재지0.3690.0000.9401.0000.142
점검구분0.3280.0470.6140.1421.000
2023-08-15T13:10:43.297216image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종명점검구분소재지
업종명1.0000.0770.000
점검구분0.0771.0000.074
소재지0.0000.0741.000
2023-08-15T13:10:43.474044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호업종명소재지점검구분
번호1.0000.1570.1450.206
업종명0.1571.0000.0000.077
소재지0.1450.0001.0000.074
점검구분0.2060.0770.0741.000

Missing values

2023-08-15T13:10:37.661907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-08-15T13:10:37.926965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호지도점검일자업종명업체명칭소재지점 검 결 과점검구분
012015-01-14폐수배출업소관리효성*****경상남도 창원시특이사항 발견치 못함,- 가동시작 확인.-채수 : 4L*1, 1L*3수시
122015-01-19폐수배출업소관리현대*****경상남도 창원시**** ** **특이사항 발견치 못함,- 미채수 : 가동시작 대상 배설시설 전량위탁정기
232015-01-21폐수배출업소관리(주*****경상남도 창원시특이사항 발견치 못함,* 가동시작 확인, 미채수(전량위탁)정기
342015-01-21폐수배출업소관리넥센*****경상남도 창녕군**** ** **특이사항 발견치 못함,4L*3, 1L*6,* 기본배출부과금 부과를 위한 채수 병행정기
452015-01-26폐수배출업소관리(주*****경상남도 양산시**** ** **특이사항 발견치 못함,시료 미채수(별도배출허용기준 면제 사업장)정기
562015-01-26폐수배출업소관리(주*****경상남도 양산시**** ** **특이사항 발견치 못함,시료 채수 1리터*2정기
672015-01-27대기배출업소관리아시*****경상남도 창원시**** ** **대기오염 배출시설 및 방지시설 운영기록 미보존 및 미작성, 위반확인서 징구정기
782015-01-28대기배출업소관리한국*****경상남도 창원시**** ** **특이사항 발견하지 못함정기
892015-01-28폐수배출업소관리영흥*****경상남도 창원시**** ** **특이사항 발견치 못함,- 4L*1, 1L*1정기
9102015-01-30폐수배출업소관리(주*****경상남도 창원시**** ** **특이사항 발견치 못함,- 미채수(방류량 부족)정기
번호지도점검일자업종명업체명칭소재지점 검 결 과점검구분
5665672015-11-23폐수배출업소관리(주*****경상남도 창원시**** ** **가동시작개시 및 폐수시설 지도점검, 시료 미채수 (위탁처리)정기
5675682015-12-23폐수배출업소관리삼신*****경상남도 함안군특이사항 발견치 못함정기
5685692015-12-23폐수배출업소관리진주*****경상남도 진주시**** ** **특이사항 발견치 못함정기
5695702015-12-21폐수배출업소관리대우*****경상남도 거제시**** ** **특이사항 발견치 못함정기
5705712015-12-30폐수배출업소관리남강*****경상남도 진주시**** ** **특이사항 발견치 못함정기
5715722015-12-30폐수배출업소관리(주*****경상남도 함안군**** ** **특이사항 발견치 못함정기
5725732015-12-30폐수배출업소관리동일*****경상남도 진주시**** ** **특이사항 발견치 못함정기
5735742015-12-30폐수배출업소관리무림*****경상남도 진주시**** ** **특이사항 발견치 못함정기
5745752015-12-21대기배출업소관리GS*****경상남도 진주시**** ** **특이사항발견치 못함정기
5755762015-12-21대기배출업소관리한국*****경상남도 거제시특이사항 발견치 못함정기