Overview

Dataset statistics

Number of variables7
Number of observations285
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.3 KiB
Average record size in memory58.5 B

Variable types

Numeric1
Categorical2
Text2
DateTime2

Dataset

Description광주광역시 광산구 관내에 위치한 폐수배출 관련 업소 현황(업종, 업체명, 주소, 배출량 단위(종), 등록일자, 데이터기준일자)을 제공합니다.
URLhttps://www.data.go.kr/data/15032226/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
업종 is highly overall correlated with 배출량단위(종)High correlation
배출량단위(종) is highly overall correlated with 업종High correlation
배출량단위(종) is highly imbalanced (85.3%)Imbalance
연번 has unique valuesUnique
업체명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 09:54:23.121499
Analysis finished2023-12-12 09:54:23.775033
Duration0.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean143
Minimum1
Maximum285
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2023-12-12T18:54:23.842706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile15.2
Q172
median143
Q3214
95-th percentile270.8
Maximum285
Range284
Interquartile range (IQR)142

Descriptive statistics

Standard deviation82.416625
Coefficient of variation (CV)0.57634003
Kurtosis-1.2
Mean143
Median Absolute Deviation (MAD)71
Skewness0
Sum40755
Variance6792.5
MonotonicityStrictly increasing
2023-12-12T18:54:24.062881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.4%
189 1
 
0.4%
195 1
 
0.4%
194 1
 
0.4%
193 1
 
0.4%
192 1
 
0.4%
191 1
 
0.4%
190 1
 
0.4%
188 1
 
0.4%
197 1
 
0.4%
Other values (275) 275
96.5%
ValueCountFrequency (%)
1 1
0.4%
2 1
0.4%
3 1
0.4%
4 1
0.4%
5 1
0.4%
6 1
0.4%
7 1
0.4%
8 1
0.4%
9 1
0.4%
10 1
0.4%
ValueCountFrequency (%)
285 1
0.4%
284 1
0.4%
283 1
0.4%
282 1
0.4%
281 1
0.4%
280 1
0.4%
279 1
0.4%
278 1
0.4%
277 1
0.4%
276 1
0.4%

업종
Categorical

HIGH CORRELATION 

Distinct31
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
주유소(세차)
87 
세차장
82 
공업사
21 
충전소(세차)
19 
(기타)요양병원
10 
Other values (26)
66 

Length

Max length21
Median length19
Mean length6.4105263
Min length3

Unique

Unique12 ?
Unique (%)4.2%

Sample

1st row(식품)도축업
2nd row(비금)콘크리트제품제조
3rd row(비금)콘크리트제품제조
4th row세차장
5th row(식품)식료품제조업

Common Values

ValueCountFrequency (%)
주유소(세차) 87
30.5%
세차장 82
28.8%
공업사 21
 
7.4%
충전소(세차) 19
 
6.7%
(기타)요양병원 10
 
3.5%
(기타)일반병원 7
 
2.5%
(비금)폐기물중간처리업(재활용전문) 7
 
2.5%
차고지(세차) 6
 
2.1%
(비금)콘크리트제품제조 5
 
1.8%
(비금)건설폐기물중간처리업(재활용전문) 5
 
1.8%
Other values (21) 36
12.6%

Length

2023-12-12T18:54:24.244212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
주유소(세차 87
29.3%
세차장 82
27.6%
공업사 21
 
7.1%
충전소(세차 19
 
6.4%
기타)요양병원 10
 
3.4%
기타)일반병원 7
 
2.4%
비금)폐기물중간처리업(재활용전문 7
 
2.4%
차고지(세차 6
 
2.0%
비금)콘크리트제품제조 5
 
1.7%
비금)건설폐기물중간처리업(재활용전문 5
 
1.7%
Other values (28) 48
16.2%

업체명
Text

UNIQUE 

Distinct285
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-12T18:54:24.559314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length19
Mean length7.645614
Min length2

Characters and Unicode

Total characters2179
Distinct characters325
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique285 ?
Unique (%)100.0%

Sample

1st row삼국산업
2nd row현대개발㈜
3rd row㈜남도
4th row장세차장
5th row㈜남양냉동식품
ValueCountFrequency (%)
컴인워시 3
 
0.9%
주식회사 2
 
0.6%
농업회사법인 2
 
0.6%
동광건설㈜ 2
 
0.6%
광주지역전기공급시설전력구공사(수완~하남 2
 
0.6%
현대오일뱅크㈜직영 2
 
0.6%
세차장 2
 
0.6%
첨단점 2
 
0.6%
㈜제일에너지 2
 
0.6%
광주올카 1
 
0.3%
Other values (308) 308
93.9%
2023-12-12T18:54:25.032815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
112
 
5.1%
100
 
4.6%
95
 
4.4%
73
 
3.4%
52
 
2.4%
45
 
2.1%
43
 
2.0%
42
 
1.9%
35
 
1.6%
35
 
1.6%
Other values (315) 1547
71.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1891
86.8%
Uppercase Letter 76
 
3.5%
Other Symbol 73
 
3.4%
Space Separator 43
 
2.0%
Close Punctuation 27
 
1.2%
Open Punctuation 27
 
1.2%
Lowercase Letter 21
 
1.0%
Decimal Number 11
 
0.5%
Other Punctuation 7
 
0.3%
Math Symbol 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
112
 
5.9%
100
 
5.3%
95
 
5.0%
52
 
2.7%
45
 
2.4%
42
 
2.2%
35
 
1.9%
35
 
1.9%
31
 
1.6%
29
 
1.5%
Other values (269) 1315
69.5%
Uppercase Letter
ValueCountFrequency (%)
G 11
14.5%
L 10
13.2%
K 9
11.8%
P 9
11.8%
S 7
9.2%
C 4
 
5.3%
I 4
 
5.3%
D 4
 
5.3%
J 3
 
3.9%
A 3
 
3.9%
Other values (8) 12
15.8%
Lowercase Letter
ValueCountFrequency (%)
l 3
14.3%
i 3
14.3%
e 3
14.3%
t 2
9.5%
f 2
9.5%
s 2
9.5%
x 1
 
4.8%
n 1
 
4.8%
m 1
 
4.8%
o 1
 
4.8%
Other values (2) 2
9.5%
Decimal Number
ValueCountFrequency (%)
1 3
27.3%
2 3
27.3%
3 2
18.2%
5 1
 
9.1%
6 1
 
9.1%
9 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
# 3
42.9%
& 2
28.6%
. 1
 
14.3%
, 1
 
14.3%
Other Symbol
ValueCountFrequency (%)
73
100.0%
Space Separator
ValueCountFrequency (%)
43
100.0%
Close Punctuation
ValueCountFrequency (%)
) 27
100.0%
Open Punctuation
ValueCountFrequency (%)
( 27
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1964
90.1%
Common 118
 
5.4%
Latin 97
 
4.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
112
 
5.7%
100
 
5.1%
95
 
4.8%
73
 
3.7%
52
 
2.6%
45
 
2.3%
42
 
2.1%
35
 
1.8%
35
 
1.8%
31
 
1.6%
Other values (270) 1344
68.4%
Latin
ValueCountFrequency (%)
G 11
 
11.3%
L 10
 
10.3%
K 9
 
9.3%
P 9
 
9.3%
S 7
 
7.2%
C 4
 
4.1%
I 4
 
4.1%
D 4
 
4.1%
l 3
 
3.1%
J 3
 
3.1%
Other values (20) 33
34.0%
Common
ValueCountFrequency (%)
43
36.4%
) 27
22.9%
( 27
22.9%
# 3
 
2.5%
1 3
 
2.5%
2 3
 
2.5%
& 2
 
1.7%
3 2
 
1.7%
~ 2
 
1.7%
5 1
 
0.8%
Other values (5) 5
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1891
86.8%
ASCII 215
 
9.9%
None 73
 
3.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
112
 
5.9%
100
 
5.3%
95
 
5.0%
52
 
2.7%
45
 
2.4%
42
 
2.2%
35
 
1.9%
35
 
1.9%
31
 
1.6%
29
 
1.5%
Other values (269) 1315
69.5%
None
ValueCountFrequency (%)
73
100.0%
ASCII
ValueCountFrequency (%)
43
20.0%
) 27
12.6%
( 27
12.6%
G 11
 
5.1%
L 10
 
4.7%
K 9
 
4.2%
P 9
 
4.2%
S 7
 
3.3%
C 4
 
1.9%
I 4
 
1.9%
Other values (35) 64
29.8%

주소
Text

Distinct279
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-12T18:54:25.418635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length23
Mean length18.150877
Min length15

Characters and Unicode

Total characters5173
Distinct characters99
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique273 ?
Unique (%)95.8%

Sample

1st row광주광역시 광산구 어등대로 539-11
2nd row광주광역시 광산구 풍영정길 201
3rd row광주광역시 광산구 임곡로 794
4th row광주광역시 광산구 광산로67번길 51
5th row광주광역시 광산구 소촌로85번길 2
ValueCountFrequency (%)
광주광역시 285
25.0%
광산구 285
25.0%
사암로 30
 
2.6%
동곡로 25
 
2.2%
북문대로 17
 
1.5%
목련로 15
 
1.3%
임방울대로 12
 
1.1%
왕버들로 9
 
0.8%
송도로 9
 
0.8%
상무대로 8
 
0.7%
Other values (323) 446
39.1%
2023-12-12T18:54:25.999426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
861
16.6%
856
16.5%
296
 
5.7%
286
 
5.5%
285
 
5.5%
285
 
5.5%
285
 
5.5%
250
 
4.8%
1 153
 
3.0%
2 122
 
2.4%
Other values (89) 1494
28.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3362
65.0%
Decimal Number 909
 
17.6%
Space Separator 856
 
16.5%
Dash Punctuation 42
 
0.8%
Other Punctuation 2
 
< 0.1%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
861
25.6%
296
 
8.8%
286
 
8.5%
285
 
8.5%
285
 
8.5%
285
 
8.5%
250
 
7.4%
62
 
1.8%
60
 
1.8%
52
 
1.5%
Other values (75) 640
19.0%
Decimal Number
ValueCountFrequency (%)
1 153
16.8%
2 122
13.4%
3 118
13.0%
5 88
9.7%
6 87
9.6%
4 79
8.7%
7 73
8.0%
8 69
7.6%
9 67
7.4%
0 53
 
5.8%
Space Separator
ValueCountFrequency (%)
856
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 42
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
B 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3362
65.0%
Common 1809
35.0%
Latin 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
861
25.6%
296
 
8.8%
286
 
8.5%
285
 
8.5%
285
 
8.5%
285
 
8.5%
250
 
7.4%
62
 
1.8%
60
 
1.8%
52
 
1.5%
Other values (75) 640
19.0%
Common
ValueCountFrequency (%)
856
47.3%
1 153
 
8.5%
2 122
 
6.7%
3 118
 
6.5%
5 88
 
4.9%
6 87
 
4.8%
4 79
 
4.4%
7 73
 
4.0%
8 69
 
3.8%
9 67
 
3.7%
Other values (3) 97
 
5.4%
Latin
ValueCountFrequency (%)
B 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3362
65.0%
ASCII 1811
35.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
861
25.6%
296
 
8.8%
286
 
8.5%
285
 
8.5%
285
 
8.5%
285
 
8.5%
250
 
7.4%
62
 
1.8%
60
 
1.8%
52
 
1.5%
Other values (75) 640
19.0%
ASCII
ValueCountFrequency (%)
856
47.3%
1 153
 
8.4%
2 122
 
6.7%
3 118
 
6.5%
5 88
 
4.9%
6 87
 
4.8%
4 79
 
4.4%
7 73
 
4.0%
8 69
 
3.8%
9 67
 
3.7%
Other values (4) 99
 
5.5%

배출량단위(종)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
5
279 
4
 
6

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row5
3rd row5
4th row5
5th row4

Common Values

ValueCountFrequency (%)
5 279
97.9%
4 6
 
2.1%

Length

2023-12-12T18:54:26.489690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:54:26.602209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5 279
97.9%
4 6
 
2.1%
Distinct273
Distinct (%)95.8%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
Minimum1978-12-21 00:00:00
Maximum2022-11-28 00:00:00
2023-12-12T18:54:26.800425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:54:27.009662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
Minimum2022-12-31 00:00:00
Maximum2022-12-31 00:00:00
2023-12-12T18:54:27.155162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:54:27.272160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T18:54:23.473034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T18:54:27.365613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종배출량단위(종)
연번1.0000.6760.107
업종0.6761.0000.886
배출량단위(종)0.1070.8861.000
2023-12-12T18:54:27.481878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
배출량단위(종)업종
배출량단위(종)1.0000.764
업종0.7641.000
2023-12-12T18:54:27.575030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종배출량단위(종)
연번1.0000.2980.080
업종0.2981.0000.764
배출량단위(종)0.0800.7641.000

Missing values

2023-12-12T18:54:23.605536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:54:23.730932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번업종업체명주소배출량단위(종)등록일자데이터기준일자
01(식품)도축업삼국산업광주광역시 광산구 어등대로 539-1141978-12-212022-12-31
12(비금)콘크리트제품제조현대개발㈜광주광역시 광산구 풍영정길 20151994-03-212022-12-31
23(비금)콘크리트제품제조㈜남도광주광역시 광산구 임곡로 79451989-05-222022-12-31
34세차장장세차장광주광역시 광산구 광산로67번길 5151990-04-262022-12-31
45(식품)식료품제조업㈜남양냉동식품광주광역시 광산구 소촌로85번길 241993-05-272022-12-31
56공업사애니카랜드우산점광주광역시 광산구 우산로 7251991-04-242022-12-31
67(기타)기타금속㈜원영금속광주광역시 광산구 소촌로123번길 2151992-01-212022-12-31
78세차장뉴현대세차장광주광역시 광산구 사암로 34151992-08-312022-12-31
89공업사금호.로케트하남대리점광주광역시 광산구 용아로 35251995-05-112022-12-31
910세차장태양세차장광주광역시 광산구 사암로339번길 251993-05-042022-12-31
연번업종업체명주소배출량단위(종)등록일자데이터기준일자
275276세차장삐까뻔쩍손세차광택광주광역시 광산구 하남마항로12번길 1452022-05-312022-12-31
276277(기타)일반 전기 공사업동광건설㈜ 광주지역전기공급시설전력구공사(수완~하남) #1광주광역시 광산구 수완동 89842022-06-132022-12-31
277278(기타)일반 전기 공사업동광건설㈜ 광주지역전기공급시설전력구공사(수완~하남) #2광주광역시 광산구 장덕동 110152022-06-132022-12-31
278279세차장그린워시광주광역시 광산구 상무대로 5752022-06-202022-12-31
279280세차장DK워시광주점광주광역시 광산구 북문대로 59552022-06-292022-12-31
280281주유소(세차)남선석유㈜임방울셀프주유소광주광역시 광산구 임방울대로 1452022-09-202022-12-31
281282세차장솔릭개발㈜광주광역시 광산구 송정동 836-652022-10-062022-12-31
282283(비금)폐기물중간처리업(재활용전문)광일산업광주광역시 광산구 삼도로 2-252022-07-282022-12-31
283284세차장송정대양자동차공업사광주광역시 광산구 동곡로 817, B동52022-10-242022-12-31
284285세차장K모터스광주광역시 광산구 상완길 10152022-11-282022-12-31