Overview

Dataset statistics

Number of variables5
Number of observations2939
Missing cells0
Missing cells (%)0.0%
Duplicate rows486
Duplicate rows (%)16.5%
Total size in memory117.8 KiB
Average record size in memory41.0 B

Variable types

Text3
DateTime1
Numeric1

Dataset

Description아산시 관내 사업장 폐기물 배출자 신고 현황으로 상호, 신고일, 폐기물종류, 배출량(톤), 사업장주소가 표시됩니다.
URLhttps://www.data.go.kr/data/15060432/fileData.do

Alerts

Dataset has 486 (16.5%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 12:53:18.919824
Analysis finished2023-12-12 12:53:19.692939
Duration0.77 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상호
Text

Distinct866
Distinct (%)29.5%
Missing0
Missing (%)0.0%
Memory size23.1 KiB
2023-12-12T21:53:19.871886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length21
Mean length9.8342974
Min length1

Characters and Unicode

Total characters28903
Distinct characters440
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique320 ?
Unique (%)10.9%

Sample

1st row(주)일심스틸
2nd row(주)한샘할인마트
3rd row(주)한샘할인마트
4th row주식회사 제이스텍
5th row통자원
ValueCountFrequency (%)
주식회사 186
 
4.9%
삼성디스플레이(주 119
 
3.1%
아산공장 101
 
2.6%
사단법인 86
 
2.2%
아산디스플레이시티1 81
 
2.1%
일반산업단지 81
 
2.1%
입주기업체협의회 81
 
2.1%
아산지점 53
 
1.4%
1사업장 50
 
1.3%
아산2 46
 
1.2%
Other values (911) 2944
76.9%
2023-12-12T21:53:20.466163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2627
 
9.1%
) 2351
 
8.1%
( 2349
 
8.1%
983
 
3.4%
889
 
3.1%
844
 
2.9%
819
 
2.8%
769
 
2.7%
595
 
2.1%
511
 
1.8%
Other values (430) 16166
55.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 22683
78.5%
Close Punctuation 2438
 
8.4%
Open Punctuation 2436
 
8.4%
Space Separator 889
 
3.1%
Decimal Number 252
 
0.9%
Uppercase Letter 165
 
0.6%
Dash Punctuation 27
 
0.1%
Connector Punctuation 5
 
< 0.1%
Other Punctuation 5
 
< 0.1%
Lowercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2627
 
11.6%
983
 
4.3%
844
 
3.7%
819
 
3.6%
769
 
3.4%
595
 
2.6%
511
 
2.3%
457
 
2.0%
424
 
1.9%
366
 
1.6%
Other values (398) 14288
63.0%
Uppercase Letter
ValueCountFrequency (%)
D 40
24.2%
S 23
13.9%
K 23
13.9%
C 21
12.7%
L 19
11.5%
I 8
 
4.8%
N 6
 
3.6%
B 6
 
3.6%
M 5
 
3.0%
A 4
 
2.4%
Other values (5) 10
 
6.1%
Decimal Number
ValueCountFrequency (%)
1 146
57.9%
2 75
29.8%
7 23
 
9.1%
3 5
 
2.0%
4 3
 
1.2%
Lowercase Letter
ValueCountFrequency (%)
s 1
33.3%
k 1
33.3%
y 1
33.3%
Close Punctuation
ValueCountFrequency (%)
) 2351
96.4%
] 87
 
3.6%
Open Punctuation
ValueCountFrequency (%)
( 2349
96.4%
[ 87
 
3.6%
Other Punctuation
ValueCountFrequency (%)
. 4
80.0%
/ 1
 
20.0%
Space Separator
ValueCountFrequency (%)
889
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 27
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 22683
78.5%
Common 6052
 
20.9%
Latin 168
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2627
 
11.6%
983
 
4.3%
844
 
3.7%
819
 
3.6%
769
 
3.4%
595
 
2.6%
511
 
2.3%
457
 
2.0%
424
 
1.9%
366
 
1.6%
Other values (398) 14288
63.0%
Latin
ValueCountFrequency (%)
D 40
23.8%
S 23
13.7%
K 23
13.7%
C 21
12.5%
L 19
11.3%
I 8
 
4.8%
N 6
 
3.6%
B 6
 
3.6%
M 5
 
3.0%
A 4
 
2.4%
Other values (8) 13
 
7.7%
Common
ValueCountFrequency (%)
) 2351
38.8%
( 2349
38.8%
889
 
14.7%
1 146
 
2.4%
[ 87
 
1.4%
] 87
 
1.4%
2 75
 
1.2%
- 27
 
0.4%
7 23
 
0.4%
3 5
 
0.1%
Other values (4) 13
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 22683
78.5%
ASCII 6220
 
21.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2627
 
11.6%
983
 
4.3%
844
 
3.7%
819
 
3.6%
769
 
3.4%
595
 
2.6%
511
 
2.3%
457
 
2.0%
424
 
1.9%
366
 
1.6%
Other values (398) 14288
63.0%
ASCII
ValueCountFrequency (%)
) 2351
37.8%
( 2349
37.8%
889
 
14.3%
1 146
 
2.3%
[ 87
 
1.4%
] 87
 
1.4%
2 75
 
1.2%
D 40
 
0.6%
- 27
 
0.4%
7 23
 
0.4%
Other values (22) 146
 
2.3%
Distinct782
Distinct (%)26.6%
Missing0
Missing (%)0.0%
Memory size23.1 KiB
Minimum1993-04-17 00:00:00
Maximum2023-03-13 00:00:00
2023-12-12T21:53:20.590924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:53:20.712559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct146
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size23.1 KiB
2023-12-12T21:53:21.051552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length84
Median length64
Mean length15.86526
Min length1

Characters and Unicode

Total characters46628
Distinct characters222
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)1.3%

Sample

1st row폐합성수지류(폐염화비닐수지류는 제외한다)
2nd row폐합성수지류(폐염화비닐수지류는 제외한다)
3rd row폐합성수지류(폐염화비닐수지류는 제외한다)
4th row폐합성수지류(폐염화비닐수지류는 제외한다)
5th row폐합성수지류(폐염화비닐수지류는 제외한다)
ValueCountFrequency (%)
제외한다 1270
19.3%
폐합성수지류(폐염화비닐수지류는 1239
18.8%
524
 
8.0%
밖의 524
 
8.0%
폐수처리오니 225
 
3.4%
폐합성수지류 223
 
3.4%
말한다 95
 
1.4%
분진 80
 
1.2%
공정오니 66
 
1.0%
폐활성탄 59
 
0.9%
Other values (219) 2271
34.5%
2023-12-12T21:53:21.598616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4010
 
8.6%
3650
 
7.8%
3102
 
6.7%
3019
 
6.5%
2877
 
6.2%
1779
 
3.8%
1624
 
3.5%
1547
 
3.3%
( 1428
 
3.1%
) 1428
 
3.1%
Other values (212) 22164
47.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39849
85.5%
Space Separator 3650
 
7.8%
Open Punctuation 1429
 
3.1%
Close Punctuation 1429
 
3.1%
Connector Punctuation 253
 
0.5%
Decimal Number 11
 
< 0.1%
Other Punctuation 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4010
 
10.1%
3102
 
7.8%
3019
 
7.6%
2877
 
7.2%
1779
 
4.5%
1624
 
4.1%
1547
 
3.9%
1427
 
3.6%
1389
 
3.5%
1337
 
3.4%
Other values (201) 17738
44.5%
Decimal Number
ValueCountFrequency (%)
1 6
54.5%
2 4
36.4%
8 1
 
9.1%
Open Punctuation
ValueCountFrequency (%)
( 1428
99.9%
1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 1428
99.9%
1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 5
71.4%
· 2
 
28.6%
Space Separator
ValueCountFrequency (%)
3650
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 253
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39849
85.5%
Common 6779
 
14.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4010
 
10.1%
3102
 
7.8%
3019
 
7.6%
2877
 
7.2%
1779
 
4.5%
1624
 
4.1%
1547
 
3.9%
1427
 
3.6%
1389
 
3.5%
1337
 
3.4%
Other values (201) 17738
44.5%
Common
ValueCountFrequency (%)
3650
53.8%
( 1428
 
21.1%
) 1428
 
21.1%
_ 253
 
3.7%
1 6
 
0.1%
. 5
 
0.1%
2 4
 
0.1%
· 2
 
< 0.1%
1
 
< 0.1%
8 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39745
85.2%
ASCII 6775
 
14.5%
Compat Jamo 104
 
0.2%
None 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4010
 
10.1%
3102
 
7.8%
3019
 
7.6%
2877
 
7.2%
1779
 
4.5%
1624
 
4.1%
1547
 
3.9%
1427
 
3.6%
1389
 
3.5%
1337
 
3.4%
Other values (200) 17634
44.4%
ASCII
ValueCountFrequency (%)
3650
53.9%
( 1428
 
21.1%
) 1428
 
21.1%
_ 253
 
3.7%
1 6
 
0.1%
. 5
 
0.1%
2 4
 
0.1%
8 1
 
< 0.1%
Compat Jamo
ValueCountFrequency (%)
104
100.0%
None
ValueCountFrequency (%)
· 2
50.0%
1
25.0%
1
25.0%

배출량(톤)
Real number (ℝ)

Distinct280
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean903.6791
Minimum0
Maximum150000
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size26.0 KiB
2023-12-12T21:53:21.763558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q130
median72
Q3300
95-th percentile3600
Maximum150000
Range150000
Interquartile range (IQR)270

Descriptive statistics

Standard deviation4612.2847
Coefficient of variation (CV)5.1038966
Kurtosis433.88099
Mean903.6791
Median Absolute Deviation (MAD)58
Skewness17.172406
Sum2655912.9
Variance21273170
MonotonicityNot monotonic
2023-12-12T21:53:21.896661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60.0 249
 
8.5%
120.0 200
 
6.8%
36.0 157
 
5.3%
24.0 128
 
4.4%
12.0 116
 
3.9%
30.0 114
 
3.9%
50.0 94
 
3.2%
100.0 91
 
3.1%
10.0 88
 
3.0%
240.0 80
 
2.7%
Other values (270) 1622
55.2%
ValueCountFrequency (%)
0.0 1
 
< 0.1%
0.025 1
 
< 0.1%
0.1 2
0.1%
0.18 1
 
< 0.1%
0.3 1
 
< 0.1%
0.4 1
 
< 0.1%
0.48 1
 
< 0.1%
0.5 2
0.1%
0.6 4
0.1%
0.72 3
0.1%
ValueCountFrequency (%)
150000.0 1
 
< 0.1%
72000.0 1
 
< 0.1%
66000.0 1
 
< 0.1%
60000.0 1
 
< 0.1%
51000.0 1
 
< 0.1%
50000.0 2
 
0.1%
36000.0 3
0.1%
33200.0 1
 
< 0.1%
30000.0 5
0.2%
25000.0 1
 
< 0.1%
Distinct667
Distinct (%)22.7%
Missing0
Missing (%)0.0%
Memory size23.1 KiB
2023-12-12T21:53:22.189770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length40
Mean length20.949983
Min length1

Characters and Unicode

Total characters61572
Distinct characters254
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique217 ?
Unique (%)7.4%

Sample

1st row충청남도 아산시 둔포면 봉신로 138-28
2nd row충청남도 아산시 둔포면 중앙공원로 52
3rd row충청남도 아산시 둔포면 중앙공원로 52
4th row충청남도 아산시 음봉면 산동로 433-15 (주)제이스텍
5th row충청남도 아산시 음봉면 음봉면로 286
ValueCountFrequency (%)
아산시 2616
19.6%
충청남도 2565
19.2%
둔포면 567
 
4.2%
탕정면 407
 
3.0%
음봉면 305
 
2.3%
인주면 284
 
2.1%
영인면 266
 
2.0%
신창면 166
 
1.2%
삼성로 159
 
1.2%
인주산단로 143
 
1.1%
Other values (805) 5874
44.0%
2023-12-12T21:53:22.626010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11175
18.1%
3455
 
5.6%
3103
 
5.0%
2693
 
4.4%
2672
 
4.3%
2650
 
4.3%
2641
 
4.3%
2577
 
4.2%
1 2393
 
3.9%
2259
 
3.7%
Other values (244) 25954
42.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 38188
62.0%
Space Separator 11175
 
18.1%
Decimal Number 10149
 
16.5%
Dash Punctuation 997
 
1.6%
Close Punctuation 477
 
0.8%
Open Punctuation 477
 
0.8%
Connector Punctuation 88
 
0.1%
Uppercase Letter 18
 
< 0.1%
Other Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3455
 
9.0%
3103
 
8.1%
2693
 
7.1%
2672
 
7.0%
2650
 
6.9%
2641
 
6.9%
2577
 
6.7%
2259
 
5.9%
2208
 
5.8%
841
 
2.2%
Other values (221) 13089
34.3%
Decimal Number
ValueCountFrequency (%)
1 2393
23.6%
2 1523
15.0%
3 1044
10.3%
0 906
 
8.9%
4 831
 
8.2%
8 796
 
7.8%
7 788
 
7.8%
6 731
 
7.2%
5 723
 
7.1%
9 414
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
C 5
27.8%
A 3
16.7%
S 2
 
11.1%
M 2
 
11.1%
D 2
 
11.1%
T 2
 
11.1%
K 2
 
11.1%
Space Separator
ValueCountFrequency (%)
11175
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 997
100.0%
Close Punctuation
ValueCountFrequency (%)
) 477
100.0%
Open Punctuation
ValueCountFrequency (%)
( 477
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 88
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 38188
62.0%
Common 23366
37.9%
Latin 18
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3455
 
9.0%
3103
 
8.1%
2693
 
7.1%
2672
 
7.0%
2650
 
6.9%
2641
 
6.9%
2577
 
6.7%
2259
 
5.9%
2208
 
5.8%
841
 
2.2%
Other values (221) 13089
34.3%
Common
ValueCountFrequency (%)
11175
47.8%
1 2393
 
10.2%
2 1523
 
6.5%
3 1044
 
4.5%
- 997
 
4.3%
0 906
 
3.9%
4 831
 
3.6%
8 796
 
3.4%
7 788
 
3.4%
6 731
 
3.1%
Other values (6) 2182
 
9.3%
Latin
ValueCountFrequency (%)
C 5
27.8%
A 3
16.7%
S 2
 
11.1%
M 2
 
11.1%
D 2
 
11.1%
T 2
 
11.1%
K 2
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 38188
62.0%
ASCII 23384
38.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11175
47.8%
1 2393
 
10.2%
2 1523
 
6.5%
3 1044
 
4.5%
- 997
 
4.3%
0 906
 
3.9%
4 831
 
3.6%
8 796
 
3.4%
7 788
 
3.4%
6 731
 
3.1%
Other values (13) 2200
 
9.4%
Hangul
ValueCountFrequency (%)
3455
 
9.0%
3103
 
8.1%
2693
 
7.1%
2672
 
7.0%
2650
 
6.9%
2641
 
6.9%
2577
 
6.7%
2259
 
5.9%
2208
 
5.8%
841
 
2.2%
Other values (221) 13089
34.3%

Interactions

2023-12-12T21:53:19.435288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-12T21:53:19.546920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:53:19.654482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

상호신고일폐기물 종류배출량(톤)사업장도로명주소
0(주)일심스틸2023-03-13폐합성수지류(폐염화비닐수지류는 제외한다)50.0충청남도 아산시 둔포면 봉신로 138-28
1(주)한샘할인마트2023-03-07폐합성수지류(폐염화비닐수지류는 제외한다)24.0충청남도 아산시 둔포면 중앙공원로 52
2(주)한샘할인마트2023-03-07폐합성수지류(폐염화비닐수지류는 제외한다)24.0충청남도 아산시 둔포면 중앙공원로 52
3주식회사 제이스텍2023-02-28폐합성수지류(폐염화비닐수지류는 제외한다)250.0충청남도 아산시 음봉면 산동로 433-15 (주)제이스텍
4통자원2023-02-14폐합성수지류(폐염화비닐수지류는 제외한다)120.0충청남도 아산시 음봉면 음봉면로 286
5주식회사 무창2023-02-10폐합성수지류(폐염화비닐수지류는 제외한다)120.0충청남도 아산시 선장면 서부남로 199-28
6(주)엠플러스2023-02-07폐합성수지류(폐염화비닐수지류는 제외한다)120.0충청남도 아산시 인주면 인주산단로 123-16
7(주)샘텍 아산지점2023-01-20폐합성수지류(폐염화비닐수지류는 제외한다)36.0충청남도 아산시 둔포면 아산밸리로388번길 167-40
8주식회사 앤피텍2023-01-04폐합성수지류(폐염화비닐수지류는 제외한다)12.0충청남도 아산시 둔포면 아산밸리로406번길 50 (주)동광
9(주)케이비아이에이스텍 아산2공장2023-01-02폐합성수지류(폐염화비닐수지류는 제외한다)120.0충청남도 아산시 염치읍 아산로 645-24
상호신고일폐기물 종류배출량(톤)사업장도로명주소
2929(주)두원공조2003-06-30폐합성수지류(폐염화비닐수지류는 제외한다)80.0충청남도 아산시 음봉면 연암율금로 43
2930(주)두원공조2003-06-30폐합성수지류(폐염화비닐수지류는 제외한다)80.0충청남도 아산시 음봉면 연암율금로 43
2931(주)두원공조2003-06-30폐가구류_ 폐도장목_ 폐목재포장재_ 폐전선드럼(원목상태의 깨끗한 목재를 말한다)500.0충청남도 아산시 음봉면 연암율금로 43
2932(주)두원공조2003-06-30폐수처리오니50.0충청남도 아산시 음봉면 연암율금로 43
2933(주)두원공조2003-06-30폐수처리오니50.0충청남도 아산시 음봉면 연암율금로 43
2934(주)두원공조2003-06-30그 밖의 공정오니12.0충청남도 아산시 음봉면 연암율금로 43
2935(주)두원공조2003-06-30그 밖의 분진3.0충청남도 아산시 음봉면 연암율금로 43
2936(주)두원공조2003-06-30폐흡착제12.0충청남도 아산시 음봉면 연암율금로 43
2937(주)두원공조2003-06-30그 밖의 분진3.0충청남도 아산시 음봉면 연암율금로 43
2938(주)두원공조2003-06-30그 밖의 공정오니12.0충청남도 아산시 음봉면 연암율금로 43

Duplicate rows

Most frequently occurring

상호신고일폐기물 종류배출량(톤)사업장도로명주소# duplicates
16(주)그린아산2002-03-15음식물류폐기물처리잔재물(액상의 경우만 해당한다)3600.0충청남도 아산시 둔포면 충무로 1611-316
33(주)뉴세종테크2022-05-23폐합성수지류(폐염화비닐수지류는 제외한다)500.0충청남도 아산시 선장면 아산만로 638-46
101(주)세신2015-10-13폐합성수지류(폐염화비닐수지류는 제외한다)150.0충청남도 아산시 둔포면 이화서길 30-16
198(주)하나메탈코리아2008-09-12폐합성수지류(폐염화비닐수지류는 제외한다)12.0충청남도 아산시 둔포면 운교길126번길 316
280사단법인 아산디스플레이시티1 일반산업단지 입주기업체협의회2002-04-25그 밖의 폐수처리오니3000.0충청남도 아산시 탕정면 삼성로 11-86
318삼성디스플레이(주)[정배수지]2020-12-24폐합성수지류(폐염화비닐수지류는 제외한다)10.06
138(주)온세계 아산지점2019-06-24폐합성수지류(폐염화비닐수지류는 제외한다)100.0충청남도 아산시 둔포면 윤보선로 423 (외2필지)5
149(주)이엔에프테크놀로지 아산공장2009-03-20폐합성수지류(폐염화비닐수지류는 제외한다)40.0충청남도 아산시 인주면 인주산단로 123-385
155(주)제이앤이 아산공장2012-02-21폐합성수지류(폐염화비닐수지류는 제외한다)120.0충청남도 아산시 영인면 장영실로 7295
162(주)케이씨씨글라스 아산공장2000-04-27폐합성수지류(폐염화비닐수지류는 제외한다)300.0충청남도 아산시 염치읍 아산로 658-335