Overview

Dataset statistics

Number of variables3
Number of observations514
Missing cells0
Missing cells (%)0.0%
Duplicate rows3
Duplicate rows (%)0.6%
Total size in memory12.2 KiB
Average record size in memory24.3 B

Variable types

Text3

Alerts

Dataset has 3 (0.6%) duplicate rowsDuplicates

Reproduction

Analysis started2024-01-09 21:31:29.139675
Analysis finished2024-01-09 21:31:29.530119
Duration0.39 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상호
Text

Distinct258
Distinct (%)50.2%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
2024-01-10T06:31:29.641753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length19
Mean length9.0038911
Min length3

Characters and Unicode

Total characters4628
Distinct characters277
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique122 ?
Unique (%)23.7%

Sample

1st row(유)에치비
2nd row(유)에치비
3rd row(유)에프엑스인터내셔널
4th row(재)충남테크노파크바이오센터
5th row(주) 대영환경
ValueCountFrequency (%)
주식회사 38
 
5.9%
주)에코프라임 9
 
1.4%
천광산업(주)예산공장 8
 
1.3%
예산지점 8
 
1.3%
예산공장 7
 
1.1%
주)유티아이예산지점 7
 
1.1%
대영환경 6
 
0.9%
예일환경(주 6
 
0.9%
대산산업주식회사 6
 
0.9%
예당공장 6
 
0.9%
Other values (273) 538
84.2%
2024-01-10T06:31:29.940425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
398
 
8.6%
( 354
 
7.6%
) 354
 
7.6%
157
 
3.4%
125
 
2.7%
120
 
2.6%
104
 
2.2%
100
 
2.2%
100
 
2.2%
90
 
1.9%
Other values (267) 2726
58.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3764
81.3%
Open Punctuation 354
 
7.6%
Close Punctuation 354
 
7.6%
Space Separator 125
 
2.7%
Decimal Number 14
 
0.3%
Uppercase Letter 9
 
0.2%
Other Punctuation 7
 
0.2%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
398
 
10.6%
157
 
4.2%
120
 
3.2%
104
 
2.8%
100
 
2.7%
100
 
2.7%
90
 
2.4%
86
 
2.3%
74
 
2.0%
65
 
1.7%
Other values (253) 2470
65.6%
Uppercase Letter
ValueCountFrequency (%)
P 3
33.3%
O 2
22.2%
R 2
22.2%
G 1
 
11.1%
S 1
 
11.1%
Decimal Number
ValueCountFrequency (%)
2 7
50.0%
1 4
28.6%
3 3
21.4%
Other Punctuation
ValueCountFrequency (%)
/ 5
71.4%
& 2
 
28.6%
Open Punctuation
ValueCountFrequency (%)
( 354
100.0%
Close Punctuation
ValueCountFrequency (%)
) 354
100.0%
Space Separator
ValueCountFrequency (%)
125
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3765
81.4%
Common 854
 
18.5%
Latin 9
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
398
 
10.6%
157
 
4.2%
120
 
3.2%
104
 
2.8%
100
 
2.7%
100
 
2.7%
90
 
2.4%
86
 
2.3%
74
 
2.0%
65
 
1.7%
Other values (254) 2471
65.6%
Common
ValueCountFrequency (%)
( 354
41.5%
) 354
41.5%
125
 
14.6%
2 7
 
0.8%
/ 5
 
0.6%
1 4
 
0.5%
3 3
 
0.4%
& 2
 
0.2%
Latin
ValueCountFrequency (%)
P 3
33.3%
O 2
22.2%
R 2
22.2%
G 1
 
11.1%
S 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3764
81.3%
ASCII 863
 
18.6%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
398
 
10.6%
157
 
4.2%
120
 
3.2%
104
 
2.8%
100
 
2.7%
100
 
2.7%
90
 
2.4%
86
 
2.3%
74
 
2.0%
65
 
1.7%
Other values (253) 2470
65.6%
ASCII
ValueCountFrequency (%)
( 354
41.0%
) 354
41.0%
125
 
14.5%
2 7
 
0.8%
/ 5
 
0.6%
1 4
 
0.5%
P 3
 
0.3%
3 3
 
0.3%
O 2
 
0.2%
R 2
 
0.2%
Other values (3) 4
 
0.5%
None
ValueCountFrequency (%)
1
100.0%
Distinct97
Distinct (%)18.9%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
2024-01-10T06:31:30.149533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length84
Median length53.5
Mean length12.406615
Min length2

Characters and Unicode

Total characters6377
Distinct characters193
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40 ?
Unique (%)7.8%

Sample

1st row그 밖의 식물성잔재물
2nd row폐합성수지류(폐염화비닐수지류는 제외한다)
3rd row폐합성수지류(폐염화비닐수지류는 제외한다)
4th row폐활성탄
5th row건축현장 폐목재(원목상태의 깨끗한 목재를 말한다)
ValueCountFrequency (%)
제외한다 141
 
13.5%
폐합성수지류(폐염화비닐수지류는 134
 
12.8%
104
 
10.0%
밖의 104
 
10.0%
폐수처리오니 33
 
3.2%
폐합성수지류 31
 
3.0%
분진 27
 
2.6%
폐활성탄 20
 
1.9%
폐목재류 19
 
1.8%
폐기물 15
 
1.4%
Other values (150) 415
39.8%
2024-01-10T06:31:30.452980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
567
 
8.9%
532
 
8.3%
365
 
5.7%
360
 
5.6%
326
 
5.1%
241
 
3.8%
195
 
3.1%
173
 
2.7%
173
 
2.7%
) 162
 
2.5%
Other values (183) 3283
51.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5481
85.9%
Space Separator 532
 
8.3%
Close Punctuation 164
 
2.6%
Open Punctuation 164
 
2.6%
Connector Punctuation 26
 
0.4%
Decimal Number 9
 
0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
567
 
10.3%
365
 
6.7%
360
 
6.6%
326
 
5.9%
241
 
4.4%
195
 
3.6%
173
 
3.2%
173
 
3.2%
160
 
2.9%
151
 
2.8%
Other values (172) 2770
50.5%
Decimal Number
ValueCountFrequency (%)
1 4
44.4%
8 2
22.2%
7 2
22.2%
2 1
 
11.1%
Close Punctuation
ValueCountFrequency (%)
) 162
98.8%
2
 
1.2%
Open Punctuation
ValueCountFrequency (%)
( 162
98.8%
2
 
1.2%
Space Separator
ValueCountFrequency (%)
532
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 26
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5481
85.9%
Common 896
 
14.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
567
 
10.3%
365
 
6.7%
360
 
6.6%
326
 
5.9%
241
 
4.4%
195
 
3.6%
173
 
3.2%
173
 
3.2%
160
 
2.9%
151
 
2.8%
Other values (172) 2770
50.5%
Common
ValueCountFrequency (%)
532
59.4%
) 162
 
18.1%
( 162
 
18.1%
_ 26
 
2.9%
1 4
 
0.4%
2
 
0.2%
2
 
0.2%
8 2
 
0.2%
7 2
 
0.2%
2 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5470
85.8%
ASCII 892
 
14.0%
Compat Jamo 11
 
0.2%
None 4
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
567
 
10.4%
365
 
6.7%
360
 
6.6%
326
 
6.0%
241
 
4.4%
195
 
3.6%
173
 
3.2%
173
 
3.2%
160
 
2.9%
151
 
2.8%
Other values (171) 2759
50.4%
ASCII
ValueCountFrequency (%)
532
59.6%
) 162
 
18.2%
( 162
 
18.2%
_ 26
 
2.9%
1 4
 
0.4%
8 2
 
0.2%
7 2
 
0.2%
2 1
 
0.1%
. 1
 
0.1%
Compat Jamo
ValueCountFrequency (%)
11
100.0%
None
ValueCountFrequency (%)
2
50.0%
2
50.0%
Distinct245
Distinct (%)47.7%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
2024-01-10T06:31:30.712445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length36
Mean length23.361868
Min length19

Characters and Unicode

Total characters12008
Distinct characters190
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique112 ?
Unique (%)21.8%

Sample

1st row충청남도 예산군 고덕면 예당산단4길 67
2nd row충청남도 예산군 고덕면 예당산단4길 67
3rd row충청남도 예산군 봉산면 한천로 311-4
4th row충청남도 예산군 삽교읍 산단3길 226
5th row충청남도 예산군 신암면 신원탄중길 216-48
ValueCountFrequency (%)
충청남도 514
19.5%
예산군 512
19.4%
고덕면 84
 
3.2%
신암면 81
 
3.1%
예산읍 74
 
2.8%
삽교읍 69
 
2.6%
오가면 52
 
2.0%
응봉면 45
 
1.7%
추사로 42
 
1.6%
대술면 38
 
1.4%
Other values (382) 1129
42.8%
2024-01-10T06:31:31.311558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2251
18.7%
823
 
6.9%
682
 
5.7%
544
 
4.5%
525
 
4.4%
516
 
4.3%
516
 
4.3%
514
 
4.3%
1 394
 
3.3%
369
 
3.1%
Other values (180) 4874
40.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7466
62.2%
Space Separator 2251
 
18.7%
Decimal Number 1914
 
15.9%
Dash Punctuation 250
 
2.1%
Close Punctuation 62
 
0.5%
Open Punctuation 62
 
0.5%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
823
 
11.0%
682
 
9.1%
544
 
7.3%
525
 
7.0%
516
 
6.9%
516
 
6.9%
514
 
6.9%
369
 
4.9%
277
 
3.7%
201
 
2.7%
Other values (163) 2499
33.5%
Decimal Number
ValueCountFrequency (%)
1 394
20.6%
3 264
13.8%
2 261
13.6%
6 179
9.4%
5 176
9.2%
4 149
 
7.8%
7 145
 
7.6%
0 141
 
7.4%
9 109
 
5.7%
8 96
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
M 1
33.3%
F 1
33.3%
K 1
33.3%
Space Separator
ValueCountFrequency (%)
2251
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 250
100.0%
Close Punctuation
ValueCountFrequency (%)
) 62
100.0%
Open Punctuation
ValueCountFrequency (%)
( 62
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7466
62.2%
Common 4539
37.8%
Latin 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
823
 
11.0%
682
 
9.1%
544
 
7.3%
525
 
7.0%
516
 
6.9%
516
 
6.9%
514
 
6.9%
369
 
4.9%
277
 
3.7%
201
 
2.7%
Other values (163) 2499
33.5%
Common
ValueCountFrequency (%)
2251
49.6%
1 394
 
8.7%
3 264
 
5.8%
2 261
 
5.8%
- 250
 
5.5%
6 179
 
3.9%
5 176
 
3.9%
4 149
 
3.3%
7 145
 
3.2%
0 141
 
3.1%
Other values (4) 329
 
7.2%
Latin
ValueCountFrequency (%)
M 1
33.3%
F 1
33.3%
K 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7466
62.2%
ASCII 4542
37.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2251
49.6%
1 394
 
8.7%
3 264
 
5.8%
2 261
 
5.7%
- 250
 
5.5%
6 179
 
3.9%
5 176
 
3.9%
4 149
 
3.3%
7 145
 
3.2%
0 141
 
3.1%
Other values (7) 332
 
7.3%
Hangul
ValueCountFrequency (%)
823
 
11.0%
682
 
9.1%
544
 
7.3%
525
 
7.0%
516
 
6.9%
516
 
6.9%
514
 
6.9%
369
 
4.9%
277
 
3.7%
201
 
2.7%
Other values (163) 2499
33.5%

Missing values

2024-01-10T06:31:29.451241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T06:31:29.506955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

상호폐기물 종류사업장소재지
0(유)에치비그 밖의 식물성잔재물충청남도 예산군 고덕면 예당산단4길 67
1(유)에치비폐합성수지류(폐염화비닐수지류는 제외한다)충청남도 예산군 고덕면 예당산단4길 67
2(유)에프엑스인터내셔널폐합성수지류(폐염화비닐수지류는 제외한다)충청남도 예산군 봉산면 한천로 311-4
3(재)충남테크노파크바이오센터폐활성탄충청남도 예산군 삽교읍 산단3길 226
4(주) 대영환경건축현장 폐목재(원목상태의 깨끗한 목재를 말한다)충청남도 예산군 신암면 신원탄중길 216-48
5(주) 대영환경건축현장 폐목재(접착제_ 페인트_ 기름_ 콘크리트 등의 물질이 사용된 목재를 말한다)충청남도 예산군 신암면 신원탄중길 216-48
6(주) 대영환경그 밖의 폐기물충청남도 예산군 신암면 신원탄중길 216-48
7(주) 대영환경무기성오니류충청남도 예산군 신암면 신원탄중길 216-48
8(주) 대영환경폐발포합성수지충청남도 예산군 신암면 신원탄중길 216-48
9(주) 대영환경폐합성수지류(폐염화비닐수지류는 제외한다)충청남도 예산군 신암면 신원탄중길 216-48
상호폐기물 종류사업장소재지
504한국비앤텍주식회사광재류충청남도 예산군 고덕면 오추길 2-6
505한국비앤텍주식회사폐합성수지류충청남도 예산군 고덕면 오추길 2-6
506한국전기통신공사예산전화국기타충청남도 예산군 예산읍 예산리 722-15
507한국전력공사 예산지사폐전주(폐애자_ 폐근가 및 폐합성수지제 커버류 등을 포함한다)충청남도 예산군 예산읍 금오대로 151 (한국전력공사)
508한국철도시설공단 충청지역본부폐목재류충청남도 예산군 예산읍 역전로 23-9
509한라산업폐가구류_ 폐도장목_ 폐목재포장재_ 폐전선드럼(접착제_ 페인트_ 기름_ 콘크리트 등의 물질이 사용된 목재를 말한다)충청남도 예산군 예산읍 역전로 66
510한성실업(주)폐합성수지류충청남도 예산군 응봉면 응봉로 50-19
511한조케미칼폐수처리오니충청남도 예산군 예산읍 충서로 1354
512현대에스티(주)폐합성수지류충청남도 예산군 오가면 예산산업단지로 73
513현대제철 주식회사그 밖의 폐합성고분자화합물(합성수지류로 피복된 폐전선을 포함한다)충청남도 예산군 삽교읍 산단3길 107

Duplicate rows

Most frequently occurring

상호폐기물 종류사업장소재지# duplicates
0(주)두비원폐합성수지류(폐염화비닐수지류는 제외한다)충청남도 예산군 덕산면 남은들로 167-252
1금광카서비스폐유충청남도 예산군 오가면 신원리 5372
2주식회사 영우티피폐합성수지류(폐염화비닐수지류는 제외한다)충청남도 예산군 신암면 추사로 235-282