Overview

Dataset statistics

Number of variables11
Number of observations10000
Missing cells2005
Missing cells (%)1.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory947.3 KiB
Average record size in memory97.0 B

Variable types

Numeric1
Categorical3
Text6
DateTime1

Dataset

Description전라남도 보성군 사업장폐기물배출자신고현황(상호명, 폐기물종류, 연락처, 처리방법, 기준년도, 주소 등)입니다.
URLhttps://www.data.go.kr/data/15060279/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
처리방법 is highly overall correlated with 폐기물구분High correlation
폐기물구분 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 폐기물구분High correlation
처리방법 is highly imbalanced (58.7%)Imbalance
연락처 has 1993 (19.9%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 13:26:03.290382
Analysis finished2023-12-12 13:26:04.870109
Duration1.58 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6447.9641
Minimum2
Maximum12948
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T22:26:04.949442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile632.95
Q13251.75
median6459.5
Q39638.25
95-th percentile12255.05
Maximum12948
Range12946
Interquartile range (IQR)6386.5

Descriptive statistics

Standard deviation3713.7351
Coefficient of variation (CV)0.57595469
Kurtosis-1.1911476
Mean6447.9641
Median Absolute Deviation (MAD)3193
Skewness-0.0040360882
Sum64479641
Variance13791829
MonotonicityNot monotonic
2023-12-12T22:26:05.123414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
882 1
 
< 0.1%
3425 1
 
< 0.1%
6963 1
 
< 0.1%
2399 1
 
< 0.1%
3689 1
 
< 0.1%
12095 1
 
< 0.1%
694 1
 
< 0.1%
4928 1
 
< 0.1%
7866 1
 
< 0.1%
1049 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
12 1
< 0.1%
14 1
< 0.1%
ValueCountFrequency (%)
12948 1
< 0.1%
12946 1
< 0.1%
12945 1
< 0.1%
12944 1
< 0.1%
12943 1
< 0.1%
12942 1
< 0.1%
12941 1
< 0.1%
12940 1
< 0.1%
12938 1
< 0.1%
12935 1
< 0.1%

폐기물구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
지정폐기물
8305 
사업장일반폐기물
1695 

Length

Max length8
Median length5
Mean length5.5085
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사업장일반폐기물
2nd row지정폐기물
3rd row지정폐기물
4th row지정폐기물
5th row사업장일반폐기물

Common Values

ValueCountFrequency (%)
지정폐기물 8305
83.0%
사업장일반폐기물 1695
 
17.0%

Length

2023-12-12T22:26:05.306389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:26:05.438576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지정폐기물 8305
83.0%
사업장일반폐기물 1695
 
17.0%
Distinct1873
Distinct (%)18.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T22:26:05.614301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length18
Mean length6.2215
Min length1

Characters and Unicode

Total characters62215
Distinct characters421
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique876 ?
Unique (%)8.8%

Sample

1st row명일산업개발
2nd row(주)남부환경개발
3rd row개인
4th row문충현
5th row신화종합건설(주)
ValueCountFrequency (%)
개인 1708
 
15.9%
용산건설(주 442
 
4.1%
주식회사 430
 
4.0%
동남환경건설(주 414
 
3.9%
동해종합건설(주 392
 
3.7%
㈜부성산업개발 380
 
3.5%
㈜민성건설 325
 
3.0%
성동건설(주 263
 
2.5%
주)정운건설 258
 
2.4%
주)민성건설 243
 
2.3%
Other values (1916) 5875
54.8%
2023-12-12T22:26:05.969686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4294
 
6.9%
) 4079
 
6.6%
( 4074
 
6.5%
3950
 
6.3%
3825
 
6.1%
2968
 
4.8%
2629
 
4.2%
2043
 
3.3%
1387
 
2.2%
1377
 
2.2%
Other values (411) 31589
50.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 52435
84.3%
Close Punctuation 4079
 
6.6%
Open Punctuation 4074
 
6.5%
Space Separator 766
 
1.2%
Other Symbol 760
 
1.2%
Decimal Number 64
 
0.1%
Uppercase Letter 26
 
< 0.1%
Other Punctuation 9
 
< 0.1%
Modifier Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4294
 
8.2%
3950
 
7.5%
3825
 
7.3%
2968
 
5.7%
2629
 
5.0%
2043
 
3.9%
1387
 
2.6%
1377
 
2.6%
1149
 
2.2%
925
 
1.8%
Other values (385) 27888
53.2%
Uppercase Letter
ValueCountFrequency (%)
K 5
19.2%
O 4
15.4%
S 3
11.5%
H 2
 
7.7%
A 2
 
7.7%
I 2
 
7.7%
T 2
 
7.7%
P 2
 
7.7%
G 1
 
3.8%
L 1
 
3.8%
Other values (2) 2
 
7.7%
Decimal Number
ValueCountFrequency (%)
0 20
31.2%
1 17
26.6%
9 8
 
12.5%
2 7
 
10.9%
3 6
 
9.4%
8 5
 
7.8%
6 1
 
1.6%
Other Punctuation
ValueCountFrequency (%)
. 7
77.8%
% 2
 
22.2%
Close Punctuation
ValueCountFrequency (%)
) 4079
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4074
100.0%
Space Separator
ValueCountFrequency (%)
766
100.0%
Other Symbol
ValueCountFrequency (%)
760
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 53195
85.5%
Common 8994
 
14.5%
Latin 26
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4294
 
8.1%
3950
 
7.4%
3825
 
7.2%
2968
 
5.6%
2629
 
4.9%
2043
 
3.8%
1387
 
2.6%
1377
 
2.6%
1149
 
2.2%
925
 
1.7%
Other values (386) 28648
53.9%
Common
ValueCountFrequency (%)
) 4079
45.4%
( 4074
45.3%
766
 
8.5%
0 20
 
0.2%
1 17
 
0.2%
9 8
 
0.1%
2 7
 
0.1%
. 7
 
0.1%
3 6
 
0.1%
8 5
 
0.1%
Other values (3) 5
 
0.1%
Latin
ValueCountFrequency (%)
K 5
19.2%
O 4
15.4%
S 3
11.5%
H 2
 
7.7%
A 2
 
7.7%
I 2
 
7.7%
T 2
 
7.7%
P 2
 
7.7%
G 1
 
3.8%
L 1
 
3.8%
Other values (2) 2
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 52435
84.3%
ASCII 9020
 
14.5%
None 760
 
1.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4294
 
8.2%
3950
 
7.5%
3825
 
7.3%
2968
 
5.7%
2629
 
5.0%
2043
 
3.9%
1387
 
2.6%
1377
 
2.6%
1149
 
2.2%
925
 
1.8%
Other values (385) 27888
53.2%
ASCII
ValueCountFrequency (%)
) 4079
45.2%
( 4074
45.2%
766
 
8.5%
0 20
 
0.2%
1 17
 
0.2%
9 8
 
0.1%
2 7
 
0.1%
. 7
 
0.1%
3 6
 
0.1%
8 5
 
0.1%
Other values (15) 31
 
0.3%
None
ValueCountFrequency (%)
760
100.0%
Distinct127
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T22:26:06.181464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length81
Median length79
Mean length21.1102
Min length1

Characters and Unicode

Total characters211102
Distinct characters247
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)0.4%

Sample

1st row사업장폐기물
2nd row건조고형물의 함량을 기준으로 하여 석면이 1퍼센트 이상 함유된 제품ㆍ설비(뿜칠로 사용된 것을 포함한다) 등의 해체ㆍ제거 시 발생되는 것
3rd row석면의 제거작업에 사용된 바닥비닐시트ㆍ방진마스크ㆍ작업복 등
4th row흩날릴 우려가 없는 폐석면
5th row폐목재류 1등급
ValueCountFrequency (%)
우려가 4202
 
9.6%
흩날릴 4202
 
9.6%
3635
 
8.3%
폐석면 3488
 
7.9%
없는 3165
 
7.2%
사용된 2905
 
6.6%
제거작업에 2826
 
6.4%
석면의 2826
 
6.4%
비닐시트ㆍ방진마스크ㆍ작업복ㆍ집진필터 2424
 
5.5%
모든 2424
 
5.5%
Other values (227) 11820
26.9%
2023-12-12T22:26:06.580460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
33919
 
16.1%
8350
 
4.0%
6451
 
3.1%
6398
 
3.0%
6125
 
2.9%
5694
 
2.7%
5654
 
2.7%
5296
 
2.5%
4540
 
2.2%
4523
 
2.1%
Other values (237) 124152
58.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 175290
83.0%
Space Separator 33919
 
16.1%
Connector Punctuation 630
 
0.3%
Close Punctuation 505
 
0.2%
Open Punctuation 505
 
0.2%
Decimal Number 175
 
0.1%
Lowercase Letter 54
 
< 0.1%
Other Punctuation 24
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8350
 
4.8%
6451
 
3.7%
6398
 
3.6%
6125
 
3.5%
5694
 
3.2%
5654
 
3.2%
5296
 
3.0%
4540
 
2.6%
4523
 
2.6%
4445
 
2.5%
Other values (217) 117814
67.2%
Decimal Number
ValueCountFrequency (%)
1 148
84.6%
2 12
 
6.9%
0 9
 
5.1%
3 3
 
1.7%
8 3
 
1.7%
Lowercase Letter
ValueCountFrequency (%)
e 18
33.3%
s 9
16.7%
a 9
16.7%
r 9
16.7%
g 9
16.7%
Close Punctuation
ValueCountFrequency (%)
) 493
97.6%
] 9
 
1.8%
3
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 493
97.6%
[ 9
 
1.8%
3
 
0.6%
Other Punctuation
ValueCountFrequency (%)
. 18
75.0%
· 6
 
25.0%
Space Separator
ValueCountFrequency (%)
33919
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 630
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 175290
83.0%
Common 35758
 
16.9%
Latin 54
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8350
 
4.8%
6451
 
3.7%
6398
 
3.6%
6125
 
3.5%
5694
 
3.2%
5654
 
3.2%
5296
 
3.0%
4540
 
2.6%
4523
 
2.6%
4445
 
2.5%
Other values (217) 117814
67.2%
Common
ValueCountFrequency (%)
33919
94.9%
_ 630
 
1.8%
) 493
 
1.4%
( 493
 
1.4%
1 148
 
0.4%
. 18
 
0.1%
2 12
 
< 0.1%
] 9
 
< 0.1%
0 9
 
< 0.1%
[ 9
 
< 0.1%
Other values (5) 18
 
0.1%
Latin
ValueCountFrequency (%)
e 18
33.3%
s 9
16.7%
a 9
16.7%
r 9
16.7%
g 9
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 166940
79.1%
ASCII 35800
 
17.0%
Compat Jamo 8350
 
4.0%
None 12
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
33919
94.7%
_ 630
 
1.8%
) 493
 
1.4%
( 493
 
1.4%
1 148
 
0.4%
e 18
 
0.1%
. 18
 
0.1%
2 12
 
< 0.1%
] 9
 
< 0.1%
s 9
 
< 0.1%
Other values (7) 51
 
0.1%
Compat Jamo
ValueCountFrequency (%)
8350
100.0%
Hangul
ValueCountFrequency (%)
6451
 
3.9%
6398
 
3.8%
6125
 
3.7%
5694
 
3.4%
5654
 
3.4%
5296
 
3.2%
4540
 
2.7%
4523
 
2.7%
4445
 
2.7%
4250
 
2.5%
Other values (216) 113564
68.0%
None
ValueCountFrequency (%)
· 6
50.0%
3
25.0%
3
25.0%

연락처
Text

MISSING 

Distinct633
Distinct (%)7.9%
Missing1993
Missing (%)19.9%
Memory size156.2 KiB
2023-12-12T22:26:06.842012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length8.0178594
Min length1

Characters and Unicode

Total characters64199
Distinct characters16
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique254 ?
Unique (%)3.2%

Sample

1st row061-857-1441
2nd row061-337-4501
3rd row
4th row062-226-0321
5th row061-858-7081
ValueCountFrequency (%)
061-852-0537 599
 
11.7%
061-858-7081 448
 
8.7%
061-858-4404 422
 
8.2%
061-857-6336 401
 
7.8%
061-853-3303 340
 
6.6%
061-852-3920 258
 
5.0%
062-528-0404 253
 
4.9%
061-472-8501 154
 
3.0%
061-471-2913 112
 
2.2%
061-851-0406 109
 
2.1%
Other values (625) 2025
39.5%
2023-12-12T22:26:07.238164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 10198
15.9%
0 9770
15.2%
8 6823
10.6%
6 6742
10.5%
1 6690
10.4%
5 6267
9.8%
3 4825
7.5%
2 3283
 
5.1%
4 3086
 
4.8%
2900
 
4.5%
Other values (6) 3615
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 51085
79.6%
Dash Punctuation 10198
 
15.9%
Space Separator 2900
 
4.5%
Other Letter 15
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 9770
19.1%
8 6823
13.4%
6 6742
13.2%
1 6690
13.1%
5 6267
12.3%
3 4825
9.4%
2 3283
 
6.4%
4 3086
 
6.0%
7 2886
 
5.6%
9 713
 
1.4%
Other Letter
ValueCountFrequency (%)
5
33.3%
5
33.3%
5
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 10198
100.0%
Space Separator
ValueCountFrequency (%)
2900
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 64184
> 99.9%
Hangul 15
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 10198
15.9%
0 9770
15.2%
8 6823
10.6%
6 6742
10.5%
1 6690
10.4%
5 6267
9.8%
3 4825
7.5%
2 3283
 
5.1%
4 3086
 
4.8%
2900
 
4.5%
Other values (3) 3600
 
5.6%
Hangul
ValueCountFrequency (%)
5
33.3%
5
33.3%
5
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64184
> 99.9%
Hangul 15
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 10198
15.9%
0 9770
15.2%
8 6823
10.6%
6 6742
10.5%
1 6690
10.4%
5 6267
9.8%
3 4825
7.5%
2 3283
 
5.1%
4 3086
 
4.8%
2900
 
4.5%
Other values (3) 3600
 
5.6%
Hangul
ValueCountFrequency (%)
5
33.3%
5
33.3%
5
33.3%
Distinct862
Distinct (%)8.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T22:26:07.610116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length4
Mean length4.0007
Min length1

Characters and Unicode

Total characters40007
Distinct characters217
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique370 ?
Unique (%)3.7%

Sample

1st row(주)빛고을환경
2nd row1.3
3rd row0.01
4th row1.95
5th row푸른보성영농조합법인
ValueCountFrequency (%)
0.01 1309
 
13.1%
0.02 804
 
8.0%
1 632
 
6.3%
0.05 527
 
5.3%
2.99 451
 
4.5%
0.3 424
 
4.2%
0 416
 
4.2%
0.12 179
 
1.8%
푸른보성영농조합법인 156
 
1.6%
2 120
 
1.2%
Other values (848) 4977
49.8%
2023-12-12T22:26:08.142206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 7779
19.4%
. 6924
17.3%
1 3830
 
9.6%
2 2646
 
6.6%
9 1587
 
4.0%
3 1295
 
3.2%
5 1108
 
2.8%
1017
 
2.5%
1010
 
2.5%
( 912
 
2.3%
Other values (207) 11899
29.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20810
52.0%
Other Letter 10316
25.8%
Other Punctuation 6925
 
17.3%
Open Punctuation 912
 
2.3%
Close Punctuation 912
 
2.3%
Space Separator 71
 
0.2%
Connector Punctuation 42
 
0.1%
Uppercase Letter 19
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1017
 
9.9%
1010
 
9.8%
779
 
7.6%
538
 
5.2%
426
 
4.1%
377
 
3.7%
333
 
3.2%
292
 
2.8%
292
 
2.8%
224
 
2.2%
Other values (180) 5028
48.7%
Uppercase Letter
ValueCountFrequency (%)
N 3
15.8%
H 3
15.8%
D 3
15.8%
E 2
10.5%
T 2
10.5%
S 1
 
5.3%
I 1
 
5.3%
G 1
 
5.3%
J 1
 
5.3%
K 1
 
5.3%
Decimal Number
ValueCountFrequency (%)
0 7779
37.4%
1 3830
18.4%
2 2646
 
12.7%
9 1587
 
7.6%
3 1295
 
6.2%
5 1108
 
5.3%
8 790
 
3.8%
4 643
 
3.1%
6 621
 
3.0%
7 511
 
2.5%
Other Punctuation
ValueCountFrequency (%)
. 6924
> 99.9%
· 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 912
100.0%
Close Punctuation
ValueCountFrequency (%)
) 912
100.0%
Space Separator
ValueCountFrequency (%)
71
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 42
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 29672
74.2%
Hangul 10316
 
25.8%
Latin 19
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1017
 
9.9%
1010
 
9.8%
779
 
7.6%
538
 
5.2%
426
 
4.1%
377
 
3.7%
333
 
3.2%
292
 
2.8%
292
 
2.8%
224
 
2.2%
Other values (180) 5028
48.7%
Common
ValueCountFrequency (%)
0 7779
26.2%
. 6924
23.3%
1 3830
12.9%
2 2646
 
8.9%
9 1587
 
5.3%
3 1295
 
4.4%
5 1108
 
3.7%
( 912
 
3.1%
) 912
 
3.1%
8 790
 
2.7%
Other values (6) 1889
 
6.4%
Latin
ValueCountFrequency (%)
N 3
15.8%
H 3
15.8%
D 3
15.8%
E 2
10.5%
T 2
10.5%
S 1
 
5.3%
I 1
 
5.3%
G 1
 
5.3%
J 1
 
5.3%
K 1
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 29690
74.2%
Hangul 10316
 
25.8%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7779
26.2%
. 6924
23.3%
1 3830
12.9%
2 2646
 
8.9%
9 1587
 
5.3%
3 1295
 
4.4%
5 1108
 
3.7%
( 912
 
3.1%
) 912
 
3.1%
8 790
 
2.7%
Other values (16) 1907
 
6.4%
Hangul
ValueCountFrequency (%)
1017
 
9.9%
1010
 
9.8%
779
 
7.6%
538
 
5.2%
426
 
4.1%
377
 
3.7%
333
 
3.2%
292
 
2.8%
292
 
2.8%
224
 
2.2%
Other values (180) 5028
48.7%
None
ValueCountFrequency (%)
· 1
100.0%
Distinct493
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T22:26:08.449902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length58
Median length24
Mean length8.2368
Min length1

Characters and Unicode

Total characters82368
Distinct characters256
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique216 ?
Unique (%)2.2%

Sample

1st row인선ENT
2nd row인선ENT(주)
3rd row주식회사그린바이로
4th row한맥테코산업(주)
5th row푸른보성영농조합법인
ValueCountFrequency (%)
한맥테코산업(주 1783
17.5%
주)와이엔텍 959
 
9.4%
㈜와이엔텍 857
 
8.4%
인선이엔티(주)광양 622
 
6.1%
승우산업개발(주 619
 
6.1%
한맥테코산업(주)율촌사업소 514
 
5.1%
주)지엠이앤씨 344
 
3.4%
㈜하나이앤에스 305
 
3.0%
에코시스템(주 282
 
2.8%
푸른보성영농조합법인 158
 
1.6%
Other values (484) 3730
36.7%
2023-12-12T22:26:08.867118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7598
 
9.2%
) 7352
 
8.9%
( 7347
 
8.9%
4278
 
5.2%
4086
 
5.0%
3598
 
4.4%
2954
 
3.6%
2647
 
3.2%
2593
 
3.1%
2435
 
3.0%
Other values (246) 37480
45.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65438
79.4%
Close Punctuation 7352
 
8.9%
Open Punctuation 7347
 
8.9%
Other Symbol 1517
 
1.8%
Uppercase Letter 339
 
0.4%
Space Separator 247
 
0.3%
Lowercase Letter 45
 
0.1%
Decimal Number 36
 
< 0.1%
Dash Punctuation 30
 
< 0.1%
Connector Punctuation 16
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7598
 
11.6%
4278
 
6.5%
4086
 
6.2%
3598
 
5.5%
2954
 
4.5%
2647
 
4.0%
2593
 
4.0%
2435
 
3.7%
2426
 
3.7%
1868
 
2.9%
Other values (224) 30955
47.3%
Uppercase Letter
ValueCountFrequency (%)
C 74
21.8%
K 74
21.8%
E 63
18.6%
T 63
18.6%
N 63
18.6%
J 1
 
0.3%
S 1
 
0.3%
Lowercase Letter
ValueCountFrequency (%)
e 31
68.9%
k 6
 
13.3%
c 6
 
13.3%
t 1
 
2.2%
n 1
 
2.2%
Decimal Number
ValueCountFrequency (%)
2 32
88.9%
1 3
 
8.3%
9 1
 
2.8%
Close Punctuation
ValueCountFrequency (%)
) 7352
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7347
100.0%
Other Symbol
ValueCountFrequency (%)
1517
100.0%
Space Separator
ValueCountFrequency (%)
247
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 30
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 16
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 66955
81.3%
Common 15029
 
18.2%
Latin 384
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7598
 
11.3%
4278
 
6.4%
4086
 
6.1%
3598
 
5.4%
2954
 
4.4%
2647
 
4.0%
2593
 
3.9%
2435
 
3.6%
2426
 
3.6%
1868
 
2.8%
Other values (225) 32472
48.5%
Latin
ValueCountFrequency (%)
C 74
19.3%
K 74
19.3%
E 63
16.4%
T 63
16.4%
N 63
16.4%
e 31
8.1%
k 6
 
1.6%
c 6
 
1.6%
t 1
 
0.3%
n 1
 
0.3%
Other values (2) 2
 
0.5%
Common
ValueCountFrequency (%)
) 7352
48.9%
( 7347
48.9%
247
 
1.6%
2 32
 
0.2%
- 30
 
0.2%
_ 16
 
0.1%
1 3
 
< 0.1%
. 1
 
< 0.1%
9 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65436
79.4%
ASCII 15413
 
18.7%
None 1517
 
1.8%
Compat Jamo 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7598
 
11.6%
4278
 
6.5%
4086
 
6.2%
3598
 
5.5%
2954
 
4.5%
2647
 
4.0%
2593
 
4.0%
2435
 
3.7%
2426
 
3.7%
1868
 
2.9%
Other values (222) 30953
47.3%
ASCII
ValueCountFrequency (%)
) 7352
47.7%
( 7347
47.7%
247
 
1.6%
C 74
 
0.5%
K 74
 
0.5%
E 63
 
0.4%
T 63
 
0.4%
N 63
 
0.4%
2 32
 
0.2%
e 31
 
0.2%
Other values (11) 67
 
0.4%
None
ValueCountFrequency (%)
1517
100.0%
Compat Jamo
ValueCountFrequency (%)
1
50.0%
1
50.0%

처리방법
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct43
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
매립(민간관리형매립시설)
6121 
중간처분(고형화)
1115 
중간처분(일반소각)
723 
재활용(파쇄.분쇄)
 
558
파쇄.절단
 
274
Other values (38)
1209 

Length

Max length19
Median length13
Mean length11.5185
Min length1

Unique

Unique14 ?
Unique (%)0.1%

Sample

1st row매립(관리형매립시설)
2nd row매립(민간관리형매립시설)
3rd row매립(민간관리형매립시설)
4th row매립(민간관리형매립시설)
5th row재활용(파쇄.분쇄)

Common Values

ValueCountFrequency (%)
매립(민간관리형매립시설) 6121
61.2%
중간처분(고형화) 1115
 
11.2%
중간처분(일반소각) 723
 
7.2%
재활용(파쇄.분쇄) 558
 
5.6%
파쇄.절단 274
 
2.7%
중간처분(파쇄.분쇄) 264
 
2.6%
재활용(중간가공폐기물 제조) 232
 
2.3%
재활용(기타) 147
 
1.5%
소각 138
 
1.4%
중간처분(고온소각) 65
 
0.7%
Other values (33) 363
 
3.6%

Length

2023-12-12T22:26:09.265226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
매립(민간관리형매립시설 6121
59.1%
중간처분(고형화 1115
 
10.8%
중간처분(일반소각 723
 
7.0%
재활용(파쇄.분쇄 558
 
5.4%
파쇄.절단 274
 
2.6%
제조 271
 
2.6%
중간처분(파쇄.분쇄 264
 
2.5%
재활용(중간가공폐기물 232
 
2.2%
재활용(기타 147
 
1.4%
소각 138
 
1.3%
Other values (37) 516
 
5.0%
Distinct2996
Distinct (%)30.0%
Missing12
Missing (%)0.1%
Memory size156.2 KiB
2023-12-12T22:26:09.565996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length50
Mean length21.630857
Min length1

Characters and Unicode

Total characters216049
Distinct characters395
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1463 ?
Unique (%)14.6%

Sample

1st row전라남도 보성군 벌교읍 장좌리 711-7
2nd row전라남도 나주시 다도면 송학리 190
3rd row전라남도 보성군 벌교읍 월곡길 47
4th row전라남도 보성군 노동면 신천길 121
5th row전라남도 화순군 화순읍 광덕리 370
ValueCountFrequency (%)
전라남도 9629
19.1%
보성군 8833
 
17.6%
벌교읍 2722
 
5.4%
보성읍 2488
 
4.9%
조성면 868
 
1.7%
문덕면 805
 
1.6%
장운길 714
 
1.4%
송재로 613
 
1.2%
원동길 595
 
1.2%
40 583
 
1.2%
Other values (3223) 22456
44.6%
2023-12-12T22:26:10.036842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
43833
20.3%
13247
 
6.1%
12084
 
5.6%
10235
 
4.7%
9877
 
4.6%
9824
 
4.5%
9676
 
4.5%
9404
 
4.4%
1 5965
 
2.8%
5350
 
2.5%
Other values (385) 86554
40.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 135038
62.5%
Space Separator 43833
 
20.3%
Decimal Number 30834
 
14.3%
Dash Punctuation 4670
 
2.2%
Close Punctuation 639
 
0.3%
Open Punctuation 639
 
0.3%
Connector Punctuation 329
 
0.2%
Uppercase Letter 58
 
< 0.1%
Lowercase Letter 6
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13247
 
9.8%
12084
 
8.9%
10235
 
7.6%
9877
 
7.3%
9824
 
7.3%
9676
 
7.2%
9404
 
7.0%
5350
 
4.0%
4812
 
3.6%
4136
 
3.1%
Other values (353) 46393
34.4%
Uppercase Letter
ValueCountFrequency (%)
Y 21
36.2%
D 21
36.2%
C 3
 
5.2%
K 3
 
5.2%
S 3
 
5.2%
T 2
 
3.4%
M 1
 
1.7%
A 1
 
1.7%
L 1
 
1.7%
B 1
 
1.7%
Decimal Number
ValueCountFrequency (%)
1 5965
19.3%
2 4620
15.0%
3 4032
13.1%
4 2842
9.2%
7 2769
9.0%
6 2542
8.2%
0 2412
7.8%
8 2399
7.8%
5 1826
 
5.9%
9 1427
 
4.6%
Lowercase Letter
ValueCountFrequency (%)
i 2
33.3%
o 2
33.3%
l 2
33.3%
Math Symbol
ValueCountFrequency (%)
~ 1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
43833
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4670
100.0%
Close Punctuation
ValueCountFrequency (%)
) 639
100.0%
Open Punctuation
ValueCountFrequency (%)
( 639
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 329
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 135038
62.5%
Common 80947
37.5%
Latin 64
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13247
 
9.8%
12084
 
8.9%
10235
 
7.6%
9877
 
7.3%
9824
 
7.3%
9676
 
7.2%
9404
 
7.0%
5350
 
4.0%
4812
 
3.6%
4136
 
3.1%
Other values (353) 46393
34.4%
Common
ValueCountFrequency (%)
43833
54.2%
1 5965
 
7.4%
- 4670
 
5.8%
2 4620
 
5.7%
3 4032
 
5.0%
4 2842
 
3.5%
7 2769
 
3.4%
6 2542
 
3.1%
0 2412
 
3.0%
8 2399
 
3.0%
Other values (8) 4863
 
6.0%
Latin
ValueCountFrequency (%)
Y 21
32.8%
D 21
32.8%
C 3
 
4.7%
K 3
 
4.7%
S 3
 
4.7%
T 2
 
3.1%
i 2
 
3.1%
o 2
 
3.1%
l 2
 
3.1%
M 1
 
1.6%
Other values (4) 4
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 135038
62.5%
ASCII 81010
37.5%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
43833
54.1%
1 5965
 
7.4%
- 4670
 
5.8%
2 4620
 
5.7%
3 4032
 
5.0%
4 2842
 
3.5%
7 2769
 
3.4%
6 2542
 
3.1%
0 2412
 
3.0%
8 2399
 
3.0%
Other values (21) 4926
 
6.1%
Hangul
ValueCountFrequency (%)
13247
 
9.8%
12084
 
8.9%
10235
 
7.6%
9877
 
7.3%
9824
 
7.3%
9676
 
7.2%
9404
 
7.0%
5350
 
4.0%
4812
 
3.6%
4136
 
3.1%
Other values (353) 46393
34.4%
Math Operators
ValueCountFrequency (%)
1
100.0%
Distinct1906
Distinct (%)19.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum1998-10-26 00:00:00
Maximum2023-05-10 00:00:00
2023-12-12T22:26:10.194040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:26:10.343494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-05-16
10000 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-05-16
2nd row2023-05-16
3rd row2023-05-16
4th row2023-05-16
5th row2023-05-16

Common Values

ValueCountFrequency (%)
2023-05-16 10000
100.0%

Length

2023-12-12T22:26:10.471228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:26:10.559953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-05-16 10000
100.0%

Interactions

2023-12-12T22:26:04.348342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:26:10.611684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번폐기물구분처리방법
연번1.0000.9930.809
폐기물구분0.9931.0000.969
처리방법0.8090.9691.000
2023-12-12T22:26:10.696724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리방법폐기물구분
처리방법1.0000.918
폐기물구분0.9181.000
2023-12-12T22:26:10.788450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번폐기물구분처리방법
연번1.0000.9250.436
폐기물구분0.9251.0000.918
처리방법0.4360.9181.000

Missing values

2023-12-12T22:26:04.493368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:26:04.672716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T22:26:04.805755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번폐기물구분상호명폐기물종류연락처운반자명처리업소명처리방법사업장주소신고기준년도데이터기준일자
878882사업장일반폐기물명일산업개발사업장폐기물061-857-1441(주)빛고을환경인선ENT매립(관리형매립시설)전라남도 보성군 벌교읍 장좌리 711-72008-05-162023-05-16
1179611830지정폐기물(주)남부환경개발건조고형물의 함량을 기준으로 하여 석면이 1퍼센트 이상 함유된 제품ㆍ설비(뿜칠로 사용된 것을 포함한다) 등의 해체ㆍ제거 시 발생되는 것061-337-45011.3인선ENT(주)매립(민간관리형매립시설)전라남도 나주시 다도면 송학리 1902008-08-292023-05-16
1052610559지정폐기물개인석면의 제거작업에 사용된 바닥비닐시트ㆍ방진마스크ㆍ작업복 등0.01주식회사그린바이로매립(민간관리형매립시설)전라남도 보성군 벌교읍 월곡길 472015-06-112023-05-16
88298862지정폐기물문충현흩날릴 우려가 없는 폐석면<NA>1.95한맥테코산업(주)매립(민간관리형매립시설)전라남도 보성군 노동면 신천길 1212018-06-222023-05-16
739742사업장일반폐기물신화종합건설(주)폐목재류 1등급062-226-0321푸른보성영농조합법인푸른보성영농조합법인재활용(파쇄.분쇄)전라남도 화순군 화순읍 광덕리 3702013-05-072023-05-16
48744907지정폐기물㈜민성건설흩날릴 우려가 없는 폐석면061-858-70813.98한맥테코산업(주)매립(민간관리형매립시설)전라남도 보성군 벌교읍 원동길 402021-12-092023-05-16
95189551지정폐기물(주)성륜건설흩날릴 우려가 없는 폐석면0.74(주)와이엔텍매립(민간관리형매립시설)전라남도 나주시 봉황면 도천로 146-122017-06-052023-05-16
67636796지정폐기물개인흩날릴 우려가 없는 폐석면<NA>1.18한맥테코산업(주)매립(민간관리형매립시설)전라남도 순천시 장선배기길 89_ 103동 406호 (조례동_ 금당대림아파트)2020-09-252023-05-16
1260112692지정폐기물예당연합의원폐합성수지류061-853-03740한국환경개발기타전라남도 보성군 득량면 예당리 430-22001-01-082023-05-16
11991213사업장일반폐기물개인건설폐재류동남환경건설(주)동남환경건설(주)재활용(파쇄.분쇄)전라남도 보성군 벌교읍 벌교리 867-42004-10-192023-05-16
연번폐기물구분상호명폐기물종류연락처운반자명처리업소명처리방법사업장주소신고기준년도데이터기준일자
14061427사업장일반폐기물(주)선인수도기계건설폐기물(주)미래환경산업개발(주)미래환경산업개발재활용(파쇄.분쇄)전라남도 보성군 보성읍 용문리 227-22004-01-142023-05-16
578580사업장일반폐기물한국어촌어항협회폐합성수지류(폐염화비닐수지류는 제외한다)초당환경초당환경재활용(중간가공폐기물 제조)2017-10-132023-05-16
1209512180지정폐기물현대농기계폐유0.12호남자원재활용(기타)전라남도 보성군 벌교읍 낙성리 2-12004-08-042023-05-16
58755908지정폐기물동남환경건설(주)흩날릴 우려가 없는 폐석면061-853-33032.99에코시스템(주)매립(민간관리형매립시설)전라남도 보성군 조성면 녹색로 40732021-04-282023-05-16
35173550지정폐기물(유)산돌건설석면의 제거작업에 사용된 모든 비닐시트ㆍ방진마스크ㆍ작업복ㆍ집진필터 등061-853-02260.01인선이엔티(주)광양매립(민간관리형매립시설)전라남도 보성군 겸백면 사곡길 112022-04-282023-05-16
79507983지정폐기물동남환경건설(주)석면의 제거작업에 사용된 모든 비닐시트ㆍ방진마스크ㆍ작업복ㆍ집진필터 등061-853-33030.02(주)와이엔텍매립(민간관리형매립시설)전라남도 보성군 조성면 녹색로 40732019-07-032023-05-16
1133811372지정폐기물보성군청슬레이트 등 고형화되어 있어 흩날릴 우려가 없는 것061-850-55522.8한맥테코산업(주)율촌사업소매립(민간관리형매립시설)전라남도 보성군 보성읍 송재로 1652012-09-192023-05-16
1038410417지정폐기물보성군청석면의 제거작업에 사용된 바닥비닐시트ㆍ방진마스크ㆍ작업복 등061-850-51610.1(주)와이엔텍매립(민간관리형매립시설)전라남도 보성군 보성읍 송재로 1652015-08-192023-05-16
81288161지정폐기물동남환경건설(주)석면의 제거작업에 사용된 모든 비닐시트ㆍ방진마스크ㆍ작업복ㆍ집진필터 등061-853-33030.02㈜와이엔텍매립(민간관리형매립시설)전라남도 보성군 조성면 녹색로 40732019-05-142023-05-16
57925825지정폐기물동해종합건설(주)흩날릴 우려가 없는 폐석면062-528-04042.99에코시스템(주)매립(민간관리형매립시설)전라남도 보성군 보성읍 갱맹골길 1732021-05-102023-05-16