Overview

Dataset statistics

Number of variables6
Number of observations45
Missing cells50
Missing cells (%)18.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.2 KiB
Average record size in memory50.9 B

Variable types

Text5
Categorical1

Dataset

Description충청남도 청양군 관내의 대기오염물질 배출사업장(4종~5종)의 데이터로 배출사업장 업체명, 소재지, 업종 및 종별에 대한 데이터
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=339&beforeMenuCd=DOM_000000201001001000&publicdatapk=15083641

Alerts

도로명주소 has 2 (4.4%) missing valuesMissing
지번주소 has 43 (95.6%) missing valuesMissing
연락처 has 5 (11.1%) missing valuesMissing
상 호 has unique valuesUnique

Reproduction

Analysis started2024-01-09 21:24:37.309813
Analysis finished2024-01-09 21:24:37.864304
Duration0.55 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상 호
Text

UNIQUE 

Distinct45
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size492.0 B
2024-01-10T06:24:38.002143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length12
Mean length7
Min length3

Characters and Unicode

Total characters315
Distinct characters122
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)100.0%

Sample

1st row칠갑농산㈜
2nd row진흥자동차공업사
3rd row영농조합법인청양골미곡종합처리장
4th row㈜칠갑정비
5th row청양금호정비
ValueCountFrequency (%)
칠갑농산㈜ 1
 
2.1%
진흥자동차공업사 1
 
2.1%
한스텍 1
 
2.1%
㈜뉴콘 1
 
2.1%
금강개발산업㈜ 1
 
2.1%
㈜우리에프엔비 1
 
2.1%
청양군양돈액비유통센타영농조합법인 1
 
2.1%
농업회사법인㈜칠성바이오 1
 
2.1%
㈜삼진레미콘 1
 
2.1%
㈜하은산업 1
 
2.1%
Other values (38) 38
79.2%
2024-01-10T06:24:38.301123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34
 
10.8%
10
 
3.2%
10
 
3.2%
10
 
3.2%
8
 
2.5%
7
 
2.2%
7
 
2.2%
7
 
2.2%
6
 
1.9%
6
 
1.9%
Other values (112) 210
66.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 271
86.0%
Other Symbol 34
 
10.8%
Uppercase Letter 4
 
1.3%
Space Separator 3
 
1.0%
Decimal Number 2
 
0.6%
Other Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10
 
3.7%
10
 
3.7%
10
 
3.7%
8
 
3.0%
7
 
2.6%
7
 
2.6%
7
 
2.6%
6
 
2.2%
6
 
2.2%
6
 
2.2%
Other values (105) 194
71.6%
Uppercase Letter
ValueCountFrequency (%)
S 2
50.0%
M 2
50.0%
Decimal Number
ValueCountFrequency (%)
2 1
50.0%
1 1
50.0%
Other Symbol
ValueCountFrequency (%)
34
100.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 305
96.8%
Common 6
 
1.9%
Latin 4
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
34
 
11.1%
10
 
3.3%
10
 
3.3%
10
 
3.3%
8
 
2.6%
7
 
2.3%
7
 
2.3%
7
 
2.3%
6
 
2.0%
6
 
2.0%
Other values (106) 200
65.6%
Common
ValueCountFrequency (%)
3
50.0%
2 1
 
16.7%
1 1
 
16.7%
. 1
 
16.7%
Latin
ValueCountFrequency (%)
S 2
50.0%
M 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 271
86.0%
None 34
 
10.8%
ASCII 10
 
3.2%

Most frequent character per block

None
ValueCountFrequency (%)
34
100.0%
Hangul
ValueCountFrequency (%)
10
 
3.7%
10
 
3.7%
10
 
3.7%
8
 
3.0%
7
 
2.6%
7
 
2.6%
7
 
2.6%
6
 
2.2%
6
 
2.2%
6
 
2.2%
Other values (105) 194
71.6%
ASCII
ValueCountFrequency (%)
3
30.0%
S 2
20.0%
M 2
20.0%
2 1
 
10.0%
1 1
 
10.0%
. 1
 
10.0%

업종
Text

Distinct43
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Memory size492.0 B
2024-01-10T06:24:38.491067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length15
Mean length11
Min length3

Characters and Unicode

Total characters495
Distinct characters104
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)91.1%

Sample

1st row음식료품제조시설
2nd row운수장비수선시설(자동차정비)
3rd row도정업
4th row도장시설
5th row도장(건조)시설
ValueCountFrequency (%)
9
 
11.5%
기타화학물질제조업 2
 
2.6%
차체및특장차제조업 2
 
2.6%
1 2
 
2.6%
1
 
1.3%
전기발전업 1
 
1.3%
유제품제조업 1
 
1.3%
광물제동제조 1
 
1.3%
외기타비금속 1
 
1.3%
자동차정비업 1
 
1.3%
Other values (57) 57
73.1%
2024-01-10T06:24:38.797528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
41
 
8.3%
33
 
6.7%
30
 
6.1%
28
 
5.7%
18
 
3.6%
17
 
3.4%
16
 
3.2%
16
 
3.2%
16
 
3.2%
14
 
2.8%
Other values (94) 266
53.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 443
89.5%
Space Separator 33
 
6.7%
Other Punctuation 8
 
1.6%
Open Punctuation 4
 
0.8%
Close Punctuation 4
 
0.8%
Decimal Number 3
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
41
 
9.3%
30
 
6.8%
28
 
6.3%
18
 
4.1%
17
 
3.8%
16
 
3.6%
16
 
3.6%
16
 
3.6%
14
 
3.2%
12
 
2.7%
Other values (88) 235
53.0%
Decimal Number
ValueCountFrequency (%)
1 2
66.7%
3 1
33.3%
Space Separator
ValueCountFrequency (%)
33
100.0%
Other Punctuation
ValueCountFrequency (%)
, 8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 443
89.5%
Common 52
 
10.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
41
 
9.3%
30
 
6.8%
28
 
6.3%
18
 
4.1%
17
 
3.8%
16
 
3.6%
16
 
3.6%
16
 
3.6%
14
 
3.2%
12
 
2.7%
Other values (88) 235
53.0%
Common
ValueCountFrequency (%)
33
63.5%
, 8
 
15.4%
( 4
 
7.7%
) 4
 
7.7%
1 2
 
3.8%
3 1
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 443
89.5%
ASCII 52
 
10.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
41
 
9.3%
30
 
6.8%
28
 
6.3%
18
 
4.1%
17
 
3.8%
16
 
3.6%
16
 
3.6%
16
 
3.6%
14
 
3.2%
12
 
2.7%
Other values (88) 235
53.0%
ASCII
ValueCountFrequency (%)
33
63.5%
, 8
 
15.4%
( 4
 
7.7%
) 4
 
7.7%
1 2
 
3.8%
3 1
 
1.9%

규모
Categorical

Distinct2
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size492.0 B
5종
27 
4종
18 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5종
2nd row5종
3rd row5종
4th row5종
5th row4종

Common Values

ValueCountFrequency (%)
5종 27
60.0%
4종 18
40.0%

Length

2024-01-10T06:24:38.912330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:24:39.229751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5종 27
60.0%
4종 18
40.0%

도로명주소
Text

MISSING 

Distinct42
Distinct (%)97.7%
Missing2
Missing (%)4.4%
Memory size492.0 B
2024-01-10T06:24:39.390489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length23
Mean length22.395349
Min length18

Characters and Unicode

Total characters963
Distinct characters65
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)95.3%

Sample

1st row충청남도 청양군 청양읍 충절로 872-6
2nd row충청남도 청양군 청양읍 중앙로6길 1
3rd row충청남도 청양군 청양읍 칠갑산로 120
4th row충청남도 청양군 청양읍 칠갑산로 343
5th row충청남도 청양군 청양읍 충절로 1355-18
ValueCountFrequency (%)
충청남도 43
20.0%
청양군 43
20.0%
비봉면 10
 
4.7%
운곡면 10
 
4.7%
청양읍 8
 
3.7%
화성면 6
 
2.8%
작은한술길 5
 
2.3%
정산면 5
 
2.3%
충절로 4
 
1.9%
록평용당로 4
 
1.9%
Other values (58) 77
35.8%
2024-01-10T06:24:39.680853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
172
17.9%
101
 
10.5%
51
 
5.3%
50
 
5.2%
44
 
4.6%
43
 
4.5%
43
 
4.5%
35
 
3.6%
1 31
 
3.2%
- 28
 
2.9%
Other values (55) 365
37.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 592
61.5%
Space Separator 172
 
17.9%
Decimal Number 171
 
17.8%
Dash Punctuation 28
 
2.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
101
17.1%
51
 
8.6%
50
 
8.4%
44
 
7.4%
43
 
7.3%
43
 
7.3%
35
 
5.9%
26
 
4.4%
17
 
2.9%
11
 
1.9%
Other values (43) 171
28.9%
Decimal Number
ValueCountFrequency (%)
1 31
18.1%
5 24
14.0%
3 22
12.9%
7 19
11.1%
2 18
10.5%
4 17
9.9%
6 12
 
7.0%
8 12
 
7.0%
9 10
 
5.8%
0 6
 
3.5%
Space Separator
ValueCountFrequency (%)
172
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 592
61.5%
Common 371
38.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
101
17.1%
51
 
8.6%
50
 
8.4%
44
 
7.4%
43
 
7.3%
43
 
7.3%
35
 
5.9%
26
 
4.4%
17
 
2.9%
11
 
1.9%
Other values (43) 171
28.9%
Common
ValueCountFrequency (%)
172
46.4%
1 31
 
8.4%
- 28
 
7.5%
5 24
 
6.5%
3 22
 
5.9%
7 19
 
5.1%
2 18
 
4.9%
4 17
 
4.6%
6 12
 
3.2%
8 12
 
3.2%
Other values (2) 16
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 592
61.5%
ASCII 371
38.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
172
46.4%
1 31
 
8.4%
- 28
 
7.5%
5 24
 
6.5%
3 22
 
5.9%
7 19
 
5.1%
2 18
 
4.9%
4 17
 
4.6%
6 12
 
3.2%
8 12
 
3.2%
Other values (2) 16
 
4.3%
Hangul
ValueCountFrequency (%)
101
17.1%
51
 
8.6%
50
 
8.4%
44
 
7.4%
43
 
7.3%
43
 
7.3%
35
 
5.9%
26
 
4.4%
17
 
2.9%
11
 
1.9%
Other values (43) 171
28.9%

지번주소
Text

MISSING 

Distinct2
Distinct (%)100.0%
Missing43
Missing (%)95.6%
Memory size492.0 B
2024-01-10T06:24:39.820859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length22.5
Mean length22.5
Min length22

Characters and Unicode

Total characters45
Distinct characters24
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row충청남도 청양군 목면 화양리 46, 48
2nd row충청남도 청양군 장평면 분향리 797-32
ValueCountFrequency (%)
충청남도 2
18.2%
청양군 2
18.2%
목면 1
9.1%
화양리 1
9.1%
46 1
9.1%
48 1
9.1%
장평면 1
9.1%
분향리 1
9.1%
797-32 1
9.1%
2024-01-10T06:24:40.032037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9
20.0%
4
 
8.9%
3
 
6.7%
2
 
4.4%
2
 
4.4%
2
 
4.4%
2
 
4.4%
2
 
4.4%
2
 
4.4%
4 2
 
4.4%
Other values (14) 15
33.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 25
55.6%
Space Separator 9
 
20.0%
Decimal Number 9
 
20.0%
Dash Punctuation 1
 
2.2%
Other Punctuation 1
 
2.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
16.0%
3
12.0%
2
8.0%
2
8.0%
2
8.0%
2
8.0%
2
8.0%
2
8.0%
1
 
4.0%
1
 
4.0%
Other values (4) 4
16.0%
Decimal Number
ValueCountFrequency (%)
4 2
22.2%
7 2
22.2%
3 1
11.1%
9 1
11.1%
6 1
11.1%
8 1
11.1%
2 1
11.1%
Space Separator
ValueCountFrequency (%)
9
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 25
55.6%
Common 20
44.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
16.0%
3
12.0%
2
8.0%
2
8.0%
2
8.0%
2
8.0%
2
8.0%
2
8.0%
1
 
4.0%
1
 
4.0%
Other values (4) 4
16.0%
Common
ValueCountFrequency (%)
9
45.0%
4 2
 
10.0%
7 2
 
10.0%
3 1
 
5.0%
- 1
 
5.0%
9 1
 
5.0%
6 1
 
5.0%
8 1
 
5.0%
, 1
 
5.0%
2 1
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 25
55.6%
ASCII 20
44.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9
45.0%
4 2
 
10.0%
7 2
 
10.0%
3 1
 
5.0%
- 1
 
5.0%
9 1
 
5.0%
6 1
 
5.0%
8 1
 
5.0%
, 1
 
5.0%
2 1
 
5.0%
Hangul
ValueCountFrequency (%)
4
16.0%
3
12.0%
2
8.0%
2
8.0%
2
8.0%
2
8.0%
2
8.0%
2
8.0%
1
 
4.0%
1
 
4.0%
Other values (4) 4
16.0%

연락처
Text

MISSING 

Distinct37
Distinct (%)92.5%
Missing5
Missing (%)11.1%
Memory size492.0 B
2024-01-10T06:24:40.217110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.025
Min length12

Characters and Unicode

Total characters481
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34 ?
Unique (%)85.0%

Sample

1st row041-943-6670
2nd row041-943-2631
3rd row041-943-8040
4th row041-942-5700
5th row041-942-0067
ValueCountFrequency (%)
041-943-7300 2
 
5.0%
041-942-8523 2
 
5.0%
041-940-5700 2
 
5.0%
041-942-0114 1
 
2.5%
041-943-7436 1
 
2.5%
041-943-4681 1
 
2.5%
041-942-7234 1
 
2.5%
041-943-3922 1
 
2.5%
041-943-9082 1
 
2.5%
041-943-2631 1
 
2.5%
Other values (27) 27
67.5%
2024-01-10T06:24:40.489169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 101
21.0%
- 80
16.6%
0 76
15.8%
1 55
11.4%
9 48
10.0%
2 34
 
7.1%
3 33
 
6.9%
6 17
 
3.5%
7 16
 
3.3%
8 11
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 401
83.4%
Dash Punctuation 80
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 101
25.2%
0 76
19.0%
1 55
13.7%
9 48
12.0%
2 34
 
8.5%
3 33
 
8.2%
6 17
 
4.2%
7 16
 
4.0%
8 11
 
2.7%
5 10
 
2.5%
Dash Punctuation
ValueCountFrequency (%)
- 80
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 481
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 101
21.0%
- 80
16.6%
0 76
15.8%
1 55
11.4%
9 48
10.0%
2 34
 
7.1%
3 33
 
6.9%
6 17
 
3.5%
7 16
 
3.3%
8 11
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 481
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 101
21.0%
- 80
16.6%
0 76
15.8%
1 55
11.4%
9 48
10.0%
2 34
 
7.1%
3 33
 
6.9%
6 17
 
3.5%
7 16
 
3.3%
8 11
 
2.3%

Correlations

2024-01-10T06:24:40.568592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상 호업종규모도로명주소지번주소연락처
상 호1.0001.0001.0001.0000.0001.000
업종1.0001.0001.0000.9850.0000.951
규모1.0001.0001.0000.0000.0000.000
도로명주소1.0000.9850.0001.000NaN1.000
지번주소0.0000.0000.000NaN1.000NaN
연락처1.0000.9510.0001.000NaN1.000

Missing values

2024-01-10T06:24:37.675656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T06:24:37.753633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-10T06:24:37.824765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

상 호업종규모도로명주소지번주소연락처
0칠갑농산㈜음식료품제조시설5종충청남도 청양군 청양읍 충절로 872-6<NA>041-943-6670
1진흥자동차공업사운수장비수선시설(자동차정비)5종충청남도 청양군 청양읍 중앙로6길 1<NA>041-943-2631
2영농조합법인청양골미곡종합처리장도정업5종충청남도 청양군 청양읍 칠갑산로 120<NA>041-943-8040
3㈜칠갑정비도장시설5종충청남도 청양군 청양읍 칠갑산로 343<NA>041-942-5700
4청양금호정비도장(건조)시설4종충청남도 청양군 청양읍 충절로 1355-18<NA>041-942-0067
5SM인더스트리㈜오토모티브사업부기타자동차부품제조시설5종충청남도 청양군 청양읍 충절로 1259-130<NA>041-940-5700
6청양군청공통시설(보일러)5종충청남도 청양군 청양읍 문화예술로 222<NA>041-940-2646
7SM케미칼㈜합성수지 및 기타플라스틱물질제조업외 14종충청남도 청양군 청양읍 충절로 1259-130<NA>041-940-5700
8㈜디.아이청양지점그외기타비금속광물제품제조업5종충청남도 청양군 운곡면 신대길 379<NA>041-944-0656
9에이씨엠텍㈜기타화학제품제조시설5종충청남도 청양군 운곡면 중묵운곡로 398-27<NA>041-943-1917
상 호업종규모도로명주소지번주소연락처
35제일레미콘㈜레미콘제조시설4종충청남도 청양군 비봉면 록평용당로 575-20<NA>041-943-7300
36㈜삼화그린텍단백질 및 배합사료 제조4종충청남도 청양군 비봉면 록평용당로 347<NA>041-942-9624
37제일아스콘㈜아스콘제조업4종충청남도 청양군 비봉면 록평용당로 575-19<NA>041-943-7300
38㈜대경에너텍절삭가공및유사처리업(탈사시설)4종충청남도 청양군 비봉면 작은한술길 48-16<NA>041-943-9082
39㈜충청콘크리트시멘트,석회,플라스틱 및그제품제조시설5종충청남도 청양군 비봉면 작은한술길 48-30<NA>041-943-3922
40㈜진에너텍1공장라이터연소물및흡연용품제조업5종충청남도 청양군 비봉면 작은한술길 48-39<NA>041-942-7234
41화성농업협동조합곡물도정업5종충청남도 청양군 비봉면 배암실길 17<NA>041-943-4681
42㈜보민환경건설폐기물중간처리업5종충청남도 청양군 비봉면 록평용당로 656-32<NA>041-943-7436
43㈜수이노베이션라이터, 연소물 및 흡연용품 제조업외 14종충청남도 청양군 비봉면 작은한술길 48-27<NA>041-942-0114
44케이씨그린에너지㈜지정 외 폐기물처리업4종충청남도 청양군 비봉면 작은한술길 48-44<NA>041-943-0097