Overview

Dataset statistics

Number of variables9
Number of observations124
Missing cells14
Missing cells (%)1.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.8 KiB
Average record size in memory73.1 B

Variable types

Text3
Categorical5
DateTime1

Dataset

Description서천군 소재 대기배출기업 사업장현황 데이터로 업체명, 소재지, 전화번호, 분류업종, 종별등을 제공하고 있습니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=337&beforeMenuCd=DOM_000000201001001000&publicdatapk=15083946

Alerts

데이터기준일 has constant value ""Constant
구분 is highly overall correlated with 분류업종 and 2 other fieldsHigh correlation
공단명 is highly overall correlated with 구분 and 1 other fieldsHigh correlation
종별 is highly overall correlated with 분류업종 and 1 other fieldsHigh correlation
비고(휴업 등) is highly overall correlated with 공단명 and 2 other fieldsHigh correlation
분류업종 is highly overall correlated with 종별 and 1 other fieldsHigh correlation
비고(휴업 등) is highly imbalanced (50.1%)Imbalance
전화번호(지역번호 041) has 14 (11.3%) missing valuesMissing

Reproduction

Analysis started2024-01-09 22:26:41.012476
Analysis finished2024-01-09 22:26:41.804710
Duration0.79 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct123
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2024-01-10T07:26:41.922706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length15
Mean length7.6290323
Min length3

Characters and Unicode

Total characters946
Distinct characters217
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique122 ?
Unique (%)98.4%

Sample

1st row동진자동차공업사
2nd row동진차유리
3rd row주식회사 삼남
4th row장미세차장
5th row계동산업
ValueCountFrequency (%)
주식회사 7
 
4.7%
충청남도 3
 
2.0%
㈜우양 2
 
1.4%
서천공장 2
 
1.4%
2공장 1
 
0.7%
㈜더존날 1
 
0.7%
㈜기업과사람들 1
 
0.7%
서천한산식품 1
 
0.7%
그린세차장 1
 
0.7%
서천소방서 1
 
0.7%
Other values (128) 128
86.5%
2024-01-10T07:26:42.208012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35
 
3.7%
33
 
3.5%
28
 
3.0%
25
 
2.6%
25
 
2.6%
25
 
2.6%
23
 
2.4%
22
 
2.3%
21
 
2.2%
16
 
1.7%
Other values (207) 693
73.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 847
89.5%
Other Symbol 35
 
3.7%
Space Separator 25
 
2.6%
Open Punctuation 14
 
1.5%
Close Punctuation 14
 
1.5%
Decimal Number 9
 
1.0%
Uppercase Letter 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
33
 
3.9%
28
 
3.3%
25
 
3.0%
25
 
3.0%
23
 
2.7%
22
 
2.6%
21
 
2.5%
16
 
1.9%
16
 
1.9%
15
 
1.8%
Other values (195) 623
73.6%
Decimal Number
ValueCountFrequency (%)
2 3
33.3%
1 2
22.2%
6 1
 
11.1%
3 1
 
11.1%
8 1
 
11.1%
4 1
 
11.1%
Uppercase Letter
ValueCountFrequency (%)
K 1
50.0%
S 1
50.0%
Other Symbol
ValueCountFrequency (%)
35
100.0%
Space Separator
ValueCountFrequency (%)
25
100.0%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 882
93.2%
Common 62
 
6.6%
Latin 2
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
35
 
4.0%
33
 
3.7%
28
 
3.2%
25
 
2.8%
25
 
2.8%
23
 
2.6%
22
 
2.5%
21
 
2.4%
16
 
1.8%
16
 
1.8%
Other values (196) 638
72.3%
Common
ValueCountFrequency (%)
25
40.3%
( 14
22.6%
) 14
22.6%
2 3
 
4.8%
1 2
 
3.2%
6 1
 
1.6%
3 1
 
1.6%
8 1
 
1.6%
4 1
 
1.6%
Latin
ValueCountFrequency (%)
K 1
50.0%
S 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 847
89.5%
ASCII 64
 
6.8%
None 35
 
3.7%

Most frequent character per block

None
ValueCountFrequency (%)
35
100.0%
Hangul
ValueCountFrequency (%)
33
 
3.9%
28
 
3.3%
25
 
3.0%
25
 
3.0%
23
 
2.7%
22
 
2.6%
21
 
2.5%
16
 
1.9%
16
 
1.9%
15
 
1.8%
Other values (195) 623
73.6%
ASCII
ValueCountFrequency (%)
25
39.1%
( 14
21.9%
) 14
21.9%
2 3
 
4.7%
1 2
 
3.1%
6 1
 
1.6%
3 1
 
1.6%
8 1
 
1.6%
4 1
 
1.6%
K 1
 
1.6%
Distinct120
Distinct (%)96.8%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
2024-01-10T07:26:42.491282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length27
Mean length22.266129
Min length18

Characters and Unicode

Total characters2761
Distinct characters85
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique116 ?
Unique (%)93.5%

Sample

1st row충청남도 서천군 마서면 송내리 498-14,6
2nd row충청남도 서천군 마서면 송내리 498-14,6
3rd row충청남도 서천군 판교면 복대리 19-4외3필지
4th row충청남도 서천군 서천읍 군사리 171-1
5th row충청남도 서천군 서면 신합리 327-1, 311-3
ValueCountFrequency (%)
서천군 130
20.6%
충청남도 124
19.7%
장항읍 39
 
6.2%
종천면 23
 
3.7%
서천읍 14
 
2.2%
원수리 13
 
2.1%
마서면 12
 
1.9%
서면 11
 
1.7%
석촌리 11
 
1.7%
군사리 6
 
1.0%
Other values (190) 247
39.2%
2024-01-10T07:26:42.876815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
506
18.3%
178
 
6.4%
173
 
6.3%
137
 
5.0%
131
 
4.7%
128
 
4.6%
127
 
4.6%
124
 
4.5%
107
 
3.9%
1 99
 
3.6%
Other values (75) 1051
38.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1663
60.2%
Space Separator 506
 
18.3%
Decimal Number 492
 
17.8%
Dash Punctuation 90
 
3.3%
Other Punctuation 10
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
178
10.7%
173
 
10.4%
137
 
8.2%
131
 
7.9%
128
 
7.7%
127
 
7.6%
124
 
7.5%
107
 
6.4%
70
 
4.2%
53
 
3.2%
Other values (62) 435
26.2%
Decimal Number
ValueCountFrequency (%)
1 99
20.1%
3 64
13.0%
2 55
11.2%
4 52
10.6%
5 49
10.0%
9 41
8.3%
6 38
 
7.7%
7 37
 
7.5%
8 33
 
6.7%
0 24
 
4.9%
Space Separator
ValueCountFrequency (%)
506
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 90
100.0%
Other Punctuation
ValueCountFrequency (%)
, 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1663
60.2%
Common 1098
39.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
178
10.7%
173
 
10.4%
137
 
8.2%
131
 
7.9%
128
 
7.7%
127
 
7.6%
124
 
7.5%
107
 
6.4%
70
 
4.2%
53
 
3.2%
Other values (62) 435
26.2%
Common
ValueCountFrequency (%)
506
46.1%
1 99
 
9.0%
- 90
 
8.2%
3 64
 
5.8%
2 55
 
5.0%
4 52
 
4.7%
5 49
 
4.5%
9 41
 
3.7%
6 38
 
3.5%
7 37
 
3.4%
Other values (3) 67
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1663
60.2%
ASCII 1098
39.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
506
46.1%
1 99
 
9.0%
- 90
 
8.2%
3 64
 
5.8%
2 55
 
5.0%
4 52
 
4.7%
5 49
 
4.5%
9 41
 
3.7%
6 38
 
3.5%
7 37
 
3.4%
Other values (3) 67
 
6.1%
Hangul
ValueCountFrequency (%)
178
10.7%
173
 
10.4%
137
 
8.2%
131
 
7.9%
128
 
7.7%
127
 
7.6%
124
 
7.5%
107
 
6.4%
70
 
4.2%
53
 
3.2%
Other values (62) 435
26.2%
Distinct103
Distinct (%)93.6%
Missing14
Missing (%)11.3%
Memory size1.1 KiB
2024-01-10T07:26:43.128965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters880
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique97 ?
Unique (%)88.2%

Sample

1st row956-0472
2nd row956-1618
3rd row951-9563
4th row951-9000
5th row952-6070
ValueCountFrequency (%)
956-7171 3
 
2.7%
953-8227 2
 
1.8%
956-8808 2
 
1.8%
953-2327 2
 
1.8%
956-3711 2
 
1.8%
952-6119 2
 
1.8%
957-0535 1
 
0.9%
951-8008 1
 
0.9%
953-8811 1
 
0.9%
956-8736 1
 
0.9%
Other values (93) 93
84.5%
2024-01-10T07:26:43.475690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 146
16.6%
9 138
15.7%
- 110
12.5%
1 89
10.1%
6 69
7.8%
3 66
7.5%
0 64
7.3%
8 62
7.0%
2 61
6.9%
7 45
 
5.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 770
87.5%
Dash Punctuation 110
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 146
19.0%
9 138
17.9%
1 89
11.6%
6 69
9.0%
3 66
8.6%
0 64
8.3%
8 62
8.1%
2 61
7.9%
7 45
 
5.8%
4 30
 
3.9%
Dash Punctuation
ValueCountFrequency (%)
- 110
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 880
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 146
16.6%
9 138
15.7%
- 110
12.5%
1 89
10.1%
6 69
7.8%
3 66
7.5%
0 64
7.3%
8 62
7.0%
2 61
6.9%
7 45
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 880
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 146
16.6%
9 138
15.7%
- 110
12.5%
1 89
10.1%
6 69
7.8%
3 66
7.5%
0 64
7.3%
8 62
7.0%
2 61
6.9%
7 45
 
5.1%

분류업종
Categorical

HIGH CORRELATION 

Distinct41
Distinct (%)33.1%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
식품제조업
36 
자동차세차업
25 
자동차정비업
비금속광물제조업
레미콘제조업
Other values (36)
46 

Length

Max length17
Median length15
Mean length6.2983871
Min length3

Unique

Unique30 ?
Unique (%)24.2%

Sample

1st row자동차정비업
2nd row자동차세차업
3rd row아스콘제조업
4th row자동차세차업
5th row규사코팅업

Common Values

ValueCountFrequency (%)
식품제조업 36
29.0%
자동차세차업 25
20.2%
자동차정비업 6
 
4.8%
비금속광물제조업 6
 
4.8%
레미콘제조업 5
 
4.0%
소방행정 3
 
2.4%
그외 기타 플라스틱 제품 제조업 3
 
2.4%
도정업 3
 
2.4%
아스콘제조업 3
 
2.4%
금속제품제조 2
 
1.6%
Other values (31) 32
25.8%

Length

2024-01-10T07:26:43.600765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
식품제조업 36
25.5%
자동차세차업 25
17.7%
자동차정비업 6
 
4.3%
비금속광물제조업 6
 
4.3%
레미콘제조업 5
 
3.5%
기타 4
 
2.8%
제조업 4
 
2.8%
제품 3
 
2.1%
아스콘제조업 3
 
2.1%
도정업 3
 
2.1%
Other values (38) 46
32.6%

공단명
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
91 
종천1농공단지
14 
장항원수농공단지
13 
장항국가생태산업단지
 
3
종천2농공단지
 
2

Length

Max length10
Median length4
Mean length4.9919355
Min length4

Unique

Unique1 ?
Unique (%)0.8%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 91
73.4%
종천1농공단지 14
 
11.3%
장항원수농공단지 13
 
10.5%
장항국가생태산업단지 3
 
2.4%
종천2농공단지 2
 
1.6%
서면김가공특화단지 1
 
0.8%

Length

2024-01-10T07:26:43.698574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:26:43.784575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 91
73.4%
종천1농공단지 14
 
11.3%
장항원수농공단지 13
 
10.5%
장항국가생태산업단지 3
 
2.4%
종천2농공단지 2
 
1.6%
서면김가공특화단지 1
 
0.8%

종별
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
53 
5
27 
4
24 
12 
3

Length

Max length4
Median length1
Mean length2.2822581
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row<NA>
3rd row3
4th row<NA>
5th row3

Common Values

ValueCountFrequency (%)
<NA> 53
42.7%
5 27
21.8%
4 24
19.4%
12
 
9.7%
3 8
 
6.5%

Length

2024-01-10T07:26:43.880311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:26:43.967325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 53
47.3%
5 27
24.1%
4 24
21.4%
3 8
 
7.1%

구분
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
65 
신고
56 
허가
 
3

Length

Max length4
Median length4
Mean length3.0483871
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row신고
2nd row<NA>
3rd row신고
4th row<NA>
5th row신고

Common Values

ValueCountFrequency (%)
<NA> 65
52.4%
신고 56
45.2%
허가 3
 
2.4%

Length

2024-01-10T07:26:44.067258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:26:44.174617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 65
52.4%
신고 56
45.2%
허가 3
 
2.4%

비고(휴업 등)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
<NA>
103 
휴업
16 
가동개시 전
 
5

Length

Max length6
Median length4
Mean length3.8225806
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 103
83.1%
휴업 16
 
12.9%
가동개시 전 5
 
4.0%

Length

2024-01-10T07:26:44.261885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:26:44.345774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 103
79.8%
휴업 16
 
12.4%
가동개시 5
 
3.9%
5
 
3.9%

데이터기준일
Date

CONSTANT 

Distinct1
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
Minimum2021-07-26 00:00:00
Maximum2021-07-26 00:00:00
2024-01-10T07:26:44.410032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T07:26:44.477564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2024-01-10T07:26:44.533070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분류업종공단명종별구분비고(휴업 등)
분류업종1.0000.5410.9341.0000.000
공단명0.5411.0000.0000.8041.000
종별0.9340.0001.0000.199NaN
구분1.0000.8040.1991.000NaN
비고(휴업 등)0.0001.000NaNNaN1.000
2024-01-10T07:26:44.636312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분공단명종별분류업종비고(휴업 등)
구분1.0000.5590.3220.7011.000
공단명0.5591.0000.0000.2371.000
종별0.3220.0001.0000.5831.000
분류업종0.7010.2370.5831.0000.000
비고(휴업 등)1.0001.0001.0000.0001.000
2024-01-10T07:26:44.735697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분류업종공단명종별구분비고(휴업 등)
분류업종1.0000.2370.5830.7010.000
공단명0.2371.0000.0000.5591.000
종별0.5830.0001.0000.3221.000
구분0.7010.5590.3221.0001.000
비고(휴업 등)0.0001.0001.0001.0001.000

Missing values

2024-01-10T07:26:41.651183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T07:26:41.759015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업체명소재지전화번호(지역번호 041)분류업종공단명종별구분비고(휴업 등)데이터기준일
0동진자동차공업사충청남도 서천군 마서면 송내리 498-14,6956-0472자동차정비업<NA>4신고<NA>2021-07-26
1동진차유리충청남도 서천군 마서면 송내리 498-14,6956-1618자동차세차업<NA><NA><NA><NA>2021-07-26
2주식회사 삼남충청남도 서천군 판교면 복대리 19-4외3필지951-9563아스콘제조업<NA>3신고<NA>2021-07-26
3장미세차장충청남도 서천군 서천읍 군사리 171-1<NA>자동차세차업<NA><NA><NA><NA>2021-07-26
4계동산업충청남도 서천군 서면 신합리 327-1, 311-3951-9000규사코팅업<NA>3신고<NA>2021-07-26
5㈜모헨즈서천공장충청남도 서천군 비인면 장포리 6-3952-6070레미콘제조업<NA>4신고<NA>2021-07-26
6대진레미콘충청남도 서천군 종천면 당정리 64-2953-2580레미콘제조업<NA>4신고<NA>2021-07-26
7공단주유소충청남도 서천군 장항읍 신창리 303-15956-0777자동차세차업<NA><NA><NA>2021-07-26
8서해양만충청남도 서천군 마서면 남전리 360-1950-6000양식업<NA>5신고휴업2021-07-26
9서광식품충청남도 서천군 장항읍 신창리 164-1956-0873식품제조업<NA><NA>휴업2021-07-26
업체명소재지전화번호(지역번호 041)분류업종공단명종별구분비고(휴업 등)데이터기준일
114국립해양생물자원관충청남도 서천군 장항읍 송림리 913<NA>해양연구업<NA>4신고<NA>2021-07-26
115㈜일아그린충청남도 서천군 종천면 충서로 513-90953-8227폐기물종합재활용업<NA>5신고<NA>2021-07-26
116㈜허스델리충청남도 서천군 장항읍 장항산단23길5956-8118식품제조업장항국가생태산업단지4신고<NA>2021-07-26
117한산농협주유소충청남도 서천군 서천군 기산면 충절로 966951-9510자동차세차업<NA><NA><NA><NA>2021-07-26
118충청남도서천교육지원청(서천학생수영장)충청남도 서천군 서천군 장항읍 성주길 6-21956-1491교육행정<NA>5신고<NA>2021-07-26
119㈜우양충청남도 서천군 서천군 장항읍 옥남리 1040956-7171식품제조업장항국가생태산업단지<NA><NA>가동개시 전2021-07-26
120충남마른김가공수산업협동조합충청남도 서천군 서천군 서면 월리 423953-9811식품제조업서면김가공특화단지<NA><NA>가동개시 전2021-07-26
121㈜티엠충청남도 서천군 서천군 종천공단길 14번길 13973-9920그외 기타 플라스틱 제품 제조업종천1농공단지5신고<NA>2021-07-26
122㈜ 일광폴리머 서천2공장충청남도 서천군 종천면 종천공단길62번길 21953-2327도장 및 기타 피막처리업종천2농공단지5신고<NA>2021-07-26
123엘에스메탈㈜충청남도 서천군 장항읍 화송길 123955-3114금속제품제조가공업<NA>3허가<NA>2021-07-26