Overview

Dataset statistics

Number of variables8
Number of observations4014
Missing cells983
Missing cells (%)3.1%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory251.0 KiB
Average record size in memory64.0 B

Variable types

Categorical3
Text5

Dataset

Description천안시 제조업체 현황자료(업종, 기업체명, 주생산품,연락처)로 천안시 공장등록된 제조업체에대한 자료 입니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=415&beforeMenuCd=DOM_000000201001001000&publicdatapk=15005040

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
공장구분 is highly overall correlated with 시도High correlation
시도 is highly overall correlated with 시군구 and 1 other fieldsHigh correlation
시군구 is highly overall correlated with 시도High correlation
시도 is highly imbalanced (95.9%)Imbalance
공장구분 is highly imbalanced (66.5%)Imbalance
전화번호 has 747 (18.6%) missing valuesMissing
공장대표주소(도로명) has 217 (5.4%) missing valuesMissing

Reproduction

Analysis started2024-01-09 21:20:43.093271
Analysis finished2024-01-09 21:20:44.410928
Duration1.32 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size31.5 KiB
충청남도
3996 
<NA>
 
18

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row충청남도
2nd row충청남도
3rd row충청남도
4th row충청남도
5th row충청남도

Common Values

ValueCountFrequency (%)
충청남도 3996
99.6%
<NA> 18
 
0.4%

Length

2024-01-10T06:20:44.480992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:20:44.574730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
충청남도 3996
99.6%
na 18
 
0.4%

시군구
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size31.5 KiB
천안시 서북구
2593 
천안시 동남구
1403 
<NA>
 
18

Length

Max length7
Median length7
Mean length6.9865471
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row천안시 서북구
2nd row천안시 서북구
3rd row천안시 동남구
4th row천안시 서북구
5th row천안시 동남구

Common Values

ValueCountFrequency (%)
천안시 서북구 2593
64.6%
천안시 동남구 1403
35.0%
<NA> 18
 
0.4%

Length

2024-01-10T06:20:44.657141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:20:44.739692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
천안시 3996
49.9%
서북구 2593
32.4%
동남구 1403
 
17.5%
na 18
 
0.2%
Distinct3756
Distinct (%)93.6%
Missing0
Missing (%)0.0%
Memory size31.5 KiB
2024-01-10T06:20:44.937461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length23
Mean length7.6763827
Min length2

Characters and Unicode

Total characters30813
Distinct characters636
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3523 ?
Unique (%)87.8%

Sample

1st row 수덕산업
2nd row 플렉스폼코리아(유)천안2공장(성거)
3rd row(사)두리장애인복지회 두리다담사업단
4th row(사)우리들행복나눔장애인복지회(화장지사업단)
5th row(사)한마음장애인복지회
ValueCountFrequency (%)
주식회사 541
 
11.2%
농업회사법인 41
 
0.8%
제2공장 35
 
0.7%
천안공장 23
 
0.5%
14
 
0.3%
천안지점 12
 
0.2%
2공장 11
 
0.2%
제3공장 8
 
0.2%
제1공장 6
 
0.1%
천안 6
 
0.1%
Other values (3775) 4144
85.6%
2024-01-10T06:20:45.269267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2891
 
9.4%
( 2319
 
7.5%
) 2319
 
7.5%
1045
 
3.4%
865
 
2.8%
856
 
2.8%
771
 
2.5%
702
 
2.3%
649
 
2.1%
644
 
2.1%
Other values (626) 17752
57.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 24893
80.8%
Open Punctuation 2320
 
7.5%
Close Punctuation 2320
 
7.5%
Space Separator 856
 
2.8%
Uppercase Letter 196
 
0.6%
Decimal Number 130
 
0.4%
Lowercase Letter 35
 
0.1%
Other Punctuation 34
 
0.1%
Other Symbol 19
 
0.1%
Dash Punctuation 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2891
 
11.6%
1045
 
4.2%
865
 
3.5%
771
 
3.1%
702
 
2.8%
649
 
2.6%
644
 
2.6%
483
 
1.9%
437
 
1.8%
428
 
1.7%
Other values (568) 15978
64.2%
Uppercase Letter
ValueCountFrequency (%)
S 22
11.2%
E 21
10.7%
N 21
10.7%
G 17
 
8.7%
C 14
 
7.1%
A 12
 
6.1%
I 10
 
5.1%
T 10
 
5.1%
M 10
 
5.1%
B 8
 
4.1%
Other values (14) 51
26.0%
Lowercase Letter
ValueCountFrequency (%)
c 7
20.0%
e 5
14.3%
o 4
11.4%
y 4
11.4%
n 4
11.4%
u 3
8.6%
h 2
 
5.7%
i 1
 
2.9%
r 1
 
2.9%
t 1
 
2.9%
Other values (3) 3
8.6%
Decimal Number
ValueCountFrequency (%)
2 81
62.3%
3 18
 
13.8%
1 17
 
13.1%
4 5
 
3.8%
9 2
 
1.5%
0 2
 
1.5%
5 2
 
1.5%
6 2
 
1.5%
7 1
 
0.8%
Other Punctuation
ValueCountFrequency (%)
. 23
67.6%
& 6
 
17.6%
, 4
 
11.8%
/ 1
 
2.9%
Open Punctuation
ValueCountFrequency (%)
( 2319
> 99.9%
[ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 2319
> 99.9%
] 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
856
100.0%
Other Symbol
ValueCountFrequency (%)
19
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 24912
80.8%
Common 5670
 
18.4%
Latin 231
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2891
 
11.6%
1045
 
4.2%
865
 
3.5%
771
 
3.1%
702
 
2.8%
649
 
2.6%
644
 
2.6%
483
 
1.9%
437
 
1.8%
428
 
1.7%
Other values (569) 15997
64.2%
Latin
ValueCountFrequency (%)
S 22
 
9.5%
E 21
 
9.1%
N 21
 
9.1%
G 17
 
7.4%
C 14
 
6.1%
A 12
 
5.2%
I 10
 
4.3%
T 10
 
4.3%
M 10
 
4.3%
B 8
 
3.5%
Other values (27) 86
37.2%
Common
ValueCountFrequency (%)
( 2319
40.9%
) 2319
40.9%
856
 
15.1%
2 81
 
1.4%
. 23
 
0.4%
3 18
 
0.3%
1 17
 
0.3%
- 9
 
0.2%
& 6
 
0.1%
4 5
 
0.1%
Other values (10) 17
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 24893
80.8%
ASCII 5901
 
19.2%
None 19
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2891
 
11.6%
1045
 
4.2%
865
 
3.5%
771
 
3.1%
702
 
2.8%
649
 
2.6%
644
 
2.6%
483
 
1.9%
437
 
1.8%
428
 
1.7%
Other values (568) 15978
64.2%
ASCII
ValueCountFrequency (%)
( 2319
39.3%
) 2319
39.3%
856
 
14.5%
2 81
 
1.4%
. 23
 
0.4%
S 22
 
0.4%
E 21
 
0.4%
N 21
 
0.4%
3 18
 
0.3%
G 17
 
0.3%
Other values (47) 204
 
3.5%
None
ValueCountFrequency (%)
19
100.0%

공장구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size31.5 KiB
개별입지
3765 
개획입지
 
249

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row개별입지
2nd row개별입지
3rd row개별입지
4th row개별입지
5th row개별입지

Common Values

ValueCountFrequency (%)
개별입지 3765
93.8%
개획입지 249
 
6.2%

Length

2024-01-10T06:20:45.378636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:20:45.452956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
개별입지 3765
93.8%
개획입지 249
 
6.2%

전화번호
Text

MISSING 

Distinct2877
Distinct (%)88.1%
Missing747
Missing (%)18.6%
Memory size31.5 KiB
2024-01-10T06:20:45.613901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length11.987756
Min length2

Characters and Unicode

Total characters39164
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2562 ?
Unique (%)78.4%

Sample

1st row041-902-1889
2nd row041-551-2999
3rd row041-587-5771
4th row041-552-1460
5th row041-583-8815
ValueCountFrequency (%)
041-553-4336 15
 
0.5%
041 11
 
0.3%
041-575-4044 5
 
0.2%
063-243-4444 5
 
0.2%
041-581-5400 4
 
0.1%
041-588-0500 4
 
0.1%
041-568-0022 4
 
0.1%
041-523-8990 3
 
0.1%
041-585-6700 3
 
0.1%
041-552-4042 3
 
0.1%
Other values (2867) 3210
98.3%
2024-01-10T06:20:45.918682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 6494
16.6%
0 5905
15.1%
1 5205
13.3%
5 4835
12.3%
4 4685
12.0%
8 2503
 
6.4%
2 2442
 
6.2%
6 2024
 
5.2%
3 1983
 
5.1%
7 1835
 
4.7%
Other values (4) 1253
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 32649
83.4%
Dash Punctuation 6494
 
16.6%
Uppercase Letter 21
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5905
18.1%
1 5205
15.9%
5 4835
14.8%
4 4685
14.3%
8 2503
7.7%
2 2442
7.5%
6 2024
 
6.2%
3 1983
 
6.1%
7 1835
 
5.6%
9 1232
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
A 7
33.3%
R 7
33.3%
S 7
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 6494
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 39143
99.9%
Latin 21
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 6494
16.6%
0 5905
15.1%
1 5205
13.3%
5 4835
12.4%
4 4685
12.0%
8 2503
 
6.4%
2 2442
 
6.2%
6 2024
 
5.2%
3 1983
 
5.1%
7 1835
 
4.7%
Latin
ValueCountFrequency (%)
A 7
33.3%
R 7
33.3%
S 7
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39164
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 6494
16.6%
0 5905
15.1%
1 5205
13.3%
5 4835
12.3%
4 4685
12.0%
8 2503
 
6.4%
2 2442
 
6.2%
6 2024
 
5.2%
3 1983
 
5.1%
7 1835
 
4.7%
Other values (4) 1253
 
3.2%
Distinct3418
Distinct (%)85.3%
Missing8
Missing (%)0.2%
Memory size31.5 KiB
2024-01-10T06:20:46.146625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length65
Median length55
Mean length10.672991
Min length1

Characters and Unicode

Total characters42756
Distinct characters793
Distinct categories15 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3169 ?
Unique (%)79.1%

Sample

1st row코팅제룸(무정전pc판)
2nd row자동차용 내장재
3rd row종량제봉투, 위생팩, 비닐장갑
4th row화장지
5th row수배전반, 제어장치, CCTV, LED
ValueCountFrequency (%)
226
 
2.9%
반도체 172
 
2.2%
128
 
1.6%
부품 123
 
1.6%
85
 
1.1%
장비 70
 
0.9%
자동차부품 62
 
0.8%
자동차 50
 
0.6%
반도체장비 48
 
0.6%
플라스틱 43
 
0.5%
Other values (4549) 6825
87.1%
2024-01-10T06:20:46.516386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3880
 
9.1%
, 1878
 
4.4%
1282
 
3.0%
868
 
2.0%
848
 
2.0%
831
 
1.9%
665
 
1.6%
658
 
1.5%
651
 
1.5%
598
 
1.4%
Other values (783) 30597
71.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 31603
73.9%
Space Separator 3880
 
9.1%
Uppercase Letter 2980
 
7.0%
Other Punctuation 1974
 
4.6%
Lowercase Letter 1247
 
2.9%
Close Punctuation 461
 
1.1%
Open Punctuation 459
 
1.1%
Decimal Number 92
 
0.2%
Dash Punctuation 48
 
0.1%
Control 7
 
< 0.1%
Other values (5) 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1282
 
4.1%
868
 
2.7%
848
 
2.7%
831
 
2.6%
665
 
2.1%
658
 
2.1%
651
 
2.1%
598
 
1.9%
587
 
1.9%
560
 
1.8%
Other values (700) 24055
76.1%
Uppercase Letter
ValueCountFrequency (%)
C 312
 
10.5%
E 292
 
9.8%
P 287
 
9.6%
L 275
 
9.2%
D 225
 
7.6%
S 176
 
5.9%
A 168
 
5.6%
T 158
 
5.3%
R 128
 
4.3%
O 116
 
3.9%
Other values (16) 843
28.3%
Lowercase Letter
ValueCountFrequency (%)
e 145
11.6%
i 111
 
8.9%
r 99
 
7.9%
a 98
 
7.9%
t 95
 
7.6%
o 93
 
7.5%
n 75
 
6.0%
l 73
 
5.9%
c 66
 
5.3%
s 60
 
4.8%
Other values (16) 332
26.6%
Decimal Number
ValueCountFrequency (%)
2 25
27.2%
1 17
18.5%
3 14
15.2%
4 13
14.1%
0 10
 
10.9%
5 6
 
6.5%
8 3
 
3.3%
9 3
 
3.3%
7 1
 
1.1%
Other Punctuation
ValueCountFrequency (%)
, 1878
95.1%
. 42
 
2.1%
/ 35
 
1.8%
· 6
 
0.3%
' 6
 
0.3%
& 3
 
0.2%
: 3
 
0.2%
1
 
0.1%
Control
ValueCountFrequency (%)
3
42.9%
3
42.9%
1
 
14.3%
Close Punctuation
ValueCountFrequency (%)
) 456
98.9%
] 5
 
1.1%
Open Punctuation
ValueCountFrequency (%)
( 454
98.9%
[ 5
 
1.1%
Space Separator
ValueCountFrequency (%)
3880
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 48
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Other Number
ValueCountFrequency (%)
² 1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 31598
73.9%
Common 6926
 
16.2%
Latin 4227
 
9.9%
Han 5
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1282
 
4.1%
868
 
2.7%
848
 
2.7%
831
 
2.6%
665
 
2.1%
658
 
2.1%
651
 
2.1%
598
 
1.9%
587
 
1.9%
560
 
1.8%
Other values (697) 24050
76.1%
Latin
ValueCountFrequency (%)
C 312
 
7.4%
E 292
 
6.9%
P 287
 
6.8%
L 275
 
6.5%
D 225
 
5.3%
S 176
 
4.2%
A 168
 
4.0%
T 158
 
3.7%
e 145
 
3.4%
R 128
 
3.0%
Other values (42) 2061
48.8%
Common
ValueCountFrequency (%)
3880
56.0%
, 1878
27.1%
) 456
 
6.6%
( 454
 
6.6%
- 48
 
0.7%
. 42
 
0.6%
/ 35
 
0.5%
2 25
 
0.4%
1 17
 
0.2%
3 14
 
0.2%
Other values (21) 77
 
1.1%
Han
ValueCountFrequency (%)
2
40.0%
2
40.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 31596
73.9%
ASCII 11144
 
26.1%
None 8
 
< 0.1%
CJK 5
 
< 0.1%
Compat Jamo 2
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3880
34.8%
, 1878
16.9%
) 456
 
4.1%
( 454
 
4.1%
C 312
 
2.8%
E 292
 
2.6%
P 287
 
2.6%
L 275
 
2.5%
D 225
 
2.0%
S 176
 
1.6%
Other values (69) 2909
26.1%
Hangul
ValueCountFrequency (%)
1282
 
4.1%
868
 
2.7%
848
 
2.7%
831
 
2.6%
665
 
2.1%
658
 
2.1%
651
 
2.1%
598
 
1.9%
587
 
1.9%
560
 
1.8%
Other values (696) 24048
76.1%
None
ValueCountFrequency (%)
· 6
75.0%
² 1
 
12.5%
1
 
12.5%
CJK
ValueCountFrequency (%)
2
40.0%
2
40.0%
1
20.0%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
Punctuation
ValueCountFrequency (%)
1
100.0%
Distinct977
Distinct (%)24.4%
Missing11
Missing (%)0.3%
Memory size31.5 KiB
2024-01-10T06:20:46.779815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length29
Mean length17.553085
Min length3

Characters and Unicode

Total characters70265
Distinct characters344
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique491 ?
Unique (%)12.3%

Sample

1st row도장 및 기타 피막처리업
2nd row그 외 자동차용 신품 부품 제조업
3rd row플라스틱 필름 제조업 외 1 종
4th row위생용 종이제품 제조업
5th row배전반 및 전기 자동제어반 제조업 외 4 종
ValueCountFrequency (%)
제조업 3605
 
16.0%
2169
 
9.6%
1581
 
7.0%
1496
 
6.6%
기타 1034
 
4.6%
1 872
 
3.9%
588
 
2.6%
기계 468
 
2.1%
제조용 338
 
1.5%
금속 286
 
1.3%
Other values (685) 10080
44.8%
2024-01-10T06:20:47.168798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18516
26.4%
5081
 
7.2%
4558
 
6.5%
4241
 
6.0%
2766
 
3.9%
2237
 
3.2%
1636
 
2.3%
1525
 
2.2%
1429
 
2.0%
1381
 
2.0%
Other values (334) 26895
38.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 49572
70.6%
Space Separator 18516
 
26.4%
Decimal Number 1619
 
2.3%
Other Punctuation 472
 
0.7%
Open Punctuation 43
 
0.1%
Close Punctuation 43
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5081
 
10.2%
4558
 
9.2%
4241
 
8.6%
2766
 
5.6%
2237
 
4.5%
1636
 
3.3%
1525
 
3.1%
1429
 
2.9%
1381
 
2.8%
1095
 
2.2%
Other values (318) 23623
47.7%
Decimal Number
ValueCountFrequency (%)
1 910
56.2%
2 282
 
17.4%
3 213
 
13.2%
4 97
 
6.0%
5 46
 
2.8%
6 36
 
2.2%
7 22
 
1.4%
0 6
 
0.4%
8 6
 
0.4%
9 1
 
0.1%
Other Punctuation
ValueCountFrequency (%)
, 450
95.3%
. 20
 
4.2%
· 2
 
0.4%
Space Separator
ValueCountFrequency (%)
18516
100.0%
Open Punctuation
ValueCountFrequency (%)
( 43
100.0%
Close Punctuation
ValueCountFrequency (%)
) 43
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 49572
70.6%
Common 20693
29.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5081
 
10.2%
4558
 
9.2%
4241
 
8.6%
2766
 
5.6%
2237
 
4.5%
1636
 
3.3%
1525
 
3.1%
1429
 
2.9%
1381
 
2.8%
1095
 
2.2%
Other values (318) 23623
47.7%
Common
ValueCountFrequency (%)
18516
89.5%
1 910
 
4.4%
, 450
 
2.2%
2 282
 
1.4%
3 213
 
1.0%
4 97
 
0.5%
5 46
 
0.2%
( 43
 
0.2%
) 43
 
0.2%
6 36
 
0.2%
Other values (6) 57
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 49540
70.5%
ASCII 20691
29.4%
Compat Jamo 32
 
< 0.1%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
18516
89.5%
1 910
 
4.4%
, 450
 
2.2%
2 282
 
1.4%
3 213
 
1.0%
4 97
 
0.5%
5 46
 
0.2%
( 43
 
0.2%
) 43
 
0.2%
6 36
 
0.2%
Other values (5) 55
 
0.3%
Hangul
ValueCountFrequency (%)
5081
 
10.3%
4558
 
9.2%
4241
 
8.6%
2766
 
5.6%
2237
 
4.5%
1636
 
3.3%
1525
 
3.1%
1429
 
2.9%
1381
 
2.8%
1095
 
2.2%
Other values (317) 23591
47.6%
Compat Jamo
ValueCountFrequency (%)
32
100.0%
None
ValueCountFrequency (%)
· 2
100.0%
Distinct3511
Distinct (%)92.5%
Missing217
Missing (%)5.4%
Memory size31.5 KiB
2024-01-10T06:20:47.485031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length96
Median length64
Mean length30.676323
Min length7

Characters and Unicode

Total characters116478
Distinct characters433
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3286 ?
Unique (%)86.5%

Sample

1st row충청남도 천안시 서북구 성환읍 복모리 280 외 1필지
2nd row충청남도 천안시 서북구 성거읍 봉주로 275
3rd row충청남도 천안시 동남구 수신면 발산1길 281
4th row충청남도 천안시서북구 성환읍 매주리 607
5th row충청남도 천안시 동남구 성남면 석곡3길 60, (2,3동)
ValueCountFrequency (%)
충청남도 3796
 
14.9%
천안시 3211
 
12.6%
서북구 1952
 
7.6%
동남구 1258
 
4.9%
직산읍 792
 
3.1%
726
 
2.8%
천안시서북구 502
 
2.0%
성환읍 462
 
1.8%
입장면 366
 
1.4%
성남면 344
 
1.3%
Other values (3524) 12140
47.5%
2024-01-10T06:20:47.921068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
21755
 
18.7%
5662
 
4.9%
4347
 
3.7%
4083
 
3.5%
3942
 
3.4%
3918
 
3.4%
1 3885
 
3.3%
3859
 
3.3%
3842
 
3.3%
3827
 
3.3%
Other values (423) 57358
49.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 72490
62.2%
Space Separator 21755
 
18.7%
Decimal Number 16633
 
14.3%
Dash Punctuation 1617
 
1.4%
Open Punctuation 1411
 
1.2%
Close Punctuation 1410
 
1.2%
Other Punctuation 773
 
0.7%
Uppercase Letter 327
 
0.3%
Lowercase Letter 47
 
< 0.1%
Math Symbol 14
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5662
 
7.8%
4347
 
6.0%
4083
 
5.6%
3942
 
5.4%
3918
 
5.4%
3859
 
5.3%
3842
 
5.3%
3827
 
5.3%
2563
 
3.5%
2484
 
3.4%
Other values (365) 33963
46.9%
Uppercase Letter
ValueCountFrequency (%)
A 61
18.7%
B 51
15.6%
M 50
15.3%
C 31
9.5%
N 20
 
6.1%
S 19
 
5.8%
I 16
 
4.9%
E 15
 
4.6%
T 14
 
4.3%
G 10
 
3.1%
Other values (11) 40
12.2%
Lowercase Letter
ValueCountFrequency (%)
e 7
14.9%
o 6
12.8%
r 5
10.6%
a 5
10.6%
t 4
8.5%
i 3
6.4%
n 3
6.4%
g 3
6.4%
m 2
 
4.3%
p 2
 
4.3%
Other values (7) 7
14.9%
Decimal Number
ValueCountFrequency (%)
1 3885
23.4%
2 2642
15.9%
3 2066
12.4%
5 1576
9.5%
4 1473
 
8.9%
0 1252
 
7.5%
6 1168
 
7.0%
7 950
 
5.7%
9 822
 
4.9%
8 799
 
4.8%
Other Punctuation
ValueCountFrequency (%)
, 763
98.7%
. 8
 
1.0%
/ 1
 
0.1%
: 1
 
0.1%
Space Separator
ValueCountFrequency (%)
21755
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1617
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1411
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1410
100.0%
Math Symbol
ValueCountFrequency (%)
~ 14
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 72491
62.2%
Common 43613
37.4%
Latin 374
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5662
 
7.8%
4347
 
6.0%
4083
 
5.6%
3942
 
5.4%
3918
 
5.4%
3859
 
5.3%
3842
 
5.3%
3827
 
5.3%
2563
 
3.5%
2484
 
3.4%
Other values (366) 33964
46.9%
Latin
ValueCountFrequency (%)
A 61
16.3%
B 51
13.6%
M 50
13.4%
C 31
 
8.3%
N 20
 
5.3%
S 19
 
5.1%
I 16
 
4.3%
E 15
 
4.0%
T 14
 
3.7%
G 10
 
2.7%
Other values (28) 87
23.3%
Common
ValueCountFrequency (%)
21755
49.9%
1 3885
 
8.9%
2 2642
 
6.1%
3 2066
 
4.7%
- 1617
 
3.7%
5 1576
 
3.6%
4 1473
 
3.4%
( 1411
 
3.2%
) 1410
 
3.2%
0 1252
 
2.9%
Other values (9) 4526
 
10.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 72490
62.2%
ASCII 43987
37.8%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
21755
49.5%
1 3885
 
8.8%
2 2642
 
6.0%
3 2066
 
4.7%
- 1617
 
3.7%
5 1576
 
3.6%
4 1473
 
3.3%
( 1411
 
3.2%
) 1410
 
3.2%
0 1252
 
2.8%
Other values (47) 4900
 
11.1%
Hangul
ValueCountFrequency (%)
5662
 
7.8%
4347
 
6.0%
4083
 
5.6%
3942
 
5.4%
3918
 
5.4%
3859
 
5.3%
3842
 
5.3%
3827
 
5.3%
2563
 
3.5%
2484
 
3.4%
Other values (365) 33963
46.9%
None
ValueCountFrequency (%)
1
100.0%

Correlations

2024-01-10T06:20:47.998568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군구공장구분
시군구1.0000.096
공장구분0.0961.000
2024-01-10T06:20:48.063532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공장구분시도시군구
공장구분1.0001.0000.061
시도1.0001.0001.000
시군구0.0611.0001.000
2024-01-10T06:20:48.130243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도시군구공장구분
시도1.0001.0001.000
시군구1.0001.0000.061
공장구분1.0000.0611.000

Missing values

2024-01-10T06:20:44.159048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T06:20:44.256941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-10T06:20:44.349674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시도시군구회사명공장구분전화번호생산품업종명공장대표주소(도로명)
0충청남도천안시 서북구수덕산업개별입지<NA>코팅제룸(무정전pc판)도장 및 기타 피막처리업충청남도 천안시 서북구 성환읍 복모리 280 외 1필지
1충청남도천안시 서북구플렉스폼코리아(유)천안2공장(성거)개별입지041-902-1889자동차용 내장재그 외 자동차용 신품 부품 제조업충청남도 천안시 서북구 성거읍 봉주로 275
2충청남도천안시 동남구(사)두리장애인복지회 두리다담사업단개별입지041-551-2999종량제봉투, 위생팩, 비닐장갑플라스틱 필름 제조업 외 1 종충청남도 천안시 동남구 수신면 발산1길 281
3충청남도천안시 서북구(사)우리들행복나눔장애인복지회(화장지사업단)개별입지041-587-5771화장지위생용 종이제품 제조업충청남도 천안시서북구 성환읍 매주리 607
4충청남도천안시 동남구(사)한마음장애인복지회개별입지041-552-1460수배전반, 제어장치, CCTV, LED배전반 및 전기 자동제어반 제조업 외 4 종충청남도 천안시 동남구 성남면 석곡3길 60, (2,3동)
5충청남도천안시 동남구(사)한마음장애인복지회 한마음사업단개별입지041-583-8815헤드레스트커버, 전기차 충전기위생용 종이제품 제조업 외 4 종충청남도 천안시 동남구 수신면 발산1길 171
6충청남도천안시 동남구(유)선양지연지점개별입지041-559-6056문구용지문구용 종이제품 제조업충청남도 천안시 동남구 광덕면 세종로 4186
7충청남도천안시 동남구(유)성진개별입지041-559-6217문구용지문구용 종이제품 제조업충청남도 천안시 동남구 광덕면 세종로 4186
8충청남도천안시 동남구(유)엔비오그린개별입지<NA>미생물그 외 기타 분류 안된 화학제품 제조업 외 1 종충청남도 천안시 동남구 병천면 개목고개길 41
9충청남도천안시 서북구(유)트윈위더스개별입지041-588-0500투명전도성 필름플라스틱 적층, 도포 및 기타 표면처리 제품 제조업충청남도 천안시 서북구 입장면 홍천당곡길 70
시도시군구회사명공장구분전화번호생산품업종명공장대표주소(도로명)
4004충청남도천안시 서북구효신금속(주)개별입지041-582-4261P.V.D 이온플레이팀도금업충청남도 천안시 서북구 성환읍 홍경길 100 (효신금속(주))
4005충청남도천안시 동남구후드메이트(주)개별입지041-552-6063조미료기타 식품 첨가물 제조업충청남도 천안시동남구 안서동 산 120-1번지 호서대 신기술창업센타 304
4006충청남도천안시 서북구휴나팩(주)개별입지<NA>플라스틱포대 및 봉투플라스틱 포대, 봉투 및 유사제품 제조업충청남도 천안시 서북구 입장면 가산리 642 외 1필지
4007충청남도천안시 서북구휴림로봇(주)개별입지041-590-1737산업용 및 지능형 로봇산업용 로봇 제조업충청남도 천안시 서북구 직산읍 4산단6길 27
4008충청남도천안시 서북구휴먼이엔티주식회사개별입지041-622-0118패널,인방제,바닥제구조물, 디자인형울타리콘크리트 타일, 기와, 벽돌 및 블록 제조업 외 4 종충청남도 천안시 서북구 성거읍 망향로 903-6
4009충청남도천안시 서북구휴코시스 주식회사개별입지041-523-9077전력변환전원장치기타 전기 변환장치 제조업충청남도 천안시 서북구 백석공단1로 10, A동513호(백석동, 천안 미래에이스하이테크시티)
4010충청남도천안시 동남구흥림농산개별입지<NA>참기름,압착식용유식물성 유지 제조업충청남도 천안시 동남구 수신면 장산동길 168-27
4011충청남도천안시 서북구희성폴리머(주)개별입지041-559-1010포장재, 광고지, 천막지기타 인쇄업 외 3 종충청남도 천안시 서북구 성환읍 천안대로 2131 (성환읍) 외 1필지
4012충청남도천안시 서북구희영개별입지041-583-1677금형제조업주형 및 금형 제조업충청남도 천안시 서북구 직산읍 금곡로 141 ((주)그린테크산업) (총 3 필지) 외 2필지
4013충청남도천안시 서북구히트텍(주)개별입지041-584-8881차량공조 열교환기공기 조화장치 제조업충청남도 천안시서북구 직산읍 마정리 522번지

Duplicate rows

Most frequently occurring

시도시군구회사명공장구분전화번호생산품업종명공장대표주소(도로명)# duplicates
0충청남도천안시 서북구(주)화이버옵틱코리아개별입지041-587-9911조명장치및광센서일반용조명장치제조업 외 1 종충청남도 천안시서북구 직산읍 삼은리 43-5번지 충남테크노파크천안밸리 생산관2109호2