Overview

Dataset statistics

Number of variables8
Number of observations2249
Missing cells749
Missing cells (%)4.2%
Duplicate rows2
Duplicate rows (%)0.1%
Total size in memory142.9 KiB
Average record size in memory65.1 B

Variable types

Text7
Numeric1

Dataset

Description경기도 포천시에서 제공하는 공장 등록 현황(회사명, 업종명, 주소<도로명, 지번), 연락처, 종업원수, 생산품, 등 항목을 제공합니다.)
Author경기도 포천시
URLhttps://www.data.go.kr/data/15020785/fileData.do

Alerts

Dataset has 2 (0.1%) duplicate rowsDuplicates
원자재 has 741 (32.9%) missing valuesMissing
종업원수(총인원) has 34 (1.5%) zerosZeros

Reproduction

Analysis started2024-03-15 02:25:03.027600
Analysis finished2024-03-15 02:25:06.415282
Duration3.39 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2184
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Memory size17.7 KiB
2024-03-15T11:25:07.213964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length23
Mean length7.0600267
Min length2

Characters and Unicode

Total characters15878
Distinct characters590
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2125 ?
Unique (%)94.5%

Sample

1st row(주)성원목재
2nd row주식회사 대명코리아
3rd row(주)동양봉제기계
4th row(사)우리들행복나눔(종합가구사업단)
5th row(사)한국교통장애인협회 사업부
ValueCountFrequency (%)
주식회사 270
 
10.2%
농업회사법인 18
 
0.7%
11
 
0.4%
제2공장 11
 
0.4%
명성기업 6
 
0.2%
2공장 5
 
0.2%
원일상공 4
 
0.2%
포천공장 4
 
0.2%
포천지점 4
 
0.2%
tex 3
 
0.1%
Other values (2217) 2305
87.3%
2024-03-15T11:25:08.363909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1317
 
8.3%
) 1031
 
6.5%
( 1030
 
6.5%
432
 
2.7%
411
 
2.6%
404
 
2.5%
381
 
2.4%
364
 
2.3%
359
 
2.3%
338
 
2.1%
Other values (580) 9811
61.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13074
82.3%
Close Punctuation 1031
 
6.5%
Open Punctuation 1030
 
6.5%
Space Separator 411
 
2.6%
Uppercase Letter 209
 
1.3%
Decimal Number 48
 
0.3%
Lowercase Letter 32
 
0.2%
Other Punctuation 28
 
0.2%
Other Symbol 9
 
0.1%
Dash Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1317
 
10.1%
432
 
3.3%
404
 
3.1%
381
 
2.9%
364
 
2.8%
359
 
2.7%
338
 
2.6%
262
 
2.0%
227
 
1.7%
214
 
1.6%
Other values (523) 8776
67.1%
Uppercase Letter
ValueCountFrequency (%)
E 24
 
11.5%
S 17
 
8.1%
N 17
 
8.1%
C 15
 
7.2%
T 14
 
6.7%
L 11
 
5.3%
O 11
 
5.3%
I 11
 
5.3%
G 10
 
4.8%
D 9
 
4.3%
Other values (16) 70
33.5%
Lowercase Letter
ValueCountFrequency (%)
o 6
18.8%
e 4
12.5%
s 4
12.5%
a 3
9.4%
c 2
 
6.2%
d 2
 
6.2%
k 2
 
6.2%
n 2
 
6.2%
i 2
 
6.2%
t 1
 
3.1%
Other values (4) 4
12.5%
Decimal Number
ValueCountFrequency (%)
2 26
54.2%
1 11
22.9%
3 4
 
8.3%
4 3
 
6.2%
5 2
 
4.2%
9 1
 
2.1%
7 1
 
2.1%
Other Punctuation
ValueCountFrequency (%)
. 23
82.1%
& 2
 
7.1%
, 2
 
7.1%
· 1
 
3.6%
Close Punctuation
ValueCountFrequency (%)
) 1031
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1030
100.0%
Space Separator
ValueCountFrequency (%)
411
100.0%
Other Symbol
ValueCountFrequency (%)
9
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13082
82.4%
Common 2554
 
16.1%
Latin 241
 
1.5%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1317
 
10.1%
432
 
3.3%
404
 
3.1%
381
 
2.9%
364
 
2.8%
359
 
2.7%
338
 
2.6%
262
 
2.0%
227
 
1.7%
214
 
1.6%
Other values (523) 8784
67.1%
Latin
ValueCountFrequency (%)
E 24
 
10.0%
S 17
 
7.1%
N 17
 
7.1%
C 15
 
6.2%
T 14
 
5.8%
L 11
 
4.6%
O 11
 
4.6%
I 11
 
4.6%
G 10
 
4.1%
D 9
 
3.7%
Other values (30) 102
42.3%
Common
ValueCountFrequency (%)
) 1031
40.4%
( 1030
40.3%
411
 
16.1%
2 26
 
1.0%
. 23
 
0.9%
1 11
 
0.4%
- 5
 
0.2%
3 4
 
0.2%
4 3
 
0.1%
& 2
 
0.1%
Other values (6) 8
 
0.3%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13072
82.3%
ASCII 2794
 
17.6%
None 10
 
0.1%
Compat Jamo 1
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1317
 
10.1%
432
 
3.3%
404
 
3.1%
381
 
2.9%
364
 
2.8%
359
 
2.7%
338
 
2.6%
262
 
2.0%
227
 
1.7%
214
 
1.6%
Other values (521) 8774
67.1%
ASCII
ValueCountFrequency (%)
) 1031
36.9%
( 1030
36.9%
411
 
14.7%
2 26
 
0.9%
E 24
 
0.9%
. 23
 
0.8%
S 17
 
0.6%
N 17
 
0.6%
C 15
 
0.5%
T 14
 
0.5%
Other values (45) 186
 
6.7%
None
ValueCountFrequency (%)
9
90.0%
· 1
 
10.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
100.0%
Distinct2201
Distinct (%)98.0%
Missing3
Missing (%)0.1%
Memory size17.7 KiB
2024-03-15T11:25:09.496731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length81
Median length45
Mean length25.731077
Min length14

Characters and Unicode

Total characters57792
Distinct characters239
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2159 ?
Unique (%)96.1%

Sample

1st row경기도 포천시 내촌면 부마로282번길 42
2nd row경기도 포천시 가산면 정교리 12-11 외 1필지
3rd row경기도 포천시 금강로 2927 (내촌면) 1층
4th row경기도 포천시 내촌면 작은넙고개1길 84, 가동
5th row경기도 포천시 가산면 정금로 183-23, 마,라동 (가산면)
ValueCountFrequency (%)
경기도 2245
 
16.6%
포천시 2245
 
16.6%
680
 
5.0%
가산면 610
 
4.5%
1필지 353
 
2.6%
군내면 280
 
2.1%
소흘읍 272
 
2.0%
내촌면 252
 
1.9%
신북면 176
 
1.3%
2필지 143
 
1.1%
Other values (2024) 6261
46.3%
2024-03-15T11:25:11.203739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11272
19.5%
1 2430
 
4.2%
2395
 
4.1%
2371
 
4.1%
2371
 
4.1%
2297
 
4.0%
2277
 
3.9%
2248
 
3.9%
1703
 
2.9%
2 1694
 
2.9%
Other values (229) 26734
46.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 32839
56.8%
Space Separator 11272
 
19.5%
Decimal Number 10782
 
18.7%
Dash Punctuation 1112
 
1.9%
Open Punctuation 635
 
1.1%
Close Punctuation 635
 
1.1%
Other Punctuation 461
 
0.8%
Uppercase Letter 52
 
0.1%
Other Symbol 3
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2395
 
7.3%
2371
 
7.2%
2371
 
7.2%
2297
 
7.0%
2277
 
6.9%
2248
 
6.8%
1703
 
5.2%
1351
 
4.1%
1287
 
3.9%
947
 
2.9%
Other values (203) 13592
41.4%
Decimal Number
ValueCountFrequency (%)
1 2430
22.5%
2 1694
15.7%
3 1216
11.3%
4 1015
9.4%
5 910
 
8.4%
9 724
 
6.7%
8 721
 
6.7%
6 716
 
6.6%
7 690
 
6.4%
0 666
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
A 18
34.6%
B 12
23.1%
C 11
21.2%
D 7
 
13.5%
F 1
 
1.9%
H 1
 
1.9%
E 1
 
1.9%
J 1
 
1.9%
Other Punctuation
ValueCountFrequency (%)
, 458
99.3%
. 3
 
0.7%
Space Separator
ValueCountFrequency (%)
11272
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1112
100.0%
Open Punctuation
ValueCountFrequency (%)
( 635
100.0%
Close Punctuation
ValueCountFrequency (%)
) 635
100.0%
Other Symbol
ValueCountFrequency (%)
3
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 32839
56.8%
Common 24901
43.1%
Latin 52
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2395
 
7.3%
2371
 
7.2%
2371
 
7.2%
2297
 
7.0%
2277
 
6.9%
2248
 
6.8%
1703
 
5.2%
1351
 
4.1%
1287
 
3.9%
947
 
2.9%
Other values (203) 13592
41.4%
Common
ValueCountFrequency (%)
11272
45.3%
1 2430
 
9.8%
2 1694
 
6.8%
3 1216
 
4.9%
- 1112
 
4.5%
4 1015
 
4.1%
5 910
 
3.7%
9 724
 
2.9%
8 721
 
2.9%
6 716
 
2.9%
Other values (8) 3091
 
12.4%
Latin
ValueCountFrequency (%)
A 18
34.6%
B 12
23.1%
C 11
21.2%
D 7
 
13.5%
F 1
 
1.9%
H 1
 
1.9%
E 1
 
1.9%
J 1
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 32839
56.8%
ASCII 24950
43.2%
CJK Compat 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11272
45.2%
1 2430
 
9.7%
2 1694
 
6.8%
3 1216
 
4.9%
- 1112
 
4.5%
4 1015
 
4.1%
5 910
 
3.6%
9 724
 
2.9%
8 721
 
2.9%
6 716
 
2.9%
Other values (15) 3140
 
12.6%
Hangul
ValueCountFrequency (%)
2395
 
7.3%
2371
 
7.2%
2371
 
7.2%
2297
 
7.0%
2277
 
6.9%
2248
 
6.8%
1703
 
5.2%
1351
 
4.1%
1287
 
3.9%
947
 
2.9%
Other values (203) 13592
41.4%
CJK Compat
ValueCountFrequency (%)
3
100.0%
Distinct2212
Distinct (%)98.4%
Missing0
Missing (%)0.0%
Memory size17.7 KiB
2024-03-15T11:25:12.483331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length73
Median length46
Mean length24.417519
Min length14

Characters and Unicode

Total characters54915
Distinct characters184
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2177 ?
Unique (%)96.8%

Sample

1st row경기도 포천시 내촌면 마명리 110-8번지
2nd row경기도 포천시 가산면 정교리 12-11 외 1필지
3rd row경기도 포천시 내촌면 음현리 688-1 1층
4th row경기도 포천시 내촌면 진목리 915-9번지 가동
5th row경기도 포천시 가산면 정교리 211-3번지 마,라동
ValueCountFrequency (%)
경기도 2249
17.4%
포천시 2249
17.4%
675
 
5.2%
가산면 621
 
4.8%
1필지 352
 
2.7%
군내면 294
 
2.3%
소흘읍 275
 
2.1%
내촌면 253
 
2.0%
금현리 198
 
1.5%
신북면 185
 
1.4%
Other values (2297) 5567
43.1%
2024-03-15T11:25:14.175900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10673
19.4%
2294
 
4.2%
2254
 
4.1%
2254
 
4.1%
2252
 
4.1%
2251
 
4.1%
2249
 
4.1%
1 2080
 
3.8%
2035
 
3.7%
2032
 
3.7%
Other values (174) 24541
44.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 32144
58.5%
Space Separator 10673
 
19.4%
Decimal Number 9835
 
17.9%
Dash Punctuation 1825
 
3.3%
Other Punctuation 166
 
0.3%
Open Punctuation 108
 
0.2%
Close Punctuation 108
 
0.2%
Uppercase Letter 52
 
0.1%
Other Symbol 3
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2294
 
7.1%
2254
 
7.0%
2254
 
7.0%
2252
 
7.0%
2251
 
7.0%
2249
 
7.0%
2035
 
6.3%
2032
 
6.3%
1687
 
5.2%
1289
 
4.0%
Other values (148) 11547
35.9%
Decimal Number
ValueCountFrequency (%)
1 2080
21.1%
2 1220
12.4%
3 1125
11.4%
4 1102
11.2%
5 917
9.3%
6 760
 
7.7%
8 760
 
7.7%
7 678
 
6.9%
0 637
 
6.5%
9 556
 
5.7%
Uppercase Letter
ValueCountFrequency (%)
A 18
34.6%
B 12
23.1%
C 11
21.2%
D 7
 
13.5%
F 1
 
1.9%
H 1
 
1.9%
E 1
 
1.9%
J 1
 
1.9%
Other Punctuation
ValueCountFrequency (%)
, 159
95.8%
. 7
 
4.2%
Space Separator
ValueCountFrequency (%)
10673
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1825
100.0%
Open Punctuation
ValueCountFrequency (%)
( 108
100.0%
Close Punctuation
ValueCountFrequency (%)
) 108
100.0%
Other Symbol
ValueCountFrequency (%)
3
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 32144
58.5%
Common 22719
41.4%
Latin 52
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2294
 
7.1%
2254
 
7.0%
2254
 
7.0%
2252
 
7.0%
2251
 
7.0%
2249
 
7.0%
2035
 
6.3%
2032
 
6.3%
1687
 
5.2%
1289
 
4.0%
Other values (148) 11547
35.9%
Common
ValueCountFrequency (%)
10673
47.0%
1 2080
 
9.2%
- 1825
 
8.0%
2 1220
 
5.4%
3 1125
 
5.0%
4 1102
 
4.9%
5 917
 
4.0%
6 760
 
3.3%
8 760
 
3.3%
7 678
 
3.0%
Other values (8) 1579
 
7.0%
Latin
ValueCountFrequency (%)
A 18
34.6%
B 12
23.1%
C 11
21.2%
D 7
 
13.5%
F 1
 
1.9%
H 1
 
1.9%
E 1
 
1.9%
J 1
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 32144
58.5%
ASCII 22768
41.5%
CJK Compat 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
10673
46.9%
1 2080
 
9.1%
- 1825
 
8.0%
2 1220
 
5.4%
3 1125
 
4.9%
4 1102
 
4.8%
5 917
 
4.0%
6 760
 
3.3%
8 760
 
3.3%
7 678
 
3.0%
Other values (15) 1628
 
7.2%
Hangul
ValueCountFrequency (%)
2294
 
7.1%
2254
 
7.0%
2254
 
7.0%
2252
 
7.0%
2251
 
7.0%
2249
 
7.0%
2035
 
6.3%
2032
 
6.3%
1687
 
5.2%
1289
 
4.0%
Other values (148) 11547
35.9%
CJK Compat
ValueCountFrequency (%)
3
100.0%
Distinct654
Distinct (%)29.1%
Missing0
Missing (%)0.0%
Memory size17.7 KiB
2024-03-15T11:25:15.585815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length28
Mean length16.435749
Min length3

Characters and Unicode

Total characters36964
Distinct characters312
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique327 ?
Unique (%)14.5%

Sample

1st row일반 제재업
2nd row수동식 식품 가공기기 및 금속 주방용기 제조업 외 1 종
3rd row기타 섬유, 의복 및 가죽 가공 기계 제조업
4th row기타 목재가구 제조업 외 6 종
5th row구조용 금속 판제품 및 공작물 제조업 외 3 종
ValueCountFrequency (%)
제조업 1917
 
16.2%
996
 
8.4%
952
 
8.0%
765
 
6.5%
기타 702
 
5.9%
1 439
 
3.7%
플라스틱 214
 
1.8%
187
 
1.6%
편조원단 186
 
1.6%
목재가구 182
 
1.5%
Other values (561) 5296
44.7%
2024-03-15T11:25:17.347086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9589
25.9%
2598
 
7.0%
2372
 
6.4%
2294
 
6.2%
1154
 
3.1%
996
 
2.7%
984
 
2.7%
782
 
2.1%
729
 
2.0%
659
 
1.8%
Other values (302) 14807
40.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 26205
70.9%
Space Separator 9589
 
25.9%
Decimal Number 799
 
2.2%
Other Punctuation 315
 
0.9%
Close Punctuation 28
 
0.1%
Open Punctuation 28
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2598
 
9.9%
2372
 
9.1%
2294
 
8.8%
1154
 
4.4%
996
 
3.8%
984
 
3.8%
782
 
3.0%
729
 
2.8%
659
 
2.5%
607
 
2.3%
Other values (287) 13030
49.7%
Decimal Number
ValueCountFrequency (%)
1 472
59.1%
2 138
 
17.3%
3 69
 
8.6%
4 45
 
5.6%
5 25
 
3.1%
6 14
 
1.8%
7 14
 
1.8%
0 9
 
1.1%
8 8
 
1.0%
9 5
 
0.6%
Other Punctuation
ValueCountFrequency (%)
, 306
97.1%
. 9
 
2.9%
Space Separator
ValueCountFrequency (%)
9589
100.0%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%
Open Punctuation
ValueCountFrequency (%)
( 28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 26205
70.9%
Common 10759
29.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2598
 
9.9%
2372
 
9.1%
2294
 
8.8%
1154
 
4.4%
996
 
3.8%
984
 
3.8%
782
 
3.0%
729
 
2.8%
659
 
2.5%
607
 
2.3%
Other values (287) 13030
49.7%
Common
ValueCountFrequency (%)
9589
89.1%
1 472
 
4.4%
, 306
 
2.8%
2 138
 
1.3%
3 69
 
0.6%
4 45
 
0.4%
) 28
 
0.3%
( 28
 
0.3%
5 25
 
0.2%
6 14
 
0.1%
Other values (5) 45
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 26182
70.8%
ASCII 10759
29.1%
Compat Jamo 23
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9589
89.1%
1 472
 
4.4%
, 306
 
2.8%
2 138
 
1.3%
3 69
 
0.6%
4 45
 
0.4%
) 28
 
0.3%
( 28
 
0.3%
5 25
 
0.2%
6 14
 
0.1%
Other values (5) 45
 
0.4%
Hangul
ValueCountFrequency (%)
2598
 
9.9%
2372
 
9.1%
2294
 
8.8%
1154
 
4.4%
996
 
3.8%
984
 
3.8%
782
 
3.0%
729
 
2.8%
659
 
2.5%
607
 
2.3%
Other values (286) 13007
49.7%
Compat Jamo
ValueCountFrequency (%)
23
100.0%
Distinct2090
Distinct (%)92.9%
Missing0
Missing (%)0.0%
Memory size17.7 KiB
2024-03-15T11:25:18.323634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length11.991996
Min length9

Characters and Unicode

Total characters26970
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1967 ?
Unique (%)87.5%

Sample

1st row031-533-8740
2nd row031-595-7997
3rd row031-531-3579
4th row031-591-8353
5th row031-544-3133
ValueCountFrequency (%)
031-535-3030 13
 
0.6%
031-542-6526 6
 
0.3%
031-536-8308 5
 
0.2%
031-541-8722 4
 
0.2%
031-532-0948 4
 
0.2%
031-534-9500 4
 
0.2%
031-544-6781 3
 
0.1%
031-969-9873 3
 
0.1%
031-541-0171 3
 
0.1%
031-543-3411 3
 
0.1%
Other values (2080) 2201
97.9%
2024-03-15T11:25:19.818171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 4491
16.7%
3 4292
15.9%
1 3600
13.3%
0 3567
13.2%
5 3009
11.2%
4 2299
8.5%
2 1557
 
5.8%
7 1101
 
4.1%
8 1084
 
4.0%
6 1064
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 22479
83.3%
Dash Punctuation 4491
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 4292
19.1%
1 3600
16.0%
0 3567
15.9%
5 3009
13.4%
4 2299
10.2%
2 1557
 
6.9%
7 1101
 
4.9%
8 1084
 
4.8%
6 1064
 
4.7%
9 906
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 4491
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 26970
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 4491
16.7%
3 4292
15.9%
1 3600
13.3%
0 3567
13.2%
5 3009
11.2%
4 2299
8.5%
2 1557
 
5.8%
7 1101
 
4.1%
8 1084
 
4.0%
6 1064
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26970
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 4491
16.7%
3 4292
15.9%
1 3600
13.3%
0 3567
13.2%
5 3009
11.2%
4 2299
8.5%
2 1557
 
5.8%
7 1101
 
4.1%
8 1084
 
4.0%
6 1064
 
3.9%

종업원수(총인원)
Real number (ℝ)

ZEROS 

Distinct84
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.798133
Minimum0
Maximum290
Zeros34
Zeros (%)1.5%
Negative0
Negative (%)0.0%
Memory size19.9 KiB
2024-03-15T11:25:20.256792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q14
median7
Q313
95-th percentile35
Maximum290
Range290
Interquartile range (IQR)9

Descriptive statistics

Standard deviation16.647117
Coefficient of variation (CV)1.4109959
Kurtosis73.882407
Mean11.798133
Median Absolute Deviation (MAD)4
Skewness6.681058
Sum26534
Variance277.12649
MonotonicityNot monotonic
2024-03-15T11:25:20.736674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5 247
 
11.0%
3 210
 
9.3%
4 194
 
8.6%
10 157
 
7.0%
6 153
 
6.8%
2 139
 
6.2%
7 123
 
5.5%
8 119
 
5.3%
12 91
 
4.0%
9 75
 
3.3%
Other values (74) 741
32.9%
ValueCountFrequency (%)
0 34
 
1.5%
1 62
 
2.8%
2 139
6.2%
3 210
9.3%
4 194
8.6%
5 247
11.0%
6 153
6.8%
7 123
5.5%
8 119
5.3%
9 75
 
3.3%
ValueCountFrequency (%)
290 1
< 0.1%
240 1
< 0.1%
210 1
< 0.1%
176 1
< 0.1%
160 1
< 0.1%
140 1
< 0.1%
138 1
< 0.1%
133 1
< 0.1%
130 1
< 0.1%
114 1
< 0.1%
Distinct1813
Distinct (%)80.8%
Missing5
Missing (%)0.2%
Memory size17.7 KiB
2024-03-15T11:25:22.364391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length85
Median length56
Mean length8.3092692
Min length1

Characters and Unicode

Total characters18646
Distinct characters681
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1657 ?
Unique (%)73.8%

Sample

1st row각재판재
2nd row주방기기, 씽크대 등
3rd row봉제기계
4th row가구,의자,소파,침대
5th row철물구조, 장애인보조기구,배전반
ValueCountFrequency (%)
76
 
2.1%
원단 63
 
1.8%
62
 
1.8%
가구 41
 
1.2%
목재가구 32
 
0.9%
플라스틱 27
 
0.8%
마스크 25
 
0.7%
니트원단 23
 
0.7%
염색 13
 
0.4%
편조원단 13
 
0.4%
Other values (2344) 3160
89.4%
2024-03-15T11:25:24.306176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1302
 
7.0%
, 1205
 
6.5%
512
 
2.7%
388
 
2.1%
379
 
2.0%
305
 
1.6%
304
 
1.6%
267
 
1.4%
262
 
1.4%
250
 
1.3%
Other values (671) 13472
72.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15314
82.1%
Space Separator 1302
 
7.0%
Other Punctuation 1238
 
6.6%
Uppercase Letter 310
 
1.7%
Open Punctuation 195
 
1.0%
Close Punctuation 194
 
1.0%
Lowercase Letter 80
 
0.4%
Decimal Number 10
 
0.1%
Dash Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
512
 
3.3%
388
 
2.5%
379
 
2.5%
305
 
2.0%
304
 
2.0%
267
 
1.7%
262
 
1.7%
250
 
1.6%
242
 
1.6%
240
 
1.6%
Other values (611) 12165
79.4%
Uppercase Letter
ValueCountFrequency (%)
P 57
18.4%
C 41
13.2%
E 40
12.9%
V 24
7.7%
L 24
7.7%
T 20
 
6.5%
S 20
 
6.5%
D 18
 
5.8%
B 9
 
2.9%
A 9
 
2.9%
Other values (14) 48
15.5%
Lowercase Letter
ValueCountFrequency (%)
p 15
18.8%
e 14
17.5%
l 10
12.5%
t 9
11.2%
c 4
 
5.0%
o 3
 
3.8%
u 3
 
3.8%
a 3
 
3.8%
v 3
 
3.8%
d 2
 
2.5%
Other values (9) 14
17.5%
Decimal Number
ValueCountFrequency (%)
4 2
20.0%
9 2
20.0%
2 2
20.0%
1 1
10.0%
8 1
10.0%
0 1
10.0%
5 1
10.0%
Other Punctuation
ValueCountFrequency (%)
, 1205
97.3%
. 29
 
2.3%
/ 3
 
0.2%
: 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 194
99.5%
[ 1
 
0.5%
Close Punctuation
ValueCountFrequency (%)
) 193
99.5%
] 1
 
0.5%
Space Separator
ValueCountFrequency (%)
1302
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 15314
82.1%
Common 2942
 
15.8%
Latin 390
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
512
 
3.3%
388
 
2.5%
379
 
2.5%
305
 
2.0%
304
 
2.0%
267
 
1.7%
262
 
1.7%
250
 
1.6%
242
 
1.6%
240
 
1.6%
Other values (611) 12165
79.4%
Latin
ValueCountFrequency (%)
P 57
14.6%
C 41
 
10.5%
E 40
 
10.3%
V 24
 
6.2%
L 24
 
6.2%
T 20
 
5.1%
S 20
 
5.1%
D 18
 
4.6%
p 15
 
3.8%
e 14
 
3.6%
Other values (33) 117
30.0%
Common
ValueCountFrequency (%)
1302
44.3%
, 1205
41.0%
( 194
 
6.6%
) 193
 
6.6%
. 29
 
1.0%
- 3
 
0.1%
/ 3
 
0.1%
4 2
 
0.1%
9 2
 
0.1%
2 2
 
0.1%
Other values (7) 7
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 15314
82.1%
ASCII 3332
 
17.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1302
39.1%
, 1205
36.2%
( 194
 
5.8%
) 193
 
5.8%
P 57
 
1.7%
C 41
 
1.2%
E 40
 
1.2%
. 29
 
0.9%
V 24
 
0.7%
L 24
 
0.7%
Other values (50) 223
 
6.7%
Hangul
ValueCountFrequency (%)
512
 
3.3%
388
 
2.5%
379
 
2.5%
305
 
2.0%
304
 
2.0%
267
 
1.7%
262
 
1.7%
250
 
1.6%
242
 
1.6%
240
 
1.6%
Other values (611) 12165
79.4%

원자재
Text

MISSING 

Distinct1142
Distinct (%)75.7%
Missing741
Missing (%)32.9%
Memory size17.7 KiB
2024-03-15T11:25:25.743472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length479
Median length46
Mean length9.5537135
Min length1

Characters and Unicode

Total characters14407
Distinct characters553
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1038 ?
Unique (%)68.8%

Sample

1st row목재
2nd row스테인레스
3rd rowMDF,가공목재,금속자재
4th row철판 등
5th row합판, 가구부자재, 천, 내자, 철판, 파이프
ValueCountFrequency (%)
127
 
4.4%
철판 74
 
2.6%
원사 73
 
2.5%
원단 57
 
2.0%
목재 44
 
1.5%
알루미늄 36
 
1.3%
합판 35
 
1.2%
플라스틱 34
 
1.2%
mdf 34
 
1.2%
32
 
1.1%
Other values (1380) 2326
81.0%
2024-03-15T11:25:27.580519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1854
 
12.9%
, 1552
 
10.8%
402
 
2.8%
319
 
2.2%
P 305
 
2.1%
284
 
2.0%
253
 
1.8%
227
 
1.6%
178
 
1.2%
175
 
1.2%
Other values (543) 8858
61.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9026
62.7%
Space Separator 1854
 
12.9%
Other Punctuation 1616
 
11.2%
Uppercase Letter 1299
 
9.0%
Lowercase Letter 252
 
1.7%
Open Punctuation 144
 
1.0%
Close Punctuation 144
 
1.0%
Decimal Number 55
 
0.4%
Dash Punctuation 14
 
0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
402
 
4.5%
319
 
3.5%
284
 
3.1%
253
 
2.8%
227
 
2.5%
178
 
2.0%
175
 
1.9%
169
 
1.9%
149
 
1.7%
149
 
1.7%
Other values (477) 6721
74.5%
Uppercase Letter
ValueCountFrequency (%)
P 305
23.5%
E 121
 
9.3%
D 111
 
8.5%
B 101
 
7.8%
C 96
 
7.4%
S 94
 
7.2%
M 90
 
6.9%
F 69
 
5.3%
L 65
 
5.0%
A 53
 
4.1%
Other values (13) 194
14.9%
Lowercase Letter
ValueCountFrequency (%)
p 26
 
10.3%
s 26
 
10.3%
l 24
 
9.5%
e 24
 
9.5%
o 19
 
7.5%
a 16
 
6.3%
c 15
 
6.0%
b 14
 
5.6%
n 12
 
4.8%
y 11
 
4.4%
Other values (12) 65
25.8%
Decimal Number
ValueCountFrequency (%)
0 15
27.3%
4 9
16.4%
3 9
16.4%
2 6
 
10.9%
6 6
 
10.9%
1 5
 
9.1%
5 3
 
5.5%
8 2
 
3.6%
Other Punctuation
ValueCountFrequency (%)
, 1552
96.0%
. 51
 
3.2%
/ 9
 
0.6%
& 3
 
0.2%
' 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 143
99.3%
[ 1
 
0.7%
Close Punctuation
ValueCountFrequency (%)
) 143
99.3%
] 1
 
0.7%
Math Symbol
ValueCountFrequency (%)
+ 2
66.7%
~ 1
33.3%
Space Separator
ValueCountFrequency (%)
1854
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9026
62.7%
Common 3830
26.6%
Latin 1551
 
10.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
402
 
4.5%
319
 
3.5%
284
 
3.1%
253
 
2.8%
227
 
2.5%
178
 
2.0%
175
 
1.9%
169
 
1.9%
149
 
1.7%
149
 
1.7%
Other values (477) 6721
74.5%
Latin
ValueCountFrequency (%)
P 305
19.7%
E 121
 
7.8%
D 111
 
7.2%
B 101
 
6.5%
C 96
 
6.2%
S 94
 
6.1%
M 90
 
5.8%
F 69
 
4.4%
L 65
 
4.2%
A 53
 
3.4%
Other values (35) 446
28.8%
Common
ValueCountFrequency (%)
1854
48.4%
, 1552
40.5%
( 143
 
3.7%
) 143
 
3.7%
. 51
 
1.3%
0 15
 
0.4%
- 14
 
0.4%
/ 9
 
0.2%
4 9
 
0.2%
3 9
 
0.2%
Other values (11) 31
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9024
62.6%
ASCII 5381
37.3%
Compat Jamo 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1854
34.5%
, 1552
28.8%
P 305
 
5.7%
( 143
 
2.7%
) 143
 
2.7%
E 121
 
2.2%
D 111
 
2.1%
B 101
 
1.9%
C 96
 
1.8%
S 94
 
1.7%
Other values (56) 861
16.0%
Hangul
ValueCountFrequency (%)
402
 
4.5%
319
 
3.5%
284
 
3.1%
253
 
2.8%
227
 
2.5%
178
 
2.0%
175
 
1.9%
169
 
1.9%
149
 
1.7%
149
 
1.7%
Other values (475) 6719
74.5%
Compat Jamo
ValueCountFrequency (%)
1
50.0%
1
50.0%

Interactions

2024-03-15T11:25:05.220973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2024-03-15T11:25:05.591425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T11:25:06.022713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-15T11:25:06.244089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

회사명공장주소(도로명)공장주소(지번)업종명전화번호종업원수(총인원)생산품원자재
0(주)성원목재경기도 포천시 내촌면 부마로282번길 42경기도 포천시 내촌면 마명리 110-8번지일반 제재업031-533-87402각재판재목재
1주식회사 대명코리아경기도 포천시 가산면 정교리 12-11 외 1필지경기도 포천시 가산면 정교리 12-11 외 1필지수동식 식품 가공기기 및 금속 주방용기 제조업 외 1 종031-595-799730주방기기, 씽크대 등스테인레스
2(주)동양봉제기계경기도 포천시 금강로 2927 (내촌면) 1층경기도 포천시 내촌면 음현리 688-1 1층기타 섬유, 의복 및 가죽 가공 기계 제조업031-531-35798봉제기계<NA>
3(사)우리들행복나눔(종합가구사업단)경기도 포천시 내촌면 작은넙고개1길 84, 가동경기도 포천시 내촌면 진목리 915-9번지 가동기타 목재가구 제조업 외 6 종031-591-835312가구,의자,소파,침대MDF,가공목재,금속자재
4(사)한국교통장애인협회 사업부경기도 포천시 가산면 정금로 183-23, 마,라동 (가산면)경기도 포천시 가산면 정교리 211-3번지 마,라동구조용 금속 판제품 및 공작물 제조업 외 3 종031-544-31337철물구조, 장애인보조기구,배전반철판 등
5(사)한국척수장애인협회(가구사업소)경기도 포천시 가산면 마정로 61경기도 포천시 가산면 마산리 299-8번지기타 목재가구 제조업 외 4 종031-544-041018사무용가구(완제품), 붙받이장(반제품)합판, 가구부자재, 천, 내자, 철판, 파이프
6(재)경기대진테크노파크경기도 포천시 가산면 포천로912번길 147-2경기도 포천시 가산면 마전리 565-13번지기타 목재가구 제조업031-539-50003목재가구목재
7(주) 강원그린석재 포천공장경기도 포천시 영중면 가영로535번길 119경기도 포천시 영중면 영송리 711-13번지건설용 석제품 제조업031-534-41605도로경계석원석
8(주) 거산경기도 포천시 군내면 명산리 1 외 1필지경기도 포천시 군내면 명산리 1 외 1필지그 외 기타 플라스틱 제품 제조업031-536-22537비닐장갑 비닐롤백HDPEH포리에칠엔
9(주) 로드텍이엔지경기도 포천시 죽엽산로237번길 35 (소흘읍) 외 1필지경기도 포천시 소흘읍 고모리 474-3 외 1필지일반저울 제조업 외 1 종031-541-58496전자저울철판,앵글,로드셀
회사명공장주소(도로명)공장주소(지번)업종명전화번호종업원수(총인원)생산품원자재
2239효성섬유경기도 포천시 소흘읍 화합로300번길 33경기도 포천시 소흘읍 송우리 265번지편조원단 제조업031-543-32184원단<NA>
2240효성섬유경기도 포천시 가산면 정금로473번길 63경기도 포천시 가산면 금현리 422-9번지편조원단 제조업031-543-139520원단편직원단
2241효원석재산업(주)경기도 포천시 일동면 금강로 4380 (일동면) 외 3필지경기도 포천시 일동면 기산리 435-7번지 외 3필지건설용 석제품 제조업 외 1 종031-531-15938건축자재, 전시조감품등석재
2242효진산업경기도 포천시 가산면 너배기1길 12경기도 포천시 가산면 정교리 105-1번지기타 목재가구 제조업031-533-53715목재가구MDF, PB, 합판
2243효천푸드경기도 포천시 가산면 마전리 614경기도 포천시 가산면 마전리 614면류, 마카로니 및 유사식품 제조업031-541-837712국수밀가루
2244효천푸드경기도 포천시 가산면 포천로887번길 73 (총 2 필지) 외 1필지경기도 포천시 가산면 감암리 162-1번지 외 1필지면류, 마카로니 및 유사식품 제조업 외 2 종031-541-837712국수<NA>
2245후덕한금속(주)경기도 포천시 화합로 248 (동교동) (총 2 필지)경기도 포천시 동교동 504-6번지동 압연, 압출 및 연신제품 제조업031-541-376336동관<NA>
2246흥덕텍스타일경기도 포천시 설운동 559-2경기도 포천시 설운동 559-2편조원단 제조업031-541-45603니트, 니트원단원사
2247희훈산업경기도 포천시 정금로162번길 32 (소흘읍)경기도 포천시 소흘읍 고모리 1-2플라스틱 시트 및 판 제조업031-542-33158p.p SHEETP.P
2248히포물산경기도 포천시 소흘읍 소흘로 28경기도 포천시 소흘읍 무봉리 262-1번지스타킹 및 기타 양말 제조업031-543-160010양말<NA>

Duplicate rows

Most frequently occurring

회사명공장주소(도로명)공장주소(지번)업종명전화번호종업원수(총인원)생산품원자재# duplicates
0대진정밀경기도 포천시 가산면 시우동길 59-2경기도 포천시 가산면 가산리 480-1번지자동차 차체용 신품 부품 제조업 외 1 종02-955-70674자동차부품<NA>2
1주식회사 제이씨솔루션경기도 포천시 가산면 너배기길 34-21경기도 포천시 가산면 금현리 1132-33전시 및 광고용 조명장치 제조업 외 1 종031-969-98738안내전광판, 스코어보그전광판LED모듈, SMPS, 컨트롤러2