Overview

Dataset statistics

Number of variables9
Number of observations856
Missing cells2469
Missing cells (%)32.0%
Duplicate rows45
Duplicate rows (%)5.3%
Total size in memory60.3 KiB
Average record size in memory72.2 B

Variable types

Text6
DateTime2
Categorical1

Dataset

Description해외제조공장(고압가스 용기, 냉동기 등) 중 국내 유통이 가능하도록 심사완료된 업체 현황(등록번호, 업소명, 등록종류, 소재지, 대표자 등)데이터 입니다.
Author한국가스안전공사
URLhttps://www.data.go.kr/data/15043845/fileData.do

Alerts

Dataset has 45 (5.3%) duplicate rowsDuplicates
업소(Company Name) has 390 (45.6%) missing valuesMissing
최초등록일(Initial Date) has 391 (45.7%) missing valuesMissing
만료일(Expiry Date) has 391 (45.7%) missing valuesMissing
등록종류(Type) has 336 (39.3%) missing valuesMissing
제조기준(Standard) has 181 (21.1%) missing valuesMissing
소재지(Address) has 390 (45.6%) missing valuesMissing
대표자(Representative) has 390 (45.6%) missing valuesMissing

Reproduction

Analysis started2023-12-12 16:03:53.504352
Analysis finished2023-12-12 16:03:54.260697
Duration0.76 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct466
Distinct (%)54.4%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
2023-12-13T01:03:54.433564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length9.1028037
Min length8

Characters and Unicode

Total characters7792
Distinct characters16
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique251 ?
Unique (%)29.3%

Sample

1st row제 ES-1 호
2nd row제 ES-1 호
3rd row제 ES-2 호
4th row제 ES-7 호
5th row제 ES-7 호
ValueCountFrequency (%)
856
44.2%
224
 
11.6%
es-522 8
 
0.4%
es-27 8
 
0.4%
es-19 8
 
0.4%
es-201호 8
 
0.4%
es-209호 7
 
0.4%
es-9 7
 
0.4%
es-112호 7
 
0.4%
es-163호 6
 
0.3%
Other values (458) 797
41.2%
2023-12-13T01:03:54.771997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1080
13.9%
856
11.0%
E 856
11.0%
S 856
11.0%
- 856
11.0%
856
11.0%
2 388
 
5.0%
1 351
 
4.5%
4 324
 
4.2%
3 291
 
3.7%
Other values (6) 1078
13.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2432
31.2%
Other Letter 1712
22.0%
Uppercase Letter 1712
22.0%
Space Separator 1080
13.9%
Dash Punctuation 856
 
11.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 388
16.0%
1 351
14.4%
4 324
13.3%
3 291
12.0%
5 276
11.3%
0 173
7.1%
7 169
6.9%
9 161
6.6%
6 155
 
6.4%
8 144
 
5.9%
Other Letter
ValueCountFrequency (%)
856
50.0%
856
50.0%
Uppercase Letter
ValueCountFrequency (%)
E 856
50.0%
S 856
50.0%
Space Separator
ValueCountFrequency (%)
1080
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 856
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4368
56.1%
Hangul 1712
 
22.0%
Latin 1712
 
22.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1080
24.7%
- 856
19.6%
2 388
 
8.9%
1 351
 
8.0%
4 324
 
7.4%
3 291
 
6.7%
5 276
 
6.3%
0 173
 
4.0%
7 169
 
3.9%
9 161
 
3.7%
Other values (2) 299
 
6.8%
Hangul
ValueCountFrequency (%)
856
50.0%
856
50.0%
Latin
ValueCountFrequency (%)
E 856
50.0%
S 856
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6080
78.0%
Hangul 1712
 
22.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1080
17.8%
E 856
14.1%
S 856
14.1%
- 856
14.1%
2 388
 
6.4%
1 351
 
5.8%
4 324
 
5.3%
3 291
 
4.8%
5 276
 
4.5%
0 173
 
2.8%
Other values (4) 629
10.3%
Hangul
ValueCountFrequency (%)
856
50.0%
856
50.0%

업소(Company Name)
Text

MISSING 

Distinct443
Distinct (%)95.1%
Missing390
Missing (%)45.6%
Memory size6.8 KiB
2023-12-13T01:03:55.052150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length90
Median length53
Mean length30.291845
Min length6

Characters and Unicode

Total characters14116
Distinct characters69
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique421 ?
Unique (%)90.3%

Sample

1st rowKANTO KOATSU-YOKI MFG. CO., LTD.
2nd rowBENKAN KIKOH Corporation
3rd rowTaylor-Wharton Malaysia Sdn. Bhd.
4th rowFaber Industrie spa
5th rowSCI(Structural Composites Industries)
ValueCountFrequency (%)
ltd 127
 
6.5%
co 110
 
5.6%
gmbh 59
 
3.0%
equipment 35
 
1.8%
inc 34
 
1.7%
31
 
1.6%
co.,ltd 26
 
1.3%
engineering 26
 
1.3%
corporation 23
 
1.2%
company 21
 
1.1%
Other values (759) 1476
75.0%
2023-12-13T01:03:55.456675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1523
 
10.8%
n 775
 
5.5%
e 694
 
4.9%
a 673
 
4.8%
i 670
 
4.7%
o 601
 
4.3%
t 522
 
3.7%
r 519
 
3.7%
C 436
 
3.1%
. 424
 
3.0%
Other values (59) 7279
51.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7445
52.7%
Uppercase Letter 4310
30.5%
Space Separator 1523
 
10.8%
Other Punctuation 641
 
4.5%
Open Punctuation 77
 
0.5%
Close Punctuation 77
 
0.5%
Dash Punctuation 29
 
0.2%
Math Symbol 8
 
0.1%
Decimal Number 5
 
< 0.1%
Other Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 775
 
10.4%
e 694
 
9.3%
a 673
 
9.0%
i 670
 
9.0%
o 601
 
8.1%
t 522
 
7.0%
r 519
 
7.0%
s 367
 
4.9%
l 296
 
4.0%
u 283
 
3.8%
Other values (16) 2045
27.5%
Uppercase Letter
ValueCountFrequency (%)
C 436
 
10.1%
E 355
 
8.2%
A 347
 
8.1%
S 337
 
7.8%
L 334
 
7.7%
I 287
 
6.7%
T 266
 
6.2%
H 196
 
4.5%
R 187
 
4.3%
N 187
 
4.3%
Other values (16) 1378
32.0%
Other Punctuation
ValueCountFrequency (%)
. 424
66.1%
, 179
27.9%
& 35
 
5.5%
/ 2
 
0.3%
: 1
 
0.2%
Decimal Number
ValueCountFrequency (%)
2 2
40.0%
7 1
20.0%
1 1
20.0%
3 1
20.0%
Open Punctuation
ValueCountFrequency (%)
( 75
97.4%
[ 2
 
2.6%
Close Punctuation
ValueCountFrequency (%)
) 75
97.4%
] 2
 
2.6%
Space Separator
ValueCountFrequency (%)
1523
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 29
100.0%
Math Symbol
ValueCountFrequency (%)
+ 8
100.0%
Other Letter
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11755
83.3%
Common 2360
 
16.7%
Hangul 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 775
 
6.6%
e 694
 
5.9%
a 673
 
5.7%
i 670
 
5.7%
o 601
 
5.1%
t 522
 
4.4%
r 519
 
4.4%
C 436
 
3.7%
s 367
 
3.1%
E 355
 
3.0%
Other values (42) 6143
52.3%
Common
ValueCountFrequency (%)
1523
64.5%
. 424
 
18.0%
, 179
 
7.6%
( 75
 
3.2%
) 75
 
3.2%
& 35
 
1.5%
- 29
 
1.2%
+ 8
 
0.3%
/ 2
 
0.1%
2 2
 
0.1%
Other values (6) 8
 
0.3%
Hangul
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14115
> 99.9%
Hangul 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1523
 
10.8%
n 775
 
5.5%
e 694
 
4.9%
a 673
 
4.8%
i 670
 
4.7%
o 601
 
4.3%
t 522
 
3.7%
r 519
 
3.7%
C 436
 
3.1%
. 424
 
3.0%
Other values (58) 7278
51.6%
Hangul
ValueCountFrequency (%)
1
100.0%
Distinct393
Distinct (%)84.5%
Missing391
Missing (%)45.7%
Memory size6.8 KiB
Minimum2003-06-10 00:00:00
Maximum2023-06-15 00:00:00
2023-12-13T01:03:55.587458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:03:55.703233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct413
Distinct (%)88.8%
Missing391
Missing (%)45.7%
Memory size6.8 KiB
Minimum2009-08-19 00:00:00
Maximum2026-07-20 00:00:00
2023-12-13T01:03:55.819599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:03:55.937489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

등록종류(Type)
Text

MISSING 

Distinct163
Distinct (%)31.3%
Missing336
Missing (%)39.3%
Memory size6.8 KiB
2023-12-13T01:03:56.158104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length104
Median length85
Mean length43.098077
Min length2

Characters and Unicode

Total characters22411
Distinct characters84
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique116 ?
Unique (%)22.3%

Sample

1st rowSeamless Steel Cylinder(Spinning Type)
2nd rowWelded Cylinder (Internal Volume under 500L)
3rd rowSeamless Steel Cylinder(Spinning Type)
4th rowCryogenic Gas Cylinders(with internal volume under 500ℓ)
5th rowSeamless Steel Cylinders
ValueCountFrequency (%)
pressure 274
 
10.7%
tower 170
 
6.7%
others 163
 
6.4%
for 152
 
6.0%
drum 146
 
5.7%
exchanger 122
 
4.8%
high-pressure 101
 
4.0%
vessels(heat 99
 
3.9%
gases 95
 
3.7%
vessels(reactor 82
 
3.2%
Other values (154) 1148
45.0%
2023-12-13T01:03:56.725168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3254
14.5%
s 2432
 
10.9%
r 2201
 
9.8%
2058
 
9.2%
a 865
 
3.9%
t 801
 
3.6%
o 783
 
3.5%
l 736
 
3.3%
u 684
 
3.1%
i 670
 
3.0%
Other values (74) 7927
35.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16441
73.4%
Uppercase Letter 2315
 
10.3%
Space Separator 2058
 
9.2%
Other Punctuation 569
 
2.5%
Open Punctuation 375
 
1.7%
Close Punctuation 374
 
1.7%
Dash Punctuation 133
 
0.6%
Decimal Number 101
 
0.5%
Other Letter 40
 
0.2%
Math Symbol 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3254
19.8%
s 2432
14.8%
r 2201
13.4%
a 865
 
5.3%
t 801
 
4.9%
o 783
 
4.8%
l 736
 
4.5%
u 684
 
4.2%
i 670
 
4.1%
n 580
 
3.5%
Other values (15) 3435
20.9%
Other Letter
ValueCountFrequency (%)
7
17.5%
7
17.5%
2
 
5.0%
2
 
5.0%
2
 
5.0%
2
 
5.0%
2
 
5.0%
1
 
2.5%
1
 
2.5%
1
 
2.5%
Other values (13) 13
32.5%
Uppercase Letter
ValueCountFrequency (%)
P 306
13.2%
V 296
12.8%
T 233
10.1%
R 197
8.5%
H 193
8.3%
D 189
8.2%
O 174
7.5%
S 168
7.3%
C 157
6.8%
E 146
6.3%
Other values (9) 256
11.1%
Decimal Number
ValueCountFrequency (%)
0 64
63.4%
5 32
31.7%
1 2
 
2.0%
2 1
 
1.0%
4 1
 
1.0%
3 1
 
1.0%
Other Punctuation
ValueCountFrequency (%)
, 566
99.5%
: 2
 
0.4%
& 1
 
0.2%
Math Symbol
ValueCountFrequency (%)
< 3
60.0%
1
 
20.0%
> 1
 
20.0%
Open Punctuation
ValueCountFrequency (%)
( 374
99.7%
1
 
0.3%
Space Separator
ValueCountFrequency (%)
2058
100.0%
Close Punctuation
ValueCountFrequency (%)
) 374
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 133
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18751
83.7%
Common 3620
 
16.2%
Hangul 40
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3254
17.4%
s 2432
13.0%
r 2201
 
11.7%
a 865
 
4.6%
t 801
 
4.3%
o 783
 
4.2%
l 736
 
3.9%
u 684
 
3.6%
i 670
 
3.6%
n 580
 
3.1%
Other values (33) 5745
30.6%
Hangul
ValueCountFrequency (%)
7
17.5%
7
17.5%
2
 
5.0%
2
 
5.0%
2
 
5.0%
2
 
5.0%
2
 
5.0%
1
 
2.5%
1
 
2.5%
1
 
2.5%
Other values (13) 13
32.5%
Common
ValueCountFrequency (%)
2058
56.9%
, 566
 
15.6%
( 374
 
10.3%
) 374
 
10.3%
- 133
 
3.7%
0 64
 
1.8%
5 32
 
0.9%
5
 
0.1%
< 3
 
0.1%
1 2
 
0.1%
Other values (8) 9
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 22364
99.8%
Hangul 40
 
0.2%
Letterlike Symbols 5
 
< 0.1%
Math Operators 1
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3254
14.6%
s 2432
 
10.9%
r 2201
 
9.8%
2058
 
9.2%
a 865
 
3.9%
t 801
 
3.6%
o 783
 
3.5%
l 736
 
3.3%
u 684
 
3.1%
i 670
 
3.0%
Other values (48) 7880
35.2%
Hangul
ValueCountFrequency (%)
7
17.5%
7
17.5%
2
 
5.0%
2
 
5.0%
2
 
5.0%
2
 
5.0%
2
 
5.0%
1
 
2.5%
1
 
2.5%
1
 
2.5%
Other values (13) 13
32.5%
Letterlike Symbols
ValueCountFrequency (%)
5
100.0%
Math Operators
ValueCountFrequency (%)
1
100.0%
None
ValueCountFrequency (%)
1
100.0%
Distinct179
Distinct (%)26.5%
Missing181
Missing (%)21.1%
Memory size6.8 KiB
2023-12-13T01:03:56.942518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length73
Median length35
Mean length14.937778
Min length5

Characters and Unicode

Total characters10083
Distinct characters83
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique116 ?
Unique (%)17.2%

Sample

1st rowHPGSL
2nd rowHPGSL
3rd rowHPGSL
4th rowDOT-4L
5th rowPED+EN ISO9809-1
ValueCountFrequency (%)
asme 330
19.1%
sec.viii 299
17.3%
div.1 233
13.5%
91
 
5.3%
ped 69
 
4.0%
ad2000 49
 
2.8%
kgs 49
 
2.8%
div.2 46
 
2.7%
en 24
 
1.4%
tped 24
 
1.4%
Other values (182) 510
29.6%
2023-12-13T01:03:57.390963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1057
 
10.5%
I 1004
 
10.0%
S 804
 
8.0%
. 656
 
6.5%
D 580
 
5.8%
A 571
 
5.7%
E 548
 
5.4%
1 496
 
4.9%
M 339
 
3.4%
e 331
 
3.3%
Other values (73) 3697
36.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5023
49.8%
Decimal Number 1413
 
14.0%
Lowercase Letter 1382
 
13.7%
Space Separator 1057
 
10.5%
Other Punctuation 912
 
9.0%
Math Symbol 123
 
1.2%
Dash Punctuation 91
 
0.9%
Open Punctuation 22
 
0.2%
Close Punctuation 22
 
0.2%
Other Letter 22
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 1004
20.0%
S 804
16.0%
D 580
11.5%
A 571
11.4%
E 548
10.9%
M 339
 
6.7%
V 316
 
6.3%
P 188
 
3.7%
G 112
 
2.2%
T 99
 
2.0%
Other values (14) 462
9.2%
Lowercase Letter
ValueCountFrequency (%)
e 331
24.0%
i 324
23.4%
c 323
23.4%
v 312
22.6%
t 13
 
0.9%
n 12
 
0.9%
r 12
 
0.9%
a 10
 
0.7%
o 8
 
0.6%
p 5
 
0.4%
Other values (13) 32
 
2.3%
Other Letter
ValueCountFrequency (%)
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
2
9.1%
1
 
4.5%
1
 
4.5%
Other values (4) 4
18.2%
Decimal Number
ValueCountFrequency (%)
1 496
35.1%
0 240
17.0%
2 224
15.9%
3 132
 
9.3%
4 79
 
5.6%
6 70
 
5.0%
9 69
 
4.9%
5 44
 
3.1%
7 30
 
2.1%
8 29
 
2.1%
Other Punctuation
ValueCountFrequency (%)
. 656
71.9%
, 239
 
26.2%
/ 13
 
1.4%
& 3
 
0.3%
: 1
 
0.1%
Letter Number
ValueCountFrequency (%)
12
75.0%
4
 
25.0%
Space Separator
ValueCountFrequency (%)
1057
100.0%
Math Symbol
ValueCountFrequency (%)
+ 123
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 91
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6421
63.7%
Common 3640
36.1%
Han 20
 
0.2%
Katakana 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 1004
15.6%
S 804
12.5%
D 580
9.0%
A 571
8.9%
E 548
8.5%
M 339
 
5.3%
e 331
 
5.2%
i 324
 
5.0%
c 323
 
5.0%
V 316
 
4.9%
Other values (39) 1281
20.0%
Common
ValueCountFrequency (%)
1057
29.0%
. 656
18.0%
1 496
13.6%
0 240
 
6.6%
, 239
 
6.6%
2 224
 
6.2%
3 132
 
3.6%
+ 123
 
3.4%
- 91
 
2.5%
4 79
 
2.2%
Other values (10) 303
 
8.3%
Han
ValueCountFrequency (%)
2
10.0%
2
10.0%
2
10.0%
2
10.0%
2
10.0%
2
10.0%
2
10.0%
2
10.0%
1
5.0%
1
5.0%
Other values (2) 2
10.0%
Katakana
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10045
99.6%
CJK 20
 
0.2%
Number Forms 16
 
0.2%
Katakana 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1057
 
10.5%
I 1004
 
10.0%
S 804
 
8.0%
. 656
 
6.5%
D 580
 
5.8%
A 571
 
5.7%
E 548
 
5.5%
1 496
 
4.9%
M 339
 
3.4%
e 331
 
3.3%
Other values (57) 3659
36.4%
Number Forms
ValueCountFrequency (%)
12
75.0%
4
 
25.0%
CJK
ValueCountFrequency (%)
2
10.0%
2
10.0%
2
10.0%
2
10.0%
2
10.0%
2
10.0%
2
10.0%
2
10.0%
1
5.0%
1
5.0%
Other values (2) 2
10.0%
Katakana
ValueCountFrequency (%)
1
50.0%
1
50.0%

소재지(Address)
Text

MISSING 

Distinct463
Distinct (%)99.4%
Missing390
Missing (%)45.6%
Memory size6.8 KiB
2023-12-13T01:03:57.721863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length152
Median length101
Mean length58.684549
Min length29

Characters and Unicode

Total characters27347
Distinct characters80
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique460 ?
Unique (%)98.7%

Sample

1st row153-1, Tottori-machi, Maebashi-shi, Gunma-pref., JAPAN
2nd row1-1, Fuso-cho, Amagasaki-shi, Hyogo-pref., JAPAN
3rd rowLot. No. PT 5073 Jalan Jangur 28/43, Hicom Industrial Estate 40400 Shah Alam, Selangor Malaysia
4th rowVia dell'Industria 23-33043 Cividale del Friuli, Udine-Italy
5th row 336 Enterprise Place, Pomona, CA 91768 USA
ValueCountFrequency (%)
road 107
 
2.8%
china 106
 
2.7%
usa 74
 
1.9%
germany 56
 
1.5%
city 55
 
1.4%
japan 49
 
1.3%
province 48
 
1.2%
industrial 42
 
1.1%
41
 
1.1%
district 37
 
1.0%
Other values (1974) 3246
84.1%
2023-12-13T01:03:58.305695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3506
 
12.8%
a 1885
 
6.9%
n 1469
 
5.4%
i 1347
 
4.9%
e 1233
 
4.5%
, 1153
 
4.2%
o 1021
 
3.7%
r 861
 
3.1%
t 745
 
2.7%
h 658
 
2.4%
Other values (70) 13469
49.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13769
50.3%
Uppercase Letter 5077
 
18.6%
Space Separator 3506
 
12.8%
Decimal Number 3113
 
11.4%
Other Punctuation 1444
 
5.3%
Dash Punctuation 348
 
1.3%
Open Punctuation 40
 
0.1%
Close Punctuation 40
 
0.1%
Final Punctuation 5
 
< 0.1%
Modifier Symbol 2
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1885
13.7%
n 1469
10.7%
i 1347
9.8%
e 1233
 
9.0%
o 1021
 
7.4%
r 861
 
6.3%
t 745
 
5.4%
h 658
 
4.8%
u 601
 
4.4%
s 597
 
4.3%
Other values (17) 3352
24.3%
Uppercase Letter
ValueCountFrequency (%)
A 523
 
10.3%
S 438
 
8.6%
C 369
 
7.3%
N 334
 
6.6%
R 307
 
6.0%
I 295
 
5.8%
P 238
 
4.7%
T 223
 
4.4%
D 208
 
4.1%
L 187
 
3.7%
Other values (17) 1955
38.5%
Decimal Number
ValueCountFrequency (%)
1 538
17.3%
0 481
15.5%
2 376
12.1%
3 325
10.4%
5 309
9.9%
4 300
9.6%
6 223
7.2%
7 213
 
6.8%
8 191
 
6.1%
9 157
 
5.0%
Other Punctuation
ValueCountFrequency (%)
, 1153
79.8%
. 242
 
16.8%
/ 25
 
1.7%
' 9
 
0.6%
& 8
 
0.6%
: 4
 
0.3%
# 3
 
0.2%
Space Separator
ValueCountFrequency (%)
3506
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 348
100.0%
Open Punctuation
ValueCountFrequency (%)
( 40
100.0%
Close Punctuation
ValueCountFrequency (%)
) 40
100.0%
Final Punctuation
ValueCountFrequency (%)
5
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%
Other Letter
ValueCountFrequency (%)
º 1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Math Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18846
68.9%
Common 8500
31.1%
Greek 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1885
 
10.0%
n 1469
 
7.8%
i 1347
 
7.1%
e 1233
 
6.5%
o 1021
 
5.4%
r 861
 
4.6%
t 745
 
4.0%
h 658
 
3.5%
u 601
 
3.2%
s 597
 
3.2%
Other values (44) 8429
44.7%
Common
ValueCountFrequency (%)
3506
41.2%
, 1153
 
13.6%
1 538
 
6.3%
0 481
 
5.7%
2 376
 
4.4%
- 348
 
4.1%
3 325
 
3.8%
5 309
 
3.6%
4 300
 
3.5%
. 242
 
2.8%
Other values (15) 922
 
10.8%
Greek
ValueCountFrequency (%)
Φ 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27335
> 99.9%
Punctuation 6
 
< 0.1%
None 5
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3506
 
12.8%
a 1885
 
6.9%
n 1469
 
5.4%
i 1347
 
4.9%
e 1233
 
4.5%
, 1153
 
4.2%
o 1021
 
3.7%
r 861
 
3.1%
t 745
 
2.7%
h 658
 
2.4%
Other values (64) 13457
49.2%
Punctuation
ValueCountFrequency (%)
5
83.3%
1
 
16.7%
None
ValueCountFrequency (%)
ß 3
60.0%
Φ 1
 
20.0%
º 1
 
20.0%
Math Operators
ValueCountFrequency (%)
1
100.0%
Distinct431
Distinct (%)92.5%
Missing390
Missing (%)45.6%
Memory size6.8 KiB
2023-12-13T01:03:58.711881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length45
Median length33
Mean length14.306867
Min length3

Characters and Unicode

Total characters6667
Distinct characters63
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique397 ?
Unique (%)85.2%

Sample

1st rowToshiyuki Yabata
2nd rowMANABU ITCHODA
3rd rowConnie Liew Yoke Kien
4th rowAlberto Agnoletti
5th rowAmit Sedha
ValueCountFrequency (%)
mr 13
 
1.2%
dr 9
 
0.9%
michael 8
 
0.8%
zhang 8
 
0.8%
jiang 7
 
0.7%
john 6
 
0.6%
chen 6
 
0.6%
thomas 6
 
0.6%
li 6
 
0.6%
steve 5
 
0.5%
Other values (745) 982
93.0%
2023-12-13T01:03:59.300611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
610
 
9.1%
a 516
 
7.7%
i 432
 
6.5%
n 418
 
6.3%
e 404
 
6.1%
r 316
 
4.7%
o 279
 
4.2%
h 200
 
3.0%
u 185
 
2.8%
g 180
 
2.7%
Other values (53) 3127
46.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4020
60.3%
Uppercase Letter 1938
29.1%
Space Separator 610
 
9.1%
Other Punctuation 83
 
1.2%
Dash Punctuation 5
 
0.1%
Open Punctuation 4
 
0.1%
Close Punctuation 4
 
0.1%
Other Letter 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 516
12.8%
i 432
10.7%
n 418
10.4%
e 404
10.0%
r 316
 
7.9%
o 279
 
6.9%
h 200
 
5.0%
u 185
 
4.6%
g 180
 
4.5%
l 175
 
4.4%
Other values (16) 915
22.8%
Uppercase Letter
ValueCountFrequency (%)
A 164
 
8.5%
M 141
 
7.3%
S 118
 
6.1%
I 117
 
6.0%
R 115
 
5.9%
N 99
 
5.1%
T 97
 
5.0%
E 93
 
4.8%
L 92
 
4.7%
H 91
 
4.7%
Other values (16) 811
41.8%
Other Punctuation
ValueCountFrequency (%)
. 70
84.3%
, 10
 
12.0%
& 2
 
2.4%
' 1
 
1.2%
Other Letter
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Space Separator
ValueCountFrequency (%)
610
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5958
89.4%
Common 706
 
10.6%
Hangul 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 516
 
8.7%
i 432
 
7.3%
n 418
 
7.0%
e 404
 
6.8%
r 316
 
5.3%
o 279
 
4.7%
h 200
 
3.4%
u 185
 
3.1%
g 180
 
3.0%
l 175
 
2.9%
Other values (42) 2853
47.9%
Common
ValueCountFrequency (%)
610
86.4%
. 70
 
9.9%
, 10
 
1.4%
- 5
 
0.7%
( 4
 
0.6%
) 4
 
0.6%
& 2
 
0.3%
' 1
 
0.1%
Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6664
> 99.9%
Hangul 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
610
 
9.2%
a 516
 
7.7%
i 432
 
6.5%
n 418
 
6.3%
e 404
 
6.1%
r 316
 
4.7%
o 279
 
4.2%
h 200
 
3.0%
u 185
 
2.8%
g 180
 
2.7%
Other values (50) 3124
46.9%
Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

국가(Country)
Categorical

Distinct41
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size6.8 KiB
<NA>
390 
China
109 
USA
81 
Germany
56 
Japan
49 
Other values (36)
171 

Length

Max length16
Median length12
Mean length4.6985981
Min length3

Unique

Unique10 ?
Unique (%)1.2%

Sample

1st rowJapan
2nd row<NA>
3rd rowJapan
4th rowMalaysia
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 390
45.6%
China 109
 
12.7%
USA 81
 
9.5%
Germany 56
 
6.5%
Japan 49
 
5.7%
Italy 36
 
4.2%
France 24
 
2.8%
India 12
 
1.4%
U.K 10
 
1.2%
Austria 8
 
0.9%
Other values (31) 81
 
9.5%

Length

2023-12-13T01:03:59.507284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 390
45.5%
china 110
 
12.8%
usa 81
 
9.4%
germany 58
 
6.8%
japan 49
 
5.7%
italy 38
 
4.4%
france 24
 
2.8%
india 12
 
1.4%
u.k 10
 
1.2%
austria 8
 
0.9%
Other values (27) 78
 
9.1%

Missing values

2023-12-13T01:03:53.913467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:03:54.032133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T01:03:54.157037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

등록번호(Cert_No)업소(Company Name)최초등록일(Initial Date)만료일(Expiry Date)등록종류(Type)제조기준(Standard)소재지(Address)대표자(Representative)국가(Country)
0제 ES-1 호KANTO KOATSU-YOKI MFG. CO., LTD.2003-07-102024-07-09Seamless Steel Cylinder(Spinning Type)HPGSL153-1, Tottori-machi, Maebashi-shi, Gunma-pref., JAPANToshiyuki YabataJapan
1제 ES-1 호<NA><NA><NA>Welded Cylinder (Internal Volume under 500L)HPGSL<NA><NA><NA>
2제 ES-2 호BENKAN KIKOH Corporation2003-07-102024-07-09Seamless Steel Cylinder(Spinning Type)HPGSL1-1, Fuso-cho, Amagasaki-shi, Hyogo-pref., JAPANMANABU ITCHODAJapan
3제 ES-7 호Taylor-Wharton Malaysia Sdn. Bhd.2003-09-082024-09-07Cryogenic Gas Cylinders(with internal volume under 500ℓ)DOT-4LLot. No. PT 5073 Jalan Jangur 28/43, Hicom Industrial Estate 40400 Shah Alam, Selangor MalaysiaConnie Liew Yoke KienMalaysia
4제 ES-7 호<NA><NA><NA><NA><NA><NA><NA><NA>
5제 ES-9 호Faber Industrie spa2003-08-192024-08-18Seamless Steel CylindersPED+EN ISO9809-1Via dell'Industria 23-33043 Cividale del Friuli, Udine-ItalyAlberto AgnolettiItaly
6제 ES-9 호<NA><NA><NA><NA>TPED+EN ISO9809-1<NA><NA><NA>
7제 ES-9 호<NA><NA><NA>Composite Material Cylinders(steel liner)ANSI/NGV-2(CNG-1,2,3)<NA><NA><NA>
8제 ES-9 호<NA><NA><NA><NA><NA><NA><NA><NA>
9제 ES-9 호<NA><NA><NA>Composite Material Cylinders(Non-metal Liner)ANSI/NGV-2(CNG-4)<NA><NA><NA>
등록번호(Cert_No)업소(Company Name)최초등록일(Initial Date)만료일(Expiry Date)등록종류(Type)제조기준(Standard)소재지(Address)대표자(Representative)국가(Country)
846제 ES-551 호Applied Cryo Technologies, INC2023-05-042026-05-03Storage TankASME Sec.VIII Div.17150 Almeda Genoa Rd, Houston TX. 77075, USARobert ErnullUSA
847제 ES-551 호<NA><NA><NA>Pressure Vessels(Drum, Tower, Reactor, Others)ASME Sec.VIII Div.1<NA><NA><NA>
848제 ES-551 호<NA><NA><NA>High-Pressure Gas Tanks Fixed on VehiclesASME Sec. XII<NA><NA><NA>
849제 ES-552 호WEKA AG2023-03-172026-03-16Emergency Cut-off Devices for High-pressure GasesPED+DIN EN 16668Schrlistrasse 8, CH-8344 Bretswil, SWITZERLANDPascal ErniSWITZERLAND
850제 ES-552 호<NA><NA><NA><NA>PED+DIN EN 12516-2<NA><NA><NA>
851제 ES-552 호<NA><NA><NA><NA>PED+DIN EN 1779<NA><NA><NA>
852제 ES-553 호ACME Cryogenics, Inc2023-03-092026-03-08Emergency Cut-off Devices for High-pressure GasesASME B16.342801 Mitchell Ave Allentown, PA 18103, USAChad thomasUSA
853제 ES-553 호<NA><NA><NA><NA>ASME B31.12<NA><NA><NA>
854제 ES-554 호Japan Steel Works M&E, Inc. Muroran Plant2023-05-162026-05-15Pressure vessels(Seamless vessel)HPGSA4 Chatsu-Cho Muroran, Hokkaido, 051-8505, JapanKengo TakeyaJapan
855제 ES-555 호HANGZHOU YUHANG ZHANGSHAN STEEL CYLINDER CO.,LTD.2023-06-152026-06-14Welded Cylinders(Internal volume < 500L)KGS AC211Renhe Industrial Zone, Yuhang, Hangzhou, Zhejiang Province P.R. China 311107Chen HongweiChina

Duplicate rows

Most frequently occurring

등록번호(Cert_No)업소(Company Name)최초등록일(Initial Date)만료일(Expiry Date)등록종류(Type)제조기준(Standard)소재지(Address)대표자(Representative)국가(Country)# duplicates
43제 ES-522 호<NA><NA><NA><NA>ASME Sec.VIII Div.1<NA><NA><NA>4
24제 ES-303호<NA><NA><NA><NA><NA><NA><NA><NA>3
0제 ES-105 호<NA><NA><NA><NA>AD2000<NA><NA><NA>2
1제 ES-105 호<NA><NA><NA><NA>CODAP<NA><NA><NA>2
2제 ES-109호<NA><NA><NA><NA>ASME Sec.VIII, Div.1<NA><NA><NA>2
3제 ES-126호<NA><NA><NA><NA><NA><NA><NA><NA>2
4제 ES-163호<NA><NA><NA><NA>ASME Sec.VIII, Div.2<NA><NA><NA>2
5제 ES-163호<NA><NA><NA><NA>PED + AD2000<NA><NA><NA>2
6제 ES-166호<NA><NA><NA><NA>JHPGSC<NA><NA><NA>2
7제 ES-187호<NA><NA><NA><NA><NA><NA><NA><NA>2