Overview

Dataset statistics

Number of variables5
Number of observations52
Missing cells12
Missing cells (%)4.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.2 KiB
Average record size in memory43.5 B

Variable types

Numeric1
Text3
Categorical1

Dataset

Description수소법 시행에 따라 2022년 2월부터 수행될 수소용품 법정검사 대상 제조업체 현황(업체명, 수소용품, 주소, 전화번호)을 제공하여 수소용품산업 진행현황을 알려드리고자 제공하는 데이터입니다.
Author한국가스안전공사
URLhttps://www.data.go.kr/data/15091488/fileData.do

Alerts

번호 is highly overall correlated with 수소용품High correlation
수소용품 is highly overall correlated with 번호High correlation
대표전화 has 12 (23.1%) missing valuesMissing
번호 has unique valuesUnique
주소 has unique valuesUnique

Reproduction

Analysis started2023-12-16 15:54:12.336778
Analysis finished2023-12-16 15:54:14.954918
Duration2.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct52
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.5
Minimum1
Maximum52
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size600.0 B
2023-12-16T15:54:15.363782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.55
Q113.75
median26.5
Q339.25
95-th percentile49.45
Maximum52
Range51
Interquartile range (IQR)25.5

Descriptive statistics

Standard deviation15.154757
Coefficient of variation (CV)0.57187763
Kurtosis-1.2
Mean26.5
Median Absolute Deviation (MAD)13
Skewness0
Sum1378
Variance229.66667
MonotonicityStrictly increasing
2023-12-16T15:54:16.041510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.9%
28 1
 
1.9%
30 1
 
1.9%
31 1
 
1.9%
32 1
 
1.9%
33 1
 
1.9%
34 1
 
1.9%
35 1
 
1.9%
36 1
 
1.9%
37 1
 
1.9%
Other values (42) 42
80.8%
ValueCountFrequency (%)
1 1
1.9%
2 1
1.9%
3 1
1.9%
4 1
1.9%
5 1
1.9%
6 1
1.9%
7 1
1.9%
8 1
1.9%
9 1
1.9%
10 1
1.9%
ValueCountFrequency (%)
52 1
1.9%
51 1
1.9%
50 1
1.9%
49 1
1.9%
48 1
1.9%
47 1
1.9%
46 1
1.9%
45 1
1.9%
44 1
1.9%
43 1
1.9%
Distinct51
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size548.0 B
2023-12-16T15:54:17.076389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length10
Mean length6.2307692
Min length3

Characters and Unicode

Total characters324
Distinct characters130
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)96.2%

Sample

1st row바이오프랜즈
2nd row원익머터리얼즈
3rd row제이앤케이히터
4th row㈜신넥앤테크
5th row㈜원일티엔아이
ValueCountFrequency (%)
에이치앤파워 2
 
3.4%
sk 2
 
3.4%
대성히트에너시스㈜ 1
 
1.7%
동아퓨얼셀 1
 
1.7%
케이퓨얼셀 1
 
1.7%
㈜수소에너젠 1
 
1.7%
엘켐텍 1
 
1.7%
한화솔루션 1
 
1.7%
이엠솔루션 1
 
1.7%
주식회사 1
 
1.7%
Other values (46) 46
79.3%
2023-12-16T15:54:18.814199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19
 
5.9%
18
 
5.6%
10
 
3.1%
10
 
3.1%
8
 
2.5%
7
 
2.2%
7
 
2.2%
7
 
2.2%
7
 
2.2%
7
 
2.2%
Other values (120) 224
69.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 271
83.6%
Other Symbol 19
 
5.9%
Uppercase Letter 14
 
4.3%
Lowercase Letter 9
 
2.8%
Space Separator 8
 
2.5%
Close Punctuation 1
 
0.3%
Other Punctuation 1
 
0.3%
Open Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
18
 
6.6%
10
 
3.7%
10
 
3.7%
7
 
2.6%
7
 
2.6%
7
 
2.6%
7
 
2.6%
7
 
2.6%
6
 
2.2%
6
 
2.2%
Other values (98) 186
68.6%
Uppercase Letter
ValueCountFrequency (%)
S 4
28.6%
E 2
14.3%
K 2
14.3%
X 1
 
7.1%
T 1
 
7.1%
H 1
 
7.1%
P 1
 
7.1%
G 1
 
7.1%
N 1
 
7.1%
Lowercase Letter
ValueCountFrequency (%)
e 2
22.2%
y 1
11.1%
v 1
11.1%
r 1
11.1%
s 1
11.1%
l 1
11.1%
u 1
11.1%
g 1
11.1%
Other Symbol
ValueCountFrequency (%)
19
100.0%
Space Separator
ValueCountFrequency (%)
8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 290
89.5%
Latin 23
 
7.1%
Common 11
 
3.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
 
6.6%
18
 
6.2%
10
 
3.4%
10
 
3.4%
7
 
2.4%
7
 
2.4%
7
 
2.4%
7
 
2.4%
7
 
2.4%
6
 
2.1%
Other values (99) 192
66.2%
Latin
ValueCountFrequency (%)
S 4
17.4%
E 2
 
8.7%
K 2
 
8.7%
e 2
 
8.7%
X 1
 
4.3%
T 1
 
4.3%
y 1
 
4.3%
v 1
 
4.3%
r 1
 
4.3%
s 1
 
4.3%
Other values (7) 7
30.4%
Common
ValueCountFrequency (%)
8
72.7%
) 1
 
9.1%
& 1
 
9.1%
( 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 271
83.6%
ASCII 34
 
10.5%
None 19
 
5.9%

Most frequent character per block

None
ValueCountFrequency (%)
19
100.0%
Hangul
ValueCountFrequency (%)
18
 
6.6%
10
 
3.7%
10
 
3.7%
7
 
2.6%
7
 
2.6%
7
 
2.6%
7
 
2.6%
7
 
2.6%
6
 
2.2%
6
 
2.2%
Other values (98) 186
68.6%
ASCII
ValueCountFrequency (%)
8
23.5%
S 4
 
11.8%
E 2
 
5.9%
K 2
 
5.9%
e 2
 
5.9%
X 1
 
2.9%
T 1
 
2.9%
y 1
 
2.9%
v 1
 
2.9%
r 1
 
2.9%
Other values (11) 11
32.4%

수소용품
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Memory size548.0 B
수전해설비
20 
연료전지
20 
수소추출설비
12 

Length

Max length6
Median length5
Mean length4.8461538
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수소추출설비
2nd row수소추출설비
3rd row수소추출설비
4th row수소추출설비
5th row수소추출설비

Common Values

ValueCountFrequency (%)
수전해설비 20
38.5%
연료전지 20
38.5%
수소추출설비 12
23.1%

Length

2023-12-16T15:54:19.640387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-16T15:54:20.199719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수전해설비 20
38.5%
연료전지 20
38.5%
수소추출설비 12
23.1%

주소
Text

UNIQUE 

Distinct52
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size548.0 B
2023-12-16T15:54:21.010566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length29.5
Mean length22.596154
Min length9

Characters and Unicode

Total characters1175
Distinct characters186
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52 ?
Unique (%)100.0%

Sample

1st row대전광역시 유성구 테크노 2로 199
2nd row충북 청주시 청원구 오창읍 양청3길 30
3rd row충청남도 당진시 우강면 다량길 60 (공장)
4th row대전시 유성구 유성대로 1662 바이오벤처타운 503,504호
5th row경기도 김포시 양촌읍 황금 1로 150번지
ValueCountFrequency (%)
경기도 18
 
6.6%
유성구 7
 
2.6%
서울시 6
 
2.2%
대전광역시 4
 
1.5%
성남시 4
 
1.5%
강남구 3
 
1.1%
서울특별시 3
 
1.1%
권선구 2
 
0.7%
충남 2
 
0.7%
전라북도 2
 
0.7%
Other values (193) 222
81.3%
2023-12-16T15:54:23.201835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
222
 
18.9%
48
 
4.1%
48
 
4.1%
1 43
 
3.7%
35
 
3.0%
2 30
 
2.6%
27
 
2.3%
6 26
 
2.2%
24
 
2.0%
0 24
 
2.0%
Other values (176) 648
55.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 701
59.7%
Space Separator 222
 
18.9%
Decimal Number 214
 
18.2%
Uppercase Letter 15
 
1.3%
Dash Punctuation 6
 
0.5%
Other Punctuation 5
 
0.4%
Close Punctuation 5
 
0.4%
Open Punctuation 5
 
0.4%
Lowercase Letter 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
48
 
6.8%
48
 
6.8%
35
 
5.0%
27
 
3.9%
24
 
3.4%
23
 
3.3%
20
 
2.9%
19
 
2.7%
18
 
2.6%
17
 
2.4%
Other values (149) 422
60.2%
Decimal Number
ValueCountFrequency (%)
1 43
20.1%
2 30
14.0%
6 26
12.1%
0 24
11.2%
5 21
9.8%
3 20
9.3%
7 18
8.4%
4 15
 
7.0%
9 9
 
4.2%
8 8
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
N 2
13.3%
D 2
13.3%
K 2
13.3%
I 2
13.3%
T 2
13.3%
F 1
6.7%
R 1
6.7%
A 1
6.7%
B 1
6.7%
S 1
6.7%
Lowercase Letter
ValueCountFrequency (%)
k 1
50.0%
s 1
50.0%
Space Separator
ValueCountFrequency (%)
222
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Other Punctuation
ValueCountFrequency (%)
, 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 701
59.7%
Common 457
38.9%
Latin 17
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
48
 
6.8%
48
 
6.8%
35
 
5.0%
27
 
3.9%
24
 
3.4%
23
 
3.3%
20
 
2.9%
19
 
2.7%
18
 
2.6%
17
 
2.4%
Other values (149) 422
60.2%
Common
ValueCountFrequency (%)
222
48.6%
1 43
 
9.4%
2 30
 
6.6%
6 26
 
5.7%
0 24
 
5.3%
5 21
 
4.6%
3 20
 
4.4%
7 18
 
3.9%
4 15
 
3.3%
9 9
 
2.0%
Other values (5) 29
 
6.3%
Latin
ValueCountFrequency (%)
N 2
11.8%
D 2
11.8%
K 2
11.8%
I 2
11.8%
T 2
11.8%
F 1
5.9%
R 1
5.9%
A 1
5.9%
B 1
5.9%
S 1
5.9%
Other values (2) 2
11.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 701
59.7%
ASCII 474
40.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
222
46.8%
1 43
 
9.1%
2 30
 
6.3%
6 26
 
5.5%
0 24
 
5.1%
5 21
 
4.4%
3 20
 
4.2%
7 18
 
3.8%
4 15
 
3.2%
9 9
 
1.9%
Other values (17) 46
 
9.7%
Hangul
ValueCountFrequency (%)
48
 
6.8%
48
 
6.8%
35
 
5.0%
27
 
3.9%
24
 
3.4%
23
 
3.3%
20
 
2.9%
19
 
2.7%
18
 
2.6%
17
 
2.4%
Other values (149) 422
60.2%

대표전화
Text

MISSING 

Distinct39
Distinct (%)97.5%
Missing12
Missing (%)23.1%
Memory size548.0 B
2023-12-16T15:54:24.235365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.075
Min length11

Characters and Unicode

Total characters483
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)95.0%

Sample

1st row02-6739-1100
2nd row043-210-4600
3rd row02-6011-4250
4th row044-863-0311
5th row031-498-0521
ValueCountFrequency (%)
042-350-7330 2
 
5.0%
070-4613-4900 1
 
2.5%
053-263-8988 1
 
2.5%
051-263-3459 1
 
2.5%
061-902-4075 1
 
2.5%
042-931-0715 1
 
2.5%
031-494-0720 1
 
2.5%
080-600-6000 1
 
2.5%
031-8007-3516 1
 
2.5%
02-2018-5114 1
 
2.5%
Other values (29) 29
72.5%
2023-12-16T15:54:26.299782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 98
20.3%
- 80
16.6%
1 63
13.0%
2 49
10.1%
3 48
9.9%
4 32
 
6.6%
5 30
 
6.2%
6 25
 
5.2%
8 22
 
4.6%
9 20
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 403
83.4%
Dash Punctuation 80
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 98
24.3%
1 63
15.6%
2 49
12.2%
3 48
11.9%
4 32
 
7.9%
5 30
 
7.4%
6 25
 
6.2%
8 22
 
5.5%
9 20
 
5.0%
7 16
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 80
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 483
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 98
20.3%
- 80
16.6%
1 63
13.0%
2 49
10.1%
3 48
9.9%
4 32
 
6.6%
5 30
 
6.2%
6 25
 
5.2%
8 22
 
4.6%
9 20
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 483
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 98
20.3%
- 80
16.6%
1 63
13.0%
2 49
10.1%
3 48
9.9%
4 32
 
6.6%
5 30
 
6.2%
6 25
 
5.2%
8 22
 
4.6%
9 20
 
4.1%

Interactions

2023-12-16T15:54:13.782349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-16T15:54:26.972554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호업체명수소용품주소대표전화
번호1.0000.9410.9501.0000.962
업체명0.9411.0000.6211.0001.000
수소용품0.9500.6211.0001.0000.716
주소1.0001.0001.0001.0001.000
대표전화0.9621.0000.7161.0001.000
2023-12-16T15:54:27.538000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호수소용품
번호1.0000.872
수소용품0.8721.000

Missing values

2023-12-16T15:54:14.451565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-16T15:54:14.840452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호업체명수소용품주소대표전화
01바이오프랜즈수소추출설비대전광역시 유성구 테크노 2로 19902-6739-1100
12원익머터리얼즈수소추출설비충북 청주시 청원구 오창읍 양청3길 30043-210-4600
23제이앤케이히터수소추출설비충청남도 당진시 우강면 다량길 60 (공장)02-6011-4250
34㈜신넥앤테크수소추출설비대전시 유성구 유성대로 1662 바이오벤처타운 503,504호044-863-0311
45㈜원일티엔아이수소추출설비경기도 김포시 양촌읍 황금 1로 150번지031-498-0521
56파나시아수소추출설비부산 강서구 미음산단 3로 55051-831-1010
67현대로템㈜수소추출설비경기도 의왕시 철도박물관로 37031-8090-8114
78디알퓨얼셀수소추출설비경기도 성남시 중원구 갈마치로 215, 금강펜테리움IT타워 B동 305호<NA>
89케이테크수소추출설비충남 당진시 우강면 다랑길 62<NA>
910엘지화학수소추출설비충청남도 서산시 대산읍041-661-2114
번호업체명수소용품주소대표전화
4243STX에너지솔루션㈜연료전지대구광역시 달서구 성서공단로 275 2F053-263-8988
4344㈜미코파워연료전지경기도 안성시 공단 2로 23031-610-7940
4445㈜코텍에너지연료전지경기도 성남시 판교로 700 테크노파크 D동 506호02-809-6200
4546㈜두산퓨얼셀파워연료전지경기도 화성시 향남읍 제약단지로 75031-781-0475
4647범한퓨얼셀㈜연료전지경상남도 창원시 마산회원구 자유무역4길 61055-224-0500
4748에이치앤파워연료전지대전광역시 유성구 대학로 291 N9동042-350-7330
4849㈜가온셀연료전지전라북도 완주군 봉동읍 완주산단6로212063-262-0522
4950현대모비스㈜연료전지(우)135-911 서울특별시 강남구 테헤란로 20302-2018-5114
5051㈜씨엔엘에너지연료전지서울시 동대문구 회기로 2길 4 1층02-921-2312
5152㈜두산모빌리티이노베이션연료전지경기도 용인시 수지구 수지로 112031-270-1703