Overview

Dataset statistics

Number of variables3
Number of observations4542
Missing cells712
Missing cells (%)5.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory106.6 KiB
Average record size in memory24.0 B

Variable types

Text3

Dataset

Description국민이 제공신청 한 데이터로, 전국에 위치한 LPG판매업체 현황(업체명, 주소, 일련번호)를 공개하여 국민분들의 실생활에 도움을 드리고자 제공하는 데이터입니다
Author한국가스안전공사
URLhttps://www.data.go.kr/data/15091481/fileData.do

Alerts

일련번호 has 712 (15.7%) missing valuesMissing

Reproduction

Analysis started2024-03-15 01:26:40.610310
Analysis finished2024-03-15 01:26:41.776816
Duration1.17 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct3164
Distinct (%)69.7%
Missing0
Missing (%)0.0%
Memory size35.6 KiB
2024-03-15T10:26:42.886910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length22
Mean length5.5424923
Min length2

Characters and Unicode

Total characters25174
Distinct characters459
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2727 ?
Unique (%)60.0%

Sample

1st row(주)태성산업가스경기충전소
2nd row대륙가스
3rd row뉴PKS종합가스
4th row에스앤디(주)
5th row(유)서울가스
ValueCountFrequency (%)
현대가스 52
 
1.1%
대성가스 40
 
0.9%
안전가스 38
 
0.8%
우리가스 37
 
0.8%
삼성가스 35
 
0.8%
제일가스 34
 
0.7%
중앙가스 27
 
0.6%
한국가스 24
 
0.5%
금성가스 21
 
0.5%
대한가스 20
 
0.4%
Other values (3178) 4277
92.9%
2024-03-15T10:26:44.680695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3861
 
15.3%
3817
 
15.2%
777
 
3.1%
728
 
2.9%
652
 
2.6%
645
 
2.6%
568
 
2.3%
567
 
2.3%
) 560
 
2.2%
( 559
 
2.2%
Other values (449) 12440
49.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 23689
94.1%
Close Punctuation 560
 
2.2%
Open Punctuation 559
 
2.2%
Uppercase Letter 196
 
0.8%
Space Separator 63
 
0.3%
Other Punctuation 53
 
0.2%
Decimal Number 32
 
0.1%
Lowercase Letter 14
 
0.1%
Dash Punctuation 7
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3861
 
16.3%
3817
 
16.1%
777
 
3.3%
728
 
3.1%
652
 
2.8%
645
 
2.7%
568
 
2.4%
567
 
2.4%
546
 
2.3%
506
 
2.1%
Other values (404) 11022
46.5%
Uppercase Letter
ValueCountFrequency (%)
S 50
25.5%
K 38
19.4%
G 27
13.8%
L 17
 
8.7%
P 14
 
7.1%
M 8
 
4.1%
E 6
 
3.1%
T 5
 
2.6%
O 4
 
2.0%
N 4
 
2.0%
Other values (8) 23
11.7%
Decimal Number
ValueCountFrequency (%)
8 11
34.4%
3 5
15.6%
2 4
 
12.5%
5 4
 
12.5%
1 3
 
9.4%
6 2
 
6.2%
4 2
 
6.2%
9 1
 
3.1%
Lowercase Letter
ValueCountFrequency (%)
s 2
14.3%
o 2
14.3%
g 2
14.3%
r 2
14.3%
i 2
14.3%
a 2
14.3%
b 1
7.1%
d 1
7.1%
Other Punctuation
ValueCountFrequency (%)
. 22
41.5%
, 12
22.6%
& 8
 
15.1%
/ 5
 
9.4%
· 4
 
7.5%
: 2
 
3.8%
Close Punctuation
ValueCountFrequency (%)
) 560
100.0%
Open Punctuation
ValueCountFrequency (%)
( 559
100.0%
Space Separator
ValueCountFrequency (%)
63
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 23689
94.1%
Common 1275
 
5.1%
Latin 210
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3861
 
16.3%
3817
 
16.1%
777
 
3.3%
728
 
3.1%
652
 
2.8%
645
 
2.7%
568
 
2.4%
567
 
2.4%
546
 
2.3%
506
 
2.1%
Other values (404) 11022
46.5%
Latin
ValueCountFrequency (%)
S 50
23.8%
K 38
18.1%
G 27
12.9%
L 17
 
8.1%
P 14
 
6.7%
M 8
 
3.8%
E 6
 
2.9%
T 5
 
2.4%
O 4
 
1.9%
N 4
 
1.9%
Other values (16) 37
17.6%
Common
ValueCountFrequency (%)
) 560
43.9%
( 559
43.8%
63
 
4.9%
. 22
 
1.7%
, 12
 
0.9%
8 11
 
0.9%
& 8
 
0.6%
- 7
 
0.5%
3 5
 
0.4%
/ 5
 
0.4%
Other values (9) 23
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 23689
94.1%
ASCII 1481
 
5.9%
None 4
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3861
 
16.3%
3817
 
16.1%
777
 
3.3%
728
 
3.1%
652
 
2.8%
645
 
2.7%
568
 
2.4%
567
 
2.4%
546
 
2.3%
506
 
2.1%
Other values (404) 11022
46.5%
ASCII
ValueCountFrequency (%)
) 560
37.8%
( 559
37.7%
63
 
4.3%
S 50
 
3.4%
K 38
 
2.6%
G 27
 
1.8%
. 22
 
1.5%
L 17
 
1.1%
P 14
 
0.9%
, 12
 
0.8%
Other values (34) 119
 
8.0%
None
ValueCountFrequency (%)
· 4
100.0%

주소
Text

Distinct4171
Distinct (%)91.8%
Missing0
Missing (%)0.0%
Memory size35.6 KiB
2024-03-15T10:26:46.368281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length43
Median length37
Mean length19.491854
Min length11

Characters and Unicode

Total characters88532
Distinct characters493
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3964 ?
Unique (%)87.3%

Sample

1st row경기 용인시 기흥구 중부대로 14-9 (영덕동)
2nd row충북 청주시 상당구 1순환로 1498-2
3rd row충북 청주시 청원구 오창읍 중부로 1481
4th row울산 남구 장생포고래로 317
5th row전북 전주시 완산구 안행7길 8 (효자동1가)
ValueCountFrequency (%)
경기 673
 
3.0%
경남 667
 
3.0%
경북 652
 
2.9%
전남 369
 
1.7%
충남 342
 
1.5%
전북 313
 
1.4%
충북 274
 
1.2%
부산 260
 
1.2%
강원 254
 
1.1%
대구 249
 
1.1%
Other values (7123) 18125
81.7%
2024-03-15T10:26:48.095202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17739
 
20.0%
1 3491
 
3.9%
3124
 
3.5%
2584
 
2.9%
2 2325
 
2.6%
2231
 
2.5%
2209
 
2.5%
2067
 
2.3%
1957
 
2.2%
3 1782
 
2.0%
Other values (483) 49023
55.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 52330
59.1%
Space Separator 17739
 
20.0%
Decimal Number 16075
 
18.2%
Dash Punctuation 1270
 
1.4%
Open Punctuation 515
 
0.6%
Close Punctuation 514
 
0.6%
Other Punctuation 53
 
0.1%
Uppercase Letter 33
 
< 0.1%
Lowercase Letter 2
 
< 0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3124
 
6.0%
2584
 
4.9%
2231
 
4.3%
2209
 
4.2%
2067
 
3.9%
1957
 
3.7%
1637
 
3.1%
1606
 
3.1%
1550
 
3.0%
1420
 
2.7%
Other values (456) 31945
61.0%
Decimal Number
ValueCountFrequency (%)
1 3491
21.7%
2 2325
14.5%
3 1782
11.1%
4 1488
9.3%
5 1402
8.7%
6 1281
 
8.0%
7 1199
 
7.5%
8 1111
 
6.9%
9 1055
 
6.6%
0 941
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
P 7
21.2%
G 7
21.2%
L 7
21.2%
A 6
18.2%
B 3
9.1%
C 2
 
6.1%
T 1
 
3.0%
Other Punctuation
ValueCountFrequency (%)
, 44
83.0%
. 7
 
13.2%
: 2
 
3.8%
Lowercase Letter
ValueCountFrequency (%)
k 1
50.0%
s 1
50.0%
Space Separator
ValueCountFrequency (%)
17739
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1270
100.0%
Open Punctuation
ValueCountFrequency (%)
( 515
100.0%
Close Punctuation
ValueCountFrequency (%)
) 514
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 52331
59.1%
Common 36166
40.9%
Latin 35
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3124
 
6.0%
2584
 
4.9%
2231
 
4.3%
2209
 
4.2%
2067
 
3.9%
1957
 
3.7%
1637
 
3.1%
1606
 
3.1%
1550
 
3.0%
1420
 
2.7%
Other values (457) 31946
61.0%
Common
ValueCountFrequency (%)
17739
49.0%
1 3491
 
9.7%
2 2325
 
6.4%
3 1782
 
4.9%
4 1488
 
4.1%
5 1402
 
3.9%
6 1281
 
3.5%
- 1270
 
3.5%
7 1199
 
3.3%
8 1111
 
3.1%
Other values (7) 3078
 
8.5%
Latin
ValueCountFrequency (%)
P 7
20.0%
G 7
20.0%
L 7
20.0%
A 6
17.1%
B 3
8.6%
C 2
 
5.7%
T 1
 
2.9%
k 1
 
2.9%
s 1
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 52330
59.1%
ASCII 36201
40.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17739
49.0%
1 3491
 
9.6%
2 2325
 
6.4%
3 1782
 
4.9%
4 1488
 
4.1%
5 1402
 
3.9%
6 1281
 
3.5%
- 1270
 
3.5%
7 1199
 
3.3%
8 1111
 
3.1%
Other values (16) 3113
 
8.6%
Hangul
ValueCountFrequency (%)
3124
 
6.0%
2584
 
4.9%
2231
 
4.3%
2209
 
4.2%
2067
 
3.9%
1957
 
3.7%
1637
 
3.1%
1606
 
3.1%
1550
 
3.0%
1420
 
2.7%
Other values (456) 31945
61.0%
None
ValueCountFrequency (%)
1
100.0%

일련번호
Text

MISSING 

Distinct3765
Distinct (%)98.3%
Missing712
Missing (%)15.7%
Memory size35.6 KiB
2024-03-15T10:26:49.140332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.997389
Min length6

Characters and Unicode

Total characters26800
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3706 ?
Unique (%)96.8%

Sample

1st row838400
2nd row841600
3rd row1541736
4th row1567601
5th row1570394
ValueCountFrequency (%)
8528204 5
 
0.1%
8668812 3
 
0.1%
5849704 3
 
0.1%
8664488 3
 
0.1%
7322525 2
 
0.1%
6538282 2
 
0.1%
3380209 2
 
0.1%
7467722 2
 
0.1%
3387618 2
 
0.1%
8642000 2
 
0.1%
Other values (3755) 3804
99.3%
2024-03-15T10:26:50.542007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 3427
12.8%
2 3199
11.9%
5 3064
11.4%
4 2834
10.6%
8 2746
10.2%
7 2545
9.5%
0 2537
9.5%
6 2534
9.5%
1 2274
8.5%
9 1638
6.1%
Other values (2) 2
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26798
> 99.9%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 3427
12.8%
2 3199
11.9%
5 3064
11.4%
4 2834
10.6%
8 2746
10.2%
7 2545
9.5%
0 2537
9.5%
6 2534
9.5%
1 2274
8.5%
9 1638
6.1%
Other Punctuation
ValueCountFrequency (%)
/ 1
50.0%
, 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 26800
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 3427
12.8%
2 3199
11.9%
5 3064
11.4%
4 2834
10.6%
8 2746
10.2%
7 2545
9.5%
0 2537
9.5%
6 2534
9.5%
1 2274
8.5%
9 1638
6.1%
Other values (2) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26800
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 3427
12.8%
2 3199
11.9%
5 3064
11.4%
4 2834
10.6%
8 2746
10.2%
7 2545
9.5%
0 2537
9.5%
6 2534
9.5%
1 2274
8.5%
9 1638
6.1%
Other values (2) 2
 
< 0.1%

Missing values

2024-03-15T10:26:41.542180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T10:26:41.718502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업소명주소일련번호
0(주)태성산업가스경기충전소경기 용인시 기흥구 중부대로 14-9 (영덕동)<NA>
1대륙가스충북 청주시 상당구 1순환로 1498-2<NA>
2뉴PKS종합가스충북 청주시 청원구 오창읍 중부로 1481<NA>
3에스앤디(주)울산 남구 장생포고래로 317<NA>
4(유)서울가스전북 전주시 완산구 안행7길 8 (효자동1가)<NA>
5(유)전주에너지그린한국가스전북 전주시 덕진구 신흥마을길 177<NA>
6린나이가스경남 창원시 마산합포구 구산면 해양관광로 1489<NA>
7(유)예향에너지광주 광산구 고봉로 241<NA>
8부경산업가스경남 함안군 군북면 여명로 159-1<NA>
9동일가스산업울산 울주군 서생면 용연길 240<NA>
업소명주소일련번호
4532원가스산업경기 파주시 파평면 장승배기로 206<NA>
4533세종가스경기 파주시 탄현면 금승리 173,172-19<NA>
4534에스케이(SK)가스경기 양주시 은현면 봉암리 49-5<NA>
4535(주)현대가스텍 삼성가스지점경기 파주시 광탄면 보광로 1430 가<NA>
4536현대가스텍 삼성가스지점경기 파주시 광탄면 보광로 1430<NA>
4537동일산업가스경기 파주시 월롱산로 46-1<NA>
4538동방가스산업경기 파주시 문산읍 사목로 63( 사목리281-1,282-2)<NA>
4539우성종합가스경기 양주시 백석읍 중앙로 68<NA>
4540정인에너지경기 양주시 백석읍 월암로 74-3, 74-2<NA>
4541신평화가스충북 충주시 신니면 참샘길 187<NA>