Overview

Dataset statistics

Number of variables3
Number of observations394
Missing cells216
Missing cells (%)18.3%
Duplicate rows1
Duplicate rows (%)0.3%
Total size in memory9.4 KiB
Average record size in memory24.3 B

Variable types

Text3

Dataset

Description기장군의 즉석식품제조가공업체 현황에 대한 데이터로 업소명, 소재지, 전화번호 등의 항목을 제공합니다. 소재지는 도로명 주소로 표기하였고 , 소재지전화는 데이터 미집계로 공란이 있을 수 있습니다
Author부산광역시 기장군
URLhttps://www.data.go.kr/data/15047916/fileData.do

Alerts

Dataset has 1 (0.3%) duplicate rowsDuplicates
소재지전화 has 215 (54.6%) missing valuesMissing

Reproduction

Analysis started2023-12-12 04:22:00.792649
Analysis finished2023-12-12 04:22:01.409561
Duration0.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct390
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Memory size3.2 KiB
2023-12-12T13:22:01.700671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length20
Mean length6.6979695
Min length1

Characters and Unicode

Total characters2639
Distinct characters442
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique386 ?
Unique (%)98.0%

Sample

1st row경주상회
2nd row송정기름집
3rd row안동참기름
4th row기장상회
5th row일광참기름상회
ValueCountFrequency (%)
주식회사 10
 
2.0%
정관점 7
 
1.4%
기장점 5
 
1.0%
반찬 3
 
0.6%
담꾹 3
 
0.6%
부산정관점 3
 
0.6%
행복밥상금계리 2
 
0.4%
주)근해유통 2
 
0.4%
세븐일레븐 2
 
0.4%
3호점 2
 
0.4%
Other values (444) 454
92.1%
2023-12-12T13:22:02.234358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
99
 
3.8%
49
 
1.9%
48
 
1.8%
( 45
 
1.7%
) 45
 
1.7%
44
 
1.7%
43
 
1.6%
42
 
1.6%
41
 
1.6%
36
 
1.4%
Other values (432) 2147
81.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2265
85.8%
Space Separator 99
 
3.8%
Lowercase Letter 97
 
3.7%
Uppercase Letter 52
 
2.0%
Open Punctuation 45
 
1.7%
Close Punctuation 45
 
1.7%
Decimal Number 29
 
1.1%
Other Punctuation 4
 
0.2%
Dash Punctuation 2
 
0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
49
 
2.2%
48
 
2.1%
44
 
1.9%
43
 
1.9%
42
 
1.9%
41
 
1.8%
36
 
1.6%
35
 
1.5%
34
 
1.5%
34
 
1.5%
Other values (375) 1859
82.1%
Lowercase Letter
ValueCountFrequency (%)
e 17
17.5%
o 10
10.3%
a 10
10.3%
m 8
 
8.2%
r 6
 
6.2%
y 5
 
5.2%
t 5
 
5.2%
b 5
 
5.2%
k 5
 
5.2%
i 4
 
4.1%
Other values (10) 22
22.7%
Uppercase Letter
ValueCountFrequency (%)
A 7
13.5%
E 6
11.5%
C 5
9.6%
F 5
9.6%
B 4
 
7.7%
M 4
 
7.7%
O 3
 
5.8%
D 3
 
5.8%
L 2
 
3.8%
G 2
 
3.8%
Other values (10) 11
21.2%
Decimal Number
ValueCountFrequency (%)
0 6
20.7%
1 5
17.2%
8 3
10.3%
4 3
10.3%
9 3
10.3%
6 3
10.3%
2 3
10.3%
3 2
 
6.9%
5 1
 
3.4%
Other Punctuation
ValueCountFrequency (%)
& 2
50.0%
, 1
25.0%
' 1
25.0%
Space Separator
ValueCountFrequency (%)
99
100.0%
Open Punctuation
ValueCountFrequency (%)
( 45
100.0%
Close Punctuation
ValueCountFrequency (%)
) 45
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2263
85.8%
Common 224
 
8.5%
Latin 150
 
5.7%
Han 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
49
 
2.2%
48
 
2.1%
44
 
1.9%
43
 
1.9%
42
 
1.9%
41
 
1.8%
36
 
1.6%
35
 
1.5%
34
 
1.5%
34
 
1.5%
Other values (373) 1857
82.1%
Latin
ValueCountFrequency (%)
e 17
 
11.3%
o 10
 
6.7%
a 10
 
6.7%
m 8
 
5.3%
A 7
 
4.7%
E 6
 
4.0%
r 6
 
4.0%
y 5
 
3.3%
t 5
 
3.3%
b 5
 
3.3%
Other values (31) 71
47.3%
Common
ValueCountFrequency (%)
99
44.2%
( 45
20.1%
) 45
20.1%
0 6
 
2.7%
1 5
 
2.2%
8 3
 
1.3%
4 3
 
1.3%
9 3
 
1.3%
6 3
 
1.3%
2 3
 
1.3%
Other values (6) 9
 
4.0%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2263
85.8%
ASCII 373
 
14.1%
CJK 2
 
0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
99
26.5%
( 45
 
12.1%
) 45
 
12.1%
e 17
 
4.6%
o 10
 
2.7%
a 10
 
2.7%
m 8
 
2.1%
A 7
 
1.9%
E 6
 
1.6%
0 6
 
1.6%
Other values (46) 120
32.2%
Hangul
ValueCountFrequency (%)
49
 
2.2%
48
 
2.1%
44
 
1.9%
43
 
1.9%
42
 
1.9%
41
 
1.8%
36
 
1.6%
35
 
1.5%
34
 
1.5%
34
 
1.5%
Other values (373) 1857
82.1%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct383
Distinct (%)97.5%
Missing1
Missing (%)0.3%
Memory size3.2 KiB
2023-12-12T13:22:02.573295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length56
Median length49
Mean length30.063613
Min length19

Characters and Unicode

Total characters11815
Distinct characters221
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique374 ?
Unique (%)95.2%

Sample

1st row부산광역시 기장군 기장읍 대라리 64-6
2nd row부산광역시 기장군 철마면 두송길 33-5, 1층
3rd row부산광역시 기장군 정관읍 정관1로 18, 123동 B-103호 (이지 더원1차 아파트)
4th row부산광역시 기장군 기장읍 읍내로104번길 19
5th row부산광역시 기장군 일광읍 일광로 128
ValueCountFrequency (%)
부산광역시 393
 
15.4%
기장군 393
 
15.4%
정관읍 150
 
5.9%
기장읍 142
 
5.6%
1층 132
 
5.2%
일광읍 63
 
2.5%
장안읍 30
 
1.2%
기장해안로 28
 
1.1%
정관로 22
 
0.9%
101호 20
 
0.8%
Other values (568) 1173
46.1%
2023-12-12T13:22:03.014524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2154
18.2%
625
 
5.3%
587
 
5.0%
1 587
 
5.0%
478
 
4.0%
432
 
3.7%
412
 
3.5%
401
 
3.4%
400
 
3.4%
393
 
3.3%
Other values (211) 5346
45.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7002
59.3%
Space Separator 2154
 
18.2%
Decimal Number 2027
 
17.2%
Other Punctuation 296
 
2.5%
Dash Punctuation 103
 
0.9%
Open Punctuation 86
 
0.7%
Close Punctuation 86
 
0.7%
Uppercase Letter 56
 
0.5%
Lowercase Letter 4
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
625
 
8.9%
587
 
8.4%
478
 
6.8%
432
 
6.2%
412
 
5.9%
401
 
5.7%
400
 
5.7%
393
 
5.6%
393
 
5.6%
315
 
4.5%
Other values (181) 2566
36.6%
Uppercase Letter
ValueCountFrequency (%)
B 28
50.0%
A 12
21.4%
H 3
 
5.4%
L 3
 
5.4%
D 3
 
5.4%
C 2
 
3.6%
K 1
 
1.8%
W 1
 
1.8%
J 1
 
1.8%
S 1
 
1.8%
Decimal Number
ValueCountFrequency (%)
1 587
29.0%
2 269
13.3%
0 232
 
11.4%
3 205
 
10.1%
4 174
 
8.6%
5 161
 
7.9%
6 127
 
6.3%
7 111
 
5.5%
8 90
 
4.4%
9 71
 
3.5%
Lowercase Letter
ValueCountFrequency (%)
a 2
50.0%
z 1
25.0%
l 1
25.0%
Space Separator
ValueCountFrequency (%)
2154
100.0%
Other Punctuation
ValueCountFrequency (%)
, 296
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 103
100.0%
Open Punctuation
ValueCountFrequency (%)
( 86
100.0%
Close Punctuation
ValueCountFrequency (%)
) 86
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7002
59.3%
Common 4753
40.2%
Latin 60
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
625
 
8.9%
587
 
8.4%
478
 
6.8%
432
 
6.2%
412
 
5.9%
401
 
5.7%
400
 
5.7%
393
 
5.6%
393
 
5.6%
315
 
4.5%
Other values (181) 2566
36.6%
Common
ValueCountFrequency (%)
2154
45.3%
1 587
 
12.4%
, 296
 
6.2%
2 269
 
5.7%
0 232
 
4.9%
3 205
 
4.3%
4 174
 
3.7%
5 161
 
3.4%
6 127
 
2.7%
7 111
 
2.3%
Other values (6) 437
 
9.2%
Latin
ValueCountFrequency (%)
B 28
46.7%
A 12
20.0%
H 3
 
5.0%
L 3
 
5.0%
D 3
 
5.0%
a 2
 
3.3%
C 2
 
3.3%
K 1
 
1.7%
W 1
 
1.7%
J 1
 
1.7%
Other values (4) 4
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7002
59.3%
ASCII 4813
40.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2154
44.8%
1 587
 
12.2%
, 296
 
6.2%
2 269
 
5.6%
0 232
 
4.8%
3 205
 
4.3%
4 174
 
3.6%
5 161
 
3.3%
6 127
 
2.6%
7 111
 
2.3%
Other values (20) 497
 
10.3%
Hangul
ValueCountFrequency (%)
625
 
8.9%
587
 
8.4%
478
 
6.8%
432
 
6.2%
412
 
5.9%
401
 
5.7%
400
 
5.7%
393
 
5.6%
393
 
5.6%
315
 
4.5%
Other values (181) 2566
36.6%

소재지전화
Text

MISSING 

Distinct174
Distinct (%)97.2%
Missing215
Missing (%)54.6%
Memory size3.2 KiB
2023-12-12T13:22:03.271835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.03352
Min length12

Characters and Unicode

Total characters2154
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique169 ?
Unique (%)94.4%

Sample

1st row051-721-2179
2nd row051-515-1796
3rd row051-721-2537
4th row051-721-1612
5th row051-727-0548
ValueCountFrequency (%)
051-727-2721 2
 
1.1%
051-728-4160 2
 
1.1%
051-922-2500 2
 
1.1%
051-727-3714 2
 
1.1%
051-609-8265 2
 
1.1%
051-722-2018 1
 
0.6%
051-721-2179 1
 
0.6%
051-724-2828 1
 
0.6%
051-728-0116 1
 
0.6%
051-721-2934 1
 
0.6%
Other values (164) 164
91.6%
2023-12-12T13:22:03.742480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 358
16.6%
7 297
13.8%
0 286
13.3%
2 280
13.0%
1 269
12.5%
5 264
12.3%
8 95
 
4.4%
3 86
 
4.0%
4 82
 
3.8%
9 74
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1796
83.4%
Dash Punctuation 358
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 297
16.5%
0 286
15.9%
2 280
15.6%
1 269
15.0%
5 264
14.7%
8 95
 
5.3%
3 86
 
4.8%
4 82
 
4.6%
9 74
 
4.1%
6 63
 
3.5%
Dash Punctuation
ValueCountFrequency (%)
- 358
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2154
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 358
16.6%
7 297
13.8%
0 286
13.3%
2 280
13.0%
1 269
12.5%
5 264
12.3%
8 95
 
4.4%
3 86
 
4.0%
4 82
 
3.8%
9 74
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2154
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 358
16.6%
7 297
13.8%
0 286
13.3%
2 280
13.0%
1 269
12.5%
5 264
12.3%
8 95
 
4.4%
3 86
 
4.0%
4 82
 
3.8%
9 74
 
3.4%

Missing values

2023-12-12T13:22:01.171230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:22:01.252479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T13:22:01.357014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업소명소재지(도로명)소재지전화
0경주상회부산광역시 기장군 기장읍 대라리 64-6051-721-2179
1송정기름집부산광역시 기장군 철마면 두송길 33-5, 1층<NA>
2안동참기름부산광역시 기장군 정관읍 정관1로 18, 123동 B-103호 (이지 더원1차 아파트)051-515-1796
3기장상회부산광역시 기장군 기장읍 읍내로104번길 19051-721-2537
4일광참기름상회부산광역시 기장군 일광읍 일광로 128<NA>
5안동기름집부산광역시 기장군 일광읍 기장해안로 1291<NA>
6하서떡방앗간부산광역시 기장군 기장읍 차성남로65번길 4051-721-1612
7칠암제분업부산광역시 기장군 일광읍 일광로 646-1051-727-0548
8송정떡방앗간부산광역시 기장군 철마면 여락송정로 334-16, 1층051-508-4422
9풍년상회부산광역시 기장군 기장읍 차성로287번길 10, 1층051-721-2022
업소명소재지(도로명)소재지전화
384민유통부산광역시 기장군 정관읍 정관5로 50, 정관 홈플러스 지하1층<NA>
385마켓 샤퀴테리부산광역시 기장군 기장읍 기장해안로 267-7, 클리퍼C동 1층<NA>
386카포티(Capote)부산광역시 기장군 기장읍 기장해안로 267-17, 엘피크리스탈(메인)동 2층<NA>
387Le Blanc Bakery(르블랑 베이커리)부산광역시 기장군 기장읍 기장해안로 267-17, 엘피크리스탈(메인동), 2층<NA>
388주식회사 현승에프앤디부산광역시 기장군 정관읍 산단1로 133, 탑마트 정관점 1층<NA>
389주경식품부산광역시 기장군 일광읍 기장대로 673, 메가마트 1층<NA>
390현재상사부산광역시 기장군 정관읍 산단1로 133, 탑마트<NA>
391과일비행기부산광역시 기장군 정관읍 가동1길 23-5<NA>
392(주)미트벨리(탑마트-서부점 內)부산광역시 기장군 기장읍 읍내로 49, 탑마트<NA>
393솔화담부산광역시 기장군 정관읍 모전1길 70-1, 101호<NA>

Duplicate rows

Most frequently occurring

업소명소재지(도로명)소재지전화# duplicates
0롯데쇼핑(주)롯데마트동부산점부산광역시 기장군 기장읍 기장해안로 147, 2층 (롯데몰동부산점)051-922-25002