Overview

Dataset statistics

Number of variables3
Number of observations203
Missing cells84
Missing cells (%)13.8%
Duplicate rows1
Duplicate rows (%)0.5%
Total size in memory4.9 KiB
Average record size in memory24.6 B

Variable types

Text3

Dataset

Description경상남도 고성군에 소재하고 있는 담배 소매인 지정 현황에 관한 데이터로 업소명, 주소, 전화번호 등의 항목을 제공합니다.
Author경상남도 고성군
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15021178

Alerts

Dataset has 1 (0.5%) duplicate rowsDuplicates
전화번호 has 84 (41.4%) missing valuesMissing

Reproduction

Analysis started2024-04-20 18:11:13.682797
Analysis finished2024-04-20 18:11:14.203087
Duration0.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct196
Distinct (%)96.6%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2024-04-21T03:11:14.909806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length15
Mean length6.270936
Min length2

Characters and Unicode

Total characters1273
Distinct characters243
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique189 ?
Unique (%)93.1%

Sample

1st rowSK 종합상사
2nd row하이
3rd rowGS25 고성광장점
4th row이마트24 고성점
5th row또또24시편의점
ValueCountFrequency (%)
씨유 10
 
4.0%
gs25 8
 
3.2%
고성점 3
 
1.2%
매점 3
 
1.2%
세븐일레븐 3
 
1.2%
주)연합진흥 2
 
0.8%
삼락휴게소매점 2
 
0.8%
하나로마트 2
 
0.8%
이마트24 2
 
0.8%
학우사 2
 
0.8%
Other values (203) 210
85.0%
2024-04-21T03:11:16.043190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
60
 
4.7%
58
 
4.6%
50
 
3.9%
44
 
3.5%
40
 
3.1%
37
 
2.9%
34
 
2.7%
31
 
2.4%
25
 
2.0%
23
 
1.8%
Other values (233) 871
68.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1149
90.3%
Space Separator 44
 
3.5%
Decimal Number 31
 
2.4%
Uppercase Letter 31
 
2.4%
Open Punctuation 8
 
0.6%
Close Punctuation 8
 
0.6%
Letter Number 1
 
0.1%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
60
 
5.2%
58
 
5.0%
50
 
4.4%
40
 
3.5%
37
 
3.2%
34
 
3.0%
31
 
2.7%
25
 
2.2%
23
 
2.0%
20
 
1.7%
Other values (217) 771
67.1%
Uppercase Letter
ValueCountFrequency (%)
S 13
41.9%
G 12
38.7%
K 2
 
6.5%
L 1
 
3.2%
P 1
 
3.2%
U 1
 
3.2%
C 1
 
3.2%
Decimal Number
ValueCountFrequency (%)
2 15
48.4%
5 11
35.5%
4 4
 
12.9%
1 1
 
3.2%
Space Separator
ValueCountFrequency (%)
44
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1149
90.3%
Common 92
 
7.2%
Latin 32
 
2.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
60
 
5.2%
58
 
5.0%
50
 
4.4%
40
 
3.5%
37
 
3.2%
34
 
3.0%
31
 
2.7%
25
 
2.2%
23
 
2.0%
20
 
1.7%
Other values (217) 771
67.1%
Common
ValueCountFrequency (%)
44
47.8%
2 15
 
16.3%
5 11
 
12.0%
( 8
 
8.7%
) 8
 
8.7%
4 4
 
4.3%
1 1
 
1.1%
. 1
 
1.1%
Latin
ValueCountFrequency (%)
S 13
40.6%
G 12
37.5%
K 2
 
6.2%
L 1
 
3.1%
P 1
 
3.1%
U 1
 
3.1%
C 1
 
3.1%
1
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1149
90.3%
ASCII 123
 
9.7%
Number Forms 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
60
 
5.2%
58
 
5.0%
50
 
4.4%
40
 
3.5%
37
 
3.2%
34
 
3.0%
31
 
2.7%
25
 
2.2%
23
 
2.0%
20
 
1.7%
Other values (217) 771
67.1%
ASCII
ValueCountFrequency (%)
44
35.8%
2 15
 
12.2%
S 13
 
10.6%
G 12
 
9.8%
5 11
 
8.9%
( 8
 
6.5%
) 8
 
6.5%
4 4
 
3.3%
K 2
 
1.6%
L 1
 
0.8%
Other values (5) 5
 
4.1%
Number Forms
ValueCountFrequency (%)
1
100.0%

주소
Text

Distinct200
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2024-04-21T03:11:17.382134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length36
Mean length22.70936
Min length18

Characters and Unicode

Total characters4610
Distinct characters149
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique197 ?
Unique (%)97.0%

Sample

1st row경상남도 고성군 하이면 공룡로 10
2nd row경상남도 고성군 하이면 하이로 243, 1층
3rd row경상남도 고성군 고성읍 동외로 137-1
4th row경상남도 고성군 고성읍 남포로 49
5th row경상남도 고성군 고성읍 중앙로 33
ValueCountFrequency (%)
경상남도 203
19.4%
고성군 203
19.4%
고성읍 87
 
8.3%
회화면 21
 
2.0%
동해면 18
 
1.7%
하이면 15
 
1.4%
남해안대로 15
 
1.4%
거류면 13
 
1.2%
동외로 10
 
1.0%
성내로 10
 
1.0%
Other values (303) 452
43.2%
2024-04-21T03:11:19.005515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
908
19.7%
309
 
6.7%
298
 
6.5%
231
 
5.0%
222
 
4.8%
207
 
4.5%
204
 
4.4%
203
 
4.4%
1 167
 
3.6%
146
 
3.2%
Other values (139) 1715
37.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2871
62.3%
Space Separator 908
 
19.7%
Decimal Number 728
 
15.8%
Dash Punctuation 60
 
1.3%
Close Punctuation 15
 
0.3%
Open Punctuation 15
 
0.3%
Other Punctuation 11
 
0.2%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
309
 
10.8%
298
 
10.4%
231
 
8.0%
222
 
7.7%
207
 
7.2%
204
 
7.1%
203
 
7.1%
146
 
5.1%
116
 
4.0%
87
 
3.0%
Other values (122) 848
29.5%
Decimal Number
ValueCountFrequency (%)
1 167
22.9%
3 93
12.8%
2 87
12.0%
5 72
9.9%
4 68
9.3%
6 62
 
8.5%
7 56
 
7.7%
9 45
 
6.2%
0 42
 
5.8%
8 36
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
S 1
50.0%
D 1
50.0%
Space Separator
ValueCountFrequency (%)
908
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 60
100.0%
Close Punctuation
ValueCountFrequency (%)
) 15
100.0%
Open Punctuation
ValueCountFrequency (%)
( 15
100.0%
Other Punctuation
ValueCountFrequency (%)
, 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2871
62.3%
Common 1737
37.7%
Latin 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
309
 
10.8%
298
 
10.4%
231
 
8.0%
222
 
7.7%
207
 
7.2%
204
 
7.1%
203
 
7.1%
146
 
5.1%
116
 
4.0%
87
 
3.0%
Other values (122) 848
29.5%
Common
ValueCountFrequency (%)
908
52.3%
1 167
 
9.6%
3 93
 
5.4%
2 87
 
5.0%
5 72
 
4.1%
4 68
 
3.9%
6 62
 
3.6%
- 60
 
3.5%
7 56
 
3.2%
9 45
 
2.6%
Other values (5) 119
 
6.9%
Latin
ValueCountFrequency (%)
S 1
50.0%
D 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2871
62.3%
ASCII 1739
37.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
908
52.2%
1 167
 
9.6%
3 93
 
5.3%
2 87
 
5.0%
5 72
 
4.1%
4 68
 
3.9%
6 62
 
3.6%
- 60
 
3.5%
7 56
 
3.2%
9 45
 
2.6%
Other values (7) 121
 
7.0%
Hangul
ValueCountFrequency (%)
309
 
10.8%
298
 
10.4%
231
 
8.0%
222
 
7.7%
207
 
7.2%
204
 
7.1%
203
 
7.1%
146
 
5.1%
116
 
4.0%
87
 
3.0%
Other values (122) 848
29.5%

전화번호
Text

MISSING 

Distinct117
Distinct (%)98.3%
Missing84
Missing (%)41.4%
Memory size1.7 KiB
2024-04-21T03:11:19.864828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.033613
Min length12

Characters and Unicode

Total characters1432
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique115 ?
Unique (%)96.6%

Sample

1st row055-835-4727
2nd row055-674-1859
3rd row070-8832-1648
4th row055-672-0082
5th row055-673-2895
ValueCountFrequency (%)
055-673-4422 2
 
1.7%
055-670-2800 2
 
1.7%
055-672-6153 1
 
0.8%
055-672-6016 1
 
0.8%
055-673-8210 1
 
0.8%
055-672-4119 1
 
0.8%
055-672-2150 1
 
0.8%
055-672-0022 1
 
0.8%
055-673-0886 1
 
0.8%
055-674-2380 1
 
0.8%
Other values (107) 107
89.9%
2024-04-21T03:11:20.889113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 283
19.8%
- 238
16.6%
0 192
13.4%
6 158
11.0%
7 152
10.6%
2 103
 
7.2%
3 90
 
6.3%
4 80
 
5.6%
1 53
 
3.7%
8 51
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1194
83.4%
Dash Punctuation 238
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 283
23.7%
0 192
16.1%
6 158
13.2%
7 152
12.7%
2 103
 
8.6%
3 90
 
7.5%
4 80
 
6.7%
1 53
 
4.4%
8 51
 
4.3%
9 32
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
- 238
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1432
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 283
19.8%
- 238
16.6%
0 192
13.4%
6 158
11.0%
7 152
10.6%
2 103
 
7.2%
3 90
 
6.3%
4 80
 
5.6%
1 53
 
3.7%
8 51
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1432
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 283
19.8%
- 238
16.6%
0 192
13.4%
6 158
11.0%
7 152
10.6%
2 103
 
7.2%
3 90
 
6.3%
4 80
 
5.6%
1 53
 
3.7%
8 51
 
3.6%

Missing values

2024-04-21T03:11:14.002288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T03:11:14.151989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업소명주소전화번호
0SK 종합상사경상남도 고성군 하이면 공룡로 10055-835-4727
1하이경상남도 고성군 하이면 하이로 243, 1층<NA>
2GS25 고성광장점경상남도 고성군 고성읍 동외로 137-1<NA>
3이마트24 고성점경상남도 고성군 고성읍 남포로 49<NA>
4또또24시편의점경상남도 고성군 고성읍 중앙로 33055-674-1859
5국군복지단 부산지원본부경상남도 고성군 상리면 부포리 192-5070-8832-1648
6산들바다관광농원경상남도 고성군 동해면 외산로 330055-672-0082
7씨유 고성동외타운점경상남도 고성군 고성읍 동외로168번길 69<NA>
8세븐일레븐 고성스마일점경상남도 고성군 고성읍 성내로 14, 2층<NA>
9이마트24 고성송학점경상남도 고성군 고성읍 중앙로43번길 73-13<NA>
업소명주소전화번호
193연화주유소 매점경상남도 고성군 개천면 북평리 314-11055-672-6655
194가천상회경상남도 고성군 개천면 가천리 787055-673-2562
195아지매상회경상남도 고성군 거류면 당동리 127<NA>
196용산상회경상남도 고성군 거류면 용산리 430-7<NA>
197거산상회경상남도 고성군 거류면 동해로 526<NA>
198동림상회경상남도 고성군 동해면 봉암2길 93-1<NA>
199돈막상회경상남도 고성군 동해면 외산3길 98-8<NA>
200장산상회경상남도 고성군 마암면 장산리 150055-672-6014
201만물상회경상남도 고성군 구만면 효락리 847-2055-672-3017
202성내슈퍼경상남도 고성군 고성읍 중앙로15번길 19<NA>

Duplicate rows

Most frequently occurring

업소명주소전화번호# duplicates
0삼락휴게소매점경상남도 고성군 마암면 남해안대로 3390055-673-44222