Overview

Dataset statistics

Number of variables4
Number of observations2489
Missing cells573
Missing cells (%)5.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory77.9 KiB
Average record size in memory32.1 B

Variable types

Categorical1
Text3

Dataset

Description서산시에서 인허가된 일반음식점, 휴게음식점 현황에 대한 데이터입니다. 업종명, 업소명, 소재지, 영업장면적, 전화번호, 업태명, 데이터기준일의 항목명을 가지고 있습니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=447&beforeMenuCd=DOM_000000201001001000&publicdatapk=15000815

Alerts

업종명 has constant value ""Constant
소재지전화 has 573 (23.0%) missing valuesMissing

Reproduction

Analysis started2024-01-09 21:07:00.616580
Analysis finished2024-01-09 21:07:01.179467
Duration0.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size19.6 KiB
일반음식점
2489 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반음식점
2nd row일반음식점
3rd row일반음식점
4th row일반음식점
5th row일반음식점

Common Values

ValueCountFrequency (%)
일반음식점 2489
100.0%

Length

2024-01-10T06:07:01.234132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:07:01.306959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반음식점 2489
100.0%
Distinct2449
Distinct (%)98.4%
Missing0
Missing (%)0.0%
Memory size19.6 KiB
2024-01-10T06:07:01.538268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length19
Mean length5.7147449
Min length1

Characters and Unicode

Total characters14224
Distinct characters702
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2413 ?
Unique (%)96.9%

Sample

1st row송남회관
2nd row영성각
3rd row버드내식당
4th row태광식당
5th row서산식당
ValueCountFrequency (%)
서산점 44
 
1.5%
대산점 15
 
0.5%
호수공원점 13
 
0.5%
서산호수공원점 11
 
0.4%
동문점 10
 
0.4%
해미점 9
 
0.3%
예천점 9
 
0.3%
읍내점 7
 
0.2%
2호점 7
 
0.2%
처갓집양념치킨 5
 
0.2%
Other values (2549) 2711
95.4%
2024-01-10T06:07:01.943187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
407
 
2.9%
352
 
2.5%
349
 
2.5%
310
 
2.2%
290
 
2.0%
289
 
2.0%
262
 
1.8%
220
 
1.5%
212
 
1.5%
205
 
1.4%
Other values (692) 11328
79.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13403
94.2%
Space Separator 352
 
2.5%
Close Punctuation 166
 
1.2%
Open Punctuation 166
 
1.2%
Decimal Number 96
 
0.7%
Other Punctuation 27
 
0.2%
Uppercase Letter 10
 
0.1%
Lowercase Letter 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
407
 
3.0%
349
 
2.6%
310
 
2.3%
290
 
2.2%
289
 
2.2%
262
 
2.0%
220
 
1.6%
212
 
1.6%
205
 
1.5%
203
 
1.5%
Other values (663) 10656
79.5%
Decimal Number
ValueCountFrequency (%)
2 24
25.0%
1 16
16.7%
0 12
12.5%
9 11
11.5%
5 10
10.4%
4 8
 
8.3%
7 6
 
6.2%
8 4
 
4.2%
3 3
 
3.1%
6 2
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
B 2
20.0%
C 2
20.0%
H 2
20.0%
K 1
10.0%
D 1
10.0%
S 1
10.0%
W 1
10.0%
Other Punctuation
ValueCountFrequency (%)
. 12
44.4%
& 9
33.3%
· 4
 
14.8%
? 1
 
3.7%
' 1
 
3.7%
Lowercase Letter
ValueCountFrequency (%)
r 1
25.0%
t 1
25.0%
o 1
25.0%
y 1
25.0%
Space Separator
ValueCountFrequency (%)
352
100.0%
Close Punctuation
ValueCountFrequency (%)
) 166
100.0%
Open Punctuation
ValueCountFrequency (%)
( 166
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13402
94.2%
Common 807
 
5.7%
Latin 14
 
0.1%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
407
 
3.0%
349
 
2.6%
310
 
2.3%
290
 
2.2%
289
 
2.2%
262
 
2.0%
220
 
1.6%
212
 
1.6%
205
 
1.5%
203
 
1.5%
Other values (662) 10655
79.5%
Common
ValueCountFrequency (%)
352
43.6%
) 166
20.6%
( 166
20.6%
2 24
 
3.0%
1 16
 
2.0%
0 12
 
1.5%
. 12
 
1.5%
9 11
 
1.4%
5 10
 
1.2%
& 9
 
1.1%
Other values (8) 29
 
3.6%
Latin
ValueCountFrequency (%)
B 2
14.3%
C 2
14.3%
H 2
14.3%
K 1
7.1%
D 1
7.1%
r 1
7.1%
S 1
7.1%
t 1
7.1%
o 1
7.1%
y 1
7.1%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13402
94.2%
ASCII 817
 
5.7%
None 4
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
407
 
3.0%
349
 
2.6%
310
 
2.3%
290
 
2.2%
289
 
2.2%
262
 
2.0%
220
 
1.6%
212
 
1.6%
205
 
1.5%
203
 
1.5%
Other values (662) 10655
79.5%
ASCII
ValueCountFrequency (%)
352
43.1%
) 166
20.3%
( 166
20.3%
2 24
 
2.9%
1 16
 
2.0%
0 12
 
1.5%
. 12
 
1.5%
9 11
 
1.3%
5 10
 
1.2%
& 9
 
1.1%
Other values (18) 39
 
4.8%
None
ValueCountFrequency (%)
· 4
100.0%
CJK
ValueCountFrequency (%)
1
100.0%
Distinct2201
Distinct (%)88.4%
Missing0
Missing (%)0.0%
Memory size19.6 KiB
2024-01-10T06:07:02.253146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length66
Median length54
Mean length26.186018
Min length18

Characters and Unicode

Total characters65177
Distinct characters292
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1975 ?
Unique (%)79.3%

Sample

1st row충청남도 서산시 음암면 도당큰말길 6-4
2nd row충청남도 서산시 해미면 남문1로 40-1
3rd row충청남도 서산시 운산면 해운로 1205-1
4th row충청남도 서산시 해미면 읍성마을4길 17-8
5th row충청남도 서산시 운산면 용장리 406번지
ValueCountFrequency (%)
충청남도 2489
 
17.0%
서산시 2489
 
17.0%
1층 1161
 
7.9%
동문동 518
 
3.5%
읍내동 391
 
2.7%
대산읍 374
 
2.5%
해미면 235
 
1.6%
예천동 204
 
1.4%
2층 157
 
1.1%
석림동 148
 
1.0%
Other values (1511) 6506
44.3%
2024-01-10T06:07:02.692353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13756
21.1%
1 3593
 
5.5%
3048
 
4.7%
2673
 
4.1%
2672
 
4.1%
2650
 
4.1%
2589
 
4.0%
2541
 
3.9%
2490
 
3.8%
2281
 
3.5%
Other values (282) 26884
41.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 37047
56.8%
Space Separator 13756
 
21.1%
Decimal Number 10457
 
16.0%
Open Punctuation 1603
 
2.5%
Close Punctuation 1603
 
2.5%
Dash Punctuation 598
 
0.9%
Uppercase Letter 60
 
0.1%
Math Symbol 23
 
< 0.1%
Other Punctuation 22
 
< 0.1%
Lowercase Letter 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3048
 
8.2%
2673
 
7.2%
2672
 
7.2%
2650
 
7.2%
2589
 
7.0%
2541
 
6.9%
2490
 
6.7%
2281
 
6.2%
2011
 
5.4%
1421
 
3.8%
Other values (250) 12671
34.2%
Decimal Number
ValueCountFrequency (%)
1 3593
34.4%
2 1499
14.3%
3 1134
 
10.8%
4 786
 
7.5%
5 691
 
6.6%
6 624
 
6.0%
7 565
 
5.4%
8 526
 
5.0%
9 522
 
5.0%
0 517
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
D 19
31.7%
B 17
28.3%
A 17
28.3%
C 5
 
8.3%
T 1
 
1.7%
F 1
 
1.7%
Other Punctuation
ValueCountFrequency (%)
@ 13
59.1%
/ 4
 
18.2%
· 2
 
9.1%
. 2
 
9.1%
* 1
 
4.5%
Lowercase Letter
ValueCountFrequency (%)
e 6
75.0%
c 1
 
12.5%
h 1
 
12.5%
Open Punctuation
ValueCountFrequency (%)
( 1602
99.9%
[ 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 1602
99.9%
] 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 22
95.7%
1
 
4.3%
Space Separator
ValueCountFrequency (%)
13756
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 598
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 37047
56.8%
Common 28062
43.1%
Latin 68
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3048
 
8.2%
2673
 
7.2%
2672
 
7.2%
2650
 
7.2%
2589
 
7.0%
2541
 
6.9%
2490
 
6.7%
2281
 
6.2%
2011
 
5.4%
1421
 
3.8%
Other values (250) 12671
34.2%
Common
ValueCountFrequency (%)
13756
49.0%
1 3593
 
12.8%
( 1602
 
5.7%
) 1602
 
5.7%
2 1499
 
5.3%
3 1134
 
4.0%
4 786
 
2.8%
5 691
 
2.5%
6 624
 
2.2%
- 598
 
2.1%
Other values (13) 2177
 
7.8%
Latin
ValueCountFrequency (%)
D 19
27.9%
B 17
25.0%
A 17
25.0%
e 6
 
8.8%
C 5
 
7.4%
c 1
 
1.5%
T 1
 
1.5%
h 1
 
1.5%
F 1
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 37047
56.8%
ASCII 28127
43.2%
None 2
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
13756
48.9%
1 3593
 
12.8%
( 1602
 
5.7%
) 1602
 
5.7%
2 1499
 
5.3%
3 1134
 
4.0%
4 786
 
2.8%
5 691
 
2.5%
6 624
 
2.2%
- 598
 
2.1%
Other values (20) 2242
 
8.0%
Hangul
ValueCountFrequency (%)
3048
 
8.2%
2673
 
7.2%
2672
 
7.2%
2650
 
7.2%
2589
 
7.0%
2541
 
6.9%
2490
 
6.7%
2281
 
6.2%
2011
 
5.4%
1421
 
3.8%
Other values (250) 12671
34.2%
None
ValueCountFrequency (%)
· 2
100.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

소재지전화
Text

MISSING 

Distinct1890
Distinct (%)98.6%
Missing573
Missing (%)23.0%
Memory size19.6 KiB
2024-01-10T06:07:02.910361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.017223
Min length9

Characters and Unicode

Total characters23025
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1865 ?
Unique (%)97.3%

Sample

1st row041-688-2047
2nd row041-665-3271
3rd row041-665-9187
4th row041-667-7373
5th row041-665-5643
ValueCountFrequency (%)
041-688-8814 3
 
0.2%
041-920-9292 2
 
0.1%
041-688-9280 2
 
0.1%
041-681-6206 2
 
0.1%
041-663-4414 2
 
0.1%
041-662-8699 2
 
0.1%
041-688-5724 2
 
0.1%
041-669-5500 2
 
0.1%
041-669-5803 2
 
0.1%
041-665-1015 2
 
0.1%
Other values (1880) 1895
98.9%
2024-01-10T06:07:03.243901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 4155
18.0%
- 3831
16.6%
0 2862
12.4%
1 2773
12.0%
4 2699
11.7%
8 1518
 
6.6%
2 1137
 
4.9%
9 1079
 
4.7%
5 1069
 
4.6%
3 962
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19194
83.4%
Dash Punctuation 3831
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 4155
21.6%
0 2862
14.9%
1 2773
14.4%
4 2699
14.1%
8 1518
 
7.9%
2 1137
 
5.9%
9 1079
 
5.6%
5 1069
 
5.6%
3 962
 
5.0%
7 940
 
4.9%
Dash Punctuation
ValueCountFrequency (%)
- 3831
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23025
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 4155
18.0%
- 3831
16.6%
0 2862
12.4%
1 2773
12.0%
4 2699
11.7%
8 1518
 
6.6%
2 1137
 
4.9%
9 1079
 
4.7%
5 1069
 
4.6%
3 962
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23025
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 4155
18.0%
- 3831
16.6%
0 2862
12.4%
1 2773
12.0%
4 2699
11.7%
8 1518
 
6.6%
2 1137
 
4.9%
9 1079
 
4.7%
5 1069
 
4.6%
3 962
 
4.2%

Missing values

2024-01-10T06:07:01.073882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T06:07:01.145680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명소재지(도로명)소재지전화
0일반음식점송남회관충청남도 서산시 음암면 도당큰말길 6-4<NA>
1일반음식점영성각충청남도 서산시 해미면 남문1로 40-1041-688-2047
2일반음식점버드내식당충청남도 서산시 운산면 해운로 1205-1<NA>
3일반음식점태광식당충청남도 서산시 해미면 읍성마을4길 17-8<NA>
4일반음식점서산식당충청남도 서산시 운산면 용장리 406번지<NA>
5일반음식점현대한우촌충청남도 서산시 율지17로 21 (동문동)041-665-3271
6일반음식점부자집충청남도 서산시 안견로 181 (동문동)041-665-9187
7일반음식점인지반점충청남도 서산시 인지면 무학로 1693041-667-7373
8일반음식점여로집충청남도 서산시 동문동 986번지<NA>
9일반음식점한일식당충청남도 서산시 번화1로 12 (읍내동)041-665-5643
업종명업소명소재지(도로명)소재지전화
2479일반음식점미선충청남도 서산시 대산읍 구진로 16-1 1층<NA>
2480일반음식점누룽지통닭충청남도 서산시 대산읍 정자동5로 27 1층041-663-2333
2481일반음식점부자순대국충청남도 서산시 서령로 217 (온석동)041-663-5916
2482일반음식점육감만족(성연점)충청남도 서산시 성연면 성연로 210 2층041-665-9212
2483일반음식점족가족가숯불구이족발충청남도 서산시 성연면 성연5로 29-1 1층041-664-2232
2484일반음식점비스트로리꼬충청남도 서산시 호수공원4로 44 1층 (예천동)<NA>
2485일반음식점바다횟집충청남도 서산시 성연면 성연5로 25041-663-8100
2486일반음식점왕더푸충청남도 서산시 고운로 107 1층 (읍내동)<NA>
2487일반음식점배달의쌀국수충청남도 서산시 중앙로 98-1 가동 1층 (동문동)<NA>
2488일반음식점한우암소갈비(우미관)충청남도 서산시 해미면 읍성마을1길 4-42 1층<NA>