Overview

Dataset statistics

Number of variables4
Number of observations2696
Missing cells908
Missing cells (%)8.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory84.4 KiB
Average record size in memory32.0 B

Variable types

Categorical1
Text3

Dataset

Description서산시에서 인허가된 일반음식점, 휴게음식점 현황에 대한 데이터입니다. 업종명, 업소명, 소재지, 영업장면적, 전화번호, 업태명, 데이터기준일의 항목명을 가지고 있습니다.
Author충청남도 서산시
URLhttps://www.data.go.kr/data/15000815/fileData.do

Alerts

업종명 has constant value ""Constant
소재지전화 has 907 (33.6%) missing valuesMissing

Reproduction

Analysis started2023-12-13 00:00:29.196845
Analysis finished2023-12-13 00:00:29.951208
Duration0.75 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size21.2 KiB
일반음식점
2696 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반음식점
2nd row일반음식점
3rd row일반음식점
4th row일반음식점
5th row일반음식점

Common Values

ValueCountFrequency (%)
일반음식점 2696
100.0%

Length

2023-12-13T09:00:30.021940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:00:30.117463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반음식점 2696
100.0%
Distinct2675
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Memory size21.2 KiB
2023-12-13T09:00:30.360124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length22
Mean length6.2073442
Min length1

Characters and Unicode

Total characters16735
Distinct characters802
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2656 ?
Unique (%)98.5%

Sample

1st row송남회관
2nd row버드내식당
3rd row현대한우촌
4th row부자집
5th row한일식당
ValueCountFrequency (%)
서산점 100
 
3.1%
서산호수공원점 25
 
0.8%
대산점 22
 
0.7%
예천점 16
 
0.5%
성연점 15
 
0.5%
서산대산점 12
 
0.4%
동문점 12
 
0.4%
서산테크노밸리점 11
 
0.3%
서산성연점 10
 
0.3%
해미점 10
 
0.3%
Other values (2795) 3044
92.9%
2023-12-13T09:00:30.769465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
581
 
3.5%
580
 
3.5%
467
 
2.8%
404
 
2.4%
331
 
2.0%
296
 
1.8%
273
 
1.6%
248
 
1.5%
231
 
1.4%
( 226
 
1.4%
Other values (792) 13098
78.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15124
90.4%
Space Separator 581
 
3.5%
Open Punctuation 226
 
1.4%
Close Punctuation 226
 
1.4%
Lowercase Letter 210
 
1.3%
Uppercase Letter 183
 
1.1%
Decimal Number 120
 
0.7%
Other Punctuation 59
 
0.4%
Math Symbol 3
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
580
 
3.8%
467
 
3.1%
404
 
2.7%
331
 
2.2%
296
 
2.0%
273
 
1.8%
248
 
1.6%
231
 
1.5%
212
 
1.4%
208
 
1.4%
Other values (719) 11874
78.5%
Uppercase Letter
ValueCountFrequency (%)
C 22
 
12.0%
B 20
 
10.9%
O 18
 
9.8%
E 15
 
8.2%
A 11
 
6.0%
H 10
 
5.5%
S 8
 
4.4%
M 7
 
3.8%
T 7
 
3.8%
K 7
 
3.8%
Other values (15) 58
31.7%
Lowercase Letter
ValueCountFrequency (%)
e 31
14.8%
o 24
11.4%
a 18
 
8.6%
c 14
 
6.7%
s 14
 
6.7%
r 14
 
6.7%
i 11
 
5.2%
h 10
 
4.8%
n 10
 
4.8%
f 10
 
4.8%
Other values (13) 54
25.7%
Decimal Number
ValueCountFrequency (%)
1 25
20.8%
2 22
18.3%
3 14
11.7%
9 13
10.8%
0 13
10.8%
8 8
 
6.7%
4 8
 
6.7%
6 7
 
5.8%
5 5
 
4.2%
7 5
 
4.2%
Other Punctuation
ValueCountFrequency (%)
& 31
52.5%
. 15
25.4%
, 6
 
10.2%
' 3
 
5.1%
· 2
 
3.4%
: 1
 
1.7%
/ 1
 
1.7%
Math Symbol
ValueCountFrequency (%)
+ 1
33.3%
< 1
33.3%
> 1
33.3%
Space Separator
ValueCountFrequency (%)
581
100.0%
Open Punctuation
ValueCountFrequency (%)
( 226
100.0%
Close Punctuation
ValueCountFrequency (%)
) 226
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 15115
90.3%
Common 1218
 
7.3%
Latin 393
 
2.3%
Han 9
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
580
 
3.8%
467
 
3.1%
404
 
2.7%
331
 
2.2%
296
 
2.0%
273
 
1.8%
248
 
1.6%
231
 
1.5%
212
 
1.4%
208
 
1.4%
Other values (710) 11865
78.5%
Latin
ValueCountFrequency (%)
e 31
 
7.9%
o 24
 
6.1%
C 22
 
5.6%
B 20
 
5.1%
O 18
 
4.6%
a 18
 
4.6%
E 15
 
3.8%
c 14
 
3.6%
s 14
 
3.6%
r 14
 
3.6%
Other values (38) 203
51.7%
Common
ValueCountFrequency (%)
581
47.7%
( 226
 
18.6%
) 226
 
18.6%
& 31
 
2.5%
1 25
 
2.1%
2 22
 
1.8%
. 15
 
1.2%
3 14
 
1.1%
9 13
 
1.1%
0 13
 
1.1%
Other values (15) 52
 
4.3%
Han
ValueCountFrequency (%)
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 15115
90.3%
ASCII 1609
 
9.6%
CJK 9
 
0.1%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
581
36.1%
( 226
 
14.0%
) 226
 
14.0%
e 31
 
1.9%
& 31
 
1.9%
1 25
 
1.6%
o 24
 
1.5%
2 22
 
1.4%
C 22
 
1.4%
B 20
 
1.2%
Other values (62) 401
24.9%
Hangul
ValueCountFrequency (%)
580
 
3.8%
467
 
3.1%
404
 
2.7%
331
 
2.2%
296
 
2.0%
273
 
1.8%
248
 
1.6%
231
 
1.5%
212
 
1.4%
208
 
1.4%
Other values (710) 11865
78.5%
None
ValueCountFrequency (%)
· 2
100.0%
CJK
ValueCountFrequency (%)
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Distinct2440
Distinct (%)90.5%
Missing1
Missing (%)< 0.1%
Memory size21.2 KiB
2023-12-13T09:00:31.050864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length66
Median length55
Mean length26.899814
Min length17

Characters and Unicode

Total characters72495
Distinct characters325
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2231 ?
Unique (%)82.8%

Sample

1st row충청남도 서산시 음암면 도당큰말길 6-4
2nd row충청남도 서산시 운산면 해운로 1205-1
3rd row충청남도 서산시 율지17로 21 (동문동)
4th row충청남도 서산시 안견로 181 (동문동)
5th row충청남도 서산시 번화1로 12 (읍내동)
ValueCountFrequency (%)
충청남도 2695
 
16.7%
서산시 2695
 
16.7%
1층 1349
 
8.4%
동문동 477
 
3.0%
대산읍 399
 
2.5%
읍내동 370
 
2.3%
예천동 251
 
1.6%
해미면 243
 
1.5%
성연면 185
 
1.1%
석림동 163
 
1.0%
Other values (1606) 7317
45.3%
2023-12-13T09:00:31.461293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13490
18.6%
1 4213
 
5.8%
3293
 
4.5%
2918
 
4.0%
2886
 
4.0%
2864
 
4.0%
2791
 
3.8%
2748
 
3.8%
2700
 
3.7%
2484
 
3.4%
Other values (315) 32108
44.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 40833
56.3%
Space Separator 13490
 
18.6%
Decimal Number 12003
 
16.6%
Other Punctuation 1989
 
2.7%
Open Punctuation 1668
 
2.3%
Close Punctuation 1668
 
2.3%
Dash Punctuation 697
 
1.0%
Uppercase Letter 89
 
0.1%
Math Symbol 54
 
0.1%
Lowercase Letter 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3293
 
8.1%
2918
 
7.1%
2886
 
7.1%
2864
 
7.0%
2791
 
6.8%
2748
 
6.7%
2700
 
6.6%
2484
 
6.1%
2234
 
5.5%
1666
 
4.1%
Other values (284) 14249
34.9%
Decimal Number
ValueCountFrequency (%)
1 4213
35.1%
2 1737
14.5%
3 1235
 
10.3%
4 914
 
7.6%
5 756
 
6.3%
0 736
 
6.1%
6 692
 
5.8%
7 618
 
5.1%
9 566
 
4.7%
8 536
 
4.5%
Uppercase Letter
ValueCountFrequency (%)
A 34
38.2%
B 27
30.3%
D 17
19.1%
C 6
 
6.7%
S 3
 
3.4%
F 1
 
1.1%
T 1
 
1.1%
Other Punctuation
ValueCountFrequency (%)
, 1971
99.1%
@ 11
 
0.6%
/ 3
 
0.2%
. 2
 
0.1%
· 2
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
e 2
50.0%
h 1
25.0%
c 1
25.0%
Math Symbol
ValueCountFrequency (%)
~ 53
98.1%
1
 
1.9%
Space Separator
ValueCountFrequency (%)
13490
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1668
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1668
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 697
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 40833
56.3%
Common 31569
43.5%
Latin 93
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3293
 
8.1%
2918
 
7.1%
2886
 
7.1%
2864
 
7.0%
2791
 
6.8%
2748
 
6.7%
2700
 
6.6%
2484
 
6.1%
2234
 
5.5%
1666
 
4.1%
Other values (284) 14249
34.9%
Common
ValueCountFrequency (%)
13490
42.7%
1 4213
 
13.3%
, 1971
 
6.2%
2 1737
 
5.5%
( 1668
 
5.3%
) 1668
 
5.3%
3 1235
 
3.9%
4 914
 
2.9%
5 756
 
2.4%
0 736
 
2.3%
Other values (11) 3181
 
10.1%
Latin
ValueCountFrequency (%)
A 34
36.6%
B 27
29.0%
D 17
18.3%
C 6
 
6.5%
S 3
 
3.2%
e 2
 
2.2%
F 1
 
1.1%
h 1
 
1.1%
T 1
 
1.1%
c 1
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 40833
56.3%
ASCII 31659
43.7%
None 2
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
13490
42.6%
1 4213
 
13.3%
, 1971
 
6.2%
2 1737
 
5.5%
( 1668
 
5.3%
) 1668
 
5.3%
3 1235
 
3.9%
4 914
 
2.9%
5 756
 
2.4%
0 736
 
2.3%
Other values (19) 3271
 
10.3%
Hangul
ValueCountFrequency (%)
3293
 
8.1%
2918
 
7.1%
2886
 
7.1%
2864
 
7.0%
2791
 
6.8%
2748
 
6.7%
2700
 
6.6%
2484
 
6.1%
2234
 
5.5%
1666
 
4.1%
Other values (284) 14249
34.9%
None
ValueCountFrequency (%)
· 2
100.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

소재지전화
Text

MISSING 

Distinct1763
Distinct (%)98.5%
Missing907
Missing (%)33.6%
Memory size21.2 KiB
2023-12-13T09:00:31.679750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.01621
Min length10

Characters and Unicode

Total characters21497
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1739 ?
Unique (%)97.2%

Sample

1st row041-665-3271
2nd row041-665-9187
3rd row041-665-5643
4th row041-664-2442
5th row041-665-2246
ValueCountFrequency (%)
041-689-7727 3
 
0.2%
041-688-8814 3
 
0.2%
041-665-8329 2
 
0.1%
041-669-5500 2
 
0.1%
041-667-9208 2
 
0.1%
041-666-2251 2
 
0.1%
041-688-9294 2
 
0.1%
041-920-9292 2
 
0.1%
041-665-1015 2
 
0.1%
041-669-6444 2
 
0.1%
Other values (1753) 1767
98.8%
2023-12-13T09:00:32.026635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 3869
18.0%
- 3578
16.6%
0 2687
12.5%
1 2557
11.9%
4 2502
11.6%
8 1464
 
6.8%
2 1089
 
5.1%
9 1058
 
4.9%
5 950
 
4.4%
3 884
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 17919
83.4%
Dash Punctuation 3578
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 3869
21.6%
0 2687
15.0%
1 2557
14.3%
4 2502
14.0%
8 1464
 
8.2%
2 1089
 
6.1%
9 1058
 
5.9%
5 950
 
5.3%
3 884
 
4.9%
7 859
 
4.8%
Dash Punctuation
ValueCountFrequency (%)
- 3578
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 21497
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 3869
18.0%
- 3578
16.6%
0 2687
12.5%
1 2557
11.9%
4 2502
11.6%
8 1464
 
6.8%
2 1089
 
5.1%
9 1058
 
4.9%
5 950
 
4.4%
3 884
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21497
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 3869
18.0%
- 3578
16.6%
0 2687
12.5%
1 2557
11.9%
4 2502
11.6%
8 1464
 
6.8%
2 1089
 
5.1%
9 1058
 
4.9%
5 950
 
4.4%
3 884
 
4.1%

Missing values

2023-12-13T09:00:29.761738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:00:29.838787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T09:00:29.911474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업종명업소명소재지(도로명)소재지전화
0일반음식점송남회관충청남도 서산시 음암면 도당큰말길 6-4<NA>
1일반음식점버드내식당충청남도 서산시 운산면 해운로 1205-1<NA>
2일반음식점현대한우촌충청남도 서산시 율지17로 21 (동문동)041-665-3271
3일반음식점부자집충청남도 서산시 안견로 181 (동문동)041-665-9187
4일반음식점한일식당충청남도 서산시 번화1로 12 (읍내동)041-665-5643
5일반음식점자금성충청남도 서산시 성연면 성연로 215-1041-664-2442
6일반음식점동양식당충청남도 서산시 고운로 166-4 (동문동)041-665-2246
7일반음식점영국수산식당충청남도 서산시 읍내동 150<NA>
8일반음식점서부식당충청남도 서산시 번화2로 8-1 (읍내동)041-669-6225
9일반음식점터미널식당충청남도 서산시 안견로 190 (동문동)041-667-4599
업종명업소명소재지(도로명)소재지전화
2686일반음식점맘스터치 서산대산점충청남도 서산시 대산읍 충의로 1909-2, 세아빌딩 102호<NA>
2687일반음식점정자동6로충청남도 서산시 대산읍 정자동6로 26, 1층<NA>
2688일반음식점나희네충청남도 서산시 시장4길 36, 1층 (동문동)<NA>
2689일반음식점마라공방충청남도 서산시 호수공원10로 7, 배터라이프 1층 (예천동)<NA>
2690일반음식점샤인호프충청남도 서산시 시장4길 26, 2층 (동문동)<NA>
2691일반음식점아리아컨벤션충청남도 서산시 번화2로 38, 1층 (동문동)<NA>
2692일반음식점갈비명가 궁 대산기은점충청남도 서산시 대산읍 명지1로 270-5, 2층<NA>
2693일반음식점에스빠충청남도 서산시 읍내1로 35, 1층 8호 (읍내동)<NA>
2694일반음식점바다가에서충청남도 서산시 율지13로 16, 1층 (동문동)<NA>
2695일반음식점서산국화축제장 상설식당충청남도 서산시 고북면 가구리 624-67<NA>