Overview

Dataset statistics

Number of variables4
Number of observations511
Missing cells146
Missing cells (%)7.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory16.1 KiB
Average record size in memory32.3 B

Variable types

Categorical1
Text3

Dataset

Description서산시에 영업허가된 공중 위생업소(이발소. 미용실, 세탁소, 목욕탕, 사우나, 네일아트, 피부관리샵)정보로 업종명, 업소명, 업소소재지, 소재지에 대한 정보를 제공합니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=445&beforeMenuCd=DOM_000000201001001000&publicdatapk=15000677

Alerts

업종명 is highly imbalanced (57.1%)Imbalance
소재지전화 has 146 (28.6%) missing valuesMissing

Reproduction

Analysis started2024-01-09 21:32:03.354829
Analysis finished2024-01-09 21:32:03.734019
Duration0.38 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct14
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
미용업(일반)
360 
미용업(피부)
69 
미용업(손톱ㆍ발톱)
 
26
미용업
 
21
미용업(종합)
 
7
Other values (9)
 
28

Length

Max length31
Median length7
Mean length7.6751468
Min length3

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row미용업(일반)
2nd row미용업(일반)
3rd row미용업(일반)
4th row미용업(일반)
5th row미용업(일반)

Common Values

ValueCountFrequency (%)
미용업(일반) 360
70.5%
미용업(피부) 69
 
13.5%
미용업(손톱ㆍ발톱) 26
 
5.1%
미용업 21
 
4.1%
미용업(종합) 7
 
1.4%
미용업(피부), 미용업(손톱ㆍ발톱) 7
 
1.4%
미용업(일반), 미용업(피부) 4
 
0.8%
미용업(손톱ㆍ발톱), 미용업(화장ㆍ분장) 4
 
0.8%
미용업(일반), 미용업(손톱ㆍ발톱) 3
 
0.6%
미용업(피부), 미용업(화장ㆍ분장) 3
 
0.6%
Other values (4) 7
 
1.4%

Length

2024-01-10T06:32:03.784318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
미용업(일반 370
68.5%
미용업(피부 86
 
15.9%
미용업(손톱ㆍ발톱 42
 
7.8%
미용업 21
 
3.9%
미용업(화장ㆍ분장 14
 
2.6%
미용업(종합 7
 
1.3%
Distinct507
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
2024-01-10T06:32:03.956468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length18
Mean length5.5362035
Min length1

Characters and Unicode

Total characters2829
Distinct characters379
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique503 ?
Unique (%)98.4%

Sample

1st row제일미용실
2nd row임명숙헤어미용실
3rd row쎄느미용실
4th row신정미용실
5th row명동미용실
ValueCountFrequency (%)
헤어샵 8
 
1.4%
미용실 7
 
1.2%
헤어 5
 
0.9%
네일 4
 
0.7%
서산점 4
 
0.7%
피부관리실 2
 
0.3%
헤어갤러리 2
 
0.3%
헤어살롱 2
 
0.3%
반헤어 2
 
0.3%
2
 
0.3%
Other values (533) 539
93.4%
2024-01-10T06:32:04.298339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
248
 
8.8%
234
 
8.3%
130
 
4.6%
102
 
3.6%
93
 
3.3%
79
 
2.8%
69
 
2.4%
67
 
2.4%
66
 
2.3%
60
 
2.1%
Other values (369) 1681
59.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2713
95.9%
Space Separator 66
 
2.3%
Close Punctuation 13
 
0.5%
Open Punctuation 13
 
0.5%
Lowercase Letter 8
 
0.3%
Decimal Number 6
 
0.2%
Other Punctuation 5
 
0.2%
Uppercase Letter 4
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
248
 
9.1%
234
 
8.6%
130
 
4.8%
102
 
3.8%
93
 
3.4%
79
 
2.9%
69
 
2.5%
67
 
2.5%
60
 
2.2%
41
 
1.5%
Other values (349) 1590
58.6%
Lowercase Letter
ValueCountFrequency (%)
i 2
25.0%
l 1
12.5%
v 1
12.5%
e 1
12.5%
h 1
12.5%
a 1
12.5%
r 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
O 1
25.0%
S 1
25.0%
E 1
25.0%
Z 1
25.0%
Decimal Number
ValueCountFrequency (%)
2 4
66.7%
1 1
 
16.7%
3 1
 
16.7%
Other Punctuation
ValueCountFrequency (%)
, 4
80.0%
& 1
 
20.0%
Space Separator
ValueCountFrequency (%)
66
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2713
95.9%
Common 104
 
3.7%
Latin 12
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
248
 
9.1%
234
 
8.6%
130
 
4.8%
102
 
3.8%
93
 
3.4%
79
 
2.9%
69
 
2.5%
67
 
2.5%
60
 
2.2%
41
 
1.5%
Other values (349) 1590
58.6%
Latin
ValueCountFrequency (%)
i 2
16.7%
O 1
8.3%
l 1
8.3%
v 1
8.3%
e 1
8.3%
h 1
8.3%
a 1
8.3%
r 1
8.3%
S 1
8.3%
E 1
8.3%
Common
ValueCountFrequency (%)
66
63.5%
) 13
 
12.5%
( 13
 
12.5%
, 4
 
3.8%
2 4
 
3.8%
1 1
 
1.0%
3 1
 
1.0%
- 1
 
1.0%
& 1
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2713
95.9%
ASCII 116
 
4.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
248
 
9.1%
234
 
8.6%
130
 
4.8%
102
 
3.8%
93
 
3.4%
79
 
2.9%
69
 
2.5%
67
 
2.5%
60
 
2.2%
41
 
1.5%
Other values (349) 1590
58.6%
ASCII
ValueCountFrequency (%)
66
56.9%
) 13
 
11.2%
( 13
 
11.2%
, 4
 
3.4%
2 4
 
3.4%
i 2
 
1.7%
1 1
 
0.9%
3 1
 
0.9%
- 1
 
0.9%
O 1
 
0.9%
Other values (10) 10
 
8.6%
Distinct496
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
2024-01-10T06:32:04.572803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length57
Median length48
Mean length28.191781
Min length19

Characters and Unicode

Total characters14406
Distinct characters209
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique481 ?
Unique (%)94.1%

Sample

1st row충청남도 서산시 고운로 129 (읍내동)
2nd row충청남도 서산시 번화1로 45 (동문동)
3rd row충청남도 서산시 번화2로 33 (동문동)
4th row충청남도 서산시 번화2로 28 (읍내동)
5th row충청남도 서산시 읍내동 183,214-23번지 1호
ValueCountFrequency (%)
충청남도 511
 
16.3%
서산시 511
 
16.3%
1층 242
 
7.7%
동문동 189
 
6.0%
읍내동 74
 
2.4%
2층 61
 
1.9%
석림동 50
 
1.6%
예천동 45
 
1.4%
상가동 37
 
1.2%
고운로 29
 
0.9%
Other values (553) 1381
44.1%
2024-01-10T06:32:04.974696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2648
18.4%
744
 
5.2%
1 726
 
5.0%
568
 
3.9%
567
 
3.9%
565
 
3.9%
560
 
3.9%
528
 
3.7%
516
 
3.6%
514
 
3.6%
Other values (199) 6470
44.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7986
55.4%
Space Separator 2648
 
18.4%
Decimal Number 2278
 
15.8%
Close Punctuation 471
 
3.3%
Open Punctuation 471
 
3.3%
Other Punctuation 440
 
3.1%
Dash Punctuation 99
 
0.7%
Uppercase Letter 12
 
0.1%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
744
 
9.3%
568
 
7.1%
567
 
7.1%
565
 
7.1%
560
 
7.0%
528
 
6.6%
516
 
6.5%
514
 
6.4%
431
 
5.4%
339
 
4.2%
Other values (177) 2654
33.2%
Decimal Number
ValueCountFrequency (%)
1 726
31.9%
2 384
16.9%
3 234
 
10.3%
4 179
 
7.9%
0 155
 
6.8%
6 146
 
6.4%
5 138
 
6.1%
9 111
 
4.9%
7 105
 
4.6%
8 100
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
A 5
41.7%
C 3
25.0%
B 2
 
16.7%
J 1
 
8.3%
K 1
 
8.3%
Other Punctuation
ValueCountFrequency (%)
, 439
99.8%
@ 1
 
0.2%
Space Separator
ValueCountFrequency (%)
2648
100.0%
Close Punctuation
ValueCountFrequency (%)
) 471
100.0%
Open Punctuation
ValueCountFrequency (%)
( 471
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 99
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7986
55.4%
Common 6407
44.5%
Latin 13
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
744
 
9.3%
568
 
7.1%
567
 
7.1%
565
 
7.1%
560
 
7.0%
528
 
6.6%
516
 
6.5%
514
 
6.4%
431
 
5.4%
339
 
4.2%
Other values (177) 2654
33.2%
Common
ValueCountFrequency (%)
2648
41.3%
1 726
 
11.3%
) 471
 
7.4%
( 471
 
7.4%
, 439
 
6.9%
2 384
 
6.0%
3 234
 
3.7%
4 179
 
2.8%
0 155
 
2.4%
6 146
 
2.3%
Other values (6) 554
 
8.6%
Latin
ValueCountFrequency (%)
A 5
38.5%
C 3
23.1%
B 2
 
15.4%
e 1
 
7.7%
J 1
 
7.7%
K 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7986
55.4%
ASCII 6420
44.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2648
41.2%
1 726
 
11.3%
) 471
 
7.3%
( 471
 
7.3%
, 439
 
6.8%
2 384
 
6.0%
3 234
 
3.6%
4 179
 
2.8%
0 155
 
2.4%
6 146
 
2.3%
Other values (12) 567
 
8.8%
Hangul
ValueCountFrequency (%)
744
 
9.3%
568
 
7.1%
567
 
7.1%
565
 
7.1%
560
 
7.0%
528
 
6.6%
516
 
6.5%
514
 
6.4%
431
 
5.4%
339
 
4.2%
Other values (177) 2654
33.2%

소재지전화
Text

MISSING 

Distinct361
Distinct (%)98.9%
Missing146
Missing (%)28.6%
Memory size4.1 KiB
2024-01-10T06:32:05.236609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length13.99726
Min length13

Characters and Unicode

Total characters5109
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique357 ?
Unique (%)97.8%

Sample

1st row 041- 664-7558
2nd row 041- 665-3594
3rd row 041- 665-4492
4th row 041- 664-0230
5th row 041- 665-8862
ValueCountFrequency (%)
041 354
41.2%
669 27
 
3.1%
665 20
 
2.3%
667 16
 
1.9%
668 13
 
1.5%
681 12
 
1.4%
666 10
 
1.2%
664 8
 
0.9%
663 8
 
0.9%
662 7
 
0.8%
Other values (367) 385
44.8%
2024-01-10T06:32:05.598883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 813
15.9%
- 730
14.3%
721
14.1%
0 567
11.1%
4 539
10.6%
1 531
10.4%
8 232
 
4.5%
5 224
 
4.4%
7 220
 
4.3%
3 192
 
3.8%
Other values (2) 340
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3658
71.6%
Dash Punctuation 730
 
14.3%
Space Separator 721
 
14.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 813
22.2%
0 567
15.5%
4 539
14.7%
1 531
14.5%
8 232
 
6.3%
5 224
 
6.1%
7 220
 
6.0%
3 192
 
5.2%
9 179
 
4.9%
2 161
 
4.4%
Dash Punctuation
ValueCountFrequency (%)
- 730
100.0%
Space Separator
ValueCountFrequency (%)
721
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5109
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 813
15.9%
- 730
14.3%
721
14.1%
0 567
11.1%
4 539
10.6%
1 531
10.4%
8 232
 
4.5%
5 224
 
4.4%
7 220
 
4.3%
3 192
 
3.8%
Other values (2) 340
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5109
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 813
15.9%
- 730
14.3%
721
14.1%
0 567
11.1%
4 539
10.6%
1 531
10.4%
8 232
 
4.5%
5 224
 
4.4%
7 220
 
4.3%
3 192
 
3.8%
Other values (2) 340
6.7%

Missing values

2024-01-10T06:32:03.643977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T06:32:03.707943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명업소소재지(도로명)소재지전화
0미용업(일반)제일미용실충청남도 서산시 고운로 129 (읍내동)041- 664-7558
1미용업(일반)임명숙헤어미용실충청남도 서산시 번화1로 45 (동문동)041- 665-3594
2미용업(일반)쎄느미용실충청남도 서산시 번화2로 33 (동문동)041- 665-4492
3미용업(일반)신정미용실충청남도 서산시 번화2로 28 (읍내동)041- 664-0230
4미용업(일반)명동미용실충청남도 서산시 읍내동 183,214-23번지 1호041- 665-8862
5미용업(일반)진주미용실충청남도 서산시 시장4길 21 (동문동)041- 667-2893
6미용업(일반)나라헤어샵충청남도 서산시 번화2로 21-1 (읍내동)041- 664-1445
7미용업(일반)수지미용실충청남도 서산시 운산면 운암로 1046-1041- 663-4400
8미용업(일반)평화미용실충청남도 서산시 해미면 읍내리 333번지 2호041- 688-2812
9미용업(일반)정머리방충청남도 서산시 시장4길 21 (동문동)041- 664-0478
업종명업소명업소소재지(도로명)소재지전화
501미용업(피부), 미용업(손톱ㆍ발톱)어텀네일충청남도 서산시 동헌로 38-1, 1층 (석남동)<NA>
502미용업(일반)김미용실충청남도 서산시 중앙로 51, 2동 (동문동)<NA>
503미용업(피부)채우다충청남도 서산시 율지3로 36, 1층 108호 (동문동)<NA>
504미용업(손톱ㆍ발톱), 미용업(화장ㆍ분장)오늘,네일샵충청남도 서산시 대산읍 충의로 1889, 1층 3호<NA>
505미용업(피부)영테라피충청남도 서산시 금남로 36, 1층 (동문동)<NA>
506미용업(피부)곱다샵충청남도 서산시 탑동1로 17, 1층 (동문동)041 -664 -1110
507미용업(일반)연헤어충청남도 서산시 명륜1로 85, 1층 (읍내동)<NA>
508미용업(일반)크러쉬온이나 서산점충청남도 서산시 고운로 109, 2층 (읍내동)041-669 -5178
509미용업(손톱ㆍ발톱), 미용업(화장ㆍ분장)핑크타임충청남도 서산시 번화1로 23, 1층 (동문동)<NA>
510미용업(피부)민스킨케어충청남도 서산시 연당1로 3-4 (읍내동)<NA>