Overview

Dataset statistics

Number of variables4
Number of observations700
Missing cells143
Missing cells (%)5.1%
Duplicate rows1
Duplicate rows (%)0.1%
Total size in memory22.0 KiB
Average record size in memory32.2 B

Variable types

Categorical1
Text3

Dataset

Description사천시 공중위생업(숙박업, 이미용업, 목욕업, 세탁업, 건물위생관리업)
Author경상남도 사천시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15045275

Alerts

Dataset has 1 (0.1%) duplicate rowsDuplicates
소재지전화 has 143 (20.4%) missing valuesMissing

Reproduction

Analysis started2023-12-10 23:52:07.121146
Analysis finished2023-12-10 23:52:07.572309
Duration0.45 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

Distinct18
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
미용업(일반)
269 
숙박업(일반)
154 
이용업
61 
미용업(피부)
52 
세탁업
50 
Other values (13)
114 

Length

Max length31
Median length7
Mean length6.7471429
Min length3

Unique

Unique3 ?
Unique (%)0.4%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
미용업(일반) 269
38.4%
숙박업(일반) 154
22.0%
이용업 61
 
8.7%
미용업(피부) 52
 
7.4%
세탁업 50
 
7.1%
목욕장업 40
 
5.7%
미용업(손톱ㆍ발톱) 24
 
3.4%
건물위생관리업 17
 
2.4%
미용업(일반), 미용업(손톱ㆍ발톱) 10
 
1.4%
숙박업(생활) 7
 
1.0%
Other values (8) 16
 
2.3%

Length

2023-12-11T08:52:07.632266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
미용업(일반 284
39.1%
숙박업(일반 154
21.2%
이용업 61
 
8.4%
미용업(피부 60
 
8.3%
세탁업 50
 
6.9%
미용업(손톱ㆍ발톱 43
 
5.9%
목욕장업 40
 
5.5%
건물위생관리업 17
 
2.3%
미용업(화장ㆍ분장 8
 
1.1%
숙박업(생활 7
 
1.0%
Distinct685
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
2023-12-11T08:52:07.865103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length16
Mean length5.5214286
Min length1

Characters and Unicode

Total characters3865
Distinct characters473
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique671 ?
Unique (%)95.9%

Sample

1st row삼풍여인숙
2nd row금성여관
3rd row남일여인숙
4th row신진여인숙
5th row한려장여관
ValueCountFrequency (%)
미용실 35
 
4.1%
헤어 7
 
0.8%
모텔 7
 
0.8%
3
 
0.4%
머리방 3
 
0.4%
현대이용원 3
 
0.4%
현대탕 2
 
0.2%
이지뷰티 2
 
0.2%
헤어아트 2
 
0.2%
v모텔 2
 
0.2%
Other values (751) 780
92.2%
2023-12-11T08:52:08.257097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
155
 
4.0%
148
 
3.8%
146
 
3.8%
146
 
3.8%
130
 
3.4%
109
 
2.8%
108
 
2.8%
93
 
2.4%
93
 
2.4%
76
 
2.0%
Other values (463) 2661
68.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3429
88.7%
Space Separator 146
 
3.8%
Lowercase Letter 109
 
2.8%
Uppercase Letter 93
 
2.4%
Other Punctuation 26
 
0.7%
Open Punctuation 23
 
0.6%
Close Punctuation 23
 
0.6%
Decimal Number 14
 
0.4%
Dash Punctuation 1
 
< 0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
155
 
4.5%
148
 
4.3%
146
 
4.3%
130
 
3.8%
109
 
3.2%
108
 
3.1%
93
 
2.7%
93
 
2.7%
76
 
2.2%
70
 
2.0%
Other values (406) 2301
67.1%
Uppercase Letter
ValueCountFrequency (%)
S 15
16.1%
H 8
 
8.6%
O 8
 
8.6%
M 7
 
7.5%
A 6
 
6.5%
N 5
 
5.4%
T 5
 
5.4%
E 5
 
5.4%
J 4
 
4.3%
L 4
 
4.3%
Other values (12) 26
28.0%
Lowercase Letter
ValueCountFrequency (%)
i 13
11.9%
e 12
11.0%
a 12
11.0%
n 10
9.2%
h 8
 
7.3%
l 8
 
7.3%
o 8
 
7.3%
y 6
 
5.5%
d 5
 
4.6%
r 4
 
3.7%
Other values (8) 23
21.1%
Other Punctuation
ValueCountFrequency (%)
? 10
38.5%
# 5
19.2%
& 5
19.2%
. 4
 
15.4%
, 1
 
3.8%
' 1
 
3.8%
Decimal Number
ValueCountFrequency (%)
1 5
35.7%
2 4
28.6%
3 2
 
14.3%
0 1
 
7.1%
5 1
 
7.1%
6 1
 
7.1%
Space Separator
ValueCountFrequency (%)
146
100.0%
Open Punctuation
ValueCountFrequency (%)
( 23
100.0%
Close Punctuation
ValueCountFrequency (%)
) 23
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3427
88.7%
Common 233
 
6.0%
Latin 203
 
5.3%
Han 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
155
 
4.5%
148
 
4.3%
146
 
4.3%
130
 
3.8%
109
 
3.2%
108
 
3.2%
93
 
2.7%
93
 
2.7%
76
 
2.2%
70
 
2.0%
Other values (405) 2299
67.1%
Latin
ValueCountFrequency (%)
S 15
 
7.4%
i 13
 
6.4%
e 12
 
5.9%
a 12
 
5.9%
n 10
 
4.9%
h 8
 
3.9%
l 8
 
3.9%
H 8
 
3.9%
o 8
 
3.9%
O 8
 
3.9%
Other values (31) 101
49.8%
Common
ValueCountFrequency (%)
146
62.7%
( 23
 
9.9%
) 23
 
9.9%
? 10
 
4.3%
# 5
 
2.1%
& 5
 
2.1%
1 5
 
2.1%
. 4
 
1.7%
2 4
 
1.7%
3 2
 
0.9%
Other values (6) 6
 
2.6%
Han
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3427
88.7%
ASCII 435
 
11.3%
CJK 2
 
0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
155
 
4.5%
148
 
4.3%
146
 
4.3%
130
 
3.8%
109
 
3.2%
108
 
3.2%
93
 
2.7%
93
 
2.7%
76
 
2.2%
70
 
2.0%
Other values (405) 2299
67.1%
ASCII
ValueCountFrequency (%)
146
33.6%
( 23
 
5.3%
) 23
 
5.3%
S 15
 
3.4%
i 13
 
3.0%
e 12
 
2.8%
a 12
 
2.8%
n 10
 
2.3%
? 10
 
2.3%
h 8
 
1.8%
Other values (46) 163
37.5%
CJK
ValueCountFrequency (%)
2
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Distinct651
Distinct (%)93.0%
Missing0
Missing (%)0.0%
Memory size5.6 KiB
2023-12-11T08:52:08.587184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length45
Mean length23.632857
Min length16

Characters and Unicode

Total characters16543
Distinct characters221
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique611 ?
Unique (%)87.3%

Sample

1st row경상남도 사천시 중앙시장1길 11-7 (선구동)
2nd row경상남도 사천시 한내6길 80 (동금동)
3rd row경상남도 사천시 망산공원길 6-8 (선구동)
4th row경상남도 사천시 한내5길 101-3 (선구동)
5th row경상남도 사천시 한내5길 85 (동금동)
ValueCountFrequency (%)
경상남도 700
 
18.7%
사천시 700
 
18.7%
사천읍 176
 
4.7%
벌리동 100
 
2.7%
동금동 73
 
2.0%
진삼로 54
 
1.4%
선구동 53
 
1.4%
향촌동 33
 
0.9%
사남면 33
 
0.9%
서금동 26
 
0.7%
Other values (652) 1789
47.9%
2023-12-11T08:52:09.067918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3037
18.4%
956
 
5.8%
913
 
5.5%
784
 
4.7%
745
 
4.5%
740
 
4.5%
718
 
4.3%
705
 
4.3%
657
 
4.0%
1 621
 
3.8%
Other values (211) 6667
40.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9955
60.2%
Space Separator 3037
 
18.4%
Decimal Number 2355
 
14.2%
Open Punctuation 421
 
2.5%
Close Punctuation 421
 
2.5%
Dash Punctuation 214
 
1.3%
Other Punctuation 140
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
956
 
9.6%
913
 
9.2%
784
 
7.9%
745
 
7.5%
740
 
7.4%
718
 
7.2%
705
 
7.1%
657
 
6.6%
423
 
4.2%
280
 
2.8%
Other values (192) 3034
30.5%
Decimal Number
ValueCountFrequency (%)
1 621
26.4%
2 324
13.8%
4 222
 
9.4%
3 209
 
8.9%
0 189
 
8.0%
6 187
 
7.9%
5 187
 
7.9%
7 169
 
7.2%
8 124
 
5.3%
9 123
 
5.2%
Other Punctuation
ValueCountFrequency (%)
, 138
98.6%
@ 1
 
0.7%
* 1
 
0.7%
Open Punctuation
ValueCountFrequency (%)
( 420
99.8%
[ 1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 420
99.8%
] 1
 
0.2%
Space Separator
ValueCountFrequency (%)
3037
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 214
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9955
60.2%
Common 6588
39.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
956
 
9.6%
913
 
9.2%
784
 
7.9%
745
 
7.5%
740
 
7.4%
718
 
7.2%
705
 
7.1%
657
 
6.6%
423
 
4.2%
280
 
2.8%
Other values (192) 3034
30.5%
Common
ValueCountFrequency (%)
3037
46.1%
1 621
 
9.4%
( 420
 
6.4%
) 420
 
6.4%
2 324
 
4.9%
4 222
 
3.4%
- 214
 
3.2%
3 209
 
3.2%
0 189
 
2.9%
6 187
 
2.8%
Other values (9) 745
 
11.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9955
60.2%
ASCII 6588
39.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3037
46.1%
1 621
 
9.4%
( 420
 
6.4%
) 420
 
6.4%
2 324
 
4.9%
4 222
 
3.4%
- 214
 
3.2%
3 209
 
3.2%
0 189
 
2.9%
6 187
 
2.8%
Other values (9) 745
 
11.3%
Hangul
ValueCountFrequency (%)
956
 
9.6%
913
 
9.2%
784
 
7.9%
745
 
7.5%
740
 
7.4%
718
 
7.2%
705
 
7.1%
657
 
6.6%
423
 
4.2%
280
 
2.8%
Other values (192) 3034
30.5%

소재지전화
Text

MISSING 

Distinct547
Distinct (%)98.2%
Missing143
Missing (%)20.4%
Memory size5.6 KiB
2023-12-11T08:52:09.314461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters6684
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique540 ?
Unique (%)96.9%

Sample

1st row055-835-0922
2nd row055-833-2238
3rd row055-834-3333
4th row055-833-3331
5th row055-833-2773
ValueCountFrequency (%)
055-832-9800 5
 
0.9%
055-833-5020 2
 
0.4%
055-854-3669 2
 
0.4%
055-855-0422 2
 
0.4%
055-852-6504 2
 
0.4%
055-850-1520 2
 
0.4%
055-855-9292 2
 
0.4%
055-832-5558 1
 
0.2%
055-835-3434 1
 
0.2%
055-852-6470 1
 
0.2%
Other values (537) 537
96.4%
2023-12-11T08:52:09.650091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 1752
26.2%
- 1114
16.7%
0 869
13.0%
8 792
11.8%
3 691
 
10.3%
2 406
 
6.1%
4 263
 
3.9%
1 219
 
3.3%
7 216
 
3.2%
6 194
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5570
83.3%
Dash Punctuation 1114
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 1752
31.5%
0 869
15.6%
8 792
14.2%
3 691
 
12.4%
2 406
 
7.3%
4 263
 
4.7%
1 219
 
3.9%
7 216
 
3.9%
6 194
 
3.5%
9 168
 
3.0%
Dash Punctuation
ValueCountFrequency (%)
- 1114
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6684
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 1752
26.2%
- 1114
16.7%
0 869
13.0%
8 792
11.8%
3 691
 
10.3%
2 406
 
6.1%
4 263
 
3.9%
1 219
 
3.3%
7 216
 
3.2%
6 194
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6684
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 1752
26.2%
- 1114
16.7%
0 869
13.0%
8 792
11.8%
3 691
 
10.3%
2 406
 
6.1%
4 263
 
3.9%
1 219
 
3.3%
7 216
 
3.2%
6 194
 
2.9%

Missing values

2023-12-11T08:52:07.467604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:52:07.542884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명업소소재지(도로명)소재지전화
0숙박업(일반)삼풍여인숙경상남도 사천시 중앙시장1길 11-7 (선구동)055-835-0922
1숙박업(일반)금성여관경상남도 사천시 한내6길 80 (동금동)055-833-2238
2숙박업(일반)남일여인숙경상남도 사천시 망산공원길 6-8 (선구동)055-834-3333
3숙박업(일반)신진여인숙경상남도 사천시 한내5길 101-3 (선구동)055-833-3331
4숙박업(일반)한려장여관경상남도 사천시 한내5길 85 (동금동)055-833-2773
5숙박업(일반)산수장여관경상남도 사천시 벌리한들길 84 (벌리동)055-833-6712
6숙박업(일반)행운장여관경상남도 사천시 팔포1길 6-3 (선구동)055-833-2644
7숙박업(일반)산호장여관경상남도 사천시 수남길 33 (선구동)055-833-5305
8숙박업(일반)신 유일장여관경상남도 사천시 수남길 27 (선구동)055-833-5760
9숙박업(일반)한일여인숙경상남도 사천시 갈대샘길 30-6 (선구동)055-833-4417
업종명업소명업소소재지(도로명)소재지전화
690건물위생관리업(주)청보경상남도 사천시 구미3길 104 (송포동)055-854-7576
691건물위생관리업한솔환경산업(주)경상남도 사천시 벌용길 118 (용강동)055-833-8350
692건물위생관리업연호건설(주)경상남도 사천시 축동면 사천대로 2136, 2층055-753-5900
693건물위생관리업(주) 도현건설경상남도 사천시 사남면 진삼로 1116055-573-8504
694건물위생관리업진우크린경상남도 사천시 사남면 진삼로 1102055-853-7001
695건물위생관리업와이에이치이엔씨(주)경상남도 사천시 축동면 사천대로 2136055-855-6400
696건물위생관리업손모아클리닝경상남도 사천시 새시장길 19, 가동 16호 (동금동, 경남아파트상가)055-855-0422
697건물위생관리업매일청소경상남도 사천시 사천읍 읍내로 68, 102호 (대성프린스)<NA>
698건물위생관리업위너HS경상남도 사천시 사천읍 사천읍성로 39<NA>
699건물위생관리업(주)하얀경상남도 사천시 사남면 진삼로 1128<NA>

Duplicate rows

Most frequently occurring

업종명업소명업소소재지(도로명)소재지전화# duplicates
0미용업(일반)봄날헤어경상남도 사천시 벌리1길 16, 1층 105호 (벌리동, 비룡데파트)<NA>2