Overview

Dataset statistics

Number of variables5
Number of observations593
Missing cells726
Missing cells (%)24.5%
Duplicate rows5
Duplicate rows (%)0.8%
Total size in memory23.3 KiB
Average record size in memory40.2 B

Variable types

Text4
Categorical1

Dataset

Description경상남도 김해시 이용업 현황(사업장명, 소재지전화, 지번주소, 도로명주소, 영업상태 등)에 대한 데이터를 제공합니다.
Author경상남도 김해시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15033427

Alerts

Dataset has 5 (0.8%) duplicate rowsDuplicates
영업상태 is highly imbalanced (50.1%)Imbalance
소재지전화 has 463 (78.1%) missing valuesMissing
영업소 주소(도로명) has 259 (43.7%) missing valuesMissing

Reproduction

Analysis started2023-12-11 00:11:52.499238
Analysis finished2023-12-11 00:11:53.205466
Duration0.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct462
Distinct (%)78.0%
Missing1
Missing (%)0.2%
Memory size4.8 KiB
2023-12-11T09:11:53.399842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length5
Mean length5.7128378
Min length3

Characters and Unicode

Total characters3382
Distinct characters306
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique368 ?
Unique (%)62.2%

Sample

1st row연지사우나이용원
2nd row주공이용원
3rd row삼계이용원
4th row목화탕이용원
5th row덕삼이용원
ValueCountFrequency (%)
명천탕이용원 5
 
0.8%
동원이용원 5
 
0.8%
이용원 5
 
0.8%
중앙이용원 4
 
0.7%
명동이용원 4
 
0.7%
현대이용원 4
 
0.7%
퀸즈헤나 4
 
0.7%
한일이용원 3
 
0.5%
태후사랑 3
 
0.5%
광명이용원 3
 
0.5%
Other values (462) 569
93.4%
2023-12-11T09:11:53.780740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
517
 
15.3%
496
 
14.7%
489
 
14.5%
77
 
2.3%
58
 
1.7%
55
 
1.6%
47
 
1.4%
44
 
1.3%
42
 
1.2%
41
 
1.2%
Other values (296) 1516
44.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3327
98.4%
Uppercase Letter 24
 
0.7%
Space Separator 17
 
0.5%
Close Punctuation 5
 
0.1%
Open Punctuation 5
 
0.1%
Other Punctuation 2
 
0.1%
Decimal Number 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
517
 
15.5%
496
 
14.9%
489
 
14.7%
77
 
2.3%
58
 
1.7%
55
 
1.7%
47
 
1.4%
44
 
1.3%
42
 
1.3%
41
 
1.2%
Other values (275) 1461
43.9%
Uppercase Letter
ValueCountFrequency (%)
B 4
16.7%
O 3
12.5%
P 2
 
8.3%
A 2
 
8.3%
S 2
 
8.3%
R 2
 
8.3%
K 1
 
4.2%
C 1
 
4.2%
E 1
 
4.2%
Z 1
 
4.2%
Other values (5) 5
20.8%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
2 1
50.0%
Space Separator
ValueCountFrequency (%)
17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3327
98.4%
Common 31
 
0.9%
Latin 24
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
517
 
15.5%
496
 
14.9%
489
 
14.7%
77
 
2.3%
58
 
1.7%
55
 
1.7%
47
 
1.4%
44
 
1.3%
42
 
1.3%
41
 
1.2%
Other values (275) 1461
43.9%
Latin
ValueCountFrequency (%)
B 4
16.7%
O 3
12.5%
P 2
 
8.3%
A 2
 
8.3%
S 2
 
8.3%
R 2
 
8.3%
K 1
 
4.2%
C 1
 
4.2%
E 1
 
4.2%
Z 1
 
4.2%
Other values (5) 5
20.8%
Common
ValueCountFrequency (%)
17
54.8%
) 5
 
16.1%
( 5
 
16.1%
. 2
 
6.5%
1 1
 
3.2%
2 1
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3327
98.4%
ASCII 55
 
1.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
517
 
15.5%
496
 
14.9%
489
 
14.7%
77
 
2.3%
58
 
1.7%
55
 
1.7%
47
 
1.4%
44
 
1.3%
42
 
1.3%
41
 
1.2%
Other values (275) 1461
43.9%
ASCII
ValueCountFrequency (%)
17
30.9%
) 5
 
9.1%
( 5
 
9.1%
B 4
 
7.3%
O 3
 
5.5%
P 2
 
3.6%
. 2
 
3.6%
A 2
 
3.6%
S 2
 
3.6%
R 2
 
3.6%
Other values (11) 11
20.0%

소재지전화
Text

MISSING 

Distinct127
Distinct (%)97.7%
Missing463
Missing (%)78.1%
Memory size4.8 KiB
2023-12-11T09:11:54.318466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.676923
Min length12

Characters and Unicode

Total characters1648
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique124 ?
Unique (%)95.4%

Sample

1st row 055-328-2827
2nd row 055-322-2850
3rd row 055-322-4834
4th row 055-333-5690
5th row 055-333-8513
ValueCountFrequency (%)
055-326-7418 2
 
1.5%
055-339-5519 2
 
1.5%
055-332-7799 2
 
1.5%
055-335-7255 2
 
1.5%
055-314-0766 2
 
1.5%
055-334-8025 2
 
1.5%
055-313-9492 2
 
1.5%
055-326-9900 1
 
0.8%
055-334-9072 1
 
0.8%
055-341-4238 1
 
0.8%
Other values (113) 113
86.9%
2023-12-11T09:11:55.057065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 335
20.3%
- 260
15.8%
3 232
14.1%
0 180
10.9%
2 144
8.7%
88
 
5.3%
1 78
 
4.7%
4 76
 
4.6%
7 68
 
4.1%
8 66
 
4.0%
Other values (2) 121
 
7.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1300
78.9%
Dash Punctuation 260
 
15.8%
Space Separator 88
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 335
25.8%
3 232
17.8%
0 180
13.8%
2 144
11.1%
1 78
 
6.0%
4 76
 
5.8%
7 68
 
5.2%
8 66
 
5.1%
6 62
 
4.8%
9 59
 
4.5%
Dash Punctuation
ValueCountFrequency (%)
- 260
100.0%
Space Separator
ValueCountFrequency (%)
88
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1648
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 335
20.3%
- 260
15.8%
3 232
14.1%
0 180
10.9%
2 144
8.7%
88
 
5.3%
1 78
 
4.7%
4 76
 
4.6%
7 68
 
4.1%
8 66
 
4.0%
Other values (2) 121
 
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1648
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 335
20.3%
- 260
15.8%
3 232
14.1%
0 180
10.9%
2 144
8.7%
88
 
5.3%
1 78
 
4.7%
4 76
 
4.6%
7 68
 
4.1%
8 66
 
4.0%
Other values (2) 121
 
7.3%
Distinct313
Distinct (%)93.7%
Missing259
Missing (%)43.7%
Memory size4.8 KiB
2023-12-11T09:11:55.381675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length57
Median length44
Mean length30.065868
Min length18

Characters and Unicode

Total characters10042
Distinct characters209
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique294 ?
Unique (%)88.0%

Sample

1st row경상남도 김해시 우암로 175, A401호 (내동)
2nd row경상남도 김해시 김해대로2529번길 70 (어방동)
3rd row경상남도 김해시 호계로 513, 1층 4호 (동상동)
4th row경상남도 김해시 호계로 481-1 (동상동)
5th row경상남도 김해시 가락로125번길 36 (대성동)
ValueCountFrequency (%)
경상남도 334
 
16.6%
김해시 334
 
16.6%
외동 33
 
1.6%
1층 31
 
1.5%
진영읍 30
 
1.5%
3층 29
 
1.4%
내동 24
 
1.2%
2층 23
 
1.1%
부원동 22
 
1.1%
삼계동 20
 
1.0%
Other values (528) 1130
56.2%
2023-12-11T09:11:55.874934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1678
 
16.7%
1 452
 
4.5%
378
 
3.8%
375
 
3.7%
366
 
3.6%
356
 
3.5%
348
 
3.5%
339
 
3.4%
335
 
3.3%
334
 
3.3%
Other values (199) 5081
50.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5702
56.8%
Decimal Number 1738
 
17.3%
Space Separator 1678
 
16.7%
Open Punctuation 295
 
2.9%
Close Punctuation 295
 
2.9%
Other Punctuation 239
 
2.4%
Dash Punctuation 85
 
0.8%
Uppercase Letter 10
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
378
 
6.6%
375
 
6.6%
366
 
6.4%
356
 
6.2%
348
 
6.1%
339
 
5.9%
335
 
5.9%
334
 
5.9%
334
 
5.9%
201
 
3.5%
Other values (178) 2336
41.0%
Decimal Number
ValueCountFrequency (%)
1 452
26.0%
2 297
17.1%
3 180
 
10.4%
0 158
 
9.1%
5 137
 
7.9%
4 129
 
7.4%
7 113
 
6.5%
6 104
 
6.0%
8 87
 
5.0%
9 81
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
A 5
50.0%
C 1
 
10.0%
S 1
 
10.0%
M 1
 
10.0%
I 1
 
10.0%
D 1
 
10.0%
Space Separator
ValueCountFrequency (%)
1678
100.0%
Open Punctuation
ValueCountFrequency (%)
( 295
100.0%
Close Punctuation
ValueCountFrequency (%)
) 295
100.0%
Other Punctuation
ValueCountFrequency (%)
, 239
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 85
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5702
56.8%
Common 4330
43.1%
Latin 10
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
378
 
6.6%
375
 
6.6%
366
 
6.4%
356
 
6.2%
348
 
6.1%
339
 
5.9%
335
 
5.9%
334
 
5.9%
334
 
5.9%
201
 
3.5%
Other values (178) 2336
41.0%
Common
ValueCountFrequency (%)
1678
38.8%
1 452
 
10.4%
2 297
 
6.9%
( 295
 
6.8%
) 295
 
6.8%
, 239
 
5.5%
3 180
 
4.2%
0 158
 
3.6%
5 137
 
3.2%
4 129
 
3.0%
Other values (5) 470
 
10.9%
Latin
ValueCountFrequency (%)
A 5
50.0%
C 1
 
10.0%
S 1
 
10.0%
M 1
 
10.0%
I 1
 
10.0%
D 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5702
56.8%
ASCII 4340
43.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1678
38.7%
1 452
 
10.4%
2 297
 
6.8%
( 295
 
6.8%
) 295
 
6.8%
, 239
 
5.5%
3 180
 
4.1%
0 158
 
3.6%
5 137
 
3.2%
4 129
 
3.0%
Other values (11) 480
 
11.1%
Hangul
ValueCountFrequency (%)
378
 
6.6%
375
 
6.6%
366
 
6.4%
356
 
6.2%
348
 
6.1%
339
 
5.9%
335
 
5.9%
334
 
5.9%
334
 
5.9%
201
 
3.5%
Other values (178) 2336
41.0%
Distinct523
Distinct (%)88.6%
Missing3
Missing (%)0.5%
Memory size4.8 KiB
2023-12-11T09:11:56.155247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length48
Median length40
Mean length23.023729
Min length15

Characters and Unicode

Total characters13584
Distinct characters232
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique474 ?
Unique (%)80.3%

Sample

1st row경상남도 김해시 내동 1071-1 A401호
2nd row경상남도 김해시 외동 1250-3 주공상가 202호
3rd row경상남도 김해시 삼계동 1155-1
4th row경상남도 김해시 지내동 20 B 7L 준공업지구
5th row경상남도 김해시 어방동 1097-16
ValueCountFrequency (%)
경상남도 590
20.7%
김해시 590
20.7%
외동 58
 
2.0%
진영읍 48
 
1.7%
내동 44
 
1.5%
부원동 42
 
1.5%
삼방동 40
 
1.4%
삼정동 35
 
1.2%
어방동 35
 
1.2%
대청동 28
 
1.0%
Other values (717) 1340
47.0%
2023-12-11T09:11:56.589011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2804
20.6%
1 721
 
5.3%
678
 
5.0%
608
 
4.5%
595
 
4.4%
595
 
4.4%
593
 
4.4%
592
 
4.4%
591
 
4.4%
584
 
4.3%
Other values (222) 5223
38.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7311
53.8%
Decimal Number 2910
 
21.4%
Space Separator 2804
 
20.6%
Dash Punctuation 517
 
3.8%
Uppercase Letter 25
 
0.2%
Other Punctuation 13
 
0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
678
 
9.3%
608
 
8.3%
595
 
8.1%
595
 
8.1%
593
 
8.1%
592
 
8.1%
591
 
8.1%
584
 
8.0%
114
 
1.6%
104
 
1.4%
Other values (198) 2257
30.9%
Decimal Number
ValueCountFrequency (%)
1 721
24.8%
2 426
14.6%
0 273
 
9.4%
3 270
 
9.3%
6 252
 
8.7%
4 239
 
8.2%
5 226
 
7.8%
7 207
 
7.1%
9 161
 
5.5%
8 135
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
A 11
44.0%
B 5
20.0%
L 4
 
16.0%
M 1
 
4.0%
S 1
 
4.0%
D 1
 
4.0%
I 1
 
4.0%
C 1
 
4.0%
Other Punctuation
ValueCountFrequency (%)
, 12
92.3%
. 1
 
7.7%
Space Separator
ValueCountFrequency (%)
2804
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 517
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7311
53.8%
Common 6248
46.0%
Latin 25
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
678
 
9.3%
608
 
8.3%
595
 
8.1%
595
 
8.1%
593
 
8.1%
592
 
8.1%
591
 
8.1%
584
 
8.0%
114
 
1.6%
104
 
1.4%
Other values (198) 2257
30.9%
Common
ValueCountFrequency (%)
2804
44.9%
1 721
 
11.5%
- 517
 
8.3%
2 426
 
6.8%
0 273
 
4.4%
3 270
 
4.3%
6 252
 
4.0%
4 239
 
3.8%
5 226
 
3.6%
7 207
 
3.3%
Other values (6) 313
 
5.0%
Latin
ValueCountFrequency (%)
A 11
44.0%
B 5
20.0%
L 4
 
16.0%
M 1
 
4.0%
S 1
 
4.0%
D 1
 
4.0%
I 1
 
4.0%
C 1
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7311
53.8%
ASCII 6273
46.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2804
44.7%
1 721
 
11.5%
- 517
 
8.2%
2 426
 
6.8%
0 273
 
4.4%
3 270
 
4.3%
6 252
 
4.0%
4 239
 
3.8%
5 226
 
3.6%
7 207
 
3.3%
Other values (14) 338
 
5.4%
Hangul
ValueCountFrequency (%)
678
 
9.3%
608
 
8.3%
595
 
8.1%
595
 
8.1%
593
 
8.1%
592
 
8.1%
591
 
8.1%
584
 
8.0%
114
 
1.6%
104
 
1.4%
Other values (198) 2257
30.9%

영업상태
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
폐업
457 
영업/정상
135 
<NA>
 
1

Length

Max length5
Median length2
Mean length2.6863406
Min length2

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row폐업
2nd row폐업
3rd row폐업
4th row폐업
5th row폐업

Common Values

ValueCountFrequency (%)
폐업 457
77.1%
영업/정상 135
 
22.8%
<NA> 1
 
0.2%

Length

2023-12-11T09:11:56.764671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:11:56.880513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
폐업 457
77.1%
영업/정상 135
 
22.8%
na 1
 
0.2%

Missing values

2023-12-11T09:11:52.954551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:11:53.050875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T09:11:53.143604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

사업장명소재지전화영업소 주소(도로명)영업소 주소(지번)영업상태
0연지사우나이용원<NA>경상남도 김해시 우암로 175, A401호 (내동)경상남도 김해시 내동 1071-1 A401호폐업
1주공이용원055-328-2827<NA>경상남도 김해시 외동 1250-3 주공상가 202호폐업
2삼계이용원<NA><NA>경상남도 김해시 삼계동 1155-1폐업
3목화탕이용원<NA><NA>경상남도 김해시 지내동 20 B 7L 준공업지구폐업
4덕삼이용원055-322-2850경상남도 김해시 김해대로2529번길 70 (어방동)경상남도 김해시 어방동 1097-16폐업
5태평양이용원055-322-4834<NA>경상남도 김해시 안동 259-16 ,17폐업
6동광이용원<NA>경상남도 김해시 호계로 513, 1층 4호 (동상동)경상남도 김해시 동상동 803-6 1층 4호영업/정상
7대웅이용원<NA><NA>경상남도 김해시 동상동 675-15폐업
8동부이용원<NA><NA>경상남도 김해시 동상동 595폐업
9옥수이용원<NA><NA>경상남도 김해시 동상동 953-2폐업
사업장명소재지전화영업소 주소(도로명)영업소 주소(지번)영업상태
583채플린<NA>경상남도 김해시 삼계로205번길 20, 1층 (삼계동)경상남도 김해시 삼계동 1444-5 1층영업/정상
584롯데이용원<NA>경상남도 김해시 가야로451번길 6-11, 2층 201호 (동상동)경상남도 김해시 동상동 1101-4 2층 201호영업/정상
585폭포수사우나이용원<NA>경상남도 김해시 계동로 129-31, 4층 (대청동)경상남도 김해시 대청동 510 , 4층폐업
586영오탕이용원<NA>경상남도 김해시 내외중앙로 105, 영오탕 내 3층 (내동)경상남도 김해시 내동 1128-4영업/정상
587덕삼탕 내 이용원<NA>경상남도 김해시 김해대로2529번길 70, 3층 (어방동)경상남도 김해시 어방동 1097-17 3층영업/정상
588신사쌀롱<NA>경상남도 김해시 덕정로40번길 13-20, 1층 일부호 (관동동)경상남도 김해시 관동동 427-7영업/정상
589엔젤바버샵<NA>경상남도 김해시 능동로 27, 퓨전스포츠타운 106호 (삼문동)경상남도 김해시 삼문동 79-3 퓨전스포츠타운 106호영업/정상
590라이프<NA>경상남도 김해시 진영읍 진영로 132, 3층경상남도 김해시 진영읍 진영리 265-1영업/정상
591오리엔트바버샵<NA>경상남도 김해시 율하숲길 24, 112호 (장유동)경상남도 김해시 장유동 843-6 112호영업/정상
592<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

사업장명소재지전화영업소 주소(도로명)영업소 주소(지번)영업상태# duplicates
0경보스포츠이용원<NA><NA>경상남도 김해시 내동 1117-1 13층폐업2
1수성이용원055-326-7418<NA>경상남도 김해시 동상동 792-8폐업2
2아람이용원<NA>경상남도 김해시 함박로101번길 20, 3층 303호 (외동, 혜민빌딩)경상남도 김해시 외동 1252-3 혜민빌딩 303호폐업2
3유성탕이용원<NA><NA>경상남도 김해시 대성동 111-1폐업2
4해피죤이용원<NA>경상남도 김해시 함박로 120 (외동, 한국아파트상가동 지하1층)경상남도 김해시 외동 1261-9 한국아파트상가동 지하1층폐업2