Overview

Dataset statistics

Number of variables5
Number of observations821
Missing cells353
Missing cells (%)8.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory33.0 KiB
Average record size in memory41.2 B

Variable types

Categorical1
Text3
Numeric1

Dataset

Description부산광역시 기장군 공중위생업소(숙박업, 목욕장업, 이용업, 세탁업, 건물위생관리업, 일반미용업 등) 에 관한 데이터로 업소명, 소재지, 전화번호, 우편번호 등의 항목을 제공합니다.
URLhttps://www.data.go.kr/data/15112990/fileData.do

Alerts

우편번호(도로명) has 15 (1.8%) missing valuesMissing
소재지전화 has 338 (41.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 04:08:14.006864
Analysis finished2023-12-12 04:08:15.073616
Duration1.07 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

Distinct21
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size6.5 KiB
일반미용업
302 
건물위생관리업
96 
피부미용업
76 
네일미용업
57 
세탁업
57 
Other values (16)
233 

Length

Max length23
Median length5
Mean length5.8915956
Min length3

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
일반미용업 302
36.8%
건물위생관리업 96
 
11.7%
피부미용업 76
 
9.3%
네일미용업 57
 
6.9%
세탁업 57
 
6.9%
숙박업(일반) 56
 
6.8%
이용업 52
 
6.3%
목욕장업 26
 
3.2%
숙박업(생활) 20
 
2.4%
화장 분장 미용업 15
 
1.8%
Other values (11) 64
 
7.8%

Length

2023-12-12T13:08:15.685492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
일반미용업 330
33.6%
피부미용업 98
 
10.0%
건물위생관리업 96
 
9.8%
네일미용업 88
 
9.0%
세탁업 57
 
5.8%
숙박업(일반 56
 
5.7%
이용업 52
 
5.3%
미용업 51
 
5.2%
화장 50
 
5.1%
분장 50
 
5.1%
Other values (3) 55
 
5.6%
Distinct808
Distinct (%)98.4%
Missing0
Missing (%)0.0%
Memory size6.5 KiB
2023-12-12T13:08:16.037554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length28
Mean length6.7320341
Min length2

Characters and Unicode

Total characters5527
Distinct characters518
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique796 ?
Unique (%)97.0%

Sample

1st row조인장모텔
2nd row올리브모텔
3rd row늘봄모텔
4th row팔레스
5th row해변여관
ValueCountFrequency (%)
주식회사 12
 
1.2%
hair 10
 
1.0%
에스테틱 9
 
0.9%
헤어 8
 
0.8%
정관점 8
 
0.8%
nail 5
 
0.5%
beauty 5
 
0.5%
오시리아 5
 
0.5%
네일 4
 
0.4%
뷰티 4
 
0.4%
Other values (904) 944
93.1%
2023-12-12T13:08:16.641764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
229
 
4.1%
220
 
4.0%
193
 
3.5%
148
 
2.7%
) 117
 
2.1%
117
 
2.1%
( 116
 
2.1%
93
 
1.7%
85
 
1.5%
79
 
1.4%
Other values (508) 4130
74.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4440
80.3%
Lowercase Letter 315
 
5.7%
Uppercase Letter 267
 
4.8%
Space Separator 193
 
3.5%
Close Punctuation 117
 
2.1%
Open Punctuation 116
 
2.1%
Other Punctuation 48
 
0.9%
Decimal Number 25
 
0.5%
Dash Punctuation 4
 
0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
229
 
5.2%
220
 
5.0%
148
 
3.3%
117
 
2.6%
93
 
2.1%
85
 
1.9%
79
 
1.8%
77
 
1.7%
71
 
1.6%
68
 
1.5%
Other values (440) 3253
73.3%
Lowercase Letter
ValueCountFrequency (%)
a 45
14.3%
i 39
12.4%
e 33
10.5%
n 26
8.3%
r 23
 
7.3%
l 21
 
6.7%
o 20
 
6.3%
y 15
 
4.8%
h 15
 
4.8%
s 15
 
4.8%
Other values (14) 63
20.0%
Uppercase Letter
ValueCountFrequency (%)
H 24
 
9.0%
A 23
 
8.6%
N 20
 
7.5%
S 18
 
6.7%
E 18
 
6.7%
M 18
 
6.7%
I 17
 
6.4%
B 16
 
6.0%
T 13
 
4.9%
O 13
 
4.9%
Other values (13) 87
32.6%
Decimal Number
ValueCountFrequency (%)
2 10
40.0%
1 3
 
12.0%
5 3
 
12.0%
3 3
 
12.0%
7 2
 
8.0%
0 2
 
8.0%
9 1
 
4.0%
6 1
 
4.0%
Other Punctuation
ValueCountFrequency (%)
, 12
25.0%
# 11
22.9%
. 9
18.8%
& 7
14.6%
' 4
 
8.3%
: 3
 
6.2%
2
 
4.2%
Math Symbol
ValueCountFrequency (%)
~ 1
50.0%
+ 1
50.0%
Space Separator
ValueCountFrequency (%)
193
100.0%
Close Punctuation
ValueCountFrequency (%)
) 117
100.0%
Open Punctuation
ValueCountFrequency (%)
( 116
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4436
80.3%
Latin 582
 
10.5%
Common 505
 
9.1%
Han 4
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
229
 
5.2%
220
 
5.0%
148
 
3.3%
117
 
2.6%
93
 
2.1%
85
 
1.9%
79
 
1.8%
77
 
1.7%
71
 
1.6%
68
 
1.5%
Other values (436) 3249
73.2%
Latin
ValueCountFrequency (%)
a 45
 
7.7%
i 39
 
6.7%
e 33
 
5.7%
n 26
 
4.5%
H 24
 
4.1%
r 23
 
4.0%
A 23
 
4.0%
l 21
 
3.6%
N 20
 
3.4%
o 20
 
3.4%
Other values (37) 308
52.9%
Common
ValueCountFrequency (%)
193
38.2%
) 117
23.2%
( 116
23.0%
, 12
 
2.4%
# 11
 
2.2%
2 10
 
2.0%
. 9
 
1.8%
& 7
 
1.4%
' 4
 
0.8%
- 4
 
0.8%
Other values (11) 22
 
4.4%
Han
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4436
80.3%
ASCII 1085
 
19.6%
CJK 4
 
0.1%
None 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
229
 
5.2%
220
 
5.0%
148
 
3.3%
117
 
2.6%
93
 
2.1%
85
 
1.9%
79
 
1.8%
77
 
1.7%
71
 
1.6%
68
 
1.5%
Other values (436) 3249
73.2%
ASCII
ValueCountFrequency (%)
193
17.8%
) 117
 
10.8%
( 116
 
10.7%
a 45
 
4.1%
i 39
 
3.6%
e 33
 
3.0%
n 26
 
2.4%
H 24
 
2.2%
r 23
 
2.1%
A 23
 
2.1%
Other values (57) 446
41.1%
None
ValueCountFrequency (%)
2
100.0%
CJK
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Distinct790
Distinct (%)96.2%
Missing0
Missing (%)0.0%
Memory size6.5 KiB
2023-12-12T13:08:17.064158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length58
Median length51
Mean length30.313033
Min length19

Characters and Unicode

Total characters24887
Distinct characters260
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique762 ?
Unique (%)92.8%

Sample

1st row부산광역시 기장군 기장읍 차성동로 52-14
2nd row부산광역시 기장군 기장읍 대변로 20
3rd row부산광역시 기장군 일광읍 삼성3길 53-3
4th row부산광역시 기장군 기장읍 대변로 24
5th row부산광역시 기장군 일광읍 삼성3길 27
ValueCountFrequency (%)
부산광역시 821
 
15.4%
기장군 821
 
15.4%
기장읍 336
 
6.3%
정관읍 284
 
5.3%
1층 181
 
3.4%
일광읍 101
 
1.9%
장안읍 84
 
1.6%
정관로 81
 
1.5%
2층 69
 
1.3%
차성로 39
 
0.7%
Other values (899) 2506
47.1%
2023-12-12T13:08:17.617990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4510
 
18.1%
1301
 
5.2%
1191
 
4.8%
1 1142
 
4.6%
964
 
3.9%
885
 
3.6%
860
 
3.5%
841
 
3.4%
829
 
3.3%
821
 
3.3%
Other values (250) 11543
46.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 14627
58.8%
Space Separator 4510
 
18.1%
Decimal Number 4472
 
18.0%
Other Punctuation 603
 
2.4%
Dash Punctuation 194
 
0.8%
Close Punctuation 190
 
0.8%
Open Punctuation 190
 
0.8%
Uppercase Letter 78
 
0.3%
Math Symbol 21
 
0.1%
Letter Number 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1301
 
8.9%
1191
 
8.1%
964
 
6.6%
885
 
6.1%
860
 
5.9%
841
 
5.7%
829
 
5.7%
821
 
5.6%
816
 
5.6%
677
 
4.6%
Other values (221) 5442
37.2%
Decimal Number
ValueCountFrequency (%)
1 1142
25.5%
2 687
15.4%
3 587
13.1%
0 491
11.0%
4 386
 
8.6%
5 335
 
7.5%
6 254
 
5.7%
8 226
 
5.1%
7 213
 
4.8%
9 151
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
B 47
60.3%
A 12
 
15.4%
T 5
 
6.4%
L 3
 
3.8%
H 3
 
3.8%
C 2
 
2.6%
I 2
 
2.6%
S 2
 
2.6%
D 1
 
1.3%
P 1
 
1.3%
Other Punctuation
ValueCountFrequency (%)
, 598
99.2%
@ 4
 
0.7%
& 1
 
0.2%
Space Separator
ValueCountFrequency (%)
4510
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 194
100.0%
Close Punctuation
ValueCountFrequency (%)
) 190
100.0%
Open Punctuation
ValueCountFrequency (%)
( 190
100.0%
Math Symbol
ValueCountFrequency (%)
~ 21
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 14627
58.8%
Common 10180
40.9%
Latin 80
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1301
 
8.9%
1191
 
8.1%
964
 
6.6%
885
 
6.1%
860
 
5.9%
841
 
5.7%
829
 
5.7%
821
 
5.6%
816
 
5.6%
677
 
4.6%
Other values (221) 5442
37.2%
Common
ValueCountFrequency (%)
4510
44.3%
1 1142
 
11.2%
2 687
 
6.7%
, 598
 
5.9%
3 587
 
5.8%
0 491
 
4.8%
4 386
 
3.8%
5 335
 
3.3%
6 254
 
2.5%
8 226
 
2.2%
Other values (8) 964
 
9.5%
Latin
ValueCountFrequency (%)
B 47
58.8%
A 12
 
15.0%
T 5
 
6.2%
L 3
 
3.8%
H 3
 
3.8%
C 2
 
2.5%
2
 
2.5%
I 2
 
2.5%
S 2
 
2.5%
D 1
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 14627
58.8%
ASCII 10258
41.2%
Number Forms 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4510
44.0%
1 1142
 
11.1%
2 687
 
6.7%
, 598
 
5.8%
3 587
 
5.7%
0 491
 
4.8%
4 386
 
3.8%
5 335
 
3.3%
6 254
 
2.5%
8 226
 
2.2%
Other values (18) 1042
 
10.2%
Hangul
ValueCountFrequency (%)
1301
 
8.9%
1191
 
8.1%
964
 
6.6%
885
 
6.1%
860
 
5.9%
841
 
5.7%
829
 
5.7%
821
 
5.6%
816
 
5.6%
677
 
4.6%
Other values (221) 5442
37.2%
Number Forms
ValueCountFrequency (%)
2
100.0%

우편번호(도로명)
Real number (ℝ)

MISSING 

Distinct70
Distinct (%)8.7%
Missing15
Missing (%)1.8%
Infinite0
Infinite (%)0.0%
Mean46042.36
Minimum46002
Maximum46084
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.3 KiB
2023-12-12T13:08:17.836064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum46002
5-th percentile46008
Q146019
median46044
Q346063
95-th percentile46080
Maximum46084
Range82
Interquartile range (IQR)44

Descriptive statistics

Standard deviation23.606654
Coefficient of variation (CV)0.00051271599
Kurtosis-1.3176025
Mean46042.36
Median Absolute Deviation (MAD)22
Skewness-0.013622606
Sum37110142
Variance557.27411
MonotonicityNot monotonic
2023-12-12T13:08:17.996168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
46008 49
 
6.0%
46048 46
 
5.6%
46066 41
 
5.0%
46044 37
 
4.5%
46015 36
 
4.4%
46061 36
 
4.4%
46017 32
 
3.9%
46036 25
 
3.0%
46055 24
 
2.9%
46056 22
 
2.7%
Other values (60) 458
55.8%
ValueCountFrequency (%)
46002 5
 
0.6%
46004 1
 
0.1%
46007 22
2.7%
46008 49
6.0%
46009 2
 
0.2%
46010 9
 
1.1%
46011 2
 
0.2%
46012 7
 
0.9%
46013 19
 
2.3%
46014 2
 
0.2%
ValueCountFrequency (%)
46084 10
1.2%
46083 5
 
0.6%
46082 10
1.2%
46081 5
 
0.6%
46080 13
1.6%
46079 15
1.8%
46078 3
 
0.4%
46076 3
 
0.4%
46074 10
1.2%
46073 4
 
0.5%

소재지전화
Text

MISSING 

Distinct474
Distinct (%)98.1%
Missing338
Missing (%)41.2%
Memory size6.5 KiB
2023-12-12T13:08:18.271848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.020704
Min length12

Characters and Unicode

Total characters5806
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique465 ?
Unique (%)96.3%

Sample

1st row051-721-2212
2nd row051-721-7390
3rd row051-724-3360
4th row051-721-5988
5th row051-721-1882
ValueCountFrequency (%)
051-724-3457 2
 
0.4%
051-724-6688 2
 
0.4%
051-721-1235 2
 
0.4%
070-7766-2222 2
 
0.4%
051-722-7712 2
 
0.4%
051-727-2072 2
 
0.4%
051-721-5283 2
 
0.4%
051-727-3655 2
 
0.4%
051-727-9991 2
 
0.4%
051-727-3993 1
 
0.2%
Other values (464) 464
96.1%
2023-12-12T13:08:18.700716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 966
16.6%
7 781
13.5%
0 774
13.3%
1 742
12.8%
5 728
12.5%
2 721
12.4%
8 272
 
4.7%
4 248
 
4.3%
3 220
 
3.8%
6 181
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4840
83.4%
Dash Punctuation 966
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 781
16.1%
0 774
16.0%
1 742
15.3%
5 728
15.0%
2 721
14.9%
8 272
 
5.6%
4 248
 
5.1%
3 220
 
4.5%
6 181
 
3.7%
9 173
 
3.6%
Dash Punctuation
ValueCountFrequency (%)
- 966
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5806
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 966
16.6%
7 781
13.5%
0 774
13.3%
1 742
12.8%
5 728
12.5%
2 721
12.4%
8 272
 
4.7%
4 248
 
4.3%
3 220
 
3.8%
6 181
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5806
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 966
16.6%
7 781
13.5%
0 774
13.3%
1 742
12.8%
5 728
12.5%
2 721
12.4%
8 272
 
4.7%
4 248
 
4.3%
3 220
 
3.8%
6 181
 
3.1%

Interactions

2023-12-12T13:08:14.564071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:08:18.827964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종명우편번호(도로명)
업종명1.0000.568
우편번호(도로명)0.5681.000
2023-12-12T13:08:18.925757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우편번호(도로명)업종명
우편번호(도로명)1.0000.246
업종명0.2461.000

Missing values

2023-12-12T13:08:14.724101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:08:14.872179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T13:08:14.998903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업종명업소명영업소 주소(도로명)우편번호(도로명)소재지전화
0숙박업(일반)조인장모텔부산광역시 기장군 기장읍 차성동로 52-1446065051-721-2212
1숙박업(일반)올리브모텔부산광역시 기장군 기장읍 대변로 2046081051-721-7390
2숙박업(일반)늘봄모텔부산광역시 기장군 일광읍 삼성3길 53-346044051-724-3360
3숙박업(일반)팔레스부산광역시 기장군 기장읍 대변로 2446081051-721-5988
4숙박업(일반)해변여관부산광역시 기장군 일광읍 삼성3길 2746044051-721-1882
5숙박업(일반)베스트루이스해밀턴호텔 기장점(BEST LOUIS HAMILTON HOTEL)부산광역시 기장군 기장읍 반송로 154546058051-721-2203
6숙박업(일반)아이더블유(I.W)호텔부산광역시 기장군 기장읍 연화길 7646079051-724-9201
7숙박업(일반)호텔오월로부산광역시 기장군 기장읍 차성로 36546057051-722-4561
8숙박업(일반)호텔오즈부산광역시 기장군 기장읍 기장해안로 53046079051-722-9030
9숙박업(일반)초콜릿부산광역시 기장군 기장읍 연화길 3246082051-723-2988
업종명업소명영업소 주소(도로명)우편번호(도로명)소재지전화
811네일미용업+ 화장 분장 미용업메이드미(Mademi)부산광역시 기장군 정관읍 가동3로 24, 1층46013<NA>
812네일미용업+ 화장 분장 미용업디오부산광역시 기장군 정관읍 정관1로 18, 125동 B105호 (이지 더원1차 아파트)46008<NA>
813일반미용업+ 피부미용업+ 네일미용업어썸헤어 by 수미지부산광역시 기장군 정관읍 정관5로 12, 228동 134호 (동원로얄듀크2차)46017051-727-1707
814일반미용업+ 피부미용업+ 네일미용업#빛나담 뷰티테라스부산광역시 기장군 일광읍 해빛1로 72, 일광골드타워II 401호46048<NA>
815일반미용업+ 네일미용업+ 화장 분장 미용업살롱드주헤어(Salon de Ju 헤어)부산광역시 기장군 정관읍 정관로 545, 504동 지하1층 103호 (이진 캐스빌 아파트)46015<NA>
816일반미용업+ 네일미용업+ 화장 분장 미용업프리미엄 이가자 일광점부산광역시 기장군 일광읍 해송1로 36, 201,202호46044<NA>
817피부미용업+ 네일미용업+ 화장 분장 미용업나비네일부산광역시 기장군 일광읍 해빛5로 21-3, 2층 202호46048051-722-1551
818피부미용업+ 네일미용업+ 화장 분장 미용업소노뷰티부산광역시 기장군 정관읍 달산2길 25-3, 1층46024<NA>
819피부미용업+ 네일미용업+ 화장 분장 미용업즐거운네일부산광역시 기장군 정관읍 정관덕산길 80, 상가동 6호 (정관 파스텔라 타운하우스)46022<NA>
820피부미용업+ 네일미용업+ 화장 분장 미용업어게인뷰티부산광역시 기장군 정관읍 모전로 78-1, 1층46008<NA>