Overview

Dataset statistics

Number of variables4
Number of observations9493
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory306.1 KiB
Average record size in memory33.0 B

Variable types

Numeric1
Categorical1
Text2

Dataset

Description이 파일은 인천광역시 관내 미용실 및 목욕탕에 대한 데이터입니다. 해당 업소의 상호명 및 상세 주소 등에 대한 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15061611/fileData.do

Alerts

구분 is highly imbalanced (82.9%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 12:17:15.401407
Analysis finished2023-12-12 12:17:16.923443
Duration1.52 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct9493
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4747
Minimum1
Maximum9493
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size83.6 KiB
2023-12-12T21:17:17.030865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile475.6
Q12374
median4747
Q37120
95-th percentile9018.4
Maximum9493
Range9492
Interquartile range (IQR)4746

Descriptive statistics

Standard deviation2740.5374
Coefficient of variation (CV)0.57731986
Kurtosis-1.2
Mean4747
Median Absolute Deviation (MAD)2373
Skewness0
Sum45063271
Variance7510545.2
MonotonicityStrictly increasing
2023-12-12T21:17:17.251983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
6325 1
 
< 0.1%
6327 1
 
< 0.1%
6328 1
 
< 0.1%
6329 1
 
< 0.1%
6330 1
 
< 0.1%
6331 1
 
< 0.1%
6332 1
 
< 0.1%
6333 1
 
< 0.1%
6334 1
 
< 0.1%
Other values (9483) 9483
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
9493 1
< 0.1%
9492 1
< 0.1%
9491 1
< 0.1%
9490 1
< 0.1%
9489 1
< 0.1%
9488 1
< 0.1%
9487 1
< 0.1%
9486 1
< 0.1%
9485 1
< 0.1%
9484 1
< 0.1%

구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size74.3 KiB
미용실
9252 
목욕탕
 
241

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row목욕탕
2nd row목욕탕
3rd row미용실
4th row목욕탕
5th row목욕탕

Common Values

ValueCountFrequency (%)
미용실 9252
97.5%
목욕탕 241
 
2.5%

Length

2023-12-12T21:17:17.406851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:17:17.523826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
미용실 9252
97.5%
목욕탕 241
 
2.5%
Distinct7990
Distinct (%)84.2%
Missing0
Missing (%)0.0%
Memory size74.3 KiB
2023-12-12T21:17:17.826968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length33
Mean length6.3695354
Min length1

Characters and Unicode

Total characters60466
Distinct characters932
Distinct categories15 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7200 ?
Unique (%)75.8%

Sample

1st row유성목욕탕
2nd row대양목욕탕
3rd row터미널대중탕
4th row시휴재
5th row청솔불한증막
ValueCountFrequency (%)
헤어 229
 
1.9%
hair 116
 
1.0%
nail 94
 
0.8%
네일 93
 
0.8%
미용실 90
 
0.8%
헤어샵 50
 
0.4%
salon 38
 
0.3%
에스테틱 36
 
0.3%
35
 
0.3%
리안헤어 35
 
0.3%
Other values (8300) 10995
93.1%
2023-12-12T21:17:18.271122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4140
 
6.8%
3992
 
6.6%
2319
 
3.8%
1686
 
2.8%
1301
 
2.2%
1296
 
2.1%
1273
 
2.1%
1232
 
2.0%
1185
 
2.0%
) 1001
 
1.7%
Other values (922) 41041
67.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 48258
79.8%
Lowercase Letter 3603
 
6.0%
Uppercase Letter 3134
 
5.2%
Space Separator 2319
 
3.8%
Close Punctuation 1057
 
1.7%
Open Punctuation 1056
 
1.7%
Other Punctuation 549
 
0.9%
Decimal Number 419
 
0.7%
Dash Punctuation 38
 
0.1%
Connector Punctuation 18
 
< 0.1%
Other values (5) 15
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4140
 
8.6%
3992
 
8.3%
1686
 
3.5%
1301
 
2.7%
1296
 
2.7%
1273
 
2.6%
1232
 
2.6%
1185
 
2.5%
956
 
2.0%
950
 
2.0%
Other values (831) 30247
62.7%
Lowercase Letter
ValueCountFrequency (%)
a 519
14.4%
i 417
11.6%
e 365
10.1%
l 298
8.3%
n 294
8.2%
o 283
 
7.9%
r 231
 
6.4%
h 171
 
4.7%
s 150
 
4.2%
y 142
 
3.9%
Other values (16) 733
20.3%
Uppercase Letter
ValueCountFrequency (%)
A 361
 
11.5%
N 287
 
9.2%
H 241
 
7.7%
I 213
 
6.8%
S 205
 
6.5%
L 193
 
6.2%
O 186
 
5.9%
E 180
 
5.7%
B 150
 
4.8%
R 141
 
4.5%
Other values (16) 977
31.2%
Other Punctuation
ValueCountFrequency (%)
& 142
25.9%
, 112
20.4%
. 101
18.4%
# 100
18.2%
' 45
 
8.2%
: 22
 
4.0%
· 10
 
1.8%
; 6
 
1.1%
! 3
 
0.5%
3
 
0.5%
Other values (3) 5
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 90
21.5%
2 83
19.8%
0 61
14.6%
3 47
11.2%
5 31
 
7.4%
6 29
 
6.9%
4 21
 
5.0%
9 20
 
4.8%
7 19
 
4.5%
8 18
 
4.3%
Math Symbol
ValueCountFrequency (%)
+ 4
50.0%
= 2
25.0%
~ 1
 
12.5%
× 1
 
12.5%
Close Punctuation
ValueCountFrequency (%)
) 1001
94.7%
] 56
 
5.3%
Open Punctuation
ValueCountFrequency (%)
( 1000
94.7%
[ 56
 
5.3%
Modifier Symbol
ValueCountFrequency (%)
´ 1
50.0%
` 1
50.0%
Space Separator
ValueCountFrequency (%)
2319
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 38
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 18
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%
Other Symbol
ValueCountFrequency (%)
° 2
100.0%
Other Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 48210
79.7%
Latin 6739
 
11.1%
Common 5469
 
9.0%
Han 48
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4140
 
8.6%
3992
 
8.3%
1686
 
3.5%
1301
 
2.7%
1296
 
2.7%
1273
 
2.6%
1232
 
2.6%
1185
 
2.5%
956
 
2.0%
950
 
2.0%
Other values (815) 30199
62.6%
Latin
ValueCountFrequency (%)
a 519
 
7.7%
i 417
 
6.2%
e 365
 
5.4%
A 361
 
5.4%
l 298
 
4.4%
n 294
 
4.4%
N 287
 
4.3%
o 283
 
4.2%
H 241
 
3.6%
r 231
 
3.4%
Other values (43) 3443
51.1%
Common
ValueCountFrequency (%)
2319
42.4%
) 1001
18.3%
( 1000
18.3%
& 142
 
2.6%
, 112
 
2.0%
. 101
 
1.8%
# 100
 
1.8%
1 90
 
1.6%
2 83
 
1.5%
0 61
 
1.1%
Other values (28) 460
 
8.4%
Han
ValueCountFrequency (%)
26
54.2%
4
 
8.3%
2
 
4.2%
2
 
4.2%
2
 
4.2%
2
 
4.2%
1
 
2.1%
1
 
2.1%
1
 
2.1%
1
 
2.1%
Other values (6) 6
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 48207
79.7%
ASCII 12185
 
20.2%
CJK 48
 
0.1%
None 21
 
< 0.1%
Compat Jamo 3
 
< 0.1%
Number Forms 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4140
 
8.6%
3992
 
8.3%
1686
 
3.5%
1301
 
2.7%
1296
 
2.7%
1273
 
2.6%
1232
 
2.6%
1185
 
2.5%
956
 
2.0%
950
 
2.0%
Other values (812) 30196
62.6%
ASCII
ValueCountFrequency (%)
2319
19.0%
) 1001
 
8.2%
( 1000
 
8.2%
a 519
 
4.3%
i 417
 
3.4%
e 365
 
3.0%
A 361
 
3.0%
l 298
 
2.4%
n 294
 
2.4%
N 287
 
2.4%
Other values (72) 5324
43.7%
CJK
ValueCountFrequency (%)
26
54.2%
4
 
8.3%
2
 
4.2%
2
 
4.2%
2
 
4.2%
2
 
4.2%
1
 
2.1%
1
 
2.1%
1
 
2.1%
1
 
2.1%
Other values (6) 6
 
12.5%
None
ValueCountFrequency (%)
· 10
47.6%
3
 
14.3%
2
 
9.5%
° 2
 
9.5%
´ 1
 
4.8%
1
 
4.8%
1
 
4.8%
× 1
 
4.8%
Number Forms
ValueCountFrequency (%)
2
100.0%
Compat Jamo
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

주소
Text

Distinct8972
Distinct (%)94.5%
Missing1
Missing (%)< 0.1%
Memory size74.3 KiB
2023-12-12T21:17:18.649089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length46
Mean length23.391698
Min length6

Characters and Unicode

Total characters222034
Distinct characters580
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8652 ?
Unique (%)91.2%

Sample

1st row강화군 강화읍 강화대로 431
2nd row강화군 강화읍 청하동길 14
3rd row강화군 강화읍 중앙로 43
4th row강화군 내가면 강화서로 26
5th row강화군 선원면 중앙로 246
ValueCountFrequency (%)
남동구 1813
 
3.7%
부평구 1723
 
3.6%
서구 1696
 
3.5%
미추홀구 1440
 
3.0%
1호 1270
 
2.6%
연수구 1070
 
2.2%
1층 1053
 
2.2%
계양구 936
 
1.9%
부평동 774
 
1.6%
2호 656
 
1.4%
Other values (6765) 35983
74.3%
2023-12-12T21:17:19.600096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
38922
 
17.5%
1 14421
 
6.5%
12187
 
5.5%
10246
 
4.6%
9996
 
4.5%
9404
 
4.2%
9250
 
4.2%
2 7929
 
3.6%
0 6068
 
2.7%
3 5431
 
2.4%
Other values (570) 98180
44.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 120795
54.4%
Decimal Number 55938
25.2%
Space Separator 38922
 
17.5%
Dash Punctuation 3204
 
1.4%
Uppercase Letter 1049
 
0.5%
Other Punctuation 743
 
0.3%
Open Punctuation 582
 
0.3%
Close Punctuation 581
 
0.3%
Lowercase Letter 182
 
0.1%
Math Symbol 29
 
< 0.1%
Other values (2) 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12187
 
10.1%
10246
 
8.5%
9996
 
8.3%
9404
 
7.8%
9250
 
7.7%
4296
 
3.6%
2763
 
2.3%
2613
 
2.2%
2213
 
1.8%
2095
 
1.7%
Other values (501) 55732
46.1%
Uppercase Letter
ValueCountFrequency (%)
A 166
15.8%
B 162
15.4%
S 75
 
7.1%
C 74
 
7.1%
E 64
 
6.1%
I 57
 
5.4%
K 51
 
4.9%
D 42
 
4.0%
L 40
 
3.8%
V 39
 
3.7%
Other values (15) 279
26.6%
Lowercase Letter
ValueCountFrequency (%)
e 63
34.6%
s 22
 
12.1%
r 17
 
9.3%
a 17
 
9.3%
d 14
 
7.7%
y 12
 
6.6%
k 10
 
5.5%
t 8
 
4.4%
c 3
 
1.6%
i 3
 
1.6%
Other values (8) 13
 
7.1%
Decimal Number
ValueCountFrequency (%)
1 14421
25.8%
2 7929
14.2%
0 6068
10.8%
3 5431
 
9.7%
4 4456
 
8.0%
5 4268
 
7.6%
6 3920
 
7.0%
9 3178
 
5.7%
7 3145
 
5.6%
8 3122
 
5.6%
Other Punctuation
ValueCountFrequency (%)
, 657
88.4%
@ 46
 
6.2%
. 15
 
2.0%
' 14
 
1.9%
/ 9
 
1.2%
& 2
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 581
99.8%
[ 1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 580
99.8%
] 1
 
0.2%
Letter Number
ValueCountFrequency (%)
6
75.0%
2
 
25.0%
Space Separator
ValueCountFrequency (%)
38922
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3204
100.0%
Math Symbol
ValueCountFrequency (%)
~ 29
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 120795
54.4%
Common 100000
45.0%
Latin 1239
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12187
 
10.1%
10246
 
8.5%
9996
 
8.3%
9404
 
7.8%
9250
 
7.7%
4296
 
3.6%
2763
 
2.3%
2613
 
2.2%
2213
 
1.8%
2095
 
1.7%
Other values (501) 55732
46.1%
Latin
ValueCountFrequency (%)
A 166
 
13.4%
B 162
 
13.1%
S 75
 
6.1%
C 74
 
6.0%
E 64
 
5.2%
e 63
 
5.1%
I 57
 
4.6%
K 51
 
4.1%
D 42
 
3.4%
L 40
 
3.2%
Other values (35) 445
35.9%
Common
ValueCountFrequency (%)
38922
38.9%
1 14421
 
14.4%
2 7929
 
7.9%
0 6068
 
6.1%
3 5431
 
5.4%
4 4456
 
4.5%
5 4268
 
4.3%
6 3920
 
3.9%
- 3204
 
3.2%
9 3178
 
3.2%
Other values (14) 8203
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 120795
54.4%
ASCII 101231
45.6%
Number Forms 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
38922
38.4%
1 14421
 
14.2%
2 7929
 
7.8%
0 6068
 
6.0%
3 5431
 
5.4%
4 4456
 
4.4%
5 4268
 
4.2%
6 3920
 
3.9%
- 3204
 
3.2%
9 3178
 
3.1%
Other values (57) 9434
 
9.3%
Hangul
ValueCountFrequency (%)
12187
 
10.1%
10246
 
8.5%
9996
 
8.3%
9404
 
7.8%
9250
 
7.7%
4296
 
3.6%
2763
 
2.3%
2613
 
2.2%
2213
 
1.8%
2095
 
1.7%
Other values (501) 55732
46.1%
Number Forms
ValueCountFrequency (%)
6
75.0%
2
 
25.0%

Interactions

2023-12-12T21:17:16.564548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:17:19.715283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분
연번1.0000.624
구분0.6241.000
2023-12-12T21:17:19.840173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분
연번1.0000.483
구분0.4831.000

Missing values

2023-12-12T21:17:16.739770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:17:16.865828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번구분상호명주소
01목욕탕유성목욕탕강화군 강화읍 강화대로 431
12목욕탕대양목욕탕강화군 강화읍 청하동길 14
23미용실터미널대중탕강화군 강화읍 중앙로 43
34목욕탕시휴재강화군 내가면 강화서로 26
45목욕탕청솔불한증막강화군 선원면 중앙로 246
56목욕탕약수천사우나강화군 길상면 강화동로 23
67목욕탕삼화목욕탕강화군 강화읍 송악길 6
78목욕탕강화해수랜드강화군 길상면 해안남로 13-12
89목욕탕강화리빙스파랜드강화군 강화읍 갑룡길73번길 1
910목욕탕교동목욕탕강화군 교동면 대룡안길45번길 23
연번구분상호명주소
94839484미용실루씨르 네일서구 가정동 504번지 6호 -107
94849485미용실오오샵 검단점서구 왕길동 662번지 6호 장수프라자
94859486미용실서구 청라동 165번지 11호
94869487미용실블링제이뷰티서구 가정동 606번지 3호 아트프라자
94879488미용실청라한네일서구 청라동 165번지 12호 지젤엠청라
94889489미용실라라뷰티서구 검암동 606번지 1호 준암프라자
94899490미용실리블로셀검단아라점서구 원당동 0번지 우미린 더 시그니처
94909491미용실네일루서구 불로동 788번지 8호
94919492미용실모드니네일서구 청라동 165번지 12호 지젤엠청라-38
94929493미용실오늘네일서구 가정동 546번지 한신그랜드힐빌리지