Overview

Dataset statistics

Number of variables4
Number of observations718
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory23.3 KiB
Average record size in memory33.2 B

Variable types

Numeric1
Categorical1
Text2

Dataset

Description울산광역시 및 구군에서 지정 관리하는 공중위생 우수업소 현황 정보(업종, 업소명, 도로명주소)를 제공하고 있음.
Author울산광역시
URLhttps://www.data.go.kr/data/3043786/fileData.do

Alerts

연번 is highly overall correlated with 업종High correlation
업종 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique

Reproduction

Analysis started2024-03-15 00:39:09.276151
Analysis finished2024-03-15 00:39:10.948444
Duration1.67 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct718
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean359.5
Minimum1
Maximum718
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2024-03-15T09:39:11.158527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile36.85
Q1180.25
median359.5
Q3538.75
95-th percentile682.15
Maximum718
Range717
Interquartile range (IQR)358.5

Descriptive statistics

Standard deviation207.41303
Coefficient of variation (CV)0.57694863
Kurtosis-1.2
Mean359.5
Median Absolute Deviation (MAD)179.5
Skewness0
Sum258121
Variance43020.167
MonotonicityStrictly increasing
2024-03-15T09:39:11.515834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
495 1
 
0.1%
475 1
 
0.1%
476 1
 
0.1%
477 1
 
0.1%
478 1
 
0.1%
479 1
 
0.1%
480 1
 
0.1%
481 1
 
0.1%
482 1
 
0.1%
Other values (708) 708
98.6%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
718 1
0.1%
717 1
0.1%
716 1
0.1%
715 1
0.1%
714 1
0.1%
713 1
0.1%
712 1
0.1%
711 1
0.1%
710 1
0.1%
709 1
0.1%

업종
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
일반미용업
299 
미용업(일반)
97 
네일미용업
93 
피부미용업
71 
이용업
54 
Other values (20)
104 

Length

Max length28
Median length5
Mean length5.7089136
Min length3

Unique

Unique3 ?
Unique (%)0.4%

Sample

1st row이용업
2nd row이용업
3rd row이용업
4th row이용업
5th row이용업

Common Values

ValueCountFrequency (%)
일반미용업 299
41.6%
미용업(일반) 97
 
13.5%
네일미용업 93
 
13.0%
피부미용업 71
 
9.9%
이용업 54
 
7.5%
네일아트업 18
 
2.5%
종합미용업 14
 
1.9%
미용업(네일) 9
 
1.3%
화장.분장미용업 9
 
1.3%
네일미용업, 화장ㆍ분장 미용업 7
 
1.0%
Other values (15) 47
 
6.5%

Length

2024-03-15T09:39:11.908277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
일반미용업 304
39.5%
네일미용업 109
 
14.2%
미용업(일반 102
 
13.2%
피부미용업 82
 
10.6%
이용업 54
 
7.0%
화장ㆍ분장 20
 
2.6%
미용업 20
 
2.6%
네일아트업 18
 
2.3%
종합미용업 14
 
1.8%
미용업(네일 14
 
1.8%
Other values (5) 33
 
4.3%
Distinct708
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
2024-03-15T09:39:12.985728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length22
Mean length6.1601671
Min length1

Characters and Unicode

Total characters4423
Distinct characters487
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique698 ?
Unique (%)97.2%

Sample

1st rowW이용원
2nd row아담남성컷트
3rd row남성컷트클럽
4th row금메달남성커트전문점
5th row에스에이치(S.H)이용원
ValueCountFrequency (%)
헤어 10
 
1.2%
hair 5
 
0.6%
헤어르뽀렘 5
 
0.6%
에스테틱 5
 
0.6%
네일 5
 
0.6%
nail 5
 
0.6%
미용실 4
 
0.5%
예다연탑헤어 3
 
0.3%
beauty 3
 
0.3%
by 3
 
0.3%
Other values (794) 815
94.4%
2024-03-15T09:39:14.508285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
279
 
6.3%
269
 
6.1%
145
 
3.3%
134
 
3.0%
111
 
2.5%
110
 
2.5%
95
 
2.1%
71
 
1.6%
70
 
1.6%
) 61
 
1.4%
Other values (477) 3078
69.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3597
81.3%
Lowercase Letter 320
 
7.2%
Uppercase Letter 181
 
4.1%
Space Separator 145
 
3.3%
Close Punctuation 61
 
1.4%
Open Punctuation 61
 
1.4%
Other Punctuation 37
 
0.8%
Decimal Number 17
 
0.4%
Dash Punctuation 2
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
279
 
7.8%
269
 
7.5%
134
 
3.7%
111
 
3.1%
110
 
3.1%
95
 
2.6%
71
 
2.0%
70
 
1.9%
59
 
1.6%
45
 
1.3%
Other values (411) 2354
65.4%
Lowercase Letter
ValueCountFrequency (%)
a 41
12.8%
i 33
10.3%
l 28
 
8.8%
o 26
 
8.1%
e 24
 
7.5%
n 22
 
6.9%
y 20
 
6.2%
t 16
 
5.0%
r 16
 
5.0%
b 16
 
5.0%
Other values (13) 78
24.4%
Uppercase Letter
ValueCountFrequency (%)
N 19
 
10.5%
O 19
 
10.5%
H 15
 
8.3%
A 14
 
7.7%
S 12
 
6.6%
I 12
 
6.6%
B 12
 
6.6%
E 10
 
5.5%
J 8
 
4.4%
R 7
 
3.9%
Other values (13) 53
29.3%
Decimal Number
ValueCountFrequency (%)
2 4
23.5%
0 3
17.6%
7 2
11.8%
8 2
11.8%
1 2
11.8%
9 2
11.8%
3 1
 
5.9%
5 1
 
5.9%
Other Punctuation
ValueCountFrequency (%)
# 12
32.4%
. 10
27.0%
& 6
16.2%
' 4
 
10.8%
, 4
 
10.8%
: 1
 
2.7%
Space Separator
ValueCountFrequency (%)
145
100.0%
Close Punctuation
ValueCountFrequency (%)
) 61
100.0%
Open Punctuation
ValueCountFrequency (%)
( 61
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Math Symbol
ValueCountFrequency (%)
= 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3595
81.3%
Latin 501
 
11.3%
Common 325
 
7.3%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
279
 
7.8%
269
 
7.5%
134
 
3.7%
111
 
3.1%
110
 
3.1%
95
 
2.6%
71
 
2.0%
70
 
1.9%
59
 
1.6%
45
 
1.3%
Other values (409) 2352
65.4%
Latin
ValueCountFrequency (%)
a 41
 
8.2%
i 33
 
6.6%
l 28
 
5.6%
o 26
 
5.2%
e 24
 
4.8%
n 22
 
4.4%
y 20
 
4.0%
N 19
 
3.8%
O 19
 
3.8%
t 16
 
3.2%
Other values (36) 253
50.5%
Common
ValueCountFrequency (%)
145
44.6%
) 61
18.8%
( 61
18.8%
# 12
 
3.7%
. 10
 
3.1%
& 6
 
1.8%
' 4
 
1.2%
2 4
 
1.2%
, 4
 
1.2%
0 3
 
0.9%
Other values (10) 15
 
4.6%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3595
81.3%
ASCII 826
 
18.7%
CJK 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
279
 
7.8%
269
 
7.5%
134
 
3.7%
111
 
3.1%
110
 
3.1%
95
 
2.6%
71
 
2.0%
70
 
1.9%
59
 
1.6%
45
 
1.3%
Other values (409) 2352
65.4%
ASCII
ValueCountFrequency (%)
145
 
17.6%
) 61
 
7.4%
( 61
 
7.4%
a 41
 
5.0%
i 33
 
4.0%
l 28
 
3.4%
o 26
 
3.1%
e 24
 
2.9%
n 22
 
2.7%
y 20
 
2.4%
Other values (56) 365
44.2%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct709
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
2024-03-15T09:39:15.839642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length55
Mean length30.090529
Min length20

Characters and Unicode

Total characters21605
Distinct characters315
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique700 ?
Unique (%)97.5%

Sample

1st row울산광역시 중구 산전길 16, 4층 (남외동)
2nd row울산광역시 중구 구교12길 15, 1층 (반구동)
3rd row울산광역시 중구 서원2길 51, 1층 (반구동)
4th row울산광역시 중구 반구정15길 8, 105동 106호 (반구동, 반구 신동아파밀리에)
5th row울산광역시 중구 번영로 417, 4층 (복산동)
ValueCountFrequency (%)
울산광역시 718
 
15.7%
1층 273
 
6.0%
중구 264
 
5.8%
동구 135
 
2.9%
남구 130
 
2.8%
울주군 117
 
2.6%
2층 92
 
2.0%
북구 73
 
1.6%
삼산동 69
 
1.5%
범서읍 52
 
1.1%
Other values (1059) 2657
58.0%
2024-03-15T09:39:17.988876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3997
 
18.5%
1 1148
 
5.3%
943
 
4.4%
892
 
4.1%
848
 
3.9%
731
 
3.4%
722
 
3.3%
721
 
3.3%
686
 
3.2%
, 675
 
3.1%
Other values (305) 10242
47.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11944
55.3%
Space Separator 3997
 
18.5%
Decimal Number 3529
 
16.3%
Other Punctuation 679
 
3.1%
Open Punctuation 654
 
3.0%
Close Punctuation 654
 
3.0%
Dash Punctuation 117
 
0.5%
Uppercase Letter 21
 
0.1%
Lowercase Letter 5
 
< 0.1%
Math Symbol 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
943
 
7.9%
892
 
7.5%
848
 
7.1%
731
 
6.1%
722
 
6.0%
721
 
6.0%
686
 
5.7%
520
 
4.4%
478
 
4.0%
328
 
2.7%
Other values (267) 5075
42.5%
Decimal Number
ValueCountFrequency (%)
1 1148
32.5%
2 554
15.7%
0 331
 
9.4%
3 298
 
8.4%
4 275
 
7.8%
5 251
 
7.1%
6 200
 
5.7%
8 181
 
5.1%
7 165
 
4.7%
9 126
 
3.6%
Uppercase Letter
ValueCountFrequency (%)
A 6
28.6%
B 4
19.0%
C 3
14.3%
M 2
 
9.5%
K 2
 
9.5%
H 1
 
4.8%
D 1
 
4.8%
J 1
 
4.8%
P 1
 
4.8%
Other Punctuation
ValueCountFrequency (%)
, 675
99.4%
* 1
 
0.1%
@ 1
 
0.1%
& 1
 
0.1%
' 1
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
e 1
20.0%
k 1
20.0%
s 1
20.0%
m 1
20.0%
i 1
20.0%
Close Punctuation
ValueCountFrequency (%)
) 652
99.7%
] 1
 
0.2%
} 1
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 653
99.8%
[ 1
 
0.2%
Space Separator
ValueCountFrequency (%)
3997
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 117
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11944
55.3%
Common 9634
44.6%
Latin 27
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
943
 
7.9%
892
 
7.5%
848
 
7.1%
731
 
6.1%
722
 
6.0%
721
 
6.0%
686
 
5.7%
520
 
4.4%
478
 
4.0%
328
 
2.7%
Other values (267) 5075
42.5%
Common
ValueCountFrequency (%)
3997
41.5%
1 1148
 
11.9%
, 675
 
7.0%
( 653
 
6.8%
) 652
 
6.8%
2 554
 
5.8%
0 331
 
3.4%
3 298
 
3.1%
4 275
 
2.9%
5 251
 
2.6%
Other values (13) 800
 
8.3%
Latin
ValueCountFrequency (%)
A 6
22.2%
B 4
14.8%
C 3
11.1%
M 2
 
7.4%
K 2
 
7.4%
H 1
 
3.7%
D 1
 
3.7%
1
 
3.7%
e 1
 
3.7%
k 1
 
3.7%
Other values (5) 5
18.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11944
55.3%
ASCII 9660
44.7%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3997
41.4%
1 1148
 
11.9%
, 675
 
7.0%
( 653
 
6.8%
) 652
 
6.7%
2 554
 
5.7%
0 331
 
3.4%
3 298
 
3.1%
4 275
 
2.8%
5 251
 
2.6%
Other values (27) 826
 
8.6%
Hangul
ValueCountFrequency (%)
943
 
7.9%
892
 
7.5%
848
 
7.1%
731
 
6.1%
722
 
6.0%
721
 
6.0%
686
 
5.7%
520
 
4.4%
478
 
4.0%
328
 
2.7%
Other values (267) 5075
42.5%
Number Forms
ValueCountFrequency (%)
1
100.0%

Interactions

2024-03-15T09:39:09.968248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T09:39:18.248345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종
연번1.0000.854
업종0.8541.000
2024-03-15T09:39:18.415460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종
연번1.0000.503
업종0.5031.000

Missing values

2024-03-15T09:39:10.462526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T09:39:10.841165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번업종업소명소재지(도로명)
01이용업W이용원울산광역시 중구 산전길 16, 4층 (남외동)
12이용업아담남성컷트울산광역시 중구 구교12길 15, 1층 (반구동)
23이용업남성컷트클럽울산광역시 중구 서원2길 51, 1층 (반구동)
34이용업금메달남성커트전문점울산광역시 중구 반구정15길 8, 105동 106호 (반구동, 반구 신동아파밀리에)
45이용업에스에이치(S.H)이용원울산광역시 중구 번영로 417, 4층 (복산동)
56이용업호수이용원울산광역시 중구 북부순환도로 862, 케이제이 스포츠센터 3층 (서동)
67이용업퀸즈헤나(반구점)울산광역시 중구 서원1길 76, 1층 (반구동)
78이용업컷트박스울산광역시 중구 동천2길 10 (서동)
89이용업탑사우나내이용원울산광역시 중구 북부순환도로 180, 제2동 3층 (태화동)
910이용업한일이용원울산광역시 중구 학성공원13길 4 (학성동)
연번업종업소명소재지(도로명)
708709네일미용업네일니드유울산광역시 울주군 언양읍 내곡능골길 13-15, 1층
709710네일미용업쏨's 뷰티 네일울산광역시 울주군 범서읍 점촌4길 25, 일신파크맨션 상가동 214호
710711네일미용업지야네일울산광역시 울주군 온산읍 영남6길 9, 1층 2호
711712네일미용업앙쥬네일울산광역시 울주군 범서읍 대동길 36-1, 천상동아아파트 104동 1층 106호
712713네일미용업예담뷰티울산광역시 울주군 삼남읍 울산역로 274, 301동 122호 (울산역 신도시 동문굿모닝힐)
713714네일미용업오아이네일울산광역시 울주군 범서읍 굴화1길 7-39, 1층
714715네일미용업Mio beauty#울산광역시 울주군 청량읍 상남길 71-27, 1층
715716네일미용업쏭네일울산광역시 울주군 언양읍 동부6길 7, 1층
716717네일미용업레푸스울산범서점울산광역시 울주군 범서읍 울밀로 2816, 2층
717718네일미용업일상네일울산광역시 울주군 범서읍 장검1길 58, 1층