Overview

Dataset statistics

Number of variables5
Number of observations810
Missing cells336
Missing cells (%)8.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory32.6 KiB
Average record size in memory41.2 B

Variable types

Numeric1
Categorical1
Text3

Dataset

Description대구광역시 서구 관내 이용업 및 미용업에 대한 데이터로 업종명, 업소명, 소재지, 업소전화번호 등의 항목을 제공합니다.
Author대구광역시 서구
URLhttps://www.data.go.kr/data/15054531/fileData.do

Alerts

연번 is highly overall correlated with 업종명High correlation
업종명 is highly overall correlated with 연번High correlation
업종명 is highly imbalanced (56.9%)Imbalance
소재지전화 has 336 (41.5%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2024-03-14 13:04:52.292129
Analysis finished2024-03-14 13:04:53.676153
Duration1.38 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct810
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean405.5
Minimum1
Maximum810
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.2 KiB
2024-03-14T22:04:53.826984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile41.45
Q1203.25
median405.5
Q3607.75
95-th percentile769.55
Maximum810
Range809
Interquartile range (IQR)404.5

Descriptive statistics

Standard deviation233.97115
Coefficient of variation (CV)0.57699421
Kurtosis-1.2
Mean405.5
Median Absolute Deviation (MAD)202.5
Skewness0
Sum328455
Variance54742.5
MonotonicityStrictly increasing
2024-03-14T22:04:54.100921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
545 1
 
0.1%
535 1
 
0.1%
536 1
 
0.1%
537 1
 
0.1%
538 1
 
0.1%
539 1
 
0.1%
540 1
 
0.1%
541 1
 
0.1%
542 1
 
0.1%
Other values (800) 800
98.8%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
810 1
0.1%
809 1
0.1%
808 1
0.1%
807 1
0.1%
806 1
0.1%
805 1
0.1%
804 1
0.1%
803 1
0.1%
802 1
0.1%
801 1
0.1%

업종명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct14
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size6.5 KiB
일반미용업
570 
이용업
96 
네일미용업
 
49
피부미용업
 
43
화장ㆍ분장 미용업
 
12
Other values (9)
 
40

Length

Max length23
Median length5
Mean length5.1950617
Min length3

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row이용업
2nd row이용업
3rd row이용업
4th row이용업
5th row이용업

Common Values

ValueCountFrequency (%)
일반미용업 570
70.4%
이용업 96
 
11.9%
네일미용업 49
 
6.0%
피부미용업 43
 
5.3%
화장ㆍ분장 미용업 12
 
1.5%
종합미용업 8
 
1.0%
네일미용업, 화장ㆍ분장 미용업 8
 
1.0%
피부미용업, 네일미용업 5
 
0.6%
일반미용업, 피부미용업 4
 
0.5%
일반미용업, 네일미용업 4
 
0.5%
Other values (4) 11
 
1.4%

Length

2024-03-14T22:04:54.376717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
일반미용업 579
66.3%
이용업 96
 
11.0%
네일미용업 70
 
8.0%
피부미용업 60
 
6.9%
미용업 31
 
3.6%
화장ㆍ분장 29
 
3.3%
종합미용업 8
 
0.9%
Distinct765
Distinct (%)94.4%
Missing0
Missing (%)0.0%
Memory size6.5 KiB
2024-03-14T22:04:55.466193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length20
Mean length5.6333333
Min length1

Characters and Unicode

Total characters4563
Distinct characters457
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique725 ?
Unique (%)89.5%

Sample

1st row대성이용소
2nd row골드
3rd row성주이용소
4th row달성이용원
5th row제일이용소
ValueCountFrequency (%)
hair 6
 
0.7%
nail 5
 
0.6%
헤어 5
 
0.6%
미용실 4
 
0.5%
4
 
0.5%
헤어샵 4
 
0.5%
진미용실 4
 
0.5%
헤어스케치 3
 
0.3%
영미용실 3
 
0.3%
beauty 3
 
0.3%
Other values (797) 845
95.4%
2024-03-14T22:04:56.762354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
303
 
6.6%
300
 
6.6%
274
 
6.0%
250
 
5.5%
205
 
4.5%
132
 
2.9%
86
 
1.9%
83
 
1.8%
80
 
1.8%
76
 
1.7%
Other values (447) 2774
60.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3929
86.1%
Uppercase Letter 198
 
4.3%
Lowercase Letter 192
 
4.2%
Space Separator 76
 
1.7%
Open Punctuation 55
 
1.2%
Close Punctuation 55
 
1.2%
Other Punctuation 35
 
0.8%
Decimal Number 21
 
0.5%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
303
 
7.7%
300
 
7.6%
274
 
7.0%
250
 
6.4%
205
 
5.2%
132
 
3.4%
86
 
2.2%
83
 
2.1%
80
 
2.0%
68
 
1.7%
Other values (384) 2148
54.7%
Uppercase Letter
ValueCountFrequency (%)
I 18
 
9.1%
A 18
 
9.1%
N 18
 
9.1%
O 17
 
8.6%
H 14
 
7.1%
L 11
 
5.6%
S 11
 
5.6%
B 10
 
5.1%
R 9
 
4.5%
D 9
 
4.5%
Other values (13) 63
31.8%
Lowercase Letter
ValueCountFrequency (%)
a 26
13.5%
i 25
13.0%
e 22
11.5%
n 15
 
7.8%
l 14
 
7.3%
o 14
 
7.3%
s 14
 
7.3%
r 8
 
4.2%
u 8
 
4.2%
d 7
 
3.6%
Other values (12) 39
20.3%
Decimal Number
ValueCountFrequency (%)
7 4
19.0%
5 4
19.0%
8 4
19.0%
0 3
14.3%
1 3
14.3%
2 1
 
4.8%
3 1
 
4.8%
9 1
 
4.8%
Other Punctuation
ValueCountFrequency (%)
& 14
40.0%
# 6
17.1%
, 4
 
11.4%
' 4
 
11.4%
. 4
 
11.4%
: 3
 
8.6%
Space Separator
ValueCountFrequency (%)
76
100.0%
Open Punctuation
ValueCountFrequency (%)
( 55
100.0%
Close Punctuation
ValueCountFrequency (%)
) 55
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3927
86.1%
Latin 390
 
8.5%
Common 244
 
5.3%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
303
 
7.7%
300
 
7.6%
274
 
7.0%
250
 
6.4%
205
 
5.2%
132
 
3.4%
86
 
2.2%
83
 
2.1%
80
 
2.0%
68
 
1.7%
Other values (382) 2146
54.6%
Latin
ValueCountFrequency (%)
a 26
 
6.7%
i 25
 
6.4%
e 22
 
5.6%
I 18
 
4.6%
A 18
 
4.6%
N 18
 
4.6%
O 17
 
4.4%
n 15
 
3.8%
l 14
 
3.6%
H 14
 
3.6%
Other values (35) 203
52.1%
Common
ValueCountFrequency (%)
76
31.1%
( 55
22.5%
) 55
22.5%
& 14
 
5.7%
# 6
 
2.5%
7 4
 
1.6%
5 4
 
1.6%
8 4
 
1.6%
, 4
 
1.6%
' 4
 
1.6%
Other values (8) 18
 
7.4%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3927
86.1%
ASCII 634
 
13.9%
CJK 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
303
 
7.7%
300
 
7.6%
274
 
7.0%
250
 
6.4%
205
 
5.2%
132
 
3.4%
86
 
2.2%
83
 
2.1%
80
 
2.0%
68
 
1.7%
Other values (382) 2146
54.6%
ASCII
ValueCountFrequency (%)
76
 
12.0%
( 55
 
8.7%
) 55
 
8.7%
a 26
 
4.1%
i 25
 
3.9%
e 22
 
3.5%
I 18
 
2.8%
A 18
 
2.8%
N 18
 
2.8%
O 17
 
2.7%
Other values (53) 304
47.9%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct795
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size6.5 KiB
2024-03-14T22:04:57.907830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length52
Mean length27.611111
Min length21

Characters and Unicode

Total characters22365
Distinct characters131
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique782 ?
Unique (%)96.5%

Sample

1st row대구광역시 서구 당산로53길 3, 1층 (내당동)
2nd row대구광역시 서구 문화로66길 17, 1층 (비산동)
3rd row대구광역시 서구 국채보상로57길 39, 1층 (평리동)
4th row대구광역시 서구 고성로 116-1 (원대동1가)
5th row대구광역시 서구 달서로 63 (비산동)
ValueCountFrequency (%)
대구광역시 810
17.8%
서구 810
17.8%
1층 260
 
5.7%
평리동 258
 
5.7%
비산동 243
 
5.3%
내당동 205
 
4.5%
중리동 54
 
1.2%
평리로 31
 
0.7%
국채보상로 29
 
0.6%
2층 22
 
0.5%
Other values (639) 1827
40.2%
2024-03-14T22:04:59.492835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3740
 
16.7%
1812
 
8.1%
1047
 
4.7%
1036
 
4.6%
1 966
 
4.3%
860
 
3.8%
818
 
3.7%
814
 
3.6%
813
 
3.6%
( 813
 
3.6%
Other values (121) 9646
43.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 12632
56.5%
Decimal Number 3762
 
16.8%
Space Separator 3740
 
16.7%
Open Punctuation 813
 
3.6%
Close Punctuation 813
 
3.6%
Other Punctuation 395
 
1.8%
Dash Punctuation 187
 
0.8%
Uppercase Letter 19
 
0.1%
Lowercase Letter 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1812
14.3%
1047
 
8.3%
1036
 
8.2%
860
 
6.8%
818
 
6.5%
814
 
6.4%
813
 
6.4%
807
 
6.4%
606
 
4.8%
406
 
3.2%
Other values (96) 3613
28.6%
Decimal Number
ValueCountFrequency (%)
1 966
25.7%
3 480
12.8%
2 477
12.7%
6 331
 
8.8%
4 320
 
8.5%
5 300
 
8.0%
7 283
 
7.5%
0 245
 
6.5%
8 202
 
5.4%
9 158
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
A 7
36.8%
B 4
21.1%
C 2
 
10.5%
D 2
 
10.5%
K 1
 
5.3%
T 1
 
5.3%
E 1
 
5.3%
X 1
 
5.3%
Other Punctuation
ValueCountFrequency (%)
, 394
99.7%
/ 1
 
0.3%
Space Separator
ValueCountFrequency (%)
3740
100.0%
Open Punctuation
ValueCountFrequency (%)
( 813
100.0%
Close Punctuation
ValueCountFrequency (%)
) 813
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 187
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 12632
56.5%
Common 9710
43.4%
Latin 23
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1812
14.3%
1047
 
8.3%
1036
 
8.2%
860
 
6.8%
818
 
6.5%
814
 
6.4%
813
 
6.4%
807
 
6.4%
606
 
4.8%
406
 
3.2%
Other values (96) 3613
28.6%
Common
ValueCountFrequency (%)
3740
38.5%
1 966
 
9.9%
( 813
 
8.4%
) 813
 
8.4%
3 480
 
4.9%
2 477
 
4.9%
, 394
 
4.1%
6 331
 
3.4%
4 320
 
3.3%
5 300
 
3.1%
Other values (6) 1076
 
11.1%
Latin
ValueCountFrequency (%)
A 7
30.4%
B 4
17.4%
e 4
17.4%
C 2
 
8.7%
D 2
 
8.7%
K 1
 
4.3%
T 1
 
4.3%
E 1
 
4.3%
X 1
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12632
56.5%
ASCII 9733
43.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3740
38.4%
1 966
 
9.9%
( 813
 
8.4%
) 813
 
8.4%
3 480
 
4.9%
2 477
 
4.9%
, 394
 
4.0%
6 331
 
3.4%
4 320
 
3.3%
5 300
 
3.1%
Other values (15) 1099
 
11.3%
Hangul
ValueCountFrequency (%)
1812
14.3%
1047
 
8.3%
1036
 
8.2%
860
 
6.8%
818
 
6.5%
814
 
6.4%
813
 
6.4%
807
 
6.4%
606
 
4.8%
406
 
3.2%
Other values (96) 3613
28.6%

소재지전화
Text

MISSING 

Distinct472
Distinct (%)99.6%
Missing336
Missing (%)41.5%
Memory size6.5 KiB
2024-03-14T22:05:00.629018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.012658
Min length9

Characters and Unicode

Total characters5694
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique470 ?
Unique (%)99.2%

Sample

1st row053-561-7989
2nd row053-566-9025
3rd row053-358-5297
4th row053-552-5379
5th row053-555-2965
ValueCountFrequency (%)
053-553-3367 2
 
0.4%
053-563-7909 2
 
0.4%
053-562-9758 1
 
0.2%
053-556-8053 1
 
0.2%
053-553-8760 1
 
0.2%
053-523-4748 1
 
0.2%
053-322-4688 1
 
0.2%
053-527-6377 1
 
0.2%
053-555-4991 1
 
0.2%
053-524-0895 1
 
0.2%
Other values (462) 462
97.5%
2024-03-14T22:05:01.824412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 1301
22.8%
- 948
16.6%
3 808
14.2%
0 682
12.0%
6 381
 
6.7%
2 361
 
6.3%
7 319
 
5.6%
1 254
 
4.5%
8 231
 
4.1%
4 228
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4746
83.4%
Dash Punctuation 948
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 1301
27.4%
3 808
17.0%
0 682
14.4%
6 381
 
8.0%
2 361
 
7.6%
7 319
 
6.7%
1 254
 
5.4%
8 231
 
4.9%
4 228
 
4.8%
9 181
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 948
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5694
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 1301
22.8%
- 948
16.6%
3 808
14.2%
0 682
12.0%
6 381
 
6.7%
2 361
 
6.3%
7 319
 
5.6%
1 254
 
4.5%
8 231
 
4.1%
4 228
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 1301
22.8%
- 948
16.6%
3 808
14.2%
0 682
12.0%
6 381
 
6.7%
2 361
 
6.3%
7 319
 
5.6%
1 254
 
4.5%
8 231
 
4.1%
4 228
 
4.0%

Interactions

2024-03-14T22:04:52.872497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T22:05:02.071461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종명
연번1.0000.815
업종명0.8151.000
2024-03-14T22:05:02.291659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종명
연번1.0000.504
업종명0.5041.000

Missing values

2024-03-14T22:04:53.236537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T22:04:53.557538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번업종명업소명영업소 주소(도로명)소재지전화
01이용업대성이용소대구광역시 서구 당산로53길 3, 1층 (내당동)053-561-7989
12이용업골드대구광역시 서구 문화로66길 17, 1층 (비산동)<NA>
23이용업성주이용소대구광역시 서구 국채보상로57길 39, 1층 (평리동)053-566-9025
34이용업달성이용원대구광역시 서구 고성로 116-1 (원대동1가)053-358-5297
45이용업제일이용소대구광역시 서구 달서로 63 (비산동)053-552-5379
56이용업뉴타운이용소대구광역시 서구 서대구로11길 49 (내당동)053-555-2965
67이용업구내복지이용소대구광역시 서구 국채보상로 257 (평리동)053-562-3340
78이용업성원이용소대구광역시 서구 평리로 214 (내당동)053-565-3869
89이용업형제이용소대구광역시 서구 평리로83길 11-1 (평리동)053-552-4500
910이용업광명이용소대구광역시 서구 국채보상로88길 27 (내당동)053-564-2647
연번업종명업소명영업소 주소(도로명)소재지전화
800801네일미용업, 화장ㆍ분장 미용업내니네일대구광역시 서구 달서로4길 7, 1층 (내당동)<NA>
801802네일미용업, 화장ㆍ분장 미용업2호점네일대구광역시 서구 평리로 276, 1층 (내당동)<NA>
802803네일미용업, 화장ㆍ분장 미용업핑크손(PINK SON)대구광역시 서구 당산로 241, 1층 (내당동)<NA>
803804네일미용업, 화장ㆍ분장 미용업아이조아네일대구광역시 서구 달구벌대로361길 41, 501동 2층 208호 (내당동, e편한세상두류역)<NA>
804805네일미용업, 화장ㆍ분장 미용업더예쁨대구광역시 서구 당산로47길 23, 1층 (내당동)<NA>
805806네일미용업, 화장ㆍ분장 미용업뷰티화이트대구광역시 서구 통학로21길 1, 1층 (평리동)<NA>
806807피부미용업, 네일미용업, 화장ㆍ분장 미용업민낯뷰티본점대구광역시 서구 서대구로 45 (내당동)<NA>
807808피부미용업, 네일미용업, 화장ㆍ분장 미용업네일콩스(Nail Kong's)대구광역시 서구 달구벌대로361길 5-2, 1층 (내당동)<NA>
808809피부미용업, 네일미용업, 화장ㆍ분장 미용업루아뷰티(Ru:a)대구광역시 서구 서대구로3길 70, 1층 (내당동)<NA>
809810피부미용업, 네일미용업, 화장ㆍ분장 미용업바로네일(BARO NAIL)대구광역시 서구 당산로 216, 1층 (내당동)<NA>