Overview

Dataset statistics

Number of variables6
Number of observations384
Missing cells992
Missing cells (%)43.1%
Duplicate rows1
Duplicate rows (%)0.3%
Total size in memory18.1 KiB
Average record size in memory48.3 B

Variable types

Categorical2
Text3
DateTime1

Dataset

Description이 데이터는 충청남도 금산군의 미용업(업종구분, 업소명, 행정구역, 주소, 전화번호, 데이터기준일자)에 대한 정보를 제공합니다.
Author충청남도 금산군
URLhttps://www.data.go.kr/data/15099795/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (0.3%) duplicate rowsDuplicates
업종구분 is highly imbalanced (53.5%)Imbalance
행정구역 is highly imbalanced (55.0%)Imbalance
업소명 has 248 (64.6%) missing valuesMissing
주소 has 248 (64.6%) missing valuesMissing
전화번호 has 248 (64.6%) missing valuesMissing
데이터기준일자 has 248 (64.6%) missing valuesMissing

Reproduction

Analysis started2023-12-12 06:38:06.970704
Analysis finished2023-12-12 06:38:07.601510
Duration0.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종구분
Categorical

IMBALANCE 

Distinct13
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
<NA>
248 
일반미용업
68 
미용업
31 
피부미용업
 
15
종합미용업
 
6
Other values (8)
 
16

Length

Max length23
Median length4
Mean length4.4921875
Min length3

Unique

Unique5 ?
Unique (%)1.3%

Sample

1st row미용업
2nd row미용업
3rd row미용업
4th row미용업
5th row미용업

Common Values

ValueCountFrequency (%)
<NA> 248
64.6%
일반미용업 68
 
17.7%
미용업 31
 
8.1%
피부미용업 15
 
3.9%
종합미용업 6
 
1.6%
네일미용업 6
 
1.6%
피부미용업, 네일미용업 3
 
0.8%
일반미용업, 화장ㆍ분장 미용업 2
 
0.5%
일반미용업, 네일미용업 1
 
0.3%
피부미용업, 화장ㆍ분장 미용업 1
 
0.3%
Other values (3) 3
 
0.8%

Length

2023-12-12T15:38:07.689384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 248
61.5%
일반미용업 73
 
18.1%
미용업 37
 
9.2%
피부미용업 21
 
5.2%
네일미용업 12
 
3.0%
종합미용업 6
 
1.5%
화장ㆍ분장 6
 
1.5%

업소명
Text

MISSING 

Distinct136
Distinct (%)100.0%
Missing248
Missing (%)64.6%
Memory size3.1 KiB
2023-12-12T15:38:07.954458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length18
Mean length5.3676471
Min length2

Characters and Unicode

Total characters730
Distinct characters213
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique136 ?
Unique (%)100.0%

Sample

1st row정헤어샵
2nd row나드리미용실
3rd row현대미용실
4th row새봄미용실
5th row희나리미용실
ValueCountFrequency (%)
헤어 4
 
2.7%
대구미용실 1
 
0.7%
모모쌀롱 1
 
0.7%
퀸즈피부샵 1
 
0.7%
비단헤어 1
 
0.7%
비즈 1
 
0.7%
복수미용실 1
 
0.7%
경헤어 1
 
0.7%
예가헤어샵 1
 
0.7%
선혜미용실 1
 
0.7%
Other values (135) 135
91.2%
2023-12-12T15:38:08.322486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
56
 
7.7%
52
 
7.1%
50
 
6.8%
50
 
6.8%
48
 
6.6%
15
 
2.1%
13
 
1.8%
13
 
1.8%
12
 
1.6%
7
 
1.0%
Other values (203) 414
56.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 667
91.4%
Uppercase Letter 27
 
3.7%
Lowercase Letter 13
 
1.8%
Space Separator 12
 
1.6%
Other Punctuation 5
 
0.7%
Open Punctuation 3
 
0.4%
Close Punctuation 3
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
56
 
8.4%
52
 
7.8%
50
 
7.5%
50
 
7.5%
48
 
7.2%
15
 
2.2%
13
 
1.9%
13
 
1.9%
7
 
1.0%
7
 
1.0%
Other values (172) 356
53.4%
Uppercase Letter
ValueCountFrequency (%)
B 3
11.1%
E 3
11.1%
J 3
11.1%
M 3
11.1%
S 2
7.4%
A 2
7.4%
V 2
7.4%
U 2
7.4%
R 2
7.4%
H 1
 
3.7%
Other values (4) 4
14.8%
Lowercase Letter
ValueCountFrequency (%)
a 2
15.4%
b 2
15.4%
m 1
7.7%
c 1
7.7%
o 1
7.7%
e 1
7.7%
n 1
7.7%
u 1
7.7%
t 1
7.7%
y 1
7.7%
Other Punctuation
ValueCountFrequency (%)
, 3
60.0%
& 1
 
20.0%
. 1
 
20.0%
Space Separator
ValueCountFrequency (%)
12
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 667
91.4%
Latin 40
 
5.5%
Common 23
 
3.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
56
 
8.4%
52
 
7.8%
50
 
7.5%
50
 
7.5%
48
 
7.2%
15
 
2.2%
13
 
1.9%
13
 
1.9%
7
 
1.0%
7
 
1.0%
Other values (172) 356
53.4%
Latin
ValueCountFrequency (%)
B 3
 
7.5%
E 3
 
7.5%
J 3
 
7.5%
M 3
 
7.5%
S 2
 
5.0%
a 2
 
5.0%
A 2
 
5.0%
V 2
 
5.0%
U 2
 
5.0%
R 2
 
5.0%
Other values (15) 16
40.0%
Common
ValueCountFrequency (%)
12
52.2%
( 3
 
13.0%
) 3
 
13.0%
, 3
 
13.0%
& 1
 
4.3%
. 1
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 667
91.4%
ASCII 63
 
8.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
56
 
8.4%
52
 
7.8%
50
 
7.5%
50
 
7.5%
48
 
7.2%
15
 
2.2%
13
 
1.9%
13
 
1.9%
7
 
1.0%
7
 
1.0%
Other values (172) 356
53.4%
ASCII
ValueCountFrequency (%)
12
19.0%
B 3
 
4.8%
E 3
 
4.8%
( 3
 
4.8%
) 3
 
4.8%
J 3
 
4.8%
, 3
 
4.8%
M 3
 
4.8%
S 2
 
3.2%
a 2
 
3.2%
Other values (21) 26
41.3%

행정구역
Categorical

IMBALANCE 

Distinct7
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size3.1 KiB
<NA>
248 
금산읍
112 
추부면
 
17
진산면
 
3
복수면
 
2
Other values (2)
 
2

Length

Max length4
Median length4
Mean length3.6458333
Min length3

Unique

Unique2 ?
Unique (%)0.5%

Sample

1st row금산읍
2nd row금산읍
3rd row금산읍
4th row금산읍
5th row금산읍

Common Values

ValueCountFrequency (%)
<NA> 248
64.6%
금산읍 112
29.2%
추부면 17
 
4.4%
진산면 3
 
0.8%
복수면 2
 
0.5%
부리면 1
 
0.3%
제원면 1
 
0.3%

Length

2023-12-12T15:38:08.457208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:38:08.566792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 248
64.6%
금산읍 112
29.2%
추부면 17
 
4.4%
진산면 3
 
0.8%
복수면 2
 
0.5%
부리면 1
 
0.3%
제원면 1
 
0.3%

주소
Text

MISSING 

Distinct134
Distinct (%)98.5%
Missing248
Missing (%)64.6%
Memory size3.1 KiB
2023-12-12T15:38:08.957483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length35
Mean length22.382353
Min length18

Characters and Unicode

Total characters3044
Distinct characters97
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique132 ?
Unique (%)97.1%

Sample

1st row충청남도 금산군 금산읍 상리 89-3
2nd row충청남도 금산군 금산읍 금산천길 150
3rd row충청남도 금산군 금산읍 인삼로 110
4th row충청남도 금산군 금산읍 금산로 1479
5th row충청남도 금산군 금산읍 상리 177-10
ValueCountFrequency (%)
충청남도 136
18.5%
금산군 136
18.5%
금산읍 112
15.3%
1층 18
 
2.5%
추부면 17
 
2.3%
비단로 15
 
2.0%
상리 11
 
1.5%
인삼로 8
 
1.1%
금산로 8
 
1.1%
비호로 8
 
1.1%
Other values (173) 265
36.1%
2023-12-12T15:38:09.617720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
624
20.5%
272
 
8.9%
263
 
8.6%
141
 
4.6%
140
 
4.6%
140
 
4.6%
137
 
4.5%
136
 
4.5%
115
 
3.8%
1 115
 
3.8%
Other values (87) 961
31.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1881
61.8%
Space Separator 624
 
20.5%
Decimal Number 454
 
14.9%
Other Punctuation 36
 
1.2%
Dash Punctuation 36
 
1.2%
Close Punctuation 4
 
0.1%
Open Punctuation 4
 
0.1%
Uppercase Letter 4
 
0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
272
14.5%
263
14.0%
141
 
7.5%
140
 
7.4%
140
 
7.4%
137
 
7.3%
136
 
7.2%
115
 
6.1%
75
 
4.0%
36
 
1.9%
Other values (70) 426
22.6%
Decimal Number
ValueCountFrequency (%)
1 115
25.3%
2 54
11.9%
3 52
11.5%
6 46
 
10.1%
5 41
 
9.0%
4 36
 
7.9%
0 32
 
7.0%
7 32
 
7.0%
9 25
 
5.5%
8 21
 
4.6%
Space Separator
ValueCountFrequency (%)
624
100.0%
Other Punctuation
ValueCountFrequency (%)
, 36
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 36
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Uppercase Letter
ValueCountFrequency (%)
I 4
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1881
61.8%
Common 1159
38.1%
Latin 4
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
272
14.5%
263
14.0%
141
 
7.5%
140
 
7.4%
140
 
7.4%
137
 
7.3%
136
 
7.2%
115
 
6.1%
75
 
4.0%
36
 
1.9%
Other values (70) 426
22.6%
Common
ValueCountFrequency (%)
624
53.8%
1 115
 
9.9%
2 54
 
4.7%
3 52
 
4.5%
6 46
 
4.0%
5 41
 
3.5%
4 36
 
3.1%
, 36
 
3.1%
- 36
 
3.1%
0 32
 
2.8%
Other values (6) 87
 
7.5%
Latin
ValueCountFrequency (%)
I 4
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1881
61.8%
ASCII 1163
38.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
624
53.7%
1 115
 
9.9%
2 54
 
4.6%
3 52
 
4.5%
6 46
 
4.0%
5 41
 
3.5%
4 36
 
3.1%
, 36
 
3.1%
- 36
 
3.1%
0 32
 
2.8%
Other values (7) 91
 
7.8%
Hangul
ValueCountFrequency (%)
272
14.5%
263
14.0%
141
 
7.5%
140
 
7.4%
140
 
7.4%
137
 
7.3%
136
 
7.2%
115
 
6.1%
75
 
4.0%
36
 
1.9%
Other values (70) 426
22.6%

전화번호
Text

MISSING 

Distinct104
Distinct (%)76.5%
Missing248
Missing (%)64.6%
Memory size3.1 KiB
2023-12-12T15:38:10.292510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length11.080882
Min length2

Characters and Unicode

Total characters1507
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique103 ?
Unique (%)75.7%

Sample

1st row 041- 752-1631
2nd row 041- 752-2971
3rd row 041- 754-6387
4th row 041- 752-2646
5th row 041- 752-1135
ValueCountFrequency (%)
041 102
34.9%
없음 33
 
11.3%
751 20
 
6.8%
754 14
 
4.8%
752 11
 
3.8%
753 7
 
2.4%
4348 1
 
0.3%
8686 1
 
0.3%
7227 1
 
0.3%
753-1858 1
 
0.3%
Other values (101) 101
34.6%
2023-12-12T15:38:10.789965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 206
13.7%
205
13.6%
1 176
11.7%
4 168
11.1%
7 159
10.6%
0 148
9.8%
5 144
9.6%
2 62
 
4.1%
3 57
 
3.8%
6 44
 
2.9%
Other values (4) 138
9.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1030
68.3%
Dash Punctuation 206
 
13.7%
Space Separator 205
 
13.6%
Other Letter 66
 
4.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 176
17.1%
4 168
16.3%
7 159
15.4%
0 148
14.4%
5 144
14.0%
2 62
 
6.0%
3 57
 
5.5%
6 44
 
4.3%
9 36
 
3.5%
8 36
 
3.5%
Other Letter
ValueCountFrequency (%)
33
50.0%
33
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 206
100.0%
Space Separator
ValueCountFrequency (%)
205
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1441
95.6%
Hangul 66
 
4.4%

Most frequent character per script

Common
ValueCountFrequency (%)
- 206
14.3%
205
14.2%
1 176
12.2%
4 168
11.7%
7 159
11.0%
0 148
10.3%
5 144
10.0%
2 62
 
4.3%
3 57
 
4.0%
6 44
 
3.1%
Other values (2) 72
 
5.0%
Hangul
ValueCountFrequency (%)
33
50.0%
33
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1441
95.6%
Hangul 66
 
4.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 206
14.3%
205
14.2%
1 176
12.2%
4 168
11.7%
7 159
11.0%
0 148
10.3%
5 144
10.0%
2 62
 
4.3%
3 57
 
4.0%
6 44
 
3.1%
Other values (2) 72
 
5.0%
Hangul
ValueCountFrequency (%)
33
50.0%
33
50.0%

데이터기준일자
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)0.7%
Missing248
Missing (%)64.6%
Memory size3.1 KiB
Minimum2022-03-31 00:00:00
Maximum2022-03-31 00:00:00
2023-12-12T15:38:10.975645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:38:11.112085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2023-12-12T15:38:11.192773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종구분행정구역
업종구분1.0000.332
행정구역0.3321.000
2023-12-12T15:38:11.289498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종구분행정구역
업종구분1.0000.131
행정구역0.1311.000
2023-12-12T15:38:11.395130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종구분행정구역
업종구분1.0000.131
행정구역0.1311.000

Missing values

2023-12-12T15:38:07.305254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:38:07.411828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T15:38:07.522976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업종구분업소명행정구역주소전화번호데이터기준일자
0미용업정헤어샵금산읍충청남도 금산군 금산읍 상리 89-3041- 752-16312022-03-31
1미용업나드리미용실금산읍충청남도 금산군 금산읍 금산천길 150041- 752-29712022-03-31
2미용업현대미용실금산읍충청남도 금산군 금산읍 인삼로 110041- 754-63872022-03-31
3미용업새봄미용실금산읍충청남도 금산군 금산읍 금산로 1479041- 752-26462022-03-31
4미용업희나리미용실금산읍충청남도 금산군 금산읍 상리 177-10041- 752-11352022-03-31
5미용업조하미용실금산읍충청남도 금산군 금산읍 비선길 15041- 753-58662022-03-31
6미용업은진미용실금산읍충청남도 금산군 금산읍 하옥리 335-15041- 753-75682022-03-31
7미용업임마누엘미용실금산읍충청남도 금산군 금산읍 금산천길 106, 금산시장 3동 4호041- 754-55752022-03-31
8미용업이영림헤어공간추부면충청남도 금산군 추부면 하마전로 50, 1층041- 752-47362022-03-31
9미용업대구미용실금산읍충청남도 금산군 금산읍 비호로 26041- 751-04792022-03-31
업종구분업소명행정구역주소전화번호데이터기준일자
374<NA><NA><NA><NA><NA><NA>
375<NA><NA><NA><NA><NA><NA>
376<NA><NA><NA><NA><NA><NA>
377<NA><NA><NA><NA><NA><NA>
378<NA><NA><NA><NA><NA><NA>
379<NA><NA><NA><NA><NA><NA>
380<NA><NA><NA><NA><NA><NA>
381<NA><NA><NA><NA><NA><NA>
382<NA><NA><NA><NA><NA><NA>
383<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

업종구분업소명행정구역주소전화번호데이터기준일자# duplicates
0<NA><NA><NA><NA><NA><NA>248