Overview

Dataset statistics

Number of variables5
Number of observations78
Missing cells3
Missing cells (%)0.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.2 KiB
Average record size in memory41.7 B

Variable types

Categorical2
Text3

Dataset

Description음성군 숙박업소현황에 대한 데이터로 업종명, 업소명, 소재지 주소, 소재지 전화번호 데이터기준일자에 대해 안내합니다.
URLhttps://www.data.go.kr/data/3073540/fileData.do

Alerts

기준일자 has constant value ""Constant
업종명 is highly imbalanced (76.5%)Imbalance
소재지전화 has 3 (3.8%) missing valuesMissing

Reproduction

Analysis started2023-12-12 04:02:56.148114
Analysis finished2023-12-12 04:02:56.927703
Duration0.78 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종명
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size756.0 B
숙박업(일반)
75 
숙박업(생활)
 
3

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row숙박업(일반)
2nd row숙박업(일반)
3rd row숙박업(일반)
4th row숙박업(일반)
5th row숙박업(일반)

Common Values

ValueCountFrequency (%)
숙박업(일반) 75
96.2%
숙박업(생활) 3
 
3.8%

Length

2023-12-12T13:02:56.990705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:02:57.124075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
숙박업(일반 75
96.2%
숙박업(생활 3
 
3.8%
Distinct76
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Memory size756.0 B
2023-12-12T13:02:57.414444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length4.9102564
Min length2

Characters and Unicode

Total characters383
Distinct characters139
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74 ?
Unique (%)94.9%

Sample

1st row황금장
2nd row그린힐모텔
3rd row스카이모텔(SKY모텔)
4th row로얄모텔
5th row음성파크
ValueCountFrequency (%)
그린파크 2
 
2.3%
아침 2
 
2.3%
모텔 2
 
2.3%
지중해의 2
 
2.3%
월드파크 2
 
2.3%
숲속정원 1
 
1.2%
서울장여관 1
 
1.2%
mk 1
 
1.2%
산호장여관 1
 
1.2%
두바이모텔 1
 
1.2%
Other values (71) 71
82.6%
2023-12-12T13:02:57.873728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
48
 
12.5%
34
 
8.9%
12
 
3.1%
11
 
2.9%
10
 
2.6%
10
 
2.6%
10
 
2.6%
10
 
2.6%
10
 
2.6%
8
 
2.1%
Other values (129) 220
57.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 353
92.2%
Uppercase Letter 15
 
3.9%
Space Separator 8
 
2.1%
Open Punctuation 3
 
0.8%
Close Punctuation 3
 
0.8%
Other Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
48
 
13.6%
34
 
9.6%
12
 
3.4%
11
 
3.1%
10
 
2.8%
10
 
2.8%
10
 
2.8%
10
 
2.8%
10
 
2.8%
8
 
2.3%
Other values (113) 190
53.8%
Uppercase Letter
ValueCountFrequency (%)
J 3
20.0%
K 2
13.3%
D 1
 
6.7%
A 1
 
6.7%
B 1
 
6.7%
M 1
 
6.7%
V 1
 
6.7%
R 1
 
6.7%
Y 1
 
6.7%
S 1
 
6.7%
Other values (2) 2
13.3%
Space Separator
ValueCountFrequency (%)
8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 353
92.2%
Common 15
 
3.9%
Latin 15
 
3.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
48
 
13.6%
34
 
9.6%
12
 
3.4%
11
 
3.1%
10
 
2.8%
10
 
2.8%
10
 
2.8%
10
 
2.8%
10
 
2.8%
8
 
2.3%
Other values (113) 190
53.8%
Latin
ValueCountFrequency (%)
J 3
20.0%
K 2
13.3%
D 1
 
6.7%
A 1
 
6.7%
B 1
 
6.7%
M 1
 
6.7%
V 1
 
6.7%
R 1
 
6.7%
Y 1
 
6.7%
S 1
 
6.7%
Other values (2) 2
13.3%
Common
ValueCountFrequency (%)
8
53.3%
( 3
 
20.0%
) 3
 
20.0%
& 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 353
92.2%
ASCII 30
 
7.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
48
 
13.6%
34
 
9.6%
12
 
3.4%
11
 
3.1%
10
 
2.8%
10
 
2.8%
10
 
2.8%
10
 
2.8%
10
 
2.8%
8
 
2.3%
Other values (113) 190
53.8%
ASCII
ValueCountFrequency (%)
8
26.7%
J 3
 
10.0%
( 3
 
10.0%
) 3
 
10.0%
K 2
 
6.7%
D 1
 
3.3%
A 1
 
3.3%
B 1
 
3.3%
M 1
 
3.3%
& 1
 
3.3%
Other values (6) 6
20.0%
Distinct77
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size756.0 B
2023-12-12T13:02:58.261766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length28
Mean length22.448718
Min length18

Characters and Unicode

Total characters1751
Distinct characters85
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76 ?
Unique (%)97.4%

Sample

1st row충청북도 음성군 음성읍 한불로 44
2nd row충청북도 음성군 음성읍 충청대로 1488 외 3필지
3rd row충청북도 음성군 음성읍 중앙로 319
4th row충청북도 음성군 음성읍 중앙로 126-9
5th row충청북도 음성군 음성읍 음성천서길 167
ValueCountFrequency (%)
충청북도 78
19.6%
음성군 78
19.6%
금왕읍 21
 
5.3%
음성읍 14
 
3.5%
대소면 13
 
3.3%
감곡면 10
 
2.5%
맹동면 7
 
1.8%
대금로 7
 
1.8%
충청대로 6
 
1.5%
오태로 6
 
1.5%
Other values (129) 158
39.7%
2023-12-12T13:02:58.784006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
320
18.3%
112
 
6.4%
100
 
5.7%
84
 
4.8%
84
 
4.8%
79
 
4.5%
78
 
4.5%
78
 
4.5%
1 70
 
4.0%
63
 
3.6%
Other values (75) 683
39.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1087
62.1%
Space Separator 320
 
18.3%
Decimal Number 302
 
17.2%
Dash Punctuation 36
 
2.1%
Other Punctuation 3
 
0.2%
Uppercase Letter 2
 
0.1%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
112
 
10.3%
100
 
9.2%
84
 
7.7%
84
 
7.7%
79
 
7.3%
78
 
7.2%
78
 
7.2%
63
 
5.8%
43
 
4.0%
35
 
3.2%
Other values (60) 331
30.5%
Decimal Number
ValueCountFrequency (%)
1 70
23.2%
2 38
12.6%
3 34
11.3%
6 31
10.3%
5 30
9.9%
4 28
 
9.3%
7 23
 
7.6%
9 18
 
6.0%
8 18
 
6.0%
0 12
 
4.0%
Space Separator
ValueCountFrequency (%)
320
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 36
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%
Uppercase Letter
ValueCountFrequency (%)
C 2
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1087
62.1%
Common 662
37.8%
Latin 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
112
 
10.3%
100
 
9.2%
84
 
7.7%
84
 
7.7%
79
 
7.3%
78
 
7.2%
78
 
7.2%
63
 
5.8%
43
 
4.0%
35
 
3.2%
Other values (60) 331
30.5%
Common
ValueCountFrequency (%)
320
48.3%
1 70
 
10.6%
2 38
 
5.7%
- 36
 
5.4%
3 34
 
5.1%
6 31
 
4.7%
5 30
 
4.5%
4 28
 
4.2%
7 23
 
3.5%
9 18
 
2.7%
Other values (4) 34
 
5.1%
Latin
ValueCountFrequency (%)
C 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1087
62.1%
ASCII 664
37.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
320
48.2%
1 70
 
10.5%
2 38
 
5.7%
- 36
 
5.4%
3 34
 
5.1%
6 31
 
4.7%
5 30
 
4.5%
4 28
 
4.2%
7 23
 
3.5%
9 18
 
2.7%
Other values (5) 36
 
5.4%
Hangul
ValueCountFrequency (%)
112
 
10.3%
100
 
9.2%
84
 
7.7%
84
 
7.7%
79
 
7.3%
78
 
7.2%
78
 
7.2%
63
 
5.8%
43
 
4.0%
35
 
3.2%
Other values (60) 331
30.5%

소재지전화
Text

MISSING 

Distinct74
Distinct (%)98.7%
Missing3
Missing (%)3.8%
Memory size756.0 B
2023-12-12T13:02:59.130177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters900
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique73 ?
Unique (%)97.3%

Sample

1st row043-873-6767
2nd row043-873-2111
3rd row043-872-0084
4th row043-872-0700
5th row043-872-2640
ValueCountFrequency (%)
043-877-8232 2
 
2.7%
043-882-9901 1
 
1.3%
043-877-2319 1
 
1.3%
043-877-0114 1
 
1.3%
043-877-4679 1
 
1.3%
0438-8212-95 1
 
1.3%
043-877-1400 1
 
1.3%
043-883-0227 1
 
1.3%
043-883-4380 1
 
1.3%
043-878-8505 1
 
1.3%
Other values (64) 64
85.3%
2023-12-12T13:02:59.639340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 150
16.7%
8 137
15.2%
0 123
13.7%
3 119
13.2%
4 91
10.1%
7 89
9.9%
1 56
 
6.2%
2 48
 
5.3%
6 32
 
3.6%
5 29
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 750
83.3%
Dash Punctuation 150
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 137
18.3%
0 123
16.4%
3 119
15.9%
4 91
12.1%
7 89
11.9%
1 56
7.5%
2 48
 
6.4%
6 32
 
4.3%
5 29
 
3.9%
9 26
 
3.5%
Dash Punctuation
ValueCountFrequency (%)
- 150
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 900
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 150
16.7%
8 137
15.2%
0 123
13.7%
3 119
13.2%
4 91
10.1%
7 89
9.9%
1 56
 
6.2%
2 48
 
5.3%
6 32
 
3.6%
5 29
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 900
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 150
16.7%
8 137
15.2%
0 123
13.7%
3 119
13.2%
4 91
10.1%
7 89
9.9%
1 56
 
6.2%
2 48
 
5.3%
6 32
 
3.6%
5 29
 
3.2%

기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size756.0 B
2023-08-21
78 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-08-21
2nd row2023-08-21
3rd row2023-08-21
4th row2023-08-21
5th row2023-08-21

Common Values

ValueCountFrequency (%)
2023-08-21 78
100.0%

Length

2023-12-12T13:02:59.822432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:02:59.938949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-08-21 78
100.0%

Correlations

2023-12-12T13:03:00.015084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종명업소명영업소 주소(도로명)소재지전화
업종명1.0001.0001.0001.000
업소명1.0001.0000.9950.995
영업소 주소(도로명)1.0000.9951.0000.998
소재지전화1.0000.9950.9981.000

Missing values

2023-12-12T13:02:56.794630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:02:56.888789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업종명업소명영업소 주소(도로명)소재지전화기준일자
0숙박업(일반)황금장충청북도 음성군 음성읍 한불로 44043-873-67672023-08-21
1숙박업(일반)그린힐모텔충청북도 음성군 음성읍 충청대로 1488 외 3필지043-873-21112023-08-21
2숙박업(일반)스카이모텔(SKY모텔)충청북도 음성군 음성읍 중앙로 319043-872-00842023-08-21
3숙박업(일반)로얄모텔충청북도 음성군 음성읍 중앙로 126-9043-872-07002023-08-21
4숙박업(일반)음성파크충청북도 음성군 음성읍 음성천서길 167043-872-26402023-08-21
5숙박업(일반)J모텔충청북도 음성군 음성읍 음성천동길 108043-872-20262023-08-21
6숙박업(일반)설성파크충청북도 음성군 음성읍 음성로 79043-873-25722023-08-21
7숙박업(일반)자바라무인텔충청북도 음성군 음성읍 음성로 545-12043-872-69972023-08-21
8숙박업(일반)음성관광호텔충청북도 음성군 음성읍 음성로 194043-873-88812023-08-21
9숙박업(일반)대원장여관충청북도 음성군 음성읍 시장로115번길 5043-872-11222023-08-21
업종명업소명영업소 주소(도로명)소재지전화기준일자
68숙박업(일반)제일여인숙충청북도 음성군 감곡면 장감로143번길 3-6043-881-20412023-08-21
69숙박업(일반)롯데여관충청북도 음성군 감곡면 장감로132번길 32-2043-881-37352023-08-21
70숙박업(일반)온천장여관충청북도 음성군 감곡면 장감로131번길 5043-881-24612023-08-21
71숙박업(일반)골드파크충청북도 음성군 감곡면 장감로124번길 6043-881-33672023-08-21
72숙박업(일반)서울장여관충청북도 음성군 감곡면 왕장안길 17043-872-22532023-08-21
73숙박업(생활)숲속정원충청북도 음성군 감곡면 사곡길69번길 185-45 나동<NA>2023-08-21
74숙박업(일반)발리무인텔충청북도 음성군 감곡면 사곡길 66<NA>2023-08-21
75숙박업(일반)썬모텔충청북도 음성군 감곡면 북부로 146043-881-00412023-08-21
76숙박업(일반)그린파크충청북도 음성군 감곡면 가곡로 642043-881-66892023-08-21
77숙박업(일반)몽마르뜨모텔충청북도 음성군 감곡면 가곡로 417-3043-882-94002023-08-21