Overview

Dataset statistics

Number of variables4
Number of observations45
Missing cells15
Missing cells (%)8.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 KiB
Average record size in memory35.9 B

Variable types

Text3
Categorical1

Dataset

Description예산군에 있는 민박시설 정보(업소명, 전화번호, 객실수, 주소) 제공
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=409&beforeMenuCd=DOM_000000201001001000&publicdatapk=15049862

Alerts

연락처 has 15 (33.3%) missing valuesMissing
시설명 has unique valuesUnique

Reproduction

Analysis started2024-01-09 22:50:26.414112
Analysis finished2024-01-09 22:50:26.777323
Duration0.36 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설명
Text

UNIQUE 

Distinct45
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size492.0 B
2024-01-10T07:50:26.925393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length8
Mean length4.3555556
Min length2

Characters and Unicode

Total characters196
Distinct characters110
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)100.0%

Sample

1st row숲과 집
2nd row산마루
3rd row대흥민박
4th row해피트리
5th row가야산펜션
ValueCountFrequency (%)
숲과 1
 
2.1%
하늘채펜션 1
 
2.1%
가야사의하루 1
 
2.1%
알콩달콩 1
 
2.1%
뉴캐슬 1
 
2.1%
가야펜션 1
 
2.1%
용고랑 1
 
2.1%
하늘정원 1
 
2.1%
해피죤 1
 
2.1%
숲속의정원 1
 
2.1%
Other values (37) 37
78.7%
2024-01-10T07:50:27.241291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
 
4.1%
7
 
3.6%
7
 
3.6%
6
 
3.1%
5
 
2.6%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
4
 
2.0%
Other values (100) 143
73.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 183
93.4%
Uppercase Letter 8
 
4.1%
Space Separator 2
 
1.0%
Close Punctuation 1
 
0.5%
Open Punctuation 1
 
0.5%
Decimal Number 1
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
 
4.4%
7
 
3.8%
7
 
3.8%
6
 
3.3%
5
 
2.7%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
Other values (90) 130
71.0%
Uppercase Letter
ValueCountFrequency (%)
E 3
37.5%
M 1
 
12.5%
N 1
 
12.5%
W 1
 
12.5%
T 1
 
12.5%
B 1
 
12.5%
Space Separator
ValueCountFrequency (%)
2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Decimal Number
ValueCountFrequency (%)
7 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 183
93.4%
Latin 8
 
4.1%
Common 5
 
2.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
 
4.4%
7
 
3.8%
7
 
3.8%
6
 
3.3%
5
 
2.7%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
Other values (90) 130
71.0%
Latin
ValueCountFrequency (%)
E 3
37.5%
M 1
 
12.5%
N 1
 
12.5%
W 1
 
12.5%
T 1
 
12.5%
B 1
 
12.5%
Common
ValueCountFrequency (%)
2
40.0%
) 1
20.0%
( 1
20.0%
7 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 183
93.4%
ASCII 13
 
6.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8
 
4.4%
7
 
3.8%
7
 
3.8%
6
 
3.3%
5
 
2.7%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
4
 
2.2%
Other values (90) 130
71.0%
ASCII
ValueCountFrequency (%)
E 3
23.1%
2
15.4%
M 1
 
7.7%
) 1
 
7.7%
N 1
 
7.7%
W 1
 
7.7%
T 1
 
7.7%
B 1
 
7.7%
( 1
 
7.7%
7 1
 
7.7%

연락처
Text

MISSING 

Distinct29
Distinct (%)96.7%
Missing15
Missing (%)33.3%
Memory size492.0 B
2024-01-10T07:50:27.413294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.166667
Min length12

Characters and Unicode

Total characters365
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)93.3%

Sample

1st row041-331-3312
2nd row070-4197-0815
3rd row070-4200-0247
4th row070-5088-0132
5th row041-337-6000
ValueCountFrequency (%)
070-5088-0132 2
 
6.7%
041-337-7495 1
 
3.3%
041-337-0001 1
 
3.3%
041-337-6163 1
 
3.3%
041-333-1110 1
 
3.3%
041-331-3533 1
 
3.3%
041-332-2540 1
 
3.3%
041-335-5071 1
 
3.3%
041-337-1153 1
 
3.3%
041-338-7755 1
 
3.3%
Other values (19) 19
63.3%
2024-01-10T07:50:27.997477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 73
20.0%
- 60
16.4%
0 59
16.2%
1 46
12.6%
4 36
9.9%
7 30
8.2%
8 17
 
4.7%
5 16
 
4.4%
2 14
 
3.8%
6 9
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 305
83.6%
Dash Punctuation 60
 
16.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 73
23.9%
0 59
19.3%
1 46
15.1%
4 36
11.8%
7 30
9.8%
8 17
 
5.6%
5 16
 
5.2%
2 14
 
4.6%
6 9
 
3.0%
9 5
 
1.6%
Dash Punctuation
ValueCountFrequency (%)
- 60
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 365
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 73
20.0%
- 60
16.4%
0 59
16.2%
1 46
12.6%
4 36
9.9%
7 30
8.2%
8 17
 
4.7%
5 16
 
4.4%
2 14
 
3.8%
6 9
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 365
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 73
20.0%
- 60
16.4%
0 59
16.2%
1 46
12.6%
4 36
9.9%
7 30
8.2%
8 17
 
4.7%
5 16
 
4.4%
2 14
 
3.8%
6 9
 
2.5%

객실수
Categorical

Distinct5
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Memory size492.0 B
3
14 
2
10 
4
5
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row3
3rd row3
4th row2
5th row5

Common Values

ValueCountFrequency (%)
3 14
31.1%
2 10
22.2%
4 9
20.0%
5 6
13.3%
1 6
13.3%

Length

2024-01-10T07:50:28.106107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T07:50:28.194055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3 14
31.1%
2 10
22.2%
4 9
20.0%
5 6
13.3%
1 6
13.3%

주소
Text

Distinct43
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Memory size492.0 B
2024-01-10T07:50:28.378684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length23
Mean length21.6
Min length19

Characters and Unicode

Total characters972
Distinct characters67
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)91.1%

Sample

1st row충청남도 예산군 삽교읍 삽교역로 89
2nd row충청남도 예산군 덕산면 덕산온천로 2
3rd row충청남도 예산군 대흥면 예당로 813-7
4th row충청남도 예산군 덕산면 대치난길15-10
5th row충청남도 예산군 덕산면 남은들로 68
ValueCountFrequency (%)
충청남도 45
20.1%
예산군 45
20.1%
덕산면 28
 
12.5%
응봉면 7
 
3.1%
예당관광로 5
 
2.2%
대치남길 5
 
2.2%
대치6길 5
 
2.2%
대흥면 5
 
2.2%
예당로 4
 
1.8%
남은들로 3
 
1.3%
Other values (63) 72
32.1%
2024-01-10T07:50:28.696026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
179
18.4%
78
 
8.0%
56
 
5.8%
55
 
5.7%
45
 
4.6%
45
 
4.6%
45
 
4.6%
45
 
4.6%
44
 
4.5%
33
 
3.4%
Other values (57) 347
35.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 629
64.7%
Space Separator 179
 
18.4%
Decimal Number 146
 
15.0%
Dash Punctuation 18
 
1.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
78
12.4%
56
 
8.9%
55
 
8.7%
45
 
7.2%
45
 
7.2%
45
 
7.2%
45
 
7.2%
44
 
7.0%
33
 
5.2%
23
 
3.7%
Other values (45) 160
25.4%
Decimal Number
ValueCountFrequency (%)
1 24
16.4%
2 22
15.1%
6 19
13.0%
5 18
12.3%
8 16
11.0%
4 14
9.6%
3 13
8.9%
7 9
 
6.2%
0 6
 
4.1%
9 5
 
3.4%
Space Separator
ValueCountFrequency (%)
179
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 629
64.7%
Common 343
35.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
78
12.4%
56
 
8.9%
55
 
8.7%
45
 
7.2%
45
 
7.2%
45
 
7.2%
45
 
7.2%
44
 
7.0%
33
 
5.2%
23
 
3.7%
Other values (45) 160
25.4%
Common
ValueCountFrequency (%)
179
52.2%
1 24
 
7.0%
2 22
 
6.4%
6 19
 
5.5%
- 18
 
5.2%
5 18
 
5.2%
8 16
 
4.7%
4 14
 
4.1%
3 13
 
3.8%
7 9
 
2.6%
Other values (2) 11
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 629
64.7%
ASCII 343
35.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
179
52.2%
1 24
 
7.0%
2 22
 
6.4%
6 19
 
5.5%
- 18
 
5.2%
5 18
 
5.2%
8 16
 
4.7%
4 14
 
4.1%
3 13
 
3.8%
7 9
 
2.6%
Other values (2) 11
 
3.2%
Hangul
ValueCountFrequency (%)
78
12.4%
56
 
8.9%
55
 
8.7%
45
 
7.2%
45
 
7.2%
45
 
7.2%
45
 
7.2%
44
 
7.0%
33
 
5.2%
23
 
3.7%
Other values (45) 160
25.4%

Correlations

2024-01-10T07:50:28.773272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명연락처객실수주소
시설명1.0001.0001.0001.000
연락처1.0001.0001.0000.990
객실수1.0001.0001.0000.956
주소1.0000.9900.9561.000

Missing values

2024-01-10T07:50:26.635380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T07:50:26.736309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설명연락처객실수주소
0숲과 집<NA>3충청남도 예산군 삽교읍 삽교역로 89
1산마루<NA>3충청남도 예산군 덕산면 덕산온천로 2
2대흥민박<NA>3충청남도 예산군 대흥면 예당로 813-7
3해피트리<NA>2충청남도 예산군 덕산면 대치난길15-10
4가야산펜션<NA>5충청남도 예산군 덕산면 남은들로 68
5참살이황토집<NA>3충청남도 예산군 대흥면 동서길 86
6초록수채화<NA>3충청남도 예산군 덕산면 대치남길 15-13
7글로리아<NA>4충청남도 예산군 대흥면 임존성길 64
8피플앤도그힐링041-331-33121충청남도 예산군 신양면 귀곡동절길 28
9덕산힐링하우스070-4197-08152충청남도 예산군 덕산면 흥덕서로 943-12
시설명연락처객실수주소
35예당노블레스<NA>3충청남도 예산군 응봉면 예당로 1127
36인디아나<NA>2충청남도 예산군 덕산면 대치남길 25
37꽃밭들041-337-11532충청남도 예산군 봉산면 화전1길 35-1
38양천041-335-50713충청남도 예산군 응봉면 예당관광로 244
39돌고래041-332-25404충청남도 예산군 응봉면 예당관광로 180
40예촌사랑041-331-35335충청남도 예산군 응봉면 예당관광로 205
41붕어나라041-333-11103충청남도 예산군 광시면 예당남로 62-32
42황토041-337-61634충청남도 예산군 덕산면 가루실길 123-5
43김가041-337-51383충청남도 예산군 덕산면 가루실안길 39-7
44호반팬션<NA>4충청남도 예산군 대흥면 예당로 781-14