Overview

Dataset statistics

Number of variables4
Number of observations85
Missing cells39
Missing cells (%)11.5%
Duplicate rows1
Duplicate rows (%)1.2%
Total size in memory2.8 KiB
Average record size in memory33.6 B

Variable types

Text3
Categorical1

Dataset

Description경상남도 함양군 관내 약국, 병의원, 보건소 등 의약업소 현황 정보로 업소명, 소재지 주소, 연락처, 데이터 기준일자로 구성되어 있습니다.
Author경상남도 함양군
URLhttps://www.data.go.kr/data/3064792/fileData.do

Alerts

Dataset has 1 (1.2%) duplicate rowsDuplicates
업소명 has 13 (15.3%) missing valuesMissing
소재지 has 13 (15.3%) missing valuesMissing
전화번호 has 13 (15.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 09:25:11.138576
Analysis finished2023-12-12 09:25:11.774339
Duration0.64 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업소명
Text

MISSING 

Distinct72
Distinct (%)100.0%
Missing13
Missing (%)15.3%
Memory size812.0 B
2023-12-12T18:25:11.958094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length6.3472222
Min length3

Characters and Unicode

Total characters457
Distinct characters120
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)100.0%

Sample

1st row함양군보건소
2nd row마천보건지소
3rd row휴천보건지소
4th row유림보건지소
5th row수동보건지소
ValueCountFrequency (%)
하약국 2
 
2.7%
마천보건지소 1
 
1.4%
단아미소치과의원 1
 
1.4%
새동산약국 1
 
1.4%
미래온누리약국 1
 
1.4%
안의치과의원 1
 
1.4%
현대치과의원 1
 
1.4%
박애치과의원 1
 
1.4%
상아치과의원 1
 
1.4%
효치과의원 1
 
1.4%
Other values (63) 63
85.1%
2023-12-12T18:25:12.358722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
40
 
8.8%
31
 
6.8%
25
 
5.5%
23
 
5.0%
23
 
5.0%
21
 
4.6%
20
 
4.4%
18
 
3.9%
12
 
2.6%
11
 
2.4%
Other values (110) 233
51.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 455
99.6%
Space Separator 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
40
 
8.8%
31
 
6.8%
25
 
5.5%
23
 
5.1%
23
 
5.1%
21
 
4.6%
20
 
4.4%
18
 
4.0%
12
 
2.6%
11
 
2.4%
Other values (109) 231
50.8%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 455
99.6%
Common 2
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
40
 
8.8%
31
 
6.8%
25
 
5.5%
23
 
5.1%
23
 
5.1%
21
 
4.6%
20
 
4.4%
18
 
4.0%
12
 
2.6%
11
 
2.4%
Other values (109) 231
50.8%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 455
99.6%
ASCII 2
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
40
 
8.8%
31
 
6.8%
25
 
5.5%
23
 
5.1%
23
 
5.1%
21
 
4.6%
20
 
4.4%
18
 
4.0%
12
 
2.6%
11
 
2.4%
Other values (109) 231
50.8%
ASCII
ValueCountFrequency (%)
2
100.0%

소재지
Text

MISSING 

Distinct68
Distinct (%)94.4%
Missing13
Missing (%)15.3%
Memory size812.0 B
2023-12-12T18:25:12.710094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length31
Mean length21.902778
Min length18

Characters and Unicode

Total characters1577
Distinct characters94
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)88.9%

Sample

1st row경상남도 함양군 함양읍 한들로 141
2nd row경상남도 함양군 마천면 천왕봉로 1144-2
3rd row경상남도 함양군 휴천면 함양남서로 513
4th row경상남도 함양군 유림면 천왕봉로 2872-4
5th row경상남도 함양군 수동면 수동내동길 15
ValueCountFrequency (%)
경상남도 72
18.9%
함양군 72
18.9%
함양읍 42
 
11.1%
용평중앙길 14
 
3.7%
2층 11
 
2.9%
안의면 10
 
2.6%
고운로 7
 
1.8%
함양로 7
 
1.8%
용평길 4
 
1.1%
서상면 4
 
1.1%
Other values (103) 137
36.1%
2023-12-12T18:25:13.247088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
308
19.5%
123
 
7.8%
123
 
7.8%
79
 
5.0%
76
 
4.8%
72
 
4.6%
72
 
4.6%
72
 
4.6%
1 51
 
3.2%
43
 
2.7%
Other values (84) 558
35.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1028
65.2%
Space Separator 308
 
19.5%
Decimal Number 201
 
12.7%
Dash Punctuation 16
 
1.0%
Other Punctuation 14
 
0.9%
Close Punctuation 4
 
0.3%
Open Punctuation 4
 
0.3%
Uppercase Letter 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
123
12.0%
123
12.0%
79
 
7.7%
76
 
7.4%
72
 
7.0%
72
 
7.0%
72
 
7.0%
43
 
4.2%
42
 
4.1%
31
 
3.0%
Other values (67) 295
28.7%
Decimal Number
ValueCountFrequency (%)
1 51
25.4%
2 32
15.9%
3 31
15.4%
4 23
11.4%
7 19
 
9.5%
0 12
 
6.0%
6 10
 
5.0%
8 10
 
5.0%
5 8
 
4.0%
9 5
 
2.5%
Uppercase Letter
ValueCountFrequency (%)
A 1
50.0%
B 1
50.0%
Space Separator
ValueCountFrequency (%)
308
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%
Other Punctuation
ValueCountFrequency (%)
, 14
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1028
65.2%
Common 547
34.7%
Latin 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
123
12.0%
123
12.0%
79
 
7.7%
76
 
7.4%
72
 
7.0%
72
 
7.0%
72
 
7.0%
43
 
4.2%
42
 
4.1%
31
 
3.0%
Other values (67) 295
28.7%
Common
ValueCountFrequency (%)
308
56.3%
1 51
 
9.3%
2 32
 
5.9%
3 31
 
5.7%
4 23
 
4.2%
7 19
 
3.5%
- 16
 
2.9%
, 14
 
2.6%
0 12
 
2.2%
6 10
 
1.8%
Other values (5) 31
 
5.7%
Latin
ValueCountFrequency (%)
A 1
50.0%
B 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1028
65.2%
ASCII 549
34.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
308
56.1%
1 51
 
9.3%
2 32
 
5.8%
3 31
 
5.6%
4 23
 
4.2%
7 19
 
3.5%
- 16
 
2.9%
, 14
 
2.6%
0 12
 
2.2%
6 10
 
1.8%
Other values (7) 33
 
6.0%
Hangul
ValueCountFrequency (%)
123
12.0%
123
12.0%
79
 
7.7%
76
 
7.4%
72
 
7.0%
72
 
7.0%
72
 
7.0%
43
 
4.2%
42
 
4.1%
31
 
3.0%
Other values (67) 295
28.7%

전화번호
Text

MISSING 

Distinct72
Distinct (%)100.0%
Missing13
Missing (%)15.3%
Memory size812.0 B
2023-12-12T18:25:13.524221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters864
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)100.0%

Sample

1st row055-960-8010
2nd row055-960-5360
3rd row055-960-5361
4th row055-960-5362
5th row055-960-5363
ValueCountFrequency (%)
055-960-5360 1
 
1.4%
055-960-5361 1
 
1.4%
055-964-0067 1
 
1.4%
055-963-9886 1
 
1.4%
055-963-6108 1
 
1.4%
055-963-1280 1
 
1.4%
055-963-4058 1
 
1.4%
055-964-0038 1
 
1.4%
055-962-2875 1
 
1.4%
055-962-2922 1
 
1.4%
Other values (62) 62
86.1%
2023-12-12T18:25:13.981542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 187
21.6%
- 144
16.7%
0 139
16.1%
6 94
10.9%
9 92
10.6%
3 59
 
6.8%
2 44
 
5.1%
7 33
 
3.8%
8 25
 
2.9%
1 24
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 720
83.3%
Dash Punctuation 144
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 187
26.0%
0 139
19.3%
6 94
13.1%
9 92
12.8%
3 59
 
8.2%
2 44
 
6.1%
7 33
 
4.6%
8 25
 
3.5%
1 24
 
3.3%
4 23
 
3.2%
Dash Punctuation
ValueCountFrequency (%)
- 144
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 864
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 187
21.6%
- 144
16.7%
0 139
16.1%
6 94
10.9%
9 92
10.6%
3 59
 
6.8%
2 44
 
5.1%
7 33
 
3.8%
8 25
 
2.9%
1 24
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 864
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 187
21.6%
- 144
16.7%
0 139
16.1%
6 94
10.9%
9 92
10.6%
3 59
 
6.8%
2 44
 
5.1%
7 33
 
3.8%
8 25
 
2.9%
1 24
 
2.8%
Distinct2
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size812.0 B
2021-05-06
72 
<NA>
13 

Length

Max length10
Median length10
Mean length9.0823529
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-05-06
2nd row2021-05-06
3rd row2021-05-06
4th row2021-05-06
5th row2021-05-06

Common Values

ValueCountFrequency (%)
2021-05-06 72
84.7%
<NA> 13
 
15.3%

Length

2023-12-12T18:25:14.140884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:25:14.253169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021-05-06 72
84.7%
na 13
 
15.3%

Correlations

2023-12-12T18:25:14.342853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소명소재지전화번호
업소명1.0001.0001.000
소재지1.0001.0001.000
전화번호1.0001.0001.000

Missing values

2023-12-12T18:25:11.469845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:25:11.575450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T18:25:11.687024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업소명소재지전화번호데이터기준일자
0함양군보건소경상남도 함양군 함양읍 한들로 141055-960-80102021-05-06
1마천보건지소경상남도 함양군 마천면 천왕봉로 1144-2055-960-53602021-05-06
2휴천보건지소경상남도 함양군 휴천면 함양남서로 513055-960-53612021-05-06
3유림보건지소경상남도 함양군 유림면 천왕봉로 2872-4055-960-53622021-05-06
4수동보건지소경상남도 함양군 수동면 수동내동길 15055-960-53632021-05-06
5지곡보건지소경상남도 함양군 지곡면 함양로 1882-1055-960-53642021-05-06
6안의보건지소경상남도 함양군 안의면 금성길 16055-960-53652021-05-06
7서하보건지소경상남도 함양군 서하면 송계앞길 7055-960-53662021-05-06
8서상보건지소경상남도 함양군 서상면 서상로 307055-960-63672021-05-06
9백전보건지소경상남도 함양군 백전면 함양남서로 2355055-960-53682021-05-06
업소명소재지전화번호데이터기준일자
75<NA><NA><NA><NA>
76<NA><NA><NA><NA>
77<NA><NA><NA><NA>
78<NA><NA><NA><NA>
79<NA><NA><NA><NA>
80<NA><NA><NA><NA>
81<NA><NA><NA><NA>
82<NA><NA><NA><NA>
83<NA><NA><NA><NA>
84<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

업소명소재지전화번호데이터기준일자# duplicates
0<NA><NA><NA><NA>13