Overview

Dataset statistics

Number of variables6
Number of observations76
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.8 KiB
Average record size in memory50.7 B

Variable types

Numeric1
Text3
Categorical1
DateTime1

Dataset

Description충청남도 서산시 관내에 약국을 등록한 현황에 대한 데이터로, 의료기관명, 주소, 전화번호 등에 대한 항목을 제공합니다.
Author충청남도 서산시
URLhttps://www.data.go.kr/data/15000700/fileData.do

Alerts

데이터기준일 has constant value ""Constant
연번 is highly overall correlated with 비고High correlation
비고 is highly overall correlated with 연번High correlation
비고 is highly imbalanced (65.0%)Imbalance
연번 has unique valuesUnique
의료기관명 has unique valuesUnique
전화번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 08:58:01.519159
Analysis finished2023-12-12 08:58:02.254721
Duration0.74 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct76
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.5
Minimum1
Maximum76
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size816.0 B
2023-12-12T17:58:02.339808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.75
Q119.75
median38.5
Q357.25
95-th percentile72.25
Maximum76
Range75
Interquartile range (IQR)37.5

Descriptive statistics

Standard deviation22.083176
Coefficient of variation (CV)0.57358899
Kurtosis-1.2
Mean38.5
Median Absolute Deviation (MAD)19
Skewness0
Sum2926
Variance487.66667
MonotonicityStrictly increasing
2023-12-12T17:58:02.506725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.3%
50 1
 
1.3%
57 1
 
1.3%
56 1
 
1.3%
55 1
 
1.3%
54 1
 
1.3%
53 1
 
1.3%
52 1
 
1.3%
51 1
 
1.3%
49 1
 
1.3%
Other values (66) 66
86.8%
ValueCountFrequency (%)
1 1
1.3%
2 1
1.3%
3 1
1.3%
4 1
1.3%
5 1
1.3%
6 1
1.3%
7 1
1.3%
8 1
1.3%
9 1
1.3%
10 1
1.3%
ValueCountFrequency (%)
76 1
1.3%
75 1
1.3%
74 1
1.3%
73 1
1.3%
72 1
1.3%
71 1
1.3%
70 1
1.3%
69 1
1.3%
68 1
1.3%
67 1
1.3%

의료기관명
Text

UNIQUE 

Distinct76
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size740.0 B
2023-12-12T17:58:02.858822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length4
Mean length4.5526316
Min length3

Characters and Unicode

Total characters346
Distinct characters127
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76 ?
Unique (%)100.0%

Sample

1st row녹십자약국
2nd row대산약국
3rd row대산프라자약국
4th row우성약국
5th row백세약국
ValueCountFrequency (%)
녹십자약국 1
 
1.3%
도미당약국 1
 
1.3%
오성약국 1
 
1.3%
박용두약국 1
 
1.3%
부춘약국 1
 
1.3%
대영약국 1
 
1.3%
성모약국 1
 
1.3%
더블유스토어대흥약국 1
 
1.3%
일등약국 1
 
1.3%
대산약국 1
 
1.3%
Other values (67) 67
87.0%
2023-12-12T17:58:03.389074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
76
22.0%
76
22.0%
6
 
1.7%
6
 
1.7%
6
 
1.7%
5
 
1.4%
5
 
1.4%
4
 
1.2%
4
 
1.2%
4
 
1.2%
Other values (117) 154
44.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 345
99.7%
Space Separator 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
76
22.0%
76
22.0%
6
 
1.7%
6
 
1.7%
6
 
1.7%
5
 
1.4%
5
 
1.4%
4
 
1.2%
4
 
1.2%
4
 
1.2%
Other values (116) 153
44.3%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 345
99.7%
Common 1
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
76
22.0%
76
22.0%
6
 
1.7%
6
 
1.7%
6
 
1.7%
5
 
1.4%
5
 
1.4%
4
 
1.2%
4
 
1.2%
4
 
1.2%
Other values (116) 153
44.3%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 345
99.7%
ASCII 1
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
76
22.0%
76
22.0%
6
 
1.7%
6
 
1.7%
6
 
1.7%
5
 
1.4%
5
 
1.4%
4
 
1.2%
4
 
1.2%
4
 
1.2%
Other values (116) 153
44.3%
ASCII
ValueCountFrequency (%)
1
100.0%

주소
Text

Distinct75
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size740.0 B
2023-12-12T17:58:03.775994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length26
Mean length16.565789
Min length9

Characters and Unicode

Total characters1259
Distinct characters108
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74 ?
Unique (%)97.4%

Sample

1st row대산읍 매남길 13-14
2nd row대산읍 구진천로 7
3rd row대산읍 구진천로 12
4th row대산읍 구진로 6
5th row대산읍 독곶1로 20
ValueCountFrequency (%)
동문동 34
 
12.1%
안견로 18
 
6.4%
충청남도 15
 
5.3%
서산시 14
 
5.0%
1층 8
 
2.8%
시장6로 6
 
2.1%
예천동 6
 
2.1%
석림동 5
 
1.8%
해미면 5
 
1.8%
대산읍 5
 
1.8%
Other values (135) 165
58.7%
2023-12-12T17:58:04.279234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
205
 
16.3%
91
 
7.2%
1 85
 
6.8%
70
 
5.6%
) 54
 
4.3%
( 54
 
4.3%
39
 
3.1%
2 31
 
2.5%
, 26
 
2.1%
3 25
 
2.0%
Other values (98) 579
46.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 634
50.4%
Decimal Number 272
21.6%
Space Separator 205
 
16.3%
Close Punctuation 54
 
4.3%
Open Punctuation 54
 
4.3%
Other Punctuation 26
 
2.1%
Dash Punctuation 11
 
0.9%
Uppercase Letter 3
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
91
 
14.4%
70
 
11.0%
39
 
6.2%
24
 
3.8%
23
 
3.6%
22
 
3.5%
18
 
2.8%
18
 
2.8%
18
 
2.8%
16
 
2.5%
Other values (80) 295
46.5%
Decimal Number
ValueCountFrequency (%)
1 85
31.2%
2 31
 
11.4%
3 25
 
9.2%
6 24
 
8.8%
0 22
 
8.1%
5 22
 
8.1%
7 18
 
6.6%
4 17
 
6.2%
9 15
 
5.5%
8 13
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
S 1
33.3%
J 1
33.3%
B 1
33.3%
Space Separator
ValueCountFrequency (%)
205
100.0%
Close Punctuation
ValueCountFrequency (%)
) 54
100.0%
Open Punctuation
ValueCountFrequency (%)
( 54
100.0%
Other Punctuation
ValueCountFrequency (%)
, 26
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 634
50.4%
Common 622
49.4%
Latin 3
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
91
 
14.4%
70
 
11.0%
39
 
6.2%
24
 
3.8%
23
 
3.6%
22
 
3.5%
18
 
2.8%
18
 
2.8%
18
 
2.8%
16
 
2.5%
Other values (80) 295
46.5%
Common
ValueCountFrequency (%)
205
33.0%
1 85
13.7%
) 54
 
8.7%
( 54
 
8.7%
2 31
 
5.0%
, 26
 
4.2%
3 25
 
4.0%
6 24
 
3.9%
0 22
 
3.5%
5 22
 
3.5%
Other values (5) 74
 
11.9%
Latin
ValueCountFrequency (%)
S 1
33.3%
J 1
33.3%
B 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 634
50.4%
ASCII 625
49.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
205
32.8%
1 85
13.6%
) 54
 
8.6%
( 54
 
8.6%
2 31
 
5.0%
, 26
 
4.2%
3 25
 
4.0%
6 24
 
3.8%
0 22
 
3.5%
5 22
 
3.5%
Other values (8) 77
 
12.3%
Hangul
ValueCountFrequency (%)
91
 
14.4%
70
 
11.0%
39
 
6.2%
24
 
3.8%
23
 
3.6%
22
 
3.5%
18
 
2.8%
18
 
2.8%
18
 
2.8%
16
 
2.5%
Other values (80) 295
46.5%

전화번호
Text

UNIQUE 

Distinct76
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size740.0 B
2023-12-12T17:58:04.551636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length11.868421
Min length1

Characters and Unicode

Total characters902
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76 ?
Unique (%)100.0%

Sample

1st row041-663-9844
2nd row041-681-8700
3rd row041-663-6446
4th row041-681-3893
5th row041-669-9896
ValueCountFrequency (%)
041-663-9844 1
 
1.3%
041-665-2521 1
 
1.3%
041-665-5955 1
 
1.3%
041-665-4786 1
 
1.3%
041-665-3253 1
 
1.3%
041-665-5434 1
 
1.3%
041-669-4948 1
 
1.3%
041-666-0777 1
 
1.3%
041-664-4000 1
 
1.3%
041-681-8700 1
 
1.3%
Other values (66) 66
86.8%
2023-12-12T17:58:05.715549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 156
17.3%
- 150
16.6%
0 119
13.2%
4 119
13.2%
1 111
12.3%
8 50
 
5.5%
5 47
 
5.2%
2 41
 
4.5%
3 38
 
4.2%
9 35
 
3.9%
Other values (2) 36
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 751
83.3%
Dash Punctuation 150
 
16.6%
Uppercase Letter 1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 156
20.8%
0 119
15.8%
4 119
15.8%
1 111
14.8%
8 50
 
6.7%
5 47
 
6.3%
2 41
 
5.5%
3 38
 
5.1%
9 35
 
4.7%
7 35
 
4.7%
Dash Punctuation
ValueCountFrequency (%)
- 150
100.0%
Uppercase Letter
ValueCountFrequency (%)
X 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 901
99.9%
Latin 1
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
6 156
17.3%
- 150
16.6%
0 119
13.2%
4 119
13.2%
1 111
12.3%
8 50
 
5.5%
5 47
 
5.2%
2 41
 
4.6%
3 38
 
4.2%
9 35
 
3.9%
Latin
ValueCountFrequency (%)
X 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 902
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 156
17.3%
- 150
16.6%
0 119
13.2%
4 119
13.2%
1 111
12.3%
8 50
 
5.5%
5 47
 
5.2%
2 41
 
4.5%
3 38
 
4.2%
9 35
 
3.9%
Other values (2) 36
 
4.0%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size740.0 B
<NA>
71 
한약국
 
5

Length

Max length4
Median length4
Mean length3.9342105
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 71
93.4%
한약국 5
 
6.6%

Length

2023-12-12T17:58:05.961765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:58:06.146258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 71
93.4%
한약국 5
 
6.6%

데이터기준일
Date

CONSTANT 

Distinct1
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size740.0 B
Minimum2022-10-27 00:00:00
Maximum2022-10-27 00:00:00
2023-12-12T17:58:06.338039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:58:06.524486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-12T17:58:01.921277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:58:06.656546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번의료기관명주소전화번호
연번1.0001.0001.0001.000
의료기관명1.0001.0001.0001.000
주소1.0001.0001.0001.000
전화번호1.0001.0001.0001.000
2023-12-12T17:58:06.857681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번비고
연번1.0001.000
비고1.0001.000

Missing values

2023-12-12T17:58:02.073496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:58:02.209987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번의료기관명주소전화번호비고데이터기준일
01녹십자약국대산읍 매남길 13-14041-663-9844<NA>2022-10-27
12대산약국대산읍 구진천로 7041-681-8700<NA>2022-10-27
23대산프라자약국대산읍 구진천로 12041-663-6446<NA>2022-10-27
34우성약국대산읍 구진로 6041-681-3893<NA>2022-10-27
45백세약국대산읍 독곶1로 20041-669-9896<NA>2022-10-27
56해오름약국인지면 무학로 1735041-664-4243<NA>2022-10-27
67부석약국부석면 취평2길 14041-662-0087<NA>2022-10-27
78지곡백화점약국지곡면 충의로 1192041-662-2808<NA>2022-10-27
89열린약국지곡면 충의로 762-78, 105호 (늘푸른오스카빌)041-662-5968<NA>2022-10-27
910바닷가약국지곡면 화천1로 58, B동041-674-9929<NA>2022-10-27
연번의료기관명주소전화번호비고데이터기준일
6667서문약국충청남도 서산시 안견로 191 (동문동)041-681-3464<NA>2022-10-27
6768참약국충청남도 서산시 시장6로 30-15 (동문동)041-666-0039<NA>2022-10-27
6869희망찬약국충청남도 서산시 시장8로 1 (동문동)041-667-9219<NA>2022-10-27
6970천사약국충청남도 서산시 안견로 216 (동문동)041-668-0142<NA>2022-10-27
7071연수약국충청남도 서산시 시장6로 31 (동문동)041-681-8614<NA>2022-10-27
7172흰돌온누리약국충청남도 서산시 율지6로 5 (동문동)041-667-1516<NA>2022-10-27
7273삼화약국충청남도 서산시 번화2로 32 (동문동)041-665-2124<NA>2022-10-27
7374읍내약국충청남도 서산시 시장1로 9-1 (동문동)041-666-5700<NA>2022-10-27
7475삼성약국충청남도 서산시 벌말1길 60, 1,2층041-665-2404<NA>2022-10-27
7576새봄약국충청남도 율지8로 7-6, 1층 102,103호041-662-3361<NA>2022-10-27