Overview

Dataset statistics

Number of variables2
Number of observations419
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)0.5%
Total size in memory6.7 KiB
Average record size in memory16.3 B

Variable types

Text1
Categorical1

Dataset

Description429개 의료기관에 대한 기관명, 기관분류, 주소, 홈페이지주소, 의료진, 면적, 기관소개, 좌표, 전문분야, 보험정보, 교통편 등 35개 항목에 대한 데이터를 제공합니다.(러시어)
Author서울특별시 강남구
URLhttps://www.data.go.kr/data/15072587/fileData.do

Alerts

Dataset has 2 (0.5%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 17:31:38.373047
Analysis finished2023-12-12 17:31:38.607801
Duration0.23 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct407
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
2023-12-13T02:31:38.745010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length58
Median length44
Mean length27.699284
Min length5

Characters and Unicode

Total characters11606
Distinct characters171
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique400 ?
Unique (%)95.5%

Sample

1st rowКлиника пластической хирургии Тако
2nd rowОфтальмологическая клиника Bright St Mary
3rd rowКлиника пластической хирургии Hyundai Aesthetics
4th rowКлиника пластической хирургии Deesse
5th rowКлиника MD Центр маммопластики
ValueCountFrequency (%)
клиника 187
 
12.8%
хирургии 91
 
6.2%
пластической 88
 
6.0%
clinic 84
 
5.7%
plastic 30
 
2.0%
surgery 30
 
2.0%
стоматологическая 30
 
2.0%
центр 21
 
1.4%
больница 17
 
1.2%
дерматологическая 16
 
1.1%
Other values (564) 871
59.5%
2023-12-13T02:31:39.075540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1093
 
9.4%
и 1076
 
9.3%
а 623
 
5.4%
л 486
 
4.2%
к 484
 
4.2%
о 482
 
4.2%
н 427
 
3.7%
с 365
 
3.1%
е 361
 
3.1%
р 351
 
3.0%
Other values (161) 5858
50.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8972
77.3%
Uppercase Letter 1355
 
11.7%
Space Separator 1093
 
9.4%
Other Punctuation 81
 
0.7%
Other Letter 47
 
0.4%
Decimal Number 19
 
0.2%
Dash Punctuation 15
 
0.1%
Modifier Symbol 7
 
0.1%
Open Punctuation 7
 
0.1%
Close Punctuation 7
 
0.1%
Other values (3) 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
и 1076
 
12.0%
а 623
 
6.9%
л 486
 
5.4%
к 484
 
5.4%
о 482
 
5.4%
н 427
 
4.8%
с 365
 
4.1%
е 361
 
4.0%
р 351
 
3.9%
i 338
 
3.8%
Other values (47) 3979
44.3%
Uppercase Letter
ValueCountFrequency (%)
К 165
 
12.2%
C 111
 
8.2%
S 95
 
7.0%
С 88
 
6.5%
P 46
 
3.4%
О 42
 
3.1%
Д 39
 
2.9%
D 38
 
2.8%
A 35
 
2.6%
M 35
 
2.6%
Other values (44) 661
48.8%
Other Letter
ValueCountFrequency (%)
3
 
6.4%
3
 
6.4%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.1%
1
 
2.1%
Other values (27) 27
57.4%
Decimal Number
ValueCountFrequency (%)
3 3
15.8%
1 3
15.8%
2 3
15.8%
8 2
10.5%
9 2
10.5%
6 2
10.5%
0 2
10.5%
5 1
 
5.3%
4 1
 
5.3%
Other Punctuation
ValueCountFrequency (%)
? 59
72.8%
& 7
 
8.6%
" 6
 
7.4%
. 6
 
7.4%
, 2
 
2.5%
' 1
 
1.2%
Space Separator
ValueCountFrequency (%)
1093
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Cyrillic 7185
61.9%
Latin 3142
27.1%
Common 1232
 
10.6%
Hangul 47
 
0.4%

Most frequent character per script

Cyrillic
ValueCountFrequency (%)
и 1076
15.0%
а 623
 
8.7%
л 486
 
6.8%
к 484
 
6.7%
о 482
 
6.7%
н 427
 
5.9%
с 365
 
5.1%
е 361
 
5.0%
р 351
 
4.9%
т 308
 
4.3%
Other values (50) 2222
30.9%
Latin
ValueCountFrequency (%)
i 338
 
10.8%
e 256
 
8.1%
n 254
 
8.1%
l 227
 
7.2%
a 188
 
6.0%
c 153
 
4.9%
r 144
 
4.6%
t 139
 
4.4%
o 129
 
4.1%
C 111
 
3.5%
Other values (41) 1203
38.3%
Hangul
ValueCountFrequency (%)
3
 
6.4%
3
 
6.4%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.1%
1
 
2.1%
Other values (27) 27
57.4%
Common
ValueCountFrequency (%)
1093
88.7%
? 59
 
4.8%
- 15
 
1.2%
` 7
 
0.6%
& 7
 
0.6%
( 7
 
0.6%
) 7
 
0.6%
" 6
 
0.5%
. 6
 
0.5%
3 3
 
0.2%
Other values (13) 22
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
Cyrillic 7185
61.9%
ASCII 4372
37.7%
Hangul 47
 
0.4%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1093
25.0%
i 338
 
7.7%
e 256
 
5.9%
n 254
 
5.8%
l 227
 
5.2%
a 188
 
4.3%
c 153
 
3.5%
r 144
 
3.3%
t 139
 
3.2%
o 129
 
3.0%
Other values (62) 1451
33.2%
Cyrillic
ValueCountFrequency (%)
и 1076
15.0%
а 623
 
8.7%
л 486
 
6.8%
к 484
 
6.7%
о 482
 
6.7%
н 427
 
5.9%
с 365
 
5.1%
е 361
 
5.0%
р 351
 
4.9%
т 308
 
4.3%
Other values (50) 2222
30.9%
Hangul
ValueCountFrequency (%)
3
 
6.4%
3
 
6.4%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.1%
1
 
2.1%
Other values (27) 27
57.4%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%

기관분류
Categorical

Distinct16
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size3.4 KiB
Пластика
141 
Стоматология
60 
Косметология
45 
Другое
26 
СПА и эстетика, и т. д.
22 
Other values (11)
125 

Length

Max length29
Median length28
Mean length11.909308
Min length3

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st rowПластика
2nd rowОфтальмология
3rd rowПластика
4th rowПластика
5th rowПластика

Common Values

ValueCountFrequency (%)
Пластика 141
33.7%
Стоматология 60
14.3%
Косметология 45
 
10.7%
Другое 26
 
6.2%
СПА и эстетика, и т. д. 22
 
5.3%
Офтальмология 21
 
5.0%
Восточная терапия 21
 
5.0%
Общее обследование 19
 
4.5%
Лечение позвоночника/суставов 18
 
4.3%
Гостиницы 16
 
3.8%
Other values (6) 30
 
7.2%

Length

2023-12-13T02:31:39.255799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
пластика 141
23.7%
стоматология 60
 
10.1%
косметология 45
 
7.6%
и 44
 
7.4%
другое 26
 
4.4%
спа 22
 
3.7%
эстетика 22
 
3.7%
т 22
 
3.7%
д 22
 
3.7%
офтальмология 21
 
3.5%
Other values (16) 170
28.6%

Missing values

2023-12-13T02:31:38.513078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:31:38.580713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기관명기관분류
0Клиника пластической хирургии ТакоПластика
1Офтальмологическая клиника Bright St MaryОфтальмология
2Клиника пластической хирургии Hyundai AestheticsПластика
3Клиника пластической хирургии DeesseПластика
4Клиника MD Центр маммопластикиПластика
5Клиника пластической хирургии БатангПластика
6Клиника пластической хирургии ПримиумПластика
7Клиника пластической хирургии ОпераПластика
8Стоматологическая клиника СоджунСтоматология
9Клиника пластической хирургииПластика
기관명기관분류
409Отель Форхил (Foreheal)Гостиницы
410Отель Каннам Фэмили (Gangnam Family)Гостиницы
411Отель ГраммосГостиницы
412HOTEL THE DESIGNERSГостиницы
413Отель ТриаГостиницы
414Отель Бест Вестерн Премио КангнамГостиницы
415Отель Новотель Амбассадор КангнамГостиницы
416Отель Ритц-КарлтонГостиницы
417Отель JBISГостиницы
418Отель Оквуд Премио Коекс центрГостиницы

Duplicate rows

Most frequently occurring

기관명기관분류# duplicates
1Стоматологическая клиникаСтоматология3
0Клиника пластической хирургии О Се ВонПластика2