Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory468.8 KiB
Average record size in memory48.0 B

Variable types

Text2
Categorical2
DateTime1

Dataset

Description요양기관 폐업 현황(요양기관명, 소재지, 종별, 폐업일자 등) / 연1회 업데이트 / 기준일까지 누적된 요양기관 폐업 정보임
Author건강보험심사평가원
URLhttps://www.data.go.kr/data/15051056/fileData.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
요양종별 is highly imbalanced (54.1%)Imbalance

Reproduction

Analysis started2024-03-14 23:27:35.450473
Analysis finished2024-03-14 23:27:36.973817
Duration1.52 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct7599
Distinct (%)76.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-15T08:27:38.029077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length24
Mean length6.3578
Min length3

Characters and Unicode

Total characters63578
Distinct characters669
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6542 ?
Unique (%)65.4%

Sample

1st row소생약국
2nd row열린의원
3rd row온누리기쁨약국
4th row초읍한의원
5th row모자약국
ValueCountFrequency (%)
우리약국 31
 
0.3%
삼성의원 22
 
0.2%
서울약국 20
 
0.2%
건강약국 20
 
0.2%
중앙약국 17
 
0.2%
우리들약국 17
 
0.2%
사랑약국 16
 
0.2%
현대약국 16
 
0.2%
유디치과의원 16
 
0.2%
경희한의원 15
 
0.1%
Other values (7643) 9901
98.1%
2024-03-15T08:27:39.852231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6856
 
10.8%
6474
 
10.2%
3480
 
5.5%
3402
 
5.4%
3024
 
4.8%
2127
 
3.3%
1280
 
2.0%
836
 
1.3%
821
 
1.3%
704
 
1.1%
Other values (659) 34574
54.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 63037
99.1%
Decimal Number 178
 
0.3%
Space Separator 92
 
0.1%
Uppercase Letter 78
 
0.1%
Close Punctuation 68
 
0.1%
Open Punctuation 63
 
0.1%
Lowercase Letter 35
 
0.1%
Other Punctuation 17
 
< 0.1%
Dash Punctuation 10
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6856
 
10.9%
6474
 
10.3%
3480
 
5.5%
3402
 
5.4%
3024
 
4.8%
2127
 
3.4%
1280
 
2.0%
836
 
1.3%
821
 
1.3%
704
 
1.1%
Other values (608) 34033
54.0%
Uppercase Letter
ValueCountFrequency (%)
S 11
14.1%
M 11
14.1%
D 7
 
9.0%
K 5
 
6.4%
I 5
 
6.4%
U 5
 
6.4%
H 4
 
5.1%
W 4
 
5.1%
C 4
 
5.1%
J 3
 
3.8%
Other values (12) 19
24.4%
Lowercase Letter
ValueCountFrequency (%)
e 11
31.4%
i 4
 
11.4%
m 4
 
11.4%
c 4
 
11.4%
o 3
 
8.6%
r 3
 
8.6%
d 1
 
2.9%
t 1
 
2.9%
l 1
 
2.9%
n 1
 
2.9%
Other values (2) 2
 
5.7%
Decimal Number
ValueCountFrequency (%)
5 38
21.3%
3 33
18.5%
6 31
17.4%
2 28
15.7%
1 21
11.8%
0 13
 
7.3%
8 8
 
4.5%
4 4
 
2.2%
9 2
 
1.1%
Other Punctuation
ValueCountFrequency (%)
& 6
35.3%
, 5
29.4%
. 4
23.5%
· 2
 
11.8%
Space Separator
ValueCountFrequency (%)
92
100.0%
Close Punctuation
ValueCountFrequency (%)
) 68
100.0%
Open Punctuation
ValueCountFrequency (%)
( 63
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 63036
99.1%
Common 428
 
0.7%
Latin 113
 
0.2%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6856
 
10.9%
6474
 
10.3%
3480
 
5.5%
3402
 
5.4%
3024
 
4.8%
2127
 
3.4%
1280
 
2.0%
836
 
1.3%
821
 
1.3%
704
 
1.1%
Other values (607) 34032
54.0%
Latin
ValueCountFrequency (%)
S 11
 
9.7%
M 11
 
9.7%
e 11
 
9.7%
D 7
 
6.2%
K 5
 
4.4%
I 5
 
4.4%
U 5
 
4.4%
i 4
 
3.5%
m 4
 
3.5%
H 4
 
3.5%
Other values (24) 46
40.7%
Common
ValueCountFrequency (%)
92
21.5%
) 68
15.9%
( 63
14.7%
5 38
8.9%
3 33
 
7.7%
6 31
 
7.2%
2 28
 
6.5%
1 21
 
4.9%
0 13
 
3.0%
- 10
 
2.3%
Other values (7) 31
 
7.2%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 63036
99.1%
ASCII 539
 
0.8%
None 2
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6856
 
10.9%
6474
 
10.3%
3480
 
5.5%
3402
 
5.4%
3024
 
4.8%
2127
 
3.4%
1280
 
2.0%
836
 
1.3%
821
 
1.3%
704
 
1.1%
Other values (607) 34032
54.0%
ASCII
ValueCountFrequency (%)
92
17.1%
) 68
12.6%
( 63
11.7%
5 38
 
7.1%
3 33
 
6.1%
6 31
 
5.8%
2 28
 
5.2%
1 21
 
3.9%
0 13
 
2.4%
S 11
 
2.0%
Other values (40) 141
26.2%
None
ValueCountFrequency (%)
· 2
100.0%
CJK
ValueCountFrequency (%)
1
100.0%

요양종별
Categorical

IMBALANCE 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
의원
5925 
약국
3381 
병원
628 
종합병원
 
20
보건진료소
 
19
Other values (2)
 
27

Length

Max length5
Median length2
Mean length2.0134
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row약국
2nd row의원
3rd row약국
4th row의원
5th row약국

Common Values

ValueCountFrequency (%)
의원 5925
59.2%
약국 3381
33.8%
병원 628
 
6.3%
종합병원 20
 
0.2%
보건진료소 19
 
0.2%
조산원 17
 
0.2%
보건지소 10
 
0.1%

Length

2024-03-15T08:27:40.284336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T08:27:40.597475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
의원 5925
59.2%
약국 3381
33.8%
병원 628
 
6.3%
종합병원 20
 
0.2%
보건진료소 19
 
0.2%
조산원 17
 
0.2%
보건지소 10
 
0.1%

시도명
Categorical

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
서울
2849 
경기
1986 
부산
717 
경남
535 
인천
467 
Other values (12)
3446 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광주
2nd row전북
3rd row서울
4th row부산
5th row충남

Common Values

ValueCountFrequency (%)
서울 2849
28.5%
경기 1986
19.9%
부산 717
 
7.2%
경남 535
 
5.3%
인천 467
 
4.7%
대구 446
 
4.5%
경북 423
 
4.2%
충남 408
 
4.1%
전북 356
 
3.6%
전남 321
 
3.2%
Other values (7) 1492
14.9%

Length

2024-03-15T08:27:40.973828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울 2849
28.5%
경기 1986
19.9%
부산 717
 
7.2%
경남 535
 
5.3%
인천 467
 
4.7%
대구 446
 
4.5%
경북 423
 
4.2%
충남 408
 
4.1%
전북 356
 
3.6%
전남 321
 
3.2%
Other values (7) 1492
14.9%
Distinct236
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-15T08:27:42.125951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length3
Mean length3.605
Min length2

Characters and Unicode

Total characters36050
Distinct characters146
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row서구
2nd row익산시
3rd row강북구
4th row부산진구
5th row천안시(천안서북구,동남구)
ValueCountFrequency (%)
강남구 376
 
3.3%
중구 321
 
2.8%
서구 302
 
2.6%
북구 278
 
2.4%
남구 229
 
2.0%
성남시 224
 
2.0%
동구 206
 
1.8%
송파구 175
 
1.5%
고양시 173
 
1.5%
서초구 171
 
1.5%
Other values (233) 8991
78.6%
2024-03-15T08:27:43.808732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6923
19.2%
4085
 
11.3%
1446
 
4.0%
1143
 
3.2%
1063
 
2.9%
1020
 
2.8%
943
 
2.6%
878
 
2.4%
876
 
2.4%
823
 
2.3%
Other values (136) 16850
46.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 34442
95.5%
Space Separator 1446
 
4.0%
Close Punctuation 57
 
0.2%
Open Punctuation 57
 
0.2%
Other Punctuation 48
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6923
20.1%
4085
 
11.9%
1143
 
3.3%
1063
 
3.1%
1020
 
3.0%
943
 
2.7%
878
 
2.5%
876
 
2.5%
823
 
2.4%
757
 
2.2%
Other values (132) 15931
46.3%
Space Separator
ValueCountFrequency (%)
1446
100.0%
Close Punctuation
ValueCountFrequency (%)
) 57
100.0%
Open Punctuation
ValueCountFrequency (%)
( 57
100.0%
Other Punctuation
ValueCountFrequency (%)
, 48
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 34442
95.5%
Common 1608
 
4.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6923
20.1%
4085
 
11.9%
1143
 
3.3%
1063
 
3.1%
1020
 
3.0%
943
 
2.7%
878
 
2.5%
876
 
2.5%
823
 
2.4%
757
 
2.2%
Other values (132) 15931
46.3%
Common
ValueCountFrequency (%)
1446
89.9%
) 57
 
3.5%
( 57
 
3.5%
, 48
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 34442
95.5%
ASCII 1608
 
4.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6923
20.1%
4085
 
11.9%
1143
 
3.3%
1063
 
3.1%
1020
 
3.0%
943
 
2.7%
878
 
2.5%
876
 
2.5%
823
 
2.4%
757
 
2.2%
Other values (132) 15931
46.3%
ASCII
ValueCountFrequency (%)
1446
89.9%
) 57
 
3.5%
( 57
 
3.5%
, 48
 
3.0%
Distinct4815
Distinct (%)48.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2000-01-01 00:00:00
Maximum2023-12-31 00:00:00
2024-03-15T08:27:44.210825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T08:27:44.659242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Correlations

2024-03-15T08:27:44.979286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
요양종별시도명
요양종별1.0000.137
시도명0.1371.000
2024-03-15T08:27:45.209204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명요양종별
시도명1.0000.062
요양종별0.0621.000
2024-03-15T08:27:45.408361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
요양종별시도명
요양종별1.0000.062
시도명0.0621.000

Missing values

2024-03-15T08:27:36.438522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T08:27:36.822649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

요양기관명요양종별시도명시군구명폐업일자
83311소생약국약국광주서구2021-12-01
68299열린의원의원전북익산시2017-08-22
30131온누리기쁨약국약국서울강북구2007-12-31
16359초읍한의원의원부산부산진구2004-06-02
9737모자약국약국충남천안시(천안서북구,동남구)2002-09-03
45341화수목한의원의원대전서구2011-09-26
85338송천한빛의원의원전북전주시 덕진구2022-06-11
40097순정성형외과의원의원서울강남구2010-06-05
41286머리채한의원의원부산수영구2010-10-04
70876하늘온누리약국약국충남천안시 서북구2018-04-29
요양기관명요양종별시도명시군구명폐업일자
88960정요양병원병원서울도봉구2023-05-09
60775효약국약국경기수원시 장안구2015-08-06
25211제비약국약국서울강남구2006-11-03
53412호산나요양병원병원경기화성시2013-08-16
49415에스덴치과의원의원서울성동구2012-09-03
21380제정치과의원의원서울강동구2005-11-21
3117이원기소아과의원의원경기군포시2000-10-09
66032다래약국약국서울강남구2017-01-13
74956현대중공업(주)해양부속의원의원울산동구2019-06-05
39696인화약국약국부산사하구2010-05-02

Duplicate rows

Most frequently occurring

요양기관명요양종별시도명시군구명폐업일자# duplicates
0척척의원,한의원의원부산연제구2023-01-052