gimi9 Pandas Profiling

Dataset statistics

Number of variables	2
Number of observations	438
Missing cells	0
Missing cells (%)	0.0%
Duplicate rows	4
Duplicate rows (%)	0.9%
Total size in memory	7.0 KiB
Average record size in memory	16.3 B

Variable types

Text	1
Categorical	1

Dataset

Description	서울특별시 강남구에 위치한 400여개 의료기관에 대한 기관명, 기관분류에 대한 데이터를 제공합니다. 자세한 사항은 서울특별시 강남구 관관진흥과로 문의하여 주시기 바랍니다.
Author	서울특별시 강남구
URL	https://www.data.go.kr/data/15071686/fileData.do

Alerts

Dataset has 4 (0.9%) duplicate rows

Duplicates

Reproduction

Analysis started	2024-04-20 15:37:22.316835
Analysis finished	2024-04-20 15:37:22.745968
Duration	0.43 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

기관명
Text

Distinct	421
Distinct (%)	96.1%
Missing	0
Missing (%)	0.0%
Memory size	3.5 KiB

Length

Max length	20
Median length	15
Mean length	8.0182648
Min length	3

Characters and Unicode

Total characters	3512
Distinct characters	364
Distinct categories	9 ?
Distinct scripts	3 ?
Distinct blocks	3 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	414 ?
Unique (%)	94.5%

Sample

1st row	타코성형외과
2nd row	밝은성모안과
3rd row	현대미학성형외과
4th row	청담여신성형외과
5th row	엠디클리닉 가슴성형센터

Value	Count	Frequency (%)
성형외과의원	33	5.2%
성형외과	20	3.2%
치과의원	16	2.5%
피부과의원	13	2.1%
의원	12	1.9%
한의원	11	1.7%
안과의원	8	1.3%
강남세브란스병원	7	1.1%
삼성서울병원	6	0.9%
치과병원	5	0.8%
Other values (469)	501	79.3%

Most occurring characters

Value	Count	Frequency (%)
과	291	8.3%
원	262	7.5%
	213	6.1%
의	209	6.0%
성	158	4.5%
외	145	4.1%
형	143	4.1%
스	84	2.4%
이	75	2.1%
치	66	1.9%
Other values (354)	1866	53.1%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	3185	90.7%
Space Separator	213	6.1%
Uppercase Letter	70	2.0%
Decimal Number	18	0.5%
Other Symbol	8	0.2%
Close Punctuation	6	0.2%
Open Punctuation	6	0.2%
Lowercase Letter	4	0.1%
Other Punctuation	2	0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
과	291	9.1%
원	262	8.2%
의	209	6.6%
성	158	5.0%
외	145	4.6%
형	143	4.5%
스	84	2.6%
이	75	2.4%
치	66	2.1%
병	53	1.7%
Other values (314)	1699	53.3%

Uppercase Letter

Value	Count	Frequency (%)
K	8	11.4%
C	7	10.0%
P	6	8.6%
J	6	8.6%
B	6	8.6%
Y	5	7.1%
S	4	5.7%
I	3	4.3%
W	3	4.3%
A	3	4.3%
Other values (13)	19	27.1%

Decimal Number

Value	Count	Frequency (%)
3	3	16.7%
1	3	16.7%
9	2	11.1%
8	2	11.1%
2	2	11.1%
6	2	11.1%
0	2	11.1%
4	1	5.6%
5	1	5.6%

Lowercase Letter

Value	Count	Frequency (%)
m	3	75.0%
c	1	25.0%

Other Punctuation

Value	Count	Frequency (%)
/	1	50.0%
&	1	50.0%

Space Separator

Value	Count	Frequency (%)
	213	100.0%

Other Symbol

Value	Count	Frequency (%)
㈜	8	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	6	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	6	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	3193	90.9%
Common	245	7.0%
Latin	74	2.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
과	291	9.1%
원	262	8.2%
의	209	6.5%
성	158	4.9%
외	145	4.5%
형	143	4.5%
스	84	2.6%
이	75	2.3%
치	66	2.1%
병	53	1.7%
Other values (315)	1707	53.5%

Latin

Value	Count	Frequency (%)
K	8	10.8%
C	7	9.5%
P	6	8.1%
J	6	8.1%
B	6	8.1%
Y	5	6.8%
S	4	5.4%
I	3	4.1%
W	3	4.1%
m	3	4.1%
Other values (15)	23	31.1%

Common

Value	Count	Frequency (%)
	213	86.9%
)	6	2.4%
(	6	2.4%
3	3	1.2%
1	3	1.2%
9	2	0.8%
8	2	0.8%
2	2	0.8%
6	2	0.8%
0	2	0.8%
Other values (4)	4	1.6%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	3185	90.7%
ASCII	319	9.1%
None	8	0.2%

Most frequent character per block

Hangul

Value	Count	Frequency (%)
과	291	9.1%
원	262	8.2%
의	209	6.6%
성	158	5.0%
외	145	4.6%
형	143	4.5%
스	84	2.6%
이	75	2.4%
치	66	2.1%
병	53	1.7%
Other values (314)	1699	53.3%

ASCII

Value	Count	Frequency (%)
	213	66.8%
K	8	2.5%
C	7	2.2%
P	6	1.9%
)	6	1.9%
J	6	1.9%
(	6	1.9%
B	6	1.9%
Y	5	1.6%
S	4	1.3%
Other values (29)	52	16.3%

None

Value	Count	Frequency (%)
㈜	8	100.0%

기관분류
Categorical

Distinct	16
Distinct (%)	3.7%
Missing	0
Missing (%)	0.0%
Memory size	3.5 KiB

성형외과	148
치과	63
피부미용	45
기타	27
한방진료	23
Other values (11)	132

Length

Max length	18
Median length	4
Mean length	4.2671233
Min length	2

Unique

Unique	1 ?
Unique (%)	0.2%

Sample

1st row	성형외과
2nd row	안과
3rd row	성형외과
4th row	성형외과
5th row	성형외과

Common Values

Value	Count	Frequency (%)
성형외과	148	33.8%
치과	63	14.4%
피부미용	45	10.3%
기타	27	6.2%
한방진료	23	5.3%
스파, 쇼핑, 유치업체 / 기타	22	5.0%
안과	21	4.8%
종합검진	19	4.3%
척추/관절치료	18	4.1%
호텔	16	3.7%
Other values (6)	36	8.2%

Length

Histogram of lengths of the category

Value	Count	Frequency (%)
성형외과	148	28.1%
치과	63	12.0%
기타	49	9.3%
피부미용	45	8.6%
한방진료	23	4.4%
스파	22	4.2%
쇼핑	22	4.2%
유치업체	22	4.2%
	22	4.2%
안과	21	4.0%
Other values (9)	89	16.9%

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	기관명	기관분류
0	타코성형외과	성형외과
1	밝은성모안과	안과
2	현대미학성형외과	성형외과
3	청담여신성형외과	성형외과
4	엠디클리닉 가슴성형센터	성형외과
5	바탕성형외과	성형외과
6	프리미어성형외과의원	성형외과
7	오페라성형외과의원	성형외과
8	소중치과	치과
9	포비성형외과	성형외과

	기관명	기관분류
428	㈜컨벤션헤리츠호텔포레힐지점	호텔
429	(주)강남패밀리호텔	호텔
430	호텔그라모스	호텔
431	호텔더디자이너스	호텔
432	트리아관광호텔	호텔
433	베스트웨스턴 프리미어 강남호텔	호텔
434	노보텔 앰배서더 강남	호텔
435	호텔리츠칼튼서울	호텔
436	제이비스관광호텔	호텔
437	오크우드프리미어코엑스센터	호텔

Most frequently occurring

	기관명	기관분류	# duplicates
0	글로비성형외과	성형외과	2
1	기쁨병원	종합검진	2
2	이문원 한의원	한방진료	2
3	하늘체한의원	한방진료	2

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Uppercase Letter

Decimal Number

Lowercase Letter

Other Punctuation

Space Separator

Other Symbol

Close Punctuation

Open Punctuation

Most occurring scripts

Most frequent character per script

Hangul

Latin

Common

Most occurring blocks

Most frequent character per block

Hangul

ASCII

None

Common Values

Length

Missing values

Sample

Duplicate rows

Most frequently occurring