gimi9 Pandas Profiling

Dataset statistics

Number of variables	3
Number of observations	1337
Missing cells	0
Missing cells (%)	0.0%
Duplicate rows	0
Duplicate rows (%)	0.0%
Total size in memory	31.5 KiB
Average record size in memory	24.1 B

Variable types

Text	2
Categorical	1

Dataset

Description	부산광역시해운대구_소독의무시설현황_20200630
Author	부산광역시 해운대구
URL	http://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3075595

Reproduction

Analysis started	2023-12-10 16:28:47.942285
Analysis finished	2023-12-10 16:28:48.666558
Duration	0.72 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

대상시설
Text

Distinct	1330
Distinct (%)	99.5%
Missing	0
Missing (%)	0.0%
Memory size	10.6 KiB

Length

Max length	45
Median length	37
Mean length	9.3186238
Min length	1

Characters and Unicode

Total characters	12459
Distinct characters	638
Distinct categories	11 ?
Distinct scripts	3 ?
Distinct blocks	5 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	1323 ?
Unique (%)	99.0%

Sample

1st row	라비드아틀란호텔
2nd row	라마다 앙코르 해운대 호텔
3rd row	소사이어티에스 호텔
4th row	선트리 호텔
5th row	휘겔리

Value	Count	Frequency (%)
급식실	44	2.2%
호텔	35	1.7%
해운대	28	1.4%
포함	19	0.9%
해운대점	18	0.9%
hotel	16	0.8%
모텔	11	0.5%
주상복합	10	0.5%
부산	9	0.4%
스타벅스	7	0.3%
Other values (1678)	1821	90.2%

Most occurring characters

Value	Count	Frequency (%)
	711	5.7%
)	362	2.9%
(	362	2.9%
스	305	2.4%
대	278	2.2%
이	268	2.2%
텔	242	1.9%
운	228	1.8%
해	221	1.8%
호	182	1.5%
Other values (628)	9300	74.6%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	9926	79.7%
Space Separator	711	5.7%
Uppercase Letter	534	4.3%
Close Punctuation	363	2.9%
Open Punctuation	363	2.9%
Decimal Number	269	2.2%
Lowercase Letter	174	1.4%
Other Punctuation	75	0.6%
Other Symbol	34	0.3%
Dash Punctuation	8	0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
스	305	3.1%
대	278	2.8%
이	268	2.7%
텔	242	2.4%
운	228	2.3%
해	221	2.2%
호	182	1.8%
센	140	1.4%
산	137	1.4%
아	136	1.4%
Other values (554)	7789	78.5%

Uppercase Letter

Value	Count	Frequency (%)
T	46	8.6%
L	45	8.4%
E	45	8.4%
O	40	7.5%
S	37	6.9%
N	35	6.6%
H	34	6.4%
C	29	5.4%
A	29	5.4%
I	24	4.5%
Other values (15)	170	31.8%

Lowercase Letter

Value	Count	Frequency (%)
o	25	14.4%
e	23	13.2%
n	15	8.6%
l	14	8.0%
a	13	7.5%
t	12	6.9%
i	12	6.9%
m	9	5.2%
s	8	4.6%
r	8	4.6%
Other values (13)	35	20.1%

Decimal Number

Value	Count	Frequency (%)
2	71	26.4%
1	58	21.6%
3	47	17.5%
0	22	8.2%
4	20	7.4%
9	14	5.2%
7	12	4.5%
5	11	4.1%
6	10	3.7%
8	4	1.5%

Other Punctuation

Value	Count	Frequency (%)
,	51	68.0%
.	11	14.7%
&	7	9.3%
※	3	4.0%
!	1	1.3%
/	1	1.3%
#	1	1.3%

Close Punctuation

Value	Count	Frequency (%)
)	362	99.7%
]	1	0.3%

Open Punctuation

Value	Count	Frequency (%)
(	362	99.7%
[	1	0.3%

Letter Number

Value	Count	Frequency (%)
Ⅰ	1	50.0%
Ⅱ	1	50.0%

Space Separator

Value	Count	Frequency (%)
	711	100.0%

Other Symbol

Value	Count	Frequency (%)
㈜	34	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	8	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	9960	79.9%
Common	1789	14.4%
Latin	710	5.7%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
스	305	3.1%
대	278	2.8%
이	268	2.7%
텔	242	2.4%
운	228	2.3%
해	221	2.2%
호	182	1.8%
센	140	1.4%
산	137	1.4%
아	136	1.4%
Other values (555)	7823	78.5%

Latin

Value	Count	Frequency (%)
T	46	6.5%
L	45	6.3%
E	45	6.3%
O	40	5.6%
S	37	5.2%
N	35	4.9%
H	34	4.8%
C	29	4.1%
A	29	4.1%
o	25	3.5%
Other values (40)	345	48.6%

Common

Value	Count	Frequency (%)
	711	39.7%
)	362	20.2%
(	362	20.2%
2	71	4.0%
1	58	3.2%
,	51	2.9%
3	47	2.6%
0	22	1.2%
4	20	1.1%
9	14	0.8%
Other values (13)	71	4.0%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	9926	79.7%
ASCII	2494	20.0%
None	34	0.3%
Punctuation	3	< 0.1%
Number Forms	2	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	711	28.5%
)	362	14.5%
(	362	14.5%
2	71	2.8%
1	58	2.3%
,	51	2.0%
3	47	1.9%
T	46	1.8%
L	45	1.8%
E	45	1.8%
Other values (60)	696	27.9%

Hangul

Value	Count	Frequency (%)
스	305	3.1%
대	278	2.8%
이	268	2.7%
텔	242	2.4%
운	228	2.3%
해	221	2.2%
호	182	1.8%
센	140	1.4%
산	137	1.4%
아	136	1.4%
Other values (554)	7789	78.5%

None

Value	Count	Frequency (%)
㈜	34	100.0%

Punctuation

Value	Count	Frequency (%)
※	3	100.0%

Number Forms

Value	Count	Frequency (%)
Ⅰ	1	50.0%
Ⅱ	1	50.0%

주소
Text

Distinct	1316
Distinct (%)	98.4%
Missing	0
Missing (%)	0.0%
Memory size	10.6 KiB

Length

Max length	44
Median length	35
Mean length	18.444278
Min length	6

Characters and Unicode

Total characters	24660
Distinct characters	305
Distinct categories	10 ?
Distinct scripts	3 ?
Distinct blocks	3 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	1296 ?
Unique (%)	96.9%

Sample

1st row	구남로 37 (중동)
2nd row	구남로 9 (우동)
3rd row	구남로12번길 37(우동)
4th row	달맞이길 209 (중동)
5th row	달맞이길62번가길 37, 6층

Value	Count	Frequency (%)
우동	392	8.5%
중동	232	5.0%
좌동	221	4.8%
재송동	105	2.3%
해운대해변로	92	2.0%
	79	1.7%
해운대로	73	1.6%
송정동	71	1.5%
반여동	60	1.3%
좌동순환로	38	0.8%
Other values (1526)	3247	70.4%

Most occurring characters

Value	Count	Frequency (%)
	3549	14.4%
1	1584	6.4%
동	1465	5.9%
2	1022	4.1%
(	948	3.8%
)	947	3.8%
로	945	3.8%
3	707	2.9%
,	694	2.8%
4	603	2.4%
Other values (295)	12196	49.5%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	11463	46.5%
Decimal Number	6521	26.4%
Space Separator	3549	14.4%
Open Punctuation	948	3.8%
Close Punctuation	947	3.8%
Other Punctuation	713	2.9%
Dash Punctuation	393	1.6%
Math Symbol	60	0.2%
Uppercase Letter	58	0.2%
Lowercase Letter	8	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
동	1465	12.8%
로	945	8.2%
해	530	4.6%
길	492	4.3%
우	459	4.0%
대	454	4.0%
번	436	3.8%
운	399	3.5%
층	390	3.4%
송	386	3.4%
Other values (264)	5507	48.0%

Decimal Number

Value	Count	Frequency (%)
1	1584	24.3%
2	1022	15.7%
3	707	10.8%
4	603	9.2%
0	494	7.6%
6	463	7.1%
5	455	7.0%
7	433	6.6%
9	391	6.0%
8	369	5.7%

Uppercase Letter

Value	Count	Frequency (%)
A	14	24.1%
B	12	20.7%
C	11	19.0%
E	8	13.8%
P	5	8.6%
N	4	6.9%
I	1	1.7%
K	1	1.7%
J	1	1.7%
L	1	1.7%

Other Punctuation

Value	Count	Frequency (%)
,	694	97.3%
.	9	1.3%
@	8	1.1%
·	2	0.3%

Lowercase Letter

Value	Count	Frequency (%)
e	7	87.5%
s	1	12.5%

Space Separator

Value	Count	Frequency (%)
	3549	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	948	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	947	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	393	100.0%

Math Symbol

Value	Count	Frequency (%)
~	60	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	13131	53.2%
Hangul	11463	46.5%
Latin	66	0.3%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
동	1465	12.8%
로	945	8.2%
해	530	4.6%
길	492	4.3%
우	459	4.0%
대	454	4.0%
번	436	3.8%
운	399	3.5%
층	390	3.4%
송	386	3.4%
Other values (264)	5507	48.0%

Common

Value	Count	Frequency (%)
	3549	27.0%
1	1584	12.1%
2	1022	7.8%
(	948	7.2%
)	947	7.2%
3	707	5.4%
,	694	5.3%
4	603	4.6%
0	494	3.8%
6	463	3.5%
Other values (9)	2120	16.1%

Latin

Value	Count	Frequency (%)
A	14	21.2%
B	12	18.2%
C	11	16.7%
E	8	12.1%
e	7	10.6%
P	5	7.6%
N	4	6.1%
I	1	1.5%
s	1	1.5%
K	1	1.5%
Other values (2)	2	3.0%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	13195	53.5%
Hangul	11463	46.5%
None	2	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	3549	26.9%
1	1584	12.0%
2	1022	7.7%
(	948	7.2%
)	947	7.2%
3	707	5.4%
,	694	5.3%
4	603	4.6%
0	494	3.7%
6	463	3.5%
Other values (20)	2184	16.6%

Hangul

Value	Count	Frequency (%)
동	1465	12.8%
로	945	8.2%
해	530	4.6%
길	492	4.3%
우	459	4.0%
대	454	4.0%
번	436	3.8%
운	399	3.5%
층	390	3.4%
송	386	3.4%
Other values (264)	5507	48.0%

None

Value	Count	Frequency (%)
·	2	100.0%

유형
Categorical

Distinct	12
Distinct (%)	0.9%
Missing	0
Missing (%)	0.0%
Memory size	10.6 KiB

2식품접객업소	379
11대형건축물	312
1숙박업소	229
13공동주택	117
12어린이집 및유치원	97
Other values (7)	203

Length

Max length	11
Median length	7
Mean length	6.4106208
Min length	3

Unique

Unique	0 ?
Unique (%)	0.0%

Sample

1st row	1숙박업소
2nd row	1숙박업소
3rd row	1숙박업소
4th row	1숙박업소
5th row	1숙박업소

Common Values

Value	Count	Frequency (%)
2식품접객업소	379	28.3%
11대형건축물	312	23.3%
1숙박업소	229	17.1%
13공동주택	117	8.8%
12어린이집 및유치원	97	7.3%
9학교	70	5.2%
5병원	38	2.8%
3교통시설	30	2.2%
6집단급식소	29	2.2%
4대형유통	28	2.1%
Other values (2)	8	0.6%

Length

Histogram of lengths of the category

Value	Count	Frequency (%)
2식품접객업소	379	26.4%
11대형건축물	312	21.8%
1숙박업소	229	16.0%
13공동주택	117	8.2%
12어린이집	97	6.8%
및유치원	97	6.8%
9학교	70	4.9%
5병원	38	2.6%
3교통시설	30	2.1%
6집단급식소	29	2.0%
Other values (3)	36	2.5%

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	대상시설	주소	유형
0	라비드아틀란호텔	구남로 37 (중동)	1숙박업소
1	라마다 앙코르 해운대 호텔	구남로 9 (우동)	1숙박업소
2	소사이어티에스 호텔	구남로12번길 37(우동)	1숙박업소
3	선트리 호텔	달맞이길 209 (중동)	1숙박업소
4	휘겔리	달맞이길62번가길 37, 6층	1숙박업소
5	해운대비치	달맞이길62번길 53, 6~7층 (중동)	1숙박업소
6	그랑빌	달맞이길62번길78	1숙박업소
7	센텀프리미어호텔	센텀1로 17 (우동)	1숙박업소
8	호텔 메리케이 센텀	센텀3로 20 (우동)	1숙박업소
9	MRINE K POOL VILLA (마린케이풀빌라)	송정광어골로 3(송정동)	1숙박업소

	대상시설	주소	유형
1327	신재초등학교 (급식실)	해운대로 81번길 55( 재송2동 1024 )	9학교
1328	부산국제외국어고등학교	해운대로469번길 50 (우동)	9학교
1329	부산문화여자고등학교	해운대로469번길 50 (우동)	9학교
1330	해운대공업고등학교 (급식실)	해운대로469번길 96 (우동)	9학교
1331	해강중학교 (급식실)	해운대해변로 17( 우2동 1417-1번지 )	9학교
1332	해강고등학교 (급식실)	해운대해변로 33 (우동, 지상1층 )	9학교
1333	해강초등학교 (급식실)	해운대해변로 43( 우1동 1388 )	9학교
1334	부산골프고등학교	반여동 1183-25	9학교
1335	동부산대학	운봉길 60(반송동)	9학교
1336	센텀고등학교 (급식실)	해운대로 246 (재송동)	9학교

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Uppercase Letter

Lowercase Letter

Decimal Number

Other Punctuation

Close Punctuation

Open Punctuation

Letter Number

Space Separator

Other Symbol

Dash Punctuation

Most occurring scripts

Most frequent character per script

Hangul

Latin

Common

Most occurring blocks

Most frequent character per block

ASCII

Hangul

None

Punctuation

Number Forms

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Decimal Number

Uppercase Letter

Other Punctuation

Lowercase Letter

Space Separator

Open Punctuation

Close Punctuation

Dash Punctuation

Math Symbol

Most occurring scripts

Most frequent character per script

Hangul

Common

Latin

Most occurring blocks

Most frequent character per block

ASCII

Hangul

None

Common Values

Length

Missing values

Sample