gimi9 Pandas Profiling

Dataset statistics

Number of variables	1
Number of observations	26
Missing cells	10
Missing cells (%)	38.5%
Duplicate rows	2
Duplicate rows (%)	7.7%
Total size in memory	340.0 B
Average record size in memory	13.1 B

Variable types

Text	1

Dataset

Description	승마장현황
Author	전라북도
URL	https://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=202751

Alerts

Dataset has 2 (7.7%) duplicate rows	Duplicates
`;[1]` has 10 (38.5%) missing values	Missing

Reproduction

Analysis started	2024-03-14 00:09:58.106383
Analysis finished	2024-03-14 00:09:58.216612
Duration	0.11 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

;[1]
Text

MISSING

Distinct	15
Distinct (%)	93.8%
Missing	10
Missing (%)	38.5%
Memory size	340.0 B

Length

Max length	76
Median length	3
Mean length	7.3125
Min length	1

Characters and Unicode

Total characters	117
Distinct characters	39
Distinct categories	7 ?
Distinct scripts	2 ?
Distinct blocks	3 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	14 ?
Unique (%)	87.5%

Sample

1st row
2nd row	;
3rd row	(‘16년 6월말 현재) (개소, 두)
4th row	;[2]
5th row	시군

Value	Count	Frequency (%)
	2	10.0%
전주시	1	5.0%
고창군	1	5.0%
장수군	1	5.0%
김제시	1	5.0%
남원시	1	5.0%
정읍시	1	5.0%
익산시	1	5.0%
군산시	1	5.0%
9	1	5.0%
Other values (9)	9	45.0%

Most occurring characters

Value	Count	Frequency (%)
	61	52.1%
시	7	6.0%
군	5	4.3%
)	3	2.6%
;	3	2.6%
(	3	2.6%
6	2	1.7%
산	2	1.7%
원	1	0.9%
익	1	0.9%
Other values (29)	29	24.8%

Most occurring categories

Value	Count	Frequency (%)
Space Separator	61	52.1%
Other Letter	38	32.5%
Decimal Number	5	4.3%
Close Punctuation	4	3.4%
Other Punctuation	4	3.4%
Open Punctuation	4	3.4%
Initial Punctuation	1	0.9%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
시	7	18.4%
군	5	13.2%
산	2	5.3%
원	1	2.6%
익	1	2.6%
정	1	2.6%
읍	1	2.6%
남	1	2.6%
김	1	2.6%
전	1	2.6%
Other values (17)	17	44.7%

Decimal Number

Value	Count	Frequency (%)
6	2	40.0%
9	1	20.0%
2	1	20.0%
1	1	20.0%

Close Punctuation

Value	Count	Frequency (%)
)	3	75.0%
]	1	25.0%

Other Punctuation

Value	Count	Frequency (%)
;	3	75.0%
,	1	25.0%

Open Punctuation

Value	Count	Frequency (%)
(	3	75.0%
[	1	25.0%

Space Separator

Value	Count	Frequency (%)
	61	100.0%

Initial Punctuation

Value	Count	Frequency (%)
‘	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	79	67.5%
Hangul	38	32.5%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
시	7	18.4%
군	5	13.2%
산	2	5.3%
원	1	2.6%
익	1	2.6%
정	1	2.6%
읍	1	2.6%
남	1	2.6%
김	1	2.6%
전	1	2.6%
Other values (17)	17	44.7%

Common

Value	Count	Frequency (%)
	61	77.2%
)	3	3.8%
;	3	3.8%
(	3	3.8%
6	2	2.5%
9	1	1.3%
‘	1	1.3%
]	1	1.3%
2	1	1.3%
[	1	1.3%
Other values (2)	2	2.5%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	78	66.7%
Hangul	38	32.5%
Punctuation	1	0.9%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	61	78.2%
)	3	3.8%
;	3	3.8%
(	3	3.8%
6	2	2.6%
9	1	1.3%
]	1	1.3%
2	1	1.3%
[	1	1.3%
,	1	1.3%

Hangul

Value	Count	Frequency (%)
시	7	18.4%
군	5	13.2%
산	2	5.3%
원	1	2.6%
익	1	2.6%
정	1	2.6%
읍	1	2.6%
남	1	2.6%
김	1	2.6%
전	1	2.6%
Other values (17)	17	44.7%

Punctuation

Value	Count	Frequency (%)
‘	1	100.0%

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	;[1]
0
1	;
2	(‘16년 6월말 현재) (개소, 두)
3	;[2]
4	시군
5	<NA>
6	계 (9)
7	전주시
8	군산시
9	익산시

	;[1]
16	남원시
17	<NA>
18	김제시
19	<NA>
20	장수군
21	<NA>
22	<NA>
23	고창군
24	부안군
25	;

Most frequently occurring

	;[1]	# duplicates
1	<NA>	10
0	;	2

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Decimal Number

Close Punctuation

Other Punctuation

Open Punctuation

Space Separator

Initial Punctuation

Most occurring scripts

Most frequent character per script

Hangul

Common

Most occurring blocks

Most frequent character per block

ASCII

Hangul

Punctuation

Missing values

Sample

Duplicate rows

Most frequently occurring