gimi9 Pandas Profiling

Dataset statistics

Number of variables	1
Number of observations	1727
Missing cells	1716
Missing cells (%)	99.4%
Duplicate rows	3
Duplicate rows (%)	0.2%
Total size in memory	13.6 KiB
Average record size in memory	8.1 B

Variable types

Text	1

Dataset

Description	지리적표시관리 인증, 심사 등의 업무 관리(등록번호, 등록명칭, 등록일자, 대상지역, 생산계획량, 구성현황 등)
Author	국립농산물품질관리원
URL	https://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220204000000001691

Alerts

Dataset has 3 (0.2%) duplicate rows	Duplicates
`ࡱ` has 1716 (99.4%) missing values	Missing

Reproduction

Analysis started	2024-03-23 07:27:13.599076
Analysis finished	2024-03-23 07:27:13.864470
Duration	0.27 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

ࡱ
Text

MISSING

Distinct	9
Distinct (%)	81.8%
Missing	1716
Missing (%)	99.4%
Memory size	13.6 KiB

Length

Max length	5
Median length	1
Mean length	2
Min length	1

Characters and Unicode

Total characters	22
Distinct characters	12
Distinct categories	7 ?
Distinct scripts	2 ?
Distinct blocks	3 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	7 ?
Unique (%)	63.6%

Sample

1st row
2nd row	@
3rd row	)̱
4th row	0
5th row	̀0

Value	Count	Frequency (%)
	2	18.2%
	2	18.2%
	1	9.1%
@	1	9.1%
̱	1	9.1%
0	1	9.1%
̀0	1	9.1%
	1	9.1%
	1	9.1%

Most occurring characters

Value	Count	Frequency (%)
	6	27.3%
	4	18.2%
	2	9.1%
0	2	9.1%
	1	4.5%
@	1	4.5%
	1	4.5%
)	1	4.5%
̱	1	4.5%
̀	1	4.5%
Other values (2)	2	9.1%

Most occurring categories

Value	Count	Frequency (%)
Control	10	45.5%
Space Separator	4	18.2%
Decimal Number	2	9.1%
Other Punctuation	2	9.1%
Nonspacing Mark	2	9.1%
Close Punctuation	1	4.5%
Open Punctuation	1	4.5%

Most frequent character per category

Control

Value	Count	Frequency (%)
	6	60.0%
	2	20.0%
	1	10.0%
	1	10.0%

Other Punctuation

Value	Count	Frequency (%)
@	1	50.0%
.	1	50.0%

Nonspacing Mark

Value	Count	Frequency (%)
̱	1	50.0%
̀	1	50.0%

Space Separator

Value	Count	Frequency (%)
	4	100.0%

Decimal Number

Value	Count	Frequency (%)
0	2	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	1	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	20	90.9%
Inherited	2	9.1%

Most frequent character per script

Common

Value	Count	Frequency (%)
	6	30.0%
	4	20.0%
	2	10.0%
0	2	10.0%
	1	5.0%
@	1	5.0%
	1	5.0%
)	1	5.0%
(	1	5.0%
.	1	5.0%

Inherited

Value	Count	Frequency (%)
̱	1	50.0%
̀	1	50.0%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	19	86.4%
Diacriticals	2	9.1%
None	1	4.5%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	6	31.6%
	4	21.1%
	2	10.5%
0	2	10.5%
	1	5.3%
@	1	5.3%
)	1	5.3%
(	1	5.3%
.	1	5.3%

None

Value	Count	Frequency (%)
	1	100.0%

Diacriticals

Value	Count	Frequency (%)
̱	1	50.0%
̀	1	50.0%

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	ࡱ
0	<NA>
1	<NA>
2	<NA>
3	<NA>
4	<NA>
5	<NA>
6	<NA>
7	<NA>
8	<NA>
9	<NA>

	ࡱ
1717	<NA>
1718	<NA>
1719	<NA>
1720	<NA>
1721	<NA>
1722	<NA>
1723	<NA>
1724	<NA>
1725	<NA>
1726	<NA>

Most frequently occurring

	ࡱ	# duplicates
2	<NA>	1716
0		2
1		2

Overview

Variables