gimi9 Pandas Profiling

Dataset statistics

Number of variables	5
Number of observations	40
Missing cells	28
Missing cells (%)	14.0%
Duplicate rows	1
Duplicate rows (%)	2.5%
Total size in memory	1.7 KiB
Average record size in memory	43.3 B

Variable types

Text	1
Categorical	1
Unsupported	3

Dataset

Description	주요 FTA별 특혜관세 적용현황 및 FTA협정 이행국가별 교역 실적에 대한 데이터 입니다. 자세한 내용은 첨부파일을 참고하시기 바랍니다.
URL	https://www.data.go.kr/data/15121019/fileData.do

Alerts

Dataset has 1 (2.5%) duplicate rows	Duplicates
`* 주요 FTA별 특혜관세 적용현황` has 19 (47.5%) missing values	Missing
`Unnamed: 2` has 3 (7.5%) missing values	Missing
`Unnamed: 3` has 3 (7.5%) missing values	Missing
`Unnamed: 4` has 3 (7.5%) missing values	Missing
`Unnamed: 2` is an unsupported type, check if it needs cleaning or further analysis	Unsupported
`Unnamed: 3` is an unsupported type, check if it needs cleaning or further analysis	Unsupported
`Unnamed: 4` is an unsupported type, check if it needs cleaning or further analysis	Unsupported

Reproduction

Analysis started	2023-12-12 07:57:13.920068
Analysis finished	2023-12-12 07:57:14.441820
Duration	0.52 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

* 주요 FTA별 특혜관세 적용현황
Text

MISSING

Distinct	21
Distinct (%)	100.0%
Missing	19
Missing (%)	47.5%
Memory size	452.0 B

Length

Max length	60
Median length	17
Mean length	15.333333
Min length	5

Characters and Unicode

Total characters	322
Distinct characters	105
Distinct categories	10 ?
Distinct scripts	3 ?
Distinct blocks	3 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	21 ?
Unique (%)	100.0%

Sample

1st row	협정(발효일)
2nd row	칠레(’04.4.)
3rd row	EFTA(’06.9.)
4th row	ASEAN(’07.6.)
5th row	인도(’10.1.)

Value	Count	Frequency (%)
	2	4.3%
수출(수입)액	2	4.3%
활용률	2	4.3%
협정(발효일	1	2.2%
제외하고	1	2.2%
별도	1	2.2%
협정이	1	2.2%
있는	1	2.2%
asean	1	2.2%
호주	1	2.2%
Other values (33)	33	71.7%

Most occurring characters

Value	Count	Frequency (%)
.	34	10.6%
	24	7.5%
1	22	6.8%
(	21	6.5%
)	21	6.5%
’	17	5.3%
2	10	3.1%
5	6	1.9%
국	6	1.9%
A	6	1.9%
Other values (95)	155	48.1%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	111	34.5%
Decimal Number	56	17.4%
Other Punctuation	41	12.7%
Uppercase Letter	27	8.4%
Space Separator	24	7.5%
Open Punctuation	21	6.5%
Close Punctuation	21	6.5%
Final Punctuation	17	5.3%
Control	3	0.9%
Math Symbol	1	0.3%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
국	6	5.4%
수	4	3.6%
중	4	3.6%
용	3	2.7%
일	3	2.7%
질	2	1.8%
랜	2	1.8%
뉴	2	1.8%
드	2	1.8%
혜	2	1.8%
Other values (64)	81	73.0%

Decimal Number

Value	Count	Frequency (%)
1	22	39.3%
2	10	17.9%
5	6	10.7%
0	4	7.1%
4	3	5.4%
6	3	5.4%
3	3	5.4%
7	3	5.4%
9	1	1.8%
8	1	1.8%

Uppercase Letter

Value	Count	Frequency (%)
A	6	22.2%
E	6	22.2%
P	2	7.4%
R	2	7.4%
C	2	7.4%
F	2	7.4%
T	2	7.4%
S	2	7.4%
N	2	7.4%
U	1	3.7%

Other Punctuation

Value	Count	Frequency (%)
.	34	82.9%
,	3	7.3%
*	2	4.9%
/	1	2.4%
※	1	2.4%

Space Separator

Value	Count	Frequency (%)
	24	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	21	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	21	100.0%

Final Punctuation

Value	Count	Frequency (%)
’	17	100.0%

Control

Value	Count	Frequency (%)
	3	100.0%

Math Symbol

Value	Count	Frequency (%)
=	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	184	57.1%
Hangul	111	34.5%
Latin	27	8.4%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
국	6	5.4%
수	4	3.6%
중	4	3.6%
용	3	2.7%
일	3	2.7%
질	2	1.8%
랜	2	1.8%
뉴	2	1.8%
드	2	1.8%
혜	2	1.8%
Other values (64)	81	73.0%

Common

Value	Count	Frequency (%)
.	34	18.5%
	24	13.0%
1	22	12.0%
(	21	11.4%
)	21	11.4%
’	17	9.2%
2	10	5.4%
5	6	3.3%
0	4	2.2%
4	3	1.6%
Other values (11)	22	12.0%

Latin

Value	Count	Frequency (%)
A	6	22.2%
E	6	22.2%
P	2	7.4%
R	2	7.4%
C	2	7.4%
F	2	7.4%
T	2	7.4%
S	2	7.4%
N	2	7.4%
U	1	3.7%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	193	59.9%
Hangul	111	34.5%
Punctuation	18	5.6%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
.	34	17.6%
	24	12.4%
1	22	11.4%
(	21	10.9%
)	21	10.9%
2	10	5.2%
5	6	3.1%
A	6	3.1%
E	6	3.1%
0	4	2.1%
Other values (19)	39	20.2%

Punctuation

Value	Count	Frequency (%)
’	17	94.4%
※	1	5.6%

Hangul

Value	Count	Frequency (%)
국	6	5.4%
수	4	3.6%
중	4	3.6%
용	3	2.7%
일	3	2.7%
질	2	1.8%
랜	2	1.8%
뉴	2	1.8%
드	2	1.8%
혜	2	1.8%
Other values (64)	81	73.0%

Unnamed: 1
Categorical

Distinct	5
Distinct (%)	12.5%
Missing	0
Missing (%)	0.0%
Memory size	452.0 B

수출	18
수입	18
<NA>	2
FTA 활용률(%)	1
구분	1

Length

Max length	10
Median length	2
Mean length	2.3
Min length	2

Unique

Unique	2 ?
Unique (%)	5.0%

Sample

1st row	FTA 활용률(%)
2nd row	구분
3rd row	수출
4th row	수입
5th row	수출

Common Values

Value	Count	Frequency (%)
수출	18	45.0%
수입	18	45.0%
<NA>	2	5.0%
FTA 활용률(%)	1	2.5%
구분	1	2.5%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
수출	18	43.9%
수입	18	43.9%
na	2	4.9%
fta	1	2.4%
활용률	1	2.4%
구분	1	2.4%

Unnamed: 2
Unsupported

MISSING REJECTED UNSUPPORTED

Missing	3
Missing (%)	7.5%
Memory size	452.0 B

Unnamed: 3
Unsupported

MISSING REJECTED UNSUPPORTED

Missing	3
Missing (%)	7.5%
Memory size	452.0 B

Unnamed: 4
Unsupported

MISSING REJECTED UNSUPPORTED

Missing	3
Missing (%)	7.5%
Memory size	452.0 B

Phik (φk)

Heatmap
Table

	* 주요 FTA별 특혜관세 적용현황	Unnamed: 1
* 주요 FTA별 특혜관세 적용현황	1.000	1.000
Unnamed: 1	1.000	1.000

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

First rows
Last rows

	* 주요 FTA별 특혜관세 적용현황	Unnamed: 1	Unnamed: 2	Unnamed: 3	Unnamed: 4
0	협정(발효일)	FTA 활용률(%)	NaN	NaN	NaN
1	<NA>	구분	’20	’21	’22
2	칠레(’04.4.)	수출	68.6	63.6	66.3
3	<NA>	수입	99.1	99.3	98.6
4	EFTA(’06.9.)	수출	80.1	71	61.4
5	<NA>	수입	76.9	69.6	75.7
6	ASEAN(’07.6.)	수출	49.2	52	58.1
7	<NA>	수입	81.5	82.7	84.4
8	인도(’10.1.)	수출	74.6	77.8	79.5
9	<NA>	수입	55.6	54	55.2

	* 주요 FTA별 특혜관세 적용현황	Unnamed: 1	Unnamed: 2	Unnamed: 3	Unnamed: 4
30	영국(’21.1.)	수출	-	90.2	89.3
31	<NA>	수입	-	67.3	63.9
32	중미5개국(’21.3.)	수출	-	19.3	25.9
33	<NA>	수입	-	79.8	68.9
34	RCEP*(일본)(’22.2.)	수출	-	-	39.1
35	<NA>	수입	-	-	25.5
36	전체 평균	수출	74.8	75.7	75.5
37	<NA>	수입	81.5	80.3	78.6
38	* RCEP 회원국 중 별도 협정이 있는 ASEAN, 호주, 중국, 뉴질랜드는 제외하고 일본만 활용률 계산	<NA>	NaN	NaN	NaN
39	※ FTA 활용률 = 실제로 특혜관세를 적용받은 수출(수입)액/ 특혜관세 대상품목의 수출(수입)액	<NA>	NaN	NaN	NaN

Most frequently occurring

	* 주요 FTA별 특혜관세 적용현황	Unnamed: 1	# duplicates
0	<NA>	수입	18

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Decimal Number

Uppercase Letter

Other Punctuation

Space Separator

Open Punctuation

Close Punctuation

Final Punctuation

Control

Math Symbol

Most occurring scripts

Most frequent character per script

Hangul

Common

Latin

Most occurring blocks

Most frequent character per block

ASCII

Punctuation

Hangul

Common Values

Length

Common Values (Plot)

Correlations

Missing values

Sample

Duplicate rows

Most frequently occurring