Overview

Dataset statistics

Number of variables16
Number of observations331
Missing cells315
Missing cells (%)5.9%
Duplicate rows21
Duplicate rows (%)6.3%
Total size in memory41.5 KiB
Average record size in memory128.4 B

Variable types

Text1
Categorical1
Unsupported14

Dataset

Description과실류 가공내역, 가공업체 현황 등 ## LINK 미리보기 [![미리보기](http://curate.gimi9.com/linkview/mafra-20210930000000001620?url=https%3A//data.mafra.go.kr/opendata/data/indexOpenDataDetail.do%3Fdata_id%3D20210930000000001620%26filter_ty%3DF&version=d7)](https://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20210930000000001620)
Author농림축산식품부
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20210930000000001620

Alerts

Dataset has 21 (6.3%) duplicate rowsDuplicates
시도(1) has 315 (95.2%) missing valuesMissing
2019 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.1 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.2 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.12 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2019.13 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 03:03:57.011977
Analysis finished2023-12-11 03:03:57.651718
Duration0.64 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도(1)
Text

MISSING 

Distinct16
Distinct (%)100.0%
Missing315
Missing (%)95.2%
Memory size2.7 KiB
2023-12-11T12:03:57.770560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length4.625
Min length3

Characters and Unicode

Total characters74
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)100.0%

Sample

1st row시도(1)
2nd row서울특별시
3rd row대구광역시
4th row인천광역시
5th row대전광역시
ValueCountFrequency (%)
경기도 1
 
6.2%
대구광역시 1
 
6.2%
인천광역시 1
 
6.2%
대전광역시 1
 
6.2%
울산광역시 1
 
6.2%
세종특별자치시 1
 
6.2%
서울특별시 1
 
6.2%
충청북도 1
 
6.2%
시도(1 1
 
6.2%
충청남도 1
 
6.2%
Other values (6) 6
37.5%
2023-12-11T12:03:58.170552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10
 
13.5%
7
 
9.5%
4
 
5.4%
4
 
5.4%
3
 
4.1%
3
 
4.1%
3
 
4.1%
3
 
4.1%
3
 
4.1%
3
 
4.1%
Other values (23) 31
41.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 71
95.9%
Decimal Number 1
 
1.4%
Close Punctuation 1
 
1.4%
Open Punctuation 1
 
1.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10
 
14.1%
7
 
9.9%
4
 
5.6%
4
 
5.6%
3
 
4.2%
3
 
4.2%
3
 
4.2%
3
 
4.2%
3
 
4.2%
3
 
4.2%
Other values (20) 28
39.4%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 71
95.9%
Common 3
 
4.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10
 
14.1%
7
 
9.9%
4
 
5.6%
4
 
5.6%
3
 
4.2%
3
 
4.2%
3
 
4.2%
3
 
4.2%
3
 
4.2%
3
 
4.2%
Other values (20) 28
39.4%
Common
ValueCountFrequency (%)
1 1
33.3%
) 1
33.3%
( 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 71
95.9%
ASCII 3
 
4.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10
 
14.1%
7
 
9.9%
4
 
5.6%
4
 
5.6%
3
 
4.2%
3
 
4.2%
3
 
4.2%
3
 
4.2%
3
 
4.2%
3
 
4.2%
Other values (20) 28
39.4%
ASCII
ValueCountFrequency (%)
1 1
33.3%
) 1
33.3%
( 1
33.3%

과실류(1)
Categorical

Distinct23
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
머루
 
15
사과
 
15
 
15
포도
 
15
감귤
 
15
Other values (18)
256 

Length

Max length6
Median length2
Mean length2.3746224
Min length1

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row과실류(1)
2nd row사과
3rd row
4th row포도
5th row감귤

Common Values

ValueCountFrequency (%)
머루 15
 
4.5%
사과 15
 
4.5%
15
 
4.5%
포도 15
 
4.5%
감귤 15
 
4.5%
단감 15
 
4.5%
복숭아 15
 
4.5%
유자 15
 
4.5%
15
 
4.5%
복분자 15
 
4.5%
Other values (13) 181
54.7%

Length

2023-12-11T12:03:58.305246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
머루 15
 
4.5%
오디 15
 
4.5%
커피 15
 
4.5%
블루베리 15
 
4.5%
아로니아 15
 
4.5%
살구 15
 
4.5%
무화과 15
 
4.5%
파인애플 15
 
4.5%
키위 15
 
4.5%
오미자 15
 
4.5%
Other values (13) 181
54.7%

2019
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.1
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.2
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.3
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.4
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.5
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.6
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.7
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.8
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.9
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.10
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.11
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.12
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

2019.13
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.7 KiB

Correlations

2023-12-11T12:03:58.372342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도(1)과실류(1)
시도(1)1.0001.000
과실류(1)1.0001.000

Missing values

2023-12-11T12:03:57.176861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:03:57.589793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도(1)과실류(1)20192019.12019.22019.32019.42019.52019.62019.72019.82019.92019.102019.112019.122019.13
0시도(1)과실류(1)통조림쥬스넥타식초음료조미사료즙청분말차건조유음료 및 커피기타
1서울특별시사과0001.30000000300
2<NA>00000000000000
3<NA>포도00000000000000
4<NA>감귤0000000.20000000
5<NA>단감00000000000000
6<NA>복숭아00000000000000
7<NA>유자00000000000000
8<NA>00000000000000
9<NA>복분자00000000000000
시도(1)과실류(1)20192019.12019.22019.32019.42019.52019.62019.72019.82019.92019.102019.112019.122019.13
321<NA>망고00000000000000
322<NA>오미자001.2000120000000
323<NA>키위00000000000000
324<NA>파인애플00000000000000
325<NA>무화과00000000000000
326<NA>살구00000000000000
327<NA>아로니아00000000000000
328<NA>블루베리0000000.50000000
329<NA>커피00000000000000
330<NA>기타014718000000454.8000

Duplicate rows

Most frequently occurring

시도(1)과실류(1)# duplicates
0<NA>15
1<NA>감귤15
2<NA>기타15
3<NA>단감15
4<NA>망고15
5<NA>매실15
6<NA>머루15
7<NA>무화과15
8<NA>15
9<NA>복분자15