Overview

Dataset statistics

Number of variables16
Number of observations353
Missing cells336
Missing cells (%)5.9%
Duplicate rows21
Duplicate rows (%)5.9%
Total size in memory44.3 KiB
Average record size in memory128.4 B

Variable types

Text1
Categorical1
Unsupported14

Dataset

Description과실류 가공내역, 가공업체 현황 등 ## LINK 미리보기 [![미리보기](http://curate.gimi9.com/linkview/mafra-20210930000000001620?url=https%3A//data.mafra.go.kr/opendata/data/indexOpenDataDetail.do%3Fdata_id%3D20210930000000001620%26filter_ty%3DF&version=d7)](https://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20210930000000001620)
Author농림축산식품부
URLhttps://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20210930000000001620

Alerts

Dataset has 21 (5.9%) duplicate rowsDuplicates
시도 has 336 (95.2%) missing valuesMissing
2021 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.1 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.2 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.3 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.4 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.12 is an unsupported type, check if it needs cleaning or further analysisUnsupported
2021.13 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 03:04:00.111801
Analysis finished2023-12-11 03:04:00.590800
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도
Text

MISSING 

Distinct17
Distinct (%)100.0%
Missing336
Missing (%)95.2%
Memory size2.9 KiB
2023-12-11T12:04:00.725610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length4.4705882
Min length2

Characters and Unicode

Total characters76
Distinct characters30
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)100.0%

Sample

1st row시도
2nd row서울특별시
3rd row대구광역시
4th row인천광역시
5th row광주광역시
ValueCountFrequency (%)
시도 1
 
5.9%
충청북도 1
 
5.9%
강원도 1
 
5.9%
경상남도 1
 
5.9%
경상북도 1
 
5.9%
전라남도 1
 
5.9%
전라북도 1
 
5.9%
충청남도 1
 
5.9%
경기도 1
 
5.9%
대구광역시 1
 
5.9%
Other values (7) 7
41.2%
2023-12-11T12:04:01.048682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10
 
13.2%
8
 
10.5%
6
 
7.9%
5
 
6.6%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
Other values (20) 29
38.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 76
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10
 
13.2%
8
 
10.5%
6
 
7.9%
5
 
6.6%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
Other values (20) 29
38.2%

Most occurring scripts

ValueCountFrequency (%)
Hangul 76
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10
 
13.2%
8
 
10.5%
6
 
7.9%
5
 
6.6%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
Other values (20) 29
38.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 76
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10
 
13.2%
8
 
10.5%
6
 
7.9%
5
 
6.6%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
3
 
3.9%
Other values (20) 29
38.2%

과실류
Categorical

Distinct23
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size2.9 KiB
머루
 
16
사과
 
16
 
16
포도
 
16
감귤
 
16
Other values (18)
273 

Length

Max length4
Median length2
Mean length2.3654391
Min length1

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row과실류
2nd row사과
3rd row
4th row포도
5th row감귤

Common Values

ValueCountFrequency (%)
머루 16
 
4.5%
사과 16
 
4.5%
16
 
4.5%
포도 16
 
4.5%
감귤 16
 
4.5%
단감 16
 
4.5%
복숭아 16
 
4.5%
유자 16
 
4.5%
16
 
4.5%
복분자 16
 
4.5%
Other values (13) 193
54.7%

Length

2023-12-11T12:04:01.218737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
머루 16
 
4.5%
오디 16
 
4.5%
커피 16
 
4.5%
블루베리 16
 
4.5%
아로니아 16
 
4.5%
살구 16
 
4.5%
무화과 16
 
4.5%
파인애플 16
 
4.5%
키위 16
 
4.5%
오미자 16
 
4.5%
Other values (13) 193
54.7%

2021
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.1
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.2
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.3
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.4
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.5
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.6
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.7
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.8
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.9
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.10
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.11
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.12
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

2021.13
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size2.9 KiB

Correlations

2023-12-11T12:04:01.293133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도과실류
시도1.0001.000
과실류1.0001.000

Missing values

2023-12-11T12:04:00.299116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:04:00.521511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도과실류20212021.12021.22021.32021.42021.52021.62021.72021.82021.92021.102021.112021.122021.13
0시도과실류통조림쥬스넥타식초음료조미사료즙청분말차건조유음료 및 커피기타
1서울특별시사과00000000000.51.100.7
2<NA>00000000000000
3<NA>포도00000000000000
4<NA>감귤0000.80000000000
5<NA>단감00000000000000
6<NA>복숭아0000.10000000000
7<NA>유자00000000000000
8<NA>00000000000000
9<NA>복분자00000000000000
시도과실류20212021.12021.22021.32021.42021.52021.62021.72021.82021.92021.102021.112021.122021.13
343<NA>망고00000000000000
344<NA>오미자001600000000000
345<NA>키위00000000000000
346<NA>파인애플00000000000000
347<NA>무화과00000000000000
348<NA>살구00000000000000
349<NA>아로니아00000000000000
350<NA>블루베리00000000000000
351<NA>커피00000000000000
352<NA>기타0440000000000000

Duplicate rows

Most frequently occurring

시도과실류# duplicates
0<NA>16
1<NA>감귤16
2<NA>기타16
3<NA>단감16
4<NA>망고16
5<NA>매실16
6<NA>머루16
7<NA>무화과16
8<NA>16
9<NA>복분자16