Overview

Dataset statistics

Number of variables8
Number of observations294
Missing cells214
Missing cells (%)9.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.5 KiB
Average record size in memory64.4 B

Variable types

Categorical2
Text4
Boolean2

Dataset

Description2011년말 기준으로 조사된 시도별 농축산물 개별브랜드 현황
Author농림축산식품부
URLhttps://www.data.go.kr/data/15055115/fileData.do

Alerts

신규여부 is highly imbalanced (82.0%)Imbalance
등록번호 has 214 (72.8%) missing valuesMissing
브랜드 명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 11:54:29.880392
Analysis finished2023-12-12 11:54:30.754182
Duration0.87 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군
Categorical

Distinct12
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
충주시
70 
괴산군
65 
제천시
30 
영동군
24 
옥천군
23 
Other values (7)
82 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row청주시
2nd row청주시
3rd row청주시
4th row청주시
5th row청주시

Common Values

ValueCountFrequency (%)
충주시 70
23.8%
괴산군 65
22.1%
제천시 30
10.2%
영동군 24
 
8.2%
옥천군 23
 
7.8%
청원군 22
 
7.5%
진천군 22
 
7.5%
보은군 18
 
6.1%
음성군 7
 
2.4%
청주시 6
 
2.0%
Other values (2) 7
 
2.4%

Length

2023-12-12T20:54:30.824124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
충주시 70
23.8%
괴산군 65
22.1%
제천시 30
10.2%
영동군 24
 
8.2%
옥천군 23
 
7.8%
청원군 22
 
7.5%
진천군 22
 
7.5%
보은군 18
 
6.1%
음성군 7
 
2.4%
청주시 6
 
2.0%
Other values (2) 7
 
2.4%

브랜드 명
Text

UNIQUE 

Distinct294
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-12T20:54:31.160841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length4.7142857
Min length1

Characters and Unicode

Total characters1386
Distinct characters312
Distinct categories6 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique294 ?
Unique (%)100.0%

Sample

1st row직지쌀
2nd row유근형녹용
3rd row청주양봉
4th row송절채소
5th row까치네
ValueCountFrequency (%)
장령산골 2
 
0.6%
신선고을 1
 
0.3%
돌실사과 1
 
0.3%
삼누리 1
 
0.3%
한삼인 1
 
0.3%
사미랑홍삼포크 1
 
0.3%
질벌뜰쌀 1
 
0.3%
새로미쌀 1
 
0.3%
질흙소담미 1
 
0.3%
직지쌀 1
 
0.3%
Other values (297) 297
96.4%
2023-12-12T20:54:31.632223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
46
 
3.3%
44
 
3.2%
36
 
2.6%
28
 
2.0%
27
 
1.9%
24
 
1.7%
22
 
1.6%
21
 
1.5%
20
 
1.4%
20
 
1.4%
Other values (302) 1098
79.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1358
98.0%
Space Separator 14
 
1.0%
Lowercase Letter 6
 
0.4%
Open Punctuation 3
 
0.2%
Close Punctuation 3
 
0.2%
Uppercase Letter 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
46
 
3.4%
44
 
3.2%
36
 
2.7%
28
 
2.1%
27
 
2.0%
24
 
1.8%
22
 
1.6%
21
 
1.5%
20
 
1.5%
20
 
1.5%
Other values (296) 1070
78.8%
Lowercase Letter
ValueCountFrequency (%)
o 4
66.7%
d 2
33.3%
Space Separator
ValueCountFrequency (%)
14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Uppercase Letter
ValueCountFrequency (%)
G 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1355
97.8%
Common 20
 
1.4%
Latin 8
 
0.6%
Han 3
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
46
 
3.4%
44
 
3.2%
36
 
2.7%
28
 
2.1%
27
 
2.0%
24
 
1.8%
22
 
1.6%
21
 
1.5%
20
 
1.5%
20
 
1.5%
Other values (293) 1067
78.7%
Common
ValueCountFrequency (%)
14
70.0%
( 3
 
15.0%
) 3
 
15.0%
Latin
ValueCountFrequency (%)
o 4
50.0%
d 2
25.0%
G 2
25.0%
Han
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1355
97.8%
ASCII 28
 
2.0%
CJK 3
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
46
 
3.4%
44
 
3.2%
36
 
2.7%
28
 
2.1%
27
 
2.0%
24
 
1.8%
22
 
1.6%
21
 
1.5%
20
 
1.5%
20
 
1.5%
Other values (293) 1067
78.7%
ASCII
ValueCountFrequency (%)
14
50.0%
o 4
 
14.3%
( 3
 
10.7%
) 3
 
10.7%
d 2
 
7.1%
G 2
 
7.1%
CJK
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Distinct232
Distinct (%)78.9%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-12T20:54:31.888273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length15
Mean length6.7040816
Min length2

Characters and Unicode

Total characters1971
Distinct characters257
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique199 ?
Unique (%)67.7%

Sample

1st row청주농협
2nd row충북사슴영농조합법인
3rd row청주양봉연구회
4th row송절채소작목반
5th row까치네작목반
ValueCountFrequency (%)
개인 9
 
2.9%
농협 7
 
2.2%
rpc 7
 
2.2%
장양영농조합법인 5
 
1.6%
삼두rpc 4
 
1.3%
군자농협 4
 
1.3%
주덕농협신니지점 4
 
1.3%
노은정미소 4
 
1.3%
작목반 4
 
1.3%
산계뜰영농조합 3
 
1.0%
Other values (230) 261
83.7%
2023-12-12T20:54:32.266503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
140
 
7.1%
86
 
4.4%
82
 
4.2%
75
 
3.8%
65
 
3.3%
61
 
3.1%
56
 
2.8%
55
 
2.8%
52
 
2.6%
45
 
2.3%
Other values (247) 1254
63.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1877
95.2%
Uppercase Letter 42
 
2.1%
Space Separator 19
 
1.0%
Open Punctuation 10
 
0.5%
Close Punctuation 10
 
0.5%
Other Punctuation 7
 
0.4%
Other Symbol 6
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
140
 
7.5%
86
 
4.6%
82
 
4.4%
75
 
4.0%
65
 
3.5%
61
 
3.2%
56
 
3.0%
55
 
2.9%
52
 
2.8%
45
 
2.4%
Other values (239) 1160
61.8%
Uppercase Letter
ValueCountFrequency (%)
C 14
33.3%
P 14
33.3%
R 14
33.3%
Space Separator
ValueCountFrequency (%)
19
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Other Punctuation
ValueCountFrequency (%)
, 7
100.0%
Other Symbol
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1883
95.5%
Common 46
 
2.3%
Latin 42
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
140
 
7.4%
86
 
4.6%
82
 
4.4%
75
 
4.0%
65
 
3.5%
61
 
3.2%
56
 
3.0%
55
 
2.9%
52
 
2.8%
45
 
2.4%
Other values (240) 1166
61.9%
Common
ValueCountFrequency (%)
19
41.3%
( 10
21.7%
) 10
21.7%
, 7
 
15.2%
Latin
ValueCountFrequency (%)
C 14
33.3%
P 14
33.3%
R 14
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1877
95.2%
ASCII 88
 
4.5%
None 6
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
140
 
7.5%
86
 
4.6%
82
 
4.4%
75
 
4.0%
65
 
3.5%
61
 
3.2%
56
 
3.0%
55
 
2.9%
52
 
2.8%
45
 
2.4%
Other values (239) 1160
61.8%
ASCII
ValueCountFrequency (%)
19
21.6%
C 14
15.9%
P 14
15.9%
R 14
15.9%
( 10
11.4%
) 10
11.4%
, 7
 
8.0%
None
ValueCountFrequency (%)
6
100.0%

부류명
Categorical

Distinct14
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
식량작물
115 
과실류
53 
농산가공
44 
축산물
20 
과채류
19 
Other values (9)
43 

Length

Max length5
Median length4
Mean length3.5306122
Min length2

Unique

Unique2 ?
Unique (%)0.7%

Sample

1st row식량작물
2nd row축산물
3rd row축산물
4th row채소류
5th row채소류

Common Values

ValueCountFrequency (%)
식량작물 115
39.1%
과실류 53
18.0%
농산가공 44
 
15.0%
축산물 20
 
6.8%
과채류 19
 
6.5%
채소류 16
 
5.4%
임산물 9
 
3.1%
공통 6
 
2.0%
특작류 3
 
1.0%
기타 3
 
1.0%
Other values (4) 6
 
2.0%

Length

2023-12-12T20:54:32.411146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
식량작물 115
39.1%
과실류 53
18.0%
농산가공 46
 
15.6%
축산물 20
 
6.8%
과채류 19
 
6.5%
채소류 16
 
5.4%
임산물 9
 
3.1%
공통 6
 
2.0%
특작류 3
 
1.0%
기타 3
 
1.0%
Other values (3) 4
 
1.4%
Distinct82
Distinct (%)27.9%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-12T20:54:32.652956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length1.952381
Min length1

Characters and Unicode

Total characters574
Distinct characters117
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51 ?
Unique (%)17.3%

Sample

1st row
2nd row녹용
3rd row축산물
4th row채소류
5th row채소류
ValueCountFrequency (%)
96
32.7%
사과 26
 
8.8%
포도 12
 
4.1%
복숭아 11
 
3.7%
한우 10
 
3.4%
오이 8
 
2.7%
고추장 7
 
2.4%
잡곡 6
 
2.0%
곶감 6
 
2.0%
된장 5
 
1.7%
Other values (72) 107
36.4%
2023-12-12T20:54:33.110850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
99
 
17.2%
29
 
5.1%
26
 
4.5%
17
 
3.0%
16
 
2.8%
13
 
2.3%
13
 
2.3%
12
 
2.1%
12
 
2.1%
11
 
1.9%
Other values (107) 326
56.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 574
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
99
 
17.2%
29
 
5.1%
26
 
4.5%
17
 
3.0%
16
 
2.8%
13
 
2.3%
13
 
2.3%
12
 
2.1%
12
 
2.1%
11
 
1.9%
Other values (107) 326
56.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 574
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
99
 
17.2%
29
 
5.1%
26
 
4.5%
17
 
3.0%
16
 
2.8%
13
 
2.3%
13
 
2.3%
12
 
2.1%
12
 
2.1%
11
 
1.9%
Other values (107) 326
56.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 574
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
99
 
17.2%
29
 
5.1%
26
 
4.5%
17
 
3.0%
16
 
2.8%
13
 
2.3%
13
 
2.3%
12
 
2.1%
12
 
2.1%
11
 
1.9%
Other values (107) 326
56.8%
Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size426.0 B
False
214 
True
80 
ValueCountFrequency (%)
False 214
72.8%
True 80
 
27.2%
2023-12-12T20:54:33.240290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

등록번호
Text

MISSING 

Distinct80
Distinct (%)100.0%
Missing214
Missing (%)72.8%
Memory size2.4 KiB
2023-12-12T20:54:33.578695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length10
Mean length10.125
Min length9

Characters and Unicode

Total characters810
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique80 ?
Unique (%)100.0%

Sample

1st row40-0578959
2nd row40-0380751
3rd row40-05727432
4th row41-0176522
5th row41-0198764
ValueCountFrequency (%)
40-0473289 1
 
1.2%
40-0462090 1
 
1.2%
40-0003562 1
 
1.2%
40-0029424 1
 
1.2%
40-0467842 1
 
1.2%
40-0803832 1
 
1.2%
40-0810152 1
 
1.2%
40-0639932 1
 
1.2%
45-0017636 1
 
1.2%
40-0535607 1
 
1.2%
Other values (72) 72
87.8%
2023-12-12T20:54:34.212088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 210
25.9%
4 133
16.4%
- 81
 
10.0%
6 64
 
7.9%
1 59
 
7.3%
5 55
 
6.8%
7 48
 
5.9%
2 42
 
5.2%
8 41
 
5.1%
3 37
 
4.6%
Other values (4) 40
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 725
89.5%
Dash Punctuation 81
 
10.0%
Space Separator 2
 
0.2%
Other Letter 2
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 210
29.0%
4 133
18.3%
6 64
 
8.8%
1 59
 
8.1%
5 55
 
7.6%
7 48
 
6.6%
2 42
 
5.8%
8 41
 
5.7%
3 37
 
5.1%
9 36
 
5.0%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 81
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 808
99.8%
Hangul 2
 
0.2%

Most frequent character per script

Common
ValueCountFrequency (%)
0 210
26.0%
4 133
16.5%
- 81
 
10.0%
6 64
 
7.9%
1 59
 
7.3%
5 55
 
6.8%
7 48
 
5.9%
2 42
 
5.2%
8 41
 
5.1%
3 37
 
4.6%
Other values (2) 38
 
4.7%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 808
99.8%
Hangul 2
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 210
26.0%
4 133
16.5%
- 81
 
10.0%
6 64
 
7.9%
1 59
 
7.3%
5 55
 
6.8%
7 48
 
5.9%
2 42
 
5.2%
8 41
 
5.1%
3 37
 
4.6%
Other values (2) 38
 
4.7%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

신규여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size426.0 B
False
286 
True
 
8
ValueCountFrequency (%)
False 286
97.3%
True 8
 
2.7%
2023-12-12T20:54:34.380181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:54:34.461647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군부류명주품목등록여부등록번호신규여부
시군1.0000.5520.6710.4541.0000.000
부류명0.5521.0000.9990.4791.0000.171
주품목0.6710.9991.0000.5161.0000.632
등록여부0.4540.4790.5161.000NaN0.034
등록번호1.0001.0001.000NaN1.0001.000
신규여부0.0000.1710.6320.0341.0001.000
2023-12-12T20:54:34.606300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부류명신규여부등록여부시군
부류명1.0000.1300.3680.242
신규여부0.1301.0000.0210.000
등록여부0.3680.0211.0000.347
시군0.2420.0000.3471.000
2023-12-12T20:54:34.743565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군부류명등록여부신규여부
시군1.0000.2420.3470.000
부류명0.2421.0000.3680.130
등록여부0.3470.3681.0000.021
신규여부0.0000.1300.0211.000

Missing values

2023-12-12T20:54:30.572222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:54:30.705375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군브랜드 명브랜드 사용자부류명주품목등록여부등록번호신규여부
0청주시직지쌀청주농협식량작물Y40-0578959N
1청주시유근형녹용충북사슴영농조합법인축산물녹용Y40-0380751N
2청주시청주양봉청주양봉연구회축산물축산물N<NA>N
3청주시송절채소송절채소작목반채소류채소류N<NA>N
4청주시까치네까치네작목반채소류채소류N<NA>N
5청주시산성것대산성것대작목반채소류채소류N<NA>N
6충주시토옥충북원협과실류사과Y40-05727432N
7충주시프레샤인충북원협과실류사과Y41-0176522N
8충주시청명주중원당,청명주양미사농산가공주류N<NA>N
9충주시장길영 충주사과감자떡충주사과감자떡농산가공감자떡N<NA>Y
시군브랜드 명브랜드 사용자부류명주품목등록여부등록번호신규여부
284괴산군해밀터해밀터영농조합법인농산가공고추장N<NA>N
285괴산군햇살옹바심종합식품농산가공고추장Y40-0844265N
286음성군햇사레미감곡농협식량작물N<NA>N
287음성군우농쌀우농 RPC식량작물N<NA>N
288음성군천하제일미우농 RPC식량작물N<NA>N
289음성군금상미썬그레인식량작물N<NA>N
290음성군청풍미썬그레인식량작물N<NA>N
291음성군좋은땅의선물썬그레인식량작물N<NA>N
292음성군스테비아쌀정든고향미대소친환경쌀연구회식량작물N<NA>N
293단양군단양특산물소백산죽령사과단양과수영농조합법인과실류사과Y40-0514544N