Overview

Dataset statistics

Number of variables4
Number of observations3508
Missing cells3
Missing cells (%)< 0.1%
Duplicate rows9
Duplicate rows (%)0.3%
Total size in memory109.8 KiB
Average record size in memory32.0 B

Variable types

Text3
Categorical1

Dataset

Description종자업 등록과 종자보증을 수행하는 종자관리사의 현황(종자관리사 등록번호, 관리품목, 성명, 주소 등) 데이터를 확인가능함
URLhttps://www.data.go.kr/data/15008338/fileData.do

Alerts

Dataset has 9 (0.3%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 16:18:45.905572
Analysis finished2023-12-12 16:18:46.404305
Duration0.5 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct3499
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Memory size27.5 KiB
2023-12-13T01:18:46.680582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.368871
Min length9

Characters and Unicode

Total characters43390
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3490 ?
Unique (%)99.5%

Sample

1st row2022-11-003179
2nd row2022-11-003178
3rd row2022-22-003177
4th row2022-11-003176
5th row2022-11-003175
ValueCountFrequency (%)
2007-11-936 2
 
0.1%
2007-11-938 2
 
0.1%
2007-22-250 2
 
0.1%
2007-22-251 2
 
0.1%
2007-11-934 2
 
0.1%
2007-11-937 2
 
0.1%
2007-11-939 2
 
0.1%
2007-11-935 2
 
0.1%
2007-11-940 2
 
0.1%
2007-11-932 1
 
< 0.1%
Other values (3489) 3489
99.5%
2023-12-13T01:18:47.157393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 9796
22.6%
0 8373
19.3%
2 7879
18.2%
- 7016
16.2%
9 2371
 
5.5%
8 1663
 
3.8%
3 1411
 
3.3%
6 1249
 
2.9%
7 1246
 
2.9%
5 1209
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 36374
83.8%
Dash Punctuation 7016
 
16.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 9796
26.9%
0 8373
23.0%
2 7879
21.7%
9 2371
 
6.5%
8 1663
 
4.6%
3 1411
 
3.9%
6 1249
 
3.4%
7 1246
 
3.4%
5 1209
 
3.3%
4 1177
 
3.2%
Dash Punctuation
ValueCountFrequency (%)
- 7016
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 43390
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 9796
22.6%
0 8373
19.3%
2 7879
18.2%
- 7016
16.2%
9 2371
 
5.5%
8 1663
 
3.8%
3 1411
 
3.3%
6 1249
 
2.9%
7 1246
 
2.9%
5 1209
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43390
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 9796
22.6%
0 8373
19.3%
2 7879
18.2%
- 7016
16.2%
9 2371
 
5.5%
8 1663
 
3.8%
3 1411
 
3.3%
6 1249
 
2.9%
7 1246
 
2.9%
5 1209
 
2.8%

관리품목
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size27.5 KiB
일반
2707 
버섯
801 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반
2nd row일반
3rd row버섯
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 2707
77.2%
버섯 801
 
22.8%

Length

2023-12-13T01:18:47.332455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T01:18:47.452827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 2707
77.2%
버섯 801
 
22.8%

성명
Text

Distinct161
Distinct (%)4.6%
Missing2
Missing (%)0.1%
Memory size27.5 KiB
2023-12-13T01:18:47.736836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length3.0008557
Min length2

Characters and Unicode

Total characters10521
Distinct characters101
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40 ?
Unique (%)1.1%

Sample

1st row홍00
2nd row신00
3rd row이00
4th row최00
5th row김00
ValueCountFrequency (%)
김oo 422
 
12.0%
이oo 321
 
9.2%
김00 290
 
8.3%
이00 186
 
5.3%
박oo 168
 
4.8%
박00 128
 
3.7%
정oo 115
 
3.3%
최oo 90
 
2.6%
조oo 75
 
2.1%
정00 72
 
2.1%
Other values (151) 1639
46.7%
2023-12-13T01:18:48.180459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 4184
39.8%
0 2825
26.9%
712
 
6.8%
507
 
4.8%
296
 
2.8%
187
 
1.8%
156
 
1.5%
129
 
1.2%
87
 
0.8%
81
 
0.8%
Other values (91) 1357
 
12.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4184
39.8%
Other Letter 3504
33.3%
Decimal Number 2825
26.9%
Uppercase Letter 7
 
0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
712
20.3%
507
14.5%
296
 
8.4%
187
 
5.3%
156
 
4.5%
129
 
3.7%
87
 
2.5%
81
 
2.3%
76
 
2.2%
71
 
2.0%
Other values (81) 1202
34.3%
Uppercase Letter
ValueCountFrequency (%)
K 1
14.3%
I 1
14.3%
M 1
14.3%
G 1
14.3%
N 1
14.3%
A 1
14.3%
W 1
14.3%
Lowercase Letter
ValueCountFrequency (%)
o 4184
100.0%
Decimal Number
ValueCountFrequency (%)
0 2825
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4191
39.8%
Hangul 3504
33.3%
Common 2826
26.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
712
20.3%
507
14.5%
296
 
8.4%
187
 
5.3%
156
 
4.5%
129
 
3.7%
87
 
2.5%
81
 
2.3%
76
 
2.2%
71
 
2.0%
Other values (81) 1202
34.3%
Latin
ValueCountFrequency (%)
o 4184
99.8%
K 1
 
< 0.1%
I 1
 
< 0.1%
M 1
 
< 0.1%
G 1
 
< 0.1%
N 1
 
< 0.1%
A 1
 
< 0.1%
W 1
 
< 0.1%
Common
ValueCountFrequency (%)
0 2825
> 99.9%
1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7017
66.7%
Hangul 3504
33.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 4184
59.6%
0 2825
40.3%
K 1
 
< 0.1%
I 1
 
< 0.1%
M 1
 
< 0.1%
1
 
< 0.1%
G 1
 
< 0.1%
N 1
 
< 0.1%
A 1
 
< 0.1%
W 1
 
< 0.1%
Hangul
ValueCountFrequency (%)
712
20.3%
507
14.5%
296
 
8.4%
187
 
5.3%
156
 
4.5%
129
 
3.7%
87
 
2.5%
81
 
2.3%
76
 
2.2%
71
 
2.0%
Other values (81) 1202
34.3%

주소
Text

Distinct399
Distinct (%)11.4%
Missing1
Missing (%)< 0.1%
Memory size27.5 KiB
2023-12-13T01:18:48.584254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length3
Mean length3.3926433
Min length2

Characters and Unicode

Total characters11898
Distinct characters144
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique135 ?
Unique (%)3.8%

Sample

1st row포천시
2nd row포천시
3rd row부여군
4th row여주시
5th row서울특별시 관악구
ValueCountFrequency (%)
경북 239
 
6.7%
전남 180
 
5.1%
경기도 167
 
4.7%
충남 150
 
4.2%
충북 142
 
4.0%
경남 138
 
3.9%
경기 127
 
3.6%
전북 98
 
2.7%
강원 91
 
2.6%
대구 62
 
1.7%
Other values (240) 2170
60.9%
2023-12-13T01:18:49.191130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2254
18.9%
1130
 
9.5%
781
 
6.6%
575
 
4.8%
555
 
4.7%
482
 
4.1%
411
 
3.5%
398
 
3.3%
387
 
3.3%
361
 
3.0%
Other values (134) 4564
38.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9643
81.0%
Space Separator 2254
 
18.9%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1130
 
11.7%
781
 
8.1%
575
 
6.0%
555
 
5.8%
482
 
5.0%
411
 
4.3%
398
 
4.1%
387
 
4.0%
361
 
3.7%
311
 
3.2%
Other values (132) 4252
44.1%
Space Separator
ValueCountFrequency (%)
2254
100.0%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9643
81.0%
Common 2255
 
19.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1130
 
11.7%
781
 
8.1%
575
 
6.0%
555
 
5.8%
482
 
5.0%
411
 
4.3%
398
 
4.1%
387
 
4.0%
361
 
3.7%
311
 
3.2%
Other values (132) 4252
44.1%
Common
ValueCountFrequency (%)
2254
> 99.9%
1 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9643
81.0%
ASCII 2255
 
19.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2254
> 99.9%
1 1
 
< 0.1%
Hangul
ValueCountFrequency (%)
1130
 
11.7%
781
 
8.1%
575
 
6.0%
555
 
5.8%
482
 
5.0%
411
 
4.3%
398
 
4.1%
387
 
4.0%
361
 
3.7%
311
 
3.2%
Other values (132) 4252
44.1%

Missing values

2023-12-13T01:18:46.173913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:18:46.260378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T01:18:46.353798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

등록번호관리품목성명주소
02022-11-003179일반홍00포천시
12022-11-003178일반신00포천시
22022-22-003177버섯이00부여군
32022-11-003176일반최00여주시
42022-11-003175일반김00서울특별시 관악구
52022-11-003174일반김00이천시
62022-11-003173일반김00홍천군
72022-11-003172일반최00고령군
82022-11-003171일반나00서울특별시 성북구
92022-11-003170일반김00홍성군
등록번호관리품목성명주소
34981998-11-06일반안oo강원도
34991998-22-1버섯민oo충남
35001998-22-2버섯유oo충북
35011998-22-3버섯민oo충남
35021998-22-4버섯백oo충남
35031998-22-9버섯곽oo강원
35041998-22-10버섯송oo전남
35051998-22-11버섯윤oo전남
35061998-22-12버섯이oo경남
35071998-22-17버섯주oo전남

Duplicate rows

Most frequently occurring

등록번호관리품목성명주소# duplicates
02007-11-934일반박oo광주2
12007-11-935일반박oo광주2
22007-11-936일반김oo경북2
32007-11-937일반양oo전남2
42007-11-938일반서oo경기2
52007-11-939일반박oo충남2
62007-11-940일반현oo제주2
72007-22-250버섯고oo경기2
82007-22-251버섯최oo경기2