Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows5
Duplicate rows (%)0.1%
Total size in memory468.8 KiB
Average record size in memory48.0 B

Variable types

DateTime1
Text4

Dataset

Description국립종자원에서 품종보호 소관식물로 담당하고 있는 식물의 종자를 수입, 생산, 판매하기 위해 국립종자원에 신고된 신고된 내역 목록(신고필증 발급일별로 정리)
URLhttps://www.data.go.kr/data/3036312/fileData.do

Alerts

Dataset has 5 (0.1%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 08:11:18.711364
Analysis finished2023-12-12 08:11:20.264475
Duration1.55 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct2484
Distinct (%)24.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum1997-12-31 00:00:00
Maximum2022-12-28 00:00:00
2023-12-12T17:11:20.359918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:11:20.562658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct9611
Distinct (%)96.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T17:11:20.847507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length17
Mean length16.1973
Min length14

Characters and Unicode

Total characters161973
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9522 ?
Unique (%)95.2%

Sample

1st row02-0004-2012-96
2nd row04-0004-1999-1068
3rd row02-0004-2015-62
4th row02-0109-2016-3
5th row04-0005-1999-978
ValueCountFrequency (%)
07-0002-1997-5 22
 
0.2%
07-0003-1997-1 17
 
0.2%
03-0002-1997-6 17
 
0.2%
07-0002-1997-3 16
 
0.2%
07-0001-1997-3 16
 
0.2%
07-0002-1997-4 14
 
0.1%
07-0003-1997-2 13
 
0.1%
07-0001-1997-2 12
 
0.1%
07-0002-1997-2 12
 
0.1%
07-0002-1997-9 11
 
0.1%
Other values (9601) 9850
98.5%
2023-12-12T17:11:21.363361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 44629
27.6%
- 30000
18.5%
2 17755
 
11.0%
1 16079
 
9.9%
14367
 
8.9%
9 10791
 
6.7%
4 8385
 
5.2%
3 5899
 
3.6%
7 4604
 
2.8%
5 3620
 
2.2%
Other values (2) 5844
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 117606
72.6%
Dash Punctuation 30000
 
18.5%
Space Separator 14367
 
8.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 44629
37.9%
2 17755
 
15.1%
1 16079
 
13.7%
9 10791
 
9.2%
4 8385
 
7.1%
3 5899
 
5.0%
7 4604
 
3.9%
5 3620
 
3.1%
6 3023
 
2.6%
8 2821
 
2.4%
Dash Punctuation
ValueCountFrequency (%)
- 30000
100.0%
Space Separator
ValueCountFrequency (%)
14367
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 161973
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 44629
27.6%
- 30000
18.5%
2 17755
 
11.0%
1 16079
 
9.9%
14367
 
8.9%
9 10791
 
6.7%
4 8385
 
5.2%
3 5899
 
3.6%
7 4604
 
2.8%
5 3620
 
2.2%
Other values (2) 5844
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 161973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 44629
27.6%
- 30000
18.5%
2 17755
 
11.0%
1 16079
 
9.9%
14367
 
8.9%
9 10791
 
6.7%
4 8385
 
5.2%
3 5899
 
3.6%
7 4604
 
2.8%
5 3620
 
2.2%
Other values (2) 5844
 
3.6%
Distinct768
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T17:11:21.757627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length20
Mean length3.3462
Min length1

Characters and Unicode

Total characters33462
Distinct characters486
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique292 ?
Unique (%)2.9%

Sample

1st row고추
2nd row카네이션
3rd row고추
4th row호박(서양계X동양계)
5th row백합
ValueCountFrequency (%)
고추 616
 
6.1%
튤립 358
 
3.6%
321
 
3.2%
양파 290
 
2.9%
사과 282
 
2.8%
백합 277
 
2.8%
느타리버섯 256
 
2.5%
카네이션 254
 
2.5%
배추 253
 
2.5%
토마토 247
 
2.5%
Other values (780) 6918
68.7%
2023-12-12T17:11:22.303948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1338
 
4.0%
1222
 
3.7%
887
 
2.7%
832
 
2.5%
806
 
2.4%
802
 
2.4%
796
 
2.4%
777
 
2.3%
( 630
 
1.9%
) 630
 
1.9%
Other values (476) 24742
73.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 32018
95.7%
Open Punctuation 630
 
1.9%
Close Punctuation 630
 
1.9%
Space Separator 73
 
0.2%
Other Punctuation 53
 
0.2%
Uppercase Letter 49
 
0.1%
Math Symbol 7
 
< 0.1%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1338
 
4.2%
1222
 
3.8%
887
 
2.8%
832
 
2.6%
806
 
2.5%
802
 
2.5%
796
 
2.5%
777
 
2.4%
597
 
1.9%
587
 
1.8%
Other values (469) 23374
73.0%
Open Punctuation
ValueCountFrequency (%)
( 630
100.0%
Close Punctuation
ValueCountFrequency (%)
) 630
100.0%
Space Separator
ValueCountFrequency (%)
73
100.0%
Other Punctuation
ValueCountFrequency (%)
, 53
100.0%
Uppercase Letter
ValueCountFrequency (%)
X 49
100.0%
Math Symbol
ValueCountFrequency (%)
+ 7
100.0%
Lowercase Letter
ValueCountFrequency (%)
x 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 32018
95.7%
Common 1393
 
4.2%
Latin 51
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1338
 
4.2%
1222
 
3.8%
887
 
2.8%
832
 
2.6%
806
 
2.5%
802
 
2.5%
796
 
2.5%
777
 
2.4%
597
 
1.9%
587
 
1.8%
Other values (469) 23374
73.0%
Common
ValueCountFrequency (%)
( 630
45.2%
) 630
45.2%
73
 
5.2%
, 53
 
3.8%
+ 7
 
0.5%
Latin
ValueCountFrequency (%)
X 49
96.1%
x 2
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 32018
95.7%
ASCII 1444
 
4.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1338
 
4.2%
1222
 
3.8%
887
 
2.8%
832
 
2.6%
806
 
2.5%
802
 
2.5%
796
 
2.5%
777
 
2.4%
597
 
1.9%
587
 
1.8%
Other values (469) 23374
73.0%
ASCII
ValueCountFrequency (%)
( 630
43.6%
) 630
43.6%
73
 
5.1%
, 53
 
3.7%
X 49
 
3.4%
+ 7
 
0.5%
x 2
 
0.1%
Distinct7073
Distinct (%)70.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T17:11:22.948094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length15
Mean length4.4564
Min length1

Characters and Unicode

Total characters44564
Distinct characters911
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6308 ?
Unique (%)63.1%

Sample

1st row피알농가사랑
2nd row이브
3rd row명콤비
4th row썬스타
5th row라이온
ValueCountFrequency (%)
일반종 427
 
4.2%
커몬 173
 
1.7%
재래종 76
 
0.8%
신고 39
 
0.4%
차랑 37
 
0.4%
혼합종 37
 
0.4%
후지 35
 
0.3%
야생종자 30
 
0.3%
캠벨어리 29
 
0.3%
추희 27
 
0.3%
Other values (7110) 9214
91.0%
2023-12-12T17:11:23.500524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1650
 
3.7%
1469
 
3.3%
1131
 
2.5%
772
 
1.7%
770
 
1.7%
749
 
1.7%
715
 
1.6%
699
 
1.6%
629
 
1.4%
621
 
1.4%
Other values (901) 35359
79.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 41821
93.8%
Decimal Number 2166
 
4.9%
Uppercase Letter 250
 
0.6%
Space Separator 129
 
0.3%
Dash Punctuation 93
 
0.2%
Other Punctuation 50
 
0.1%
Lowercase Letter 37
 
0.1%
Letter Number 7
 
< 0.1%
Close Punctuation 5
 
< 0.1%
Open Punctuation 5
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1650
 
3.9%
1469
 
3.5%
1131
 
2.7%
772
 
1.8%
770
 
1.8%
749
 
1.8%
715
 
1.7%
699
 
1.7%
629
 
1.5%
621
 
1.5%
Other values (840) 32616
78.0%
Uppercase Letter
ValueCountFrequency (%)
R 62
24.8%
I 21
 
8.4%
P 21
 
8.4%
C 19
 
7.6%
M 15
 
6.0%
K 14
 
5.6%
S 14
 
5.6%
L 14
 
5.6%
Y 11
 
4.4%
B 10
 
4.0%
Other values (14) 49
19.6%
Lowercase Letter
ValueCountFrequency (%)
o 7
18.9%
a 4
10.8%
y 4
10.8%
r 4
10.8%
e 3
8.1%
s 2
 
5.4%
u 2
 
5.4%
b 2
 
5.4%
i 2
 
5.4%
k 1
 
2.7%
Other values (6) 6
16.2%
Decimal Number
ValueCountFrequency (%)
1 520
24.0%
2 394
18.2%
0 326
15.1%
3 206
 
9.5%
5 199
 
9.2%
4 130
 
6.0%
7 118
 
5.4%
8 112
 
5.2%
6 89
 
4.1%
9 72
 
3.3%
Other Punctuation
ValueCountFrequency (%)
. 23
46.0%
: 16
32.0%
, 11
22.0%
Letter Number
ValueCountFrequency (%)
3
42.9%
3
42.9%
1
 
14.3%
Space Separator
ValueCountFrequency (%)
129
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 93
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 41821
93.8%
Common 2449
 
5.5%
Latin 294
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1650
 
3.9%
1469
 
3.5%
1131
 
2.7%
772
 
1.8%
770
 
1.8%
749
 
1.8%
715
 
1.7%
699
 
1.7%
629
 
1.5%
621
 
1.5%
Other values (840) 32616
78.0%
Latin
ValueCountFrequency (%)
R 62
21.1%
I 21
 
7.1%
P 21
 
7.1%
C 19
 
6.5%
M 15
 
5.1%
K 14
 
4.8%
S 14
 
4.8%
L 14
 
4.8%
Y 11
 
3.7%
B 10
 
3.4%
Other values (33) 93
31.6%
Common
ValueCountFrequency (%)
1 520
21.2%
2 394
16.1%
0 326
13.3%
3 206
 
8.4%
5 199
 
8.1%
4 130
 
5.3%
129
 
5.3%
7 118
 
4.8%
8 112
 
4.6%
- 93
 
3.8%
Other values (8) 222
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 41821
93.8%
ASCII 2736
 
6.1%
Number Forms 7
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1650
 
3.9%
1469
 
3.5%
1131
 
2.7%
772
 
1.8%
770
 
1.8%
749
 
1.8%
715
 
1.7%
699
 
1.7%
629
 
1.5%
621
 
1.5%
Other values (840) 32616
78.0%
ASCII
ValueCountFrequency (%)
1 520
19.0%
2 394
14.4%
0 326
11.9%
3 206
 
7.5%
5 199
 
7.3%
4 130
 
4.8%
129
 
4.7%
7 118
 
4.3%
8 112
 
4.1%
- 93
 
3.4%
Other values (48) 509
18.6%
Number Forms
ValueCountFrequency (%)
3
42.9%
3
42.9%
1
 
14.3%
Distinct1111
Distinct (%)11.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T17:11:23.813548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length17
Mean length8.6316
Min length1

Characters and Unicode

Total characters86316
Distinct characters395
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique379 ?
Unique (%)3.8%

Sample

1st row(주)제농 에스앤티 농업회사법인
2nd row한국화훼종묘
3rd row(주)팜한농
4th row주식회사 생농 농업회사법인
5th row에이스화훼통상
ValueCountFrequency (%)
농업회사법인 2322
 
16.6%
주식회사 1208
 
8.6%
한미종묘 435
 
3.1%
대양화훼종묘(주 334
 
2.4%
중앙화훼종묘(주 317
 
2.3%
사카타코리아(주 294
 
2.1%
주)팜한농 283
 
2.0%
한국다끼이(주 252
 
1.8%
아시아종묘(주 221
 
1.6%
신농화훼종묘(주 182
 
1.3%
Other values (1116) 8138
58.2%
2023-12-12T17:11:24.334518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5770
 
6.7%
4709
 
5.5%
4570
 
5.3%
) 4419
 
5.1%
( 4418
 
5.1%
3988
 
4.6%
3980
 
4.6%
3750
 
4.3%
3315
 
3.8%
2866
 
3.3%
Other values (385) 44531
51.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 73376
85.0%
Close Punctuation 4419
 
5.1%
Open Punctuation 4418
 
5.1%
Space Separator 3988
 
4.6%
Uppercase Letter 83
 
0.1%
Decimal Number 24
 
< 0.1%
Dash Punctuation 7
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5770
 
7.9%
4709
 
6.4%
4570
 
6.2%
3980
 
5.4%
3750
 
5.1%
3315
 
4.5%
2866
 
3.9%
2820
 
3.8%
2678
 
3.6%
1521
 
2.1%
Other values (372) 37397
51.0%
Uppercase Letter
ValueCountFrequency (%)
S 40
48.2%
K 40
48.2%
O 1
 
1.2%
M 1
 
1.2%
C 1
 
1.2%
Decimal Number
ValueCountFrequency (%)
1 11
45.8%
2 11
45.8%
8 2
 
8.3%
Close Punctuation
ValueCountFrequency (%)
) 4419
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4418
100.0%
Space Separator
ValueCountFrequency (%)
3988
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 73376
85.0%
Common 12857
 
14.9%
Latin 83
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5770
 
7.9%
4709
 
6.4%
4570
 
6.2%
3980
 
5.4%
3750
 
5.1%
3315
 
4.5%
2866
 
3.9%
2820
 
3.8%
2678
 
3.6%
1521
 
2.1%
Other values (372) 37397
51.0%
Common
ValueCountFrequency (%)
) 4419
34.4%
( 4418
34.4%
3988
31.0%
1 11
 
0.1%
2 11
 
0.1%
- 7
 
0.1%
8 2
 
< 0.1%
. 1
 
< 0.1%
Latin
ValueCountFrequency (%)
S 40
48.2%
K 40
48.2%
O 1
 
1.2%
M 1
 
1.2%
C 1
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 73376
85.0%
ASCII 12940
 
15.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5770
 
7.9%
4709
 
6.4%
4570
 
6.2%
3980
 
5.4%
3750
 
5.1%
3315
 
4.5%
2866
 
3.9%
2820
 
3.8%
2678
 
3.6%
1521
 
2.1%
Other values (372) 37397
51.0%
ASCII
ValueCountFrequency (%)
) 4419
34.1%
( 4418
34.1%
3988
30.8%
S 40
 
0.3%
K 40
 
0.3%
1 11
 
0.1%
2 11
 
0.1%
- 7
 
0.1%
8 2
 
< 0.1%
. 1
 
< 0.1%
Other values (3) 3
 
< 0.1%

Missing values

2023-12-12T17:11:20.049570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:11:20.191580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

신고필증 발급일신고필증번호작물명품종명신고인
209202012-11-1202-0004-2012-96고추피알농가사랑(주)제농 에스앤티 농업회사법인
457241999-07-0604-0004-1999-1068카네이션이브한국화훼종묘
145082015-10-1602-0004-2015-62고추명콤비(주)팜한농
133782016-05-0902-0109-2016-3호박(서양계X동양계)썬스타주식회사 생농 농업회사법인
462851999-06-2104-0005-1999-978백합라이온에이스화훼통상
486901999-01-3004-0074-1999-36코스모스센세이션믹스시리즈중앙화훼종묘(주)
270022009-09-1603-0012-2009-18일본자두대석조생최성복
381242003-06-2404-0027-2003-17프리뮬라하야가와폴리안사블루남도원예
406792001-07-2602-0003-2001-10양배추YR만월최원찬
210912012-09-2503-0071-2012-4블루베리(래빗아이)핑크레모네이드이귀완
신고필증 발급일신고필증번호작물명품종명신고인
559921998-12-1604-0004-1998-16카네이션그린발렌타인피꾸이앤조넨
460471999-06-2104-0003-1999-436국화래믹옐로우동서원예사
2562022-12-0103-0157-2022-1산복숭아일반종민선미
424612000-05-1607-0002-2000-194느타리버섯명월화천종균배양소
180622014-02-2703-0009-2014-26체리(감과양앵두)석홍금문경페칸영농조합
197462013-04-2602-0014-2013-17시금치신벤츠(주)팜한농
603631997-12-3102-0008-1997-101수박명물수박정춘식
265402009-11-1702-0006-2009-10오이보람류현정
411882001-03-3004-0037-2001-36거베라크리스타롱임육택
293312007-11-2201-0015-2007-3메밀양절중앙화훼종묘(주)

Duplicate rows

Most frequently occurring

신고필증 발급일신고필증번호작물명품종명신고인# duplicates
01997-12-3102-0001-1997-12의성반청무(주)팜한농2
11997-12-3102-0001-1997-8중국청피무(주)팜한농2
21997-12-3102-0012-1997-2당근여름5촌(주)팜한농2
31997-12-3107-0001-1997-3양송이버섯505호수원균농종균배양소2
41997-12-3107-0003-1997-1영지버섯영지1호한재버섯종균배양소2