Overview

Dataset statistics

Number of variables8
Number of observations2704
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory174.4 KiB
Average record size in memory66.0 B

Variable types

Numeric2
DateTime3
Text3

Dataset

Description품종보호권이 있었던 식물 신품종 중 그 권한이 소멸된 품종에 대하여 소멸일 출원일, 출원번호, 품종명, 보호권자에 대한 정보를 공고한 현황자료
URLhttps://www.data.go.kr/data/15008321/fileData.do

Alerts

연번 is highly overall correlated with 공보호수High correlation
공보호수 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique
출원번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 06:04:33.887505
Analysis finished2023-12-12 06:04:35.182747
Duration1.3 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct2704
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1352.5
Minimum1
Maximum2704
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.9 KiB
2023-12-12T15:04:35.265290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile136.15
Q1676.75
median1352.5
Q32028.25
95-th percentile2568.85
Maximum2704
Range2703
Interquartile range (IQR)1351.5

Descriptive statistics

Standard deviation780.72189
Coefficient of variation (CV)0.57724354
Kurtosis-1.2
Mean1352.5
Median Absolute Deviation (MAD)676
Skewness0
Sum3657160
Variance609526.67
MonotonicityStrictly increasing
2023-12-12T15:04:35.443995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1808 1
 
< 0.1%
1800 1
 
< 0.1%
1801 1
 
< 0.1%
1802 1
 
< 0.1%
1803 1
 
< 0.1%
1804 1
 
< 0.1%
1805 1
 
< 0.1%
1806 1
 
< 0.1%
1807 1
 
< 0.1%
Other values (2694) 2694
99.6%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2704 1
< 0.1%
2703 1
< 0.1%
2702 1
< 0.1%
2701 1
< 0.1%
2700 1
< 0.1%
2699 1
< 0.1%
2698 1
< 0.1%
2697 1
< 0.1%
2696 1
< 0.1%
2695 1
< 0.1%
Distinct1156
Distinct (%)42.8%
Missing0
Missing (%)0.0%
Memory size21.3 KiB
Minimum2000-12-19 00:00:00
Maximum2022-12-29 00:00:00
2023-12-12T15:04:35.637565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:04:35.796454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct924
Distinct (%)34.2%
Missing0
Missing (%)0.0%
Memory size21.3 KiB
Minimum1998-03-06 00:00:00
Maximum2020-05-27 00:00:00
2023-12-12T15:04:35.973813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:04:36.163142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

출원번호
Text

UNIQUE 

Distinct2704
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size21.3 KiB
2023-12-12T15:04:36.455055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters37856
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2704 ?
Unique (%)100.0%

Sample

1st row10-1998-000211
2nd row10-1998-000076
3rd row10-1998-000172
4th row10-1998-000181
5th row10-1998-000097
ValueCountFrequency (%)
10-1998-000211 1
 
< 0.1%
10-2014-000417 1
 
< 0.1%
10-2001-000079 1
 
< 0.1%
10-2006-000377 1
 
< 0.1%
10-2016-000401 1
 
< 0.1%
10-2013-000119 1
 
< 0.1%
10-2010-000059 1
 
< 0.1%
10-2014-000250 1
 
< 0.1%
10-2010-000242 1
 
< 0.1%
10-2010-000240 1
 
< 0.1%
Other values (2694) 2694
99.6%
2023-12-12T15:04:36.857182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 16291
43.0%
- 5408
 
14.3%
1 5028
 
13.3%
2 4004
 
10.6%
3 1313
 
3.5%
9 1208
 
3.2%
4 1149
 
3.0%
5 1087
 
2.9%
8 863
 
2.3%
6 764
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 32448
85.7%
Dash Punctuation 5408
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 16291
50.2%
1 5028
 
15.5%
2 4004
 
12.3%
3 1313
 
4.0%
9 1208
 
3.7%
4 1149
 
3.5%
5 1087
 
3.3%
8 863
 
2.7%
6 764
 
2.4%
7 741
 
2.3%
Dash Punctuation
ValueCountFrequency (%)
- 5408
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 37856
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 16291
43.0%
- 5408
 
14.3%
1 5028
 
13.3%
2 4004
 
10.6%
3 1313
 
3.5%
9 1208
 
3.2%
4 1149
 
3.0%
5 1087
 
2.9%
8 863
 
2.3%
6 764
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37856
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 16291
43.0%
- 5408
 
14.3%
1 5028
 
13.3%
2 4004
 
10.6%
3 1313
 
3.5%
9 1208
 
3.2%
4 1149
 
3.0%
5 1087
 
2.9%
8 863
 
2.3%
6 764
 
2.0%
Distinct2649
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Memory size21.3 KiB
2023-12-12T15:04:37.283045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length13
Mean length4.1715976
Min length1

Characters and Unicode

Total characters11280
Distinct characters650
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2598 ?
Unique (%)96.1%

Sample

1st row횡성옥
2nd row동진
3rd row탑골
4th row두산8호
5th row소백
ValueCountFrequency (%)
케이에스 10
 
0.4%
핑크 4
 
0.1%
황금 3
 
0.1%
챠밍걸 3
 
0.1%
새올 3
 
0.1%
장수 3
 
0.1%
샛별 3
 
0.1%
레드스타 2
 
0.1%
금향 2
 
0.1%
온누리 2
 
0.1%
Other values (2648) 2694
98.7%
2023-12-12T15:04:37.780862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
514
 
4.6%
483
 
4.3%
241
 
2.1%
1 197
 
1.7%
185
 
1.6%
176
 
1.6%
169
 
1.5%
0 163
 
1.4%
162
 
1.4%
152
 
1.3%
Other values (640) 8838
78.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10259
90.9%
Decimal Number 863
 
7.7%
Uppercase Letter 82
 
0.7%
Dash Punctuation 47
 
0.4%
Space Separator 26
 
0.2%
Lowercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
514
 
5.0%
483
 
4.7%
241
 
2.3%
185
 
1.8%
176
 
1.7%
169
 
1.6%
162
 
1.6%
152
 
1.5%
138
 
1.3%
137
 
1.3%
Other values (613) 7902
77.0%
Uppercase Letter
ValueCountFrequency (%)
R 28
34.1%
C 15
18.3%
S 8
 
9.8%
B 6
 
7.3%
K 6
 
7.3%
Y 5
 
6.1%
P 4
 
4.9%
W 4
 
4.9%
A 3
 
3.7%
G 1
 
1.2%
Other values (2) 2
 
2.4%
Decimal Number
ValueCountFrequency (%)
1 197
22.8%
0 163
18.9%
2 117
13.6%
3 76
 
8.8%
9 76
 
8.8%
5 62
 
7.2%
4 55
 
6.4%
7 41
 
4.8%
6 40
 
4.6%
8 36
 
4.2%
Lowercase Letter
ValueCountFrequency (%)
r 1
33.3%
h 1
33.3%
m 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 47
100.0%
Space Separator
ValueCountFrequency (%)
26
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10259
90.9%
Common 936
 
8.3%
Latin 85
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
514
 
5.0%
483
 
4.7%
241
 
2.3%
185
 
1.8%
176
 
1.7%
169
 
1.6%
162
 
1.6%
152
 
1.5%
138
 
1.3%
137
 
1.3%
Other values (613) 7902
77.0%
Latin
ValueCountFrequency (%)
R 28
32.9%
C 15
17.6%
S 8
 
9.4%
B 6
 
7.1%
K 6
 
7.1%
Y 5
 
5.9%
P 4
 
4.7%
W 4
 
4.7%
A 3
 
3.5%
G 1
 
1.2%
Other values (5) 5
 
5.9%
Common
ValueCountFrequency (%)
1 197
21.0%
0 163
17.4%
2 117
12.5%
3 76
 
8.1%
9 76
 
8.1%
5 62
 
6.6%
4 55
 
5.9%
- 47
 
5.0%
7 41
 
4.4%
6 40
 
4.3%
Other values (2) 62
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10259
90.9%
ASCII 1021
 
9.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
514
 
5.0%
483
 
4.7%
241
 
2.3%
185
 
1.8%
176
 
1.7%
169
 
1.6%
162
 
1.6%
152
 
1.5%
138
 
1.3%
137
 
1.3%
Other values (613) 7902
77.0%
ASCII
ValueCountFrequency (%)
1 197
19.3%
0 163
16.0%
2 117
11.5%
3 76
 
7.4%
9 76
 
7.4%
5 62
 
6.1%
4 55
 
5.4%
- 47
 
4.6%
7 41
 
4.0%
6 40
 
3.9%
Other values (17) 147
14.4%
Distinct423
Distinct (%)15.6%
Missing0
Missing (%)0.0%
Memory size21.3 KiB
2023-12-12T15:04:38.073174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length28
Mean length10.153107
Min length2

Characters and Unicode

Total characters27454
Distinct characters394
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique156 ?
Unique (%)5.8%

Sample

1st row농촌진흥청장
2nd row농촌진흥청장
3rd row농촌진흥청장
4th row농촌진흥청장
5th row농촌진흥청장
ValueCountFrequency (%)
농촌진흥청 250
 
4.6%
주식회사 155
 
2.9%
비.브이 144
 
2.7%
농업회사법인 138
 
2.6%
로즈 136
 
2.5%
농촌진흥청장 129
 
2.4%
너스리즈 128
 
2.4%
케이세이 127
 
2.4%
홀딩 119
 
2.2%
아이엔시 113
 
2.1%
Other values (580) 3940
73.2%
2023-12-12T15:04:38.501077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2703
 
9.8%
1314
 
4.8%
900
 
3.3%
671
 
2.4%
. 580
 
2.1%
568
 
2.1%
( 475
 
1.7%
474
 
1.7%
) 452
 
1.6%
448
 
1.6%
Other values (384) 18869
68.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 19032
69.3%
Space Separator 2703
 
9.8%
Lowercase Letter 2497
 
9.1%
Uppercase Letter 1436
 
5.2%
Other Punctuation 773
 
2.8%
Open Punctuation 475
 
1.7%
Close Punctuation 452
 
1.6%
Dash Punctuation 65
 
0.2%
Other Symbol 9
 
< 0.1%
Decimal Number 8
 
< 0.1%
Other values (2) 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1314
 
6.9%
900
 
4.7%
671
 
3.5%
568
 
3.0%
474
 
2.5%
448
 
2.4%
412
 
2.2%
408
 
2.1%
391
 
2.1%
389
 
2.0%
Other values (322) 13057
68.6%
Lowercase Letter
ValueCountFrequency (%)
e 409
16.4%
n 258
10.3%
i 253
10.1%
r 204
 
8.2%
a 196
 
7.8%
s 165
 
6.6%
u 118
 
4.7%
l 109
 
4.4%
o 98
 
3.9%
d 83
 
3.3%
Other values (15) 604
24.2%
Uppercase Letter
ValueCountFrequency (%)
R 180
12.5%
L 143
 
10.0%
B 119
 
8.3%
V 119
 
8.3%
A 111
 
7.7%
O 87
 
6.1%
F 75
 
5.2%
S 69
 
4.8%
D 62
 
4.3%
N 59
 
4.1%
Other values (13) 412
28.7%
Other Punctuation
ValueCountFrequency (%)
. 580
75.0%
, 104
 
13.5%
/ 54
 
7.0%
& 34
 
4.4%
· 1
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 4
50.0%
2 4
50.0%
Space Separator
ValueCountFrequency (%)
2703
100.0%
Open Punctuation
ValueCountFrequency (%)
( 475
100.0%
Close Punctuation
ValueCountFrequency (%)
) 452
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 65
100.0%
Other Symbol
ValueCountFrequency (%)
9
100.0%
Math Symbol
ValueCountFrequency (%)
+ 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 19041
69.4%
Common 4480
 
16.3%
Latin 3933
 
14.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1314
 
6.9%
900
 
4.7%
671
 
3.5%
568
 
3.0%
474
 
2.5%
448
 
2.4%
412
 
2.2%
408
 
2.1%
391
 
2.1%
389
 
2.0%
Other values (323) 13066
68.6%
Latin
ValueCountFrequency (%)
e 409
 
10.4%
n 258
 
6.6%
i 253
 
6.4%
r 204
 
5.2%
a 196
 
5.0%
R 180
 
4.6%
s 165
 
4.2%
L 143
 
3.6%
B 119
 
3.0%
V 119
 
3.0%
Other values (38) 1887
48.0%
Common
ValueCountFrequency (%)
2703
60.3%
. 580
 
12.9%
( 475
 
10.6%
) 452
 
10.1%
, 104
 
2.3%
- 65
 
1.5%
/ 54
 
1.2%
& 34
 
0.8%
1 4
 
0.1%
2 4
 
0.1%
Other values (3) 5
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 19032
69.3%
ASCII 8412
30.6%
None 10
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2703
32.1%
. 580
 
6.9%
( 475
 
5.6%
) 452
 
5.4%
e 409
 
4.9%
n 258
 
3.1%
i 253
 
3.0%
r 204
 
2.4%
a 196
 
2.3%
R 180
 
2.1%
Other values (50) 2702
32.1%
Hangul
ValueCountFrequency (%)
1314
 
6.9%
900
 
4.7%
671
 
3.5%
568
 
3.0%
474
 
2.5%
448
 
2.4%
412
 
2.2%
408
 
2.1%
391
 
2.1%
389
 
2.0%
Other values (322) 13057
68.6%
None
ValueCountFrequency (%)
9
90.0%
· 1
 
10.0%

공보호수
Real number (ℝ)

HIGH CORRELATION 

Distinct144
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean213.3628
Minimum80
Maximum300
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.9 KiB
2023-12-12T15:04:38.632366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum80
5-th percentile114
Q1164
median230
Q3261
95-th percentile291
Maximum300
Range220
Interquartile range (IQR)97

Descriptive statistics

Standard deviation57.478807
Coefficient of variation (CV)0.2693947
Kurtosis-1.108557
Mean213.3628
Median Absolute Deviation (MAD)45
Skewness-0.35755607
Sum576933
Variance3303.8132
MonotonicityNot monotonic
2023-12-12T15:04:38.758576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
252 111
 
4.1%
164 91
 
3.4%
261 86
 
3.2%
234 85
 
3.1%
226 81
 
3.0%
161 66
 
2.4%
148 56
 
2.1%
175 56
 
2.1%
114 53
 
2.0%
264 51
 
1.9%
Other values (134) 1968
72.8%
ValueCountFrequency (%)
80 6
 
0.2%
81 4
 
0.1%
83 1
 
< 0.1%
89 2
 
0.1%
90 5
 
0.2%
91 1
 
< 0.1%
97 5
 
0.2%
99 1
 
< 0.1%
101 22
0.8%
102 4
 
0.1%
ValueCountFrequency (%)
300 20
0.7%
298 8
 
0.3%
297 12
0.4%
296 4
 
0.1%
295 12
0.4%
294 19
0.7%
293 21
0.8%
292 23
0.9%
291 21
0.8%
290 10
0.4%
Distinct144
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size21.3 KiB
Minimum2005-03-15 00:00:00
Maximum2023-07-15 00:00:00
2023-12-12T15:04:38.878498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:04:39.028186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T15:04:34.748986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:04:34.527557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:04:34.846786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:04:34.658714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:04:39.113617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번공보호수
연번1.0000.974
공보호수0.9741.000
2023-12-12T15:04:39.201420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번공보호수
연번1.0000.997
공보호수0.9971.000

Missing values

2023-12-12T15:04:34.986133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:04:35.120832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번소멸일출원일출원번호품종명보호권자공보호수소멸공개결정일
012000-12-191998-12-0410-1998-000211횡성옥농촌진흥청장1212008-08-15
122001-12-311998-11-2010-1998-000076동진농촌진흥청장1212008-08-15
232001-12-311998-12-0410-1998-000172탑골농촌진흥청장1212008-08-15
342001-12-311998-12-0410-1998-000181두산8호농촌진흥청장1212008-08-15
452002-11-301998-11-2010-1998-000097소백농촌진흥청장1212008-08-15
562002-11-301998-11-2010-1998-000096오대벼농촌진흥청장1212008-08-15
672002-11-301998-12-0410-1998-000185은파농촌진흥청장1212008-08-15
782002-12-101998-11-2010-1998-000128송학농촌진흥청장1212008-08-15
892003-03-201998-11-1010-1998-000025유명농촌진흥청장1212008-08-15
9102003-06-281998-10-3110-1998-000023삼관왕세미니스코리아㈜1212008-08-15
연번소멸일출원일출원번호품종명보호권자공보호수소멸공개결정일
269426952022-12-102017-08-1710-2017-000422삼국향새만금생명공학센타3002023-07-15
269526962022-12-112014-06-1710-2014-000351귀부인상현영농조합법인 이온종묘3002023-07-15
269626972022-12-112014-06-2010-2014-000357우아미상현영농조합법인 이온종묘3002023-07-15
269726982022-12-122011-10-0510-2011-000459명문플러스장춘종묘사3002023-07-15
269826992022-12-182010-08-3010-2010-000405보레아스그래프 브리딩 에이/에스3002023-07-15
269927002022-12-202016-10-0610-2016-000472헬씨감마농업회사법인 (주)농우바이오3002023-07-15
270027012022-12-202017-10-1010-2017-000497헬씨킹농업회사법인 (주)농우바이오3002023-07-15
270127022022-12-202017-12-0610-2017-000634엠이에프더블유7004(주)팜한농3002023-07-15
270227032022-12-232005-11-3010-2005-000483선웨이플로리스트2962023-03-15
270327042022-12-292015-07-0610-2015-000434케이디93농업회사법인 주식회사 코레곤3002023-07-15