Overview

Dataset statistics

Number of variables5
Number of observations178
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)0.6%
Total size in memory7.6 KiB
Average record size in memory43.7 B

Variable types

Categorical1
Text1
Numeric3

Dataset

Description본 자료는 자원통합관리시스템 메타데이터 가공 자료로서 시스템 내 등록되어있는 종자이력정보에 대한 데이터로 묘령, 수종, 작업종 등에 대한 정보입니다.
URLhttps://www.data.go.kr/data/15116322/fileData.do

Alerts

Dataset has 1 (0.6%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 19:42:53.521233
Analysis finished2023-12-12 19:42:54.797044
Duration1.28 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

종자공급원
Categorical

Distinct2
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
채종임분
125 
채종림
53 

Length

Max length4
Median length4
Mean length3.7022472
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row채종림
2nd row채종임분
3rd row채종임분
4th row채종임분
5th row채종임분

Common Values

ValueCountFrequency (%)
채종임분 125
70.2%
채종림 53
29.8%

Length

2023-12-13T04:42:54.878486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:42:54.998588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
채종임분 125
70.2%
채종림 53
29.8%

수종
Text

Distinct53
Distinct (%)29.8%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2023-12-13T04:42:55.235442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length3.9213483
Min length2

Characters and Unicode

Total characters698
Distinct characters88
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique23 ?
Unique (%)12.9%

Sample

1st row소나무
2nd row백합나무
3rd row백합나무
4th row상수리나무
5th row상수리나무
ValueCountFrequency (%)
낙엽송 19
 
10.7%
소나무 14
 
7.9%
상수리나무 9
 
5.1%
백합나무 9
 
5.1%
느티나무 8
 
4.5%
자작나무 8
 
4.5%
스트로브잣나무 8
 
4.5%
편백 6
 
3.4%
굴참나무 6
 
3.4%
전나무 6
 
3.4%
Other values (43) 85
47.8%
2023-12-13T04:42:55.676044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
143
20.5%
143
20.5%
20
 
2.9%
19
 
2.7%
19
 
2.7%
18
 
2.6%
14
 
2.0%
14
 
2.0%
13
 
1.9%
11
 
1.6%
Other values (78) 284
40.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 698
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
143
20.5%
143
20.5%
20
 
2.9%
19
 
2.7%
19
 
2.7%
18
 
2.6%
14
 
2.0%
14
 
2.0%
13
 
1.9%
11
 
1.6%
Other values (78) 284
40.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 698
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
143
20.5%
143
20.5%
20
 
2.9%
19
 
2.7%
19
 
2.7%
18
 
2.6%
14
 
2.0%
14
 
2.0%
13
 
1.9%
11
 
1.6%
Other values (78) 284
40.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 698
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
143
20.5%
143
20.5%
20
 
2.9%
19
 
2.7%
19
 
2.7%
18
 
2.6%
14
 
2.0%
14
 
2.0%
13
 
1.9%
11
 
1.6%
Other values (78) 284
40.7%

수령
Real number (ℝ)

Distinct53
Distinct (%)29.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35.870787
Minimum8
Maximum200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T04:42:55.861470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile15
Q127
median35
Q341.75
95-th percentile59.15
Maximum200
Range192
Interquartile range (IQR)14.75

Descriptive statistics

Standard deviation18.688349
Coefficient of variation (CV)0.52099078
Kurtosis33.81254
Mean35.870787
Median Absolute Deviation (MAD)8
Skewness4.2688283
Sum6385
Variance349.2544
MonotonicityNot monotonic
2023-12-13T04:42:56.089853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40 10
 
5.6%
35 10
 
5.6%
30 9
 
5.1%
38 8
 
4.5%
27 7
 
3.9%
36 7
 
3.9%
31 7
 
3.9%
28 7
 
3.9%
50 5
 
2.8%
16 5
 
2.8%
Other values (43) 103
57.9%
ValueCountFrequency (%)
8 2
 
1.1%
9 1
 
0.6%
10 2
 
1.1%
11 1
 
0.6%
13 1
 
0.6%
15 3
1.7%
16 5
2.8%
17 2
 
1.1%
18 1
 
0.6%
19 4
2.2%
ValueCountFrequency (%)
200 1
 
0.6%
106 1
 
0.6%
84 1
 
0.6%
80 1
 
0.6%
66 1
 
0.6%
65 1
 
0.6%
60 3
1.7%
59 1
 
0.6%
57 1
 
0.6%
56 2
1.1%

조성연도
Real number (ℝ)

Distinct19
Distinct (%)10.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.2416
Minimum1986
Maximum2016
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T04:42:56.278773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1986
5-th percentile2003
Q12007
median2009.5
Q32015
95-th percentile2016
Maximum2016
Range30
Interquartile range (IQR)8

Descriptive statistics

Standard deviation5.6601553
Coefficient of variation (CV)0.0028156593
Kurtosis3.9845472
Mean2010.2416
Median Absolute Deviation (MAD)3.5
Skewness-1.6136644
Sum357823
Variance32.037358
MonotonicityNot monotonic
2023-12-13T04:42:56.437356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
2015 51
28.7%
2009 24
13.5%
2006 23
12.9%
2016 19
 
10.7%
2008 14
 
7.9%
2007 13
 
7.3%
2012 9
 
5.1%
2013 5
 
2.8%
2003 4
 
2.2%
2014 3
 
1.7%
Other values (9) 13
 
7.3%
ValueCountFrequency (%)
1986 2
 
1.1%
1992 3
 
1.7%
1994 1
 
0.6%
1996 1
 
0.6%
1997 1
 
0.6%
2003 4
 
2.2%
2004 1
 
0.6%
2005 2
 
1.1%
2006 23
12.9%
2007 13
7.3%
ValueCountFrequency (%)
2016 19
 
10.7%
2015 51
28.7%
2014 3
 
1.7%
2013 5
 
2.8%
2012 9
 
5.1%
2011 1
 
0.6%
2010 1
 
0.6%
2009 24
13.5%
2008 14
 
7.9%
2007 13
 
7.3%

본수
Real number (ℝ)

Distinct117
Distinct (%)65.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1638.5618
Minimum2
Maximum22000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.7 KiB
2023-12-13T04:42:56.635133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile30.4
Q1250
median800
Q31750
95-th percentile6018
Maximum22000
Range21998
Interquartile range (IQR)1500

Descriptive statistics

Standard deviation2542.9404
Coefficient of variation (CV)1.5519344
Kurtosis24.690285
Mean1638.5618
Median Absolute Deviation (MAD)695
Skewness4.0447581
Sum291664
Variance6466546
MonotonicityNot monotonic
2023-12-13T04:42:56.817783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
500 7
 
3.9%
1000 6
 
3.4%
1500 6
 
3.4%
600 5
 
2.8%
300 5
 
2.8%
900 5
 
2.8%
3000 5
 
2.8%
50 4
 
2.2%
1200 4
 
2.2%
840 3
 
1.7%
Other values (107) 128
71.9%
ValueCountFrequency (%)
2 1
0.6%
10 1
0.6%
17 2
1.1%
18 1
0.6%
20 2
1.1%
21 1
0.6%
27 1
0.6%
31 1
0.6%
32 1
0.6%
40 2
1.1%
ValueCountFrequency (%)
22000 1
0.6%
12500 1
0.6%
9225 1
0.6%
8262 1
0.6%
8000 1
0.6%
7900 1
0.6%
7250 1
0.6%
7000 1
0.6%
6120 1
0.6%
6000 1
0.6%

Interactions

2023-12-13T04:42:54.274835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:53.716721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:54.014678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:54.388506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:53.812372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:54.110022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:54.483087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:53.912218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:42:54.194894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:42:56.952138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종자공급원수종수령조성연도본수
종자공급원1.0000.4930.0000.5830.139
수종0.4931.0000.2700.2290.000
수령0.0000.2701.0000.6120.000
조성연도0.5830.2290.6121.0000.000
본수0.1390.0000.0000.0001.000
2023-12-13T04:42:57.472131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수령조성연도본수종자공급원
수령1.000-0.268-0.1640.000
조성연도-0.2681.0000.2820.431
본수-0.1640.2821.0000.146
종자공급원0.0000.4310.1461.000

Missing values

2023-12-13T04:42:54.591049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:42:54.734161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

종자공급원수종수령조성연도본수
0채종림소나무221994485
1채종임분백합나무352008216
2채종임분백합나무45200820
3채종임분상수리나무44200950
4채종임분상수리나무362009180
5채종임분자작나무232009500
6채종임분물푸레나무2520091000
7채종임분백합나무48201195
8채종림고로쇠나무3520061600
9채종림음나무4020061200
종자공급원수종수령조성연도본수
168채종임분층층나무292007625
169채종임분편백3520071200
170채종임분편백362007500
171채종임분편백352012840
172채종임분편백2420158000
173채종임분독일가문비372015280
174채종임분황칠나무1320151000
175채종임분후박나무172015675
176채종임분황칠나무11201563
177채종임분편백3320151200

Duplicate rows

Most frequently occurring

종자공급원수종수령조성연도본수# duplicates
0채종임분참죽나무382013172