Overview

Dataset statistics

Number of variables6
Number of observations5034
Missing cells0
Missing cells (%)0.0%
Duplicate rows761
Duplicate rows (%)15.1%
Total size in memory250.8 KiB
Average record size in memory51.0 B

Variable types

Numeric3
Text2
DateTime1

Dataset

Description일반병해충 참나무시들음병 고사목 상세정보 데이터를 제공합니다.- 지역X좌표, 지역Y좌표, 국가지점번호, 법정동코드, PNU코드, 조사일자
Author산림청
URLhttps://www.data.go.kr/data/15120578/fileData.do

Alerts

Dataset has 761 (15.1%) duplicate rowsDuplicates
지역X좌표 is highly overall correlated with PNU코드High correlation
PNU코드 is highly overall correlated with 지역X좌표High correlation

Reproduction

Analysis started2023-12-12 20:31:59.529238
Analysis finished2023-12-12 20:32:01.138774
Duration1.61 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지역X좌표
Real number (ℝ)

HIGH CORRELATION 

Distinct2718
Distinct (%)54.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean214738.59
Minimum179302
Maximum2015722
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.4 KiB
2023-12-13T05:32:01.209059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum179302
5-th percentile179816.7
Q1193730
median205793.5
Q3223623.75
95-th percentile281184.4
Maximum2015722
Range1836420
Interquartile range (IQR)29893.75

Descriptive statistics

Standard deviation42472.054
Coefficient of variation (CV)0.19778492
Kurtosis644.69716
Mean214738.59
Median Absolute Deviation (MAD)16353
Skewness16.398076
Sum1.0809941 × 109
Variance1.8038753 × 109
MonotonicityNot monotonic
2023-12-13T05:32:01.365482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
193753 12
 
0.2%
235109 10
 
0.2%
219538 10
 
0.2%
220125 10
 
0.2%
235479 10
 
0.2%
235458 9
 
0.2%
219534 8
 
0.2%
193737 8
 
0.2%
235478 8
 
0.2%
235456 8
 
0.2%
Other values (2708) 4941
98.2%
ValueCountFrequency (%)
179302 1
 
< 0.1%
179304 3
0.1%
179306 2
< 0.1%
179309 2
< 0.1%
179310 1
 
< 0.1%
179311 2
< 0.1%
179312 1
 
< 0.1%
179313 4
0.1%
179315 4
0.1%
179316 1
 
< 0.1%
ValueCountFrequency (%)
2015722 1
< 0.1%
404781 1
< 0.1%
404753 1
< 0.1%
401416 1
< 0.1%
401126 1
< 0.1%
401099 1
< 0.1%
400983 1
< 0.1%
400963 1
< 0.1%
400945 1
< 0.1%
400939 1
< 0.1%

지역Y좌표
Real number (ℝ)

Distinct3074
Distinct (%)61.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean539527.65
Minimum258801
Maximum627144
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.4 KiB
2023-12-13T05:32:01.516591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum258801
5-th percentile507028
Q1521113.5
median540581
Q3565535
95-th percentile578138.35
Maximum627144
Range368343
Interquartile range (IQR)44421.5

Descriptive statistics

Standard deviation37928.863
Coefficient of variation (CV)0.070300129
Kurtosis20.950014
Mean539527.65
Median Absolute Deviation (MAD)19601.5
Skewness-3.5706326
Sum2.7159822 × 109
Variance1.4385987 × 109
MonotonicityNot monotonic
2023-12-13T05:32:01.680003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
565532 12
 
0.2%
565737 12
 
0.2%
565615 12
 
0.2%
565717 10
 
0.2%
551278 10
 
0.2%
565647 10
 
0.2%
565848 10
 
0.2%
551281 10
 
0.2%
565713 8
 
0.2%
541078 8
 
0.2%
Other values (3064) 4932
98.0%
ValueCountFrequency (%)
258801 1
< 0.1%
286306 1
< 0.1%
286307 1
< 0.1%
286308 1
< 0.1%
286309 1
< 0.1%
286310 1
< 0.1%
286311 1
< 0.1%
286312 1
< 0.1%
286313 1
< 0.1%
286314 1
< 0.1%
ValueCountFrequency (%)
627144 1
< 0.1%
627140 1
< 0.1%
627079 1
< 0.1%
627043 1
< 0.1%
626967 1
< 0.1%
626966 1
< 0.1%
626953 1
< 0.1%
626813 1
< 0.1%
626165 1
< 0.1%
626163 1
< 0.1%
Distinct2707
Distinct (%)53.8%
Missing0
Missing (%)0.0%
Memory size39.5 KiB
2023-12-13T05:32:01.934963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters50340
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1470 ?
Unique (%)29.2%

Sample

1st row라사34058610
2nd row라사34058610
3rd row라사34058610
4th row라사34058610
5th row라사31588267
ValueCountFrequency (%)
다라33948700 21
 
0.4%
다사75486555 20
 
0.4%
다사76056572 18
 
0.4%
다사76046576 14
 
0.3%
다사75446574 14
 
0.3%
다사90925143 13
 
0.3%
다사75366556 12
 
0.2%
다사90975132 11
 
0.2%
다사90915142 10
 
0.2%
다사90935136 10
 
0.2%
Other values (2697) 4891
97.2%
2023-12-13T05:32:02.389911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5 5346
10.6%
4 5137
10.2%
4886
9.7%
4681
9.3%
1 4450
8.8%
6 4230
8.4%
9 3965
7.9%
7 3956
7.9%
3 3801
7.6%
2 3267
6.5%
Other values (5) 6621
13.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 40272
80.0%
Other Letter 10068
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 5346
13.3%
4 5137
12.8%
1 4450
11.0%
6 4230
10.5%
9 3965
9.8%
7 3956
9.8%
3 3801
9.4%
2 3267
8.1%
0 3240
8.0%
8 2880
7.2%
Other Letter
ValueCountFrequency (%)
4886
48.5%
4681
46.5%
337
 
3.3%
108
 
1.1%
56
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Common 40272
80.0%
Hangul 10068
 
20.0%

Most frequent character per script

Common
ValueCountFrequency (%)
5 5346
13.3%
4 5137
12.8%
1 4450
11.0%
6 4230
10.5%
9 3965
9.8%
7 3956
9.8%
3 3801
9.4%
2 3267
8.1%
0 3240
8.0%
8 2880
7.2%
Hangul
ValueCountFrequency (%)
4886
48.5%
4681
46.5%
337
 
3.3%
108
 
1.1%
56
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 40272
80.0%
Hangul 10068
 
20.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5 5346
13.3%
4 5137
12.8%
1 4450
11.0%
6 4230
10.5%
9 3965
9.8%
7 3956
9.8%
3 3801
9.4%
2 3267
8.1%
0 3240
8.0%
8 2880
7.2%
Hangul
ValueCountFrequency (%)
4886
48.5%
4681
46.5%
337
 
3.3%
108
 
1.1%
56
 
0.6%
Distinct132
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size39.5 KiB
2023-12-13T05:32:02.736326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.9992054
Min length6

Characters and Unicode

Total characters50336
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)0.4%

Sample

1st row5111031027
2nd row5111031027
3rd row5111031027
4th row5111031027
5th row5111031030
ValueCountFrequency (%)
4183033024 727
14.4%
2820010400 480
 
9.5%
4136026223 412
 
8.2%
4163010800 351
 
7.0%
4121010400 327
 
6.5%
4111314100 307
 
6.1%
4111313400 280
 
5.6%
4121010600 270
 
5.4%
4146125627 226
 
4.5%
4111113900 150
 
3.0%
Other values (122) 1504
29.9%
2023-12-13T05:32:03.266351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 13825
27.5%
1 11904
23.6%
4 6881
13.7%
3 5755
11.4%
2 5226
 
10.4%
6 2558
 
5.1%
8 1819
 
3.6%
5 1226
 
2.4%
7 611
 
1.2%
9 527
 
1.0%
Other values (3) 4
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 50332
> 99.9%
Lowercase Letter 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 13825
27.5%
1 11904
23.7%
4 6881
13.7%
3 5755
11.4%
2 5226
 
10.4%
6 2558
 
5.1%
8 1819
 
3.6%
5 1226
 
2.4%
7 611
 
1.2%
9 527
 
1.0%
Lowercase Letter
ValueCountFrequency (%)
l 2
50.0%
n 1
25.0%
u 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common 50332
> 99.9%
Latin 4
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 13825
27.5%
1 11904
23.7%
4 6881
13.7%
3 5755
11.4%
2 5226
 
10.4%
6 2558
 
5.1%
8 1819
 
3.6%
5 1226
 
2.4%
7 611
 
1.2%
9 527
 
1.0%
Latin
ValueCountFrequency (%)
l 2
50.0%
n 1
25.0%
u 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50336
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 13825
27.5%
1 11904
23.6%
4 6881
13.7%
3 5755
11.4%
2 5226
 
10.4%
6 2558
 
5.1%
8 1819
 
3.6%
5 1226
 
2.4%
7 611
 
1.2%
9 527
 
1.0%
Other values (3) 4
 
< 0.1%

PNU코드
Real number (ℝ)

HIGH CORRELATION 

Distinct392
Distinct (%)7.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.0587638 × 1018
Minimum2.6290106 × 1018
Maximum5.181035 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size44.4 KiB
2023-12-13T05:32:03.460585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.6290106 × 1018
5-th percentile2.8200104 × 1018
Q14.1113141 × 1018
median4.1360262 × 1018
Q34.1630108 × 1018
95-th percentile5.1401802 × 1018
Maximum5.181035 × 1018
Range2.5520244 × 1018
Interquartile range (IQR)5.16967 × 1016

Descriptive statistics

Standard deviation4.8878039 × 1017
Coefficient of variation (CV)0.12042593
Kurtosis2.8565961
Mean4.0587638 × 1018
Median Absolute Deviation (MAD)2.4712823 × 1016
Skewness-1.0627132
Sum-7.1753251 × 1018
Variance2.3890627 × 1035
MonotonicityNot monotonic
2023-12-13T05:32:03.631566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4183033024200510000 724
 
14.4%
4136026223200500000 409
 
8.1%
2820010400200010001 234
 
4.6%
4111313400200230000 192
 
3.8%
2820010400200040002 179
 
3.6%
4111314100200530000 154
 
3.1%
4146125627200740008 146
 
2.9%
4121010600200310006 104
 
2.1%
4163010800200480000 103
 
2.0%
4159013200200800000 100
 
2.0%
Other values (382) 2689
53.4%
ValueCountFrequency (%)
2629010600200530001 5
 
0.1%
2629010600200530019 2
 
< 0.1%
2629010600200530032 4
 
0.1%
2632010300105780000 1
 
< 0.1%
2632010400201290009 1
 
< 0.1%
2632010500200120000 2
 
< 0.1%
2820010400100010000 1
 
< 0.1%
2820010400100760000 2
 
< 0.1%
2820010400200010001 234
4.6%
2820010400200010002 4
 
0.1%
ValueCountFrequency (%)
5181035023200010000 2
 
< 0.1%
5181035022201560001 2
 
< 0.1%
5181035021203050017 1
 
< 0.1%
5181035021203050000 2
 
< 0.1%
5181033025200710125 1
 
< 0.1%
5181033024202040000 4
0.1%
5181033023203190001 8
0.2%
5181032024202620012 6
0.1%
5181032024202590000 1
 
< 0.1%
5181032024200120062 1
 
< 0.1%
Distinct90
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size39.5 KiB
Minimum2017-05-22 00:00:00
Maximum2021-09-02 00:00:00
2023-12-13T05:32:03.779775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:32:03.966698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-13T05:32:00.623315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:31:59.849577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:32:00.218458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:32:00.738951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:31:59.968064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:32:00.370665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:32:00.853976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:32:00.106567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T05:32:00.507523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T05:32:04.076642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역X좌표지역Y좌표PNU코드조사일자
지역X좌표1.0000.6880.7310.912
지역Y좌표0.6881.0000.8870.983
PNU코드0.7310.8871.0000.984
조사일자0.9120.9830.9841.000
2023-12-13T05:32:04.198086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역X좌표지역Y좌표PNU코드
지역X좌표1.0000.2610.803
지역Y좌표0.2611.0000.463
PNU코드0.8030.4631.000

Missing values

2023-12-13T05:32:00.974110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T05:32:01.082053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지역X좌표지역Y좌표국가지점번호법정동코드PNU코드조사일자
0278061586400라사34058610511103102751110310272000100212017-05-22
1278061586400라사34058610511103102751110310272000100212017-05-22
2278061586400라사34058610511103102751110310272000100212017-05-22
3278061586400라사34058610511103102751110310272000100212017-05-22
4275605582957라사31588267511103103051110310302005300002017-05-22
5254513568326라사10426816511103402751110340272000400002017-05-22
6254513568326라사10426816511103402751110340272000400002017-05-22
7254513568326라사10426816511103402751110340272000400002017-05-22
8254513568326라사10426816511103402751110340272000400002017-05-22
9254513568326라사10426816511103402751110340272000400002017-05-22
지역X좌표지역Y좌표국가지점번호법정동코드PNU코드조사일자
5024333935535965라사89633539517702502851770250282000200012021-08-30
5025334045535748라사89743518517702502851770250282000200012021-08-31
5026391682517126마사47251626512303303651230330362010400002021-09-01
5027391682517126마사47251626512303303651230330362010400002021-09-01
5028338177535401라사93863481517702502851770250282000200012021-09-01
5029336980535164라사92673458517702502851770250282000200012021-09-01
5030337745535032라사93433444517702502951770250292040000002021-09-01
5031337743535041라사93433445517702502951770250292040000002021-09-01
5032338642536254라사94333566517702502851770250282000200012021-09-02
5033342176538215라사97883760517702502951770250292000100002021-09-02

Duplicate rows

Most frequently occurring

지역X좌표지역Y좌표국가지점번호법정동코드PNU코드조사일자# duplicates
26179463286665다라33948700292001060029200106002006700212018-07-137
27179464286666다라33948700292001060029200106002006700212018-07-137
28179465286667다라33948700292001060029200106002006700212018-07-137
29179466286668다라33948701292001060029200106002006700212018-07-137
717254513568326라사10426816511103402751110340272000400002017-05-225
268211870512149다사67491223415901320041590132002008000002018-07-174
285212003528807다사67712888411351020041135102002006600002018-07-304
510220144565700다사76056572413602622241360262222011600002018-09-134
626222694506346다사78280638414612562741461256272007400082018-02-084
647224185507647다사79780767414612562741461256272006600012018-05-204