Overview

Dataset statistics

Number of variables5
Number of observations3183
Missing cells0
Missing cells (%)0.0%
Duplicate rows236
Duplicate rows (%)7.4%
Total size in memory130.7 KiB
Average record size in memory42.0 B

Variable types

Categorical1
DateTime1
Text1
Numeric2

Dataset

Description국립공원 내에서 발생한 각 로드킬 건에 대해 국립공원명, 사고 상세 위치, 발생 날짜, 개체종에 대한 정보를 제공하고 있습니다.
Author국립공원공단
URLhttps://www.data.go.kr/data/3068387/fileData.do

Alerts

Dataset has 236 (7.4%) duplicate rowsDuplicates
경도 is highly overall correlated with 위도 and 1 other fieldsHigh correlation
위도 is highly overall correlated with 경도 and 1 other fieldsHigh correlation
국립공원명 is highly overall correlated with 경도 and 1 other fieldsHigh correlation

Reproduction

Analysis started2024-01-06 12:20:03.741340
Analysis finished2024-01-06 12:20:07.537760
Duration3.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

국립공원명
Categorical

HIGH CORRELATION 

Distinct20
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size25.0 KiB
지리산
756 
오대산
549 
소백산
241 
내장산
239 
덕유산
221 
Other values (15)
1177 

Length

Max length5
Median length3
Mean length3.1105875
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row한려해상
2nd row한려해상
3rd row한려해상
4th row한려해상
5th row한려해상

Common Values

ValueCountFrequency (%)
지리산 756
23.8%
오대산 549
17.2%
소백산 241
 
7.6%
내장산 239
 
7.5%
덕유산 221
 
6.9%
속리산 193
 
6.1%
설악산 189
 
5.9%
변산반도 132
 
4.1%
한려해상 128
 
4.0%
월악산 103
 
3.2%
Other values (10) 432
13.6%

Length

2024-01-06T12:20:07.891921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
지리산 756
23.8%
오대산 549
17.2%
소백산 241
 
7.6%
내장산 239
 
7.5%
덕유산 221
 
6.9%
속리산 193
 
6.1%
설악산 189
 
5.9%
변산반도 132
 
4.1%
한려해상 128
 
4.0%
월악산 103
 
3.2%
Other values (10) 432
13.6%
Distinct1630
Distinct (%)51.2%
Missing0
Missing (%)0.0%
Memory size25.0 KiB
Minimum2011-01-10 00:00:00
Maximum2023-11-20 00:00:00
2024-01-06T12:20:08.410094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-06T12:20:08.869401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct98
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size25.0 KiB
2024-01-06T12:20:09.497449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.2613886
Min length1

Characters and Unicode

Total characters10381
Distinct characters139
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)0.8%

Sample

1st row멧새
2nd row청설모
3rd row청설모
4th row능구렁이
5th row고라니
ValueCountFrequency (%)
다람쥐 1128
35.4%
고라니 230
 
7.2%
청설모 223
 
7.0%
너구리 150
 
4.7%
누룩뱀 144
 
4.5%
유혈목이 139
 
4.4%
능구렁이 109
 
3.4%
쇠살모사 95
 
3.0%
두꺼비 85
 
2.7%
족제비 79
 
2.5%
Other values (88) 801
25.2%
2024-01-06T12:20:10.612442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1161
 
11.2%
1140
 
11.0%
1128
 
10.9%
421
 
4.1%
393
 
3.8%
378
 
3.6%
291
 
2.8%
263
 
2.5%
232
 
2.2%
230
 
2.2%
Other values (129) 4744
45.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 10381
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1161
 
11.2%
1140
 
11.0%
1128
 
10.9%
421
 
4.1%
393
 
3.8%
378
 
3.6%
291
 
2.8%
263
 
2.5%
232
 
2.2%
230
 
2.2%
Other values (129) 4744
45.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 10381
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1161
 
11.2%
1140
 
11.0%
1128
 
10.9%
421
 
4.1%
393
 
3.8%
378
 
3.6%
291
 
2.8%
263
 
2.5%
232
 
2.2%
230
 
2.2%
Other values (129) 4744
45.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 10381
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1161
 
11.2%
1140
 
11.0%
1128
 
10.9%
421
 
4.1%
393
 
3.8%
378
 
3.6%
291
 
2.8%
263
 
2.5%
232
 
2.2%
230
 
2.2%
Other values (129) 4744
45.7%

경도
Real number (ℝ)

HIGH CORRELATION 

Distinct95
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean127.87738
Minimum125.41874
Maximum129.35559
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.1 KiB
2024-01-06T12:20:11.080977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum125.41874
5-th percentile126.5971
Q1127.50059
median127.82359
Q3128.52153
95-th percentile128.65107
Maximum129.35559
Range3.9368513
Interquartile range (IQR)1.0209418

Descriptive statistics

Standard deviation0.67790714
Coefficient of variation (CV)0.0053012277
Kurtosis-0.49317114
Mean127.87738
Median Absolute Deviation (MAD)0.5917593
Skewness-0.3723323
Sum407033.71
Variance0.45955809
MonotonicityNot monotonic
2024-01-06T12:20:11.714494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
128.6169457 279
 
8.8%
127.5005907 274
 
8.6%
128.5852584 180
 
5.7%
127.5668684 157
 
4.9%
127.6366693 137
 
4.3%
127.7678166 137
 
4.3%
128.391064 116
 
3.6%
126.5970999 94
 
3.0%
128.5809816 88
 
2.8%
128.421133 85
 
2.7%
Other values (85) 1636
51.4%
ValueCountFrequency (%)
125.4187399 5
 
0.2%
125.8953789 1
 
< 0.1%
126.1598171 4
 
0.1%
126.1611309 4
 
0.1%
126.2872804 2
 
0.1%
126.3287751 32
1.0%
126.3439414 1
 
< 0.1%
126.4672665 1
 
< 0.1%
126.5669846 1
 
< 0.1%
126.582075 35
1.1%
ValueCountFrequency (%)
129.3555912 7
 
0.2%
129.3529929 6
 
0.2%
129.3485815 16
 
0.5%
129.2196397 9
 
0.3%
129.1797786 9
 
0.3%
129.1090618 10
 
0.3%
128.958275 3
 
0.1%
128.9270511 36
1.1%
128.6746923 51
1.6%
128.6668004 2
 
0.1%

위도
Real number (ℝ)

HIGH CORRELATION 

Distinct95
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.317464
Minimum34.2025
Maximum38.217
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.1 KiB
2024-01-06T12:20:12.243582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum34.2025
5-th percentile34.7709
Q135.3576
median35.9326
Q337.347619
95-th percentile38.101
Maximum38.217
Range4.0145
Interquartile range (IQR)1.9900189

Descriptive statistics

Standard deviation1.070412
Coefficient of variation (CV)0.029473754
Kurtosis-1.2914972
Mean36.317464
Median Absolute Deviation (MAD)0.6571
Skewness0.28142578
Sum115598.49
Variance1.1457818
MonotonicityNot monotonic
2024-01-06T12:20:13.070540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.7688 279
 
8.8%
35.2982 274
 
8.6%
37.77809511 180
 
5.7%
35.3637 157
 
4.9%
35.31619123 137
 
4.3%
35.9326 137
 
4.3%
38.101 116
 
3.6%
35.6598 94
 
3.0%
37.7553 88
 
2.8%
36.9076 85
 
2.7%
Other values (85) 1636
51.4%
ValueCountFrequency (%)
34.2025 22
0.7%
34.3645 4
 
0.1%
34.46792173 8
 
0.3%
34.497 1
 
< 0.1%
34.6692 5
 
0.2%
34.7155 5
 
0.2%
34.7326 2
 
0.1%
34.74428893 9
 
0.3%
34.7519 1
 
< 0.1%
34.75226396 51
1.6%
ValueCountFrequency (%)
38.217 18
 
0.6%
38.2127 13
 
0.4%
38.179 6
 
0.2%
38.1711 6
 
0.2%
38.13378028 30
 
0.9%
38.101 116
3.6%
37.77809511 180
5.7%
37.7688 279
8.8%
37.7553 88
 
2.8%
37.6945 2
 
0.1%

Interactions

2024-01-06T12:20:05.902825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-06T12:20:04.852831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-06T12:20:06.321293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-06T12:20:05.404895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-06T12:20:13.481151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국립공원명자원명경도위도
국립공원명1.0000.6860.9951.000
자원명0.6861.0000.6950.629
경도0.9950.6951.0000.941
위도1.0000.6290.9411.000
2024-01-06T12:20:13.930460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
경도위도국립공원명
경도1.0000.6120.871
위도0.6121.0000.960
국립공원명0.8710.9601.000

Missing values

2024-01-06T12:20:06.893637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-06T12:20:07.294652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

국립공원명조사일자자원명경도위도
0한려해상2021-10-20멧새128.67469234.752264
1한려해상2021-10-13청설모128.67469234.752264
2한려해상2021-10-11청설모128.67469234.752264
3한려해상2021-09-02능구렁이128.67469234.752264
4한려해상2021-07-27고라니128.67469234.752264
5한려해상2020-10-29고라니128.67469234.752264
6한려해상2020-11-15청설모128.67469234.752264
7한려해상2020-10-18까치128.67469234.752264
8한려해상2020-09-27족제비128.67469234.752264
9한려해상2020-09-26다람쥐128.67469234.752264
국립공원명조사일자자원명경도위도
3173가야산2013-07-24호랑지빠귀128.0999135.807853
3174가야산2013-06-08다람쥐128.0999135.807853
3175가야산2012-09-12청설모128.0999135.807853
3176가야산2012-09-03유혈목이128.0999135.807853
3177가야산2012-05-04너구리128.0999135.807853
3178가야산2012-04-29멧비둘기128.0999135.807853
3179가야산2012-02-27노루128.0999135.807853
3180가야산2011-09-09다람쥐128.0999135.807853
3181가야산2011-09-05능구렁이128.0999135.807853
3182가야산2011-07-09오소리128.0999135.807853

Duplicate rows

Most frequently occurring

국립공원명조사일자자원명경도위도# duplicates
59소백산2021-03-24두꺼비128.52153336.905612
183지리산2017-06-26다람쥐127.50059135.29829
63소백산2022-03-30두꺼비128.52153336.90567
161지리산2012-10-02다람쥐127.56686835.36376
211지리산2020-07-29다람쥐127.50059135.29826
6내장산2016-06-20다람쥐126.84611335.44765
60소백산2021-09-21두꺼비128.52153336.90564
85오대산2011-07-17다람쥐128.61694637.76884
173지리산2014-09-16다람쥐127.50059135.29824
181지리산2017-06-12다람쥐127.50059135.29824