Overview

Dataset statistics

Number of variables6
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.1 KiB
Average record size in memory52.3 B

Variable types

Categorical5
Numeric1

Alerts

anals_trget_year is highly overall correlated with arrrg_rate and 3 other fieldsHigh correlation
one_area_nm is highly overall correlated with arrrg_rate and 3 other fieldsHigh correlation
anals_trget_mt is highly overall correlated with arrrg_rate and 3 other fieldsHigh correlation
two_area_nm is highly overall correlated with anals_trget_year and 2 other fieldsHigh correlation
arrrg_rate is highly overall correlated with anals_trget_year and 2 other fieldsHigh correlation
anals_trget_year is highly imbalanced (80.6%)Imbalance
anals_trget_mt is highly imbalanced (80.6%)Imbalance
one_area_nm is highly imbalanced (80.6%)Imbalance

Reproduction

Analysis started2023-12-10 10:10:16.157289
Analysis finished2023-12-10 10:10:17.256096
Duration1.1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

anals_trget_year
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2018
97 
2020
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2018
2nd row2020
3rd row2018
4th row2018
5th row2018

Common Values

ValueCountFrequency (%)
2018 97
97.0%
2020 3
 
3.0%

Length

2023-12-10T19:10:17.370908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:10:17.598928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2018 97
97.0%
2020 3
 
3.0%

anals_trget_mt
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
97 
12
 
3

Length

Max length2
Median length1
Mean length1.03
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row12
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 97
97.0%
12 3
 
3.0%

Length

2023-12-10T19:10:17.848537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:10:18.051385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 97
97.0%
12 3
 
3.0%

one_area_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
강원도
97 
충청북도
 
3

Length

Max length4
Median length3
Mean length3.03
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row충청북도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
강원도 97
97.0%
충청북도 3
 
3.0%

Length

2023-12-10T19:10:18.247758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:10:18.420955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
강원도 97
97.0%
충청북도 3
 
3.0%

two_area_nm
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
고성군
11 
삼척시
11 
속초시
11 
양구군
11 
양양군
11 
Other values (6)
45 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row강릉시
2nd row충주시
3rd row강릉시
4th row강릉시
5th row강릉시

Common Values

ValueCountFrequency (%)
고성군 11
11.0%
삼척시 11
11.0%
속초시 11
11.0%
양구군 11
11.0%
양양군 11
11.0%
영월군 11
11.0%
원주시 11
11.0%
동해시 10
10.0%
강릉시 9
9.0%
충주시 3
 
3.0%

Length

2023-12-10T19:10:18.664794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
고성군 11
11.0%
삼척시 11
11.0%
속초시 11
11.0%
양구군 11
11.0%
양양군 11
11.0%
영월군 11
11.0%
원주시 11
11.0%
동해시 10
10.0%
강릉시 9
9.0%
충주시 3
 
3.0%

kdc_nm
Categorical

Distinct11
Distinct (%)11.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
역사
11 
예술
10 
미상
사회과학
기술과학
Other values (6)
52 

Length

Max length4
Median length2
Mean length2.54
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row미상
2nd row사회과학
3rd row역사
4th row종교
5th row기술과학

Common Values

ValueCountFrequency (%)
역사 11
11.0%
예술 10
10.0%
미상 9
9.0%
사회과학 9
9.0%
기술과학 9
9.0%
문학 9
9.0%
철학 9
9.0%
총류 9
9.0%
자연과학 9
9.0%
종교 8
8.0%

Length

2023-12-10T19:10:18.930931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
역사 11
11.0%
예술 10
10.0%
미상 9
9.0%
사회과학 9
9.0%
기술과학 9
9.0%
문학 9
9.0%
철학 9
9.0%
총류 9
9.0%
자연과학 9
9.0%
종교 8
8.0%

arrrg_rate
Real number (ℝ)

HIGH CORRELATION 

Distinct75
Distinct (%)75.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.16882
Minimum0.011
Maximum0.364
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:10:19.157588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.011
5-th percentile0.09335
Q10.14125
median0.1685
Q30.19825
95-th percentile0.26225
Maximum0.364
Range0.353
Interquartile range (IQR)0.057

Descriptive statistics

Standard deviation0.053021699
Coefficient of variation (CV)0.31407238
Kurtosis2.5042167
Mean0.16882
Median Absolute Deviation (MAD)0.0295
Skewness0.30847656
Sum16.882
Variance0.0028113006
MonotonicityNot monotonic
2023-12-10T19:10:19.410823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.175 4
 
4.0%
0.177 3
 
3.0%
0.188 3
 
3.0%
0.149 3
 
3.0%
0.145 3
 
3.0%
0.159 2
 
2.0%
0.201 2
 
2.0%
0.168 2
 
2.0%
0.162 2
 
2.0%
0.185 2
 
2.0%
Other values (65) 74
74.0%
ValueCountFrequency (%)
0.011 1
1.0%
0.028 1
1.0%
0.051 1
1.0%
0.055 1
1.0%
0.081 1
1.0%
0.094 1
1.0%
0.102 1
1.0%
0.103 1
1.0%
0.109 1
1.0%
0.116 2
2.0%
ValueCountFrequency (%)
0.364 1
1.0%
0.321 1
1.0%
0.282 1
1.0%
0.274 1
1.0%
0.267 1
1.0%
0.262 1
1.0%
0.247 1
1.0%
0.229 1
1.0%
0.227 1
1.0%
0.222 1
1.0%

Interactions

2023-12-10T19:10:16.749263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:10:19.603540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
anals_trget_yearanals_trget_mtone_area_nmtwo_area_nmkdc_nmarrrg_rate
anals_trget_year1.0000.9630.9631.0000.0000.960
anals_trget_mt0.9631.0000.9631.0000.0000.960
one_area_nm0.9630.9631.0001.0000.0000.960
two_area_nm1.0001.0001.0001.0000.0000.604
kdc_nm0.0000.0000.0000.0001.0000.000
arrrg_rate0.9600.9600.9600.6040.0001.000
2023-12-10T19:10:19.797410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
anals_trget_yearone_area_nmanals_trget_mtkdc_nmtwo_area_nm
anals_trget_year1.0000.8260.8260.0000.953
one_area_nm0.8261.0000.8260.0000.953
anals_trget_mt0.8260.8261.0000.0000.953
kdc_nm0.0000.0000.0001.0000.000
two_area_nm0.9530.9530.9530.0001.000
2023-12-10T19:10:20.358292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
arrrg_rateanals_trget_yearanals_trget_mtone_area_nmtwo_area_nmkdc_nm
arrrg_rate1.0000.7930.7930.7930.3060.000
anals_trget_year0.7931.0000.8260.8260.9530.000
anals_trget_mt0.7930.8261.0000.8260.9530.000
one_area_nm0.7930.8260.8261.0000.9530.000
two_area_nm0.3060.9530.9530.9531.0000.000
kdc_nm0.0000.0000.0000.0000.0001.000

Missing values

2023-12-10T19:10:16.993785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:10:17.185268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

anals_trget_yearanals_trget_mtone_area_nmtwo_area_nmkdc_nmarrrg_rate
020181강원도강릉시미상0.177
1202012충청북도충주시사회과학0.028
220181강원도강릉시역사0.217
320181강원도강릉시종교0.152
420181강원도강릉시기술과학0.188
520181강원도강릉시문학0.175
620181강원도강릉시철학0.211
7202012충청북도충주시예술0.109
820181강원도강릉시예술0.116
920181강원도강릉시총류0.198
anals_trget_yearanals_trget_mtone_area_nmtwo_area_nmkdc_nmarrrg_rate
9020181강원도원주시언어0.187
9120181강원도원주시예술0.177
9220181강원도원주시역사0.169
9320181강원도원주시사회과학0.205
9420181강원도원주시미상0.184
9520181강원도원주시철학0.227
9620181강원도원주시종교0.188
9720181강원도원주시기술과학0.214
9820181강원도원주시자연과학0.166
9920181강원도인제군역사0.262