Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.9 KiB
Average record size in memory70.3 B

Variable types

Categorical5
Numeric3

Alerts

anals_trget_year has constant value ""Constant
anals_trget_mt is highly overall correlated with area_nmHigh correlation
area_nm is highly overall correlated with anals_trget_mtHigh correlation
lon_co is highly overall correlated with lon_mber_coHigh correlation
lon_mber_co is highly overall correlated with lon_coHigh correlation
read_qy is highly overall correlated with age_flag_nmHigh correlation
age_flag_nm is highly overall correlated with read_qyHigh correlation
anals_trget_mt is highly imbalanced (80.6%)Imbalance
lon_co has unique valuesUnique
lon_mber_co has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:57:20.033648
Analysis finished2023-12-10 09:57:23.662427
Duration3.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

anals_trget_year
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2019
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
2019 100
100.0%

Length

2023-12-10T18:57:23.859489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:57:24.134939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2019 100
100.0%

anals_trget_mt
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
97 
12
 
3

Length

Max length2
Median length1
Mean length1.03
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row12
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 97
97.0%
12 3
 
3.0%

Length

2023-12-10T18:57:24.369552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:57:24.613614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 97
97.0%
12 3
 
3.0%

area_nm
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경상남도
18 
경상북도
18 
광주광역시
18 
경기도
17 
강원도
16 
Other values (2)
13 

Length

Max length5
Median length4
Mean length3.95
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row충청북도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
경상남도 18
18.0%
경상북도 18
18.0%
광주광역시 18
18.0%
경기도 17
17.0%
강원도 16
16.0%
대구광역시 10
10.0%
충청북도 3
 
3.0%

Length

2023-12-10T18:57:24.900225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:57:25.229169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경상남도 18
18.0%
경상북도 18
18.0%
광주광역시 18
18.0%
경기도 17
17.0%
강원도 16
16.0%
대구광역시 10
10.0%
충청북도 3
 
3.0%

age_flag_nm
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
유아(6-7)
12 
초등(8-13)
12 
60대이상
12 
20대
12 
영유아(0-5)
11 
Other values (4)
41 

Length

Max length10
Median length8
Mean length5.64
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영유아(0-5)
2nd row50대
3rd row유아(6-7)
4th row유아(6-7)
5th row초등(8-13)

Common Values

ValueCountFrequency (%)
유아(6-7) 12
12.0%
초등(8-13) 12
12.0%
60대이상 12
12.0%
20대 12
12.0%
영유아(0-5) 11
11.0%
50대 11
11.0%
청소년(14-19) 11
11.0%
30대 10
10.0%
40대 9
9.0%

Length

2023-12-10T18:57:25.511041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:57:25.794086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유아(6-7 12
12.0%
초등(8-13 12
12.0%
60대이상 12
12.0%
20대 12
12.0%
영유아(0-5 11
11.0%
50대 11
11.0%
청소년(14-19 11
11.0%
30대 10
10.0%
40대 9
9.0%

sexdstn_flag_nm
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
남자
51 
여자
49 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남자
2nd row여자
3rd row남자
4th row여자
5th row남자

Common Values

ValueCountFrequency (%)
남자 51
51.0%
여자 49
49.0%

Length

2023-12-10T18:57:26.065801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:57:26.257271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
남자 51
51.0%
여자 49
49.0%

lon_co
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36095.2
Minimum1244
Maximum474241
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:57:26.484291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1244
5-th percentile2454.1
Q15575.75
median11456
Q337086.5
95-th percentile123230.55
Maximum474241
Range472997
Interquartile range (IQR)31510.75

Descriptive statistics

Standard deviation72584.984
Coefficient of variation (CV)2.0109318
Kurtosis19.701627
Mean36095.2
Median Absolute Deviation (MAD)7061
Skewness4.1939077
Sum3609520
Variance5.2685799 × 109
MonotonicityNot monotonic
2023-12-10T18:57:26.837653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1909 1
 
1.0%
12734 1
 
1.0%
3025 1
 
1.0%
1244 1
 
1.0%
1670 1
 
1.0%
4831 1
 
1.0%
7635 1
 
1.0%
9853 1
 
1.0%
8047 1
 
1.0%
42872 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1244 1
1.0%
1670 1
1.0%
1909 1
1.0%
2102 1
1.0%
2437 1
1.0%
2455 1
1.0%
2772 1
1.0%
2876 1
1.0%
3025 1
1.0%
3216 1
1.0%
ValueCountFrequency (%)
474241 1
1.0%
391536 1
1.0%
275445 1
1.0%
261094 1
1.0%
136522 1
1.0%
122531 1
1.0%
112470 1
1.0%
111277 1
1.0%
101173 1
1.0%
84250 1
1.0%

lon_mber_co
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4929.08
Minimum143
Maximum54100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:57:27.140705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum143
5-th percentile281.55
Q1900.5
median1754
Q34265.25
95-th percentile21597.65
Maximum54100
Range53957
Interquartile range (IQR)3364.75

Descriptive statistics

Standard deviation8789.6808
Coefficient of variation (CV)1.7832295
Kurtosis14.195331
Mean4929.08
Median Absolute Deviation (MAD)1125.5
Skewness3.5073081
Sum492908
Variance77258488
MonotonicityNot monotonic
2023-12-10T18:57:27.621319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
240 1
 
1.0%
1832 1
 
1.0%
319 1
 
1.0%
143 1
 
1.0%
178 1
 
1.0%
649 1
 
1.0%
1058 1
 
1.0%
1676 1
 
1.0%
1341 1
 
1.0%
6113 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
143 1
1.0%
178 1
1.0%
240 1
1.0%
253 1
1.0%
273 1
1.0%
282 1
1.0%
319 1
1.0%
339 1
1.0%
369 1
1.0%
383 1
1.0%
ValueCountFrequency (%)
54100 1
1.0%
46160 1
1.0%
27486 1
1.0%
26201 1
1.0%
24783 1
1.0%
21430 1
1.0%
20989 1
1.0%
19782 1
1.0%
18692 1
1.0%
14795 1
1.0%

read_qy
Real number (ℝ)

HIGH CORRELATION 

Distinct99
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.40772
Minimum3.813
Maximum14.761
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:57:28.434305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3.813
5-th percentile4.04705
Q15.411
median6.8975
Q38.7065
95-th percentile13.0322
Maximum14.761
Range10.948
Interquartile range (IQR)3.2955

Descriptive statistics

Standard deviation2.5957488
Coefficient of variation (CV)0.3504113
Kurtosis0.50705404
Mean7.40772
Median Absolute Deviation (MAD)1.672
Skewness0.89471649
Sum740.772
Variance6.7379118
MonotonicityNot monotonic
2023-12-10T18:57:29.256637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.901 2
 
2.0%
7.954 1
 
1.0%
7.561 1
 
1.0%
9.483 1
 
1.0%
8.699 1
 
1.0%
9.382 1
 
1.0%
7.444 1
 
1.0%
7.216 1
 
1.0%
5.879 1
 
1.0%
6.001 1
 
1.0%
Other values (89) 89
89.0%
ValueCountFrequency (%)
3.813 1
1.0%
3.818 1
1.0%
3.915 1
1.0%
3.973 1
1.0%
4.029 1
1.0%
4.048 1
1.0%
4.156 1
1.0%
4.219 1
1.0%
4.237 1
1.0%
4.402 1
1.0%
ValueCountFrequency (%)
14.761 1
1.0%
14.345 1
1.0%
13.589 1
1.0%
13.536 1
1.0%
13.492 1
1.0%
13.008 1
1.0%
12.836 1
1.0%
12.772 1
1.0%
11.599 1
1.0%
11.403 1
1.0%

Interactions

2023-12-10T18:57:22.680094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:20.774364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:22.044831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:22.840233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:21.441845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:22.313795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:23.019251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:21.745315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:22.484836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:57:29.873669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
anals_trget_mtarea_nmage_flag_nmsexdstn_flag_nmlon_colon_mber_coread_qy
anals_trget_mt1.0001.0000.1950.0000.0000.0000.000
area_nm1.0001.0000.0000.0000.4720.4220.445
age_flag_nm0.1950.0001.0000.0000.2210.3460.799
sexdstn_flag_nm0.0000.0000.0001.0000.0000.1330.000
lon_co0.0000.4720.2210.0001.0000.9790.579
lon_mber_co0.0000.4220.3460.1330.9791.0000.468
read_qy0.0000.4450.7990.0000.5790.4681.000
2023-12-10T18:57:30.296532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
anals_trget_mtarea_nmsexdstn_flag_nmage_flag_nm
anals_trget_mt1.0000.9740.0000.185
area_nm0.9741.0000.0000.000
sexdstn_flag_nm0.0000.0001.0000.000
age_flag_nm0.1850.0000.0001.000
2023-12-10T18:57:30.550859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
lon_colon_mber_coread_qyanals_trget_mtarea_nmage_flag_nmsexdstn_flag_nm
lon_co1.0000.9610.0420.0000.3030.1120.000
lon_mber_co0.9611.000-0.1870.0000.2410.1770.090
read_qy0.042-0.1871.0000.0000.2380.5240.000
anals_trget_mt0.0000.0000.0001.0000.9740.1850.000
area_nm0.3030.2410.2380.9741.0000.0000.000
age_flag_nm0.1120.1770.5240.1850.0001.0000.000
sexdstn_flag_nm0.0000.0900.0000.0000.0000.0001.000

Missing values

2023-12-10T18:57:23.286730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:57:23.556838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

anals_trget_yearanals_trget_mtarea_nmage_flag_nmsexdstn_flag_nmlon_colon_mber_coread_qy
020191강원도영유아(0-5)남자19092407.954
1201912충청북도50대여자966514486.675
220191강원도유아(6-7)남자24372828.642
320191강원도유아(6-7)여자21022737.7
420191강원도초등(8-13)남자1333816068.305
520191강원도초등(8-13)여자1490518628.005
620191강원도청소년(14-19)남자501911214.477
7201912충청북도60대이상남자77639378.285
820191강원도20대남자40409044.469
920191강원도20대여자661615034.402
anals_trget_yearanals_trget_mtarea_nmage_flag_nmsexdstn_flag_nmlon_colon_mber_coread_qy
9020191대구광역시영유아(0-5)남자587539814.761
9120191대구광역시영유아(0-5)여자486333914.345
9220191대구광역시유아(6-7)남자1041677213.492
9320191대구광역시유아(6-7)여자889365713.536
9420191대구광역시초등(8-13)남자50805438011.599
9520191대구광역시초등(8-13)여자50038438811.403
9620191대구광역시청소년(14-19)남자1295624835.218
9720191대구광역시청소년(14-19)여자2009639055.146
9820191대구광역시20대남자1077826754.029
9920191대구광역시20대여자2055248714.219