Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.9 KiB
Average record size in memory70.3 B

Variable types

Categorical5
Numeric3

Alerts

anals_trget_year has constant value ""Constant
anals_trget_mt is highly overall correlated with area_nmHigh correlation
area_nm is highly overall correlated with anals_trget_mtHigh correlation
lon_mber_co is highly overall correlated with all_mber_coHigh correlation
all_mber_co is highly overall correlated with lon_mber_co and 1 other fieldsHigh correlation
read_rt is highly overall correlated with all_mber_coHigh correlation
anals_trget_mt is highly imbalanced (80.6%)Imbalance
all_mber_co has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:14:58.625178
Analysis finished2023-12-10 10:15:01.604543
Duration2.98 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

anals_trget_year
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2020
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 100
100.0%

Length

2023-12-10T19:15:01.766881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:15:01.921370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 100
100.0%

anals_trget_mt
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
97 
12
 
3

Length

Max length2
Median length1
Mean length1.03
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row12
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 97
97.0%
12 3
 
3.0%

Length

2023-12-10T19:15:02.132070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:15:02.311083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 97
97.0%
12 3
 
3.0%

area_nm
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경상남도
18 
경상북도
18 
광주광역시
18 
경기도
17 
강원도
16 
Other values (2)
13 

Length

Max length5
Median length4
Mean length3.95
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row충청북도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
경상남도 18
18.0%
경상북도 18
18.0%
광주광역시 18
18.0%
경기도 17
17.0%
강원도 16
16.0%
대구광역시 10
10.0%
충청북도 3
 
3.0%

Length

2023-12-10T19:15:02.528075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:15:02.758291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경상남도 18
18.0%
경상북도 18
18.0%
광주광역시 18
18.0%
경기도 17
17.0%
강원도 16
16.0%
대구광역시 10
10.0%
충청북도 3
 
3.0%

age_flag_nm
Categorical

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
유아(6-7)
12 
초등(8-13)
12 
60대이상
12 
20대
12 
영유아(0-5)
11 
Other values (4)
41 

Length

Max length10
Median length8
Mean length5.64
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영유아(0-5)
2nd row50대
3rd row유아(6-7)
4th row유아(6-7)
5th row초등(8-13)

Common Values

ValueCountFrequency (%)
유아(6-7) 12
12.0%
초등(8-13) 12
12.0%
60대이상 12
12.0%
20대 12
12.0%
영유아(0-5) 11
11.0%
50대 11
11.0%
청소년(14-19) 11
11.0%
30대 10
10.0%
40대 9
9.0%

Length

2023-12-10T19:15:02.984586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:15:03.201987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유아(6-7 12
12.0%
초등(8-13 12
12.0%
60대이상 12
12.0%
20대 12
12.0%
영유아(0-5 11
11.0%
50대 11
11.0%
청소년(14-19 11
11.0%
30대 10
10.0%
40대 9
9.0%

sexdstn_flag_nm
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
남자
51 
여자
49 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남자
2nd row여자
3rd row남자
4th row여자
5th row남자

Common Values

ValueCountFrequency (%)
남자 51
51.0%
여자 49
49.0%

Length

2023-12-10T19:15:03.456629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:15:03.615774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
남자 51
51.0%
여자 49
49.0%

lon_mber_co
Real number (ℝ)

HIGH CORRELATION 

Distinct98
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4517.23
Minimum149
Maximum47021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:15:03.806980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum149
5-th percentile245.95
Q1785.75
median1649.5
Q33714.25
95-th percentile21714.25
Maximum47021
Range46872
Interquartile range (IQR)2928.5

Descriptive statistics

Standard deviation7969.3694
Coefficient of variation (CV)1.764216
Kurtosis13.197307
Mean4517.23
Median Absolute Deviation (MAD)1108.5
Skewness3.4115903
Sum451723
Variance63510849
MonotonicityNot monotonic
2023-12-10T19:15:04.199111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1097 2
 
2.0%
537 2
 
2.0%
182 1
 
1.0%
1828 1
 
1.0%
149 1
 
1.0%
231 1
 
1.0%
744 1
 
1.0%
1753 1
 
1.0%
1368 1
 
1.0%
5931 1
 
1.0%
Other values (88) 88
88.0%
ValueCountFrequency (%)
149 1
1.0%
182 1
1.0%
208 1
1.0%
231 1
1.0%
245 1
1.0%
246 1
1.0%
325 1
1.0%
341 1
1.0%
352 1
1.0%
360 1
1.0%
ValueCountFrequency (%)
47021 1
1.0%
43087 1
1.0%
25058 1
1.0%
24017 1
1.0%
23315 1
1.0%
21630 1
1.0%
18422 1
1.0%
17558 1
1.0%
16944 1
1.0%
12833 1
1.0%

all_mber_co
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean91588.43
Minimum239
Maximum886851
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:15:04.486620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum239
5-th percentile760.35
Q111544.25
median28754
Q361825.75
95-th percentile564609.7
Maximum886851
Range886612
Interquartile range (IQR)50281.5

Descriptive statistics

Standard deviation181144.62
Coefficient of variation (CV)1.9778112
Kurtosis9.1447058
Mean91588.43
Median Absolute Deviation (MAD)21328.5
Skewness3.0952621
Sum9158843
Variance3.2813374 × 1010
MonotonicityNot monotonic
2023-12-10T19:15:04.731102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
873 1
 
1.0%
38033 1
 
1.0%
766 1
 
1.0%
239 1
 
1.0%
292 1
 
1.0%
11619 1
 
1.0%
12408 1
 
1.0%
33037 1
 
1.0%
22193 1
 
1.0%
63381 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
239 1
1.0%
292 1
1.0%
596 1
1.0%
637 1
1.0%
653 1
1.0%
766 1
1.0%
873 1
1.0%
1021 1
1.0%
1174 1
1.0%
1207 1
1.0%
ValueCountFrequency (%)
886851 1
1.0%
838395 1
1.0%
765872 1
1.0%
682848 1
1.0%
667755 1
1.0%
559181 1
1.0%
459320 1
1.0%
363170 1
1.0%
327289 1
1.0%
228854 1
1.0%

read_rt
Real number (ℝ)

HIGH CORRELATION 

Distinct99
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.4559
Minimum1.405
Maximum79.11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:15:04.998709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.405
5-th percentile2.02575
Q14.062
median6.1715
Q314.72075
95-th percentile41.41195
Maximum79.11
Range77.705
Interquartile range (IQR)10.65875

Descriptive statistics

Standard deviation14.270117
Coefficient of variation (CV)1.1456512
Kurtosis6.1783564
Mean12.4559
Median Absolute Deviation (MAD)3.5735
Skewness2.3718639
Sum1245.59
Variance203.63624
MonotonicityNot monotonic
2023-12-10T19:15:05.316820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.608 2
 
2.0%
20.848 1
 
1.0%
6.936 1
 
1.0%
46.997 1
 
1.0%
62.343 1
 
1.0%
79.11 1
 
1.0%
6.403 1
 
1.0%
8.841 1
 
1.0%
5.306 1
 
1.0%
6.164 1
 
1.0%
Other values (89) 89
89.0%
ValueCountFrequency (%)
1.405 1
1.0%
1.496 1
1.0%
1.604 1
1.0%
1.722 1
1.0%
1.888 1
1.0%
2.033 1
1.0%
2.352 1
1.0%
2.438 1
1.0%
2.484 1
1.0%
2.492 1
1.0%
ValueCountFrequency (%)
79.11 1
1.0%
62.343 1
1.0%
55.259 1
1.0%
49.77 1
1.0%
46.997 1
1.0%
41.118 1
1.0%
41.107 1
1.0%
39.934 1
1.0%
38.386 1
1.0%
35.459 1
1.0%

Interactions

2023-12-10T19:14:59.938864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:14:59.119078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:14:59.509382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:15:00.300710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:14:59.236315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:14:59.624273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:15:00.761894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:14:59.378354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:14:59.734556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:15:05.530267image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
anals_trget_mtarea_nmage_flag_nmsexdstn_flag_nmlon_mber_coall_mber_coread_rt
anals_trget_mt1.0001.0000.1950.0000.0000.0000.000
area_nm1.0001.0000.0000.0000.6040.4260.374
age_flag_nm0.1950.0001.0000.0000.1610.2540.807
sexdstn_flag_nm0.0000.0000.0001.0000.0000.0000.000
lon_mber_co0.0000.6040.1610.0001.0000.8710.000
all_mber_co0.0000.4260.2540.0000.8711.0000.000
read_rt0.0000.3740.8070.0000.0000.0001.000
2023-12-10T19:15:05.719187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
sexdstn_flag_nmage_flag_nmanals_trget_mtarea_nm
sexdstn_flag_nm1.0000.0000.0000.000
age_flag_nm0.0001.0000.1850.000
anals_trget_mt0.0000.1851.0000.974
area_nm0.0000.0000.9741.000
2023-12-10T19:15:05.895900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
lon_mber_coall_mber_coread_rtanals_trget_mtarea_nmage_flag_nmsexdstn_flag_nm
lon_mber_co1.0000.819-0.2490.0000.2490.0860.000
all_mber_co0.8191.000-0.6920.0000.2280.1190.000
read_rt-0.249-0.6921.0000.0000.2030.3810.000
anals_trget_mt0.0000.0000.0001.0000.9740.1850.000
area_nm0.2490.2280.2030.9741.0000.0000.000
age_flag_nm0.0860.1190.3810.1850.0001.0000.000
sexdstn_flag_nm0.0000.0000.0000.0000.0000.0001.000

Missing values

2023-12-10T19:15:01.258974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:15:01.516328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

anals_trget_yearanals_trget_mtarea_nmage_flag_nmsexdstn_flag_nmlon_mber_coall_mber_coread_rt
020201강원도영유아(0-5)남자18287320.848
1202012충청북도50대여자814326602.492
220201강원도유아(6-7)남자246124319.791
320201강원도유아(6-7)여자208102120.372
420201강원도초등(8-13)남자15151132013.383
520201강원도초등(8-13)여자1775967518.346
620201강원도청소년(14-19)남자987180495.468
7202012충청북도60대이상남자537124584.31
820201강원도20대남자793337212.352
920201강원도20대여자1386449003.087
anals_trget_yearanals_trget_mtarea_nmage_flag_nmsexdstn_flag_nmlon_mber_coall_mber_coread_rt
9020201대구광역시영유아(0-5)남자32565349.77
9120201대구광역시영유아(0-5)여자35263755.259
9220201대구광역시유아(6-7)남자634178835.459
9320201대구광역시유아(6-7)여자623162338.386
9420201대구광역시초등(8-13)남자36141842019.62
9520201대구광역시초등(8-13)여자38201794821.284
9620201대구광역시청소년(14-19)남자1862214358.687
9720201대구광역시청소년(14-19)여자30832688111.469
9820201대구광역시20대남자1899378905.012
9920201대구광역시20대여자3679813964.52