Overview

Dataset statistics

Number of variables7
Number of observations66
Missing cells24
Missing cells (%)5.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.2 KiB
Average record size in memory65.0 B

Variable types

Categorical4
Numeric3

Dataset

DescriptionSample
Author경북대학교 산학협력단
URLhttps://www.bigdata-coast.kr/gdsInfo/gdsInfoDetail.do?gdsCd=CT02KNU001

Alerts

PRSR_REVISN_VAL is highly overall correlated with WTEM_REVISN_VALHigh correlation
WTEM_REVISN_VAL is highly overall correlated with PRSR_REVISN_VALHigh correlation
CYCLE_NO is highly imbalanced (88.7%)Imbalance
LA is highly imbalanced (88.7%)Imbalance
LO is highly imbalanced (88.7%)Imbalance
YMD is highly imbalanced (88.7%)Imbalance
SLNTY_REVISN_VAL has 12 (18.2%) missing valuesMissing
WTEM_REVISN_VAL has 12 (18.2%) missing valuesMissing
PRSR_REVISN_VAL has unique valuesUnique

Reproduction

Analysis started2024-03-13 12:52:26.115310
Analysis finished2024-03-13 12:52:28.299656
Duration2.18 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

CYCLE_NO
Categorical

IMBALANCE 

Distinct2
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size660.0 B
<NA>
65 
1
 
1

Length

Max length4
Median length4
Mean length3.9545455
Min length1

Unique

Unique1 ?
Unique (%)1.5%

Sample

1st row1
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 65
98.5%
1 1
 
1.5%

Length

2024-03-13T21:52:28.416051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T21:52:28.565752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 65
98.5%
1 1
 
1.5%

LA
Categorical

IMBALANCE 

Distinct2
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size660.0 B
<NA>
65 
38.52
 
1

Length

Max length5
Median length4
Mean length4.0151515
Min length4

Unique

Unique1 ?
Unique (%)1.5%

Sample

1st row38.52
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 65
98.5%
38.52 1
 
1.5%

Length

2024-03-13T21:52:28.711296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T21:52:28.871916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 65
98.5%
38.52 1
 
1.5%

LO
Categorical

IMBALANCE 

Distinct2
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size660.0 B
<NA>
65 
129.8
 
1

Length

Max length5
Median length4
Mean length4.0151515
Min length4

Unique

Unique1 ?
Unique (%)1.5%

Sample

1st row129.8
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 65
98.5%
129.8 1
 
1.5%

Length

2024-03-13T21:52:29.033425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T21:52:29.173799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 65
98.5%
129.8 1
 
1.5%

PRSR_REVISN_VAL
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct66
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean374.36364
Minimum1
Maximum825
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size726.0 B
2024-03-13T21:52:29.379021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.25
Q178.25
median300.5
Q3706.75
95-th percentile821.75
Maximum825
Range824
Interquartile range (IQR)628.5

Descriptive statistics

Standard deviation315.1711
Coefficient of variation (CV)0.84188491
Kurtosis-1.5553842
Mean374.36364
Median Absolute Deviation (MAD)280
Skewness0.2741635
Sum24708
Variance99332.82
MonotonicityStrictly increasing
2024-03-13T21:52:29.627389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.5%
738 1
 
1.5%
388 1
 
1.5%
413 1
 
1.5%
438 1
 
1.5%
463 1
 
1.5%
488 1
 
1.5%
512 1
 
1.5%
538 1
 
1.5%
563 1
 
1.5%
Other values (56) 56
84.8%
ValueCountFrequency (%)
1 1
1.5%
2 1
1.5%
3 1
1.5%
4 1
1.5%
5 1
1.5%
6 1
1.5%
7 1
1.5%
8 1
1.5%
9 1
1.5%
10 1
1.5%
ValueCountFrequency (%)
825 1
1.5%
824 1
1.5%
823 1
1.5%
822 1
1.5%
821 1
1.5%
820 1
1.5%
819 1
1.5%
818 1
1.5%
817 1
1.5%
815 1
1.5%

SLNTY_REVISN_VAL
Real number (ℝ)

MISSING 

Distinct51
Distinct (%)94.4%
Missing12
Missing (%)18.2%
Infinite0
Infinite (%)0.0%
Mean33.921745
Minimum32.569286
Maximum34.41819
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size726.0 B
2024-03-13T21:52:29.877311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum32.569286
5-th percentile32.745476
Q134.027235
median34.062225
Q334.175447
95-th percentile34.322198
Maximum34.41819
Range1.8489037
Interquartile range (IQR)0.14821243

Descriptive statistics

Standard deviation0.45749785
Coefficient of variation (CV)0.013486861
Kurtosis2.6328905
Mean33.921745
Median Absolute Deviation (MAD)0.074491501
Skewness-1.8656865
Sum1831.7742
Variance0.20930428
MonotonicityNot monotonic
2024-03-13T21:52:30.572258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34.0692253112793 2
 
3.0%
34.068229675293 2
 
3.0%
34.0612297058105 2
 
3.0%
34.0442276000977 1
 
1.5%
34.1302185058594 1
 
1.5%
34.0962219238281 1
 
1.5%
34.0422210693359 1
 
1.5%
34.0302276611328 1
 
1.5%
34.0142211914062 1
 
1.5%
34.0202331542969 1
 
1.5%
Other values (41) 41
62.1%
(Missing) 12
 
18.2%
ValueCountFrequency (%)
32.5692863464355 1
1.5%
32.6132736206055 1
1.5%
32.6232833862305 1
1.5%
32.8112716674805 1
1.5%
33.1172409057617 1
1.5%
33.1492500305176 1
1.5%
33.2202453613281 1
1.5%
33.3862419128418 1
1.5%
33.5482330322266 1
1.5%
33.5932388305664 1
1.5%
ValueCountFrequency (%)
34.4181900024414 1
1.5%
34.4151954650879 1
1.5%
34.3481941223145 1
1.5%
34.3082008361816 1
1.5%
34.2942085266113 1
1.5%
34.2522125244141 1
1.5%
34.2252044677734 1
1.5%
34.2192039489746 1
1.5%
34.2082023620605 1
1.5%
34.1982078552246 1
1.5%

WTEM_REVISN_VAL
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct54
Distinct (%)100.0%
Missing12
Missing (%)18.2%
Infinite0
Infinite (%)0.0%
Mean8.6611851
Minimum0.43799999
Maximum24.771999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size726.0 B
2024-03-13T21:52:30.832024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.43799999
5-th percentile0.52089998
Q10.8505
median7.1919999
Q314.9735
95-th percentile24.106699
Maximum24.771999
Range24.333999
Interquartile range (IQR)14.123

Descriptive statistics

Standard deviation8.5887251
Coefficient of variation (CV)0.99163393
Kurtosis-0.94157401
Mean8.6611851
Median Absolute Deviation (MAD)6.3979999
Skewness0.73362845
Sum467.704
Variance73.766198
MonotonicityStrictly decreasing
2024-03-13T21:52:31.043009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.806999981403351 1
 
1.5%
4.01399993896484 1
 
1.5%
3.08899998664856 1
 
1.5%
2.37899994850159 1
 
1.5%
1.66600000858307 1
 
1.5%
1.34500002861023 1
 
1.5%
1.19599997997284 1
 
1.5%
1.09500002861023 1
 
1.5%
0.995999991893768 1
 
1.5%
0.924000024795532 1
 
1.5%
Other values (44) 44
66.7%
(Missing) 12
 
18.2%
ValueCountFrequency (%)
0.437999993562698 1
1.5%
0.477999985218048 1
1.5%
0.504000008106232 1
1.5%
0.529999971389771 1
1.5%
0.561999976634979 1
1.5%
0.578000009059906 1
1.5%
0.606000006198883 1
1.5%
0.635999977588654 1
1.5%
0.671999990940094 1
1.5%
0.714999973773956 1
1.5%
ValueCountFrequency (%)
24.7719993591309 1
1.5%
24.5289993286133 1
1.5%
24.3419990539551 1
1.5%
23.9799995422363 1
1.5%
23.7269992828369 1
1.5%
23.2180004119873 1
1.5%
22.1709995269775 1
1.5%
21.4629993438721 1
1.5%
21.2980003356934 1
1.5%
21.2639999389648 1
1.5%

YMD
Categorical

IMBALANCE 

Distinct2
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size660.0 B
<NA>
65 
20180719
 
1

Length

Max length8
Median length4
Mean length4.0606061
Min length4

Unique

Unique1 ?
Unique (%)1.5%

Sample

1st row20180719
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 65
98.5%
20180719 1
 
1.5%

Length

2024-03-13T21:52:31.253876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T21:52:31.417241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 65
98.5%
20180719 1
 
1.5%

Interactions

2024-03-13T21:52:27.379863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:52:26.446305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:52:26.862664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:52:27.544107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:52:26.557168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:52:27.035122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:52:27.700597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:52:26.726044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T21:52:27.196671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-13T21:52:31.515625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRSR_REVISN_VALSLNTY_REVISN_VALWTEM_REVISN_VAL
PRSR_REVISN_VAL1.0000.4100.835
SLNTY_REVISN_VAL0.4101.0000.726
WTEM_REVISN_VAL0.8350.7261.000
2024-03-13T21:52:31.659459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
LOCYCLE_NOYMDLA
LO1.000NaNNaNNaN
CYCLE_NONaN1.000NaNNaN
YMDNaNNaN1.000NaN
LANaNNaNNaN1.000
2024-03-13T21:52:31.820102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
PRSR_REVISN_VALSLNTY_REVISN_VALWTEM_REVISN_VALCYCLE_NOLALOYMD
PRSR_REVISN_VAL1.0000.223-1.000NaNNaNNaNNaN
SLNTY_REVISN_VAL0.2231.000-0.223NaNNaNNaNNaN
WTEM_REVISN_VAL-1.000-0.2231.000NaNNaNNaNNaN
CYCLE_NONaNNaNNaN1.000NaNNaNNaN
LANaNNaNNaNNaN1.000NaNNaN
LONaNNaNNaNNaNNaN1.000NaN
YMDNaNNaNNaNNaNNaNNaN1.000

Missing values

2024-03-13T21:52:27.859499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T21:52:28.031012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-13T21:52:28.195127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

CYCLE_NOLALOPRSR_REVISN_VALSLNTY_REVISN_VALWTEM_REVISN_VALYMD
0138.52129.8132.61327424.77199920180719
1<NA><NA><NA>232.62328324.528999<NA>
2<NA><NA><NA>332.56928624.341999<NA>
3<NA><NA><NA>432.81127223.98<NA>
4<NA><NA><NA>533.11724123.726999<NA>
5<NA><NA><NA>633.1492523.218<NA>
6<NA><NA><NA>733.22024522.171<NA>
7<NA><NA><NA>833.38624221.462999<NA>
8<NA><NA><NA>933.54823321.298<NA>
9<NA><NA><NA>1033.59323921.264<NA>
CYCLE_NOLALOPRSR_REVISN_VALSLNTY_REVISN_VALWTEM_REVISN_VALYMD
56<NA><NA><NA>815<NA><NA><NA>
57<NA><NA><NA>817<NA><NA><NA>
58<NA><NA><NA>818<NA><NA><NA>
59<NA><NA><NA>819<NA><NA><NA>
60<NA><NA><NA>820<NA><NA><NA>
61<NA><NA><NA>821<NA><NA><NA>
62<NA><NA><NA>822<NA><NA><NA>
63<NA><NA><NA>823<NA><NA><NA>
64<NA><NA><NA>824<NA><NA><NA>
65<NA><NA><NA>825<NA><NA><NA>