Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory507.8 KiB
Average record size in memory52.0 B

Variable types

Categorical2
Numeric3

Dataset

DescriptionSample
Author오아시스비즈니스
URLhttps://www.bigdata-realestate.kr/rebpp/usr/prd/prdInfoDetail.do?req_productId=68

Alerts

data_strd_ym has constant value ""Constant
pnu is highly overall correlated with legaldong_cdHigh correlation
legaldong_cd is highly overall correlated with pnuHigh correlation
sopsrt_spl_dims is highly skewed (γ1 = 41.28143173)Skewed

Reproduction

Analysis started2023-12-11 22:32:28.903887
Analysis finished2023-12-11 22:32:31.835603
Duration2.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

data_strd_ym
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202306
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202306
2nd row202306
3rd row202306
4th row202306
5th row202306

Common Values

ValueCountFrequency (%)
202306 10000
100.0%

Length

2023-12-12T07:32:31.903451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:32:32.009583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202306 10000
100.0%

pnu
Real number (ℝ)

HIGH CORRELATION 

Distinct8116
Distinct (%)81.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.7105373 × 1018
Minimum1.1110101 × 1018
Maximum5.013032 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:32.124401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1110101 × 1018
5-th percentile1.144012 × 1018
Q12.9200109 × 1018
median4.1480102 × 1018
Q34.5130134 × 1018
95-th percentile4.833034 × 1018
Maximum5.013032 × 1018
Range3.9020219 × 1018
Interquartile range (IQR)1.5930025 × 1018

Descriptive statistics

Standard deviation1.1552794 × 1018
Coefficient of variation (CV)0.31135096
Kurtosis0.1883042
Mean3.7105373 × 1018
Median Absolute Deviation (MAD)5.3601487 × 1017
Skewness-1.157321
Sum8.9703299 × 1018
Variance1.3346704 × 1036
MonotonicityNot monotonic
2023-12-12T07:32:32.291041image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4111710200106050000 11
 
0.1%
4136011100109750001 10
 
0.1%
4113310100135020000 8
 
0.1%
4817013700102250001 8
 
0.1%
1123010400105910053 8
 
0.1%
4128110500109500000 8
 
0.1%
4311311300122700000 8
 
0.1%
4136011200161430000 8
 
0.1%
4111313600106530000 8
 
0.1%
4136011100110050000 8
 
0.1%
Other values (8106) 9915
99.2%
ValueCountFrequency (%)
1111010100100520146 1
 
< 0.1%
1111011000101520000 1
 
< 0.1%
1111011000101530000 2
< 0.1%
1111011100100470449 1
 
< 0.1%
1111012100100010102 3
< 0.1%
1111012100101710000 3
< 0.1%
1111013100101870005 1
 
< 0.1%
1111013200100080001 1
 
< 0.1%
1111013500100070018 1
 
< 0.1%
1111013500100190022 1
 
< 0.1%
ValueCountFrequency (%)
5013032023114120005 1
< 0.1%
5013032022113080000 1
< 0.1%
5013032021108390007 1
< 0.1%
5013032021107740001 1
< 0.1%
5013032021106330002 2
< 0.1%
5013032021105550003 1
< 0.1%
5013032021105440007 2
< 0.1%
5013032021105000004 1
< 0.1%
5013031030104800025 1
< 0.1%
5013031028108650000 1
< 0.1%

legaldong_cd
Real number (ℝ)

HIGH CORRELATION 

Distinct2242
Distinct (%)22.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37105373
Minimum11110101
Maximum50130320
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:32.430099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110101
5-th percentile11440120
Q129200109
median41480102
Q345130134
95-th percentile48330340
Maximum50130320
Range39020219
Interquartile range (IQR)15930025

Descriptive statistics

Standard deviation11552794
Coefficient of variation (CV)0.31135096
Kurtosis0.18830421
Mean37105373
Median Absolute Deviation (MAD)5360148.5
Skewness-1.157321
Sum3.7105373 × 1011
Variance1.3346704 × 1014
MonotonicityNot monotonic
2023-12-12T07:32:32.561834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
41360112 84
 
0.8%
41590253 75
 
0.8%
41630114 69
 
0.7%
48250132 61
 
0.6%
31200126 55
 
0.5%
41220128 52
 
0.5%
43111124 50
 
0.5%
44200253 43
 
0.4%
44210109 41
 
0.4%
41590262 41
 
0.4%
Other values (2232) 9429
94.3%
ValueCountFrequency (%)
11110101 1
 
< 0.1%
11110110 3
< 0.1%
11110111 1
 
< 0.1%
11110121 6
0.1%
11110131 1
 
< 0.1%
11110132 1
 
< 0.1%
11110135 2
 
< 0.1%
11110137 1
 
< 0.1%
11110141 2
 
< 0.1%
11110151 1
 
< 0.1%
ValueCountFrequency (%)
50130320 10
0.1%
50130310 14
0.1%
50130259 5
 
0.1%
50130253 13
0.1%
50130250 9
0.1%
50130121 1
 
< 0.1%
50130120 4
 
< 0.1%
50130116 4
 
< 0.1%
50130115 2
 
< 0.1%
50130114 1
 
< 0.1%

induty_cd
Categorical

Distinct42
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A03
1711 
A01
1449 
C01
702 
A08
664 
B02
611 
Other values (37)
4863 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowA03
2nd rowC01
3rd rowA03
4th rowA01
5th rowB05

Common Values

ValueCountFrequency (%)
A03 1711
17.1%
A01 1449
14.5%
C01 702
 
7.0%
A08 664
 
6.6%
B02 611
 
6.1%
C06 540
 
5.4%
C05 528
 
5.3%
B05 494
 
4.9%
B10 347
 
3.5%
C07 316
 
3.2%
Other values (32) 2638
26.4%

Length

2023-12-12T07:32:32.682291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a03 1711
17.1%
a01 1449
14.5%
c01 702
 
7.0%
a08 664
 
6.6%
b02 611
 
6.1%
c06 540
 
5.4%
c05 528
 
5.3%
b05 494
 
4.9%
b10 347
 
3.5%
c07 316
 
3.2%
Other values (32) 2638
26.4%

sopsrt_spl_dims
Real number (ℝ)

SKEWED 

Distinct2978
Distinct (%)29.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.480129
Minimum0.01
Maximum37446.75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:32.779592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.13
Q10.56
median1.9
Q310.55
95-th percentile176.9565
Maximum37446.75
Range37446.74
Interquartile range (IQR)9.99

Descriptive statistics

Standard deviation582.11835
Coefficient of variation (CV)10.306604
Kurtosis2178.7223
Mean56.480129
Median Absolute Deviation (MAD)1.65
Skewness41.281432
Sum564801.29
Variance338861.77
MonotonicityNot monotonic
2023-12-12T07:32:32.891820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.17 72
 
0.7%
0.27 72
 
0.7%
0.15 66
 
0.7%
0.07 62
 
0.6%
0.45 62
 
0.6%
0.22 60
 
0.6%
0.1 60
 
0.6%
0.16 58
 
0.6%
0.25 57
 
0.6%
0.29 57
 
0.6%
Other values (2968) 9374
93.7%
ValueCountFrequency (%)
0.01 15
 
0.1%
0.02 14
 
0.1%
0.03 18
 
0.2%
0.04 41
0.4%
0.05 29
0.3%
0.06 35
0.4%
0.07 62
0.6%
0.08 36
0.4%
0.09 56
0.6%
0.1 60
0.6%
ValueCountFrequency (%)
37446.75 1
< 0.1%
20900.52 1
< 0.1%
20729.84 1
< 0.1%
19895.29 1
< 0.1%
9010.21 1
< 0.1%
8141.51 1
< 0.1%
7826.09 1
< 0.1%
7397.72 1
< 0.1%
6409.75 1
< 0.1%
5653.27 1
< 0.1%

Interactions

2023-12-12T07:32:31.191489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:30.214010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:30.818532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:31.294556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:30.555827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:30.938144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:31.419881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:30.692570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:31.075119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:32:32.968249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pnulegaldong_cdinduty_cdsopsrt_spl_dims
pnu1.0001.0000.2040.061
legaldong_cd1.0001.0000.2040.061
induty_cd0.2040.2041.0000.000
sopsrt_spl_dims0.0610.0610.0001.000
2023-12-12T07:32:33.058368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pnulegaldong_cdsopsrt_spl_dimsinduty_cd
pnu1.0001.0000.3870.079
legaldong_cd1.0001.0000.3870.079
sopsrt_spl_dims0.3870.3871.0000.000
induty_cd0.0790.0790.0001.000

Missing values

2023-12-12T07:32:31.634024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:32:31.766213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

data_strd_ympnulegaldong_cdinduty_cdsopsrt_spl_dims
17096202306415901340010455001041590134A030.63
6539202306282001080010551000228200108C0116.26
27638202306472801100010862000247280110A0313.07
6534202306282001070010205000028200107A0116.92
2315202306115601320010098001211560132B052.02
13140202306413101020010230000041310102C05246.66
24847202306461303102211202000446130310A030.8
28255202306478503203610697000047850320A0332.26
10746202306411171070010047001241117107B190.24
10168202306411131260011258000141113126B060.28
data_strd_ympnulegaldong_cdinduty_cdsopsrt_spl_dims
13224202306413601110010821000041360111A010.62
10374202306411131360010520000741113136A040.29
1740202306115001010010274000211500101B050.78
28393202306481211020010776000348121102A052.03
17733202306415902622710318000441590262C0610.75
14988202306414611050010580000241461105A0368.31
11875202306412201240010679000241220124C050.23
17329202306415902532110024000041590253B020.37
6548202306282001100010673000328200110A017.84
9946202306361102502510019000336110250C071.25