Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory507.8 KiB
Average record size in memory52.0 B

Variable types

Categorical2
Numeric3

Dataset

DescriptionSample
Author오아시스비즈니스
URLhttps://www.bigdata-realestate.kr/rebpp/usr/prd/prdInfoDetail.do?req_productId=73

Alerts

data_strd_ym has constant value ""Constant
pnu is highly overall correlated with legaldong_cdHigh correlation
legaldong_cd is highly overall correlated with pnuHigh correlation

Reproduction

Analysis started2023-12-11 22:30:12.868804
Analysis finished2023-12-11 22:30:14.777404
Duration1.91 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

data_strd_ym
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202306
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202306
2nd row202306
3rd row202306
4th row202306
5th row202306

Common Values

ValueCountFrequency (%)
202306 10000
100.0%

Length

2023-12-12T07:30:14.826494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:30:14.891322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202306 10000
100.0%

pnu
Real number (ℝ)

HIGH CORRELATION 

Distinct8937
Distinct (%)89.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.123939 × 1018
Minimum1.1110101 × 1018
Maximum1.1380108 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:30:14.969544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1110101 × 1018
5-th percentile1.1110155 × 1018
Q11.117013 × 1018
median1.1230106 × 1018
Q31.1305101 × 1018
95-th percentile1.1380103 × 1018
Maximum1.1380108 × 1018
Range2.70007 × 1016
Interquartile range (IQR)1.34971 × 1016

Descriptive statistics

Standard deviation8.0088459 × 1015
Coefficient of variation (CV)0.0071256947
Kurtosis-1.016044
Mean1.123939 × 1018
Median Absolute Deviation (MAD)6.003 × 1015
Skewness0.021695156
Sum5.3224724 × 1018
Variance6.4141612 × 1031
MonotonicityNot monotonic
2023-12-12T07:30:15.081218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1129013800103200000 8
 
0.1%
1114016500125450000 7
 
0.1%
1120010700103460000 7
 
0.1%
1135010600105060000 6
 
0.1%
1120010100108110000 6
 
0.1%
1120011400106560334 5
 
0.1%
1114014800100180185 5
 
0.1%
1117011200101330003 5
 
0.1%
1117011000100710088 4
 
< 0.1%
1138010700107700001 4
 
< 0.1%
Other values (8927) 9943
99.4%
ValueCountFrequency (%)
1111010100100660000 1
< 0.1%
1111010200100010028 1
< 0.1%
1111010200100310000 1
< 0.1%
1111010200100660000 1
< 0.1%
1111010400100310000 1
< 0.1%
1111010400100400002 1
< 0.1%
1111010400100540000 1
< 0.1%
1111010400100620000 1
< 0.1%
1111010500100990000 1
< 0.1%
1111010500101290002 1
< 0.1%
ValueCountFrequency (%)
1138010800100010045 1
 
< 0.1%
1138010800100010007 1
 
< 0.1%
1138010800100010001 1
 
< 0.1%
1138010700107700001 4
< 0.1%
1138010700107670000 4
< 0.1%
1138010700107620000 1
 
< 0.1%
1138010700107610000 2
< 0.1%
1138010700107600000 2
< 0.1%
1138010700107590000 3
< 0.1%
1138010700107540001 1
 
< 0.1%

legaldong_cd
Real number (ℝ)

HIGH CORRELATION 

Distinct285
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11239390
Minimum11110101
Maximum11380108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:30:15.385798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110101
5-th percentile11110155
Q111170130
median11230106
Q311305101
95-th percentile11380103
Maximum11380108
Range270007
Interquartile range (IQR)134971

Descriptive statistics

Standard deviation80088.459
Coefficient of variation (CV)0.0071256947
Kurtosis-1.016044
Mean11239390
Median Absolute Deviation (MAD)60030
Skewness0.021695156
Sum1.123939 × 1011
Variance6.4141612 × 109
MonotonicityNot monotonic
2023-12-12T07:30:15.491598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11260101 347
 
3.5%
11350105 333
 
3.3%
11305103 313
 
3.1%
11305101 278
 
2.8%
11215105 242
 
2.4%
11230106 236
 
2.4%
11215101 230
 
2.3%
11320107 223
 
2.2%
11200115 195
 
1.9%
11350103 186
 
1.9%
Other values (275) 7417
74.2%
ValueCountFrequency (%)
11110101 1
 
< 0.1%
11110102 3
 
< 0.1%
11110104 4
 
< 0.1%
11110105 4
 
< 0.1%
11110106 9
0.1%
11110107 13
0.1%
11110108 20
0.2%
11110109 3
 
< 0.1%
11110110 9
0.1%
11110111 4
 
< 0.1%
ValueCountFrequency (%)
11380108 3
 
< 0.1%
11380107 175
1.8%
11380106 82
 
0.8%
11380105 41
 
0.4%
11380104 159
1.6%
11380103 117
 
1.2%
11380102 73
 
0.7%
11380101 15
 
0.1%
11350106 121
 
1.2%
11350105 333
3.3%

induty_cd
Categorical

Distinct41
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A01
1658 
A03
1270 
C01
914 
B02
680 
C05
561 
Other values (36)
4917 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA03
2nd rowA01
3rd rowC05
4th rowA03
5th rowB13

Common Values

ValueCountFrequency (%)
A01 1658
16.6%
A03 1270
12.7%
C01 914
 
9.1%
B02 680
 
6.8%
C05 561
 
5.6%
B01 434
 
4.3%
C03 409
 
4.1%
C06 379
 
3.8%
C07 343
 
3.4%
B05 340
 
3.4%
Other values (31) 3012
30.1%

Length

2023-12-12T07:30:15.587853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a01 1658
16.6%
a03 1270
12.7%
c01 914
 
9.1%
b02 680
 
6.8%
c05 561
 
5.6%
b01 434
 
4.3%
c03 409
 
4.1%
c06 379
 
3.8%
c07 343
 
3.4%
b05 340
 
3.4%
Other values (31) 3012
30.1%

fpop_scor
Real number (ℝ)

Distinct2905
Distinct (%)29.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.031994
Minimum2.56
Maximum91.81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:30:15.677969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.56
5-th percentile26.48
Q144.3075
median54.745
Q363.62
95-th percentile73.2
Maximum91.81
Range89.25
Interquartile range (IQR)19.3125

Descriptive statistics

Standard deviation14.185904
Coefficient of variation (CV)0.26749709
Kurtosis0.082983609
Mean53.031994
Median Absolute Deviation (MAD)9.365
Skewness-0.58183371
Sum530319.94
Variance201.23988
MonotonicityNot monotonic
2023-12-12T07:30:15.777740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
63.62 39
 
0.4%
62.54 32
 
0.3%
64.0 31
 
0.3%
65.19 30
 
0.3%
59.41 28
 
0.3%
66.49 27
 
0.3%
64.78 27
 
0.3%
58.27 26
 
0.3%
58.54 26
 
0.3%
67.73 26
 
0.3%
Other values (2895) 9708
97.1%
ValueCountFrequency (%)
2.56 1
< 0.1%
3.8 1
< 0.1%
4.25 1
< 0.1%
4.3 1
< 0.1%
4.73 1
< 0.1%
4.81 1
< 0.1%
4.91 1
< 0.1%
5.25 1
< 0.1%
5.32 1
< 0.1%
5.55 1
< 0.1%
ValueCountFrequency (%)
91.81 1
< 0.1%
88.83 1
< 0.1%
88.49 1
< 0.1%
88.16 1
< 0.1%
87.29 1
< 0.1%
86.47 2
< 0.1%
86.3 1
< 0.1%
85.81 1
< 0.1%
85.76 1
< 0.1%
85.63 1
< 0.1%

Interactions

2023-12-12T07:30:14.440482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:30:13.985559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:30:14.235967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:30:14.518661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:30:14.096624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:30:14.307032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:30:14.590086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:30:14.168619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:30:14.376380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:30:15.843233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pnulegaldong_cdinduty_cdfpop_scor
pnu1.0001.0000.2920.438
legaldong_cd1.0001.0000.2920.438
induty_cd0.2920.2921.0000.331
fpop_scor0.4380.4380.3311.000
2023-12-12T07:30:15.912892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pnulegaldong_cdfpop_scorinduty_cd
pnu1.0001.0000.2680.104
legaldong_cd1.0001.0000.2680.104
fpop_scor0.2680.2681.0000.119
induty_cd0.1040.1040.1191.000

Missing values

2023-12-12T07:30:14.668647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:30:14.740831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

data_strd_ympnulegaldong_cdinduty_cdfpop_scor
46027202306112301100010078004211230110A0365.99
39102202306112301020010112000111230102A0161.2
36056202306112151050010625002711215105C0549.39
13636202306111401580010038002311140158A0352.13
54534202306112601060010817000011260106B1378.0
82718202306113801030010542000311380103A0171.03
25381202306112001070010346000011200107B1156.6
77485202306113501050010323000911350105A0378.03
40842202306112301040010199000711230104A0143.06
5143202306111101630010098000511110163C0167.31
data_strd_ympnulegaldong_cdinduty_cdfpop_scor
84589202306113801060010015011911380106C0669.21
74789202306113501020010469002811350102A1458.69
83550202306113801040010454002811380104A0365.99
18262202306111701100010010002311170110B1419.43
78322202306113501050010603000211350105A0542.21
5406202306111101630010332002911110163C0341.57
14565202306111401620010236007811140162C0321.51
52089202306112601040010180001611260104B0281.15
15463202306111401620010413000511140162B0359.05
19110202306111701180010095000511170118A0136.18