Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory507.8 KiB
Average record size in memory52.0 B

Variable types

Categorical2
Numeric3

Dataset

DescriptionSample
Author오아시스비즈니스
URLhttps://www.bigdata-realestate.kr/rebpp/usr/prd/prdInfoDetail.do?req_productId=69

Alerts

data_strd_ym has constant value ""Constant
pnu is highly overall correlated with legaldong_cdHigh correlation
legaldong_cd is highly overall correlated with pnuHigh correlation

Reproduction

Analysis started2023-12-11 22:32:08.553730
Analysis finished2023-12-11 22:32:10.249249
Duration1.7 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

data_strd_ym
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202307
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202307
2nd row202307
3rd row202307
4th row202307
5th row202307

Common Values

ValueCountFrequency (%)
202307 10000
100.0%

Length

2023-12-12T07:32:10.305897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:32:10.376072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202307 10000
100.0%

pnu
Real number (ℝ)

HIGH CORRELATION 

Distinct8842
Distinct (%)88.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1237021 × 1018
Minimum1.1110101 × 1018
Maximum1.1380104 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:10.494421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1110101 × 1018
5-th percentile1.1110154 × 1018
Q11.117013 × 1018
median1.1230106 × 1018
Q31.1305101 × 1018
95-th percentile1.1350105 × 1018
Maximum1.1380104 × 1018
Range2.70003 × 1016
Interquartile range (IQR)1.34971 × 1016

Descriptive statistics

Standard deviation7.7425255 × 1015
Coefficient of variation (CV)0.0068901939
Kurtosis-1.0329391
Mean1.1237021 × 1018
Median Absolute Deviation (MAD)6.0029 × 1015
Skewness-0.033289354
Sum2.954012 × 1018
Variance5.9946701 × 1031
MonotonicityNot monotonic
2023-12-12T07:32:10.622695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1114016500125450000 7
 
0.1%
1120010200110700000 7
 
0.1%
1135010500106510000 6
 
0.1%
1126010600103970000 5
 
0.1%
1114017100102000000 5
 
0.1%
1114016200108550000 5
 
0.1%
1120010100108110000 5
 
0.1%
1129013800103200000 5
 
0.1%
1135010300104410026 4
 
< 0.1%
1126010200105000000 4
 
< 0.1%
Other values (8832) 9947
99.5%
ValueCountFrequency (%)
1111010100100480000 1
< 0.1%
1111010200100360000 1
< 0.1%
1111010200100580001 1
< 0.1%
1111010400100530000 2
< 0.1%
1111010500101330000 1
< 0.1%
1111010500101360000 1
< 0.1%
1111010500101530001 1
< 0.1%
1111010600100070004 1
< 0.1%
1111010600100100000 1
< 0.1%
1111010600100350084 1
< 0.1%
ValueCountFrequency (%)
1138010400105030019 1
 
< 0.1%
1138010400105030012 1
 
< 0.1%
1138010400105030004 1
 
< 0.1%
1138010400105020033 1
 
< 0.1%
1138010400105020026 1
 
< 0.1%
1138010400105020004 1
 
< 0.1%
1138010400105010003 1
 
< 0.1%
1138010400104990010 1
 
< 0.1%
1138010400104980007 1
 
< 0.1%
1138010400104980004 3
< 0.1%

legaldong_cd
Real number (ℝ)

HIGH CORRELATION 

Distinct285
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11237021
Minimum11110101
Maximum11380104
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:10.748452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110101
5-th percentile11110154
Q111170130
median11230106
Q311305101
95-th percentile11350105
Maximum11380104
Range270003
Interquartile range (IQR)134971

Descriptive statistics

Standard deviation77425.255
Coefficient of variation (CV)0.0068901939
Kurtosis-1.0329391
Mean11237021
Median Absolute Deviation (MAD)60029
Skewness-0.033289354
Sum1.1237021 × 1011
Variance5.9946701 × 109
MonotonicityNot monotonic
2023-12-12T07:32:10.875695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11350105 391
 
3.9%
11260101 341
 
3.4%
11305103 333
 
3.3%
11305101 311
 
3.1%
11215105 266
 
2.7%
11215101 245
 
2.5%
11230106 239
 
2.4%
11320107 229
 
2.3%
11350103 228
 
2.3%
11200115 204
 
2.0%
Other values (275) 7213
72.1%
ValueCountFrequency (%)
11110101 1
 
< 0.1%
11110102 2
 
< 0.1%
11110104 2
 
< 0.1%
11110105 3
 
< 0.1%
11110106 13
0.1%
11110107 7
0.1%
11110108 15
0.1%
11110109 3
 
< 0.1%
11110110 6
 
0.1%
11110111 7
0.1%
ValueCountFrequency (%)
11380104 134
 
1.3%
11380103 118
 
1.2%
11380102 60
 
0.6%
11380101 13
 
0.1%
11350106 159
1.6%
11350105 391
3.9%
11350104 32
 
0.3%
11350103 228
2.3%
11350102 111
 
1.1%
11320108 104
 
1.0%

induty_cd
Categorical

Distinct41
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A01
1501 
A03
1226 
C01
808 
B02
661 
C05
534 
Other values (36)
5270 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA14
2nd rowA07
3rd rowC03
4th rowA07
5th rowB01

Common Values

ValueCountFrequency (%)
A01 1501
15.0%
A03 1226
 
12.3%
C01 808
 
8.1%
B02 661
 
6.6%
C05 534
 
5.3%
B01 415
 
4.2%
C07 398
 
4.0%
C03 383
 
3.8%
C06 364
 
3.6%
B05 332
 
3.3%
Other values (31) 3378
33.8%

Length

2023-12-12T07:32:10.988529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a01 1501
15.0%
a03 1226
 
12.3%
c01 808
 
8.1%
b02 661
 
6.6%
c05 534
 
5.3%
b01 415
 
4.2%
c07 398
 
4.0%
c03 383
 
3.8%
c06 364
 
3.6%
b05 332
 
3.3%
Other values (31) 3378
33.8%

fpop_scor
Real number (ℝ)

Distinct4731
Distinct (%)47.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.36098
Minimum1.66
Maximum95.03
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:11.119510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.66
5-th percentile24.44
Q143.97
median55.33
Q364.28
95-th percentile75.62
Maximum95.03
Range93.37
Interquartile range (IQR)20.31

Descriptive statistics

Standard deviation15.202201
Coefficient of variation (CV)0.28489358
Kurtosis-0.20974282
Mean53.36098
Median Absolute Deviation (MAD)10.055
Skewness-0.43083298
Sum533609.8
Variance231.10691
MonotonicityNot monotonic
2023-12-12T07:32:11.256409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
57.21 10
 
0.1%
59.79 9
 
0.1%
55.08 9
 
0.1%
67.53 9
 
0.1%
59.19 9
 
0.1%
62.29 9
 
0.1%
56.26 9
 
0.1%
55.45 8
 
0.1%
52.05 8
 
0.1%
58.12 8
 
0.1%
Other values (4721) 9912
99.1%
ValueCountFrequency (%)
1.66 1
< 0.1%
6.94 1
< 0.1%
6.95 1
< 0.1%
6.97 1
< 0.1%
7.16 1
< 0.1%
8.52 2
< 0.1%
8.55 1
< 0.1%
9.03 1
< 0.1%
9.16 1
< 0.1%
9.42 1
< 0.1%
ValueCountFrequency (%)
95.03 1
< 0.1%
91.69 1
< 0.1%
91.46 1
< 0.1%
91.33 1
< 0.1%
91.27 1
< 0.1%
91.18 1
< 0.1%
91.09 1
< 0.1%
91.03 1
< 0.1%
90.31 1
< 0.1%
90.22 1
< 0.1%

Interactions

2023-12-12T07:32:09.622244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:09.197101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:09.410670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:09.689677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:09.266808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:09.485181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:09.906604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:09.334985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:09.557565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:32:11.324209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pnulegaldong_cdinduty_cdfpop_scor
pnu1.0001.0000.2390.403
legaldong_cd1.0001.0000.2390.403
induty_cd0.2390.2391.0000.413
fpop_scor0.4030.4030.4131.000
2023-12-12T07:32:11.402675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pnulegaldong_cdfpop_scorinduty_cd
pnu1.0001.000-0.1030.084
legaldong_cd1.0001.000-0.1030.084
fpop_scor-0.103-0.1031.0000.154
induty_cd0.0840.0840.1541.000

Missing values

2023-12-12T07:32:10.065426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:32:10.190416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

data_strd_ympnulegaldong_cdinduty_cdfpop_scor
428202307111101080010154000811110108A1445.11
19041202307111701130010003000411170113A0773.46
21471202307111701300010074006011170130C0339.05
31812202307112151010010172005411215101A0763.6
33542202307112151030010200000411215103B0153.02
12875202307111401510010092000211140151A0375.93
12698202307111401480010032000011140148B0965.98
11256202307111401290010015001111140129A0321.89
30447202307112001220010073000411200122B1558.08
11658202307111401360010010000011140136C0152.05
data_strd_ympnulegaldong_cdinduty_cdfpop_scor
34260202307112151030010248003911215103A0343.04
2096202307111101300010171000111110130A0374.66
41197202307112301030011017000011230103A0167.85
56140202307112901030010609000111290103A1336.62
3146202307111101380010006000011110138A1452.57
82300202307113501050011293000011350105C0157.08
64678202307113051010010190000211305101A1152.67
65526202307113051010010679000511305101A0155.72
31819202307112151010010172006311215101B0238.39
24652202307112001050010771000511200105C0150.64