Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory507.8 KiB
Average record size in memory52.0 B

Variable types

Categorical2
Numeric3

Dataset

DescriptionSample
Author오아시스비즈니스
URLhttps://www.bigdata-realestate.kr/rebpp/usr/prd/prdInfoDetail.do?req_productId=75

Alerts

data_strd_ym has constant value ""Constant
pnu is highly overall correlated with legaldong_cdHigh correlation
legaldong_cd is highly overall correlated with pnuHigh correlation

Reproduction

Analysis started2023-12-11 22:32:01.189494
Analysis finished2023-12-11 22:32:04.097199
Duration2.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

data_strd_ym
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202307
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202307
2nd row202307
3rd row202307
4th row202307
5th row202307

Common Values

ValueCountFrequency (%)
202307 10000
100.0%

Length

2023-12-12T07:32:04.153494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:32:04.256950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202307 10000
100.0%

pnu
Real number (ℝ)

HIGH CORRELATION 

Distinct9189
Distinct (%)91.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1299028 × 1018
Minimum1.1110101 × 1018
Maximum1.1500103 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:04.388723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1110101 × 1018
5-th percentile1.1110163 × 1018
Q11.1215101 × 1018
median1.1290136 × 1018
Q31.1410112 × 1018
95-th percentile1.1470103 × 1018
Maximum1.1500103 × 1018
Range3.90002 × 1016
Interquartile range (IQR)1.95011 × 1016

Descriptive statistics

Standard deviation1.1637862 × 1016
Coefficient of variation (CV)0.01029988
Kurtosis-1.1591914
Mean1.1299028 × 1018
Median Absolute Deviation (MAD)9.0021 × 1015
Skewness0.057839912
Sum-8.8265033 × 1018
Variance1.3543984 × 1032
MonotonicityNot monotonic
2023-12-12T07:32:04.537169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1123010200107900000 5
 
0.1%
1150010200107170000 5
 
0.1%
1126010600106480000 5
 
0.1%
1117012900103010053 4
 
< 0.1%
1135010500112620000 4
 
< 0.1%
1114010900100010000 4
 
< 0.1%
1126010600104780022 4
 
< 0.1%
1120011400106560003 4
 
< 0.1%
1147010200109070014 4
 
< 0.1%
1144012700116050000 4
 
< 0.1%
Other values (9179) 9957
99.6%
ValueCountFrequency (%)
1111010100100660000 1
< 0.1%
1111010100101310000 1
< 0.1%
1111010200100360000 1
< 0.1%
1111010200100570000 1
< 0.1%
1111010400100530000 1
< 0.1%
1111010400101630001 1
< 0.1%
1111010400101640007 1
< 0.1%
1111010500100980006 1
< 0.1%
1111010500101550000 1
< 0.1%
1111010500101580007 1
< 0.1%
ValueCountFrequency (%)
1150010300109060014 1
< 0.1%
1150010300109050010 2
< 0.1%
1150010300109050005 1
< 0.1%
1150010300109050004 1
< 0.1%
1150010300109040012 1
< 0.1%
1150010300109040011 1
< 0.1%
1150010300109040001 1
< 0.1%
1150010300109020002 1
< 0.1%
1150010300109010015 1
< 0.1%
1150010300108990027 2
< 0.1%

legaldong_cd
Real number (ℝ)

HIGH CORRELATION 

Distinct329
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11299028
Minimum11110101
Maximum11500103
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:04.699405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110101
5-th percentile11110163
Q111215101
median11290136
Q311410112
95-th percentile11470103
Maximum11500103
Range390002
Interquartile range (IQR)195011

Descriptive statistics

Standard deviation116378.62
Coefficient of variation (CV)0.01029988
Kurtosis-1.1591914
Mean11299028
Median Absolute Deviation (MAD)90021
Skewness0.057839912
Sum1.1299028 × 1011
Variance1.3543984 × 1010
MonotonicityNot monotonic
2023-12-12T07:32:04.833188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11260101 259
 
2.6%
11500103 251
 
2.5%
11470101 248
 
2.5%
11305103 240
 
2.4%
11440120 228
 
2.3%
11470102 226
 
2.3%
11350105 225
 
2.2%
11470103 216
 
2.2%
11305101 204
 
2.0%
11230106 197
 
2.0%
Other values (319) 7706
77.1%
ValueCountFrequency (%)
11110101 2
 
< 0.1%
11110102 2
 
< 0.1%
11110104 3
 
< 0.1%
11110105 3
 
< 0.1%
11110106 9
0.1%
11110107 9
0.1%
11110108 16
0.2%
11110109 2
 
< 0.1%
11110110 11
0.1%
11110111 6
 
0.1%
ValueCountFrequency (%)
11500103 251
2.5%
11500102 101
1.0%
11500101 45
 
0.4%
11470103 216
2.2%
11470102 226
2.3%
11470101 248
2.5%
11440127 52
 
0.5%
11440126 12
 
0.1%
11440125 81
 
0.8%
11440124 96
 
1.0%

induty_cd
Categorical

Distinct42
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A01
1656 
A03
1304 
C01
816 
B02
710 
C05
 
465
Other values (37)
5049 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA01
2nd rowB10
3rd rowC03
4th rowA12
5th rowA03

Common Values

ValueCountFrequency (%)
A01 1656
16.6%
A03 1304
 
13.0%
C01 816
 
8.2%
B02 710
 
7.1%
C05 465
 
4.7%
B01 402
 
4.0%
C07 353
 
3.5%
C03 347
 
3.5%
C06 346
 
3.5%
B03 321
 
3.2%
Other values (32) 3280
32.8%

Length

2023-12-12T07:32:04.951830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a01 1656
16.6%
a03 1304
 
13.0%
c01 816
 
8.2%
b02 710
 
7.1%
c05 465
 
4.7%
b01 402
 
4.0%
c07 353
 
3.5%
c03 347
 
3.5%
c06 346
 
3.5%
b03 321
 
3.2%
Other values (32) 3280
32.8%

gtfc_scor
Real number (ℝ)

Distinct4001
Distinct (%)40.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48.94822
Minimum6.01
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:05.092889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6.01
5-th percentile30.258
Q142.09
median48.33
Q353.03
95-th percentile73.5625
Maximum100
Range93.99
Interquartile range (IQR)10.94

Descriptive statistics

Standard deviation12.824083
Coefficient of variation (CV)0.26199284
Kurtosis2.5645892
Mean48.94822
Median Absolute Deviation (MAD)5.61
Skewness0.97889925
Sum489482.2
Variance164.45711
MonotonicityNot monotonic
2023-12-12T07:32:05.232492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
49.79 455
 
4.5%
49.78 330
 
3.3%
49.8 28
 
0.3%
47.84 12
 
0.1%
47.82 11
 
0.1%
48.69 11
 
0.1%
47.48 11
 
0.1%
49.81 10
 
0.1%
48.31 10
 
0.1%
44.8 10
 
0.1%
Other values (3991) 9112
91.1%
ValueCountFrequency (%)
6.01 1
< 0.1%
7.93 1
< 0.1%
9.49 1
< 0.1%
9.87 1
< 0.1%
10.94 1
< 0.1%
11.72 1
< 0.1%
11.89 1
< 0.1%
12.02 1
< 0.1%
12.35 1
< 0.1%
12.36 1
< 0.1%
ValueCountFrequency (%)
100.0 9
0.1%
99.99 4
< 0.1%
99.98 1
 
< 0.1%
99.96 1
 
< 0.1%
99.95 2
 
< 0.1%
99.92 1
 
< 0.1%
99.84 1
 
< 0.1%
99.83 1
 
< 0.1%
99.75 1
 
< 0.1%
99.73 1
 
< 0.1%

Interactions

2023-12-12T07:32:03.658808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:02.743479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:03.280533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:03.761544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:03.032511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:03.417360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:03.853166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:03.166111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:03.549929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:32:05.302943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pnulegaldong_cdinduty_cdgtfc_scor
pnu1.0001.0000.2070.082
legaldong_cd1.0001.0000.2070.082
induty_cd0.2070.2071.0000.477
gtfc_scor0.0820.0820.4771.000
2023-12-12T07:32:05.385577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pnulegaldong_cdgtfc_scorinduty_cd
pnu1.0001.0000.0460.073
legaldong_cd1.0001.0000.0460.073
gtfc_scor0.0460.0461.0000.184
induty_cd0.0730.0730.1841.000

Missing values

2023-12-12T07:32:03.946591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:32:04.051540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

data_strd_ympnulegaldong_cdinduty_cdgtfc_scor
28762202307112151050010553022811215105A0146.5
60422202307113201070010650006511320107B1048.75
501202307111101110010156000111110111C0336.67
70598202307113801070010085000511380107A1254.64
58067202307113201060010264006411320106A0348.69
30537202307112151070010034000511215107C0539.87
33539202307112301040010295000911230104C0149.78
33436202307112301040010141006311230104C0149.78
17211202307111701300010136000211170130A1245.07
80610202307114401080010337001211440108B0346.0
data_strd_ympnulegaldong_cdinduty_cdgtfc_scor
64017202307113501050010292000111350105B1928.39
42493202307112601030010308001011260103A0155.92
17691202307111701310010257001011170131C0540.51
10410202307111401510010019002911140151C0149.79
37802202307112301100010078004211230110A0332.22
65626202307113501050011002000011350105B1558.67
24497202307112151010010097000511215101B0250.26
42679202307112601030010323010811260103B0233.23
65992202307113501050011132000111350105B0249.39
39684202307112601010010192012511260101C0552.29