Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory507.8 KiB
Average record size in memory52.0 B

Variable types

Categorical2
Numeric3

Dataset

DescriptionSample
Author오아시스비즈니스
URLhttps://www.bigdata-realestate.kr/rebpp/usr/prd/prdInfoDetail.do?req_productId=66

Alerts

data_strd_ym has constant value ""Constant
pnu is highly overall correlated with legaldong_cdHigh correlation
legaldong_cd is highly overall correlated with pnuHigh correlation
pul_party_sopsrt_dims is highly skewed (γ1 = 30.25402225)Skewed

Reproduction

Analysis started2023-12-11 22:31:59.971019
Analysis finished2023-12-11 22:32:03.094336
Duration3.12 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

data_strd_ym
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
202306
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202306
2nd row202306
3rd row202306
4th row202306
5th row202306

Common Values

ValueCountFrequency (%)
202306 10000
100.0%

Length

2023-12-12T07:32:03.173515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:32:03.260550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202306 10000
100.0%

pnu
Real number (ℝ)

HIGH CORRELATION 

Distinct8820
Distinct (%)88.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1248855 × 1018
Minimum1.1110101 × 1018
Maximum1.1410111 × 1018
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:03.372195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1110101 × 1018
5-th percentile1.1110157 × 1018
Q11.1200103 × 1018
median1.123011 × 1018
Q31.1320105 × 1018
95-th percentile1.1380107 × 1018
Maximum1.1410111 × 1018
Range3.0001 × 1016
Interquartile range (IQR)1.20002 × 1016

Descriptive statistics

Standard deviation8.4598481 × 1015
Coefficient of variation (CV)0.00752063
Kurtosis-1.0354692
Mean1.1248855 × 1018
Median Absolute Deviation (MAD)7.4991 × 1015
Skewness-0.00078400995
Sum-3.6584034 × 1018
Variance7.1569029 × 1031
MonotonicityNot monotonic
2023-12-12T07:32:03.509413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1123010400105910053 6
 
0.1%
1120011300105600000 6
 
0.1%
1111010700100800000 6
 
0.1%
1129013800103160003 6
 
0.1%
1132010700101340036 6
 
0.1%
1129013600102220000 5
 
0.1%
1121510300105460011 5
 
0.1%
1120011500102800021 5
 
0.1%
1123010400100100000 5
 
0.1%
1111017700102330000 5
 
0.1%
Other values (8810) 9945
99.5%
ValueCountFrequency (%)
1111010100100010000 1
< 0.1%
1111010100100500031 1
< 0.1%
1111010100100660000 1
< 0.1%
1111010100101110001 1
< 0.1%
1111010100101310000 1
< 0.1%
1111010200100010028 1
< 0.1%
1111010200100570000 1
< 0.1%
1111010300100130008 1
< 0.1%
1111010400100580000 1
< 0.1%
1111010400100700001 1
< 0.1%
ValueCountFrequency (%)
1141011100102660006 1
< 0.1%
1141011100102500002 1
< 0.1%
1141011100102460003 1
< 0.1%
1141011100102450011 1
< 0.1%
1141011100102170002 1
< 0.1%
1141011100102140001 1
< 0.1%
1141011100102110000 1
< 0.1%
1141011100101790008 1
< 0.1%
1141011100101760001 1
< 0.1%
1141011100101750001 1
< 0.1%

legaldong_cd
Real number (ℝ)

HIGH CORRELATION 

Distinct298
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11248855
Minimum11110101
Maximum11410111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:03.656834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110101
5-th percentile11110157
Q111200103
median11230110
Q311320105
95-th percentile11380107
Maximum11410111
Range300010
Interquartile range (IQR)120002

Descriptive statistics

Standard deviation84598.481
Coefficient of variation (CV)0.00752063
Kurtosis-1.0354692
Mean11248855
Median Absolute Deviation (MAD)74991
Skewness-0.00078400981
Sum1.1248855 × 1011
Variance7.1569029 × 109
MonotonicityNot monotonic
2023-12-12T07:32:03.816912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11260101 325
 
3.2%
11305103 323
 
3.2%
11350105 319
 
3.2%
11305101 273
 
2.7%
11215101 260
 
2.6%
11230106 218
 
2.2%
11320107 217
 
2.2%
11380107 210
 
2.1%
11215105 204
 
2.0%
11350103 204
 
2.0%
Other values (288) 7447
74.5%
ValueCountFrequency (%)
11110101 5
 
0.1%
11110102 2
 
< 0.1%
11110103 1
 
< 0.1%
11110104 3
 
< 0.1%
11110105 4
 
< 0.1%
11110106 8
0.1%
11110107 16
0.2%
11110108 13
0.1%
11110109 7
0.1%
11110110 9
0.1%
ValueCountFrequency (%)
11410111 34
0.3%
11410110 41
0.4%
11410109 2
 
< 0.1%
11410108 12
 
0.1%
11410107 3
 
< 0.1%
11410106 4
 
< 0.1%
11410105 12
 
0.1%
11410104 9
 
0.1%
11410103 1
 
< 0.1%
11410102 18
0.2%

induty_cd
Categorical

Distinct42
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A01
1644 
A03
1236 
C01
747 
B02
676 
C05
 
508
Other values (37)
5189 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA02
2nd rowC07
3rd rowB22
4th rowB10
5th rowC05

Common Values

ValueCountFrequency (%)
A01 1644
16.4%
A03 1236
 
12.4%
C01 747
 
7.5%
B02 676
 
6.8%
C05 508
 
5.1%
C03 384
 
3.8%
B01 363
 
3.6%
C06 353
 
3.5%
B05 350
 
3.5%
C07 325
 
3.2%
Other values (32) 3414
34.1%

Length

2023-12-12T07:32:03.954528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
a01 1644
16.4%
a03 1236
 
12.4%
c01 747
 
7.5%
b02 676
 
6.8%
c05 508
 
5.1%
c03 384
 
3.8%
b01 363
 
3.6%
c06 353
 
3.5%
b05 350
 
3.5%
c07 325
 
3.2%
Other values (32) 3414
34.1%

pul_party_sopsrt_dims
Real number (ℝ)

SKEWED 

Distinct512
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.528906
Minimum0.01
Maximum957.12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T07:32:04.079757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.02
Q10.06
median0.14
Q30.34
95-th percentile1.71
Maximum957.12
Range957.11
Interquartile range (IQR)0.28

Descriptive statistics

Standard deviation20.089753
Coefficient of variation (CV)13.139953
Kurtosis1144.6127
Mean1.528906
Median Absolute Deviation (MAD)0.1
Skewness30.254022
Sum15289.06
Variance403.59819
MonotonicityNot monotonic
2023-12-12T07:32:04.217420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.02 590
 
5.9%
0.03 504
 
5.0%
0.04 484
 
4.8%
0.06 443
 
4.4%
0.01 436
 
4.4%
0.05 421
 
4.2%
0.07 393
 
3.9%
0.08 383
 
3.8%
0.09 298
 
3.0%
0.12 271
 
2.7%
Other values (502) 5777
57.8%
ValueCountFrequency (%)
0.01 436
4.4%
0.02 590
5.9%
0.03 504
5.0%
0.04 484
4.8%
0.05 421
4.2%
0.06 443
4.4%
0.07 393
3.9%
0.08 383
3.8%
0.09 298
3.0%
0.1 266
2.7%
ValueCountFrequency (%)
957.12 1
 
< 0.1%
849.17 1
 
< 0.1%
742.28 1
 
< 0.1%
586.33 1
 
< 0.1%
384.92 1
 
< 0.1%
319.18 1
 
< 0.1%
304.63 1
 
< 0.1%
292.96 4
< 0.1%
270.22 1
 
< 0.1%
245.29 1
 
< 0.1%

Interactions

2023-12-12T07:32:02.629065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:01.329632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:02.154936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:02.725506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:01.529631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:02.360563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:02.850937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:01.971245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T07:32:02.529526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:32:04.297661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pnulegaldong_cdinduty_cdpul_party_sopsrt_dims
pnu1.0000.9990.1960.065
legaldong_cd0.9991.0000.1940.067
induty_cd0.1960.1941.0000.000
pul_party_sopsrt_dims0.0650.0670.0001.000
2023-12-12T07:32:04.402140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pnulegaldong_cdpul_party_sopsrt_dimsinduty_cd
pnu1.0001.0000.0890.069
legaldong_cd1.0001.0000.0900.068
pul_party_sopsrt_dims0.0890.0901.0000.000
induty_cd0.0690.0680.0001.000

Missing values

2023-12-12T07:32:02.955748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:32:03.040453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

data_strd_ympnulegaldong_cdinduty_cdpul_party_sopsrt_dims
80673202306113801060010015012011380106A020.04
38196202306112301030010992002911230103C070.33
36189202306112151090010125005411215109B220.25
26177202306112001150010235000111200115B100.65
61592202306113051010010867009911305101C050.43
74903202306113501050010669000011350105A030.16
70645202306113201080010636000111320108A031.52
1317202306111101220010026000011110122A010.01
24576202306112001120011114000011200112A020.49
39290202306112301050010001004911230105A010.65
data_strd_ympnulegaldong_cdinduty_cdpul_party_sopsrt_dims
56549202306112901350010003025111290135C070.02
184202306111101060010091005011110106A010.01
74077202306113501050010366000311350105B020.92
16416202306111701020010044000211170102C030.04
62095202306113051020010415003811305102A140.18
78296202306113801030010297002811380103C030.35
23062202306112001070010003002711200107B010.43
73127202306113501040010168000111350104A130.38
46452202306112601010010497000711260101B180.62
52308202306112901030010644000011290103A010.07