Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 10000 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 322.4 KiB |
Average record size in memory | 33.0 B |
Variable types
Numeric | 2 |
---|---|
Categorical | 2 |
Dataset
Description | 한국남동발전의 자산관리대장 현황입니다. 관리번호, 자산단위명, 취득일자, 기초취득가액, 기초상각누계액, 기초장부가액 등의 정보를 포함하고 있습니다. |
---|---|
Author | 한국남동발전㈜ |
URL | https://www.data.go.kr/data/15064135/fileData.do |
취득일자 has a high cardinality: 1364 distinct values | High cardinality |
df_index is highly correlated with 관리번호 | High correlation |
관리번호 is highly correlated with df_index | High correlation |
df_index has unique values | Unique |
관리번호 has unique values | Unique |
Reproduction
Analysis started | 2022-10-03 07:13:18.559136 |
---|---|
Analysis finished | 2022-10-03 07:13:19.913612 |
Duration | 1.35 second |
Software version | pandas-profiling v3.2.0 |
Download configuration | config.json |
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 11684.0763 |
Minimum | 3 |
---|---|
Maximum | 23402 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 78.2 KiB |
Quantile statistics
Minimum | 3 |
---|---|
5-th percentile | 1153.95 |
Q1 | 5833.75 |
median | 11689.5 |
Q3 | 17558.25 |
95-th percentile | 22239.1 |
Maximum | 23402 |
Range | 23399 |
Interquartile range (IQR) | 11724.5 |
Descriptive statistics
Standard deviation | 6752.891927 |
---|---|
Coefficient of variation (CV) | 0.5779568495 |
Kurtosis | -1.197417564 |
Mean | 11684.0763 |
Median Absolute Deviation (MAD) | 5865.5 |
Skewness | 0.003004149179 |
Sum | 116840763 |
Variance | 45601549.38 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
11178 | 1 | < 0.1% |
13760 | 1 | < 0.1% |
22419 | 1 | < 0.1% |
5735 | 1 | < 0.1% |
9982 | 1 | < 0.1% |
3480 | 1 | < 0.1% |
13025 | 1 | < 0.1% |
9105 | 1 | < 0.1% |
9572 | 1 | < 0.1% |
2070 | 1 | < 0.1% |
Other values (9990) | 9990 |
Value | Count | Frequency (%) |
3 | 1 | |
7 | 1 | |
10 | 1 | |
11 | 1 | |
16 | 1 | |
17 | 1 | |
24 | 1 | |
25 | 1 | |
30 | 1 | |
32 | 1 |
Value | Count | Frequency (%) |
23402 | 1 | |
23400 | 1 | |
23396 | 1 | |
23394 | 1 | |
23392 | 1 | |
23382 | 1 | |
23374 | 1 | |
23369 | 1 | |
23365 | 1 | |
23363 | 1 |
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4281111865 |
Minimum | 1010000062 |
---|---|
Maximum | 9010000076 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 88.0 KiB |
Quantile statistics
Minimum | 1010000062 |
---|---|
5-th percentile | 1010002261 |
Q1 | 4010007716 |
median | 4010022478 |
Q3 | 4710007799 |
95-th percentile | 6510010733 |
Maximum | 9010000076 |
Range | 8000000014 |
Interquartile range (IQR) | 700000083.5 |
Descriptive statistics
Standard deviation | 1610727564 |
---|---|
Coefficient of variation (CV) | 0.3762404756 |
Kurtosis | 0.2120058872 |
Mean | 4281111865 |
Median Absolute Deviation (MAD) | 699981277.5 |
Skewness | -0.1173307152 |
Sum | 4.281111865 × 1013 |
Variance | 2.594443286 × 1018 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4010021425 | 1 | < 0.1% |
4710007009 | 1 | < 0.1% |
4010005899 | 1 | < 0.1% |
1010002725 | 1 | < 0.1% |
4010008933 | 1 | < 0.1% |
4710001102 | 1 | < 0.1% |
4010011042 | 1 | < 0.1% |
4010013424 | 1 | < 0.1% |
4010025788 | 1 | < 0.1% |
4010019121 | 1 | < 0.1% |
Other values (9990) | 9990 |
Value | Count | Frequency (%) |
1010000062 | 1 | |
1010000066 | 1 | |
1010000069 | 1 | |
1010000070 | 1 | |
1010000075 | 1 | |
1010000076 | 1 | |
1010000085 | 1 | |
1010000086 | 1 | |
1010000091 | 1 | |
1010000093 | 1 |
Value | Count | Frequency (%) |
9010000076 | 1 | |
9010000073 | 1 | |
9010000069 | 1 | |
9010000067 | 1 | |
9010000065 | 1 | |
9010000050 | 1 | |
9010000040 | 1 | |
9010000034 | 1 | |
9010000030 | 1 | |
9010000027 | 1 |
자산단위명
Categorical
Distinct | 15 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 78.2 KiB |
기계장치 | |
---|---|
기계장치-저장품 | |
비품 | |
토지 | |
건물 | |
Other values (10) |
Length
Max length | 8 |
---|---|
Median length | 7 |
Mean length | 4.1017 |
Min length | 2 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | < 0.1% |
Sample
1st row | 기계장치 |
---|---|
2nd row | 비품 |
3rd row | 기계장치 |
4th row | 토지 |
5th row | 기계장치 |
Common Values
Value | Count | Frequency (%) |
기계장치 | 4226 | |
기계장치-저장품 | 1600 | 16.0% |
비품 | 1494 | 14.9% |
토지 | 891 | 8.9% |
건물 | 527 | 5.3% |
공구와기구 | 468 | 4.7% |
구축물 | 453 | 4.5% |
소프트웨어 | 186 | 1.9% |
차량운반구 | 71 | 0.7% |
종합검사원가 | 50 | 0.5% |
Other values (5) | 34 | 0.3% |
Length
Value | Count | Frequency (%) |
기계장치 | 4226 | |
기계장치-저장품 | 1600 | 16.0% |
비품 | 1494 | 14.9% |
토지 | 891 | 8.9% |
건물 | 527 | 5.3% |
공구와기구 | 468 | 4.7% |
구축물 | 453 | 4.5% |
소프트웨어 | 186 | 1.9% |
차량운반구 | 71 | 0.7% |
종합검사원가 | 50 | 0.5% |
Other values (5) | 34 | 0.3% |
Distinct | 1364 |
---|---|
Distinct (%) | 13.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 78.2 KiB |
2001-04-02 | |
---|---|
2010-01-01 | 652 |
2014-06-30 | 324 |
2015-12-31 | 239 |
2015-01-01 | 190 |
Other values (1359) |
Length
Max length | 10 |
---|---|
Median length | 10 |
Mean length | 10 |
Min length | 10 |
Unique
Unique | 641 ? |
---|---|
Unique (%) | 6.4% |
Sample
1st row | 2014-12-31 |
---|---|
2nd row | 2012-12-31 |
3rd row | 2007-10-26 |
4th row | 2008-04-01 |
5th row | 2010-02-28 |
Common Values
Value | Count | Frequency (%) |
2001-04-02 | 1116 | 11.2% |
2010-01-01 | 652 | 6.5% |
2014-06-30 | 324 | 3.2% |
2015-12-31 | 239 | 2.4% |
2015-01-01 | 190 | 1.9% |
2008-04-01 | 167 | 1.7% |
1997-03-01 | 162 | 1.6% |
2019-12-31 | 158 | 1.6% |
2016-08-31 | 156 | 1.6% |
2004-07-12 | 154 | 1.5% |
Other values (1354) | 6682 |
Length
Value | Count | Frequency (%) |
2001-04-02 | 1116 | 11.2% |
2010-01-01 | 652 | 6.5% |
2014-06-30 | 324 | 3.2% |
2015-12-31 | 239 | 2.4% |
2015-01-01 | 190 | 1.9% |
2008-04-01 | 167 | 1.7% |
1997-03-01 | 162 | 1.6% |
2019-12-31 | 158 | 1.6% |
2016-08-31 | 156 | 1.6% |
2004-07-12 | 154 | 1.5% |
Other values (1354) | 6682 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
df_index | 관리번호 | 자산단위명 | 취득일자 | |
---|---|---|---|---|
0 | 11178 | 4010021425 | 기계장치 | 2014-12-31 |
1 | 20046 | 6510004460 | 비품 | 2012-12-31 |
2 | 5229 | 4010005899 | 기계장치 | 2007-10-26 |
3 | 1463 | 1010002725 | 토지 | 2008-04-01 |
4 | 6448 | 4010008933 | 기계장치 | 2010-02-28 |
5 | 15043 | 4710001102 | 기계장치-저장품 | 2010-01-01 |
6 | 7060 | 4010011042 | 기계장치 | 1984-02-01 |
7 | 8293 | 4010013424 | 기계장치 | 2012-07-31 |
8 | 14227 | 4010025788 | 기계장치 | 2019-12-31 |
9 | 17143 | 4710007009 | 기계장치-저장품 | 2017-05-17 |
Last rows
df_index | 관리번호 | 자산단위명 | 취득일자 | |
---|---|---|---|---|
9990 | 22906 | 8200000220 | 소프트웨어 | 2012-03-29 |
9991 | 12828 | 4010024000 | 기계장치 | 2016-08-31 |
9992 | 14881 | 4710000860 | 기계장치-저장품 | 2010-01-01 |
9993 | 14705 | 4710000597 | 기계장치-저장품 | 2010-01-01 |
9994 | 14531 | 4710000360 | 기계장치-저장품 | 2010-01-01 |
9995 | 17362 | 4710007457 | 기계장치-저장품 | 2017-06-27 |
9996 | 12896 | 4010024068 | 기계장치 | 2016-08-31 |
9997 | 5871 | 4010007762 | 기계장치 | 2001-04-02 |
9998 | 14962 | 4710000989 | 기계장치-저장품 | 2010-01-01 |
9999 | 20538 | 6510008643 | 비품 | 2015-09-30 |