Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells13370
Missing cells (%)13.4%
Duplicate rows3
Duplicate rows (%)< 0.1%
Total size in memory888.7 KiB
Average record size in memory91.0 B

Variable types

Numeric2
Text1
DateTime4
Boolean1
Categorical2

Dataset

Description부산시설공단_영락공원봉안사용현황_20220125
Author부산시설공단
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15067589

Alerts

Dataset has 3 (< 0.1%) duplicate rowsDuplicates
봉안번호 is highly overall correlated with 연장차수High correlation
연장차수 is highly overall correlated with 봉안번호High correlation
개장여부 is highly imbalanced (55.1%)Imbalance
개장일자 has 9063 (90.6%) missing valuesMissing
신청일자 has 4307 (43.1%) missing valuesMissing

Reproduction

Analysis started2023-12-10 17:10:41.532870
Analysis finished2023-12-10 17:10:44.906705
Duration3.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

봉안번호
Real number (ℝ)

HIGH CORRELATION 

Distinct9997
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20333527
Minimum17
Maximum34083543
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T02:10:45.039736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile10101034
Q121118336
median22140984
Q323163388
95-th percentile33979420
Maximum34083543
Range34083526
Interquartile range (IQR)2045051.8

Descriptive statistics

Standard deviation7385654.7
Coefficient of variation (CV)0.36322545
Kurtosis0.87034895
Mean20333527
Median Absolute Deviation (MAD)1022517.5
Skewness-0.63894205
Sum2.0333527 × 1011
Variance5.4547895 × 1013
MonotonicityNot monotonic
2023-12-11T02:10:45.301741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24183981 2
 
< 0.1%
22960829 2
 
< 0.1%
21937931 2
 
< 0.1%
34082451 1
 
< 0.1%
10914726 1
 
< 0.1%
23266749 1
 
< 0.1%
10609263 1
 
< 0.1%
22755855 1
 
< 0.1%
23164095 1
 
< 0.1%
22449498 1
 
< 0.1%
Other values (9987) 9987
99.9%
ValueCountFrequency (%)
17 1
< 0.1%
23 1
< 0.1%
25 1
< 0.1%
28 1
< 0.1%
32 1
< 0.1%
43 1
< 0.1%
44 1
< 0.1%
45 1
< 0.1%
48 1
< 0.1%
67 1
< 0.1%
ValueCountFrequency (%)
34083543 1
< 0.1%
34083527 1
< 0.1%
34083517 1
< 0.1%
34083515 1
< 0.1%
34083507 1
< 0.1%
34083499 1
< 0.1%
34083479 1
< 0.1%
34083471 1
< 0.1%
34083464 1
< 0.1%
34083443 1
< 0.1%

순번
Real number (ℝ)

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2112
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T02:10:45.545571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.47393778
Coefficient of variation (CV)0.39129606
Kurtosis14.319518
Mean1.2112
Median Absolute Deviation (MAD)0
Skewness2.7745642
Sum12112
Variance0.22461702
MonotonicityNot monotonic
2023-12-11T02:10:45.731656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1 8120
81.2%
2 1683
 
16.8%
3 173
 
1.7%
4 19
 
0.2%
5 2
 
< 0.1%
6 2
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
1 8120
81.2%
2 1683
 
16.8%
3 173
 
1.7%
4 19
 
0.2%
5 2
 
< 0.1%
6 2
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
9 1
 
< 0.1%
6 2
 
< 0.1%
5 2
 
< 0.1%
4 19
 
0.2%
3 173
 
1.7%
2 1683
 
16.8%
1 8120
81.2%
Distinct9997
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T02:10:46.178519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters140000
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9994 ?
Unique (%)99.9%

Sample

1st row01동 08실 14220호
2nd row02동 32실 65544호
3rd row02동 34실 70772호
4th row02동 19실 36967호
5th row01동 07실 11374호
ValueCountFrequency (%)
02동 6799
 
22.7%
01동 1909
 
6.4%
03동 912
 
3.0%
40실 476
 
1.6%
39실 436
 
1.5%
00실 380
 
1.3%
00동 380
 
1.3%
22실 356
 
1.2%
07실 339
 
1.1%
23실 316
 
1.1%
Other values (9988) 17697
59.0%
2023-12-11T02:10:46.820658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20000
14.3%
0 19783
14.1%
2 15823
11.3%
1 10596
7.6%
10000
7.1%
10000
7.1%
10000
7.1%
3 9448
6.7%
4 6408
 
4.6%
5 5909
 
4.2%
Other values (4) 22033
15.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 90000
64.3%
Other Letter 30000
 
21.4%
Space Separator 20000
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 19783
22.0%
2 15823
17.6%
1 10596
11.8%
3 9448
10.5%
4 6408
 
7.1%
5 5909
 
6.6%
7 5908
 
6.6%
6 5903
 
6.6%
8 5294
 
5.9%
9 4928
 
5.5%
Other Letter
ValueCountFrequency (%)
10000
33.3%
10000
33.3%
10000
33.3%
Space Separator
ValueCountFrequency (%)
20000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 110000
78.6%
Hangul 30000
 
21.4%

Most frequent character per script

Common
ValueCountFrequency (%)
20000
18.2%
0 19783
18.0%
2 15823
14.4%
1 10596
9.6%
3 9448
8.6%
4 6408
 
5.8%
5 5909
 
5.4%
7 5908
 
5.4%
6 5903
 
5.4%
8 5294
 
4.8%
Hangul
ValueCountFrequency (%)
10000
33.3%
10000
33.3%
10000
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 110000
78.6%
Hangul 30000
 
21.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20000
18.2%
0 19783
18.0%
2 15823
14.4%
1 10596
9.6%
3 9448
8.6%
4 6408
 
5.8%
5 5909
 
5.4%
7 5908
 
5.4%
6 5903
 
5.4%
8 5294
 
4.8%
Hangul
ValueCountFrequency (%)
10000
33.3%
10000
33.3%
10000
33.3%
Distinct5003
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum1974-12-21 00:00:00
Maximum2020-09-25 00:00:00
2023-12-11T02:10:47.074917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:47.340970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct3870
Distinct (%)38.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum1984-12-20 00:00:00
Maximum2035-09-24 00:00:00
2023-12-11T02:10:47.633775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:48.009963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

개장여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.9 KiB
False
9062 
True
938 
ValueCountFrequency (%)
False 9062
90.6%
True 938
 
9.4%
2023-12-11T02:10:48.277576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

개장일자
Date

MISSING 

Distinct660
Distinct (%)70.4%
Missing9063
Missing (%)90.6%
Memory size156.2 KiB
Minimum1995-01-01 00:00:00
Maximum2020-09-24 00:00:00
2023-12-11T02:10:48.536482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:48.918062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

연장차수
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
4307 
1회
3775 
2회
1787 
3회
 
131

Length

Max length4
Median length2
Mean length2.8614
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1회
2nd row1회
3rd row<NA>
4th row1회
5th row2회

Common Values

ValueCountFrequency (%)
<NA> 4307
43.1%
1회 3775
37.8%
2회 1787
17.9%
3회 131
 
1.3%

Length

2023-12-11T02:10:49.232370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:10:49.434273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 4307
43.1%
1회 3775
37.8%
2회 1787
17.9%
3회 131
 
1.3%

신청일자
Date

MISSING 

Distinct1821
Distinct (%)32.0%
Missing4307
Missing (%)43.1%
Memory size156.2 KiB
Minimum2009-11-10 00:00:00
Maximum2020-09-25 00:00:00
2023-12-11T02:10:49.666166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:49.908230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

사용료
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
60000
5231 
<NA>
4307 
0
 
448
30000
 
14

Length

Max length5
Median length5
Mean length4.3901
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row60000
2nd row60000
3rd row<NA>
4th row60000
5th row60000

Common Values

ValueCountFrequency (%)
60000 5231
52.3%
<NA> 4307
43.1%
0 448
 
4.5%
30000 14
 
0.1%

Length

2023-12-11T02:10:50.148027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:10:50.325372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
60000 5231
52.3%
na 4307
43.1%
0 448
 
4.5%
30000 14
 
0.1%

Interactions

2023-12-11T02:10:43.732172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:43.269227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:43.936229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:43.503732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:10:50.438369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
봉안번호순번개장여부연장차수사용료
봉안번호1.0000.3010.2950.8720.249
순번0.3011.0000.0660.0770.184
개장여부0.2950.0661.0000.0130.029
연장차수0.8720.0770.0131.0000.394
사용료0.2490.1840.0290.3941.000
2023-12-11T02:10:50.609193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연장차수개장여부사용료
연장차수1.0000.0210.144
개장여부0.0211.0000.048
사용료0.1440.0481.000
2023-12-11T02:10:50.819528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
봉안번호순번개장여부연장차수사용료
봉안번호1.000-0.1760.2120.5670.080
순번-0.1761.0000.0710.0510.125
개장여부0.2120.0711.0000.0210.048
연장차수0.5670.0510.0211.0000.144
사용료0.0800.1250.0480.1441.000

Missing values

2023-12-11T02:10:44.234572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:10:44.509185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T02:10:44.781122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

봉안번호순번봉안정보봉안일자만료일자개장여부개장일자연장차수신청일자사용료
1777210814220101동 08실 14220호1998-07-252018-07-24N<NA>1회2014-09-1060000
6923723265544102동 32실 65544호2005-07-162025-07-15N<NA>1회2019-10-1160000
7447923470772102동 34실 70772호2006-04-112021-04-10N<NA><NA><NA><NA>
4058921936967102동 19실 36967호2002-02-102022-02-09N<NA>1회2017-01-1260000
1492610711374101동 07실 11374호1997-11-212022-11-20N<NA>2회2017-10-3060000
5131122347656202동 23실 47656호2015-05-262030-05-25N<NA><NA><NA><NA>
6686323063173102동 30실 63173호2005-03-162025-03-15N<NA>1회2020-01-2960000
4871922245068102동 22실 45068호2003-01-102018-01-09Y2014-04-05<NA><NA><NA>
4728222243634102동 22실 43634호2002-11-202022-11-19N<NA>1회2017-09-3060000
26972698200동 00실 02698호2018-11-302028-11-29N<NA><NA><NA><NA>
봉안번호순번봉안정보봉안일자만료일자개장여부개장일자연장차수신청일자사용료
7292923469224102동 34실 69224호2006-01-232021-01-22N<NA><NA><NA><NA>
4952122345866102동 23실 45866호2003-02-062023-02-05N<NA>1회2018-01-1860000
578510202234101동 02실 02234호1995-08-222020-08-21Y2019-06-092회2015-02-2360000
8207133977708103동 39실 77708호2007-04-022022-04-01N<NA><NA><NA><NA>
8694434082575103동 40실 82575호2007-12-042022-12-03N<NA><NA><NA><NA>
6691923063228102동 30실 63228호2005-03-192025-03-18N<NA>1회2019-09-1560000
5429522550637102동 25실 50637호2003-08-212018-08-20Y2010-10-28<NA><NA><NA>
1475310711201101동 07실 11201호1998-03-022023-03-01N<NA>2회2017-09-2360000
3694921733333102동 17실 33333호2001-09-032021-09-02N<NA>1회2016-02-1460000
4416622140527102동 21실 40527호2002-07-182022-07-17N<NA>1회2017-02-050

Duplicate rows

Most frequently occurring

봉안번호순번봉안정보봉안일자만료일자개장여부개장일자연장차수신청일자사용료# duplicates
021937931302동 19실 37931호2020-05-102035-05-09N<NA><NA><NA><NA>2
122960829202동 29실 60829호2020-09-062035-09-05N<NA><NA><NA><NA>2
224183981202동 41실 83981호2020-08-072035-08-06N<NA><NA><NA><NA>2