Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells13326
Missing cells (%)13.3%
Duplicate rows3
Duplicate rows (%)< 0.1%
Total size in memory888.7 KiB
Average record size in memory91.0 B

Variable types

Numeric2
Text1
DateTime4
Boolean1
Categorical2

Dataset

Description부산시설공단_영락공원봉안사용현황_20230125
Author부산시설공단
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15067589

Alerts

Dataset has 3 (< 0.1%) duplicate rowsDuplicates
봉안번호 is highly overall correlated with 연장차수High correlation
연장차수 is highly overall correlated with 봉안번호High correlation
개장여부 is highly imbalanced (55.7%)Imbalance
개장일자 has 9082 (90.8%) missing valuesMissing
신청일자 has 4244 (42.4%) missing valuesMissing

Reproduction

Analysis started2023-12-10 17:10:29.262691
Analysis finished2023-12-10 17:10:32.791387
Duration3.53 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

봉안번호
Real number (ℝ)

HIGH CORRELATION 

Distinct9997
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20273245
Minimum1
Maximum34083527
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T02:10:32.936765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile10101035
Q121118737
median22140496
Q323062551
95-th percentile33978868
Maximum34083527
Range34083526
Interquartile range (IQR)1943814.2

Descriptive statistics

Standard deviation7294490.2
Coefficient of variation (CV)0.35980872
Kurtosis0.97011686
Mean20273245
Median Absolute Deviation (MAD)1020873.5
Skewness-0.68379464
Sum2.0273245 × 1011
Variance5.3209587 × 1013
MonotonicityNot monotonic
2023-12-11T02:10:33.230747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22959639 2
 
< 0.1%
21834337 2
 
< 0.1%
21322166 2
 
< 0.1%
21221635 1
 
< 0.1%
22345904 1
 
< 0.1%
34080954 1
 
< 0.1%
23164042 1
 
< 0.1%
21527469 1
 
< 0.1%
21526877 1
 
< 0.1%
22039438 1
 
< 0.1%
Other values (9987) 9987
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
12 1
< 0.1%
25 1
< 0.1%
28 1
< 0.1%
43 1
< 0.1%
54 1
< 0.1%
57 1
< 0.1%
59 1
< 0.1%
68 1
< 0.1%
71 1
< 0.1%
ValueCountFrequency (%)
34083527 1
< 0.1%
34083522 1
< 0.1%
34083504 1
< 0.1%
34083479 1
< 0.1%
34083477 1
< 0.1%
34083475 1
< 0.1%
34083465 1
< 0.1%
34083464 1
< 0.1%
34083429 1
< 0.1%
34083428 1
< 0.1%

순번
Real number (ℝ)

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2156
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T02:10:33.471134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum9
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.47868523
Coefficient of variation (CV)0.39378516
Kurtosis17.725149
Mean1.2156
Median Absolute Deviation (MAD)0
Skewness2.9364092
Sum12156
Variance0.22913955
MonotonicityNot monotonic
2023-12-11T02:10:33.686079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 8071
80.7%
2 1739
 
17.4%
3 171
 
1.7%
4 10
 
0.1%
5 5
 
0.1%
6 2
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
1 8071
80.7%
2 1739
 
17.4%
3 171
 
1.7%
4 10
 
0.1%
5 5
 
0.1%
6 2
 
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
9 1
 
< 0.1%
8 1
 
< 0.1%
6 2
 
< 0.1%
5 5
 
0.1%
4 10
 
0.1%
3 171
 
1.7%
2 1739
 
17.4%
1 8071
80.7%
Distinct9997
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T02:10:34.144609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters140000
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9994 ?
Unique (%)99.9%

Sample

1st row02동 12실 21635호
2nd row02동 17실 32634호
3rd row02동 38실 75193호
4th row01동 08실 12534호
5th row01동 02실 04031호
ValueCountFrequency (%)
02동 6885
 
22.9%
01동 1887
 
6.3%
03동 845
 
2.8%
39실 437
 
1.5%
40실 408
 
1.4%
22실 386
 
1.3%
00실 383
 
1.3%
00동 383
 
1.3%
23실 334
 
1.1%
07실 313
 
1.0%
Other values (9984) 17739
59.1%
2023-12-11T02:10:34.812710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20000
14.3%
0 19753
14.1%
2 15901
11.4%
1 10664
7.6%
10000
7.1%
10000
7.1%
10000
7.1%
3 9344
6.7%
4 6412
 
4.6%
6 5951
 
4.3%
Other values (4) 21975
15.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 90000
64.3%
Other Letter 30000
 
21.4%
Space Separator 20000
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 19753
21.9%
2 15901
17.7%
1 10664
11.8%
3 9344
10.4%
4 6412
 
7.1%
6 5951
 
6.6%
7 5888
 
6.5%
5 5826
 
6.5%
8 5241
 
5.8%
9 5020
 
5.6%
Other Letter
ValueCountFrequency (%)
10000
33.3%
10000
33.3%
10000
33.3%
Space Separator
ValueCountFrequency (%)
20000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 110000
78.6%
Hangul 30000
 
21.4%

Most frequent character per script

Common
ValueCountFrequency (%)
20000
18.2%
0 19753
18.0%
2 15901
14.5%
1 10664
9.7%
3 9344
8.5%
4 6412
 
5.8%
6 5951
 
5.4%
7 5888
 
5.4%
5 5826
 
5.3%
8 5241
 
4.8%
Hangul
ValueCountFrequency (%)
10000
33.3%
10000
33.3%
10000
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 110000
78.6%
Hangul 30000
 
21.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20000
18.2%
0 19753
18.0%
2 15901
14.5%
1 10664
9.7%
3 9344
8.5%
4 6412
 
5.8%
6 5951
 
5.4%
7 5888
 
5.4%
5 5826
 
5.3%
8 5241
 
4.8%
Hangul
ValueCountFrequency (%)
10000
33.3%
10000
33.3%
10000
33.3%
Distinct5071
Distinct (%)50.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum1974-12-21 00:00:00
Maximum2020-09-25 00:00:00
2023-12-11T02:10:35.048176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:35.278089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct3874
Distinct (%)38.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum1984-12-20 00:00:00
Maximum2035-09-24 00:00:00
2023-12-11T02:10:35.528682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:35.744565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

개장여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.9 KiB
False
9081 
True
919 
ValueCountFrequency (%)
False 9081
90.8%
True 919
 
9.2%
2023-12-11T02:10:35.918612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

개장일자
Date

MISSING 

Distinct640
Distinct (%)69.7%
Missing9082
Missing (%)90.8%
Memory size156.2 KiB
Minimum1996-08-14 00:00:00
Maximum2020-09-25 00:00:00
2023-12-11T02:10:36.122458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:36.440373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

연장차수
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
4244 
1회
3859 
2회
1760 
3회
 
137

Length

Max length4
Median length2
Mean length2.8488
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2회
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 4244
42.4%
1회 3859
38.6%
2회 1760
17.6%
3회 137
 
1.4%

Length

2023-12-11T02:10:36.682126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:10:36.869601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 4244
42.4%
1회 3859
38.6%
2회 1760
17.6%
3회 137
 
1.4%

신청일자
Date

MISSING 

Distinct1860
Distinct (%)32.3%
Missing4244
Missing (%)42.4%
Memory size156.2 KiB
Minimum2010-01-06 00:00:00
Maximum2020-09-25 00:00:00
2023-12-11T02:10:37.047619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:37.222271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

사용료
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
60000
5253 
<NA>
4244 
0
 
487
30000
 
16

Length

Max length5
Median length5
Mean length4.3808
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row60000
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
60000 5253
52.5%
<NA> 4244
42.4%
0 487
 
4.9%
30000 16
 
0.2%

Length

2023-12-11T02:10:37.476908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T02:10:37.756578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
60000 5253
52.5%
na 4244
42.4%
0 487
 
4.9%
30000 16
 
0.2%

Interactions

2023-12-11T02:10:31.125438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:30.700598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:31.364945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T02:10:30.910142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:10:37.934908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
봉안번호순번개장여부연장차수사용료
봉안번호1.0000.3090.3070.8820.281
순번0.3091.0000.0990.0730.204
개장여부0.3070.0991.0000.0090.038
연장차수0.8820.0730.0091.0000.390
사용료0.2810.2040.0380.3901.000
2023-12-11T02:10:38.127799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연장차수개장여부사용료
연장차수1.0000.0150.142
개장여부0.0151.0000.063
사용료0.1420.0631.000
2023-12-11T02:10:38.300639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
봉안번호순번개장여부연장차수사용료
봉안번호1.000-0.1690.2210.5840.093
순번-0.1691.0000.0750.0460.131
개장여부0.2210.0751.0000.0150.063
연장차수0.5840.0460.0151.0000.142
사용료0.0930.1310.0630.1421.000

Missing values

2023-12-11T02:10:31.671885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:10:32.369927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T02:10:32.602374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

봉안번호순번봉안정보봉안일자만료일자개장여부개장일자연장차수신청일자사용료
2520421221635102동 12실 21635호1999-12-222024-12-21N<NA>2회2019-12-1260000
3624921732634102동 17실 32634호2001-07-292016-07-28N<NA><NA><NA><NA>
7890923875193102동 38실 75193호2006-12-022021-12-01N<NA><NA><NA><NA>
1608610812534101동 08실 12534호1998-02-272013-02-26N<NA><NA><NA><NA>
758310204031101동 02실 04031호1996-02-182011-02-17N<NA><NA><NA><NA>
1248810608936101동 06실 08936호1997-03-262022-03-25N<NA>2회2017-01-2760000
6718323163492102동 31실 63492호2005-03-312025-03-30N<NA>1회2019-10-0330000
1943910915887101동 09실 15887호1998-11-302023-11-29N<NA>2회2019-09-1260000
7413623470429102동 34실 70429호2006-03-252021-03-24Y2017-01-18<NA><NA><NA>
5421122550553102동 25실 50553호2003-08-182023-08-17N<NA>1회2017-10-0460000
봉안번호순번봉안정보봉안일자만료일자개장여부개장일자연장차수신청일자사용료
873410305182101동 03실 05182호1996-06-272016-06-26N<NA>1회2014-07-1160000
1190610608354101동 06실 08354호1997-01-302012-01-29N<NA><NA><NA><NA>
8787634083504103동 40실 83504호2008-01-142023-01-13N<NA><NA><NA><NA>
32383239100동 00실 03239호2011-02-012021-01-31N<NA><NA><NA><NA>
622110202670201동 02실 02670호2016-03-282031-03-27N<NA><NA><NA><NA>
2780821424224102동 14실 24224호2000-05-142025-05-13N<NA>2회2020-04-0260000
6512423061438202동 30실 61438호2008-12-172023-12-16N<NA><NA><NA><NA>
5023722346582102동 23실 46582호2003-03-082023-03-07N<NA>1회2017-10-060
5960322755938102동 27실 55938호2004-03-152019-03-14Y2014-11-04<NA><NA><NA>
6646123062773102동 30실 62773호2005-02-242025-02-23N<NA>1회2020-06-2260000

Duplicate rows

Most frequently occurring

봉안번호순번봉안정보봉안일자만료일자개장여부개장일자연장차수신청일자사용료# duplicates
021322166202동 13실 22166호2020-07-212035-07-20N<NA><NA><NA><NA>2
121834337202동 18실 34337호2020-06-192035-06-18N<NA><NA><NA><NA>2
222959639202동 29실 59639호2020-05-042035-05-03N<NA><NA><NA><NA>2