Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells13293
Missing cells (%)13.3%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory888.7 KiB
Average record size in memory91.0 B

Variable types

Numeric1
Categorical3
Text1
DateTime4
Boolean1

Dataset

Description부산 영락공원 봉안정보, 봉안일자, 만료일자, 개장여부, 개장일자, 연장차수, 신청일자, 사용료 등에 관한 정보
Author부산시설공단
URLhttps://www.data.go.kr/data/15067589/fileData.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
봉안번호 is highly overall correlated with 연장차수High correlation
연장차수 is highly overall correlated with 봉안번호High correlation
순번 is highly imbalanced (60.1%)Imbalance
개장여부 is highly imbalanced (56.3%)Imbalance
개장일자 has 9099 (91.0%) missing valuesMissing
신청일자 has 4194 (41.9%) missing valuesMissing

Reproduction

Analysis started2023-12-12 11:12:45.162104
Analysis finished2023-12-12 11:12:47.868835
Duration2.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

봉안번호
Real number (ℝ)

HIGH CORRELATION 

Distinct9999
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20207251
Minimum11
Maximum34083548
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T20:12:47.987964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile10100764
Q121118442
median22040041
Q323062048
95-th percentile33979281
Maximum34083548
Range34083537
Interquartile range (IQR)1943606.2

Descriptive statistics

Standard deviation7400078.4
Coefficient of variation (CV)0.36620906
Kurtosis0.93317053
Mean20207251
Median Absolute Deviation (MAD)922243.5
Skewness-0.68993318
Sum2.0207251 × 1011
Variance5.4761161 × 1013
MonotonicityNot monotonic
2023-12-12T20:12:48.219952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22345559 2
 
< 0.1%
10607610 1
 
< 0.1%
2856 1
 
< 0.1%
22450236 1
 
< 0.1%
10711948 1
 
< 0.1%
33979690 1
 
< 0.1%
10203473 1
 
< 0.1%
21936479 1
 
< 0.1%
23164155 1
 
< 0.1%
23368989 1
 
< 0.1%
Other values (9989) 9989
99.9%
ValueCountFrequency (%)
11 1
< 0.1%
36 1
< 0.1%
42 1
< 0.1%
44 1
< 0.1%
46 1
< 0.1%
51 1
< 0.1%
53 1
< 0.1%
60 1
< 0.1%
68 1
< 0.1%
81 1
< 0.1%
ValueCountFrequency (%)
34083548 1
< 0.1%
34083536 1
< 0.1%
34083532 1
< 0.1%
34083527 1
< 0.1%
34083524 1
< 0.1%
34083520 1
< 0.1%
34083519 1
< 0.1%
34083516 1
< 0.1%
34083509 1
< 0.1%
34083508 1
< 0.1%

순번
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
8118 
2
1680 
3
 
188
4
 
14

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 8118
81.2%
2 1680
 
16.8%
3 188
 
1.9%
4 14
 
0.1%

Length

2023-12-12T20:12:48.438623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:12:48.611994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 8118
81.2%
2 1680
 
16.8%
3 188
 
1.9%
4 14
 
0.1%
Distinct9999
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T20:12:48.962926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters140000
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9998 ?
Unique (%)> 99.9%

Sample

1st row01동 06실 07610호
2nd row02동 35실 73302호
3rd row02동 14실 26425호
4th row02동 18실 35215호
5th row02동 10실 17114호
ValueCountFrequency (%)
02동 6841
 
22.8%
01동 1883
 
6.3%
03동 858
 
2.9%
40실 436
 
1.5%
39실 422
 
1.4%
00실 418
 
1.4%
00동 418
 
1.4%
07실 369
 
1.2%
22실 354
 
1.2%
23실 328
 
1.1%
Other values (9992) 17673
58.9%
2023-12-12T20:12:49.645254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20000
14.3%
0 19860
14.2%
2 15871
11.3%
1 10652
7.6%
10000
7.1%
10000
7.1%
10000
7.1%
3 9157
6.5%
4 6497
 
4.6%
7 5985
 
4.3%
Other values (4) 21978
15.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 90000
64.3%
Other Letter 30000
 
21.4%
Space Separator 20000
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 19860
22.1%
2 15871
17.6%
1 10652
11.8%
3 9157
10.2%
4 6497
 
7.2%
7 5985
 
6.7%
5 5888
 
6.5%
6 5861
 
6.5%
8 5205
 
5.8%
9 5024
 
5.6%
Other Letter
ValueCountFrequency (%)
10000
33.3%
10000
33.3%
10000
33.3%
Space Separator
ValueCountFrequency (%)
20000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 110000
78.6%
Hangul 30000
 
21.4%

Most frequent character per script

Common
ValueCountFrequency (%)
20000
18.2%
0 19860
18.1%
2 15871
14.4%
1 10652
9.7%
3 9157
8.3%
4 6497
 
5.9%
7 5985
 
5.4%
5 5888
 
5.4%
6 5861
 
5.3%
8 5205
 
4.7%
Hangul
ValueCountFrequency (%)
10000
33.3%
10000
33.3%
10000
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 110000
78.6%
Hangul 30000
 
21.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20000
18.2%
0 19860
18.1%
2 15871
14.4%
1 10652
9.7%
3 9157
8.3%
4 6497
 
5.9%
7 5985
 
5.4%
5 5888
 
5.4%
6 5861
 
5.3%
8 5205
 
4.7%
Hangul
ValueCountFrequency (%)
10000
33.3%
10000
33.3%
10000
33.3%
Distinct5010
Distinct (%)50.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum1974-12-21 00:00:00
Maximum2020-09-25 00:00:00
2023-12-12T20:12:49.887788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:12:50.183872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct3805
Distinct (%)38.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum1984-12-20 00:00:00
Maximum2035-09-24 00:00:00
2023-12-12T20:12:50.445868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:12:50.764188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

개장여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.9 KiB
False
9099 
True
 
901
ValueCountFrequency (%)
False 9099
91.0%
True 901
 
9.0%
2023-12-12T20:12:50.959873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

개장일자
Date

MISSING 

Distinct656
Distinct (%)72.8%
Missing9099
Missing (%)91.0%
Memory size156.2 KiB
Minimum1996-07-24 00:00:00
Maximum2020-09-25 00:00:00
2023-12-12T20:12:51.120419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:12:51.323623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

연장차수
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
4194 
1회
3896 
2회
1770 
3회
 
140

Length

Max length4
Median length2
Mean length2.8388
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1회
2nd row<NA>
3rd row<NA>
4th row1회
5th row1회

Common Values

ValueCountFrequency (%)
<NA> 4194
41.9%
1회 3896
39.0%
2회 1770
17.7%
3회 140
 
1.4%

Length

2023-12-12T20:12:51.580094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:12:51.788381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 4194
41.9%
1회 3896
39.0%
2회 1770
17.7%
3회 140
 
1.4%

신청일자
Date

MISSING 

Distinct1819
Distinct (%)31.3%
Missing4194
Missing (%)41.9%
Memory size156.2 KiB
Minimum2009-11-27 00:00:00
Maximum2020-09-25 00:00:00
2023-12-12T20:12:51.989246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:12:52.288377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

사용료
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
60000
5274 
<NA>
4194 
0
 
518
30000
 
14

Length

Max length5
Median length5
Mean length4.3734
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row<NA>
3rd row<NA>
4th row60000
5th row60000

Common Values

ValueCountFrequency (%)
60000 5274
52.7%
<NA> 4194
41.9%
0 518
 
5.2%
30000 14
 
0.1%

Length

2023-12-12T20:12:52.582200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T20:12:52.806713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
60000 5274
52.7%
na 4194
41.9%
0 518
 
5.2%
30000 14
 
0.1%

Interactions

2023-12-12T20:12:46.928398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:12:52.939465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
봉안번호순번개장여부연장차수사용료
봉안번호1.0000.3590.2980.8920.288
순번0.3591.0000.1030.0590.172
개장여부0.2980.1031.0000.0140.022
연장차수0.8920.0590.0141.0000.406
사용료0.2880.1720.0220.4061.000
2023-12-12T20:12:53.121864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용료개장여부순번연장차수
사용료1.0000.0360.1630.150
개장여부0.0361.0000.0680.024
순번0.1630.0681.0000.055
연장차수0.1500.0240.0551.000
2023-12-12T20:12:53.774283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
봉안번호순번개장여부연장차수사용료
봉안번호1.0000.2390.2150.6000.096
순번0.2391.0000.0680.0550.163
개장여부0.2150.0681.0000.0240.036
연장차수0.6000.0550.0241.0000.150
사용료0.0960.1630.0360.1501.000

Missing values

2023-12-12T20:12:47.197969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:12:47.470109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T20:12:47.714654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

봉안번호순번봉안정보봉안일자만료일자개장여부개장일자연장차수신청일자사용료
1116210607610201동 06실 07610호2002-03-012022-02-28N<NA>1회2019-11-040
7701023573302102동 35실 73302호2006-08-252021-08-24N<NA><NA><NA><NA>
3002521426425202동 14실 26425호2018-04-052033-04-04N<NA><NA><NA><NA>
3883421835215102동 18실 35215호2001-11-272021-11-26N<NA>1회2016-11-2160000
2067021017114102동 10실 17114호1999-03-122019-03-11N<NA>1회2014-01-2260000
4297622039347102동 20실 39347호2002-05-252022-05-24N<NA>1회2016-09-1660000
600210202451101동 02실 02451호1995-09-132020-09-12N<NA>2회2015-08-2360000
8295333978589103동 39실 78589호2007-05-152022-05-14N<NA><NA><NA><NA>
6659423062905102동 30실 62905호2005-03-032025-03-02N<NA>1회2020-01-2660000
2018621016633102동 10실 16633호1999-02-072024-02-06N<NA>2회2018-09-1660000
봉안번호순번봉안정보봉안일자만료일자개장여부개장일자연장차수신청일자사용료
2172821118167102동 11실 18167호1999-05-172024-05-16N<NA>2회2019-03-0660000
2775921424177102동 14실 24177호2000-05-102025-05-09N<NA>2회2020-01-2160000
4482022141179202동 21실 41179호2003-09-172023-09-16N<NA>1회2018-09-2660000
811812300동 00실 00812호2019-11-272029-11-26N<NA><NA><NA><NA>
4866422245013102동 22실 45013호2003-01-082023-01-07N<NA>1회2018-02-1760000
1533310711781101동 07실 11781호1997-12-272017-12-26Y2017-11-091회2012-12-2160000
461910101068101동 01실 01068호1995-05-142020-05-13Y2017-07-022회2015-09-2760000
6888823265195102동 32실 65195호2005-06-262025-06-25N<NA>1회2020-07-0860000
3869221835073102동 18실 35073호2001-11-222016-11-21N<NA><NA><NA><NA>
5435322550695102동 25실 50695호2003-08-232023-08-22N<NA>1회2018-02-1460000

Duplicate rows

Most frequently occurring

봉안번호순번봉안정보봉안일자만료일자개장여부개장일자연장차수신청일자사용료# duplicates
022345559302동 23실 45559호2009-03-072024-03-06N<NA><NA><NA><NA>2