Overview

Dataset statistics

Number of variables6
Number of observations76
Missing cells12
Missing cells (%)2.6%
Duplicate rows1
Duplicate rows (%)1.3%
Total size in memory3.8 KiB
Average record size in memory51.7 B

Variable types

Text2
Numeric2
Categorical2

Dataset

Description부산광역시_사상구_민방위비상급수시설현황_20230717
Author부산광역시 사상구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3078758

Alerts

Dataset has 1 (1.3%) duplicate rowsDuplicates
용도 is highly overall correlated with 데이터기준일자High correlation
데이터기준일자 is highly overall correlated with 규모(일_톤) and 2 other fieldsHigh correlation
규모(일_톤) is highly overall correlated with 데이터기준일자High correlation
심도(m) is highly overall correlated with 데이터기준일자High correlation
데이터기준일자 is highly imbalanced (82.6%)Imbalance
민방위 비상급수시설명 has 3 (3.9%) missing valuesMissing
소재지 주소 has 3 (3.9%) missing valuesMissing
규모(일_톤) has 3 (3.9%) missing valuesMissing
심도(m) has 3 (3.9%) missing valuesMissing

Reproduction

Analysis started2023-12-10 16:56:43.252632
Analysis finished2023-12-10 16:56:45.259129
Duration2.01 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct73
Distinct (%)100.0%
Missing3
Missing (%)3.9%
Memory size740.0 B
2023-12-11T01:56:45.561245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length6.3835616
Min length3

Characters and Unicode

Total characters466
Distinct characters137
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique73 ?
Unique (%)100.0%

Sample

1st row(구)사상구보건소
2nd rowSK사상공원 셀프주유소
3rd rowW모텔
4th row강변허브사우나
5th row고운어린이집
ValueCountFrequency (%)
동일2차아파트 1
 
1.4%
주례여자고등학교 1
 
1.4%
주감중학교 1
 
1.4%
우신아파트 1
 
1.4%
우성아파트 1
 
1.4%
우리주유소 1
 
1.4%
우리사우나 1
 
1.4%
엘지신주례아파트2단지 1
 
1.4%
엘지신주례아파트1단지 1
 
1.4%
엄궁한신2차아파트 1
 
1.4%
Other values (64) 64
86.5%
2023-12-11T01:56:46.298405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
36
 
7.7%
34
 
7.3%
33
 
7.1%
15
 
3.2%
14
 
3.0%
10
 
2.1%
10
 
2.1%
10
 
2.1%
9
 
1.9%
9
 
1.9%
Other values (127) 286
61.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 448
96.1%
Decimal Number 10
 
2.1%
Uppercase Letter 3
 
0.6%
Close Punctuation 2
 
0.4%
Open Punctuation 2
 
0.4%
Space Separator 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
36
 
8.0%
34
 
7.6%
33
 
7.4%
15
 
3.3%
14
 
3.1%
10
 
2.2%
10
 
2.2%
10
 
2.2%
9
 
2.0%
9
 
2.0%
Other values (117) 268
59.8%
Decimal Number
ValueCountFrequency (%)
2 4
40.0%
1 3
30.0%
4 2
20.0%
3 1
 
10.0%
Uppercase Letter
ValueCountFrequency (%)
K 1
33.3%
S 1
33.3%
W 1
33.3%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 448
96.1%
Common 15
 
3.2%
Latin 3
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
36
 
8.0%
34
 
7.6%
33
 
7.4%
15
 
3.3%
14
 
3.1%
10
 
2.2%
10
 
2.2%
10
 
2.2%
9
 
2.0%
9
 
2.0%
Other values (117) 268
59.8%
Common
ValueCountFrequency (%)
2 4
26.7%
1 3
20.0%
4 2
13.3%
) 2
13.3%
( 2
13.3%
1
 
6.7%
3 1
 
6.7%
Latin
ValueCountFrequency (%)
K 1
33.3%
S 1
33.3%
W 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 448
96.1%
ASCII 18
 
3.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
36
 
8.0%
34
 
7.6%
33
 
7.4%
15
 
3.3%
14
 
3.1%
10
 
2.2%
10
 
2.2%
10
 
2.2%
9
 
2.0%
9
 
2.0%
Other values (117) 268
59.8%
ASCII
ValueCountFrequency (%)
2 4
22.2%
1 3
16.7%
4 2
11.1%
) 2
11.1%
( 2
11.1%
1
 
5.6%
K 1
 
5.6%
S 1
 
5.6%
W 1
 
5.6%
3 1
 
5.6%

소재지 주소
Text

MISSING 

Distinct73
Distinct (%)100.0%
Missing3
Missing (%)3.9%
Memory size740.0 B
2023-12-11T01:56:46.755052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length30
Mean length26.178082
Min length16

Characters and Unicode

Total characters1911
Distinct characters67
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique73 ?
Unique (%)100.0%

Sample

1st row부산광역시 사상구 운산로 77-3 (덕포동)
2nd row부산광역시 사상구 백양대로 566 (감전동)
3rd row부산광역시 사상구 광장로 93번길 6 (괘법동)
4th row부산광역시 사상구 백양대로 366 (주례동)
5th row부산광역시 사상구 동주로 2-11 (주례동)
ValueCountFrequency (%)
부산광역시 73
19.9%
사상구 73
19.9%
모라동 18
 
4.9%
주례동 15
 
4.1%
학장동 12
 
3.3%
엄궁동 11
 
3.0%
백양대로 11
 
3.0%
모라로192번길 6
 
1.6%
대동로 6
 
1.6%
엄궁로 6
 
1.6%
Other values (108) 135
36.9%
2023-12-11T01:56:47.831387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
293
 
15.3%
82
 
4.3%
76
 
4.0%
75
 
3.9%
74
 
3.9%
74
 
3.9%
74
 
3.9%
73
 
3.8%
73
 
3.8%
73
 
3.8%
Other values (57) 944
49.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1153
60.3%
Decimal Number 302
 
15.8%
Space Separator 293
 
15.3%
Open Punctuation 71
 
3.7%
Close Punctuation 70
 
3.7%
Dash Punctuation 17
 
0.9%
Other Punctuation 5
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
82
 
7.1%
76
 
6.6%
75
 
6.5%
74
 
6.4%
74
 
6.4%
74
 
6.4%
73
 
6.3%
73
 
6.3%
73
 
6.3%
71
 
6.2%
Other values (42) 408
35.4%
Decimal Number
ValueCountFrequency (%)
1 56
18.5%
2 38
12.6%
4 35
11.6%
6 30
9.9%
3 30
9.9%
0 29
9.6%
5 27
8.9%
9 25
8.3%
8 19
 
6.3%
7 13
 
4.3%
Space Separator
ValueCountFrequency (%)
293
100.0%
Open Punctuation
ValueCountFrequency (%)
( 71
100.0%
Close Punctuation
ValueCountFrequency (%)
) 70
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 17
100.0%
Other Punctuation
ValueCountFrequency (%)
, 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1153
60.3%
Common 758
39.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
82
 
7.1%
76
 
6.6%
75
 
6.5%
74
 
6.4%
74
 
6.4%
74
 
6.4%
73
 
6.3%
73
 
6.3%
73
 
6.3%
71
 
6.2%
Other values (42) 408
35.4%
Common
ValueCountFrequency (%)
293
38.7%
( 71
 
9.4%
) 70
 
9.2%
1 56
 
7.4%
2 38
 
5.0%
4 35
 
4.6%
6 30
 
4.0%
3 30
 
4.0%
0 29
 
3.8%
5 27
 
3.6%
Other values (5) 79
 
10.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1153
60.3%
ASCII 758
39.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
293
38.7%
( 71
 
9.4%
) 70
 
9.2%
1 56
 
7.4%
2 38
 
5.0%
4 35
 
4.6%
6 30
 
4.0%
3 30
 
4.0%
0 29
 
3.8%
5 27
 
3.6%
Other values (5) 79
 
10.4%
Hangul
ValueCountFrequency (%)
82
 
7.1%
76
 
6.6%
75
 
6.5%
74
 
6.4%
74
 
6.4%
74
 
6.4%
73
 
6.3%
73
 
6.3%
73
 
6.3%
71
 
6.2%
Other values (42) 408
35.4%

규모(일_톤)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct31
Distinct (%)42.5%
Missing3
Missing (%)3.9%
Infinite0
Infinite (%)0.0%
Mean82.493151
Minimum25
Maximum300
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size816.0 B
2023-12-11T01:56:48.039104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile30
Q159
median70
Q388
95-th percentile227
Maximum300
Range275
Interquartile range (IQR)29

Descriptive statistics

Standard deviation54.243055
Coefficient of variation (CV)0.65754616
Kurtosis7.721536
Mean82.493151
Median Absolute Deviation (MAD)16
Skewness2.7296159
Sum6022
Variance2942.309
MonotonicityNot monotonic
2023-12-11T01:56:48.237629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
70 8
 
10.5%
50 7
 
9.2%
60 6
 
7.9%
69 5
 
6.6%
30 5
 
6.6%
80 4
 
5.3%
65 4
 
5.3%
95 4
 
5.3%
100 4
 
5.3%
59 2
 
2.6%
Other values (21) 24
31.6%
(Missing) 3
 
3.9%
ValueCountFrequency (%)
25 1
 
1.3%
30 5
6.6%
40 2
 
2.6%
48 1
 
1.3%
50 7
9.2%
55 1
 
1.3%
59 2
 
2.6%
60 6
7.9%
65 4
5.3%
68 1
 
1.3%
ValueCountFrequency (%)
300 1
 
1.3%
280 1
 
1.3%
267 1
 
1.3%
260 1
 
1.3%
205 1
 
1.3%
146 1
 
1.3%
113 1
 
1.3%
100 4
5.3%
95 4
5.3%
92 1
 
1.3%

심도(m)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct22
Distinct (%)30.1%
Missing3
Missing (%)3.9%
Infinite0
Infinite (%)0.0%
Mean136.32877
Minimum20
Maximum500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size816.0 B
2023-12-11T01:56:48.424984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile56
Q1100
median130
Q3150
95-th percentile212
Maximum500
Range480
Interquartile range (IQR)50

Descriptive statistics

Standard deviation66.027783
Coefficient of variation (CV)0.48432759
Kurtosis12.534565
Mean136.32877
Median Absolute Deviation (MAD)28
Skewness2.6171296
Sum9952
Variance4359.6682
MonotonicityNot monotonic
2023-12-11T01:56:48.633385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
150 18
23.7%
100 14
18.4%
120 9
11.8%
200 6
 
7.9%
130 6
 
7.9%
180 2
 
2.6%
300 2
 
2.6%
110 2
 
2.6%
102 1
 
1.3%
74 1
 
1.3%
Other values (12) 12
15.8%
(Missing) 3
 
3.9%
ValueCountFrequency (%)
20 1
 
1.3%
25 1
 
1.3%
35 1
 
1.3%
50 1
 
1.3%
60 1
 
1.3%
67 1
 
1.3%
74 1
 
1.3%
80 1
 
1.3%
95 1
 
1.3%
100 14
18.4%
ValueCountFrequency (%)
500 1
 
1.3%
300 2
 
2.6%
230 1
 
1.3%
200 6
 
7.9%
180 2
 
2.6%
160 1
 
1.3%
150 18
23.7%
130 6
 
7.9%
120 9
11.8%
114 1
 
1.3%

용도
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size740.0 B
비음용
48 
음용
25 
<NA>
 
3

Length

Max length4
Median length3
Mean length2.7105263
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row비음용
2nd row비음용
3rd row비음용
4th row비음용
5th row비음용

Common Values

ValueCountFrequency (%)
비음용 48
63.2%
음용 25
32.9%
<NA> 3
 
3.9%

Length

2023-12-11T01:56:48.826109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:56:48.992457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
비음용 48
63.2%
음용 25
32.9%
na 3
 
3.9%

데이터기준일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size740.0 B
2023-07-17
73 
<NA>
 
2
 
1

Length

Max length10
Median length10
Mean length9.7236842
Min length1

Unique

Unique1 ?
Unique (%)1.3%

Sample

1st row2023-07-17
2nd row2023-07-17
3rd row2023-07-17
4th row2023-07-17
5th row2023-07-17

Common Values

ValueCountFrequency (%)
2023-07-17 73
96.1%
<NA> 2
 
2.6%
1
 
1.3%

Length

2023-12-11T01:56:49.186366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:56:49.360414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-07-17 73
97.3%
na 2
 
2.7%

Interactions

2023-12-11T01:56:44.289886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:43.997649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:44.441927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:56:44.138624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:56:49.479396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
민방위 비상급수시설명소재지 주소규모(일_톤)심도(m)용도데이터기준일자
민방위 비상급수시설명1.0001.0001.0001.0001.000NaN
소재지 주소1.0001.0001.0001.0001.000NaN
규모(일_톤)1.0001.0001.0000.5900.000NaN
심도(m)1.0001.0000.5901.0000.263NaN
용도1.0001.0000.0000.2631.000NaN
데이터기준일자NaNNaNNaNNaNNaN1.000
2023-12-11T01:56:49.709816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
용도데이터기준일자
용도1.0001.000
데이터기준일자1.0001.000
2023-12-11T01:56:49.838232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
규모(일_톤)심도(m)용도데이터기준일자
규모(일_톤)1.000-0.0250.0001.000
심도(m)-0.0251.0000.2691.000
용도0.0000.2691.0001.000
데이터기준일자1.0001.0001.0001.000

Missing values

2023-12-11T01:56:44.624554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:56:44.825270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T01:56:45.089299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

민방위 비상급수시설명소재지 주소규모(일_톤)심도(m)용도데이터기준일자
0(구)사상구보건소부산광역시 사상구 운산로 77-3 (덕포동)20520비음용2023-07-17
1SK사상공원 셀프주유소부산광역시 사상구 백양대로 566 (감전동)60150비음용2023-07-17
2W모텔부산광역시 사상구 광장로 93번길 6 (괘법동)70100비음용2023-07-17
3강변허브사우나부산광역시 사상구 백양대로 366 (주례동)300500비음용2023-07-17
4고운어린이집부산광역시 사상구 동주로 2-11 (주례동)26725비음용2023-07-17
5광우맨션부산광역시 사상구 가야대로366번길 104 (주례동)95110비음용2023-07-17
6구덕고등학교부산광역시 사상구 학감대로 81 (학장동)100102음용2023-07-17
7구덕농원부산광역시 사상구 학감대로 2 (학장동, 구덕농원)50180음용2023-07-17
8구덕대림아파트부산광역시 사상구 학감대로49번길 62 (학장동)48180비음용2023-07-17
9학장극동아파트부산광역시 사상구 학감대로 109-43 (학장동)80130비음용2023-07-17
민방위 비상급수시설명소재지 주소규모(일_톤)심도(m)용도데이터기준일자
66한효아파트부산광역시 사상구 백양대로342번길 40 (주례동)70200음용2023-07-17
67협성아파트부산광역시 사상구 백양대로950번나길 14 (모라동)79150비음용2023-07-17
68화인아파트부산광역시 사상구 백양대로934번길 52-6 (모라동)59130비음용2023-07-17
69아름스파부산광역시 사상구 대동로 304(감전동)260300비음용2023-07-17
70양지어린이공원옆부산광역시 사상구 양지로8번길 5-22(주례동)86100비음용2023-07-17
71야시곡공원앞부산광역시 사상구 백양대로950번나길 93(모라동)10050비음용2023-07-17
72미트피아부산광역시 사상구 백양대로 507(주례동)70100비음용2023-07-17
73<NA><NA><NA><NA><NA><NA>
74<NA><NA><NA><NA><NA><NA>
75<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

민방위 비상급수시설명소재지 주소규모(일_톤)심도(m)용도데이터기준일자# duplicates
0<NA><NA><NA><NA><NA><NA>2