Overview

Dataset statistics

Number of variables5
Number of observations26
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.2 KiB
Average record size in memory49.1 B

Variable types

Numeric3
Text1
Categorical1

Dataset

Description연번,사업대상지,공공임대공급호수,공실,공실률
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-21308/S/1/datasetView.do

Alerts

공실률 is highly overall correlated with 공실High correlation
공실 is highly overall correlated with 공실률High correlation
연번 has unique valuesUnique
사업대상지 has unique valuesUnique
공실률 has 20 (76.9%) zerosZeros

Reproduction

Analysis started2023-12-11 06:46:56.772219
Analysis finished2023-12-11 06:46:57.953285
Duration1.18 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct26
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.5
Minimum1
Maximum26
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-11T15:46:58.018085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.25
Q17.25
median13.5
Q319.75
95-th percentile24.75
Maximum26
Range25
Interquartile range (IQR)12.5

Descriptive statistics

Standard deviation7.6485293
Coefficient of variation (CV)0.56655772
Kurtosis-1.2
Mean13.5
Median Absolute Deviation (MAD)6.5
Skewness0
Sum351
Variance58.5
MonotonicityStrictly increasing
2023-12-11T15:46:58.192234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
1 1
 
3.8%
15 1
 
3.8%
26 1
 
3.8%
25 1
 
3.8%
24 1
 
3.8%
23 1
 
3.8%
22 1
 
3.8%
21 1
 
3.8%
20 1
 
3.8%
19 1
 
3.8%
Other values (16) 16
61.5%
ValueCountFrequency (%)
1 1
3.8%
2 1
3.8%
3 1
3.8%
4 1
3.8%
5 1
3.8%
6 1
3.8%
7 1
3.8%
8 1
3.8%
9 1
3.8%
10 1
3.8%
ValueCountFrequency (%)
26 1
3.8%
25 1
3.8%
24 1
3.8%
23 1
3.8%
22 1
3.8%
21 1
3.8%
20 1
3.8%
19 1
3.8%
18 1
3.8%
17 1
3.8%

사업대상지
Text

UNIQUE 

Distinct26
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size340.0 B
2023-12-11T15:46:58.429850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length14
Mean length13.538462
Min length11

Characters and Unicode

Total characters352
Distinct characters65
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)100.0%

Sample

1st row광진구 구의동 587-64
2nd row서대문구 충정로3가 72-1
3rd row성동구 용답동 233-1
4th row마포구 서교동 395-43
5th row강서구 등촌동 648-5
ValueCountFrequency (%)
강서구 6
 
7.7%
화곡동 3
 
3.8%
광진구 2
 
2.6%
용산구 2
 
2.6%
구의동 2
 
2.6%
중랑구 2
 
2.6%
도봉구 2
 
2.6%
등촌동 2
 
2.6%
마포구 2
 
2.6%
쌍문동 2
 
2.6%
Other values (53) 53
67.9%
2023-12-11T15:46:58.909053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
52
 
14.8%
28
 
8.0%
27
 
7.7%
1 27
 
7.7%
- 24
 
6.8%
3 15
 
4.3%
2 13
 
3.7%
0 12
 
3.4%
7 12
 
3.4%
5 11
 
3.1%
Other values (55) 131
37.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 162
46.0%
Decimal Number 114
32.4%
Space Separator 52
 
14.8%
Dash Punctuation 24
 
6.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
28
17.3%
27
 
16.7%
10
 
6.2%
9
 
5.6%
4
 
2.5%
4
 
2.5%
3
 
1.9%
3
 
1.9%
3
 
1.9%
3
 
1.9%
Other values (43) 68
42.0%
Decimal Number
ValueCountFrequency (%)
1 27
23.7%
3 15
13.2%
2 13
11.4%
0 12
10.5%
7 12
10.5%
5 11
9.6%
4 8
 
7.0%
9 6
 
5.3%
6 6
 
5.3%
8 4
 
3.5%
Space Separator
ValueCountFrequency (%)
52
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 190
54.0%
Hangul 162
46.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
28
17.3%
27
 
16.7%
10
 
6.2%
9
 
5.6%
4
 
2.5%
4
 
2.5%
3
 
1.9%
3
 
1.9%
3
 
1.9%
3
 
1.9%
Other values (43) 68
42.0%
Common
ValueCountFrequency (%)
52
27.4%
1 27
14.2%
- 24
12.6%
3 15
 
7.9%
2 13
 
6.8%
0 12
 
6.3%
7 12
 
6.3%
5 11
 
5.8%
4 8
 
4.2%
9 6
 
3.2%
Other values (2) 10
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 190
54.0%
Hangul 162
46.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
52
27.4%
1 27
14.2%
- 24
12.6%
3 15
 
7.9%
2 13
 
6.8%
0 12
 
6.3%
7 12
 
6.3%
5 11
 
5.8%
4 8
 
4.2%
9 6
 
3.2%
Other values (2) 10
 
5.3%
Hangul
ValueCountFrequency (%)
28
17.3%
27
 
16.7%
10
 
6.2%
9
 
5.6%
4
 
2.5%
4
 
2.5%
3
 
1.9%
3
 
1.9%
3
 
1.9%
3
 
1.9%
Other values (43) 68
42.0%

공공임대공급호수
Real number (ℝ)

Distinct23
Distinct (%)88.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65.230769
Minimum6
Maximum323
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-11T15:46:59.041504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile9
Q119.75
median48.5
Q369.5
95-th percentile255.75
Maximum323
Range317
Interquartile range (IQR)49.75

Descriptive statistics

Standard deviation78.412911
Coefficient of variation (CV)1.2020847
Kurtosis6.0059104
Mean65.230769
Median Absolute Deviation (MAD)26.5
Skewness2.4842502
Sum1696
Variance6148.5846
MonotonicityNot monotonic
2023-12-11T15:46:59.155722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
15 2
 
7.7%
9 2
 
7.7%
49 2
 
7.7%
87 1
 
3.8%
287 1
 
3.8%
86 1
 
3.8%
48 1
 
3.8%
70 1
 
3.8%
75 1
 
3.8%
28 1
 
3.8%
Other values (13) 13
50.0%
ValueCountFrequency (%)
6 1
3.8%
9 2
7.7%
15 2
7.7%
18 1
3.8%
19 1
3.8%
22 1
3.8%
24 1
3.8%
27 1
3.8%
28 1
3.8%
37 1
3.8%
ValueCountFrequency (%)
323 1
3.8%
287 1
3.8%
162 1
3.8%
87 1
3.8%
86 1
3.8%
75 1
3.8%
70 1
3.8%
68 1
3.8%
60 1
3.8%
53 1
3.8%

공실
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
18 
1
2
9
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)3.8%

Sample

1st row0
2nd row0
3rd row0
4th row2
5th row0

Common Values

ValueCountFrequency (%)
0 18
69.2%
1 4
 
15.4%
2 3
 
11.5%
9 1
 
3.8%

Length

2023-12-11T15:46:59.282826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T15:46:59.382069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 18
69.2%
1 4
 
15.4%
2 3
 
11.5%
9 1
 
3.8%

공실률
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)23.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.017307692
Minimum0
Maximum0.17
Zeros20
Zeros (%)76.9%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-11T15:46:59.465742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0.11
Maximum0.17
Range0.17
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.043225705
Coefficient of variation (CV)2.4974852
Kurtosis6.701704
Mean0.017307692
Median Absolute Deviation (MAD)0
Skewness2.7099134
Sum0.45
Variance0.0018684615
MonotonicityNot monotonic
2023-12-11T15:46:59.563513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0.0 20
76.9%
0.11 2
 
7.7%
0.01 1
 
3.8%
0.02 1
 
3.8%
0.17 1
 
3.8%
0.03 1
 
3.8%
ValueCountFrequency (%)
0.0 20
76.9%
0.01 1
 
3.8%
0.02 1
 
3.8%
0.03 1
 
3.8%
0.11 2
 
7.7%
0.17 1
 
3.8%
ValueCountFrequency (%)
0.17 1
 
3.8%
0.11 2
 
7.7%
0.03 1
 
3.8%
0.02 1
 
3.8%
0.01 1
 
3.8%
0.0 20
76.9%

Interactions

2023-12-11T15:46:57.537574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:46:56.973290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:46:57.211817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:46:57.620410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:46:57.050511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:46:57.307174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:46:57.701786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:46:57.127412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:46:57.401371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T15:46:59.644053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업대상지공공임대공급호수공실공실률
연번1.0001.0000.3880.4680.000
사업대상지1.0001.0001.0001.0001.000
공공임대공급호수0.3881.0001.0000.5450.000
공실0.4681.0000.5451.0000.935
공실률0.0001.0000.0000.9351.000
2023-12-11T15:46:59.746342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번공공임대공급호수공실률공실
연번1.0000.218-0.0120.257
공공임대공급호수0.2181.0000.0100.354
공실률-0.0120.0101.0000.658
공실0.2570.3540.6581.000

Missing values

2023-12-11T15:46:57.831835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T15:46:57.917142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번사업대상지공공임대공급호수공실공실률
01광진구 구의동 587-641500.0
12서대문구 충정로3가 72-14900.0
23성동구 용답동 233-12200.0
34마포구 서교동 395-4316220.01
45강서구 등촌동 648-51900.0
56동작구 노량진동 37-13700.0
67강서구 염창동 274-174910.02
78강서구 화곡동 401-1900.0
89용산구 한강로2가 2-35032310.0
910동대문구 휘경동 192-1910.11
연번사업대상지공공임대공급호수공실공실률
1617강서구 등촌동 671-15390.17
1718영등포구 도림동 250-201820.11
1819중랑구 묵동 176-392400.0
1920광진구 구의동 593-112800.0
2021노원구 공릉동 617-37520.03
2122도봉구 쌍문동 103-67000.0
2223도봉구 쌍문동 507-14800.0
2324강남구 논현동 202-78600.0
2425용산구 원효로 1가 10428710.0
2526강서구 화곡동 1073-111500.0