Overview

Dataset statistics

Number of variables10
Number of observations1613
Missing cells1528
Missing cells (%)9.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory132.4 KiB
Average record size in memory84.1 B

Variable types

Numeric3
Categorical3
Text2
DateTime2

Dataset

Description파일 다운로드
Author서울교통공사
URLhttps://data.seoul.go.kr/dataList/OA-12927/F/1/datasetView.do

Alerts

호선 has constant value ""Constant
면적(제곱미터) has 59 (3.7%) missing valuesMissing
계약시작일자 has 416 (25.8%) missing valuesMissing
계약종료일자 has 416 (25.8%) missing valuesMissing
월임대료 has 637 (39.5%) missing valuesMissing
면적(제곱미터) is highly skewed (γ1 = 31.31107565)Skewed
연번 has unique valuesUnique
상가번호 has unique valuesUnique

Reproduction

Analysis started2024-04-29 16:39:31.177748
Analysis finished2024-04-29 16:39:32.896880
Duration1.72 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct1613
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean807
Minimum1
Maximum1613
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.3 KiB
2024-04-30T01:39:32.962649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile81.6
Q1404
median807
Q31210
95-th percentile1532.4
Maximum1613
Range1612
Interquartile range (IQR)806

Descriptive statistics

Standard deviation465.77731
Coefficient of variation (CV)0.57717138
Kurtosis-1.2
Mean807
Median Absolute Deviation (MAD)403
Skewness0
Sum1301691
Variance216948.5
MonotonicityStrictly increasing
2024-04-30T01:39:33.090177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
1073 1
 
0.1%
1083 1
 
0.1%
1082 1
 
0.1%
1081 1
 
0.1%
1080 1
 
0.1%
1079 1
 
0.1%
1078 1
 
0.1%
1077 1
 
0.1%
1076 1
 
0.1%
Other values (1603) 1603
99.4%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1613 1
0.1%
1612 1
0.1%
1611 1
0.1%
1610 1
0.1%
1609 1
0.1%
1608 1
0.1%
1607 1
0.1%
1606 1
0.1%
1605 1
0.1%
1604 1
0.1%

상가유형
Categorical

Distinct7
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size12.7 KiB
개별(일반)
600 
네트워크
312 
67일괄
262 
복합
217 
공실
154 
Other values (2)
68 

Length

Max length6
Median length4
Mean length4.3261004
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row개별(일반)
2nd row개별(일반)
3rd row개별(일반)
4th row개별(일반)
5th row네트워크

Common Values

ValueCountFrequency (%)
개별(일반) 600
37.2%
네트워크 312
19.3%
67일괄 262
16.2%
복합 217
 
13.5%
공실 154
 
9.5%
소송상가 34
 
2.1%
개별(대형) 34
 
2.1%

Length

2024-04-30T01:39:33.209420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T01:39:33.318305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
개별(일반 600
37.2%
네트워크 312
19.3%
67일괄 262
16.2%
복합 217
 
13.5%
공실 154
 
9.5%
소송상가 34
 
2.1%
개별(대형 34
 
2.1%

호선
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.7 KiB
1
1613 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1613
100.0%

Length

2024-04-30T01:39:33.438344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T01:39:33.517164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 1613
100.0%
Distinct245
Distinct (%)15.2%
Missing0
Missing (%)0.0%
Memory size12.7 KiB
2024-04-30T01:39:33.711536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length9
Mean length4.9981401
Min length3

Characters and Unicode

Total characters8062
Distinct characters209
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)2.2%

Sample

1st row서울(1)역
2nd row시청(1)역
3rd row시청(1)역
4th row시청(1)역
5th row시청(1)역
ValueCountFrequency (%)
오목교역 46
 
2.9%
고속터미널(3)역 39
 
2.4%
공덕(5)역 29
 
1.8%
천호(5)역 27
 
1.7%
잠실(8)역 26
 
1.6%
사당(4)역 25
 
1.5%
노원(7)역 22
 
1.4%
강남구청역 21
 
1.3%
마들역 20
 
1.2%
미아사거리역 19
 
1.2%
Other values (235) 1339
83.0%
2024-04-30T01:39:34.054167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1639
 
20.3%
) 561
 
7.0%
( 561
 
7.0%
211
 
2.6%
188
 
2.3%
129
 
1.6%
2 114
 
1.4%
113
 
1.4%
110
 
1.4%
5 107
 
1.3%
Other values (199) 4329
53.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6337
78.6%
Decimal Number 603
 
7.5%
Close Punctuation 561
 
7.0%
Open Punctuation 561
 
7.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1639
25.9%
211
 
3.3%
188
 
3.0%
129
 
2.0%
113
 
1.8%
110
 
1.7%
98
 
1.5%
96
 
1.5%
94
 
1.5%
86
 
1.4%
Other values (189) 3573
56.4%
Decimal Number
ValueCountFrequency (%)
2 114
18.9%
5 107
17.7%
3 106
17.6%
7 102
16.9%
6 68
11.3%
4 62
10.3%
8 32
 
5.3%
1 12
 
2.0%
Close Punctuation
ValueCountFrequency (%)
) 561
100.0%
Open Punctuation
ValueCountFrequency (%)
( 561
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6337
78.6%
Common 1725
 
21.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1639
25.9%
211
 
3.3%
188
 
3.0%
129
 
2.0%
113
 
1.8%
110
 
1.7%
98
 
1.5%
96
 
1.5%
94
 
1.5%
86
 
1.4%
Other values (189) 3573
56.4%
Common
ValueCountFrequency (%)
) 561
32.5%
( 561
32.5%
2 114
 
6.6%
5 107
 
6.2%
3 106
 
6.1%
7 102
 
5.9%
6 68
 
3.9%
4 62
 
3.6%
8 32
 
1.9%
1 12
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6337
78.6%
ASCII 1725
 
21.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1639
25.9%
211
 
3.3%
188
 
3.0%
129
 
2.0%
113
 
1.8%
110
 
1.7%
98
 
1.5%
96
 
1.5%
94
 
1.5%
86
 
1.4%
Other values (189) 3573
56.4%
ASCII
ValueCountFrequency (%)
) 561
32.5%
( 561
32.5%
2 114
 
6.6%
5 107
 
6.2%
3 106
 
6.1%
7 102
 
5.9%
6 68
 
3.9%
4 62
 
3.6%
8 32
 
1.9%
1 12
 
0.7%

상가번호
Text

UNIQUE 

Distinct1613
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size12.7 KiB
2024-04-30T01:39:34.340873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters11291
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1613 ?
Unique (%)100.0%

Sample

1st row150-107
2nd row151-101
3rd row151-103
4th row151-104
5th row151-105
ValueCountFrequency (%)
150-107 1
 
0.1%
639-204 1
 
0.1%
641-202 1
 
0.1%
641-201 1
 
0.1%
641-103 1
 
0.1%
641-102 1
 
0.1%
641-101 1
 
0.1%
640-105 1
 
0.1%
640-104 1
 
0.1%
640-103 1
 
0.1%
Other values (1603) 1603
99.4%
2024-04-30T01:39:34.771792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2286
20.2%
2 1646
14.6%
- 1613
14.3%
0 1581
14.0%
3 1068
9.5%
4 747
 
6.6%
7 666
 
5.9%
5 653
 
5.8%
6 496
 
4.4%
8 275
 
2.4%
Other values (2) 260
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9676
85.7%
Dash Punctuation 1613
 
14.3%
Uppercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2286
23.6%
2 1646
17.0%
0 1581
16.3%
3 1068
11.0%
4 747
 
7.7%
7 666
 
6.9%
5 653
 
6.7%
6 496
 
5.1%
8 275
 
2.8%
9 258
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
- 1613
100.0%
Uppercase Letter
ValueCountFrequency (%)
M 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11289
> 99.9%
Latin 2
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2286
20.2%
2 1646
14.6%
- 1613
14.3%
0 1581
14.0%
3 1068
9.5%
4 747
 
6.6%
7 666
 
5.9%
5 653
 
5.8%
6 496
 
4.4%
8 275
 
2.4%
Latin
ValueCountFrequency (%)
M 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11291
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2286
20.2%
2 1646
14.6%
- 1613
14.3%
0 1581
14.0%
3 1068
9.5%
4 747
 
6.6%
7 666
 
5.9%
5 653
 
5.8%
6 496
 
4.4%
8 275
 
2.4%
Other values (2) 260
 
2.3%

면적(제곱미터)
Real number (ℝ)

MISSING  SKEWED 

Distinct823
Distinct (%)53.0%
Missing59
Missing (%)3.7%
Infinite0
Infinite (%)0.0%
Mean50.435418
Minimum7.61
Maximum7475.19
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.3 KiB
2024-04-30T01:39:34.899735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7.61
5-th percentile13.186
Q122.4125
median32
Q344.37
95-th percentile100
Maximum7475.19
Range7467.58
Interquartile range (IQR)21.9575

Descriptive statistics

Standard deviation204.77191
Coefficient of variation (CV)4.0600815
Kurtosis1117.0738
Mean50.435418
Median Absolute Deviation (MAD)10.7
Skewness31.311076
Sum78376.64
Variance41931.534
MonotonicityNot monotonic
2024-04-30T01:39:35.022939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
30.0 47
 
2.9%
33.0 38
 
2.4%
40.0 29
 
1.8%
20.0 20
 
1.2%
35.0 19
 
1.2%
50.0 18
 
1.1%
37.0 17
 
1.1%
32.0 15
 
0.9%
25.0 15
 
0.9%
31.0 15
 
0.9%
Other values (813) 1321
81.9%
(Missing) 59
 
3.7%
ValueCountFrequency (%)
7.61 1
0.1%
8.0 1
0.1%
8.15 1
0.1%
8.25 1
0.1%
9.01 1
0.1%
9.05 1
0.1%
9.06 1
0.1%
9.2 1
0.1%
9.36 1
0.1%
9.41 1
0.1%
ValueCountFrequency (%)
7475.19 1
0.1%
1351.0 1
0.1%
1260.58 1
0.1%
900.39 1
0.1%
871.4 1
0.1%
867.64 1
0.1%
849.0 1
0.1%
808.0 1
0.1%
708.0 1
0.1%
592.0 1
0.1%

영업업종
Categorical

Distinct12
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size12.7 KiB
<NA>
416 
의류
272 
기타
197 
편의점
174 
식음료
148 
Other values (7)
406 

Length

Max length5
Median length4
Mean length2.9578425
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사무실
2nd row의류
3rd row기타
4th row플라워
5th row식음료

Common Values

ValueCountFrequency (%)
<NA> 416
25.8%
의류 272
16.9%
기타 197
12.2%
편의점 174
10.8%
식음료 148
 
9.2%
제과 134
 
8.3%
액세서리 101
 
6.3%
플라워 52
 
3.2%
화장품 48
 
3.0%
사무실 32
 
2.0%
Other values (2) 39
 
2.4%

Length

2024-04-30T01:39:35.373702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 416
25.8%
의류 272
16.9%
기타 197
12.2%
편의점 174
10.8%
식음료 148
 
9.2%
제과 134
 
8.3%
액세서리 101
 
6.3%
플라워 52
 
3.2%
화장품 48
 
3.0%
사무실 32
 
2.0%
Other values (2) 39
 
2.4%

계약시작일자
Date

MISSING 

Distinct316
Distinct (%)26.4%
Missing416
Missing (%)25.8%
Memory size12.7 KiB
Minimum2010-01-28 00:00:00
Maximum2021-12-13 00:00:00
2024-04-30T01:39:35.475890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:39:35.597878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

계약종료일자
Date

MISSING 

Distinct332
Distinct (%)27.7%
Missing416
Missing (%)25.8%
Memory size12.7 KiB
Minimum2017-04-27 00:00:00
Maximum2027-01-21 00:00:00
2024-04-30T01:39:35.735743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:39:35.856642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

월임대료
Real number (ℝ)

MISSING 

Distinct870
Distinct (%)89.1%
Missing637
Missing (%)39.5%
Infinite0
Infinite (%)0.0%
Mean6330099.3
Minimum153600
Maximum2.8462293 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.3 KiB
2024-04-30T01:39:35.982877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum153600
5-th percentile596625
Q11928872.9
median3821348
Q36975317.4
95-th percentile15565267
Maximum2.8462293 × 108
Range2.8446933 × 108
Interquartile range (IQR)5046444.5

Descriptive statistics

Standard deviation15111661
Coefficient of variation (CV)2.3872707
Kurtosis188.74379
Mean6330099.3
Median Absolute Deviation (MAD)2230000
Skewness12.586932
Sum6.1781769 × 109
Variance2.2836229 × 1014
MonotonicityNot monotonic
2024-04-30T01:39:36.115125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2500000.0 5
 
0.3%
2810000.0 4
 
0.2%
2300000.0 4
 
0.2%
1700000.0 4
 
0.2%
1200000.0 4
 
0.2%
3550000.0 4
 
0.2%
2200000.0 4
 
0.2%
4310000.0 4
 
0.2%
2150000.0 3
 
0.2%
4200000.0 3
 
0.2%
Other values (860) 937
58.1%
(Missing) 637
39.5%
ValueCountFrequency (%)
153600.0 1
0.1%
186000.0 1
0.1%
233500.0 1
0.1%
252000.0 1
0.1%
300000.0 1
0.1%
302500.0 1
0.1%
311666.6667 1
0.1%
328100.0 1
0.1%
330000.0 1
0.1%
337800.0 1
0.1%
ValueCountFrequency (%)
284622927.0 1
0.1%
217793378.0 1
0.1%
176500000.0 1
0.1%
152935000.0 1
0.1%
145000000.0 1
0.1%
61517300.0 1
0.1%
55185100.0 1
0.1%
48204012.07 1
0.1%
40100000.0 1
0.1%
29358258.0 1
0.1%

Interactions

2024-04-30T01:39:32.085556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:39:31.525215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:39:31.777246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:39:32.249550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:39:31.605933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:39:31.894158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:39:32.361681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:39:31.689198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T01:39:31.992590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T01:39:36.217363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번상가유형면적(제곱미터)영업업종월임대료
연번1.0000.4760.0570.3400.000
상가유형0.4761.0000.3460.5690.368
면적(제곱미터)0.0570.3461.0000.1790.809
영업업종0.3400.5690.1791.0000.000
월임대료0.0000.3680.8090.0001.000
2024-04-30T01:39:36.326020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
영업업종상가유형
영업업종1.0000.356
상가유형0.3561.000
2024-04-30T01:39:36.407054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번면적(제곱미터)월임대료상가유형영업업종
연번1.0000.3900.0980.2640.151
면적(제곱미터)0.3901.0000.3820.2480.105
월임대료0.0980.3821.0000.2460.000
상가유형0.2640.2480.2461.0000.356
영업업종0.1510.1050.0000.3561.000

Missing values

2024-04-30T01:39:32.506585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T01:39:32.679791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-30T01:39:32.812016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번상가유형호선역사명상가번호면적(제곱미터)영업업종계약시작일자계약종료일자월임대료
01개별(일반)1서울(1)역150-10733.0사무실2019-05-082024-06-06527100.0
12개별(일반)1시청(1)역151-10129.73의류2017-04-042022-05-033858954.0
23개별(일반)1시청(1)역151-10357.6기타2020-02-012025-01-311858300.0
34개별(일반)1시청(1)역151-10425.0플라워2020-12-312026-01-302470600.0
45네트워크1시청(1)역151-10525.0식음료2021-06-032026-08-024145884.24
56개별(일반)1시청(1)역151-10614.0액세서리2017-09-192022-11-171801800.0
67개별(일반)1시청(1)역151-10722.0의류2020-09-182025-10-182613800.0
78공실1종각역152-10136.85<NA><NA><NA><NA>
89공실1종각역152-10418.64<NA><NA><NA><NA>
910개별(일반)1종각역152-10529.3편의점2017-04-182022-04-176549400.0
연번상가유형호선역사명상가번호면적(제곱미터)영업업종계약시작일자계약종료일자월임대료
16031604개별(일반)1남한산성입구역822-20517.0식음료2019-10-292024-11-271666600.0
16041605공실1단대오거리역823-10142.5<NA><NA><NA><NA>
16051606개별(일반)1단대오거리역823-10236.78기타2021-01-212026-02-201700000.0
16061607네트워크1단대오거리역823-20132.5편의점2016-07-252021-11-178712991.0
16071608공실1단대오거리역823-20228.97<NA><NA><NA><NA>
16081609개별(일반)1단대오거리역823-20354.03식음료2018-08-312023-09-297630000.0
16091610개별(일반)1단대오거리역823-20475.09의류2021-03-182026-04-173780000.0
16101611네트워크1신흥역824-10140.0편의점2016-07-252021-11-176124682.0
16111612네트워크1수진역825-10140.0편의점2016-07-252021-11-175575875.0
16121613네트워크1모란역826-10150.0편의점2016-07-252021-11-175831070.0