Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells4124
Missing cells (%)10.3%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory410.2 KiB
Average record size in memory42.0 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15249/A/1/datasetView.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
대여소 명 has 2062 (20.6%) missing valuesMissing
대여 건수 has 2062 (20.6%) missing valuesMissing

Reproduction

Analysis started2024-05-03 23:56:58.937998
Analysis finished2024-05-03 23:57:01.454156
Duration2.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

대여소 그룹
Categorical

Distinct27
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
2062 
송파구
 
584
강서구
 
547
서초구
 
457
강남구
 
449
Other values (22)
5901 

Length

Max length4
Median length3
Mean length3.2755
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서초구
2nd row<NA>
3rd row양천구
4th row<NA>
5th row마포구

Common Values

ValueCountFrequency (%)
<NA> 2062
20.6%
송파구 584
 
5.8%
강서구 547
 
5.5%
서초구 457
 
4.6%
강남구 449
 
4.5%
영등포구 396
 
4.0%
노원구 351
 
3.5%
종로구 349
 
3.5%
마포구 330
 
3.3%
강동구 306
 
3.1%
Other values (17) 4169
41.7%

Length

2024-05-03T23:57:01.844166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 2062
20.6%
송파구 584
 
5.8%
강서구 547
 
5.5%
서초구 457
 
4.6%
강남구 449
 
4.5%
영등포구 396
 
4.0%
노원구 351
 
3.5%
종로구 349
 
3.5%
마포구 330
 
3.3%
강동구 306
 
3.1%
Other values (17) 4169
41.7%

대여소 명
Text

MISSING 

Distinct2448
Distinct (%)30.8%
Missing2062
Missing (%)20.6%
Memory size156.2 KiB
2024-05-03T23:57:02.444558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length29
Mean length15.402872
Min length3

Characters and Unicode

Total characters122268
Distinct characters581
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique198 ?
Unique (%)2.5%

Sample

1st row2207. 내곡파출소 뒤 정자
2nd row737. 장수공원
3rd row3005.마포구청 청사내
4th row1547. 꿈의숲 롯데캐슬
5th row934. 신사동 성당
ValueCountFrequency (%)
2113
 
9.2%
출구 317
 
1.4%
303
 
1.3%
입구 215
 
0.9%
1번출구 188
 
0.8%
사거리 169
 
0.7%
교차로 156
 
0.7%
155
 
0.7%
3번출구 143
 
0.6%
2번출구 138
 
0.6%
Other values (4841) 18952
82.9%
2024-05-03T23:57:03.540111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14911
 
12.2%
. 7950
 
6.5%
1 6206
 
5.1%
2 5109
 
4.2%
3 3880
 
3.2%
4 3412
 
2.8%
5 2812
 
2.3%
0 2768
 
2.3%
6 2604
 
2.1%
2590
 
2.1%
Other values (571) 70026
57.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 63259
51.7%
Decimal Number 33273
27.2%
Space Separator 14911
 
12.2%
Other Punctuation 8025
 
6.6%
Uppercase Letter 1107
 
0.9%
Close Punctuation 743
 
0.6%
Open Punctuation 743
 
0.6%
Lowercase Letter 132
 
0.1%
Dash Punctuation 53
 
< 0.1%
Math Symbol 14
 
< 0.1%
Other values (2) 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2590
 
4.1%
2425
 
3.8%
1749
 
2.8%
1706
 
2.7%
1550
 
2.5%
1515
 
2.4%
1262
 
2.0%
1104
 
1.7%
1073
 
1.7%
1040
 
1.6%
Other values (511) 47245
74.7%
Uppercase Letter
ValueCountFrequency (%)
S 127
11.5%
K 123
11.1%
T 94
 
8.5%
C 94
 
8.5%
A 80
 
7.2%
L 75
 
6.8%
G 75
 
6.8%
D 68
 
6.1%
M 64
 
5.8%
P 59
 
5.3%
Other values (14) 248
22.4%
Lowercase Letter
ValueCountFrequency (%)
e 48
36.4%
k 21
15.9%
s 16
 
12.1%
t 11
 
8.3%
l 7
 
5.3%
n 6
 
4.5%
m 4
 
3.0%
o 4
 
3.0%
c 4
 
3.0%
v 4
 
3.0%
Other values (3) 7
 
5.3%
Decimal Number
ValueCountFrequency (%)
1 6206
18.7%
2 5109
15.4%
3 3880
11.7%
4 3412
10.3%
5 2812
8.5%
0 2768
8.3%
6 2604
7.8%
7 2444
 
7.3%
8 2078
 
6.2%
9 1960
 
5.9%
Other Punctuation
ValueCountFrequency (%)
. 7950
99.1%
, 43
 
0.5%
& 16
 
0.2%
? 11
 
0.1%
· 5
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 12
85.7%
+ 2
 
14.3%
Space Separator
ValueCountFrequency (%)
14911
100.0%
Close Punctuation
ValueCountFrequency (%)
) 743
100.0%
Open Punctuation
ValueCountFrequency (%)
( 743
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 53
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 6
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 63261
51.7%
Common 57768
47.2%
Latin 1239
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2590
 
4.1%
2425
 
3.8%
1749
 
2.8%
1706
 
2.7%
1550
 
2.5%
1515
 
2.4%
1262
 
2.0%
1104
 
1.7%
1073
 
1.7%
1040
 
1.6%
Other values (512) 47247
74.7%
Latin
ValueCountFrequency (%)
S 127
 
10.3%
K 123
 
9.9%
T 94
 
7.6%
C 94
 
7.6%
A 80
 
6.5%
L 75
 
6.1%
G 75
 
6.1%
D 68
 
5.5%
M 64
 
5.2%
P 59
 
4.8%
Other values (27) 380
30.7%
Common
ValueCountFrequency (%)
14911
25.8%
. 7950
13.8%
1 6206
10.7%
2 5109
 
8.8%
3 3880
 
6.7%
4 3412
 
5.9%
5 2812
 
4.9%
0 2768
 
4.8%
6 2604
 
4.5%
7 2444
 
4.2%
Other values (12) 5672
 
9.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 63259
51.7%
ASCII 59002
48.3%
None 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
14911
25.3%
. 7950
13.5%
1 6206
10.5%
2 5109
 
8.7%
3 3880
 
6.6%
4 3412
 
5.8%
5 2812
 
4.8%
0 2768
 
4.7%
6 2604
 
4.4%
7 2444
 
4.1%
Other values (48) 6906
11.7%
Hangul
ValueCountFrequency (%)
2590
 
4.1%
2425
 
3.8%
1749
 
2.8%
1706
 
2.7%
1550
 
2.5%
1515
 
2.4%
1262
 
2.0%
1104
 
1.7%
1073
 
1.7%
1040
 
1.6%
Other values (511) 47245
74.7%
None
ValueCountFrequency (%)
· 5
71.4%
2
 
28.6%
Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
2062 
202106
1686 
202105
1622 
202103
1561 
202104
1555 

Length

Max length6
Median length6
Mean length5.5876
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202103
2nd row<NA>
3rd row202105
4th row<NA>
5th row202103

Common Values

ValueCountFrequency (%)
<NA> 2062
20.6%
202106 1686
16.9%
202105 1622
16.2%
202103 1561
15.6%
202104 1555
15.6%
202102 1514
15.1%

Length

2024-05-03T23:57:03.965938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-03T23:57:04.311841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 2062
20.6%
202106 1686
16.9%
202105 1622
16.2%
202103 1561
15.6%
202104 1555
15.6%
202102 1514
15.1%

대여 건수
Real number (ℝ)

MISSING 

Distinct2676
Distinct (%)33.7%
Missing2062
Missing (%)20.6%
Infinite0
Infinite (%)0.0%
Mean1090.5005
Minimum0
Maximum17875
Zeros3
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-03T23:57:04.712848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile132
Q1409
median801
Q31426
95-th percentile2933.3
Maximum17875
Range17875
Interquartile range (IQR)1017

Descriptive statistics

Standard deviation1081.7983
Coefficient of variation (CV)0.99201997
Kurtosis32.731171
Mean1090.5005
Median Absolute Deviation (MAD)458
Skewness3.8886539
Sum8656393
Variance1170287.5
MonotonicityNot monotonic
2024-05-03T23:57:05.145457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
264 13
 
0.1%
505 12
 
0.1%
710 12
 
0.1%
197 12
 
0.1%
263 12
 
0.1%
312 12
 
0.1%
634 11
 
0.1%
210 11
 
0.1%
358 11
 
0.1%
308 11
 
0.1%
Other values (2666) 7821
78.2%
(Missing) 2062
 
20.6%
ValueCountFrequency (%)
0 3
 
< 0.1%
1 9
0.1%
2 6
0.1%
3 3
 
< 0.1%
4 2
 
< 0.1%
5 1
 
< 0.1%
6 1
 
< 0.1%
7 4
< 0.1%
8 1
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
17875 1
< 0.1%
16619 1
< 0.1%
15386 1
< 0.1%
15061 1
< 0.1%
14494 1
< 0.1%
11585 1
< 0.1%
11174 1
< 0.1%
10998 1
< 0.1%
10745 1
< 0.1%
10643 1
< 0.1%

Interactions

2024-05-03T23:57:00.137942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-03T23:57:05.402615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대여소 그룹대여 일자 / 월대여 건수
대여소 그룹1.0000.0000.260
대여 일자 / 월0.0001.0000.184
대여 건수0.2600.1841.000
2024-05-03T23:57:05.657407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대여소 그룹대여 일자 / 월
대여소 그룹1.0000.000
대여 일자 / 월0.0001.000
2024-05-03T23:57:05.894177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대여 건수대여소 그룹대여 일자 / 월
대여 건수1.0000.1000.107
대여소 그룹0.1001.0000.000
대여 일자 / 월0.1070.0001.000

Missing values

2024-05-03T23:57:00.517553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-03T23:57:00.949788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-03T23:57:01.285522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

대여소 그룹대여소 명대여 일자 / 월대여 건수
3443서초구2207. 내곡파출소 뒤 정자202103181
12996<NA><NA><NA><NA>
8689양천구737. 장수공원2021051255
13053<NA><NA><NA><NA>
3304마포구3005.마포구청 청사내202103416
244강북구1547. 꿈의숲 롯데캐슬202102129
6581은평구934. 신사동 성당202104393
12184<NA><NA><NA><NA>
1567송파구2630.서울방이동 고분군202102573
2245강남구2311. 학동로 래미안 아파트 앞202103289
대여소 그룹대여소 명대여 일자 / 월대여 건수
14111<NA><NA><NA><NA>
6086송파구1210. 롯데월드타워(잠실역2번출구 쪽)2021048336
10475마포구3007.MBC 앞2021061479
10512마포구425. DMC첨단산업센터202106827
5856서초구2287. 능안마을입구20210437
4163용산구854.HID 유족동지회 앞20210398
13855<NA><NA><NA><NA>
4730강동구1064.중앙보훈병원역 1번출구202104350
6071성북구1391.성북청소년센터202104597
6184송파구2631.배명고등학교2021041051

Duplicate rows

Most frequently occurring

대여소 그룹대여소 명대여 일자 / 월대여 건수# duplicates
0<NA><NA><NA><NA>2062