Overview

Dataset statistics

Number of variables4
Number of observations9189
Missing cells15
Missing cells (%)< 0.1%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory305.2 KiB
Average record size in memory34.0 B

Variable types

Categorical1
Text1
Numeric2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15249/F/1/datasetView.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
대여일자 is highly overall correlated with 대여건수High correlation
대여건수 is highly overall correlated with 대여일자High correlation

Reproduction

Analysis started2024-03-13 09:53:50.234809
Analysis finished2024-03-13 09:53:51.247929
Duration1.01 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

Distinct26
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size71.9 KiB
송파구
 
591
강남구
 
582
영등포구
 
528
서초구
 
527
강서구
 
510
Other values (21)
6451 

Length

Max length4
Median length3
Mean length3.0960932
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강남구
2nd row강남구
3rd row강남구
4th row강남구
5th row강남구

Common Values

ValueCountFrequency (%)
송파구 591
 
6.4%
강남구 582
 
6.3%
영등포구 528
 
5.7%
서초구 527
 
5.7%
강서구 510
 
5.6%
마포구 470
 
5.1%
노원구 405
 
4.4%
종로구 383
 
4.2%
성동구 379
 
4.1%
은평구 372
 
4.0%
Other values (16) 4442
48.3%

Length

2024-03-13T18:53:51.313708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
송파구 591
 
6.4%
강남구 582
 
6.3%
영등포구 528
 
5.7%
서초구 527
 
5.7%
강서구 510
 
5.6%
마포구 470
 
5.1%
노원구 405
 
4.4%
종로구 383
 
4.2%
성동구 379
 
4.1%
은평구 372
 
4.0%
Other values (16) 4442
48.3%
Distinct1546
Distinct (%)16.8%
Missing5
Missing (%)0.1%
Memory size71.9 KiB
2024-03-13T18:53:51.561471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length31
Mean length15.464612
Min length8

Characters and Unicode

Total characters142027
Distinct characters521
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.1%

Sample

1st row2301. 현대고등학교 건너편
2nd row2302. 교보타워 버스정류장(신논현역 3번출구 후면)
3rd row2303. 논현역 7번출구
4th row2304. 신영 ROYAL PALACE 앞
5th row2305. MCM 본사 직영점 앞
ValueCountFrequency (%)
2392
 
8.3%
471
 
1.6%
출구 330
 
1.1%
1번출구 288
 
1.0%
사거리 252
 
0.9%
교차로 242
 
0.8%
입구 239
 
0.8%
228
 
0.8%
2번출구 220
 
0.8%
3번출구 196
 
0.7%
Other values (3396) 24135
83.2%
2024-03-13T18:53:52.037865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19809
 
13.9%
. 9220
 
6.5%
1 8250
 
5.8%
2 6166
 
4.3%
3 4348
 
3.1%
5 3232
 
2.3%
3178
 
2.2%
0 3063
 
2.2%
4 2986
 
2.1%
2877
 
2.0%
Other values (511) 78898
55.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 72931
51.4%
Decimal Number 37248
26.2%
Space Separator 19809
 
13.9%
Other Punctuation 9297
 
6.5%
Uppercase Letter 1100
 
0.8%
Close Punctuation 717
 
0.5%
Open Punctuation 717
 
0.5%
Lowercase Letter 119
 
0.1%
Dash Punctuation 53
 
< 0.1%
Math Symbol 24
 
< 0.1%
Other values (2) 12
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3178
 
4.4%
2877
 
3.9%
2388
 
3.3%
2097
 
2.9%
2067
 
2.8%
1769
 
2.4%
1528
 
2.1%
1268
 
1.7%
1136
 
1.6%
1085
 
1.5%
Other values (453) 53538
73.4%
Uppercase Letter
ValueCountFrequency (%)
K 162
14.7%
S 131
11.9%
C 119
10.8%
G 84
 
7.6%
L 84
 
7.6%
T 66
 
6.0%
A 54
 
4.9%
M 54
 
4.9%
B 53
 
4.8%
I 47
 
4.3%
Other values (14) 246
22.4%
Decimal Number
ValueCountFrequency (%)
1 8250
22.1%
2 6166
16.6%
3 4348
11.7%
5 3232
 
8.7%
0 3063
 
8.2%
4 2986
 
8.0%
6 2733
 
7.3%
7 2208
 
5.9%
9 2140
 
5.7%
8 2122
 
5.7%
Lowercase Letter
ValueCountFrequency (%)
e 39
32.8%
k 13
 
10.9%
t 12
 
10.1%
l 12
 
10.1%
n 12
 
10.1%
s 7
 
5.9%
y 6
 
5.0%
c 6
 
5.0%
o 6
 
5.0%
m 6
 
5.0%
Other Punctuation
ValueCountFrequency (%)
. 9220
99.2%
, 36
 
0.4%
? 18
 
0.2%
& 12
 
0.1%
@ 6
 
0.1%
· 5
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 18
75.0%
+ 6
 
25.0%
Space Separator
ValueCountFrequency (%)
19809
100.0%
Close Punctuation
ValueCountFrequency (%)
) 717
100.0%
Open Punctuation
ValueCountFrequency (%)
( 717
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 53
100.0%
Other Symbol
ValueCountFrequency (%)
6
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 72937
51.4%
Common 67871
47.8%
Latin 1219
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3178
 
4.4%
2877
 
3.9%
2388
 
3.3%
2097
 
2.9%
2067
 
2.8%
1769
 
2.4%
1528
 
2.1%
1268
 
1.7%
1136
 
1.6%
1085
 
1.5%
Other values (454) 53544
73.4%
Latin
ValueCountFrequency (%)
K 162
13.3%
S 131
 
10.7%
C 119
 
9.8%
G 84
 
6.9%
L 84
 
6.9%
T 66
 
5.4%
A 54
 
4.4%
M 54
 
4.4%
B 53
 
4.3%
I 47
 
3.9%
Other values (24) 365
29.9%
Common
ValueCountFrequency (%)
19809
29.2%
. 9220
13.6%
1 8250
12.2%
2 6166
 
9.1%
3 4348
 
6.4%
5 3232
 
4.8%
0 3063
 
4.5%
4 2986
 
4.4%
6 2733
 
4.0%
7 2208
 
3.3%
Other values (13) 5856
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 72931
51.4%
ASCII 69085
48.6%
None 11
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
19809
28.7%
. 9220
13.3%
1 8250
11.9%
2 6166
 
8.9%
3 4348
 
6.3%
5 3232
 
4.7%
0 3063
 
4.4%
4 2986
 
4.3%
6 2733
 
4.0%
7 2208
 
3.2%
Other values (46) 7070
 
10.2%
Hangul
ValueCountFrequency (%)
3178
 
4.4%
2877
 
3.9%
2388
 
3.3%
2097
 
2.9%
2067
 
2.8%
1769
 
2.4%
1528
 
2.1%
1268
 
1.7%
1136
 
1.6%
1085
 
1.5%
Other values (453) 53538
73.4%
None
ValueCountFrequency (%)
6
54.5%
· 5
45.5%

대여일자
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing5
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean201887.91
Minimum201812
Maximum201905
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size80.9 KiB
2024-03-13T18:53:52.146025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201812
5-th percentile201812
Q1201901
median201903
Q3201904
95-th percentile201905
Maximum201905
Range93
Interquartile range (IQR)3

Descriptive statistics

Standard deviation33.873285
Coefficient of variation (CV)0.00016778263
Kurtosis1.2197688
Mean201887.91
Median Absolute Deviation (MAD)1
Skewness-1.7913755
Sum1.8541386 × 109
Variance1147.3994
MonotonicityIncreasing
2024-03-13T18:53:52.258050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
201903 1536
16.7%
201905 1536
16.7%
201904 1535
16.7%
201901 1528
16.6%
201902 1526
16.6%
201812 1523
16.6%
(Missing) 5
 
0.1%
ValueCountFrequency (%)
201812 1523
16.6%
201901 1528
16.6%
201902 1526
16.6%
201903 1536
16.7%
201904 1535
16.7%
201905 1536
16.7%
ValueCountFrequency (%)
201905 1536
16.7%
201904 1535
16.7%
201903 1536
16.7%
201902 1526
16.6%
201901 1528
16.6%
201812 1523
16.6%

대여건수
Real number (ℝ)

HIGH CORRELATION 

Distinct2033
Distinct (%)22.1%
Missing5
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean636.1679
Minimum0
Maximum16080
Zeros5
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size80.9 KiB
2024-03-13T18:53:52.366533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile74
Q1215
median407
Q3780
95-th percentile1924.85
Maximum16080
Range16080
Interquartile range (IQR)565

Descriptive statistics

Standard deviation748.96096
Coefficient of variation (CV)1.1773008
Kurtosis56.210929
Mean636.1679
Median Absolute Deviation (MAD)238.5
Skewness4.9942804
Sum5842566
Variance560942.52
MonotonicityNot monotonic
2024-03-13T18:53:52.489437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
201 27
 
0.3%
226 22
 
0.2%
275 21
 
0.2%
415 21
 
0.2%
142 21
 
0.2%
198 21
 
0.2%
144 21
 
0.2%
243 20
 
0.2%
228 20
 
0.2%
315 20
 
0.2%
Other values (2023) 8970
97.6%
ValueCountFrequency (%)
0 5
0.1%
1 1
 
< 0.1%
2 5
0.1%
3 1
 
< 0.1%
4 2
 
< 0.1%
5 5
0.1%
6 5
0.1%
7 2
 
< 0.1%
8 1
 
< 0.1%
9 3
< 0.1%
ValueCountFrequency (%)
16080 1
< 0.1%
15588 1
< 0.1%
9996 1
< 0.1%
9977 1
< 0.1%
9366 1
< 0.1%
8719 1
< 0.1%
8378 1
< 0.1%
8081 1
< 0.1%
7795 1
< 0.1%
7344 1
< 0.1%

Interactions

2024-03-13T18:53:50.805511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T18:53:50.604781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T18:53:50.897096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T18:53:50.691046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-13T18:53:52.569271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분대여일자대여건수
구분1.000NaN0.146
대여일자NaN1.000NaN
대여건수0.146NaN1.000
2024-03-13T18:53:52.651413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대여일자대여건수구분
대여일자1.0000.5740.000
대여건수0.5741.0000.058
구분0.0000.0581.000

Missing values

2024-03-13T18:53:51.008916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T18:53:51.094432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-13T18:53:51.180865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분대여소명대여일자대여건수
0강남구2301. 현대고등학교 건너편201812364
1강남구2302. 교보타워 버스정류장(신논현역 3번출구 후면)201812500
2강남구2303. 논현역 7번출구201812286
3강남구2304. 신영 ROYAL PALACE 앞201812149
4강남구2305. MCM 본사 직영점 앞201812145
5강남구2306. 압구정역 2번 출구 옆201812457
6강남구2307. 압구정 한양 3차 아파트201812279
7강남구2308. 압구정파출소 앞201812292
8강남구2309. 청담역(우리들병원 앞)201812152
9강남구2310. 청담동 맥도날드 옆(위치)201812214
구분대여소명대여일자대여건수
9179중랑구1450. 화랑대역 7번출구2019051013
9180중랑구1451. 중랑세무서2019052097
9181중랑구1452. 겸재교 진입부2019051743
9182중랑구1453. 중랑캠핑숲201905218
9183중랑구1454. 한국전력공사(동대문 중랑지사)2019051141
9184중랑구1455. 상봉역 2번 출구2019051362
9185중랑구1456. 상아빌딩(우림시장 교차로)201905826
9186중랑구1457. 동원사거리201905827
9187중랑구1458. 상봉터미널22019051421
9188중랑구1459. 용마한신아파트사거리201905447

Duplicate rows

Most frequently occurring

구분대여소명대여일자대여건수# duplicates
0<NA><NA><NA><NA>5