Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells4020
Missing cells (%)10.1%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory410.2 KiB
Average record size in memory42.0 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15249/F/1/datasetView.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
대여소 명 has 2010 (20.1%) missing valuesMissing
대여 건수 has 2010 (20.1%) missing valuesMissing

Reproduction

Analysis started2024-03-13 09:53:17.684280
Analysis finished2024-03-13 09:53:18.522485
Duration0.84 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

대여소 그룹
Categorical

Distinct28
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
2010 
송파구
 
565
강서구
 
538
강남구
 
457
서초구
 
449
Other values (23)
5981 

Length

Max length6
Median length3
Mean length3.2681
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row강서구
2nd row<NA>
3rd row강남구
4th row관악구
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 2010
20.1%
송파구 565
 
5.7%
강서구 538
 
5.4%
강남구 457
 
4.6%
서초구 449
 
4.5%
영등포구 397
 
4.0%
마포구 368
 
3.7%
종로구 350
 
3.5%
노원구 346
 
3.5%
강동구 319
 
3.2%
Other values (18) 4201
42.0%

Length

2024-03-13T18:53:18.627219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 2010
20.1%
송파구 565
 
5.6%
강서구 538
 
5.4%
강남구 457
 
4.6%
서초구 449
 
4.5%
영등포구 397
 
4.0%
마포구 368
 
3.7%
종로구 350
 
3.5%
노원구 346
 
3.5%
강동구 319
 
3.2%
Other values (19) 4202
42.0%

대여소 명
Text

MISSING 

Distinct2453
Distinct (%)30.7%
Missing2010
Missing (%)20.1%
Memory size156.2 KiB
2024-03-13T18:53:18.862091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length29
Mean length15.447434
Min length3

Characters and Unicode

Total characters123425
Distinct characters582
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique160 ?
Unique (%)2.0%

Sample

1st row1178. 개화산역 2번 출구
2nd row2361. 압구정역 교차로
3rd row3310.영락고등학교
4th row559. 왕십리역 4번 출구 건너편
5th row1709. 쌍문역4번출구 주변
ValueCountFrequency (%)
2130
 
9.2%
316
 
1.4%
출구 293
 
1.3%
입구 220
 
1.0%
1번출구 204
 
0.9%
사거리 183
 
0.8%
교차로 182
 
0.8%
2번출구 158
 
0.7%
151
 
0.7%
3번출구 150
 
0.6%
Other values (4851) 19098
82.7%
2024-03-13T18:53:19.245319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
15095
 
12.2%
. 8001
 
6.5%
1 6290
 
5.1%
2 5107
 
4.1%
3 3918
 
3.2%
4 3468
 
2.8%
5 2833
 
2.3%
0 2815
 
2.3%
6 2614
 
2.1%
2599
 
2.1%
Other values (572) 70685
57.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 63799
51.7%
Decimal Number 33634
27.3%
Space Separator 15095
 
12.2%
Other Punctuation 8075
 
6.5%
Uppercase Letter 1149
 
0.9%
Open Punctuation 728
 
0.6%
Close Punctuation 728
 
0.6%
Lowercase Letter 132
 
0.1%
Dash Punctuation 54
 
< 0.1%
Math Symbol 20
 
< 0.1%
Other values (2) 11
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2599
 
4.1%
2495
 
3.9%
1835
 
2.9%
1677
 
2.6%
1616
 
2.5%
1577
 
2.5%
1298
 
2.0%
1112
 
1.7%
1102
 
1.7%
1045
 
1.6%
Other values (512) 47443
74.4%
Uppercase Letter
ValueCountFrequency (%)
K 134
11.7%
S 126
11.0%
C 107
9.3%
T 88
 
7.7%
L 79
 
6.9%
G 79
 
6.9%
A 75
 
6.5%
D 73
 
6.4%
M 70
 
6.1%
P 52
 
4.5%
Other values (14) 266
23.2%
Lowercase Letter
ValueCountFrequency (%)
e 45
34.1%
k 21
15.9%
s 16
 
12.1%
t 11
 
8.3%
l 8
 
6.1%
n 6
 
4.5%
v 5
 
3.8%
o 5
 
3.8%
m 5
 
3.8%
c 5
 
3.8%
Other values (3) 5
 
3.8%
Decimal Number
ValueCountFrequency (%)
1 6290
18.7%
2 5107
15.2%
3 3918
11.6%
4 3468
10.3%
5 2833
8.4%
0 2815
8.4%
6 2614
7.8%
7 2507
 
7.5%
8 2122
 
6.3%
9 1960
 
5.8%
Other Punctuation
ValueCountFrequency (%)
. 8001
99.1%
, 45
 
0.6%
& 14
 
0.2%
? 11
 
0.1%
· 4
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
~ 16
80.0%
+ 4
 
20.0%
Space Separator
ValueCountFrequency (%)
15095
100.0%
Open Punctuation
ValueCountFrequency (%)
( 728
100.0%
Close Punctuation
ValueCountFrequency (%)
) 728
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 54
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 63803
51.7%
Common 58341
47.3%
Latin 1281
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2599
 
4.1%
2495
 
3.9%
1835
 
2.9%
1677
 
2.6%
1616
 
2.5%
1577
 
2.5%
1298
 
2.0%
1112
 
1.7%
1102
 
1.7%
1045
 
1.6%
Other values (513) 47447
74.4%
Latin
ValueCountFrequency (%)
K 134
 
10.5%
S 126
 
9.8%
C 107
 
8.4%
T 88
 
6.9%
L 79
 
6.2%
G 79
 
6.2%
A 75
 
5.9%
D 73
 
5.7%
M 70
 
5.5%
P 52
 
4.1%
Other values (27) 398
31.1%
Common
ValueCountFrequency (%)
15095
25.9%
. 8001
13.7%
1 6290
10.8%
2 5107
 
8.8%
3 3918
 
6.7%
4 3468
 
5.9%
5 2833
 
4.9%
0 2815
 
4.8%
6 2614
 
4.5%
7 2507
 
4.3%
Other values (12) 5693
 
9.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 63799
51.7%
ASCII 59618
48.3%
None 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
15095
25.3%
. 8001
13.4%
1 6290
10.6%
2 5107
 
8.6%
3 3918
 
6.6%
4 3468
 
5.8%
5 2833
 
4.8%
0 2815
 
4.7%
6 2614
 
4.4%
7 2507
 
4.2%
Other values (48) 6970
11.7%
Hangul
ValueCountFrequency (%)
2599
 
4.1%
2495
 
3.9%
1835
 
2.9%
1677
 
2.6%
1616
 
2.5%
1577
 
2.5%
1298
 
2.0%
1112
 
1.7%
1102
 
1.7%
1045
 
1.6%
Other values (512) 47443
74.4%
None
ValueCountFrequency (%)
· 4
50.0%
4
50.0%
Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
2010 
202106
1690 
202105
1646 
202104
1586 
202103
1556 

Length

Max length6
Median length6
Mean length5.598
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202106
2nd row<NA>
3rd row202106
4th row202103
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 2010
20.1%
202106 1690
16.9%
202105 1646
16.5%
202104 1586
15.9%
202103 1556
15.6%
202102 1512
15.1%

Length

2024-03-13T18:53:19.383483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T18:53:19.491000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 2010
20.1%
202106 1690
16.9%
202105 1646
16.5%
202104 1586
15.9%
202103 1556
15.6%
202102 1512
15.1%

대여 건수
Real number (ℝ)

MISSING 

Distinct2680
Distinct (%)33.5%
Missing2010
Missing (%)20.1%
Infinite0
Infinite (%)0.0%
Mean1095.9637
Minimum0
Maximum17875
Zeros4
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-13T18:53:19.622742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile132
Q1410
median806.5
Q31429.75
95-th percentile3012
Maximum17875
Range17875
Interquartile range (IQR)1019.75

Descriptive statistics

Standard deviation1071.974
Coefficient of variation (CV)0.97811086
Kurtosis27.216434
Mean1095.9637
Median Absolute Deviation (MAD)459.5
Skewness3.5092486
Sum8756750
Variance1149128.3
MonotonicityNot monotonic
2024-03-13T18:53:19.766751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
263 15
 
0.1%
505 12
 
0.1%
402 11
 
0.1%
353 11
 
0.1%
112 11
 
0.1%
202 11
 
0.1%
916 11
 
0.1%
399 11
 
0.1%
409 11
 
0.1%
198 11
 
0.1%
Other values (2670) 7875
78.8%
(Missing) 2010
 
20.1%
ValueCountFrequency (%)
0 4
 
< 0.1%
1 10
0.1%
2 8
0.1%
3 2
 
< 0.1%
4 1
 
< 0.1%
5 2
 
< 0.1%
7 3
 
< 0.1%
8 1
 
< 0.1%
9 2
 
< 0.1%
10 2
 
< 0.1%
ValueCountFrequency (%)
17875 1
< 0.1%
15386 1
< 0.1%
15061 1
< 0.1%
14494 1
< 0.1%
12179 1
< 0.1%
10998 1
< 0.1%
10643 1
< 0.1%
9795 1
< 0.1%
9786 1
< 0.1%
9574 1
< 0.1%

Interactions

2024-03-13T18:53:18.153780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-13T18:53:19.847240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대여소 그룹대여 일자 / 월대여 건수
대여소 그룹1.0000.0000.272
대여 일자 / 월0.0001.0000.196
대여 건수0.2720.1961.000
2024-03-13T18:53:19.936846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대여소 그룹대여 일자 / 월
대여소 그룹1.0000.000
대여 일자 / 월0.0001.000
2024-03-13T18:53:20.009856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대여 건수대여소 그룹대여 일자 / 월
대여 건수1.0000.0940.114
대여소 그룹0.0941.0000.000
대여 일자 / 월0.1140.0001.000

Missing values

2024-03-13T18:53:18.289556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T18:53:18.371655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-13T18:53:18.456620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

대여소 그룹대여소 명대여 일자 / 월대여 건수
9680강서구1178. 개화산역 2번 출구2021061574
12547<NA><NA><NA><NA>
9361강남구2361. 압구정역 교차로202106992
2720관악구3310.영락고등학교202103369
13742<NA><NA><NA><NA>
5983성동구559. 왕십리역 4번 출구 건너편2021042284
3055도봉구1709. 쌍문역4번출구 주변202103824
13235<NA><NA><NA><NA>
10728서초구4306. 서래공원 앞202106810
13065<NA><NA><NA><NA>
대여소 그룹대여소 명대여 일자 / 월대여 건수
7681노원구1639. 희성오피앙2021051617
3426서대문구3124.DMC센트럴아이파크아파트2021031220
13290<NA><NA><NA><NA>
3000노원구1666. 노원소방서인근2021031254
14672<NA><NA><NA><NA>
12278<NA><NA><NA><NA>
13131<NA><NA><NA><NA>
13830<NA><NA><NA><NA>
9888광진구3579.광진 캠퍼스시티2021063090
9188중구472.삼일교(시그니쳐 타워)202105663

Duplicate rows

Most frequently occurring

대여소 그룹대여소 명대여 일자 / 월대여 건수# duplicates
0<NA><NA><NA><NA>2010