Overview

Dataset statistics

Number of variables9
Number of observations6735
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory506.6 KiB
Average record size in memory77.0 B

Variable types

Categorical2
Text4
Numeric3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-12912/S/1/datasetView.do

Alerts

사용일자 has constant value ""Constant
등록일자 has constant value ""Constant
승차총승객수 is highly overall correlated with 하차총승객수High correlation
하차총승객수 is highly overall correlated with 승차총승객수High correlation
승차총승객수 has 412 (6.1%) zerosZeros
하차총승객수 has 240 (3.6%) zerosZeros

Reproduction

Analysis started2024-05-18 06:35:12.978083
Analysis finished2024-05-18 06:35:19.070891
Duration6.09 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사용일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.7 KiB
20231201
6735 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20231201
2nd row20231201
3rd row20231201
4th row20231201
5th row20231201

Common Values

ValueCountFrequency (%)
20231201 6735
100.0%

Length

2024-05-18T15:35:19.414342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T15:35:19.788528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20231201 6735
100.0%
Distinct86
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size52.7 KiB
2024-05-18T15:35:20.347442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.562435
Min length3

Characters and Unicode

Total characters23993
Distinct characters20
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)0.1%

Sample

1st row100
2nd row542
3rd row542
4th row542
5th row542
ValueCountFrequency (%)
n26 252
 
3.7%
n37 209
 
3.1%
542 137
 
2.0%
9701 127
 
1.9%
441 126
 
1.9%
661 125
 
1.9%
302 124
 
1.8%
541 122
 
1.8%
9403 120
 
1.8%
9408 119
 
1.8%
Other values (76) 5274
78.3%
2024-05-18T15:35:21.591935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 3574
14.9%
0 3245
13.5%
5 3127
13.0%
1 2646
11.0%
2 2405
10.0%
4 2273
9.5%
3 2261
9.4%
6 2219
9.2%
9 865
 
3.6%
N 461
 
1.9%
Other values (10) 917
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 23043
96.0%
Uppercase Letter 682
 
2.8%
Other Letter 250
 
1.0%
Dash Punctuation 18
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 3574
15.5%
0 3245
14.1%
5 3127
13.6%
1 2646
11.5%
2 2405
10.4%
4 2273
9.9%
3 2261
9.8%
6 2219
9.6%
9 865
 
3.8%
8 428
 
1.9%
Other Letter
ValueCountFrequency (%)
77
30.8%
77
30.8%
47
18.8%
47
18.8%
1
 
0.4%
1
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
N 461
67.6%
B 149
 
21.8%
A 72
 
10.6%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23061
96.1%
Latin 682
 
2.8%
Hangul 250
 
1.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 3574
15.5%
0 3245
14.1%
5 3127
13.6%
1 2646
11.5%
2 2405
10.4%
4 2273
9.9%
3 2261
9.8%
6 2219
9.6%
9 865
 
3.8%
8 428
 
1.9%
Hangul
ValueCountFrequency (%)
77
30.8%
77
30.8%
47
18.8%
47
18.8%
1
 
0.4%
1
 
0.4%
Latin
ValueCountFrequency (%)
N 461
67.6%
B 149
 
21.8%
A 72
 
10.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23743
99.0%
Hangul 250
 
1.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 3574
15.1%
0 3245
13.7%
5 3127
13.2%
1 2646
11.1%
2 2405
10.1%
4 2273
9.6%
3 2261
9.5%
6 2219
9.3%
9 865
 
3.6%
N 461
 
1.9%
Other values (4) 667
 
2.8%
Hangul
ValueCountFrequency (%)
77
30.8%
77
30.8%
47
18.8%
47
18.8%
1
 
0.4%
1
 
0.4%
Distinct88
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size52.7 KiB
2024-05-18T15:35:22.348601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length21
Mean length17.003712
Min length12

Characters and Unicode

Total characters114520
Distinct characters175
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)0.1%

Sample

1st row100번(하계동~용산구청)
2nd row542번(군포버스공영차고지~신사역)
3rd row542번(군포버스공영차고지~신사역)
4th row542번(군포버스공영차고지~신사역)
5th row542번(군포버스공영차고지~신사역)
ValueCountFrequency (%)
542번(군포버스공영차고지~신사역 137
 
2.0%
n26번(중랑공영차고지~강서공영차고지 127
 
1.8%
9701번(가좌동~서울역 127
 
1.8%
441번(월암공영차고지~신사사거리 126
 
1.8%
661번(부천상동~영등포역,신세계백화점 125
 
1.8%
n26번(강서공영차고지~중랑공영차고지 125
 
1.8%
302번(성남~동대문 124
 
1.8%
541번(군포공영차고지~강남역 122
 
1.7%
9403번(구미동차고지~중곡역 120
 
1.7%
공영차고지~여의도 119
 
1.7%
Other values (81) 5758
82.1%
2024-05-18T15:35:23.905523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 6900
 
6.0%
) 6900
 
6.0%
~ 6735
 
5.9%
6428
 
5.6%
4179
 
3.6%
3851
 
3.4%
3791
 
3.3%
7 3574
 
3.1%
3539
 
3.1%
0 3245
 
2.8%
Other values (165) 65378
57.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 69233
60.5%
Decimal Number 23189
 
20.2%
Open Punctuation 6900
 
6.0%
Close Punctuation 6900
 
6.0%
Math Symbol 6735
 
5.9%
Uppercase Letter 718
 
0.6%
Other Punctuation 552
 
0.5%
Space Separator 275
 
0.2%
Dash Punctuation 18
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6428
 
9.3%
4179
 
6.0%
3851
 
5.6%
3791
 
5.5%
3539
 
5.1%
3016
 
4.4%
2893
 
4.2%
2669
 
3.9%
1744
 
2.5%
1730
 
2.5%
Other values (144) 35393
51.1%
Decimal Number
ValueCountFrequency (%)
7 3574
15.4%
0 3245
14.0%
5 3127
13.5%
1 2791
12.0%
2 2405
10.4%
4 2274
9.8%
3 2261
9.8%
6 2219
9.6%
9 865
 
3.7%
8 428
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
N 461
64.2%
B 149
 
20.8%
A 90
 
12.5%
K 18
 
2.5%
Other Punctuation
ValueCountFrequency (%)
, 415
75.2%
. 137
 
24.8%
Open Punctuation
ValueCountFrequency (%)
( 6900
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6900
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6735
100.0%
Space Separator
ValueCountFrequency (%)
275
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 69233
60.5%
Common 44569
38.9%
Latin 718
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6428
 
9.3%
4179
 
6.0%
3851
 
5.6%
3791
 
5.5%
3539
 
5.1%
3016
 
4.4%
2893
 
4.2%
2669
 
3.9%
1744
 
2.5%
1730
 
2.5%
Other values (144) 35393
51.1%
Common
ValueCountFrequency (%)
( 6900
15.5%
) 6900
15.5%
~ 6735
15.1%
7 3574
8.0%
0 3245
7.3%
5 3127
7.0%
1 2791
6.3%
2 2405
 
5.4%
4 2274
 
5.1%
3 2261
 
5.1%
Other values (7) 4357
9.8%
Latin
ValueCountFrequency (%)
N 461
64.2%
B 149
 
20.8%
A 90
 
12.5%
K 18
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 69233
60.5%
ASCII 45287
39.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 6900
15.2%
) 6900
15.2%
~ 6735
14.9%
7 3574
7.9%
0 3245
7.2%
5 3127
6.9%
1 2791
6.2%
2 2405
 
5.3%
4 2274
 
5.0%
3 2261
 
5.0%
Other values (11) 5075
11.2%
Hangul
ValueCountFrequency (%)
6428
 
9.3%
4179
 
6.0%
3851
 
5.6%
3791
 
5.5%
3539
 
5.1%
3016
 
4.4%
2893
 
4.2%
2669
 
3.9%
1744
 
2.5%
1730
 
2.5%
Other values (144) 35393
51.1%

표준버스정류장ID
Real number (ℝ)

Distinct3604
Distinct (%)53.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5555136 × 108
Minimum1 × 108
Maximum9.998 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size59.3 KiB
2024-05-18T15:35:24.361826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 108
5-th percentile1.0200001 × 108
Q11.1200041 × 108
median1.2100001 × 108
Q32.1300002 × 108
95-th percentile2.2500001 × 108
Maximum9.998 × 108
Range8.998 × 108
Interquartile range (IQR)1.0099961 × 108

Descriptive statistics

Standard deviation65504791
Coefficient of variation (CV)0.42111357
Kurtosis66.479233
Mean1.5555136 × 108
Median Absolute Deviation (MAD)15000005
Skewness5.5038033
Sum1.0476384 × 1012
Variance4.2908776 × 1015
MonotonicityNot monotonic
2024-05-18T15:35:24.849794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
117000003 11
 
0.2%
121000010 11
 
0.2%
117000002 10
 
0.1%
121000004 10
 
0.1%
121000005 10
 
0.1%
121000006 10
 
0.1%
121000007 10
 
0.1%
121000008 10
 
0.1%
121000009 10
 
0.1%
121000012 10
 
0.1%
Other values (3594) 6633
98.5%
ValueCountFrequency (%)
100000001 2
< 0.1%
100000002 2
< 0.1%
100000003 2
< 0.1%
100000004 2
< 0.1%
100000005 1
< 0.1%
100000006 1
< 0.1%
100000007 1
< 0.1%
100000008 1
< 0.1%
100000015 1
< 0.1%
100000016 1
< 0.1%
ValueCountFrequency (%)
999800005 2
< 0.1%
999800003 1
 
< 0.1%
999033574 4
0.1%
998502269 1
 
< 0.1%
998501980 2
< 0.1%
998501932 1
 
< 0.1%
998501931 2
< 0.1%
998001700 2
< 0.1%
990032570 1
 
< 0.1%
990014944 1
 
< 0.1%
Distinct3571
Distinct (%)53.0%
Missing0
Missing (%)0.0%
Memory size52.7 KiB
2024-05-18T15:35:25.530186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/