Overview

Dataset statistics

Number of variables9
Number of observations7508
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory564.7 KiB
Average record size in memory77.0 B

Variable types

Categorical2
Text4
Numeric3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-12912/S/1/datasetView.do

Alerts

사용일자 has constant value ""Constant
등록일자 has constant value ""Constant
승차총승객수 is highly overall correlated with 하차총승객수High correlation
하차총승객수 is highly overall correlated with 승차총승객수High correlation
승차총승객수 has 498 (6.6%) zerosZeros
하차총승객수 has 349 (4.6%) zerosZeros

Reproduction

Analysis started2024-04-14 02:06:32.340035
Analysis finished2024-04-14 02:06:35.403720
Duration3.06 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사용일자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size58.8 KiB
20230201
7508 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20230201
2nd row20230201
3rd row20230201
4th row20230201
5th row20230201

Common Values

ValueCountFrequency (%)
20230201 7508
100.0%

Length

2024-04-14T11:06:35.519818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-14T11:06:35.691046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20230201 7508
100.0%
Distinct92
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size58.8 KiB
2024-04-14T11:06:36.581583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.6172083
Min length3

Characters and Unicode

Total characters27158
Distinct characters23
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row601
2nd row9703
3rd row9703
4th row9703
5th row9703
ValueCountFrequency (%)
n26 244
 
3.2%
n15 229
 
3.1%
n37 201
 
2.7%
9408 144
 
1.9%
542 138
 
1.8%
9403 136
 
1.8%
9701 127
 
1.7%
661 125
 
1.7%
441 124
 
1.7%
541 123
 
1.6%
Other values (82) 5917
78.8%
2024-04-14T11:06:37.743869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 4100
15.1%
1 3573
13.2%
5 3449
12.7%
0 3028
11.1%
6 2598
9.6%
2 2484
9.1%
3 2399
8.8%
4 2372
8.7%
9 1007
 
3.7%
N 674
 
2.5%
Other values (13) 1474
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25660
94.5%
Uppercase Letter 893
 
3.3%
Other Letter 590
 
2.2%
Dash Punctuation 15
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 4100
16.0%
1 3573
13.9%
5 3449
13.4%
0 3028
11.8%
6 2598
10.1%
2 2484
9.7%
3 2399
9.3%
4 2372
9.2%
9 1007
 
3.9%
8 650
 
2.5%
Other Letter
ValueCountFrequency (%)
148
25.1%
148
25.1%
106
18.0%
98
16.6%
29
 
4.9%
20
 
3.4%
20
 
3.4%
15
 
2.5%
6
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
N 674
75.5%
B 148
 
16.6%
A 71
 
8.0%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 25675
94.5%
Latin 893
 
3.3%
Hangul 590
 
2.2%

Most frequent character per script

Common
ValueCountFrequency (%)
7 4100
16.0%
1 3573
13.9%
5 3449
13.4%
0 3028
11.8%
6 2598
10.1%
2 2484
9.7%
3 2399
9.3%
4 2372
9.2%
9 1007
 
3.9%
8 650
 
2.5%
Hangul
ValueCountFrequency (%)
148
25.1%
148
25.1%
106
18.0%
98
16.6%
29
 
4.9%
20
 
3.4%
20
 
3.4%
15
 
2.5%
6
 
1.0%
Latin
ValueCountFrequency (%)
N 674
75.5%
B 148
 
16.6%
A 71
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26568
97.8%
Hangul 590
 
2.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 4100
15.4%
1 3573
13.4%
5 3449
13.0%
0 3028
11.4%
6 2598
9.8%
2 2484
9.3%
3 2399
9.0%
4 2372
8.9%
9 1007
 
3.8%
N 674
 
2.5%
Other values (4) 884
 
3.3%
Hangul
ValueCountFrequency (%)
148
25.1%
148
25.1%
106
18.0%
98
16.6%
29
 
4.9%
20
 
3.4%
20
 
3.4%
15
 
2.5%
6
 
1.0%
Distinct96
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size58.8 KiB
2024-04-14T11:06:38.350225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length21
Mean length17.370538
Min length12

Characters and Unicode

Total characters130418
Distinct characters184
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row601번(개화동~종로4가)
2nd row9703번(신성교통차고지~서울역)
3rd row9703번(신성교통차고지~서울역)
4th row9703번(신성교통차고지~서울역)
5th row9703번(신성교통차고지~서울역)
ValueCountFrequency (%)
9408번(성남 144
 
1.8%
분당~영등포 144
 
1.8%
n15번(우이동성원아파트~남태령역 142
 
1.8%
542번(군포버스공영차고지~신사역 138
 
1.7%
9403번(성남분당~을지로5가 136
 
1.7%
9701번(가좌동~서울역 127
 
1.6%
661번(부천상동~영등포역,신세계백화점 125
 
1.6%
441번(월암공영차고지~신사사거리 124
 
1.6%
n26번(중랑공영차고지~강서공영차고지 123
 
1.6%
541번(군포공영차고지~강남역 123
 
1.6%
Other values (90) 6602
83.3%
2024-04-14T11:06:39.409934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
) 7592
 
5.8%
( 7592
 
5.8%
~ 7508
 
5.8%
7114
 
5.5%
4430
 
3.4%
4308
 
3.3%
4124
 
3.2%
7 4106
 
3.1%
3914
 
3.0%
1 3864
 
3.0%
Other values (174) 75866
58.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 79747
61.1%
Decimal Number 26103
 
20.0%
Close Punctuation 7592
 
5.8%
Open Punctuation 7592
 
5.8%
Math Symbol 7508
 
5.8%
Uppercase Letter 893
 
0.7%
Other Punctuation 548
 
0.4%
Space Separator 420
 
0.3%
Dash Punctuation 15
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7114
 
8.9%
4430
 
5.6%
4308
 
5.4%
4124
 
5.2%
3914
 
4.9%
3228
 
4.0%
3140
 
3.9%
2767
 
3.5%
1571
 
2.0%
1520
 
1.9%
Other values (154) 43631
54.7%
Decimal Number
ValueCountFrequency (%)
7 4106
15.7%
1 3864
14.8%
5 3585
13.7%
0 3028
11.6%
6 2598
10.0%
2 2484
9.5%
3 2399
9.2%
4 2382
9.1%
9 1007
 
3.9%
8 650
 
2.5%
Uppercase Letter
ValueCountFrequency (%)
N 674
75.5%
B 148
 
16.6%
A 71
 
8.0%
Other Punctuation
ValueCountFrequency (%)
, 411
75.0%
. 137
 
25.0%
Close Punctuation
ValueCountFrequency (%)
) 7592
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7592
100.0%
Math Symbol
ValueCountFrequency (%)
~ 7508
100.0%
Space Separator
ValueCountFrequency (%)
420
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 79747
61.1%
Common 49778
38.2%
Latin 893
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7114
 
8.9%
4430
 
5.6%
4308
 
5.4%
4124
 
5.2%
3914
 
4.9%
3228
 
4.0%
3140
 
3.9%
2767
 
3.5%
1571
 
2.0%
1520
 
1.9%
Other values (154) 43631
54.7%
Common
ValueCountFrequency (%)
) 7592
15.3%
( 7592
15.3%
~ 7508
15.1%
7 4106
8.2%
1 3864
7.8%
5 3585
7.2%
0 3028
 
6.1%
6 2598
 
5.2%
2 2484
 
5.0%
3 2399
 
4.8%
Other values (7) 5022
10.1%
Latin
ValueCountFrequency (%)
N 674
75.5%
B 148
 
16.6%
A 71
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 79747
61.1%
ASCII 50671
38.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
) 7592
15.0%
( 7592
15.0%
~ 7508
14.8%
7 4106
8.1%
1 3864
7.6%
5 3585
7.1%
0 3028
 
6.0%
6 2598
 
5.1%
2 2484
 
4.9%
3 2399
 
4.7%
Other values (10) 5915
11.7%
Hangul
ValueCountFrequency (%)
7114
 
8.9%
4430
 
5.6%
4308
 
5.4%
4124
 
5.2%
3914
 
4.9%
3228
 
4.0%
3140
 
3.9%
2767
 
3.5%
1571
 
2.0%
1520
 
1.9%
Other values (154) 43631
54.7%

표준버스정류장ID
Real number (ℝ)

Distinct3908
Distinct (%)52.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4838691 × 108
Minimum1 × 108
Maximum9.998 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size66.1 KiB
2024-04-14T11:06:39.832884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 108
5-th percentile1.010001 × 108
Q11.1100093 × 108
median1.19 × 108
Q32.0900001 × 108
95-th percentile2.2200063 × 108
Maximum9.998 × 108
Range8.998 × 108
Interquartile range (IQR)97999074

Descriptive statistics

Standard deviation63734799
Coefficient of variation (CV)0.42951765
Kurtosis72.930351
Mean1.4838691 × 108
Median Absolute Deviation (MAD)9999784.5
Skewness5.9228161
Sum1.114089 × 1012
Variance4.0621246 × 1015
MonotonicityNot monotonic
2024-04-14T11:06:40.272884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
121000005 12
 
0.2%
100000384 12
 
0.2%
121000007 12
 
0.2%
100000380 11
 
0.1%
121000006 11
 
0.1%
121000008 11
 
0.1%
121000009 11
 
0.1%
121000014 10
 
0.1%
121000010 10
 
0.1%
121000013 10
 
0.1%
Other values (3898) 7398
98.5%
ValueCountFrequency (%)
100000001 3
< 0.1%
100000002 2
< 0.1%
100000003 2
< 0.1%
100000004 3
< 0.1%
100000005 3
< 0.1%
100000006 1
 
< 0.1%
100000007 1
 
< 0.1%
100000008 1
 
< 0.1%
100000015 1
 
< 0.1%
100000016 1
 
< 0.1%
ValueCountFrequency (%)
999800005 2
< 0.1%
999800004 1
 
< 0.1%
999800003 1
 
< 0.1%
999033574 4
0.1%
998502944 1
 
< 0.1%
998502062 1
 
< 0.1%
998501980 1
 
< 0.1%
998501932 1
 
< 0.1%
998501931 1
 
< 0.1%
998001700 1
 
< 0.1%
Distinct3869
Distinct (%)51.5%
Missing0
Missing (%)0.0%
Memory size58.8 KiB
2024-04-14T11:06:41.596634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.9904102
Min length1

Characters and Unicode

Total characters37468
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1