Overview

Dataset statistics

Number of variables2
Number of observations631
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory10.6 KiB
Average record size in memory17.2 B

Variable types

Text1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15262/F/1/datasetView.do

Alerts

노선명 has unique valuesUnique
ROUTE_ID has unique valuesUnique

Reproduction

Analysis started2023-12-11 07:07:48.383135
Analysis finished2023-12-11 07:07:48.766362
Duration0.38 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

노선명
Text

UNIQUE 

Distinct631
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2023-12-11T16:07:49.118429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length4
Mean length3.9413629
Min length2

Characters and Unicode

Total characters2487
Distinct characters65
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique631 ?
Unique (%)100.0%

Sample

1st row0017
2nd row01A
3rd row01B
4th row02
5th row03
ValueCountFrequency (%)
0017 1
 
0.2%
강서07 1
 
0.2%
관악02 1
 
0.2%
관악03 1
 
0.2%
관악04 1
 
0.2%
관악05 1
 
0.2%
관악06 1
 
0.2%
관악07 1
 
0.2%
관악08 1
 
0.2%
관악01 1
 
0.2%
Other values (621) 621
98.4%
2023-12-11T16:07:49.641016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 417
16.8%
0 324
13.0%
2 235
 
9.4%
6 177
 
7.1%
3 172
 
6.9%
5 158
 
6.4%
7 154
 
6.2%
4 148
 
6.0%
8 53
 
2.1%
46
 
1.8%
Other values (55) 603
24.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1877
75.5%
Other Letter 549
 
22.1%
Uppercase Letter 47
 
1.9%
Dash Punctuation 14
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
46
 
8.4%
41
 
7.5%
32
 
5.8%
32
 
5.8%
31
 
5.6%
30
 
5.5%
25
 
4.6%
25
 
4.6%
21
 
3.8%
21
 
3.8%
Other values (37) 245
44.6%
Decimal Number
ValueCountFrequency (%)
1 417
22.2%
0 324
17.3%
2 235
12.5%
6 177
9.4%
3 172
9.2%
5 158
 
8.4%
7 154
 
8.2%
4 148
 
7.9%
8 53
 
2.8%
9 39
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
N 9
19.1%
B 7
14.9%
A 7
14.9%
U 6
12.8%
O 6
12.8%
T 6
12.8%
R 6
12.8%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1891
76.0%
Hangul 549
 
22.1%
Latin 47
 
1.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
46
 
8.4%
41
 
7.5%
32
 
5.8%
32
 
5.8%
31
 
5.6%
30
 
5.5%
25
 
4.6%
25
 
4.6%
21
 
3.8%
21
 
3.8%
Other values (37) 245
44.6%
Common
ValueCountFrequency (%)
1 417
22.1%
0 324
17.1%
2 235
12.4%
6 177
9.4%
3 172
9.1%
5 158
 
8.4%
7 154
 
8.1%
4 148
 
7.8%
8 53
 
2.8%
9 39
 
2.1%
Latin
ValueCountFrequency (%)
N 9
19.1%
B 7
14.9%
A 7
14.9%
U 6
12.8%
O 6
12.8%
T 6
12.8%
R 6
12.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1938
77.9%
Hangul 549
 
22.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 417
21.5%
0 324
16.7%
2 235
12.1%
6 177
9.1%
3 172
8.9%
5 158
 
8.2%
7 154
 
7.9%
4 148
 
7.6%
8 53
 
2.7%
9 39
 
2.0%
Other values (8) 61
 
3.1%
Hangul
ValueCountFrequency (%)
46
 
8.4%
41
 
7.5%
32
 
5.8%
32
 
5.8%
31
 
5.6%
30
 
5.5%
25
 
4.6%
25
 
4.6%
21
 
3.8%
21
 
3.8%
Other values (37) 245
44.6%

ROUTE_ID
Real number (ℝ)

UNIQUE 

Distinct631
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0610204 × 108
Minimum1.0000002 × 108
Maximum1.249 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.7 KiB
2023-12-11T16:07:49.851787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0000002 × 108
5-th percentile1.0010004 × 108
Q11.0010022 × 108
median1.0010058 × 108
Q31.129 × 108
95-th percentile1.219 × 108
Maximum1.249 × 108
Range24899986
Interquartile range (IQR)12799785

Descriptive statistics

Standard deviation7891859.5
Coefficient of variation (CV)0.0743799
Kurtosis-0.71808233
Mean1.0610204 × 108
Median Absolute Deviation (MAD)546
Skewness0.89966654
Sum6.695039 × 1010
Variance6.2281447 × 1013
MonotonicityNot monotonic
2023-12-11T16:07:50.071901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100100124 1
 
0.2%
120900002 1
 
0.2%
120900008 1
 
0.2%
120900003 1
 
0.2%
120900009 1
 
0.2%
120900010 1
 
0.2%
120900004 1
 
0.2%
120900006 1
 
0.2%
120900007 1
 
0.2%
114900004 1
 
0.2%
Other values (621) 621
98.4%
ValueCountFrequency (%)
100000017 1
0.2%
100000018 1
0.2%
100100001 1
0.2%
100100002 1
0.2%
100100006 1
0.2%
100100007 1
0.2%
100100008 1
0.2%
100100009 1
0.2%
100100010 1
0.2%
100100011 1
0.2%
ValueCountFrequency (%)
124900003 1
0.2%
124900002 1
0.2%
124900001 1
0.2%
124000038 1
0.2%
124000036 1
0.2%
124000010 1
0.2%
124000008 1
0.2%
123000011 1
0.2%
123000010 1
0.2%
122900010 1
0.2%

Interactions

2023-12-11T16:07:48.472710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-11T16:07:48.642386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T16:07:48.729905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

노선명ROUTE_ID
00017100100124
101A104000007
201B104000008
302100100001
403100100002
504106000002
6100100100549
7101100100006
81014100100129
91017100100130
노선명ROUTE_ID
621종로03100900010
622종로05100900011
623종로07100900004
624종로08100900005
625종로09100900003
626종로11100900007
627종로12100900009
628종로13100900002
629중랑01106900001
630중랑02106900002