Overview

Dataset statistics

Number of variables2
Number of observations683
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.5 KiB
Average record size in memory17.2 B

Variable types

Text1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15262/S/1/datasetView.do

Alerts

노선명 has unique valuesUnique
ROUTEID has unique valuesUnique

Reproduction

Analysis started2024-05-11 06:16:10.112595
Analysis finished2024-05-11 06:16:10.553904
Duration0.44 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

노선명
Text

UNIQUE 

Distinct683
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
2024-05-11T15:16:11.040236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length4
Mean length3.9487555
Min length3

Characters and Unicode

Total characters2697
Distinct characters70
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique683 ?
Unique (%)100.0%

Sample

1st row0017
2nd row01A
3rd row01B
4th row0411
5th row100
ValueCountFrequency (%)
0017 1
 
0.1%
강서03 1
 
0.1%
강북08 1
 
0.1%
관악06 1
 
0.1%
강북09 1
 
0.1%
강북10 1
 
0.1%
강북11 1
 
0.1%
강북12 1
 
0.1%
강서01 1
 
0.1%
강서02 1
 
0.1%
Other values (673) 673
98.5%
2024-05-11T15:16:11.965278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 455
16.9%
0 378
14.0%
2 250
9.3%
6 217
 
8.0%
3 186
 
6.9%
7 169
 
6.3%
5 161
 
6.0%
4 152
 
5.6%
8 64
 
2.4%
52
 
1.9%
Other values (60) 613
22.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2070
76.8%
Other Letter 561
 
20.8%
Uppercase Letter 51
 
1.9%
Dash Punctuation 15
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
52
 
9.3%
40
 
7.1%
32
 
5.7%
30
 
5.3%
30
 
5.3%
29
 
5.2%
25
 
4.5%
25
 
4.5%
21
 
3.7%
21
 
3.7%
Other values (42) 256
45.6%
Decimal Number
ValueCountFrequency (%)
1 455
22.0%
0 378
18.3%
2 250
12.1%
6 217
10.5%
3 186
9.0%
7 169
 
8.2%
5 161
 
7.8%
4 152
 
7.3%
8 64
 
3.1%
9 38
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
N 19
37.3%
A 9
17.6%
B 7
 
13.7%
U 4
 
7.8%
R 4
 
7.8%
T 4
 
7.8%
O 4
 
7.8%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2085
77.3%
Hangul 561
 
20.8%
Latin 51
 
1.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
52
 
9.3%
40
 
7.1%
32
 
5.7%
30
 
5.3%
30
 
5.3%
29
 
5.2%
25
 
4.5%
25
 
4.5%
21
 
3.7%
21
 
3.7%
Other values (42) 256
45.6%
Common
ValueCountFrequency (%)
1 455
21.8%
0 378
18.1%
2 250
12.0%
6 217
10.4%
3 186
8.9%
7 169
 
8.1%
5 161
 
7.7%
4 152
 
7.3%
8 64
 
3.1%
9 38
 
1.8%
Latin
ValueCountFrequency (%)
N 19
37.3%
A 9
17.6%
B 7
 
13.7%
U 4
 
7.8%
R 4
 
7.8%
T 4
 
7.8%
O 4
 
7.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2136
79.2%
Hangul 561
 
20.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 455
21.3%
0 378
17.7%
2 250
11.7%
6 217
10.2%
3 186
8.7%
7 169
 
7.9%
5 161
 
7.5%
4 152
 
7.1%
8 64
 
3.0%
9 38
 
1.8%
Other values (8) 66
 
3.1%
Hangul
ValueCountFrequency (%)
52
 
9.3%
40
 
7.1%
32
 
5.7%
30
 
5.3%
30
 
5.3%
29
 
5.2%
25
 
4.5%
25
 
4.5%
21
 
3.7%
21
 
3.7%
Other values (42) 256
45.6%

ROUTEID
Real number (ℝ)

UNIQUE 

Distinct683
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0661449 × 108
Minimum1.0000002 × 108
Maximum1.249 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2024-05-11T15:16:12.221419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0000002 × 108
5-th percentile1.0010004 × 108
Q11.0010025 × 108
median1.001006 × 108
Q31.13 × 108
95-th percentile1.22 × 108
Maximum1.249 × 108
Range24899986
Interquartile range (IQR)12899753

Descriptive statistics

Standard deviation8157486.3
Coefficient of variation (CV)0.076513859
Kurtosis-0.86425872
Mean1.0661449 × 108
Median Absolute Deviation (MAD)580
Skewness0.80895933
Sum7.2817699 × 1010
Variance6.6544583 × 1013
MonotonicityNot monotonic
2024-05-11T15:16:12.514272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100100124 1
 
0.1%
115900008 1
 
0.1%
108900009 1
 
0.1%
108900001 1
 
0.1%
108900012 1
 
0.1%
115900006 1
 
0.1%
115900003 1
 
0.1%
115900004 1
 
0.1%
115900001 1
 
0.1%
115900005 1
 
0.1%
Other values (673) 673
98.5%
ValueCountFrequency (%)
100000017 1
0.1%
100000018 1
0.1%
100100001 1
0.1%
100100006 1
0.1%
100100007 1
0.1%
100100008 1
0.1%
100100009 1
0.1%
100100010 1
0.1%
100100011 1
0.1%
100100012 1
0.1%
ValueCountFrequency (%)
124900003 1
0.1%
124900002 1
0.1%
124900001 1
0.1%
124000040 1
0.1%
124000039 1
0.1%
124000038 1
0.1%
124000036 1
0.1%
124000016 1
0.1%
124000015 1
0.1%
124000014 1
0.1%

Interactions

2024-05-11T15:16:10.215836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2024-05-11T15:16:10.394252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:16:10.496863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

노선명ROUTEID
00017100100124
101A100100001
201B106000004
30411104000012
4100100100549
5101100100006
61014100100129
71017100100130
8102100100007
91020100100131
노선명ROUTEID
673종로03100900010
674종로05100900011
675종로07100900004
676종로08100900005
677종로09100900003
678종로11100900007
679종로12100900009
680종로13100900002
681중랑01106900001
682중랑02106900002