Overview

Dataset statistics

Number of variables2
Number of observations670
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.3 KiB
Average record size in memory17.2 B

Variable types

Text1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15262/F/1/datasetView.do

Alerts

노선명 has unique valuesUnique
ROUTEID has unique valuesUnique

Reproduction

Analysis started2023-12-11 07:08:01.842437
Analysis finished2023-12-11 07:08:02.245973
Duration0.4 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

노선명
Text

UNIQUE 

Distinct670
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
2023-12-11T16:08:02.622971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length4
Mean length3.9641791
Min length2

Characters and Unicode

Total characters2656
Distinct characters75
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique670 ?
Unique (%)100.0%

Sample

1st row0017
2nd row01
3rd row0411
4th row100
5th row101
ValueCountFrequency (%)
0017 1
 
0.1%
강북06 1
 
0.1%
강북09 1
 
0.1%
강북10 1
 
0.1%
강북11 1
 
0.1%
강북12 1
 
0.1%
강서01 1
 
0.1%
강서02 1
 
0.1%
강서03 1
 
0.1%
강서04 1
 
0.1%
Other values (660) 660
98.5%
2023-12-11T16:08:03.385268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 446
16.8%
0 369
13.9%
2 241
 
9.1%
6 212
 
8.0%
3 182
 
6.9%
7 164
 
6.2%
5 160
 
6.0%
4 153
 
5.8%
8 58
 
2.2%
46
 
1.7%
Other values (65) 625
23.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2027
76.3%
Other Letter 558
 
21.0%
Uppercase Letter 54
 
2.0%
Dash Punctuation 15
 
0.6%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
46
 
8.2%
41
 
7.3%
32
 
5.7%
32
 
5.7%
31
 
5.6%
30
 
5.4%
26
 
4.7%
24
 
4.3%
21
 
3.8%
21
 
3.8%
Other values (45) 254
45.5%
Decimal Number
ValueCountFrequency (%)
1 446
22.0%
0 369
18.2%
2 241
11.9%
6 212
10.5%
3 182
9.0%
7 164
 
8.1%
5 160
 
7.9%
4 153
 
7.5%
8 58
 
2.9%
9 42
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
N 17
31.5%
A 7
13.0%
T 6
 
11.1%
B 6
 
11.1%
O 6
 
11.1%
U 6
 
11.1%
R 6
 
11.1%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2044
77.0%
Hangul 558
 
21.0%
Latin 54
 
2.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
46
 
8.2%
41
 
7.3%
32
 
5.7%
32
 
5.7%
31
 
5.6%
30
 
5.4%
26
 
4.7%
24
 
4.3%
21
 
3.8%
21
 
3.8%
Other values (45) 254
45.5%
Common
ValueCountFrequency (%)
1 446
21.8%
0 369
18.1%
2 241
11.8%
6 212
10.4%
3 182
8.9%
7 164
 
8.0%
5 160
 
7.8%
4 153
 
7.5%
8 58
 
2.8%
9 42
 
2.1%
Other values (3) 17
 
0.8%
Latin
ValueCountFrequency (%)
N 17
31.5%
A 7
13.0%
T 6
 
11.1%
B 6
 
11.1%
O 6
 
11.1%
U 6
 
11.1%
R 6
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2098
79.0%
Hangul 558
 
21.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 446
21.3%
0 369
17.6%
2 241
11.5%
6 212
10.1%
3 182
8.7%
7 164
 
7.8%
5 160
 
7.6%
4 153
 
7.3%
8 58
 
2.8%
9 42
 
2.0%
Other values (10) 71
 
3.4%
Hangul
ValueCountFrequency (%)
46
 
8.2%
41
 
7.3%
32
 
5.7%
32
 
5.7%
31
 
5.6%
30
 
5.4%
26
 
4.7%
24
 
4.3%
21
 
3.8%
21
 
3.8%
Other values (45) 254
45.5%

ROUTEID
Real number (ℝ)

UNIQUE 

Distinct670
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0635194 × 108
Minimum1.0000002 × 108
Maximum1.249 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.0 KiB
2023-12-11T16:08:03.701275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0000002 × 108
5-th percentile1.0010004 × 108
Q11.0010024 × 108
median1.0010058 × 108
Q31.1290001 × 108
95-th percentile1.2190001 × 108
Maximum1.249 × 108
Range24899986
Interquartile range (IQR)12799770

Descriptive statistics

Standard deviation8076998.2
Coefficient of variation (CV)0.075945941
Kurtosis-0.77288035
Mean1.0635194 × 108
Median Absolute Deviation (MAD)558
Skewness0.86693425
Sum7.12558 × 1010
Variance6.52379 × 1013
MonotonicityNot monotonic
2023-12-11T16:08:04.056304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100100124 1
 
0.1%
108900004 1
 
0.1%
108900001 1
 
0.1%
108900012 1
 
0.1%
115900006 1
 
0.1%
115900003 1
 
0.1%
115900004 1
 
0.1%
115900001 1
 
0.1%
115900005 1
 
0.1%
115900008 1
 
0.1%
Other values (660) 660
98.5%
ValueCountFrequency (%)
100000017 1
0.1%
100000018 1
0.1%
100000020 1
0.1%
100100001 1
0.1%
100100006 1
0.1%
100100007 1
0.1%
100100008 1
0.1%
100100009 1
0.1%
100100010 1
0.1%
100100011 1
0.1%
ValueCountFrequency (%)
124900003 1
0.1%
124900002 1
0.1%
124900001 1
0.1%
124000039 1
0.1%
124000038 1
0.1%
124000036 1
0.1%
124000016 1
0.1%
124000015 1
0.1%
124000013 1
0.1%
124000010 1
0.1%

Interactions

2023-12-11T16:08:01.935773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-11T16:08:02.120752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T16:08:02.211181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

노선명ROUTEID
00017100100124
101100100001
20411104000012
3100100100549
4101100100006
51014100100129
61017100100130
7102100100007
81020100100131
9103100100008
노선명ROUTEID
660종로03100900010
661종로05100900011
662종로08100900005
663종로09100900003
664종로11100900007
665종로12100900009
666종로13100900002
667중랑01106900001
668중랑02106900002
669청와대A01(자율주행)100000020