Overview

Dataset statistics

Number of variables2
Number of observations653
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.0 KiB
Average record size in memory17.2 B

Variable types

Text1
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15262/F/1/datasetView.do

Alerts

노선명 has unique valuesUnique
ROUTE_ID has unique valuesUnique

Reproduction

Analysis started2023-12-11 07:07:52.718922
Analysis finished2023-12-11 07:07:53.116927
Duration0.4 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

노선명
Text

UNIQUE 

Distinct653
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size5.2 KiB
2023-12-11T16:07:53.480986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length4
Mean length3.9433384
Min length2

Characters and Unicode

Total characters2575
Distinct characters65
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique653 ?
Unique (%)100.0%

Sample

1st row0017
2nd row01
3rd row0411
4th row100
5th row101
ValueCountFrequency (%)
0017 1
 
0.2%
강남05 1
 
0.2%
관악01 1
 
0.2%
강서02 1
 
0.2%
강서03 1
 
0.2%
강서04 1
 
0.2%
강서05 1
 
0.2%
강서05-1 1
 
0.2%
강서06 1
 
0.2%
강서07 1
 
0.2%
Other values (643) 643
98.5%
2023-12-11T16:07:54.152298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 433
16.8%
0 344
13.4%
2 238
 
9.2%
6 197
 
7.7%
3 181
 
7.0%
5 162
 
6.3%
7 160
 
6.2%
4 153
 
5.9%
8 53
 
2.1%
46
 
1.8%
Other values (55) 608
23.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1962
76.2%
Other Letter 549
 
21.3%
Uppercase Letter 50
 
1.9%
Dash Punctuation 14
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
46
 
8.4%
41
 
7.5%
32
 
5.8%
32
 
5.8%
31
 
5.6%
30
 
5.5%
25
 
4.6%
25
 
4.6%
21
 
3.8%
21
 
3.8%
Other values (37) 245
44.6%
Decimal Number
ValueCountFrequency (%)
1 433
22.1%
0 344
17.5%
2 238
12.1%
6 197
10.0%
3 181
9.2%
5 162
 
8.3%
7 160
 
8.2%
4 153
 
7.8%
8 53
 
2.7%
9 41
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
N 14
28.0%
B 6
12.0%
A 6
12.0%
R 6
12.0%
U 6
12.0%
O 6
12.0%
T 6
12.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1976
76.7%
Hangul 549
 
21.3%
Latin 50
 
1.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
46
 
8.4%
41
 
7.5%
32
 
5.8%
32
 
5.8%
31
 
5.6%
30
 
5.5%
25
 
4.6%
25
 
4.6%
21
 
3.8%
21
 
3.8%
Other values (37) 245
44.6%
Common
ValueCountFrequency (%)
1 433
21.9%
0 344
17.4%
2 238
12.0%
6 197
10.0%
3 181
9.2%
5 162
 
8.2%
7 160
 
8.1%
4 153
 
7.7%
8 53
 
2.7%
9 41
 
2.1%
Latin
ValueCountFrequency (%)
N 14
28.0%
B 6
12.0%
A 6
12.0%
R 6
12.0%
U 6
12.0%
O 6
12.0%
T 6
12.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2026
78.7%
Hangul 549
 
21.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 433
21.4%
0 344
17.0%
2 238
11.7%
6 197
9.7%
3 181
8.9%
5 162
 
8.0%
7 160
 
7.9%
4 153
 
7.6%
8 53
 
2.6%
9 41
 
2.0%
Other values (8) 64
 
3.2%
Hangul
ValueCountFrequency (%)
46
 
8.4%
41
 
7.5%
32
 
5.8%
32
 
5.8%
31
 
5.6%
30
 
5.5%
25
 
4.6%
25
 
4.6%
21
 
3.8%
21
 
3.8%
Other values (37) 245
44.6%

ROUTE_ID
Real number (ℝ)

UNIQUE 

Distinct653
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0613721 × 108
Minimum1.0000002 × 108
Maximum1.249 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.9 KiB
2023-12-11T16:07:54.349910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.0000002 × 108
5-th percentile1.0010004 × 108
Q11.0010023 × 108
median1.0010058 × 108
Q31.129 × 108
95-th percentile1.2190001 × 108
Maximum1.249 × 108
Range24899986
Interquartile range (IQR)12799778

Descriptive statistics

Standard deviation7939884.2
Coefficient of variation (CV)0.074807735
Kurtosis-0.71412248
Mean1.0613721 × 108
Median Absolute Deviation (MAD)543
Skewness0.90147442
Sum6.9307598 × 1010
Variance6.3041761 × 1013
MonotonicityNot monotonic
2023-12-11T16:07:54.566982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100100124 1
 
0.2%
104900003 1
 
0.2%
115900003 1
 
0.2%
115900004 1
 
0.2%
115900001 1
 
0.2%
115900005 1
 
0.2%
115900008 1
 
0.2%
115900002 1
 
0.2%
115900007 1
 
0.2%
120900005 1
 
0.2%
Other values (643) 643
98.5%
ValueCountFrequency (%)
100000017 1
0.2%
100000018 1
0.2%
100100001 1
0.2%
100100006 1
0.2%
100100007 1
0.2%
100100008 1
0.2%
100100009 1
0.2%
100100010 1
0.2%
100100011 1
0.2%
100100012 1
0.2%
ValueCountFrequency (%)
124900003 1
0.2%
124900002 1
0.2%
124900001 1
0.2%
124000039 1
0.2%
124000038 1
0.2%
124000036 1
0.2%
124000013 1
0.2%
124000010 1
0.2%
124000008 1
0.2%
123000013 1
0.2%

Interactions

2023-12-11T16:07:52.811286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-11T16:07:52.984411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T16:07:53.078381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

노선명ROUTE_ID
00017100100124
101100100001
20411104000012
3100100100549
4101100100006
51014100100129
61017100100130
7102100100007
81020100100131
9103100100008
노선명ROUTE_ID
643종로03100900010
644종로05100900011
645종로07100900004
646종로08100900005
647종로09100900003
648종로11100900007
649종로12100900009
650종로13100900002
651중랑01106900001
652중랑02106900002