Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory654.3 KiB
Average record size in memory67.0 B

Variable types

Categorical2
Text2
Numeric3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15247/F/1/datasetView.do

Alerts

노선명 is highly overall correlated with 노선IDHigh correlation
노선ID is highly overall correlated with 노선명High correlation
시간 has 462 (4.6%) zerosZeros

Reproduction

Analysis started2023-12-11 09:54:01.177899
Analysis finished2023-12-11 09:54:03.002284
Duration1.82 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

노선명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
강변북로
1678 
올림픽대로
1469 
한강교량
1418 
서부간선로
1218 
내부순환로
1216 
Other values (6)
3001 

Length

Max length13
Median length5
Mean length4.8553
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row동부간선로
2nd row올림픽대로
3rd row강남순환로
4th row동부간선로
5th row올림픽대로

Common Values

ValueCountFrequency (%)
강변북로 1678
16.8%
올림픽대로 1469
14.7%
한강교량 1418
14.2%
서부간선로 1218
12.2%
내부순환로 1216
12.2%
동부간선로 961
9.6%
강남순환로 797
8.0%
분당수서로 398
 
4.0%
북부간선로 372
 
3.7%
경부고속도로 305
 
3.0%

Length

2023-12-11T18:54:03.073336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강변북로 1678
16.5%
올림픽대로 1469
14.4%
한강교량 1418
13.9%
서부간선로 1218
12.0%
내부순환로 1216
12.0%
동부간선로 961
9.5%
강남순환로 797
7.8%
경부고속도로 473
 
4.7%
분당수서로 398
 
3.9%
북부간선로 372
 
3.7%

노선ID
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
LKL3000010
849 
LKR3000010
829 
LLL3000010
767 
LHD3000010
717 
LLR3000010
702 
Other values (17)
6136 

Length

Max length11
Median length10
Mean length10.0797
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLDO3000010
2nd rowLLL3000010
3rd rowLQRE3000010
4th rowLDI3000010
5th rowLLL3000010

Common Values

ValueCountFrequency (%)
LKL3000010 849
 
8.5%
LKR3000010 829
 
8.3%
LLL3000010 767
 
7.7%
LHD3000010 717
 
7.2%
LLR3000010 702
 
7.0%
LHU3000010 701
 
7.0%
LRI3000010 663
 
6.6%
LSN3000010 628
 
6.3%
LSS3000010 590
 
5.9%
LRO3000010 553
 
5.5%
Other values (12) 3001
30.0%

Length

2023-12-11T18:54:03.220871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
lkl3000010 849
 
8.5%
lkr3000010 829
 
8.3%
lll3000010 767
 
7.7%
lhd3000010 717
 
7.2%
llr3000010 702
 
7.0%
lhu3000010 701
 
7.0%
lri3000010 663
 
6.6%
lsn3000010 628
 
6.3%
lss3000010 590
 
5.9%
lro3000010 553
 
5.5%
Other values (12) 3001
30.0%
Distinct259
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T18:54:03.473228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length20
Mean length11.6812
Min length5

Characters and Unicode

Total characters116812
Distinct characters138
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row월계1교→녹천교
2nd row강동대교남단→암사대교남단
3rd row관악터널 중간부→관악터널 출구부
4th row노원교→상계교
5th row반포대교남단→동작대교남단
ValueCountFrequency (%)
봉천터널 221
 
1.8%
출구부 218
 
1.8%
중간부 217
 
1.8%
관악터널 207
 
1.7%
입구부 180
 
1.5%
서초터널 175
 
1.4%
정릉터널 123
 
1.0%
잠실철교 122
 
1.0%
남단 114
 
0.9%
성수→성동jc(from 93
 
0.8%
Other values (267) 10502
86.3%
2023-12-11T18:54:03.921962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12741
 
10.9%
9151
 
7.8%
8804
 
7.5%
8559
 
7.3%
4938
 
4.2%
4379
 
3.7%
2721
 
2.3%
C 2367
 
2.0%
2172
 
1.9%
1991
 
1.7%
Other values (128) 58989
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 97244
83.2%
Math Symbol 9151
 
7.8%
Uppercase Letter 5272
 
4.5%
Space Separator 2172
 
1.9%
Dash Punctuation 849
 
0.7%
Close Punctuation 717
 
0.6%
Open Punctuation 717
 
0.6%
Decimal Number 411
 
0.4%
Other Punctuation 279
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12741
 
13.1%
8804
 
9.1%
8559
 
8.8%
4938
 
5.1%
4379
 
4.5%
2721
 
2.8%
1991
 
2.0%
1720
 
1.8%
1720
 
1.8%
1682
 
1.7%
Other values (110) 47989
49.3%
Uppercase Letter
ValueCountFrequency (%)
C 2367
44.9%
I 1432
27.2%
J 935
 
17.7%
F 93
 
1.8%
R 93
 
1.8%
O 93
 
1.8%
M 93
 
1.8%
D 83
 
1.6%
U 83
 
1.6%
Decimal Number
ValueCountFrequency (%)
1 366
89.1%
6 45
 
10.9%
Other Punctuation
ValueCountFrequency (%)
" 186
66.7%
, 93
33.3%
Math Symbol
ValueCountFrequency (%)
9151
100.0%
Space Separator
ValueCountFrequency (%)
2172
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 849
100.0%
Close Punctuation
ValueCountFrequency (%)
) 717
100.0%
Open Punctuation
ValueCountFrequency (%)
( 717
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 97244
83.2%
Common 14296
 
12.2%
Latin 5272
 
4.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12741
 
13.1%
8804
 
9.1%
8559
 
8.8%
4938
 
5.1%
4379
 
4.5%
2721
 
2.8%
1991
 
2.0%
1720
 
1.8%
1720
 
1.8%
1682
 
1.7%
Other values (110) 47989
49.3%
Common
ValueCountFrequency (%)
9151
64.0%
2172
 
15.2%
- 849
 
5.9%
) 717
 
5.0%
( 717
 
5.0%
1 366
 
2.6%
" 186
 
1.3%
, 93
 
0.7%
6 45
 
0.3%
Latin
ValueCountFrequency (%)
C 2367
44.9%
I 1432
27.2%
J 935
 
17.7%
F 93
 
1.8%
R 93
 
1.8%
O 93
 
1.8%
M 93
 
1.8%
D 83
 
1.6%
U 83
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 97244
83.2%
ASCII 10417
 
8.9%
Arrows 9151
 
7.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12741
 
13.1%
8804
 
9.1%
8559
 
8.8%
4938
 
5.1%
4379
 
4.5%
2721
 
2.8%
1991
 
2.0%
1720
 
1.8%
1720
 
1.8%
1682
 
1.7%
Other values (110) 47989
49.3%
Arrows
ValueCountFrequency (%)
9151
100.0%
ASCII
ValueCountFrequency (%)
C 2367
22.7%
2172
20.9%
I 1432
13.7%
J 935
 
9.0%
- 849
 
8.2%
) 717
 
6.9%
( 717
 
6.9%
1 366
 
3.5%
" 186
 
1.8%
, 93
 
0.9%
Other values (7) 583
 
5.6%
Distinct265
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T18:54:04.204689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length10.0797
Min length10

Characters and Unicode

Total characters100797
Distinct characters25
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowLDO2000100
2nd rowLLL2000200
3rd rowLQRE2000040
4th rowLDI2000025
5th rowLLL2000090
ValueCountFrequency (%)
lss2000080 56
 
0.6%
lko2000040 54
 
0.5%
ldi2000060 52
 
0.5%
lhd2000170 52
 
0.5%
lbr2000030 51
 
0.5%
lss2000120 50
 
0.5%
lfu2000030 50
 
0.5%
lhd2000210 50
 
0.5%
lkl2000050 49
 
0.5%
lfu2000020 49
 
0.5%
Other values (255) 9487
94.9%
2023-12-11T18:54:04.667896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 45088
44.7%
L 12929
 
12.8%
2 11773
 
11.7%
1 6357
 
6.3%
R 3318
 
3.3%
D 2289
 
2.3%
S 1808
 
1.8%
K 1678
 
1.7%
O 1615
 
1.6%
I 1591
 
1.6%
Other values (15) 12351
 
12.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 70000
69.4%
Uppercase Letter 30797
30.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L 12929
42.0%
R 3318
 
10.8%
D 2289
 
7.4%
S 1808
 
5.9%
K 1678
 
5.4%
O 1615
 
5.2%
I 1591
 
5.2%
H 1541
 
5.0%
U 961
 
3.1%
Q 797
 
2.6%
Other values (5) 2270
 
7.4%
Decimal Number
ValueCountFrequency (%)
0 45088
64.4%
2 11773
 
16.8%
1 6357
 
9.1%
5 1402
 
2.0%
3 1260
 
1.8%
4 957
 
1.4%
8 857
 
1.2%
6 805
 
1.1%
9 758
 
1.1%
7 743
 
1.1%

Most occurring scripts

ValueCountFrequency (%)
Common 70000
69.4%
Latin 30797
30.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 12929
42.0%
R 3318
 
10.8%
D 2289
 
7.4%
S 1808
 
5.9%
K 1678
 
5.4%
O 1615
 
5.2%
I 1591
 
5.2%
H 1541
 
5.0%
U 961
 
3.1%
Q 797
 
2.6%
Other values (5) 2270
 
7.4%
Common
ValueCountFrequency (%)
0 45088
64.4%
2 11773
 
16.8%
1 6357
 
9.1%
5 1402
 
2.0%
3 1260
 
1.8%
4 957
 
1.4%
8 857
 
1.2%
6 805
 
1.1%
9 758
 
1.1%
7 743
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 100797
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 45088
44.7%
L 12929
 
12.8%
2 11773
 
11.7%
1 6357
 
6.3%
R 3318
 
3.3%
D 2289
 
2.3%
S 1808
 
1.8%
K 1678
 
1.7%
O 1615
 
1.6%
I 1591
 
1.6%
Other values (15) 12351
 
12.3%

년월일
Real number (ℝ)

Distinct14
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20170107
Minimum20170101
Maximum20170114
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T18:54:04.833780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20170101
5-th percentile20170101
Q120170104
median20170107
Q320170111
95-th percentile20170114
Maximum20170114
Range13
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.9980627
Coefficient of variation (CV)1.9821722 × 10-7
Kurtosis-1.2266152
Mean20170107
Median Absolute Deviation (MAD)4
Skewness0.025556209
Sum2.0170107 × 1011
Variance15.984505
MonotonicityNot monotonic
2023-12-11T18:54:04.965030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
20170103 777
 
7.8%
20170113 742
 
7.4%
20170107 741
 
7.4%
20170112 741
 
7.4%
20170110 741
 
7.4%
20170111 736
 
7.4%
20170104 736
 
7.4%
20170106 734
 
7.3%
20170102 732
 
7.3%
20170101 728
 
7.3%
Other values (4) 2592
25.9%
ValueCountFrequency (%)
20170101 728
7.3%
20170102 732
7.3%
20170103 777
7.8%
20170104 736
7.4%
20170105 726
7.3%
20170106 734
7.3%
20170107 741
7.4%
20170108 670
6.7%
20170109 645
6.5%
20170110 741
7.4%
ValueCountFrequency (%)
20170114 551
5.5%
20170113 742
7.4%
20170112 741
7.4%
20170111 736
7.4%
20170110 741
7.4%
20170109 645
6.5%
20170108 670
6.7%
20170107 741
7.4%
20170106 734
7.3%
20170105 726
7.3%

시간
Real number (ℝ)

ZEROS 

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.3552
Minimum0
Maximum23
Zeros462
Zeros (%)4.6%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T18:54:05.087195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q15
median11
Q317
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)12

Descriptive statistics

Standard deviation6.9085024
Coefficient of variation (CV)0.60839989
Kurtosis-1.1905276
Mean11.3552
Median Absolute Deviation (MAD)6
Skewness0.0063051697
Sum113552
Variance47.727406
MonotonicityNot monotonic
2023-12-11T18:54:05.226789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
0 462
 
4.6%
13 445
 
4.5%
9 443
 
4.4%
16 441
 
4.4%
2 437
 
4.4%
10 432
 
4.3%
1 430
 
4.3%
12 424
 
4.2%
17 424
 
4.2%
18 422
 
4.2%
Other values (14) 5640
56.4%
ValueCountFrequency (%)
0 462
4.6%
1 430
4.3%
2 437
4.4%
3 410
4.1%
4 412
4.1%
5 403
4.0%
6 421
4.2%
7 380
3.8%
8 412
4.1%
9 443
4.4%
ValueCountFrequency (%)
23 391
3.9%
22 402
4.0%
21 387
3.9%
20 391
3.9%
19 406
4.1%
18 422
4.2%
17 424
4.2%
16 441
4.4%
15 421
4.2%
14 392
3.9%

속도
Real number (ℝ)

Distinct5297
Distinct (%)53.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.570937
Minimum7.34
Maximum118
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T18:54:05.388417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7.34
5-th percentile25.37
Q159.0275
median74.61
Q384.29
95-th percentile95.74
Maximum118
Range110.66
Interquartile range (IQR)25.2625

Descriptive statistics

Standard deviation20.770804
Coefficient of variation (CV)0.29855576
Kurtosis0.11041418
Mean69.570937
Median Absolute Deviation (MAD)11.74
Skewness-0.864999
Sum695709.37
Variance431.42629
MonotonicityNot monotonic
2023-12-11T18:54:05.530635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
75.65 9
 
0.1%
83.5 9
 
0.1%
80.58 9
 
0.1%
82.52 9
 
0.1%
76.26 8
 
0.1%
78.34 8
 
0.1%
80.1 8
 
0.1%
75.93 8
 
0.1%
65.13 8
 
0.1%
74.89 8
 
0.1%
Other values (5287) 9916
99.2%
ValueCountFrequency (%)
7.34 1
< 0.1%
7.69 1
< 0.1%
8.27 1
< 0.1%
8.36 1
< 0.1%
8.39 1
< 0.1%
8.86 1
< 0.1%
9.0 1
< 0.1%
9.35 1
< 0.1%
9.41 1
< 0.1%
9.46 1
< 0.1%
ValueCountFrequency (%)
118.0 1
< 0.1%
113.86 1
< 0.1%
112.37 1
< 0.1%
111.31 1
< 0.1%
111.19 1
< 0.1%
110.72 1
< 0.1%
110.53 1
< 0.1%
110.35 1
< 0.1%
109.53 1
< 0.1%
108.12 1
< 0.1%

Interactions

2023-12-11T18:54:02.422778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T18:54:01.787813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T18:54:02.119482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T18:54:02.537386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T18:54:01.893365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T18:54:02.231301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T18:54:02.658247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T18:54:01.990057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T18:54:02.321141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T18:54:05.622982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노선명노선ID년월일시간속도
노선명1.0001.0000.0000.0690.412
노선ID1.0001.0000.0000.0850.481
년월일0.0000.0001.0000.0580.125
시간0.0690.0850.0581.0000.525
속도0.4120.4810.1250.5251.000
2023-12-11T18:54:06.011947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
노선명노선ID
노선명1.0000.999
노선ID0.9991.000
2023-12-11T18:54:06.114113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년월일시간속도노선명노선ID
년월일1.000-0.026-0.0440.0040.000
시간-0.0261.000-0.3380.0290.031
속도-0.044-0.3381.0000.1890.200
노선명0.0040.0290.1891.0000.999
노선ID0.0000.0310.2000.9991.000

Missing values

2023-12-11T18:54:02.824914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T18:54:02.950637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

노선명노선ID구간명링크ID년월일시간속도
85936동부간선로LDO3000010월계1교→녹천교LDO2000100201701141617.55
43205올림픽대로LLL3000010강동대교남단→암사대교남단LLL2000200201701072077.88
3387강남순환로LQRE3000010관악터널 중간부→관악터널 출구부LQRE2000040201701011398.47
72467동부간선로LDI3000010노원교→상계교LDI2000025201701121228.59
6197올림픽대로LLL3000010반포대교남단→동작대교남단LLL2000090201701012375.62
77612내부순환로LRO3000010사근램프→마장램프LRO200002520170113882.18
16539강변북로LKR3000010동호대교북단→성수LKO2000120201701031574.73
43986올림픽대로LLR3000010여의하류→여의상류LLR2000060201701072388.49
43659북부간선로LBR3000010신내IC→구리시계LBR2000050201701072271.3
58126올림픽대로LLR3000010여의하류→여의상류LLR200006020170110587.82
노선명노선ID구간명링크ID년월일시간속도
4811내부순환로LRO3000010홍제램프→연희램프LRO2000130201701011869.73
11552강변북로LKL3000010영동대교북단→성수LKL2000100201701022083.0
62560서부간선로LSN3000010철산교-광명교LSN2000110201701102276.25
13764서부간선로LSS3000010광명교→금천교LSS210005020170103487.28
21347분당수서로LDO3000020청담대교남단→청담대교북단LHU200016020170104975.69
39686동부간선로LDO3000010노원교→수락지하차도LDO200014020170107767.01
54671북부간선로LBR3000010종암JC→하월곡LBR2000010201701091669.23
42067동부간선로LDO3000010월계1교→녹천교LDO2000100201701071622.38
9921한강교량LHU3000010올림픽대교남단→올림픽대교북단LHU2000190201701021374.84
49666동부간선로LDI3000010상계교→창동교LDI2000030201701082157.17