Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows3
Duplicate rows (%)< 0.1%
Total size in memory556.6 KiB
Average record size in memory57.0 B

Variable types

Categorical2
Text3
Numeric1

Dataset

Description부산광역시_지능형교통정보구간레벨패턴정보_20240229
Author부산광역시
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15041722

Alerts

Dataset has 3 (< 0.1%) duplicate rowsDuplicates
요일 is highly overall correlated with 시분High correlation
시분 is highly overall correlated with 요일High correlation
요일 is highly imbalanced (67.4%)Imbalance

Reproduction

Analysis started2024-03-23 07:03:17.140124
Analysis finished2024-03-23 07:03:25.940908
Duration8.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

요일
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
일요일
9404 
월요일
 
596

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일요일
2nd row일요일
3rd row일요일
4th row일요일
5th row월요일

Common Values

ValueCountFrequency (%)
일요일 9404
94.0%
월요일 596
 
6.0%

Length

2024-03-23T07:03:26.183918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T07:03:26.710721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일요일 9404
94.0%
월요일 596
 
6.0%

시분
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
18:00
1368 
18:35
832 
18:45
804 
18:10
791 
18:20
787 
Other values (7)
5418 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row18:00
2nd row18:40
3rd row18:00
4th row18:30
5th row18:00

Common Values

ValueCountFrequency (%)
18:00 1368
13.7%
18:35 832
8.3%
18:45 804
8.0%
18:10 791
7.9%
18:20 787
7.9%
18:50 785
7.8%
18:05 784
7.8%
18:25 779
7.8%
18:40 769
7.7%
18:30 769
7.7%
Other values (2) 1532
15.3%

Length

2024-03-23T07:03:27.104548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
18:00 1368
13.7%
18:35 832
8.3%
18:45 804
8.0%
18:10 791
7.9%
18:20 787
7.9%
18:50 785
7.8%
18:05 784
7.8%
18:25 779
7.8%
18:40 769
7.7%
18:30 769
7.7%
Other values (2) 1532
15.3%
Distinct842
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-23T07:03:27.948447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length4.9169
Min length3

Characters and Unicode

Total characters49169
Distinct characters252
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique93 ?
Unique (%)0.9%

Sample

1st row동천로
2nd row낙동남로
3rd row황령대로
4th row녹산산업대로
5th row구덕로
ValueCountFrequency (%)
중앙대로 265
 
2.6%
반송로 178
 
1.8%
낙동대로 164
 
1.6%
해운대로 161
 
1.6%
낙동남로 161
 
1.6%
가락대로 150
 
1.5%
기장대로 135
 
1.4%
백양대로 125
 
1.2%
번영로 106
 
1.1%
녹산산업중로 102
 
1.0%
Other values (832) 8453
84.5%
2024-03-23T07:03:29.526940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9705
 
19.7%
3219
 
6.5%
2093
 
4.3%
1894
 
3.9%
1765
 
3.6%
1 1281
 
2.6%
1113
 
2.3%
2 1001
 
2.0%
866
 
1.8%
3 772
 
1.6%
Other values (242) 25460
51.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 43118
87.7%
Decimal Number 6015
 
12.2%
Uppercase Letter 36
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9705
22.5%
3219
 
7.5%
2093
 
4.9%
1894
 
4.4%
1765
 
4.1%
1113
 
2.6%
866
 
2.0%
743
 
1.7%
721
 
1.7%
628
 
1.5%
Other values (228) 20371
47.2%
Decimal Number
ValueCountFrequency (%)
1 1281
21.3%
2 1001
16.6%
3 772
12.8%
7 521
8.7%
4 488
 
8.1%
6 472
 
7.8%
8 411
 
6.8%
5 387
 
6.4%
0 383
 
6.4%
9 299
 
5.0%
Uppercase Letter
ValueCountFrequency (%)
A 9
25.0%
P 9
25.0%
E 9
25.0%
C 9
25.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 43118
87.7%
Common 6015
 
12.2%
Latin 36
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9705
22.5%
3219
 
7.5%
2093
 
4.9%
1894
 
4.4%
1765
 
4.1%
1113
 
2.6%
866
 
2.0%
743
 
1.7%
721
 
1.7%
628
 
1.5%
Other values (228) 20371
47.2%
Common
ValueCountFrequency (%)
1 1281
21.3%
2 1001
16.6%
3 772
12.8%
7 521
8.7%
4 488
 
8.1%
6 472
 
7.8%
8 411
 
6.8%
5 387
 
6.4%
0 383
 
6.4%
9 299
 
5.0%
Latin
ValueCountFrequency (%)
A 9
25.0%
P 9
25.0%
E 9
25.0%
C 9
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 43118
87.7%
ASCII 6051
 
12.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9705
22.5%
3219
 
7.5%
2093
 
4.9%
1894
 
4.4%
1765
 
4.1%
1113
 
2.6%
866
 
2.0%
743
 
1.7%
721
 
1.7%
628
 
1.5%
Other values (228) 20371
47.2%
ASCII
ValueCountFrequency (%)
1 1281
21.2%
2 1001
16.5%
3 772
12.8%
7 521
8.6%
4 488
 
8.1%
6 472
 
7.8%
8 411
 
6.8%
5 387
 
6.4%
0 383
 
6.3%
9 299
 
4.9%
Other values (4) 36
 
0.6%

시점
Text

Distinct2487
Distinct (%)24.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-23T07:03:30.340969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length6.6891
Min length2

Characters and Unicode

Total characters66891
Distinct characters607
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique351 ?
Unique (%)3.5%

Sample

1st row22번교차로
2nd row송정동944-1
3rd row덕명여자정보고앞
4th row한국산업단지공단부산지역본부
5th row성불원
ValueCountFrequency (%)
속성변화점 92
 
0.9%
명지ic 28
 
0.3%
대동화명대교ic 22
 
0.2%
거제역10번출구 21
 
0.2%
청강교 19
 
0.2%
버스정류장 17
 
0.2%
연화리507 17
 
0.2%
삼락ic 17
 
0.2%
교대사거리 17
 
0.2%
감전ic삼거리 17
 
0.2%
Other values (2477) 9733
97.3%
2024-03-23T07:03:31.584862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2858
 
4.3%
2202
 
3.3%
1680
 
2.5%
1409
 
2.1%
1246
 
1.9%
1198
 
1.8%
1 1169
 
1.7%
1167
 
1.7%
1138
 
1.7%
1104
 
1.7%
Other values (597) 51720
77.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59671
89.2%
Decimal Number 5030
 
7.5%
Uppercase Letter 1346
 
2.0%
Dash Punctuation 706
 
1.1%
Open Punctuation 55
 
0.1%
Close Punctuation 55
 
0.1%
Lowercase Letter 22
 
< 0.1%
Other Punctuation 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2858
 
4.8%
2202
 
3.7%
1680
 
2.8%
1409
 
2.4%
1246
 
2.1%
1198
 
2.0%
1167
 
2.0%
1138
 
1.9%
1104
 
1.9%
1101
 
1.8%
Other values (555) 44568
74.7%
Uppercase Letter
ValueCountFrequency (%)
C 350
26.0%
I 271
20.1%
G 133
 
9.9%
S 109
 
8.1%
T 95
 
7.1%
K 81
 
6.0%
B 44
 
3.3%
U 36
 
2.7%
E 36
 
2.7%
N 32
 
2.4%
Other values (13) 159
11.8%
Decimal Number
ValueCountFrequency (%)
1 1169
23.2%
2 840
16.7%
3 565
11.2%
4 504
10.0%
5 481
9.6%
7 346
 
6.9%
6 326
 
6.5%
9 321
 
6.4%
0 244
 
4.9%
8 234
 
4.7%
Lowercase Letter
ValueCountFrequency (%)
i 6
27.3%
l 6
27.3%
s 5
22.7%
k 5
22.7%
Other Punctuation
ValueCountFrequency (%)
& 5
83.3%
, 1
 
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 706
100.0%
Open Punctuation
ValueCountFrequency (%)
( 55
100.0%
Close Punctuation
ValueCountFrequency (%)
) 55
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59671
89.2%
Common 5852
 
8.7%
Latin 1368
 
2.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2858
 
4.8%
2202
 
3.7%
1680
 
2.8%
1409
 
2.4%
1246
 
2.1%
1198
 
2.0%
1167
 
2.0%
1138
 
1.9%
1104
 
1.9%
1101
 
1.8%
Other values (555) 44568
74.7%
Latin
ValueCountFrequency (%)
C 350
25.6%
I 271
19.8%
G 133
 
9.7%
S 109
 
8.0%
T 95
 
6.9%
K 81
 
5.9%
B 44
 
3.2%
U 36
 
2.6%
E 36
 
2.6%
N 32
 
2.3%
Other values (17) 181
13.2%
Common
ValueCountFrequency (%)
1 1169
20.0%
2 840
14.4%
- 706
12.1%
3 565
9.7%
4 504
8.6%
5 481
8.2%
7 346
 
5.9%
6 326
 
5.6%
9 321
 
5.5%
0 244
 
4.2%
Other values (5) 350
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59671
89.2%
ASCII 7220
 
10.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2858
 
4.8%
2202
 
3.7%
1680
 
2.8%
1409
 
2.4%
1246
 
2.1%
1198
 
2.0%
1167
 
2.0%
1138
 
1.9%
1104
 
1.9%
1101
 
1.8%
Other values (555) 44568
74.7%
ASCII
ValueCountFrequency (%)
1 1169
16.2%
2 840
11.6%
- 706
9.8%
3 565
 
7.8%
4 504
 
7.0%
5 481
 
6.7%
C 350
 
4.8%
7 346
 
4.8%
6 326
 
4.5%
9 321
 
4.4%
Other values (32) 1612
22.3%

종점
Text

Distinct2486
Distinct (%)24.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-23T07:03:32.256889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length6.7099
Min length2

Characters and Unicode

Total characters67099
Distinct characters604
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique343 ?
Unique (%)3.4%

Sample

1st row부광몰드시스템
2nd row산양삼거리
3rd row황령터널동측
4th row강서소방서
5th row충무교차로
ValueCountFrequency (%)
속성변화점 89
 
0.9%
명지ic 43
 
0.4%
덕천ic 25
 
0.2%
버스정류장 23
 
0.2%
거제역10번출구 23
 
0.2%
감전ic삼거리 23
 
0.2%
청강교 22
 
0.2%
대동화명대교ic 21
 
0.2%
제2지하차도 20
 
0.2%
시랑리706 18
 
0.2%
Other values (2476) 9693
96.9%
2024-03-23T07:03:33.286819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2911
 
4.3%
2189
 
3.3%
1647
 
2.5%
1426
 
2.1%
1301
 
1.9%
1 1233
 
1.8%
1202
 
1.8%
1178
 
1.8%
1138
 
1.7%
1136
 
1.7%
Other values (594) 51738
77.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59575
88.8%
Decimal Number 5153
 
7.7%
Uppercase Letter 1493
 
2.2%
Dash Punctuation 734
 
1.1%
Close Punctuation 59
 
0.1%
Open Punctuation 59
 
0.1%
Lowercase Letter 16
 
< 0.1%
Other Punctuation 10
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2911
 
4.9%
2189
 
3.7%
1647
 
2.8%
1426
 
2.4%
1301
 
2.2%
1202
 
2.0%
1178
 
2.0%
1138
 
1.9%
1136
 
1.9%
1102
 
1.8%
Other values (552) 44345
74.4%
Uppercase Letter
ValueCountFrequency (%)
C 395
26.5%
I 310
20.8%
G 136
 
9.1%
S 135
 
9.0%
K 93
 
6.2%
T 80
 
5.4%
B 47
 
3.1%
E 39
 
2.6%
N 34
 
2.3%
U 31
 
2.1%
Other values (13) 193
12.9%
Decimal Number
ValueCountFrequency (%)
1 1233
23.9%
2 841
16.3%
3 529
10.3%
5 496
9.6%
4 482
 
9.4%
7 386
 
7.5%
6 328
 
6.4%
9 316
 
6.1%
8 271
 
5.3%
0 271
 
5.3%
Lowercase Letter
ValueCountFrequency (%)
i 6
37.5%
l 6
37.5%
s 2
 
12.5%
k 2
 
12.5%
Other Punctuation
ValueCountFrequency (%)
& 6
60.0%
, 4
40.0%
Dash Punctuation
ValueCountFrequency (%)
- 734
100.0%
Close Punctuation
ValueCountFrequency (%)
) 59
100.0%
Open Punctuation
ValueCountFrequency (%)
( 59
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 59575
88.8%
Common 6015
 
9.0%
Latin 1509
 
2.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2911
 
4.9%
2189
 
3.7%
1647
 
2.8%
1426
 
2.4%
1301
 
2.2%
1202
 
2.0%
1178
 
2.0%
1138
 
1.9%
1136
 
1.9%
1102
 
1.8%
Other values (552) 44345
74.4%
Latin
ValueCountFrequency (%)
C 395
26.2%
I 310
20.5%
G 136
 
9.0%
S 135
 
8.9%
K 93
 
6.2%
T 80
 
5.3%
B 47
 
3.1%
E 39
 
2.6%
N 34
 
2.3%
U 31
 
2.1%
Other values (17) 209
13.9%
Common
ValueCountFrequency (%)
1 1233
20.5%
2 841
14.0%
- 734
12.2%
3 529
8.8%
5 496
8.2%
4 482
 
8.0%
7 386
 
6.4%
6 328
 
5.5%
9 316
 
5.3%
8 271
 
4.5%
Other values (5) 399
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 59575
88.8%
ASCII 7524
 
11.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2911
 
4.9%
2189
 
3.7%
1647
 
2.8%
1426
 
2.4%
1301
 
2.2%
1202
 
2.0%
1178
 
2.0%
1138
 
1.9%
1136
 
1.9%
1102
 
1.8%
Other values (552) 44345
74.4%
ASCII
ValueCountFrequency (%)
1 1233
16.4%
2 841
11.2%
- 734
9.8%
3 529
 
7.0%
5 496
 
6.6%
4 482
 
6.4%
C 395
 
5.2%
7 386
 
5.1%
6 328
 
4.4%
9 316
 
4.2%
Other values (32) 1784
23.7%

속도
Real number (ℝ)

Distinct100
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.0339
Minimum4
Maximum103
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-23T07:03:33.825747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile11
Q119
median26
Q336
95-th percentile66
Maximum103
Range99
Interquartile range (IQR)17

Descriptive statistics

Standard deviation16.212897
Coefficient of variation (CV)0.53981991
Kurtosis2.3294777
Mean30.0339
Median Absolute Deviation (MAD)8
Skewness1.4274133
Sum300339
Variance262.85804
MonotonicityNot monotonic
2024-03-23T07:03:34.400977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21 389
 
3.9%
22 389
 
3.9%
23 376
 
3.8%
25 373
 
3.7%
26 368
 
3.7%
24 362
 
3.6%
19 361
 
3.6%
18 341
 
3.4%
20 339
 
3.4%
27 324
 
3.2%
Other values (90) 6378
63.8%
ValueCountFrequency (%)
4 49
 
0.5%
5 26
 
0.3%
6 41
 
0.4%
7 58
 
0.6%
8 84
0.8%
9 94
0.9%
10 122
1.2%
11 96
1.0%
12 133
1.3%
13 178
1.8%
ValueCountFrequency (%)
103 2
 
< 0.1%
102 3
 
< 0.1%
101 2
 
< 0.1%
100 1
 
< 0.1%
99 1
 
< 0.1%
98 1
 
< 0.1%
97 10
0.1%
96 6
0.1%
95 10
0.1%
94 3
 
< 0.1%

Interactions

2024-03-23T07:03:24.695830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T07:03:34.740914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
요일시분속도
요일1.0000.7880.014
시분0.7881.0000.000
속도0.0140.0001.000
2024-03-23T07:03:35.064426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시분요일
시분1.0000.632
요일0.6321.000
2024-03-23T07:03:35.359607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
속도요일시분
속도1.0000.0110.000
요일0.0111.0000.632
시분0.0000.6321.000

Missing values

2024-03-23T07:03:25.141871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T07:03:25.716034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

요일시분구간명시점종점속도
5926일요일18:00동천로22번교차로부광몰드시스템21
64156일요일18:40낙동남로송정동944-1산양삼거리48
1114일요일18:00황령대로덕명여자정보고앞황령터널동측45
52533일요일18:30녹산산업대로한국산업단지공단부산지역본부강서소방서45
95225월요일18:00구덕로성불원충무교차로12
94216월요일18:00황령대로상공회의소앞범내골교차로10
44744일요일18:25낙동남로명지IC명지동1-61070
52003일요일18:30동천로22번교차로부광몰드시스템21
29327일요일18:15기장대로장안IC북측장안IC입구남측63
12681일요일18:05거가대로장목터널거가대교78
요일시분구간명시점종점속도
52833일요일18:30낙동남로제2녹산교서측제1녹산교동측48
29761일요일18:15울산포항고속도로기장일광IC북측기장일광IC남측96
44710일요일18:25남해고속도로제2낙동대교서측대저JC80
59929일요일18:35장인로진성식당학장동교차로29
41386일요일18:25후리소리길꼬마대통령다대점바다유치원20
54551일요일18:30보수대로영락교회앞사거리흑교사거리17
40198일요일18:25황령대로범4호교문전교차로19
10393일요일18:05낙동대로1048번길감전IC삼거리CU감전공단점34
32021일요일18:20동천로디씨티앞교차로NC백화점17
76531일요일18:45녹산산단261로종합폴스타동성로라38

Duplicate rows

Most frequently occurring

요일시분구간명시점종점속도# duplicates
0월요일18:00기장해안로시랑리706시랑리706322
1일요일18:20대청로청강과선교서측청강교282
2일요일18:30녹산산단321로갑을녹산병원갑을녹산병원352