Overview

Dataset statistics

Number of variables14
Number of observations572
Missing cells1247
Missing cells (%)15.6%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory65.5 KiB
Average record size in memory117.2 B

Variable types

Text4
Numeric2
DateTime4
Categorical4

Dataset

Description경상북도 김천시의 버스 정류장 정보로 경유지, 종점, 출발시간, 평균운행거리, 운행횟수, 요금 정보 등을 제공합니다.
Author경상북도 김천시
URLhttps://www.data.go.kr/data/15091884/fileData.do

Alerts

Dataset has 1 (0.2%) duplicate rowsDuplicates
중고생(원) is highly overall correlated with 거리 and 4 other fieldsHigh correlation
데이터기준일 is highly overall correlated with 거리 and 4 other fieldsHigh correlation
초등학생(원) is highly overall correlated with 거리 and 4 other fieldsHigh correlation
일반(원) is highly overall correlated with 거리 and 4 other fieldsHigh correlation
거리 is highly overall correlated with 일반(원) and 3 other fieldsHigh correlation
횟수 is highly overall correlated with 일반(원) and 3 other fieldsHigh correlation
일반(원) is highly imbalanced (91.6%)Imbalance
중고생(원) is highly imbalanced (91.6%)Imbalance
초등학생(원) is highly imbalanced (91.6%)Imbalance
데이터기준일 is highly imbalanced (91.6%)Imbalance
계통번호 has 6 (1.0%) missing valuesMissing
기점 has 6 (1.0%) missing valuesMissing
경유지 has 6 (1.0%) missing valuesMissing
종점 has 6 (1.0%) missing valuesMissing
거리 has 6 (1.0%) missing valuesMissing
횟수 has 6 (1.0%) missing valuesMissing
기점 출발(1) has 6 (1.0%) missing valuesMissing
기점출발(2) has 310 (54.2%) missing valuesMissing
기점출발(3) has 426 (74.5%) missing valuesMissing
기점출발(4) has 469 (82.0%) missing valuesMissing

Reproduction

Analysis started2023-12-12 03:00:01.551583
Analysis finished2023-12-12 03:00:04.148254
Duration2.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

계통번호
Text

MISSING 

Distinct467
Distinct (%)82.5%
Missing6
Missing (%)1.0%
Memory size4.6 KiB
2023-12-12T12:00:04.502262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length5.6731449
Min length3

Characters and Unicode

Total characters3211
Distinct characters118
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique418 ?
Unique (%)73.9%

Sample

1st rowKTX11
2nd rowKTX113-2
3rd rowKTX12
4th rowKTX13
5th rowKTX14
ValueCountFrequency (%)
김천22 9
 
1.6%
시청22 8
 
1.4%
김천11 7
 
1.2%
순환1 6
 
1.1%
직지11 5
 
0.9%
김천13-2 5
 
0.9%
봉계15 5
 
0.9%
순환2 5
 
0.9%
김천15 5
 
0.9%
시청222 4
 
0.7%
Other values (457) 507
89.6%
2023-12-12T12:00:05.178546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 472
14.7%
- 371
11.6%
2 361
11.2%
3 258
 
8.0%
8 231
 
7.2%
211
 
6.6%
205
 
6.4%
5 114
 
3.6%
4 82
 
2.6%
6 60
 
1.9%
Other values (108) 846
26.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1688
52.6%
Other Letter 1096
34.1%
Dash Punctuation 371
 
11.6%
Uppercase Letter 51
 
1.6%
Lowercase Letter 3
 
0.1%
Space Separator 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
211
19.3%
205
18.7%
40
 
3.6%
37
 
3.4%
35
 
3.2%
31
 
2.8%
29
 
2.6%
24
 
2.2%
24
 
2.2%
24
 
2.2%
Other values (90) 436
39.8%
Decimal Number
ValueCountFrequency (%)
1 472
28.0%
2 361
21.4%
3 258
15.3%
8 231
13.7%
5 114
 
6.8%
4 82
 
4.9%
6 60
 
3.6%
9 46
 
2.7%
7 44
 
2.6%
0 20
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
X 17
33.3%
T 17
33.3%
K 17
33.3%
Lowercase Letter
ValueCountFrequency (%)
t 1
33.3%
k 1
33.3%
x 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 371
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2061
64.2%
Hangul 1096
34.1%
Latin 54
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
211
19.3%
205
18.7%
40
 
3.6%
37
 
3.4%
35
 
3.2%
31
 
2.8%
29
 
2.6%
24
 
2.2%
24
 
2.2%
24
 
2.2%
Other values (90) 436
39.8%
Common
ValueCountFrequency (%)
1 472
22.9%
- 371
18.0%
2 361
17.5%
3 258
12.5%
8 231
11.2%
5 114
 
5.5%
4 82
 
4.0%
6 60
 
2.9%
9 46
 
2.2%
7 44
 
2.1%
Other values (2) 22
 
1.1%
Latin
ValueCountFrequency (%)
X 17
31.5%
T 17
31.5%
K 17
31.5%
t 1
 
1.9%
k 1
 
1.9%
x 1
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2115
65.9%
Hangul 1096
34.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 472
22.3%
- 371
17.5%
2 361
17.1%
3 258
12.2%
8 231
10.9%
5 114
 
5.4%
4 82
 
3.9%
6 60
 
2.8%
9 46
 
2.2%
7 44
 
2.1%
Other values (8) 76
 
3.6%
Hangul
ValueCountFrequency (%)
211
19.3%
205
18.7%
40
 
3.6%
37
 
3.4%
35
 
3.2%
31
 
2.8%
29
 
2.6%
24
 
2.2%
24
 
2.2%
24
 
2.2%
Other values (90) 436
39.8%

기점
Text

MISSING 

Distinct78
Distinct (%)13.8%
Missing6
Missing (%)1.0%
Memory size4.6 KiB
2023-12-12T12:00:05.501654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length2.6625442
Min length2

Characters and Unicode

Total characters1507
Distinct characters112
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)2.3%

Sample

1st rowktx
2nd rowktx
3rd rowktx
4th rowktx
5th rowktx
ValueCountFrequency (%)
터미널 211
37.3%
ktx 36
 
6.4%
시청 35
 
6.2%
직지사 28
 
4.9%
종상 22
 
3.9%
봉계 21
 
3.7%
선산 10
 
1.8%
용문산 8
 
1.4%
못골 8
 
1.4%
추풍령 8
 
1.4%
Other values (68) 179
31.6%
2023-12-12T12:00:06.422507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
213
 
14.1%
211
 
14.0%
211
 
14.0%
43
 
2.9%
37
 
2.5%
k 36
 
2.4%
t 36
 
2.4%
x 36
 
2.4%
35
 
2.3%
34
 
2.3%
Other values (102) 615
40.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1397
92.7%
Lowercase Letter 108
 
7.2%
Decimal Number 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
213
15.2%
211
15.1%
211
15.1%
43
 
3.1%
37
 
2.6%
35
 
2.5%
34
 
2.4%
31
 
2.2%
30
 
2.1%
25
 
1.8%
Other values (98) 527
37.7%
Lowercase Letter
ValueCountFrequency (%)
k 36
33.3%
t 36
33.3%
x 36
33.3%
Decimal Number
ValueCountFrequency (%)
4 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1397
92.7%
Latin 108
 
7.2%
Common 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
213
15.2%
211
15.1%
211
15.1%
43
 
3.1%
37
 
2.6%
35
 
2.5%
34
 
2.4%
31
 
2.2%
30
 
2.1%
25
 
1.8%
Other values (98) 527
37.7%
Latin
ValueCountFrequency (%)
k 36
33.3%
t 36
33.3%
x 36
33.3%
Common
ValueCountFrequency (%)
4 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1397
92.7%
ASCII 110
 
7.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
213
15.2%
211
15.1%
211
15.1%
43
 
3.1%
37
 
2.6%
35
 
2.5%
34
 
2.4%
31
 
2.2%
30
 
2.1%
25
 
1.8%
Other values (98) 527
37.7%
ASCII
ValueCountFrequency (%)
k 36
32.7%
t 36
32.7%
x 36
32.7%
4 2
 
1.8%

경유지
Text

MISSING 

Distinct306
Distinct (%)54.1%
Missing6
Missing (%)1.0%
Memory size4.6 KiB
2023-12-12T12:00:06.949184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length79
Median length45
Mean length20.978799
Min length2

Characters and Unicode

Total characters11874
Distinct characters202
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique186 ?
Unique (%)32.9%

Sample

1st row 한국전력기술 무실삼거리 삼각로타리 터미널 김천역 시민탑 김천중고 다수동
2nd row용시 남곡 무실삼거리 삼각로타리
3rd row 한국전력기술 무실삼거리 삼각로타리
4th row 한국전력기술 무실삼거리 삼각로타리
5th row 한국전력기술 무실삼거리 삼각로타리
ValueCountFrequency (%)
김천역 171
 
6.2%
무실삼거리 114
 
4.1%
터미널 112
 
4.1%
삼각로타리 109
 
4.0%
시민탑 103
 
3.7%
이마트 87
 
3.2%
김천중고 84
 
3.1%
성의중고 76
 
2.8%
다수동 74
 
2.7%
지례 72
 
2.6%
Other values (185) 1748
63.6%
2023-12-12T12:00:07.717847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2209
 
18.6%
561
 
4.7%
407
 
3.4%
358
 
3.0%
332
 
2.8%
287
 
2.4%
259
 
2.2%
252
 
2.1%
238
 
2.0%
179
 
1.5%
Other values (192) 6792
57.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9232
77.7%
Space Separator 2209
 
18.6%
Uppercase Letter 357
 
3.0%
Decimal Number 64
 
0.5%
Open Punctuation 3
 
< 0.1%
Close Punctuation 3
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Lowercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
561
 
6.1%
407
 
4.4%
358
 
3.9%
332
 
3.6%
287
 
3.1%
259
 
2.8%
252
 
2.7%
238
 
2.6%
179
 
1.9%
163
 
1.8%
Other values (176) 6196
67.1%
Uppercase Letter
ValueCountFrequency (%)
A 140
39.2%
K 71
19.9%
T 67
18.8%
X 67
18.8%
C 8
 
2.2%
H 2
 
0.6%
L 2
 
0.6%
Lowercase Letter
ValueCountFrequency (%)
x 1
33.3%
t 1
33.3%
k 1
33.3%
Decimal Number
ValueCountFrequency (%)
2 51
79.7%
1 13
 
20.3%
Space Separator
ValueCountFrequency (%)
2209
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9232
77.7%
Common 2282
 
19.2%
Latin 360
 
3.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
561
 
6.1%
407
 
4.4%
358
 
3.9%
332
 
3.6%
287
 
3.1%
259
 
2.8%
252
 
2.7%
238
 
2.6%
179
 
1.9%
163
 
1.8%
Other values (176) 6196
67.1%
Latin
ValueCountFrequency (%)
A 140
38.9%
K 71
19.7%
T 67
18.6%
X 67
18.6%
C 8
 
2.2%
H 2
 
0.6%
L 2
 
0.6%
x 1
 
0.3%
t 1
 
0.3%
k 1
 
0.3%
Common
ValueCountFrequency (%)
2209
96.8%
2 51
 
2.2%
1 13
 
0.6%
( 3
 
0.1%
) 3
 
0.1%
. 3
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9232
77.7%
ASCII 2642
 
22.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2209
83.6%
A 140
 
5.3%
K 71
 
2.7%
T 67
 
2.5%
X 67
 
2.5%
2 51
 
1.9%
1 13
 
0.5%
C 8
 
0.3%
( 3
 
0.1%
) 3
 
0.1%
Other values (6) 10
 
0.4%
Hangul
ValueCountFrequency (%)
561
 
6.1%
407
 
4.4%
358
 
3.9%
332
 
3.6%
287
 
3.1%
259
 
2.8%
252
 
2.7%
238
 
2.6%
179
 
1.9%
163
 
1.8%
Other values (176) 6196
67.1%

종점
Text

MISSING 

Distinct79
Distinct (%)14.0%
Missing6
Missing (%)1.0%
Memory size4.6 KiB
2023-12-12T12:00:08.080590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length2.6819788
Min length2

Characters and Unicode

Total characters1518
Distinct characters116
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)2.7%

Sample

1st row직지사
2nd row터미널
3rd row터미널
4th row터미널
5th row터미널
ValueCountFrequency (%)
터미널 223
39.4%
ktx 37
 
6.5%
시청 31
 
5.5%
직지사 27
 
4.8%
종상 22
 
3.9%
봉계 20
 
3.5%
선산 8
 
1.4%
용문산 8
 
1.4%
못골 8
 
1.4%
추풍령 7
 
1.2%
Other values (69) 175
30.9%
2023-12-12T12:00:08.608937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
225
 
14.8%
223
 
14.7%
223
 
14.7%
39
 
2.6%
k 37
 
2.4%
t 37
 
2.4%
x 37
 
2.4%
37
 
2.4%
33
 
2.2%
31
 
2.0%
Other values (106) 596
39.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1403
92.4%
Lowercase Letter 111
 
7.3%
Uppercase Letter 3
 
0.2%
Decimal Number 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
225
16.0%
223
15.9%
223
15.9%
39
 
2.8%
37
 
2.6%
33
 
2.4%
31
 
2.2%
30
 
2.1%
28
 
2.0%
26
 
1.9%
Other values (99) 508
36.2%
Lowercase Letter
ValueCountFrequency (%)
k 37
33.3%
t 37
33.3%
x 37
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 1
33.3%
X 1
33.3%
K 1
33.3%
Decimal Number
ValueCountFrequency (%)
4 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1403
92.4%
Latin 114
 
7.5%
Common 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
225
16.0%
223
15.9%
223
15.9%
39
 
2.8%
37
 
2.6%
33
 
2.4%
31
 
2.2%
30
 
2.1%
28
 
2.0%
26
 
1.9%
Other values (99) 508
36.2%
Latin
ValueCountFrequency (%)
k 37
32.5%
t 37
32.5%
x 37
32.5%
T 1
 
0.9%
X 1
 
0.9%
K 1
 
0.9%
Common
ValueCountFrequency (%)
4 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1403
92.4%
ASCII 115
 
7.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
225
16.0%
223
15.9%
223
15.9%
39
 
2.8%
37
 
2.6%
33
 
2.4%
31
 
2.2%
30
 
2.1%
28
 
2.0%
26
 
1.9%
Other values (99) 508
36.2%
ASCII
ValueCountFrequency (%)
k 37
32.2%
t 37
32.2%
x 37
32.2%
T 1
 
0.9%
4 1
 
0.9%
X 1
 
0.9%
K 1
 
0.9%

거리
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct155
Distinct (%)27.4%
Missing6
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean18.775442
Minimum3
Maximum59.3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.2 KiB
2023-12-12T12:00:08.829094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile4.4
Q110.9
median16.2
Q324.6
95-th percentile39.3
Maximum59.3
Range56.3
Interquartile range (IQR)13.7

Descriptive statistics

Standard deviation11.182788
Coefficient of variation (CV)0.59560717
Kurtosis0.23629813
Mean18.775442
Median Absolute Deviation (MAD)7.25
Skewness0.83565498
Sum10626.9
Variance125.05474
MonotonicityNot monotonic
2023-12-12T12:00:09.050362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11.7 29
 
5.1%
3.0 24
 
4.2%
8.0 15
 
2.6%
14.7 13
 
2.3%
20.5 10
 
1.7%
23.4 9
 
1.6%
18.9 9
 
1.6%
8.9 9
 
1.6%
11.0 9
 
1.6%
4.7 8
 
1.4%
Other values (145) 431
75.3%
ValueCountFrequency (%)
3.0 24
4.2%
3.5 3
 
0.5%
4.3 2
 
0.3%
4.7 8
 
1.4%
5.0 1
 
0.2%
5.2 3
 
0.5%
5.4 1
 
0.2%
5.5 3
 
0.5%
5.6 6
 
1.0%
6.0 4
 
0.7%
ValueCountFrequency (%)
59.3 1
 
0.2%
54.2 1
 
0.2%
51.3 1
 
0.2%
50.8 3
0.5%
49.3 5
0.9%
46.2 4
0.7%
44.2 4
0.7%
43.2 3
0.5%
40.0 2
 
0.3%
39.6 4
0.7%

횟수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct23
Distinct (%)4.1%
Missing6
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean4.2424028
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.2 KiB
2023-12-12T12:00:09.231509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile19
Maximum31
Range30
Interquartile range (IQR)2

Descriptive statistics

Standard deviation6.3985953
Coefficient of variation (CV)1.5082479
Kurtosis5.9906569
Mean4.2424028
Median Absolute Deviation (MAD)0
Skewness2.5244887
Sum2401.2
Variance40.942022
MonotonicityNot monotonic
2023-12-12T12:00:09.405483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1.0 288
50.3%
2.0 109
 
19.1%
3.0 38
 
6.6%
12.0 22
 
3.8%
6.0 16
 
2.8%
31.0 8
 
1.4%
13.0 8
 
1.4%
8.0 8
 
1.4%
5.0 8
 
1.4%
4.0 8
 
1.4%
Other values (13) 53
 
9.3%
ValueCountFrequency (%)
1.0 288
50.3%
2.0 109
 
19.1%
3.0 38
 
6.6%
4.0 8
 
1.4%
5.0 8
 
1.4%
6.0 16
 
2.8%
7.0 4
 
0.7%
8.0 8
 
1.4%
9.0 3
 
0.5%
10.0 3
 
0.5%
ValueCountFrequency (%)
31.0 8
1.4%
27.0 7
1.2%
25.0 7
1.2%
21.0 6
1.0%
19.0 5
0.9%
18.0 5
0.9%
15.0 4
0.7%
14.0 4
0.7%
13.0 8
1.4%
12.9 1
 
0.2%

기점 출발(1)
Date

MISSING 

Distinct250
Distinct (%)44.2%
Missing6
Missing (%)1.0%
Memory size4.6 KiB
Minimum2023-12-12 06:00:00
Maximum2023-12-12 22:45:00
2023-12-12T12:00:09.597942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:00:09.818392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

기점출발(2)
Date

MISSING 

Distinct177
Distinct (%)67.6%
Missing310
Missing (%)54.2%
Memory size4.6 KiB
Minimum2023-12-12 06:25:00
Maximum2023-12-12 23:15:00
2023-12-12T12:00:10.027672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:00:10.244690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

기점출발(3)
Date

MISSING 

Distinct122
Distinct (%)83.6%
Missing426
Missing (%)74.5%
Memory size4.6 KiB
Minimum2023-12-12 06:55:00
Maximum2023-12-12 22:45:00
2023-12-12T12:00:10.443510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:00:10.615093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

기점출발(4)
Date

MISSING 

Distinct82
Distinct (%)79.6%
Missing469
Missing (%)82.0%
Memory size4.6 KiB
Minimum2023-12-12 07:15:00
Maximum2023-12-12 23:38:00
2023-12-12T12:00:10.775282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:00:10.954752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

일반(원)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
1500
566 
<NA>
 
6

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1500
2nd row1500
3rd row1500
4th row1500
5th row1500

Common Values

ValueCountFrequency (%)
1500 566
99.0%
<NA> 6
 
1.0%

Length

2023-12-12T12:00:11.139498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:00:11.267595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1500 566
99.0%
na 6
 
1.0%

중고생(원)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
1200
566 
<NA>
 
6

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1200
2nd row1200
3rd row1200
4th row1200
5th row1200

Common Values

ValueCountFrequency (%)
1200 566
99.0%
<NA> 6
 
1.0%

Length

2023-12-12T12:00:11.398263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:00:11.530688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1200 566
99.0%
na 6
 
1.0%

초등학생(원)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
800
566 
<NA>
 
6

Length

Max length4
Median length3
Mean length3.0104895
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row800
2nd row800
3rd row800
4th row800
5th row800

Common Values

ValueCountFrequency (%)
800 566
99.0%
<NA> 6
 
1.0%

Length

2023-12-12T12:00:11.663223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:00:11.806768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
800 566
99.0%
na 6
 
1.0%

데이터기준일
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
2023-10-10
566 
<NA>
 
6

Length

Max length10
Median length10
Mean length9.9370629
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-10-10
2nd row2023-10-10
3rd row2023-10-10
4th row2023-10-10
5th row2023-10-10

Common Values

ValueCountFrequency (%)
2023-10-10 566
99.0%
<NA> 6
 
1.0%

Length

2023-12-12T12:00:11.968954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:00:12.112004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-10-10 566
99.0%
na 6
 
1.0%

Interactions

2023-12-12T12:00:02.951512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:00:02.658403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:00:03.085061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:00:02.811244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:00:12.217248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기점종점거리횟수기점출발(4)
기점1.0000.0000.8280.0000.885
종점0.0001.0000.8670.4540.627
거리0.8280.8671.0000.4010.291
횟수0.0000.4540.4011.0000.000
기점출발(4)0.8850.6270.2910.0001.000
2023-12-12T12:00:12.348711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
중고생(원)데이터기준일초등학생(원)일반(원)
중고생(원)1.0001.0001.0001.000
데이터기준일1.0001.0001.0001.000
초등학생(원)1.0001.0001.0001.000
일반(원)1.0001.0001.0001.000
2023-12-12T12:00:12.489998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
거리횟수일반(원)중고생(원)초등학생(원)데이터기준일
거리1.000-0.4071.0001.0001.0001.000
횟수-0.4071.0001.0001.0001.0001.000
일반(원)1.0001.0001.0001.0001.0001.000
중고생(원)1.0001.0001.0001.0001.0001.000
초등학생(원)1.0001.0001.0001.0001.0001.000
데이터기준일1.0001.0001.0001.0001.0001.000

Missing values

2023-12-12T12:00:03.312393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:00:03.603393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T12:00:03.892835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

계통번호기점경유지종점거리횟수기점 출발(1)기점출발(2)기점출발(3)기점출발(4)일반(원)중고생(원)초등학생(원)데이터기준일
0KTX11ktx한국전력기술 무실삼거리 삼각로타리 터미널 김천역 시민탑 김천중고 다수동직지사20.01.014:05<NA><NA><NA>150012008002023-10-10
1KTX113-2ktx용시 남곡 무실삼거리 삼각로타리터미널8.91.009:40<NA><NA><NA>150012008002023-10-10
2KTX12ktx한국전력기술 무실삼거리 삼각로타리터미널8.312.009:1010:1512:1016:55150012008002023-10-10
3KTX13ktx한국전력기술 무실삼거리 삼각로타리터미널8.312.020:2521:1521:4521:50150012008002023-10-10
4KTX14ktx한국전력기술 무실삼거리 삼각로타리터미널8.312.022:0022:1022:4523:38150012008002023-10-10
5KTX13-2ktx용시 남곡 무실삼거리 삼각로타리터미널8.912.009:2009:4509:5511:25150012008002023-10-10
6KTX13-2ktx용시 남곡 무실삼거리 삼각로타리터미널8.912.012:2714:5015:2015:50150012008002023-10-10
7KTX13-2ktx용시 남곡 무실삼거리 삼각로타리터미널8.912.017:3018:4520:5020:55150012008002023-10-10
8KTX14-6ktxKCC아파트 우정사업조달센터 신촌 지좌동터미널8.41.007:50<NA><NA><NA>150012008002023-10-10
9KTX15ktx용시 남곡 무실삼거리 삼각로타리 터미널 김천역 한일여중고 코아루A 김천대학봉계16.91.016:30<NA><NA><NA>150012008002023-10-10
계통번호기점경유지종점거리횟수기점 출발(1)기점출발(2)기점출발(3)기점출발(4)일반(원)중고생(원)초등학생(원)데이터기준일
562하강883-2하강중앙고 양천동 황금시장터미널9.52.007:1009:05<NA><NA>150012008002023-10-10
563해인94해인농원지시 부항면사무소지례27.01.013:25<NA><NA><NA>150012008002023-10-10
564해인95해인농원부항면사무소 지례 구성 황금시장터미널37.01.019:05<NA><NA><NA>150012008002023-10-10
565호동12-2호동성의중고 아랫장터 삼각로타리터미널3.51.010:25<NA><NA><NA>150012008002023-10-10
566호동222호동성의중고 아랫장터 삼각로타리 터미널 현대A 이마트시청6.52.007:4014:42<NA><NA>150012008002023-10-10
567황새339-2황새울배시내 빗내 대홍A 김천의료원터미널20.51.007:00<NA><NA><NA>150012008002023-10-10
568희곡885-2희곡유촌 지례 김천상고 구성터미널28.11.013:25<NA><NA><NA>150012008002023-10-10
569희곡85-2희곡유촌지례8.11.007:20<NA><NA><NA>150012008002023-10-10
570김천800터미널황금시장 중앙고 구성지례20.02.013:3019:30<NA><NA>150012008002023-10-10
571지례800지례구성 중앙고 황금시장터미널20.02.014:1520:10<NA><NA>150012008002023-10-10

Duplicate rows

Most frequently occurring

계통번호기점경유지종점거리횟수기점 출발(1)기점출발(2)기점출발(3)기점출발(4)일반(원)중고생(원)초등학생(원)데이터기준일# duplicates
0<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>6