Overview

Dataset statistics

Number of variables8
Number of observations111
Missing cells68
Missing cells (%)7.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.2 KiB
Average record size in memory66.2 B

Variable types

Numeric1
Categorical4
Text2
DateTime1

Dataset

Description관련 근거 : 공공데이터 제공 및 이용 활성화에 관한 법률 제17조, 제26조, 제27조 대구광역시 택시승차대 현황을 제공합니다.
URLhttps://www.data.go.kr/data/15084387/fileData.do

Alerts

구분 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
비고 is highly overall correlated with 연번 and 3 other fieldsHigh correlation
소재지(구별) is highly overall correlated with 연번 and 1 other fieldsHigh correlation
유형 is highly overall correlated with 비고High correlation
연번 is highly overall correlated with 구분 and 2 other fieldsHigh correlation
구분 is highly imbalanced (56.3%)Imbalance
설치날짜 has 68 (61.3%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 13:04:58.426035
Analysis finished2023-12-12 13:04:59.174642
Duration0.75 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct111
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56
Minimum1
Maximum111
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-12T22:04:59.252396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.5
Q128.5
median56
Q383.5
95-th percentile105.5
Maximum111
Range110
Interquartile range (IQR)55

Descriptive statistics

Standard deviation32.186954
Coefficient of variation (CV)0.57476703
Kurtosis-1.2
Mean56
Median Absolute Deviation (MAD)28
Skewness0
Sum6216
Variance1036
MonotonicityStrictly increasing
2023-12-12T22:04:59.435725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.9%
2 1
 
0.9%
83 1
 
0.9%
82 1
 
0.9%
81 1
 
0.9%
80 1
 
0.9%
79 1
 
0.9%
78 1
 
0.9%
77 1
 
0.9%
76 1
 
0.9%
Other values (101) 101
91.0%
ValueCountFrequency (%)
1 1
0.9%
2 1
0.9%
3 1
0.9%
4 1
0.9%
5 1
0.9%
6 1
0.9%
7 1
0.9%
8 1
0.9%
9 1
0.9%
10 1
0.9%
ValueCountFrequency (%)
111 1
0.9%
110 1
0.9%
109 1
0.9%
108 1
0.9%
107 1
0.9%
106 1
0.9%
105 1
0.9%
104 1
0.9%
103 1
0.9%
102 1
0.9%

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size1020.0 B
일반
101 
모범
 
10

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반
2nd row일반
3rd row일반
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 101
91.0%
모범 10
 
9.0%

Length

2023-12-12T22:04:59.577564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:04:59.692783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 101
91.0%
모범 10
 
9.0%
Distinct108
Distinct (%)97.3%
Missing0
Missing (%)0.0%
Memory size1020.0 B
2023-12-12T22:04:59.871570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length23
Mean length13.792793
Min length5

Characters and Unicode

Total characters1531
Distinct characters245
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique105 ?
Unique (%)94.6%

Sample

1st row동변동선수촌아파트(802동)앞
2nd row서변동 제2공영주차장 앞
3rd row칠곡신한은행앞(팔거역 근처)
4th row칠곡로즈마리병원앞(팔거역 근처)
5th row칠곡홈플러스앞(팔거역 근처)
ValueCountFrequency (%)
36
 
13.5%
건너 8
 
3.0%
건너편 6
 
2.2%
동부정류장 4
 
1.5%
동대구역 4
 
1.5%
근처 4
 
1.5%
맞은편 3
 
1.1%
대곡 3
 
1.1%
홈플러스 3
 
1.1%
이마트 3
 
1.1%
Other values (179) 193
72.3%
2023-12-12T22:05:00.319031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
157
 
10.3%
78
 
5.1%
( 66
 
4.3%
) 66
 
4.3%
53
 
3.5%
46
 
3.0%
39
 
2.5%
1 25
 
1.6%
25
 
1.6%
22
 
1.4%
Other values (235) 954
62.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1132
73.9%
Space Separator 157
 
10.3%
Decimal Number 96
 
6.3%
Open Punctuation 66
 
4.3%
Close Punctuation 66
 
4.3%
Uppercase Letter 8
 
0.5%
Other Punctuation 5
 
0.3%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
78
 
6.9%
53
 
4.7%
46
 
4.1%
39
 
3.4%
25
 
2.2%
22
 
1.9%
22
 
1.9%
20
 
1.8%
19
 
1.7%
19
 
1.7%
Other values (213) 789
69.7%
Decimal Number
ValueCountFrequency (%)
1 25
26.0%
0 22
22.9%
3 14
14.6%
2 12
12.5%
5 7
 
7.3%
7 6
 
6.2%
8 4
 
4.2%
4 3
 
3.1%
6 2
 
2.1%
9 1
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
C 2
25.0%
E 2
25.0%
X 1
12.5%
O 1
12.5%
K 1
12.5%
I 1
12.5%
Other Punctuation
ValueCountFrequency (%)
, 4
80.0%
. 1
 
20.0%
Space Separator
ValueCountFrequency (%)
157
100.0%
Open Punctuation
ValueCountFrequency (%)
( 66
100.0%
Close Punctuation
ValueCountFrequency (%)
) 66
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1131
73.9%
Common 391
 
25.5%
Latin 8
 
0.5%
Han 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
78
 
6.9%
53
 
4.7%
46
 
4.1%
39
 
3.4%
25
 
2.2%
22
 
1.9%
22
 
1.9%
20
 
1.8%
19
 
1.7%
19
 
1.7%
Other values (212) 788
69.7%
Common
ValueCountFrequency (%)
157
40.2%
( 66
16.9%
) 66
16.9%
1 25
 
6.4%
0 22
 
5.6%
3 14
 
3.6%
2 12
 
3.1%
5 7
 
1.8%
7 6
 
1.5%
8 4
 
1.0%
Other values (6) 12
 
3.1%
Latin
ValueCountFrequency (%)
C 2
25.0%
E 2
25.0%
X 1
12.5%
O 1
12.5%
K 1
12.5%
I 1
12.5%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1131
73.9%
ASCII 399
 
26.1%
CJK 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
157
39.3%
( 66
16.5%
) 66
16.5%
1 25
 
6.3%
0 22
 
5.5%
3 14
 
3.5%
2 12
 
3.0%
5 7
 
1.8%
7 6
 
1.5%
8 4
 
1.0%
Other values (12) 20
 
5.0%
Hangul
ValueCountFrequency (%)
78
 
6.9%
53
 
4.7%
46
 
4.1%
39
 
3.4%
25
 
2.2%
22
 
1.9%
22
 
1.9%
20
 
1.8%
19
 
1.7%
19
 
1.7%
Other values (212) 788
69.7%
CJK
ValueCountFrequency (%)
1
100.0%

소재지(구별)
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Memory size1020.0 B
대구광역시 동구
30 
대구광역시 달서구
22 
대구광역시 북구
19 
대구광역시 수성구
15 
대구광역시 서구
10 
Other values (3)
15 

Length

Max length9
Median length8
Mean length8.3873874
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대구광역시 북구
2nd row대구광역시 북구
3rd row대구광역시 북구
4th row대구광역시 북구
5th row대구광역시 북구

Common Values

ValueCountFrequency (%)
대구광역시 동구 30
27.0%
대구광역시 달서구 22
19.8%
대구광역시 북구 19
17.1%
대구광역시 수성구 15
13.5%
대구광역시 서구 10
 
9.0%
대구광역시 중구 6
 
5.4%
대구광역시 달성군 6
 
5.4%
대구광역시 남구 3
 
2.7%

Length

2023-12-12T22:05:00.478009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:05:00.624013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대구광역시 111
50.0%
동구 30
 
13.5%
달서구 22
 
9.9%
북구 19
 
8.6%
수성구 15
 
6.8%
서구 10
 
4.5%
중구 6
 
2.7%
달성군 6
 
2.7%
남구 3
 
1.4%
Distinct101
Distinct (%)91.0%
Missing0
Missing (%)0.0%
Memory size1020.0 B
2023-12-12T22:05:01.025853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length14
Mean length7.9459459
Min length5

Characters and Unicode

Total characters882
Distinct characters109
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique96 ?
Unique (%)86.5%

Sample

1st row동변로 55
2nd row서변동 1792
3rd row학정로 427
4th row팔거천동로 215
5th row동암로12길 8
ValueCountFrequency (%)
동대구로550 7
 
3.1%
국채보상로 6
 
2.7%
동대구로 5
 
2.2%
와룡로 4
 
1.8%
527 4
 
1.8%
달구벌대로 4
 
1.8%
명덕로 4
 
1.8%
칠곡중앙대로 3
 
1.3%
화랑로 3
 
1.3%
아양로 3
 
1.3%
Other values (154) 182
80.9%
2023-12-12T22:05:01.529533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
114
 
12.9%
104
 
11.8%
1 55
 
6.2%
5 47
 
5.3%
2 47
 
5.3%
3 36
 
4.1%
0 30
 
3.4%
7 27
 
3.1%
27
 
3.1%
4 26
 
2.9%
Other values (99) 369
41.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 439
49.8%
Decimal Number 322
36.5%
Space Separator 114
 
12.9%
Dash Punctuation 6
 
0.7%
Other Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
104
23.7%
27
 
6.2%
24
 
5.5%
24
 
5.5%
12
 
2.7%
11
 
2.5%
8
 
1.8%
8
 
1.8%
7
 
1.6%
6
 
1.4%
Other values (86) 208
47.4%
Decimal Number
ValueCountFrequency (%)
1 55
17.1%
5 47
14.6%
2 47
14.6%
3 36
11.2%
0 30
9.3%
7 27
8.4%
4 26
8.1%
6 25
7.8%
9 20
 
6.2%
8 9
 
2.8%
Space Separator
ValueCountFrequency (%)
114
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 443
50.2%
Hangul 439
49.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
104
23.7%
27
 
6.2%
24
 
5.5%
24
 
5.5%
12
 
2.7%
11
 
2.5%
8
 
1.8%
8
 
1.8%
7
 
1.6%
6
 
1.4%
Other values (86) 208
47.4%
Common
ValueCountFrequency (%)
114
25.7%
1 55
12.4%
5 47
10.6%
2 47
10.6%
3 36
 
8.1%
0 30
 
6.8%
7 27
 
6.1%
4 26
 
5.9%
6 25
 
5.6%
9 20
 
4.5%
Other values (3) 16
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 443
50.2%
Hangul 439
49.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
114
25.7%
1 55
12.4%
5 47
10.6%
2 47
10.6%
3 36
 
8.1%
0 30
 
6.8%
7 27
 
6.1%
4 26
 
5.9%
6 25
 
5.6%
9 20
 
4.5%
Other values (3) 16
 
3.6%
Hangul
ValueCountFrequency (%)
104
23.7%
27
 
6.2%
24
 
5.5%
24
 
5.5%
12
 
2.7%
11
 
2.5%
8
 
1.8%
8
 
1.8%
7
 
1.6%
6
 
1.4%
Other values (86) 208
47.4%

유형
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size1020.0 B
무개
74 
유무개
20 
유개
17 

Length

Max length3
Median length2
Mean length2.1801802
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row무개
2nd row무개
3rd row무개
4th row무개
5th row유무개

Common Values

ValueCountFrequency (%)
무개 74
66.7%
유무개 20
 
18.0%
유개 17
 
15.3%

Length

2023-12-12T22:05:01.687649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:05:01.790522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
무개 74
66.7%
유무개 20
 
18.0%
유개 17
 
15.3%

설치날짜
Date

MISSING 

Distinct15
Distinct (%)34.9%
Missing68
Missing (%)61.3%
Memory size1020.0 B
Minimum2010-09-10 00:00:00
Maximum2022-12-14 00:00:00
2023-12-12T22:05:01.875314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:05:01.981457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)

비고
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size1020.0 B
<NA>
75 
베이
36 

Length

Max length4
Median length4
Mean length3.3513514
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 75
67.6%
베이 36
32.4%

Length

2023-12-12T22:05:02.123185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:05:02.255834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 75
67.6%
베이 36
32.4%

Interactions

2023-12-12T22:04:58.853117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:05:02.320760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분소재지(구별)유형설치날짜
연번1.0000.9960.8580.2170.436
구분0.9961.0000.2260.1260.122
소재지(구별)0.8580.2261.0000.4430.786
유형0.2170.1260.4431.0000.557
설치날짜0.4360.1220.7860.5571.000
2023-12-12T22:05:02.443399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분비고소재지(구별)유형
구분1.0001.0000.1630.207
비고1.0001.0001.0001.000
소재지(구별)0.1631.0001.0000.307
유형0.2071.0000.3071.000
2023-12-12T22:05:02.543079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분소재지(구별)유형비고
연번1.0000.9090.6290.0231.000
구분0.9091.0000.1630.2071.000
소재지(구별)0.6290.1631.0000.3071.000
유형0.0230.2070.3071.0001.000
비고1.0001.0001.0001.0001.000

Missing values

2023-12-12T22:04:58.998056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:04:59.128712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번구분정류소명소재지(구별)소재지(주소)유형설치날짜비고
01일반동변동선수촌아파트(802동)앞대구광역시 북구동변로 55무개2014-12-04<NA>
12일반서변동 제2공영주차장 앞대구광역시 북구서변동 1792무개2015-06-10<NA>
23일반칠곡신한은행앞(팔거역 근처)대구광역시 북구학정로 427무개<NA><NA>
34일반칠곡로즈마리병원앞(팔거역 근처)대구광역시 북구팔거천동로 215무개<NA><NA>
45일반칠곡홈플러스앞(팔거역 근처)대구광역시 북구동암로12길 8유무개<NA><NA>
56일반칠곡동서영남아파트 앞(화성타운 맞은편)대구광역시 북구구암로 180무개<NA><NA>
67일반칠곡화성타운2차(103동)앞대구광역시 북구동천로 6무개<NA><NA>
78일반동아아울렛 강북점 앞대구광역시 북구칠곡중앙대로 416무개<NA><NA>
89일반칠곡 강비뇨기과 앞(지하차도 전)대구광역시 북구칠곡중앙대로 388무개<NA>베이
910일반칠곡대구은행 태전동지점 앞(송림스포츠프라자 건물)대구광역시 북구칠곡중앙대로 309무개<NA><NA>
연번구분정류소명소재지(구별)소재지(주소)유형설치날짜비고
101102모범서대구고속터미날 앞(만평네거리)대구광역시 북구팔달로 103유무개<NA><NA>
102103모범북부정류장 입구대구광역시 서구서대구로 299무개<NA>베이
103104모범동대구역(신세계백화점 앞)대구광역시 동구동대구로550유무개2017-02-05베이
104105모범동부정류장 밑(모범택시)대구광역시 동구화랑로 65무개<NA><NA>
105106모범동대구역 광장(대형,모범)대구광역시 동구동대구로550무개2017-08-01베이
106107모범대구공항 앞(일반, 모범택시)대구광역시 동구공항로 221유개<NA><NA>
107108모범대구역 앞(일반, 모범택시)대구광역시 중구태평로 161유개<NA><NA>
108109모범인터불고 호텔 앞(모범택시)대구광역시 수성구팔현길 212유개<NA><NA>
109110모범서대구역사(남측주차장)대구광역시 서구와룡로 527유무개2022-03-30<NA>
110111모범서대구역사(북측주차장)대구광역시 서구와룡로 527유무개2022-03-30<NA>