Overview

Dataset statistics

Number of variables8
Number of observations106
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.9 KiB
Average record size in memory66.2 B

Variable types

Numeric1
Categorical5
Text2

Dataset

Description대구광역시_택시승차대현황_20210831
Author대구광역시
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=15084387&dataSetDetailId=150843871ab86b474d636&provdMethod=FILE

Alerts

설치년도 is highly overall correlated with 비고High correlation
구분 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
비고 is highly overall correlated with 연번 and 4 other fieldsHigh correlation
유형 is highly overall correlated with 비고High correlation
소 재 지(구별) is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 구분 and 2 other fieldsHigh correlation
구분 is highly imbalanced (61.4%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2024-04-17 14:59:12.193623
Analysis finished2024-04-17 14:59:12.776642
Duration0.58 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct106
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.5
Minimum1
Maximum106
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2024-04-17T23:59:12.833383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.25
Q127.25
median53.5
Q379.75
95-th percentile100.75
Maximum106
Range105
Interquartile range (IQR)52.5

Descriptive statistics

Standard deviation30.743563
Coefficient of variation (CV)0.57464604
Kurtosis-1.2
Mean53.5
Median Absolute Deviation (MAD)26.5
Skewness0
Sum5671
Variance945.16667
MonotonicityStrictly increasing
2024-04-17T23:59:12.938508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.9%
81 1
 
0.9%
79 1
 
0.9%
78 1
 
0.9%
77 1
 
0.9%
76 1
 
0.9%
75 1
 
0.9%
74 1
 
0.9%
73 1
 
0.9%
72 1
 
0.9%
Other values (96) 96
90.6%
ValueCountFrequency (%)
1 1
0.9%
2 1
0.9%
3 1
0.9%
4 1
0.9%
5 1
0.9%
6 1
0.9%
7 1
0.9%
8 1
0.9%
9 1
0.9%
10 1
0.9%
ValueCountFrequency (%)
106 1
0.9%
105 1
0.9%
104 1
0.9%
103 1
0.9%
102 1
0.9%
101 1
0.9%
100 1
0.9%
99 1
0.9%
98 1
0.9%
97 1
0.9%

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size980.0 B
일반
98 
모범
 
8

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반
2nd row일반
3rd row일반
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 98
92.5%
모범 8
 
7.5%

Length

2024-04-17T23:59:13.035020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T23:59:13.103633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 98
92.5%
모범 8
 
7.5%
Distinct105
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size980.0 B
2024-04-17T23:59:13.229310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length23
Mean length13.660377
Min length5

Characters and Unicode

Total characters1448
Distinct characters242
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique104 ?
Unique (%)98.1%

Sample

1st row동변동선수촌아파트(802동)앞
2nd row서변동 제2공영주차장 앞
3rd row칠곡신한은행앞(팔거역 근처)
4th row칠곡로즈마리병원앞(팔거역 근처)
5th row칠곡홈플러스앞(팔거역 근처)
ValueCountFrequency (%)
36
 
14.1%
건너 8
 
3.1%
건너편 6
 
2.3%
근처 5
 
2.0%
동부정류장 4
 
1.6%
동대구역 4
 
1.6%
북부정류장 3
 
1.2%
홈플러스 3
 
1.2%
이마트 3
 
1.2%
입구 3
 
1.2%
Other values (170) 181
70.7%
2024-04-17T23:59:13.498609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
150
 
10.4%
78
 
5.4%
) 61
 
4.2%
( 61
 
4.2%
54
 
3.7%
39
 
2.7%
34
 
2.3%
1 23
 
1.6%
22
 
1.5%
22
 
1.5%
Other values (232) 904
62.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1075
74.2%
Space Separator 150
 
10.4%
Decimal Number 87
 
6.0%
Close Punctuation 61
 
4.2%
Open Punctuation 61
 
4.2%
Uppercase Letter 8
 
0.6%
Other Punctuation 5
 
0.3%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
78
 
7.3%
54
 
5.0%
39
 
3.6%
34
 
3.2%
22
 
2.0%
22
 
2.0%
21
 
2.0%
20
 
1.9%
19
 
1.8%
18
 
1.7%
Other values (210) 748
69.6%
Decimal Number
ValueCountFrequency (%)
1 23
26.4%
0 20
23.0%
3 12
13.8%
2 10
11.5%
5 7
 
8.0%
7 5
 
5.7%
8 4
 
4.6%
4 3
 
3.4%
6 2
 
2.3%
9 1
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
E 2
25.0%
C 2
25.0%
X 1
12.5%
I 1
12.5%
K 1
12.5%
O 1
12.5%
Other Punctuation
ValueCountFrequency (%)
, 4
80.0%
. 1
 
20.0%
Space Separator
ValueCountFrequency (%)
150
100.0%
Close Punctuation
ValueCountFrequency (%)
) 61
100.0%
Open Punctuation
ValueCountFrequency (%)
( 61
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1074
74.2%
Common 365
 
25.2%
Latin 8
 
0.6%
Han 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
78
 
7.3%
54
 
5.0%
39
 
3.6%
34
 
3.2%
22
 
2.0%
22
 
2.0%
21
 
2.0%
20
 
1.9%
19
 
1.8%
18
 
1.7%
Other values (209) 747
69.6%
Common
ValueCountFrequency (%)
150
41.1%
) 61
16.7%
( 61
16.7%
1 23
 
6.3%
0 20
 
5.5%
3 12
 
3.3%
2 10
 
2.7%
5 7
 
1.9%
7 5
 
1.4%
8 4
 
1.1%
Other values (6) 12
 
3.3%
Latin
ValueCountFrequency (%)
E 2
25.0%
C 2
25.0%
X 1
12.5%
I 1
12.5%
K 1
12.5%
O 1
12.5%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1074
74.2%
ASCII 373
 
25.8%
CJK 1
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
150
40.2%
) 61
16.4%
( 61
16.4%
1 23
 
6.2%
0 20
 
5.4%
3 12
 
3.2%
2 10
 
2.7%
5 7
 
1.9%
7 5
 
1.3%
8 4
 
1.1%
Other values (12) 20
 
5.4%
Hangul
ValueCountFrequency (%)
78
 
7.3%
54
 
5.0%
39
 
3.6%
34
 
3.2%
22
 
2.0%
22
 
2.0%
21
 
2.0%
20
 
1.9%
19
 
1.8%
18
 
1.7%
Other values (209) 747
69.6%
CJK
ValueCountFrequency (%)
1
100.0%

소 재 지(구별)
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Memory size980.0 B
동구
30 
달서구
22 
북구
18 
수성구
15 
중구
Other values (3)
14 

Length

Max length3
Median length2
Mean length2.3962264
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row북구
2nd row북구
3rd row북구
4th row북구
5th row북구

Common Values

ValueCountFrequency (%)
동구 30
28.3%
달서구 22
20.8%
북구 18
17.0%
수성구 15
14.2%
중구 7
 
6.6%
서구 6
 
5.7%
달성군 5
 
4.7%
남구 3
 
2.8%

Length

2024-04-17T23:59:13.604910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T23:59:13.702278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
동구 30
28.3%
달서구 22
20.8%
북구 18
17.0%
수성구 15
14.2%
중구 7
 
6.6%
서구 6
 
5.7%
달성군 5
 
4.7%
남구 3
 
2.8%
Distinct98
Distinct (%)92.5%
Missing0
Missing (%)0.0%
Memory size980.0 B
2024-04-17T23:59:13.993581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length13
Mean length7.8113208
Min length5

Characters and Unicode

Total characters828
Distinct characters103
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique95 ?
Unique (%)89.6%

Sample

1st row동변로 55
2nd row서변동 1792
3rd row학정로 427
4th row팔거천동로 215
5th row동암로12길 8
ValueCountFrequency (%)
동대구로550 7
 
3.3%
국채보상로 6
 
2.9%
동대구로 5
 
2.4%
달구벌대로 4
 
1.9%
명덕로 4
 
1.9%
45 3
 
1.4%
서대구로 3
 
1.4%
칠곡중앙대로 3
 
1.4%
아양로 3
 
1.4%
화원읍 3
 
1.4%
Other values (148) 169
80.5%
2024-04-17T23:59:14.387225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
104
 
12.6%
99
 
12.0%
1 53
 
6.4%
5 43
 
5.2%
2 42
 
5.1%
3 35
 
4.2%
0 29
 
3.5%
27
 
3.3%
24
 
2.9%
6 24
 
2.9%
Other values (93) 348
42.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 415
50.1%
Decimal Number 303
36.6%
Space Separator 104
 
12.6%
Dash Punctuation 6
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
99
23.9%
27
 
6.5%
24
 
5.8%
22
 
5.3%
12
 
2.9%
9
 
2.2%
9
 
2.2%
8
 
1.9%
7
 
1.7%
6
 
1.4%
Other values (81) 192
46.3%
Decimal Number
ValueCountFrequency (%)
1 53
17.5%
5 43
14.2%
2 42
13.9%
3 35
11.6%
0 29
9.6%
6 24
7.9%
7 24
7.9%
4 24
7.9%
9 20
 
6.6%
8 9
 
3.0%
Space Separator
ValueCountFrequency (%)
104
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 415
50.1%
Common 413
49.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
99
23.9%
27
 
6.5%
24
 
5.8%
22
 
5.3%
12
 
2.9%
9
 
2.2%
9
 
2.2%
8
 
1.9%
7
 
1.7%
6
 
1.4%
Other values (81) 192
46.3%
Common
ValueCountFrequency (%)
104
25.2%
1 53
12.8%
5 43
10.4%
2 42
10.2%
3 35
 
8.5%
0 29
 
7.0%
6 24
 
5.8%
7 24
 
5.8%
4 24
 
5.8%
9 20
 
4.8%
Other values (2) 15
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 415
50.1%
ASCII 413
49.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
104
25.2%
1 53
12.8%
5 43
10.4%
2 42
10.2%
3 35
 
8.5%
0 29
 
7.0%
6 24
 
5.8%
7 24
 
5.8%
4 24
 
5.8%
9 20
 
4.8%
Other values (2) 15
 
3.6%
Hangul
ValueCountFrequency (%)
99
23.9%
27
 
6.5%
24
 
5.8%
22
 
5.3%
12
 
2.9%
9
 
2.2%
9
 
2.2%
8
 
1.9%
7
 
1.7%
6
 
1.4%
Other values (81) 192
46.3%

유형
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size980.0 B
무개
74 
유개
17 
유무개
15 

Length

Max length3
Median length2
Mean length2.1415094
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row무개
2nd row무개
3rd row무개
4th row무개
5th row유무개

Common Values

ValueCountFrequency (%)
무개 74
69.8%
유개 17
 
16.0%
유무개 15
 
14.2%

Length

2024-04-17T23:59:14.494625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T23:59:14.794003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
무개 74
69.8%
유개 17
 
16.0%
유무개 15
 
14.2%

설치년도
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)12.3%
Missing0
Missing (%)0.0%
Memory size980.0 B
<NA>
69 
2010-09-10
2014-12-04
 
6
2015-06-10
 
5
2017-02-05
 
4
Other values (8)
13 

Length

Max length10
Median length4
Mean length6.0943396
Min length4

Unique

Unique5 ?
Unique (%)4.7%

Sample

1st row2014-12-04
2nd row2015-06-10
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 69
65.1%
2010-09-10 9
 
8.5%
2014-12-04 6
 
5.7%
2015-06-10 5
 
4.7%
2017-02-05 4
 
3.8%
2015-09-25 3
 
2.8%
2017-08-01 3
 
2.8%
2011-04-04 2
 
1.9%
2019-11-28 1
 
0.9%
2012-07-15 1
 
0.9%
Other values (3) 3
 
2.8%

Length

2024-04-17T23:59:14.899240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 69
65.1%
2010-09-10 9
 
8.5%
2014-12-04 6
 
5.7%
2015-06-10 5
 
4.7%
2017-02-05 4
 
3.8%
2015-09-25 3
 
2.8%
2017-08-01 3
 
2.8%
2011-04-04 2
 
1.9%
2019-11-28 1
 
0.9%
2012-07-15 1
 
0.9%
Other values (3) 3
 
2.8%

비고
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size980.0 B
<NA>
70 
베이
36 

Length

Max length4
Median length4
Mean length3.3207547
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 70
66.0%
베이 36
34.0%

Length

2024-04-17T23:59:15.027846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T23:59:15.108346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 70
66.0%
베이 36
34.0%

Interactions

2024-04-17T23:59:12.567418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-17T23:59:15.158903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분소 재 지(구별)소재지(주소)유형설치년도
연번1.0000.9570.8520.9320.3520.456
구분0.9571.0000.0000.0000.0940.000
소 재 지(구별)0.8520.0001.0001.0000.3910.711
소재지(주소)0.9320.0001.0001.0000.6860.952
유형0.3520.0940.3910.6861.0000.000
설치년도0.4560.0000.7110.9520.0001.000
2024-04-17T23:59:15.242053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치년도구분비고유형소 재 지(구별)
설치년도1.0000.0001.0000.0000.399
구분0.0001.0001.0000.1540.000
비고1.0001.0001.0001.0001.000
유형0.0000.1541.0001.0000.262
소 재 지(구별)0.3990.0001.0000.2621.000
2024-04-17T23:59:15.333398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분소 재 지(구별)유형설치년도비고
연번1.0000.7910.6150.1990.0001.000
구분0.7911.0000.0000.1540.0001.000
소 재 지(구별)0.6150.0001.0000.2620.3991.000
유형0.1990.1540.2621.0000.0001.000
설치년도0.0000.0000.3990.0001.0001.000
비고1.0001.0001.0001.0001.0001.000

Missing values

2024-04-17T23:59:12.648633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-17T23:59:12.739530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번구분정류소명소 재 지(구별)소재지(주소)유형설치년도비고
01일반동변동선수촌아파트(802동)앞북구동변로 55무개2014-12-04<NA>
12일반서변동 제2공영주차장 앞북구서변동 1792무개2015-06-10<NA>
23일반칠곡신한은행앞(팔거역 근처)북구학정로 427무개<NA><NA>
34일반칠곡로즈마리병원앞(팔거역 근처)북구팔거천동로 215무개<NA><NA>
45일반칠곡홈플러스앞(팔거역 근처)북구동암로12길 8유무개<NA><NA>
56일반칠곡동서영남아파트 앞(화성타운 맞은편)북구구암로 180무개<NA><NA>
67일반칠곡화성타운2차(103동)앞북구동천로 6무개<NA><NA>
78일반동아아울렛 강북점 앞북구칠곡중앙대로 416무개<NA><NA>
89일반칠곡 강비뇨기과 앞(지하차도 전)북구칠곡중앙대로 388무개<NA>베이
910일반칠곡대구은행 태전동지점 앞(송림스포츠프라자 건물)북구칠곡중앙대로 309무개<NA><NA>
연번구분정류소명소 재 지(구별)소재지(주소)유형설치년도비고
9697일반대구과학관 정문 맞은편달성군유가읍 테크노대로6길 20무개2014-12-04<NA>
9798일반테크노폴리스 유가면사무소앞달성군유가읍 테크노상업로 95유개2014-12-04베이
9899모범서대구고속터미날 앞(만평네거리)북구팔달로 103유무개<NA><NA>
99100모범북부정류장 입구서구서대구로 299무개<NA>베이
100101모범동대구역(신세계백화점 앞)동구동대구로550유무개2017-02-05베이
101102모범동부정류장 밑(모범택시)동구화랑로 65무개<NA><NA>
102103모범동대구역 광장(대형,모범)동구동대구로550무개2017-08-01베이
103104모범대구공항 앞(일반, 모범택시)동구공항로 221유개<NA><NA>
104105모범대구역 앞(일반, 모범택시)중구태평로 161유개<NA><NA>
105106모범인터불고 호텔 앞(모범택시)수성구팔현길 212유개<NA><NA>