Overview

Dataset statistics

Number of variables6
Number of observations91
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.5 KiB
Average record size in memory50.5 B

Variable types

Numeric1
Categorical3
Text2

Dataset

Description인천광역시 택시승차대의 군구명, 관리번호, 위치, 형태, 비고(복합승차대)에 대한 항목 정보를 제공하는 데이터 입니다.
Author인천광역시
URLhttps://www.data.go.kr/data/15045240/fileData.do

Alerts

형태 is highly overall correlated with 비고High correlation
비고 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 is highly overall correlated with and 1 other fieldsHigh correlation
비고 is highly imbalanced (50.0%)Imbalance
연번 has unique valuesUnique
관리번호 has unique valuesUnique
위치 has unique valuesUnique

Reproduction

Analysis started2024-01-06 12:41:01.921893
Analysis finished2024-01-06 12:41:03.538843
Duration1.62 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct91
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46
Minimum1
Maximum91
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size951.0 B
2024-01-06T12:41:03.751860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.5
Q123.5
median46
Q368.5
95-th percentile86.5
Maximum91
Range90
Interquartile range (IQR)45

Descriptive statistics

Standard deviation26.41338
Coefficient of variation (CV)0.57420392
Kurtosis-1.2
Mean46
Median Absolute Deviation (MAD)23
Skewness0
Sum4186
Variance697.66667
MonotonicityStrictly increasing
2024-01-06T12:41:04.693863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.1%
59 1
 
1.1%
68 1
 
1.1%
67 1
 
1.1%
66 1
 
1.1%
65 1
 
1.1%
64 1
 
1.1%
63 1
 
1.1%
62 1
 
1.1%
61 1
 
1.1%
Other values (81) 81
89.0%
ValueCountFrequency (%)
1 1
1.1%
2 1
1.1%
3 1
1.1%
4 1
1.1%
5 1
1.1%
6 1
1.1%
7 1
1.1%
8 1
1.1%
9 1
1.1%
10 1
1.1%
ValueCountFrequency (%)
91 1
1.1%
90 1
1.1%
89 1
1.1%
88 1
1.1%
87 1
1.1%
86 1
1.1%
85 1
1.1%
84 1
1.1%
83 1
1.1%
82 1
1.1%


Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Memory size860.0 B
연수구
14 
부평구
14 
계양구
14 
서구
14 
남동구
13 
Other values (3)
22 

Length

Max length4
Median length3
Mean length2.8681319
Min length2

Unique

Unique1 ?
Unique (%)1.1%

Sample

1st row중구
2nd row중구
3rd row중구
4th row중구
5th row중구

Common Values

ValueCountFrequency (%)
연수구 14
15.4%
부평구 14
15.4%
계양구 14
15.4%
서구 14
15.4%
남동구 13
14.3%
미추홀구 12
13.2%
중구 9
9.9%
동구 1
 
1.1%

Length

2024-01-06T12:41:05.736400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-06T12:41:07.503692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
연수구 14
15.4%
부평구 14
15.4%
계양구 14
15.4%
서구 14
15.4%
남동구 13
14.3%
미추홀구 12
13.2%
중구 9
9.9%
동구 1
 
1.1%

관리번호
Text

UNIQUE 

Distinct91
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size860.0 B
2024-01-06T12:41:08.747125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length11.604396
Min length11

Characters and Unicode

Total characters1056
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91 ?
Unique (%)100.0%

Sample

1st row17-일반-01(중)
2nd row99-일반-04(중)
3rd row17-일반-05(중)
4th row10-일반-06(중)
5th row10-일반-07(중)
ValueCountFrequency (%)
17-일반-01(중 1
 
1.1%
13-일반-16(남동 1
 
1.1%
17-일반-05(계양 1
 
1.1%
01-일반-04(계양 1
 
1.1%
01-일반-03(계양 1
 
1.1%
17-일반-01(계양 1
 
1.1%
22-일반-14(부평 1
 
1.1%
19-일반-13(부평 1
 
1.1%
14-일반-12(부평 1
 
1.1%
14-일반-11(부평 1
 
1.1%
Other values (81) 81
89.0%
2024-01-06T12:41:10.379070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 182
17.2%
1 125
11.8%
) 91
8.6%
91
8.6%
91
8.6%
( 91
8.6%
0 76
 
7.2%
9 32
 
3.0%
7 29
 
2.7%
4 25
 
2.4%
Other values (15) 223
21.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 364
34.5%
Other Letter 328
31.1%
Dash Punctuation 182
17.2%
Close Punctuation 91
 
8.6%
Open Punctuation 91
 
8.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
91
27.7%
91
27.7%
25
 
7.6%
14
 
4.3%
14
 
4.3%
14
 
4.3%
14
 
4.3%
14
 
4.3%
14
 
4.3%
14
 
4.3%
Other values (2) 23
 
7.0%
Decimal Number
ValueCountFrequency (%)
1 125
34.3%
0 76
20.9%
9 32
 
8.8%
7 29
 
8.0%
4 25
 
6.9%
3 21
 
5.8%
6 17
 
4.7%
5 16
 
4.4%
2 15
 
4.1%
8 8
 
2.2%
Dash Punctuation
ValueCountFrequency (%)
- 182
100.0%
Close Punctuation
ValueCountFrequency (%)
) 91
100.0%
Open Punctuation
ValueCountFrequency (%)
( 91
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 728
68.9%
Hangul 328
31.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 182
25.0%
1 125
17.2%
) 91
12.5%
( 91
12.5%
0 76
10.4%
9 32
 
4.4%
7 29
 
4.0%
4 25
 
3.4%
3 21
 
2.9%
6 17
 
2.3%
Other values (3) 39
 
5.4%
Hangul
ValueCountFrequency (%)
91
27.7%
91
27.7%
25
 
7.6%
14
 
4.3%
14
 
4.3%
14
 
4.3%
14
 
4.3%
14
 
4.3%
14
 
4.3%
14
 
4.3%
Other values (2) 23
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 728
68.9%
Hangul 328
31.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 182
25.0%
1 125
17.2%
) 91
12.5%
( 91
12.5%
0 76
10.4%
9 32
 
4.4%
7 29
 
4.0%
4 25
 
3.4%
3 21
 
2.9%
6 17
 
2.3%
Other values (3) 39
 
5.4%
Hangul
ValueCountFrequency (%)
91
27.7%
91
27.7%
25
 
7.6%
14
 
4.3%
14
 
4.3%
14
 
4.3%
14
 
4.3%
14
 
4.3%
14
 
4.3%
14
 
4.3%
Other values (2) 23
 
7.0%

위치
Text

UNIQUE 

Distinct91
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size860.0 B
2024-01-06T12:41:10.932943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length18
Mean length14.43956
Min length8

Characters and Unicode

Total characters1314
Distinct characters184
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique91 ?
Unique (%)100.0%

Sample

1st row인현동 동인천역 남광장 앞
2nd row연안동 연안여객터미널 앞
3rd row신흥동 이마트 동인천점 앞
4th row신흥동 출입국관리사무소 앞
5th row운서동 운서역광장 앞
ValueCountFrequency (%)
84
24.3%
2번출구 12
 
3.5%
1번출구 7
 
2.0%
4번출구 6
 
1.7%
논현동 6
 
1.7%
작전동 5
 
1.4%
연수동 5
 
1.4%
롯데마트 5
 
1.4%
홈플러스 5
 
1.4%
남광장 5
 
1.4%
Other values (155) 206
59.5%
2024-01-06T12:41:11.870456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
255
 
19.4%
101
 
7.7%
84
 
6.4%
49
 
3.7%
47
 
3.6%
36
 
2.7%
35
 
2.7%
18
 
1.4%
16
 
1.2%
2 16
 
1.2%
Other values (174) 657
50.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1000
76.1%
Space Separator 255
 
19.4%
Decimal Number 51
 
3.9%
Uppercase Letter 8
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
101
 
10.1%
84
 
8.4%
49
 
4.9%
47
 
4.7%
36
 
3.6%
35
 
3.5%
18
 
1.8%
16
 
1.6%
16
 
1.6%
15
 
1.5%
Other values (161) 583
58.3%
Decimal Number
ValueCountFrequency (%)
2 16
31.4%
1 15
29.4%
4 7
13.7%
3 4
 
7.8%
5 3
 
5.9%
7 3
 
5.9%
6 2
 
3.9%
8 1
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
G 3
37.5%
V 2
25.0%
C 2
25.0%
S 1
 
12.5%
Space Separator
ValueCountFrequency (%)
255
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1000
76.1%
Common 306
 
23.3%
Latin 8
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
101
 
10.1%
84
 
8.4%
49
 
4.9%
47
 
4.7%
36
 
3.6%
35
 
3.5%
18
 
1.8%
16
 
1.6%
16
 
1.6%
15
 
1.5%
Other values (161) 583
58.3%
Common
ValueCountFrequency (%)
255
83.3%
2 16
 
5.2%
1 15
 
4.9%
4 7
 
2.3%
3 4
 
1.3%
5 3
 
1.0%
7 3
 
1.0%
6 2
 
0.7%
8 1
 
0.3%
Latin
ValueCountFrequency (%)
G 3
37.5%
V 2
25.0%
C 2
25.0%
S 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1000
76.1%
ASCII 314
 
23.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
255
81.2%
2 16
 
5.1%
1 15
 
4.8%
4 7
 
2.2%
3 4
 
1.3%
5 3
 
1.0%
G 3
 
1.0%
7 3
 
1.0%
6 2
 
0.6%
V 2
 
0.6%
Other values (3) 4
 
1.3%
Hangul
ValueCountFrequency (%)
101
 
10.1%
84
 
8.4%
49
 
4.9%
47
 
4.7%
36
 
3.6%
35
 
3.5%
18
 
1.8%
16
 
1.6%
16
 
1.6%
15
 
1.5%
Other values (161) 583
58.3%

형태
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size860.0 B
쉘터
34 
폴대
29 
쉘터(신)
28 

Length

Max length5
Median length2
Mean length2.9230769
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row쉘터
2nd row쉘터
3rd row쉘터
4th row쉘터
5th row쉘터

Common Values

ValueCountFrequency (%)
쉘터 34
37.4%
폴대 29
31.9%
쉘터(신) 28
30.8%

Length

2024-01-06T12:41:12.407299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-06T12:41:12.973749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
쉘터 34
37.4%
폴대 29
31.9%
쉘터(신 28
30.8%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size860.0 B
<NA>
81 
복합승차대
10 

Length

Max length5
Median length4
Mean length4.1098901
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row복합승차대
2nd row<NA>
3rd row복합승차대
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 81
89.0%
복합승차대 10
 
11.0%

Length

2024-01-06T12:41:13.528205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-06T12:41:13.870753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 81
89.0%
복합승차대 10
 
11.0%

Interactions

2024-01-06T12:41:02.657045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-06T12:41:14.084168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번관리번호위치형태
연번1.0000.9361.0001.0000.500
0.9361.0001.0001.0000.353
관리번호1.0001.0001.0001.0001.000
위치1.0001.0001.0001.0001.000
형태0.5000.3531.0001.0001.000
2024-01-06T12:41:14.668622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
형태비고
형태1.0001.0000.231
비고1.0001.0001.000
0.2311.0001.000
2024-01-06T12:41:15.099916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번형태비고
연번1.0000.7990.3531.000
0.7991.0000.2311.000
형태0.3530.2311.0001.000
비고1.0001.0001.0001.000

Missing values

2024-01-06T12:41:02.940163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-06T12:41:03.418802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번관리번호위치형태비고
01중구17-일반-01(중)인현동 동인천역 남광장 앞쉘터복합승차대
12중구99-일반-04(중)연안동 연안여객터미널 앞쉘터<NA>
23중구17-일반-05(중)신흥동 이마트 동인천점 앞쉘터복합승차대
34중구10-일반-06(중)신흥동 출입국관리사무소 앞쉘터<NA>
45중구10-일반-07(중)운서동 운서역광장 앞쉘터<NA>
56중구16-일반-08(중)운북동 영종역 광장 앞쉘터(신)<NA>
67중구16-일반-09(중)북성동 인천역광장 앞쉘터(신)<NA>
78중구16-일반-10(중)사동 신포역 2번출구 앞쉘터(신)<NA>
89중구16-일반-14(중)숭의동 숭의역 2번출구 앞폴대<NA>
910동구13-일반-01(동)송현동 동인천역 북광장 앞쉘터(신)<NA>
연번관리번호위치형태비고
8182서구09-일반-08(서)서구 원당대로 685번길쉘터<NA>
8283서구16-일반-09(서)경서동 청라국제도시역 앞쉘터(신)<NA>
8384서구17-일반-10(서)왕길동 검단사거리역 3번출구 앞쉘터(신)<NA>
8485서구17-일반-11(서)마전동 마전역 2번출구 앞쉘터(신)<NA>
8586서구17-일반-12(서)가좌동 가재울역 1번출구 앞폴대<NA>
8687서구17-일반-13(서)가좌동 가재울역 4번출구 앞폴대<NA>
8788서구17-일반-14(서)경서동 롯데마트 청라점 앞쉘터(신)<NA>
8889서구17-일반-15(서)경서동 롯데마트 청라점 건너편쉘터(신)<NA>
8990서구17-일반-16(서)경서동 청라엑슬루타워 정문 앞쉘터(신)<NA>
9091서구20-일반-17(서)심곡동 써브웨이 인천서구청점 앞폴대<NA>