Overview

Dataset statistics

Number of variables7
Number of observations100
Missing cells100
Missing cells (%)14.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.1 KiB
Average record size in memory62.3 B

Variable types

Categorical4
Text2
Unsupported1

Alerts

거래일자 has constant value ""Constant
승차시간 has constant value ""Constant
고객수 has 100 (100.0%) missing valuesMissing
고객수 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-10 06:39:37.099561
Analysis finished2023-12-10 06:39:38.413062
Duration1.31 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

거래일자
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
20190702
100 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20190702
2nd row20190702
3rd row20190702
4th row20190702
5th row20190702

Common Values

ValueCountFrequency (%)
20190702 100
100.0%

Length

2023-12-10T15:39:38.577749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:39:38.789168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20190702 100
100.0%

승차시간
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
8
100 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8
2nd row8
3rd row8
4th row8
5th row8

Common Values

ValueCountFrequency (%)
8 100
100.0%

Length

2023-12-10T15:39:39.027326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:39:39.230675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
8 100
100.0%
Distinct72
Distinct (%)72.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T15:39:39.655694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length6
Mean length5.21
Min length2

Characters and Unicode

Total characters521
Distinct characters141
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique59 ?
Unique (%)59.0%

Sample

1st row장지
2nd row정자
3rd row용마산
4th row우장산
5th row우장산
ValueCountFrequency (%)
잠실(송파구청 8
 
8.0%
신정(은행정 6
 
6.0%
장승배기 3
 
3.0%
신정네거리 3
 
3.0%
우장산 3
 
3.0%
장한평 3
 
3.0%
제물포 3
 
3.0%
장산역 2
 
2.0%
연산동역 2
 
2.0%
월드컵경기장 2
 
2.0%
Other values (62) 65
65.0%
2023-12-10T15:39:40.399816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26
 
5.0%
22
 
4.2%
17
 
3.3%
17
 
3.3%
16
 
3.1%
) 16
 
3.1%
( 16
 
3.1%
14
 
2.7%
12
 
2.3%
11
 
2.1%
Other values (131) 354
67.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 482
92.5%
Close Punctuation 16
 
3.1%
Open Punctuation 16
 
3.1%
Decimal Number 7
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26
 
5.4%
22
 
4.6%
17
 
3.5%
17
 
3.5%
16
 
3.3%
14
 
2.9%
12
 
2.5%
11
 
2.3%
11
 
2.3%
9
 
1.9%
Other values (125) 327
67.8%
Decimal Number
ValueCountFrequency (%)
4 2
28.6%
5 2
28.6%
2 2
28.6%
1 1
14.3%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 482
92.5%
Common 39
 
7.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
26
 
5.4%
22
 
4.6%
17
 
3.5%
17
 
3.5%
16
 
3.3%
14
 
2.9%
12
 
2.5%
11
 
2.3%
11
 
2.3%
9
 
1.9%
Other values (125) 327
67.8%
Common
ValueCountFrequency (%)
) 16
41.0%
( 16
41.0%
4 2
 
5.1%
5 2
 
5.1%
2 2
 
5.1%
1 1
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 482
92.5%
ASCII 39
 
7.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
26
 
5.4%
22
 
4.6%
17
 
3.5%
17
 
3.5%
16
 
3.3%
14
 
2.9%
12
 
2.5%
11
 
2.3%
11
 
2.3%
9
 
1.9%
Other values (125) 327
67.8%
ASCII
ValueCountFrequency (%)
) 16
41.0%
( 16
41.0%
4 2
 
5.1%
5 2
 
5.1%
2 2
 
5.1%
1 1
 
2.6%

하차시간
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
8
84 
9
16 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8
2nd row8
3rd row8
4th row8
5th row8

Common Values

ValueCountFrequency (%)
8 84
84.0%
9 16
 
16.0%

Length

2023-12-10T15:39:40.634769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:39:40.797189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
8 84
84.0%
9 16
 
16.0%
Distinct91
Distinct (%)91.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T15:39:41.190898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7
Mean length5.1
Min length2

Characters and Unicode

Total characters510
Distinct characters173
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique85 ?
Unique (%)85.0%

Sample

1st row봉은사
2nd row양재(서초구청)
3rd row성수
4th row광화문(세종문화
5th row목동
ValueCountFrequency (%)
가산디지털단지 4
 
4.0%
시청 3
 
3.0%
강남 2
 
2.0%
문래 2
 
2.0%
종로3가 2
 
2.0%
삼성(무역센터 2
 
2.0%
장위2동주민센터 1
 
1.0%
은로초등학교 1
 
1.0%
탄방 1
 
1.0%
간석오거리역(8번 1
 
1.0%
Other values (81) 81
81.0%
2023-12-10T15:39:41.912284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30
 
5.9%
15
 
2.9%
14
 
2.7%
12
 
2.4%
11
 
2.2%
11
 
2.2%
11
 
2.2%
( 10
 
2.0%
9
 
1.8%
8
 
1.6%
Other values (163) 379
74.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 473
92.7%
Decimal Number 11
 
2.2%
Open Punctuation 10
 
2.0%
Other Punctuation 8
 
1.6%
Close Punctuation 6
 
1.2%
Uppercase Letter 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
 
6.3%
15
 
3.2%
14
 
3.0%
12
 
2.5%
11
 
2.3%
11
 
2.3%
11
 
2.3%
9
 
1.9%
8
 
1.7%
7
 
1.5%
Other values (153) 345
72.9%
Decimal Number
ValueCountFrequency (%)
2 4
36.4%
3 3
27.3%
1 2
18.2%
8 1
 
9.1%
4 1
 
9.1%
Uppercase Letter
ValueCountFrequency (%)
K 1
50.0%
T 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Other Punctuation
ValueCountFrequency (%)
. 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 473
92.7%
Common 35
 
6.9%
Latin 2
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
 
6.3%
15
 
3.2%
14
 
3.0%
12
 
2.5%
11
 
2.3%
11
 
2.3%
11
 
2.3%
9
 
1.9%
8
 
1.7%
7
 
1.5%
Other values (153) 345
72.9%
Common
ValueCountFrequency (%)
( 10
28.6%
. 8
22.9%
) 6
17.1%
2 4
 
11.4%
3 3
 
8.6%
1 2
 
5.7%
8 1
 
2.9%
4 1
 
2.9%
Latin
ValueCountFrequency (%)
K 1
50.0%
T 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 473
92.7%
ASCII 37
 
7.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
30
 
6.3%
15
 
3.2%
14
 
3.0%
12
 
2.5%
11
 
2.3%
11
 
2.3%
11
 
2.3%
9
 
1.9%
8
 
1.7%
7
 
1.5%
Other values (153) 345
72.9%
ASCII
ValueCountFrequency (%)
( 10
27.0%
. 8
21.6%
) 6
16.2%
2 4
 
10.8%
3 3
 
8.1%
1 2
 
5.4%
8 1
 
2.7%
4 1
 
2.7%
K 1
 
2.7%
T 1
 
2.7%
Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
51 
0
49 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 51
51.0%
0 49
49.0%

Length

2023-12-10T15:39:42.110291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:39:42.316191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 51
51.0%
0 49
49.0%

고객수
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing100
Missing (%)100.0%
Memory size1.0 KiB

Correlations

2023-12-10T15:39:42.426250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
승차역하차시간하차역명교통수단_분류_코드
승차역1.0000.0000.9951.000
하차시간0.0001.0000.0000.575
하차역명0.9950.0001.0001.000
교통수단_분류_코드1.0000.5751.0001.000
2023-12-10T15:39:42.608998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
교통수단_분류_코드하차시간
교통수단_분류_코드1.0000.390
하차시간0.3901.000
2023-12-10T15:39:42.752078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
하차시간교통수단_분류_코드
하차시간1.0000.390
교통수단_분류_코드0.3901.000

Missing values

2023-12-10T15:39:38.142159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:39:38.322219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

거래일자승차시간승차역하차시간하차역명교통수단_분류_코드고객수
0201907028장지8봉은사1<NA>
1201907028정자8양재(서초구청)1<NA>
2201907028용마산8성수1<NA>
3201907028우장산8광화문(세종문화1<NA>
4201907028우장산8목동1<NA>
5201907028우장산8문래1<NA>
6201907028의정부9종로3가1<NA>
7201907028인덕원9삼성(무역센터)1<NA>
8201907028잠실역8잠실중학교0<NA>
9201907028장산역8전포역1<NA>
거래일자승차시간승차역하차시간하차역명교통수단_분류_코드고객수
90201907028잠실(송파구청)8방배1<NA>
91201907028잠실(송파구청)8양재시민의숲(매1<NA>
92201907028잠실(송파구청)9삼성(무역센터)1<NA>
93201907028잠실(송파구청)9서울대입구(관악1<NA>
94201907028잠실(송파구청)9서초1<NA>
95201907028잠실(송파구청)9시청1<NA>
96201907028잠실(송파구청)9을지로4가1<NA>
97201907028잠실종합운동장8강남경찰서면허시0<NA>
98201907028정릉산장아파트8북한산보국문역20<NA>
99201907028정왕역환승센터8산업기술대학0<NA>