gimi9 Pandas Profiling

Dataset statistics

Number of variables	3
Number of observations	30
Missing cells	1
Missing cells (%)	1.1%
Duplicate rows	0
Duplicate rows (%)	0.0%
Total size in memory	882.0 B
Average record size in memory	29.4 B

Variable types

Text	2
Categorical	1

Dataset

Description	대전광역시 버스전용차로 EEB(버스장착형 단속시스템) 단속카메라 현황에 대한 데이터로 노선번호, 기점-종점, 대수를 제공합니다.
URL	https://www.data.go.kr/data/15081426/fileData.do

Alerts

`기점-종점` has 1 (3.3%) missing values	Missing
`노선번호` has unique values	Unique

Reproduction

Analysis started	2023-12-12 11:59:31.623938
Analysis finished	2023-12-12 11:59:31.939800
Duration	0.32 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

노선번호
Text

UNIQUE

Distinct	30
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Memory size	372.0 B

Length

Max length	5
Median length	4
Mean length	4.0333333
Min length	4

Characters and Unicode

Total characters	121
Distinct characters	16
Distinct categories	2 ?
Distinct scripts	2 ?
Distinct blocks	2 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	30 ?
Unique (%)	100.0%

Sample

1st row	급행2번
2nd row	102번
3rd row	103번
4th row	104번
5th row	105번

Value	Count	Frequency (%)
급행2번	1	3.3%
102번	1	3.3%
802번	1	3.3%
711번	1	3.3%
705번	1	3.3%
703번	1	3.3%
619번	1	3.3%
617번	1	3.3%
613번	1	3.3%
612번	1	3.3%
Other values (20)	20	66.7%

Most occurring characters

Value	Count	Frequency (%)
번	29	24.0%
1	27	22.3%
0	15	12.4%
6	12	9.9%
3	10	8.3%
2	8	6.6%
5	4	3.3%
7	4	3.3%
4	3	2.5%
9	3	2.5%
Other values (6)	6	5.0%

Most occurring categories

Value	Count	Frequency (%)
Decimal Number	87	71.9%
Other Letter	34	28.1%

Most frequent character per category

Decimal Number

Value	Count	Frequency (%)
1	27	31.0%
0	15	17.2%
6	12	13.8%
3	10	11.5%
2	8	9.2%
5	4	4.6%
7	4	4.6%
4	3	3.4%
9	3	3.4%
8	1	1.1%

Other Letter

Value	Count	Frequency (%)
번	29	85.3%
급	1	2.9%
행	1	2.9%
개	1	2.9%
노	1	2.9%
선	1	2.9%

Most occurring scripts

Value	Count	Frequency (%)
Common	87	71.9%
Hangul	34	28.1%

Most frequent character per script

Common

Value	Count	Frequency (%)
1	27	31.0%
0	15	17.2%
6	12	13.8%
3	10	11.5%
2	8	9.2%
5	4	4.6%
7	4	4.6%
4	3	3.4%
9	3	3.4%
8	1	1.1%

Hangul

Value	Count	Frequency (%)
번	29	85.3%
급	1	2.9%
행	1	2.9%
개	1	2.9%
노	1	2.9%
선	1	2.9%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	87	71.9%
Hangul	34	28.1%

Most frequent character per block

Hangul

Value	Count	Frequency (%)
번	29	85.3%
급	1	2.9%
행	1	2.9%
개	1	2.9%
노	1	2.9%
선	1	2.9%

ASCII

Value	Count	Frequency (%)
1	27	31.0%
0	15	17.2%
6	12	13.8%
3	10	11.5%
2	8	9.2%
5	4	4.6%
7	4	4.6%
4	3	3.4%
9	3	3.4%
8	1	1.1%

기점-종점
Text

MISSING

Distinct	29
Distinct (%)	100.0%
Missing	1
Missing (%)	3.3%
Memory size	372.0 B

Length

Max length	14
Median length	13
Mean length	10.724138
Min length	8

Characters and Unicode

Total characters	311
Distinct characters	87
Distinct categories	7 ?
Distinct scripts	3 ?
Distinct blocks	3 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	29 ?
Unique (%)	100.0%

Sample

1st row	봉산동 ↔ 옥계동
2nd row	수통골 ↔ 대전역
3rd row	수통골 ↔ 동춘당
4th row	수통골 ↔ 탄방역
5th row	충대농대 ↔ 비래삼호A

Value	Count	Frequency (%)
↔	29	33.3%
비래동	5	5.7%
봉산동	3	3.4%
수통골	3	3.4%
신탄진	3	3.4%
목원대	3	3.4%
오월드(동물원	3	3.4%
갈마아파트	2	2.3%
대전대	2	2.3%
대전역	2	2.3%
Other values (29)	32	36.8%

Most occurring characters

Value	Count	Frequency (%)
	58	18.6%
↔	29	9.3%
동	26	8.4%
대	18	5.8%
원	9	2.9%
신	8	2.6%
전	7	2.3%
고	6	1.9%
래	6	1.9%
비	6	1.9%
Other values (77)	138	44.4%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	213	68.5%
Space Separator	58	18.6%
Math Symbol	29	9.3%
Close Punctuation	3	1.0%
Open Punctuation	3	1.0%
Uppercase Letter	3	1.0%
Decimal Number	2	0.6%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
동	26	12.2%
대	18	8.5%
원	9	4.2%
신	8	3.8%
전	7	3.3%
고	6	2.8%
래	6	2.8%
비	6	2.8%
역	5	2.3%
산	5	2.3%
Other values (68)	117	54.9%

Uppercase Letter

Value	Count	Frequency (%)
I	1	33.3%
C	1	33.3%
A	1	33.3%

Decimal Number

Value	Count	Frequency (%)
5	1	50.0%
2	1	50.0%

Space Separator

Value	Count	Frequency (%)
	58	100.0%

Math Symbol

Value	Count	Frequency (%)
↔	29	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	3	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	3	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	213	68.5%
Common	95	30.5%
Latin	3	1.0%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
동	26	12.2%
대	18	8.5%
원	9	4.2%
신	8	3.8%
전	7	3.3%
고	6	2.8%
래	6	2.8%
비	6	2.8%
역	5	2.3%
산	5	2.3%
Other values (68)	117	54.9%

Common

Value	Count	Frequency (%)
	58	61.1%
↔	29	30.5%
)	3	3.2%
(	3	3.2%
5	1	1.1%
2	1	1.1%

Latin

Value	Count	Frequency (%)
I	1	33.3%
C	1	33.3%
A	1	33.3%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	213	68.5%
ASCII	69	22.2%
Arrows	29	9.3%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	58	84.1%
)	3	4.3%
(	3	4.3%
5	1	1.4%
I	1	1.4%
C	1	1.4%
A	1	1.4%
2	1	1.4%

Arrows

Value	Count	Frequency (%)
↔	29	100.0%

Hangul

Value	Count	Frequency (%)
동	26	12.2%
대	18	8.5%
원	9	4.2%
신	8	3.8%
전	7	3.3%
고	6	2.8%
래	6	2.8%
비	6	2.8%
역	5	2.3%
산	5	2.3%
Other values (68)	117	54.9%

대수
Categorical

Distinct	4
Distinct (%)	13.3%
Missing	0
Missing (%)	0.0%
Memory size	372.0 B

4	19
3	8
5	2
110	1

Length

Max length	3
Median length	1
Mean length	1.0666667
Min length	1

Unique

Unique	1 ?
Unique (%)	3.3%

Sample

1st row	5
2nd row	4
3rd row	5
4th row	4
5th row	4

Common Values

Value	Count	Frequency (%)
4	19	63.3%
3	8	26.7%
5	2	6.7%
110	1	3.3%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
4	19	63.3%
3	8	26.7%
5	2	6.7%
110	1	3.3%

Phik (φk)

Heatmap
Table

	노선번호	기점-종점	대수
노선번호	1.000	1.000	1.000
기점-종점	1.000	1.000	1.000
대수	1.000	1.000	1.000

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	노선번호	기점-종점	대수
0	급행2번	봉산동 ↔ 옥계동	5
1	102번	수통골 ↔ 대전역	4
2	103번	수통골 ↔ 동춘당	5
3	104번	수통골 ↔ 탄방역	4
4	105번	충대농대 ↔ 비래삼호A	4
5	106번	목원대 ↔ 비래동	3
6	113번	서남부터미널 ↔ 학하동	4
7	119번	안산동 ↔ 효동	4
8	201번	원내차고지 ↔ 대전IC	4
9	301번	봉산동 ↔ 오월드(동물원)	4

	노선번호	기점-종점	대수
20	611번	신대동 ↔ 세천공원	3
21	612번	동신과학고 ↔ 배재대	4
22	613번	비래동 ↔ 갈마아파트	4
23	617번	비래동 ↔ 변동5	3
24	619번	동신과학고 ↔ 서대전여고	4
25	703번	신탄진 ↔ 정림동	4
26	705번	신탄진 ↔ 대전역동광장	4
27	711번	신탄진 ↔ 대전역	4
28	802번	봉산동 ↔ 보문산	3
29	29개노선	<NA>	110

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Decimal Number

Other Letter

Most occurring scripts

Most frequent character per script

Common

Hangul

Most occurring blocks

Most frequent character per block

Hangul

ASCII

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Uppercase Letter

Decimal Number

Space Separator

Math Symbol

Close Punctuation

Open Punctuation

Most occurring scripts

Most frequent character per script

Hangul

Common

Latin

Most occurring blocks

Most frequent character per block

ASCII

Arrows

Hangul

Common Values

Length

Common Values (Plot)

Correlations

Missing values

Sample