Overview

Dataset statistics

Number of variables5
Number of observations152
Missing cells20
Missing cells (%)2.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.2 KiB
Average record size in memory41.9 B

Variable types

Text3
Categorical1
DateTime1

Dataset

Description대전광역시 시설관리공단에서 운영중인 대전역 앞 지하도 상가(동구 중앙로 지하 200)의 점포에 대한 정보이력(일렬번호, 점포이름, 전화번호, 구분, 등록일) 제공
Author대전광역시시설관리공단
URLhttps://www.data.go.kr/data/15123949/fileData.do

Alerts

구분 is highly imbalanced (54.1%)Imbalance
전화번호 has 20 (13.2%) missing valuesMissing
일렬번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 12:45:03.956881
Analysis finished2023-12-12 12:45:04.519407
Duration0.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일렬번호
Text

UNIQUE 

Distinct152
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-12T21:45:04.723068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters1976
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique152 ?
Unique (%)100.0%

Sample

1st rowMH20190611068
2nd rowMH20190611069
3rd rowMH20190611070
4th rowMH20190611071
5th rowMH20190611072
ValueCountFrequency (%)
mh20190611068 1
 
0.7%
mh20190611173 1
 
0.7%
mh20190612182 1
 
0.7%
mh20190611167 1
 
0.7%
mh20190611168 1
 
0.7%
mh20190611169 1
 
0.7%
mh20190611170 1
 
0.7%
mh20190611171 1
 
0.7%
mh20190611172 1
 
0.7%
mh20190611175 1
 
0.7%
Other values (142) 142
93.4%
2023-12-12T21:45:05.128099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 534
27.0%
0 366
18.5%
2 222
11.2%
9 189
 
9.6%
6 182
 
9.2%
M 152
 
7.7%
H 152
 
7.7%
8 49
 
2.5%
7 39
 
2.0%
3 35
 
1.8%
Other values (2) 56
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1672
84.6%
Uppercase Letter 304
 
15.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 534
31.9%
0 366
21.9%
2 222
13.3%
9 189
 
11.3%
6 182
 
10.9%
8 49
 
2.9%
7 39
 
2.3%
3 35
 
2.1%
4 29
 
1.7%
5 27
 
1.6%
Uppercase Letter
ValueCountFrequency (%)
M 152
50.0%
H 152
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1672
84.6%
Latin 304
 
15.4%

Most frequent character per script

Common
ValueCountFrequency (%)
1 534
31.9%
0 366
21.9%
2 222
13.3%
9 189
 
11.3%
6 182
 
10.9%
8 49
 
2.9%
7 39
 
2.3%
3 35
 
2.1%
4 29
 
1.7%
5 27
 
1.6%
Latin
ValueCountFrequency (%)
M 152
50.0%
H 152
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1976
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 534
27.0%
0 366
18.5%
2 222
11.2%
9 189
 
9.6%
6 182
 
9.2%
M 152
 
7.7%
H 152
 
7.7%
8 49
 
2.5%
7 39
 
2.0%
3 35
 
1.8%
Other values (2) 56
 
2.8%
Distinct74
Distinct (%)48.7%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
2023-12-12T21:45:05.425252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length6.5
Mean length4.0592105
Min length1

Characters and Unicode

Total characters617
Distinct characters143
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)18.4%

Sample

1st row몽실
2nd row몽실
3rd row마모스
4th row흙비
5th row흙비
ValueCountFrequency (%)
기린전자통신 11
 
7.1%
삼광전자 5
 
3.2%
밤블비 5
 
3.2%
크로커다일 4
 
2.6%
올포유 4
 
2.6%
4
 
2.6%
여성크로커다일 4
 
2.6%
청바지코너 3
 
1.9%
호키랜드 3
 
1.9%
예시점포 3
 
1.9%
Other values (65) 108
70.1%
2023-12-12T21:45:05.864646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30
 
4.9%
30
 
4.9%
25
 
4.1%
24
 
3.9%
14
 
2.3%
13
 
2.1%
13
 
2.1%
12
 
1.9%
12
 
1.9%
11
 
1.8%
Other values (133) 433
70.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 595
96.4%
Decimal Number 10
 
1.6%
Uppercase Letter 8
 
1.3%
Space Separator 2
 
0.3%
Close Punctuation 1
 
0.2%
Open Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
 
5.0%
30
 
5.0%
25
 
4.2%
24
 
4.0%
14
 
2.4%
13
 
2.2%
13
 
2.2%
12
 
2.0%
12
 
2.0%
11
 
1.8%
Other values (120) 411
69.1%
Decimal Number
ValueCountFrequency (%)
2 5
50.0%
6 1
 
10.0%
1 1
 
10.0%
0 1
 
10.0%
5 1
 
10.0%
4 1
 
10.0%
Uppercase Letter
ValueCountFrequency (%)
N 2
25.0%
E 2
25.0%
W 2
25.0%
T 2
25.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 595
96.4%
Common 14
 
2.3%
Latin 8
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
 
5.0%
30
 
5.0%
25
 
4.2%
24
 
4.0%
14
 
2.4%
13
 
2.2%
13
 
2.2%
12
 
2.0%
12
 
2.0%
11
 
1.8%
Other values (120) 411
69.1%
Common
ValueCountFrequency (%)
2 5
35.7%
2
 
14.3%
6 1
 
7.1%
1 1
 
7.1%
0 1
 
7.1%
5 1
 
7.1%
4 1
 
7.1%
) 1
 
7.1%
( 1
 
7.1%
Latin
ValueCountFrequency (%)
N 2
25.0%
E 2
25.0%
W 2
25.0%
T 2
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 595
96.4%
ASCII 22
 
3.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
30
 
5.0%
30
 
5.0%
25
 
4.2%
24
 
4.0%
14
 
2.4%
13
 
2.2%
13
 
2.2%
12
 
2.0%
12
 
2.0%
11
 
1.8%
Other values (120) 411
69.1%
ASCII
ValueCountFrequency (%)
2 5
22.7%
2
 
9.1%
N 2
 
9.1%
E 2
 
9.1%
W 2
 
9.1%
T 2
 
9.1%
6 1
 
4.5%
1 1
 
4.5%
0 1
 
4.5%
5 1
 
4.5%
Other values (3) 3
13.6%

전화번호
Text

MISSING 

Distinct60
Distinct (%)45.5%
Missing20
Missing (%)13.2%
Memory size1.3 KiB
2023-12-12T21:45:06.137812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length7.9166667
Min length1

Characters and Unicode

Total characters1045
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)14.4%

Sample

1st row221-0893
2nd row221-0893
3rd row242-4075
4th row222-5636
5th row222-5636
ValueCountFrequency (%)
633-3701 11
 
8.3%
255-7838 5
 
3.8%
222-6999 5
 
3.8%
222-8525 4
 
3.0%
252-0036 4
 
3.0%
257-9064 4
 
3.0%
256-0842 3
 
2.3%
221-9867 3
 
2.3%
256-2990 3
 
2.3%
257-1754 3
 
2.3%
Other values (50) 87
65.9%
2023-12-12T21:45:06.571968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 223
21.3%
5 136
13.0%
- 131
12.5%
6 90
8.6%
3 89
 
8.5%
7 81
 
7.8%
0 70
 
6.7%
1 59
 
5.6%
4 59
 
5.6%
9 54
 
5.2%
Other values (2) 53
 
5.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 912
87.3%
Dash Punctuation 131
 
12.5%
Math Symbol 2
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 223
24.5%
5 136
14.9%
6 90
9.9%
3 89
 
9.8%
7 81
 
8.9%
0 70
 
7.7%
1 59
 
6.5%
4 59
 
6.5%
9 54
 
5.9%
8 51
 
5.6%
Dash Punctuation
ValueCountFrequency (%)
- 131
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1045
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 223
21.3%
5 136
13.0%
- 131
12.5%
6 90
8.6%
3 89
 
8.5%
7 81
 
7.8%
0 70
 
6.7%
1 59
 
5.6%
4 59
 
5.6%
9 54
 
5.2%
Other values (2) 53
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1045
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 223
21.3%
5 136
13.0%
- 131
12.5%
6 90
8.6%
3 89
 
8.5%
7 81
 
7.8%
0 70
 
6.7%
1 59
 
5.6%
4 59
 
5.6%
9 54
 
5.2%
Other values (2) 53
 
5.1%

구분
Categorical

IMBALANCE 

Distinct6
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
3
112 
5
28 
2
 
7
1
 
3
4
 
1

Length

Max length4
Median length1
Mean length1.0197368
Min length1

Unique

Unique2 ?
Unique (%)1.3%

Sample

1st row3
2nd row3
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
3 112
73.7%
5 28
 
18.4%
2 7
 
4.6%
1 3
 
2.0%
4 1
 
0.7%
<NA> 1
 
0.7%

Length

2023-12-12T21:45:06.770480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:45:06.922944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3 112
73.7%
5 28
 
18.4%
2 7
 
4.6%
1 3
 
2.0%
4 1
 
0.7%
na 1
 
0.7%
Distinct12
Distinct (%)7.9%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
Minimum2019-06-11 00:00:00
Maximum2019-07-31 00:00:00
2023-12-12T21:45:07.054621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:45:07.166978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)

Correlations

2023-12-12T21:45:07.247556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
점포이름전화번호구분등록일
점포이름1.0001.0000.9990.827
전화번호1.0001.0001.0000.298
구분0.9991.0001.0000.375
등록일0.8270.2980.3751.000

Missing values

2023-12-12T21:45:04.372040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:45:04.472178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일렬번호점포이름전화번호구분등록일
0MH20190611068몽실221-089332019-06-11
1MH20190611069몽실221-089332019-06-11
2MH20190611070마모스242-407532019-06-11
3MH20190611071흙비222-563632019-06-11
4MH20190611072흙비222-563632019-06-11
5MH20190611073해풍사256-203932019-06-11
6MH20190611074보라252-160132019-06-11
7MH20190611075보라<NA>32019-06-11
8MH20190611076탐나라252-485732019-06-11
9MH20190611077탐나라252-485732019-06-11
일렬번호점포이름전화번호구분등록일
142MH20190626238나열16호<NA>22019-06-26
143MH20190627244크로커다일256-299032019-06-27
144MH20190628262테스트점포<NA>32019-06-28
145MH20190628265테스트점포<NA>32019-06-28
146MH20190628284예시점포<NA>32019-06-28
147MH20190628285예시점포2<NA>32019-06-28
148MH20190705288예시점포2<NA>32019-07-05
149MH20190705289예시점포<NA>32019-07-05
150MH20190708297예시점포<NA>32019-07-08
151MH20190731301비네아<NA>32019-07-31