Overview

Dataset statistics

Number of variables5
Number of observations56
Missing cells8
Missing cells (%)2.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.3 KiB
Average record size in memory42.3 B

Variable types

Categorical2
Text3

Dataset

Description대구광역시 수성구 인쇄업 현황(2018. 8월 기준)
Author대구광역시 수성구
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=15054719&dataSetDetailId=15054719286119b22592e&provdMethod=FILE

Alerts

영업상태 is highly overall correlated with 업 종High correlation
업 종 is highly overall correlated with 영업상태High correlation
업 종 is highly imbalanced (87.1%)Imbalance
영업상태 is highly imbalanced (87.1%)Imbalance
사업체명칭 has 1 (1.8%) missing valuesMissing
사업체소재지(도로명) has 3 (5.4%) missing valuesMissing
전화번호 has 4 (7.1%) missing valuesMissing

Reproduction

Analysis started2024-04-21 03:20:57.031430
Analysis finished2024-04-21 03:20:57.949328
Duration0.92 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업 종
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size576.0 B
인쇄업
55 
<NA>
 
1

Length

Max length4
Median length3
Mean length3.0178571
Min length3

Unique

Unique1 ?
Unique (%)1.8%

Sample

1st row인쇄업
2nd row인쇄업
3rd row인쇄업
4th row인쇄업
5th row인쇄업

Common Values

ValueCountFrequency (%)
인쇄업 55
98.2%
<NA> 1
 
1.8%

Length

2024-04-21T12:20:58.064509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T12:20:58.230048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
인쇄업 55
98.2%
na 1
 
1.8%

사업체명칭
Text

MISSING 

Distinct55
Distinct (%)100.0%
Missing1
Missing (%)1.8%
Memory size576.0 B
2024-04-21T12:20:58.984775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length14
Mean length7.8181818
Min length2

Characters and Unicode

Total characters430
Distinct characters153
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique55 ?
Unique (%)100.0%

Sample

1st row새한인쇄소
2nd row경안당애드컴, 종합인쇄
3rd row상문인쇄사
4th row이재웅 광고기획
5th row팔공기획
ValueCountFrequency (%)
주식회사 5
 
6.2%
도서출판 2
 
2.5%
새한인쇄소 1
 
1.2%
티플디자인 1
 
1.2%
오샤 1
 
1.2%
102디자인 1
 
1.2%
대한종합인쇄 1
 
1.2%
이룸커뮤니케이션즈 1
 
1.2%
크리에이티브 1
 
1.2%
주)웰메이드 1
 
1.2%
Other values (66) 66
81.5%
2024-04-21T12:21:00.027100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26
 
6.0%
23
 
5.3%
16
 
3.7%
) 14
 
3.3%
( 13
 
3.0%
13
 
3.0%
11
 
2.6%
11
 
2.6%
10
 
2.3%
10
 
2.3%
Other values (143) 283
65.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 362
84.2%
Space Separator 26
 
6.0%
Close Punctuation 14
 
3.3%
Open Punctuation 13
 
3.0%
Uppercase Letter 6
 
1.4%
Other Punctuation 3
 
0.7%
Lowercase Letter 3
 
0.7%
Decimal Number 3
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
23
 
6.4%
16
 
4.4%
13
 
3.6%
11
 
3.0%
11
 
3.0%
10
 
2.8%
10
 
2.8%
8
 
2.2%
8
 
2.2%
7
 
1.9%
Other values (126) 245
67.7%
Uppercase Letter
ValueCountFrequency (%)
L 1
16.7%
C 1
16.7%
A 1
16.7%
H 1
16.7%
S 1
16.7%
O 1
16.7%
Lowercase Letter
ValueCountFrequency (%)
t 1
33.3%
o 1
33.3%
d 1
33.3%
Decimal Number
ValueCountFrequency (%)
2 1
33.3%
0 1
33.3%
1 1
33.3%
Other Punctuation
ValueCountFrequency (%)
, 2
66.7%
. 1
33.3%
Space Separator
ValueCountFrequency (%)
26
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 362
84.2%
Common 59
 
13.7%
Latin 9
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
23
 
6.4%
16
 
4.4%
13
 
3.6%
11
 
3.0%
11
 
3.0%
10
 
2.8%
10
 
2.8%
8
 
2.2%
8
 
2.2%
7
 
1.9%
Other values (126) 245
67.7%
Latin
ValueCountFrequency (%)
t 1
11.1%
L 1
11.1%
o 1
11.1%
C 1
11.1%
A 1
11.1%
H 1
11.1%
S 1
11.1%
O 1
11.1%
d 1
11.1%
Common
ValueCountFrequency (%)
26
44.1%
) 14
23.7%
( 13
22.0%
, 2
 
3.4%
. 1
 
1.7%
2 1
 
1.7%
0 1
 
1.7%
1 1
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 362
84.2%
ASCII 68
 
15.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
26
38.2%
) 14
20.6%
( 13
19.1%
, 2
 
2.9%
t 1
 
1.5%
L 1
 
1.5%
. 1
 
1.5%
o 1
 
1.5%
C 1
 
1.5%
A 1
 
1.5%
Other values (7) 7
 
10.3%
Hangul
ValueCountFrequency (%)
23
 
6.4%
16
 
4.4%
13
 
3.6%
11
 
3.0%
11
 
3.0%
10
 
2.8%
10
 
2.8%
8
 
2.2%
8
 
2.2%
7
 
1.9%
Other values (126) 245
67.7%
Distinct53
Distinct (%)100.0%
Missing3
Missing (%)5.4%
Memory size576.0 B
2024-04-21T12:21:00.980802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length32
Mean length27.528302
Min length23

Characters and Unicode

Total characters1459
Distinct characters76
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique53 ?
Unique (%)100.0%

Sample

1st row대구광역시 수성구 달구벌대로 3076-11 (시지동)
2nd row대구광역시 수성구 들안로54길 41 (수성동3가)
3rd row대구광역시 수성구 명덕로 477 (범어동)
4th row대구광역시 수성구 명덕로75길 22 (수성동1가)
5th row대구광역시 수성구 수성로58길 8 (수성동2가)
ValueCountFrequency (%)
대구광역시 53
 
18.2%
수성구 53
 
18.2%
범어동 8
 
2.7%
3층 8
 
2.7%
2층 7
 
2.4%
1층 6
 
2.1%
두산동 6
 
2.1%
황금동 5
 
1.7%
만촌동 4
 
1.4%
수성로 4
 
1.4%
Other values (104) 138
47.3%
2024-04-21T12:21:02.173400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
258
17.7%
114
 
7.8%
74
 
5.1%
72
 
4.9%
62
 
4.2%
60
 
4.1%
54
 
3.7%
( 54
 
3.7%
) 54
 
3.7%
53
 
3.6%
Other values (66) 604
41.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 835
57.2%
Space Separator 258
 
17.7%
Decimal Number 223
 
15.3%
Open Punctuation 54
 
3.7%
Close Punctuation 54
 
3.7%
Other Punctuation 29
 
2.0%
Dash Punctuation 6
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
114
13.7%
74
 
8.9%
72
 
8.6%
62
 
7.4%
60
 
7.2%
54
 
6.5%
53
 
6.3%
53
 
6.3%
53
 
6.3%
25
 
3.0%
Other values (51) 215
25.7%
Decimal Number
ValueCountFrequency (%)
1 44
19.7%
2 40
17.9%
3 31
13.9%
4 23
10.3%
5 22
9.9%
8 18
8.1%
0 14
 
6.3%
7 12
 
5.4%
6 10
 
4.5%
9 9
 
4.0%
Space Separator
ValueCountFrequency (%)
258
100.0%
Open Punctuation
ValueCountFrequency (%)
( 54
100.0%
Close Punctuation
ValueCountFrequency (%)
) 54
100.0%
Other Punctuation
ValueCountFrequency (%)
, 29
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 835
57.2%
Common 624
42.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
114
13.7%
74
 
8.9%
72
 
8.6%
62
 
7.4%
60
 
7.2%
54
 
6.5%
53
 
6.3%
53
 
6.3%
53
 
6.3%
25
 
3.0%
Other values (51) 215
25.7%
Common
ValueCountFrequency (%)
258
41.3%
( 54
 
8.7%
) 54
 
8.7%
1 44
 
7.1%
2 40
 
6.4%
3 31
 
5.0%
, 29
 
4.6%
4 23
 
3.7%
5 22
 
3.5%
8 18
 
2.9%
Other values (5) 51
 
8.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 835
57.2%
ASCII 624
42.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
258
41.3%
( 54
 
8.7%
) 54
 
8.7%
1 44
 
7.1%
2 40
 
6.4%
3 31
 
5.0%
, 29
 
4.6%
4 23
 
3.7%
5 22
 
3.5%
8 18
 
2.9%
Other values (5) 51
 
8.2%
Hangul
ValueCountFrequency (%)
114
13.7%
74
 
8.9%
72
 
8.6%
62
 
7.4%
60
 
7.2%
54
 
6.5%
53
 
6.3%
53
 
6.3%
53
 
6.3%
25
 
3.0%
Other values (51) 215
25.7%

전화번호
Text

MISSING 

Distinct51
Distinct (%)98.1%
Missing4
Missing (%)7.1%
Memory size576.0 B
2024-04-21T12:21:02.931020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.057692
Min length12

Characters and Unicode

Total characters627
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)96.2%

Sample

1st row053-791-6881
2nd row053-764-6464
3rd row053-755-0300
4th row053-741-2806
5th row053-763-7780
ValueCountFrequency (%)
053-756-7001 2
 
3.8%
053-768-6230 1
 
1.9%
070-4231-5386 1
 
1.9%
053-768-2061 1
 
1.9%
053-765-6886 1
 
1.9%
053-767-6009 1
 
1.9%
053-752-5523 1
 
1.9%
053-811-7127 1
 
1.9%
070-4200-4455 1
 
1.9%
053-767-4475 1
 
1.9%
Other values (41) 41
78.8%
2024-04-21T12:21:03.913671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 104
16.6%
0 95
15.2%
5 94
15.0%
7 75
12.0%
3 71
11.3%
6 54
8.6%
2 40
 
6.4%
4 39
 
6.2%
1 25
 
4.0%
8 22
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 523
83.4%
Dash Punctuation 104
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 95
18.2%
5 94
18.0%
7 75
14.3%
3 71
13.6%
6 54
10.3%
2 40
7.6%
4 39
7.5%
1 25
 
4.8%
8 22
 
4.2%
9 8
 
1.5%
Dash Punctuation
ValueCountFrequency (%)
- 104
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 627
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 104
16.6%
0 95
15.2%
5 94
15.0%
7 75
12.0%
3 71
11.3%
6 54
8.6%
2 40
 
6.4%
4 39
 
6.2%
1 25
 
4.0%
8 22
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 627
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 104
16.6%
0 95
15.2%
5 94
15.0%
7 75
12.0%
3 71
11.3%
6 54
8.6%
2 40
 
6.4%
4 39
 
6.2%
1 25
 
4.0%
8 22
 
3.5%

영업상태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Memory size576.0 B
영업중
55 
<NA>
 
1

Length

Max length4
Median length3
Mean length3.0178571
Min length3

Unique

Unique1 ?
Unique (%)1.8%

Sample

1st row영업중
2nd row영업중
3rd row영업중
4th row영업중
5th row영업중

Common Values

ValueCountFrequency (%)
영업중 55
98.2%
<NA> 1
 
1.8%

Length

2024-04-21T12:21:04.147682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T12:21:04.313957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영업중 55
98.2%
na 1
 
1.8%

Correlations

2024-04-21T12:21:04.417948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업체명칭사업체소재지(도로명)전화번호
사업체명칭1.0001.0001.000
사업체소재지(도로명)1.0001.0001.000
전화번호1.0001.0001.000
2024-04-21T12:21:04.563529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
영업상태업 종
영업상태1.0001.000
업 종1.0001.000
2024-04-21T12:21:04.699998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업 종영업상태
업 종1.0001.000
영업상태1.0001.000

Missing values

2024-04-21T12:20:57.488819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T12:20:57.657246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-21T12:20:57.832241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업 종사업체명칭사업체소재지(도로명)전화번호영업상태
0인쇄업새한인쇄소대구광역시 수성구 달구벌대로 3076-11 (시지동)053-791-6881영업중
1인쇄업경안당애드컴, 종합인쇄대구광역시 수성구 들안로54길 41 (수성동3가)053-764-6464영업중
2인쇄업상문인쇄사대구광역시 수성구 명덕로 477 (범어동)053-755-0300영업중
3인쇄업이재웅 광고기획대구광역시 수성구 명덕로75길 22 (수성동1가)053-741-2806영업중
4인쇄업팔공기획대구광역시 수성구 수성로58길 8 (수성동2가)053-763-7780영업중
5인쇄업화신기획인쇄대구광역시 수성구 동대구로 379 (범어동,태운빌딩 1층)053-755-4442영업중
6인쇄업동서기획대구광역시 수성구 들안로 205 (중동)053-764-8457영업중
7인쇄업대보커뮤니케이션(주)대구광역시 수성구 신천동로84길 31 (수성동4가)053-742-5673영업중
8인쇄업일진전산대구광역시 수성구 들안로8길 55 (두산동)053-767-4044영업중
9인쇄업삼영종합인쇄소대구광역시 수성구 범어로27길 33 (범어동)053-756-7001영업중
업 종사업체명칭사업체소재지(도로명)전화번호영업상태
46인쇄업(주)메이트 컴퍼니대구광역시 수성구 들안로 226, 3층 (황금동)053-765-2500영업중
47인쇄업(주)스타트컴대구광역시 수성구 상록로 44, 2층 (범어동)053-252-4523영업중
48인쇄업해디자인대구광역시 수성구 수성로 186-1 (중동)<NA>영업중
49인쇄업(주)디자인소울 디자인연구소대구광역시 수성구 동대구로 45, 3층 (두산동, 삼우빌딩)053-217-7677영업중
50인쇄업(사) 한국장애인케어 대구경북협회대구광역시 수성구 동대구로 45, 3층 (두산동)053-214-7677영업중
51인쇄업서원기획대구광역시 수성구 청수로 159, 2층 (황금동)053-252-2550영업중
52인쇄업참애드대구광역시 수성구 들안로25길 41 (중동)053-426-4226영업중
53인쇄업주식회사 디자인그룹칸대구광역시 수성구 신천동로 130, 3층 (상동)053-656-2137영업중
54인쇄업오샤 주식회사 (OSHA Co. , Ltd)대구광역시 수성구 무학로 115, 지하 1층 (두산동, 창성빌딩)053-656-6332영업중
55<NA><NA><NA><NA><NA>