Overview

Dataset statistics

Number of variables4
Number of observations473
Missing cells83
Missing cells (%)4.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory14.9 KiB
Average record size in memory32.3 B

Variable types

Text3
Categorical1

Dataset

Description부산광역시중구_출판및인쇄사현황_20230803
Author부산광역시 중구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3073842

Alerts

사업체소재지(도로명) has 76 (16.1%) missing valuesMissing
사업체소재지(지번) has 7 (1.5%) missing valuesMissing

Reproduction

Analysis started2023-12-10 16:43:06.116203
Analysis finished2023-12-10 16:43:06.608884
Duration0.49 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct424
Distinct (%)89.6%
Missing0
Missing (%)0.0%
Memory size3.8 KiB
2023-12-11T01:43:06.783897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length16
Mean length5.8350951
Min length1

Characters and Unicode

Total characters2760
Distinct characters358
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique376 ?
Unique (%)79.5%

Sample

1st row연문씨앤피
2nd row시로출판사
3rd row태화출판사
4th row참한문화사
5th row도서출판 조양
ValueCountFrequency (%)
도서출판 62
 
10.6%
주식회사 8
 
1.4%
디자인 7
 
1.2%
샤인텔 3
 
0.5%
예문사 3
 
0.5%
서울미디어 2
 
0.3%
신지사 2
 
0.3%
진영기획 2
 
0.3%
태성인쇄사 2
 
0.3%
대양사 2
 
0.3%
Other values (445) 491
84.1%
2023-12-11T01:43:07.139570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
146
 
5.3%
139
 
5.0%
114
 
4.1%
112
 
4.1%
105
 
3.8%
96
 
3.5%
73
 
2.6%
71
 
2.6%
71
 
2.6%
63
 
2.3%
Other values (348) 1770
64.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2467
89.4%
Space Separator 112
 
4.1%
Uppercase Letter 59
 
2.1%
Lowercase Letter 50
 
1.8%
Close Punctuation 32
 
1.2%
Open Punctuation 31
 
1.1%
Decimal Number 6
 
0.2%
Other Punctuation 2
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
146
 
5.9%
139
 
5.6%
114
 
4.6%
105
 
4.3%
96
 
3.9%
73
 
3.0%
71
 
2.9%
71
 
2.9%
63
 
2.6%
40
 
1.6%
Other values (298) 1549
62.8%
Uppercase Letter
ValueCountFrequency (%)
N 8
13.6%
D 4
 
6.8%
B 4
 
6.8%
M 4
 
6.8%
R 4
 
6.8%
E 4
 
6.8%
I 4
 
6.8%
P 3
 
5.1%
T 3
 
5.1%
O 3
 
5.1%
Other values (11) 18
30.5%
Lowercase Letter
ValueCountFrequency (%)
o 10
20.0%
a 6
12.0%
r 5
10.0%
e 4
 
8.0%
l 4
 
8.0%
s 4
 
8.0%
k 3
 
6.0%
h 2
 
4.0%
g 2
 
4.0%
i 2
 
4.0%
Other values (8) 8
16.0%
Decimal Number
ValueCountFrequency (%)
0 1
16.7%
4 1
16.7%
3 1
16.7%
6 1
16.7%
5 1
16.7%
7 1
16.7%
Space Separator
ValueCountFrequency (%)
112
100.0%
Close Punctuation
ValueCountFrequency (%)
) 32
100.0%
Open Punctuation
ValueCountFrequency (%)
( 31
100.0%
Other Punctuation
ValueCountFrequency (%)
& 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2467
89.4%
Common 184
 
6.7%
Latin 109
 
3.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
146
 
5.9%
139
 
5.6%
114
 
4.6%
105
 
4.3%
96
 
3.9%
73
 
3.0%
71
 
2.9%
71
 
2.9%
63
 
2.6%
40
 
1.6%
Other values (298) 1549
62.8%
Latin
ValueCountFrequency (%)
o 10
 
9.2%
N 8
 
7.3%
a 6
 
5.5%
r 5
 
4.6%
e 4
 
3.7%
l 4
 
3.7%
D 4
 
3.7%
B 4
 
3.7%
M 4
 
3.7%
s 4
 
3.7%
Other values (29) 56
51.4%
Common
ValueCountFrequency (%)
112
60.9%
) 32
 
17.4%
( 31
 
16.8%
& 2
 
1.1%
0 1
 
0.5%
4 1
 
0.5%
3 1
 
0.5%
6 1
 
0.5%
5 1
 
0.5%
- 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2467
89.4%
ASCII 293
 
10.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
146
 
5.9%
139
 
5.6%
114
 
4.6%
105
 
4.3%
96
 
3.9%
73
 
3.0%
71
 
2.9%
71
 
2.9%
63
 
2.6%
40
 
1.6%
Other values (298) 1549
62.8%
ASCII
ValueCountFrequency (%)
112
38.2%
) 32
 
10.9%
( 31
 
10.6%
o 10
 
3.4%
N 8
 
2.7%
a 6
 
2.0%
r 5
 
1.7%
e 4
 
1.4%
l 4
 
1.4%
D 4
 
1.4%
Other values (40) 77
26.3%

업종
Categorical

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size3.8 KiB
출판사
276 
인쇄사
197 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row출판사
2nd row출판사
3rd row출판사
4th row출판사
5th row출판사

Common Values

ValueCountFrequency (%)
출판사 276
58.4%
인쇄사 197
41.6%

Length

2023-12-11T01:43:07.262993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:43:07.354698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
출판사 276
58.4%
인쇄사 197
41.6%
Distinct301
Distinct (%)75.8%
Missing76
Missing (%)16.1%
Memory size3.8 KiB
2023-12-11T01:43:07.597443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length48
Median length40
Mean length29.763224
Min length22

Characters and Unicode

Total characters11816
Distinct characters149
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique232 ?
Unique (%)58.4%

Sample

1st row부산광역시 중구 대청로135번길 11 (동광동4가)
2nd row부산광역시 중구 중앙대로 76 (중앙동4가)
3rd row부산광역시 중구 충장대로5번길 50-1 (중앙동4가)
4th row부산광역시 중구 40계단길 3 (중앙동4가)
5th row부산광역시 중구 복병산길6번길 3 (대청동1가)
ValueCountFrequency (%)
부산광역시 397
 
18.4%
중구 397
 
18.4%
동광동4가 55
 
2.5%
중앙동4가 54
 
2.5%
동광길 53
 
2.5%
보수동2가 38
 
1.8%
대청로135번길 36
 
1.7%
대청동1가 36
 
1.7%
대청로 34
 
1.6%
복병산길6번길 29
 
1.3%
Other values (342) 1034
47.8%
2023-12-11T01:43:08.043246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1983
 
16.8%
1 579
 
4.9%
575
 
4.9%
554
 
4.7%
547
 
4.6%
434
 
3.7%
) 417
 
3.5%
( 417
 
3.5%
415
 
3.5%
410
 
3.5%
Other values (139) 5485
46.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6580
55.7%
Decimal Number 2080
 
17.6%
Space Separator 1983
 
16.8%
Close Punctuation 417
 
3.5%
Open Punctuation 417
 
3.5%
Other Punctuation 201
 
1.7%
Dash Punctuation 127
 
1.1%
Uppercase Letter 10
 
0.1%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
575
 
8.7%
554
 
8.4%
547
 
8.3%
434
 
6.6%
415
 
6.3%
410
 
6.2%
400
 
6.1%
397
 
6.0%
385
 
5.9%
318
 
4.8%
Other values (114) 2145
32.6%
Decimal Number
ValueCountFrequency (%)
1 579
27.8%
3 296
14.2%
2 289
13.9%
4 281
13.5%
5 167
 
8.0%
6 115
 
5.5%
0 111
 
5.3%
7 97
 
4.7%
8 79
 
3.8%
9 66
 
3.2%
Uppercase Letter
ValueCountFrequency (%)
S 2
20.0%
C 2
20.0%
K 1
10.0%
B 1
10.0%
N 1
10.0%
A 1
10.0%
P 1
10.0%
T 1
10.0%
Other Punctuation
ValueCountFrequency (%)
, 199
99.0%
/ 2
 
1.0%
Space Separator
ValueCountFrequency (%)
1983
100.0%
Close Punctuation
ValueCountFrequency (%)
) 417
100.0%
Open Punctuation
ValueCountFrequency (%)
( 417
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 127
100.0%
Lowercase Letter
ValueCountFrequency (%)
b 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6580
55.7%
Common 5225
44.2%
Latin 11
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
575
 
8.7%
554
 
8.4%
547
 
8.3%
434
 
6.6%
415
 
6.3%
410
 
6.2%
400
 
6.1%
397
 
6.0%
385
 
5.9%
318
 
4.8%
Other values (114) 2145
32.6%
Common
ValueCountFrequency (%)
1983
38.0%
1 579
 
11.1%
) 417
 
8.0%
( 417
 
8.0%
3 296
 
5.7%
2 289
 
5.5%
4 281
 
5.4%
, 199
 
3.8%
5 167
 
3.2%
- 127
 
2.4%
Other values (6) 470
 
9.0%
Latin
ValueCountFrequency (%)
S 2
18.2%
C 2
18.2%
K 1
9.1%
B 1
9.1%
N 1
9.1%
b 1
9.1%
A 1
9.1%
P 1
9.1%
T 1
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6580
55.7%
ASCII 5236
44.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1983
37.9%
1 579
 
11.1%
) 417
 
8.0%
( 417
 
8.0%
3 296
 
5.7%
2 289
 
5.5%
4 281
 
5.4%
, 199
 
3.8%
5 167
 
3.2%
- 127
 
2.4%
Other values (15) 481
 
9.2%
Hangul
ValueCountFrequency (%)
575
 
8.7%
554
 
8.4%
547
 
8.3%
434
 
6.6%
415
 
6.3%
410
 
6.2%
400
 
6.1%
397
 
6.0%
385
 
5.9%
318
 
4.8%
Other values (114) 2145
32.6%
Distinct327
Distinct (%)70.2%
Missing7
Missing (%)1.5%
Memory size3.8 KiB
2023-12-11T01:43:08.338948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length33
Mean length21.306867
Min length17

Characters and Unicode

Total characters9929
Distinct characters119
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique244 ?
Unique (%)52.4%

Sample

1st row부산광역시 중구 동광동4가 12-1
2nd row부산광역시 중구 대청동3가 8
3rd row부산광역시 중구 중앙동4가 25-5
4th row부산광역시 중구 보수동1가 8
5th row부산광역시 중구 동광동4가 1
ValueCountFrequency (%)
부산광역시 466
23.0%
중구 466
23.0%
동광동4가 78
 
3.8%
중앙동4가 77
 
3.8%
보수동2가 52
 
2.6%
대청동1가 44
 
2.2%
중앙동3가 33
 
1.6%
1층 26
 
1.3%
중앙동2가 25
 
1.2%
3층 22
 
1.1%
Other values (350) 738
36.4%
2023-12-11T01:43:08.821788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2001
20.2%
613
 
6.2%
574
 
5.8%
572
 
5.8%
483
 
4.9%
471
 
4.7%
469
 
4.7%
466
 
4.7%
466
 
4.7%
454
 
4.6%
Other values (109) 3360
33.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5486
55.3%
Decimal Number 2013
 
20.3%
Space Separator 2001
 
20.2%
Dash Punctuation 381
 
3.8%
Open Punctuation 21
 
0.2%
Close Punctuation 21
 
0.2%
Uppercase Letter 6
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
613
11.2%
574
10.5%
572
10.4%
483
8.8%
471
8.6%
469
8.5%
466
8.5%
466
8.5%
454
8.3%
147
 
2.7%
Other values (89) 771
14.1%
Decimal Number
ValueCountFrequency (%)
1 417
20.7%
2 342
17.0%
3 328
16.3%
4 308
15.3%
5 139
 
6.9%
7 105
 
5.2%
0 99
 
4.9%
9 97
 
4.8%
6 94
 
4.7%
8 84
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
B 1
16.7%
N 1
16.7%
K 1
16.7%
A 1
16.7%
P 1
16.7%
T 1
16.7%
Space Separator
ValueCountFrequency (%)
2001
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 381
100.0%
Open Punctuation
ValueCountFrequency (%)
( 21
100.0%
Close Punctuation
ValueCountFrequency (%)
) 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5486
55.3%
Common 4437
44.7%
Latin 6
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
613
11.2%
574
10.5%
572
10.4%
483
8.8%
471
8.6%
469
8.5%
466
8.5%
466
8.5%
454
8.3%
147
 
2.7%
Other values (89) 771
14.1%
Common
ValueCountFrequency (%)
2001
45.1%
1 417
 
9.4%
- 381
 
8.6%
2 342
 
7.7%
3 328
 
7.4%
4 308
 
6.9%
5 139
 
3.1%
7 105
 
2.4%
0 99
 
2.2%
9 97
 
2.2%
Other values (4) 220
 
5.0%
Latin
ValueCountFrequency (%)
B 1
16.7%
N 1
16.7%
K 1
16.7%
A 1
16.7%
P 1
16.7%
T 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5486
55.3%
ASCII 4443
44.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2001
45.0%
1 417
 
9.4%
- 381
 
8.6%
2 342
 
7.7%
3 328
 
7.4%
4 308
 
6.9%
5 139
 
3.1%
7 105
 
2.4%
0 99
 
2.2%
9 97
 
2.2%
Other values (10) 226
 
5.1%
Hangul
ValueCountFrequency (%)
613
11.2%
574
10.5%
572
10.4%
483
8.8%
471
8.6%
469
8.5%
466
8.5%
466
8.5%
454
8.3%
147
 
2.7%
Other values (89) 771
14.1%

Missing values

2023-12-11T01:43:06.418165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:43:06.491334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T01:43:06.564724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

사업체명칭업종사업체소재지(도로명)사업체소재지(지번)
0연문씨앤피출판사부산광역시 중구 대청로135번길 11 (동광동4가)부산광역시 중구 동광동4가 12-1
1시로출판사출판사<NA>부산광역시 중구 대청동3가 8
2태화출판사출판사부산광역시 중구 중앙대로 76 (중앙동4가)부산광역시 중구 중앙동4가 25-5
3참한문화사출판사<NA>부산광역시 중구 보수동1가 8
4도서출판 조양출판사<NA>부산광역시 중구 동광동4가 1
5명성출판사출판사<NA>부산광역시 중구 대청동2가 10
6산업교육출판부출판사부산광역시 중구 충장대로5번길 50-1 (중앙동4가)부산광역시 중구 중앙동4가 78-15
7도서출판 청산출판사부산광역시 중구 40계단길 3 (중앙동4가)부산광역시 중구 중앙동4가 37-5
8해광출판사<NA>부산광역시 중구 동광동4가 13
9도서출판 모아출판사부산광역시 중구 복병산길6번길 3 (대청동1가)부산광역시 중구 대청동1가 30-5
사업체명칭업종사업체소재지(도로명)사업체소재지(지번)
463디자인예원인쇄사부산광역시 중구 대청로 115-5, 3층 (대청동1가)부산광역시 중구 대청동1가 33-1
464(주) 다영인쇄사부산광역시 중구 동광길 10, 1층 (동광동4가)부산광역시 중구 동광동4가 18-8
465(주) 핑크로더인쇄사부산광역시 중구 대청로137번길 8-2, 문화빌딩 4층 (중앙동3가)부산광역시 중구 중앙동3가 13-3
466명궁출판사인쇄사부산광역시 중구 동광길 15, 1층 (동광동4가)부산광역시 중구 동광동4가 25-6
467한일기획사인쇄사부산광역시 중구 대영로215번길 4-2, 지하1층 101호 (영주동, 시즌5)부산광역시 중구 영주동 60-24 시즌5
468금아디앤피인쇄사부산광역시 중구 동광길 14, 지하1층 (동광동4가)부산광역시 중구 동광동4가 18-4
469한성특수인쇄인쇄사부산광역시 중구 흑교로81번길 16 (보수동3가)부산광역시 중구 보수동3가 1-10
470모던광고기획인쇄사부산광역시 중구 대청로135번길 16, 1층 (중앙동4가)부산광역시 중구 중앙동4가 37-35
471(주)해양문화사인쇄사부산광역시 중구 해관로 63-1, 403호 (중앙동4가)부산광역시 중구 중앙동4가 36-9
472선애드인쇄사부산광역시 중구 보수대로 128, 2/지하층 (보수동2가)부산광역시 중구 보수동2가 5-3