Overview

Dataset statistics

Number of variables6
Number of observations68
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.4 KiB
Average record size in memory50.9 B

Variable types

Numeric1
Categorical1
Text3
DateTime1

Dataset

Description부산광역시영도구_출판및인쇄업등록현황_20230831
Author부산광역시 영도구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3069124

Alerts

데이터기준일자 has constant value ""Constant
연번 is highly overall correlated with 업종High correlation
업종 is highly overall correlated with 연번High correlation
업종 is highly imbalanced (56.9%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:15:46.930978
Analysis finished2023-12-10 16:15:47.900949
Duration0.97 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct68
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.5
Minimum1
Maximum68
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 B
2023-12-11T01:15:48.007554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.35
Q117.75
median34.5
Q351.25
95-th percentile64.65
Maximum68
Range67
Interquartile range (IQR)33.5

Descriptive statistics

Standard deviation19.77372
Coefficient of variation (CV)0.5731513
Kurtosis-1.2
Mean34.5
Median Absolute Deviation (MAD)17
Skewness0
Sum2346
Variance391
MonotonicityStrictly increasing
2023-12-11T01:15:48.229975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.5%
45 1
 
1.5%
51 1
 
1.5%
50 1
 
1.5%
49 1
 
1.5%
48 1
 
1.5%
47 1
 
1.5%
46 1
 
1.5%
44 1
 
1.5%
36 1
 
1.5%
Other values (58) 58
85.3%
ValueCountFrequency (%)
1 1
1.5%
2 1
1.5%
3 1
1.5%
4 1
1.5%
5 1
1.5%
6 1
1.5%
7 1
1.5%
8 1
1.5%
9 1
1.5%
10 1
1.5%
ValueCountFrequency (%)
68 1
1.5%
67 1
1.5%
66 1
1.5%
65 1
1.5%
64 1
1.5%
63 1
1.5%
62 1
1.5%
61 1
1.5%
60 1
1.5%
59 1
1.5%

업종
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size676.0 B
출판사
62 
인쇄사
 
6

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row출판사
2nd row출판사
3rd row출판사
4th row출판사
5th row출판사

Common Values

ValueCountFrequency (%)
출판사 62
91.2%
인쇄사 6
 
8.8%

Length

2023-12-11T01:15:48.432760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:15:48.591174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
출판사 62
91.2%
인쇄사 6
 
8.8%
Distinct65
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Memory size676.0 B
2023-12-11T01:15:48.879444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length13
Mean length9.2205882
Min length4

Characters and Unicode

Total characters627
Distinct characters197
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)91.2%

Sample

1st row 국제해양문제연구소
2nd row 고신대학교출판부
3rd row 한국해양대학교출판부
4th row 창경사
5th row CL&D
ValueCountFrequency (%)
도서출판 4
 
4.8%
주)애드원플러스 2
 
2.4%
동원문화사i 2
 
2.4%
주)이도시스템 2
 
2.4%
기초학문집중력영어연구학회 1
 
1.2%
영도 1
 
1.2%
책의 1
 
1.2%
책읽는저녁 1
 
1.2%
범강 1
 
1.2%
고래섬 1
 
1.2%
Other values (68) 68
81.0%
2023-12-11T01:15:49.452620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
152
24.2%
14
 
2.2%
13
 
2.1%
( 12
 
1.9%
12
 
1.9%
12
 
1.9%
) 12
 
1.9%
11
 
1.8%
10
 
1.6%
10
 
1.6%
Other values (187) 369
58.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 394
62.8%
Space Separator 152
 
24.2%
Uppercase Letter 30
 
4.8%
Lowercase Letter 23
 
3.7%
Open Punctuation 12
 
1.9%
Close Punctuation 12
 
1.9%
Other Punctuation 3
 
0.5%
Dash Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
3.6%
13
 
3.3%
12
 
3.0%
12
 
3.0%
11
 
2.8%
10
 
2.5%
10
 
2.5%
10
 
2.5%
9
 
2.3%
8
 
2.0%
Other values (150) 285
72.3%
Lowercase Letter
ValueCountFrequency (%)
e 3
13.0%
i 3
13.0%
d 2
 
8.7%
r 2
 
8.7%
o 2
 
8.7%
u 1
 
4.3%
q 1
 
4.3%
w 1
 
4.3%
s 1
 
4.3%
n 1
 
4.3%
Other values (6) 6
26.1%
Uppercase Letter
ValueCountFrequency (%)
C 4
13.3%
M 4
13.3%
E 3
10.0%
A 3
10.0%
B 3
10.0%
T 2
 
6.7%
O 2
 
6.7%
L 2
 
6.7%
P 1
 
3.3%
H 1
 
3.3%
Other values (5) 5
16.7%
Other Punctuation
ValueCountFrequency (%)
. 2
66.7%
& 1
33.3%
Space Separator
ValueCountFrequency (%)
152
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 394
62.8%
Common 180
28.7%
Latin 53
 
8.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
3.6%
13
 
3.3%
12
 
3.0%
12
 
3.0%
11
 
2.8%
10
 
2.5%
10
 
2.5%
10
 
2.5%
9
 
2.3%
8
 
2.0%
Other values (150) 285
72.3%
Latin
ValueCountFrequency (%)
C 4
 
7.5%
M 4
 
7.5%
E 3
 
5.7%
e 3
 
5.7%
A 3
 
5.7%
i 3
 
5.7%
B 3
 
5.7%
T 2
 
3.8%
d 2
 
3.8%
r 2
 
3.8%
Other values (21) 24
45.3%
Common
ValueCountFrequency (%)
152
84.4%
( 12
 
6.7%
) 12
 
6.7%
. 2
 
1.1%
& 1
 
0.6%
- 1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 394
62.8%
ASCII 233
37.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
152
65.2%
( 12
 
5.2%
) 12
 
5.2%
C 4
 
1.7%
M 4
 
1.7%
E 3
 
1.3%
e 3
 
1.3%
A 3
 
1.3%
i 3
 
1.3%
B 3
 
1.3%
Other values (27) 34
 
14.6%
Hangul
ValueCountFrequency (%)
14
 
3.6%
13
 
3.3%
12
 
3.0%
12
 
3.0%
11
 
2.8%
10
 
2.5%
10
 
2.5%
10
 
2.5%
9
 
2.3%
8
 
2.0%
Other values (150) 285
72.3%
Distinct50
Distinct (%)73.5%
Missing0
Missing (%)0.0%
Memory size676.0 B
2023-12-11T01:15:49.854289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length30
Mean length26.029412
Min length23

Characters and Unicode

Total characters1770
Distinct characters62
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)54.4%

Sample

1st row 부산광역시 영도구 태종로 727 (동삼동)
2nd row 부산광역시 영도구 와치로 194 (동삼동)
3rd row 부산광역시 영도구 태종로 727 (동삼동)
4th row 부산광역시 영도구 태종로 594 (동삼동)
5th row 부산광역시 영도구 와치로 194 (동삼동)
ValueCountFrequency (%)
부산광역시 68
20.0%
영도구 68
20.0%
동삼동 41
 
12.1%
태종로 17
 
5.0%
청학동 10
 
2.9%
동삼서로 6
 
1.8%
조내기로 6
 
1.8%
33 5
 
1.5%
봉래동2가 5
 
1.5%
52 4
 
1.2%
Other values (77) 110
32.4%
2023-12-11T01:15:50.489229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
411
23.2%
120
 
6.8%
78
 
4.4%
( 68
 
3.8%
68
 
3.8%
68
 
3.8%
68
 
3.8%
68
 
3.8%
68
 
3.8%
68
 
3.8%
Other values (52) 685
38.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1015
57.3%
Space Separator 411
23.2%
Decimal Number 203
 
11.5%
Open Punctuation 68
 
3.8%
Close Punctuation 68
 
3.8%
Dash Punctuation 5
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
120
11.8%
78
 
7.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
63
 
6.2%
Other values (38) 278
27.4%
Decimal Number
ValueCountFrequency (%)
1 35
17.2%
2 30
14.8%
3 26
12.8%
6 22
10.8%
5 21
10.3%
7 20
9.9%
4 20
9.9%
0 12
 
5.9%
9 11
 
5.4%
8 6
 
3.0%
Space Separator
ValueCountFrequency (%)
411
100.0%
Open Punctuation
ValueCountFrequency (%)
( 68
100.0%
Close Punctuation
ValueCountFrequency (%)
) 68
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1015
57.3%
Common 755
42.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
120
11.8%
78
 
7.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
63
 
6.2%
Other values (38) 278
27.4%
Common
ValueCountFrequency (%)
411
54.4%
( 68
 
9.0%
) 68
 
9.0%
1 35
 
4.6%
2 30
 
4.0%
3 26
 
3.4%
6 22
 
2.9%
5 21
 
2.8%
7 20
 
2.6%
4 20
 
2.6%
Other values (4) 34
 
4.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1015
57.3%
ASCII 755
42.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
411
54.4%
( 68
 
9.0%
) 68
 
9.0%
1 35
 
4.6%
2 30
 
4.0%
3 26
 
3.4%
6 22
 
2.9%
5 21
 
2.8%
7 20
 
2.6%
4 20
 
2.6%
Other values (4) 34
 
4.5%
Hangul
ValueCountFrequency (%)
120
11.8%
78
 
7.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
68
 
6.7%
63
 
6.2%
Other values (38) 278
27.4%
Distinct61
Distinct (%)89.7%
Missing0
Missing (%)0.0%
Memory size676.0 B
2023-12-11T01:15:50.817365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length3
Mean length3.0882353
Min length2

Characters and Unicode

Total characters210
Distinct characters94
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique55 ?
Unique (%)80.9%

Sample

1st row정문수
2nd row이병수
3rd row박한일
4th row김형주
5th row조상래
ValueCountFrequency (%)
신경규 3
 
4.3%
최정수 2
 
2.9%
황병률 2
 
2.9%
옥영주 2
 
2.9%
한창호 2
 
2.9%
최석 2
 
2.9%
정문수 1
 
1.4%
심호섭 1
 
1.4%
조현구 1
 
1.4%
김남영 1
 
1.4%
Other values (53) 53
75.7%
2023-12-11T01:15:51.281097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
5.7%
11
 
5.2%
9
 
4.3%
8
 
3.8%
7
 
3.3%
6
 
2.9%
6
 
2.9%
5
 
2.4%
5
 
2.4%
4
 
1.9%
Other values (84) 137
65.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 205
97.6%
Space Separator 2
 
1.0%
Open Punctuation 1
 
0.5%
Decimal Number 1
 
0.5%
Close Punctuation 1
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
5.9%
11
 
5.4%
9
 
4.4%
8
 
3.9%
7
 
3.4%
6
 
2.9%
6
 
2.9%
5
 
2.4%
5
 
2.4%
4
 
2.0%
Other values (80) 132
64.4%
Space Separator
ValueCountFrequency (%)
2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 205
97.6%
Common 5
 
2.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
5.9%
11
 
5.4%
9
 
4.4%
8
 
3.9%
7
 
3.4%
6
 
2.9%
6
 
2.9%
5
 
2.4%
5
 
2.4%
4
 
2.0%
Other values (80) 132
64.4%
Common
ValueCountFrequency (%)
2
40.0%
( 1
20.0%
1 1
20.0%
) 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 205
97.6%
ASCII 5
 
2.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
 
5.9%
11
 
5.4%
9
 
4.4%
8
 
3.9%
7
 
3.4%
6
 
2.9%
6
 
2.9%
5
 
2.4%
5
 
2.4%
4
 
2.0%
Other values (80) 132
64.4%
ASCII
ValueCountFrequency (%)
2
40.0%
( 1
20.0%
1 1
20.0%
) 1
20.0%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size676.0 B
Minimum2023-08-31 00:00:00
Maximum2023-08-31 00:00:00
2023-12-11T01:15:51.492759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:15:51.706627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-11T01:15:47.494213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:15:51.837599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종사업체명칭사업체소재지(도로명)대표자
연번1.0000.9880.7990.7410.374
업종0.9881.0000.0000.0000.000
사업체명칭0.7990.0001.0001.0001.000
사업체소재지(도로명)0.7410.0001.0001.0001.000
대표자0.3740.0001.0001.0001.000
2023-12-11T01:15:51.963043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종
연번1.0000.848
업종0.8481.000

Missing values

2023-12-11T01:15:47.685582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:15:47.840232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번업종사업체명칭사업체소재지(도로명)대표자데이터기준일자
01출판사국제해양문제연구소부산광역시 영도구 태종로 727 (동삼동)정문수2023-08-31
12출판사고신대학교출판부부산광역시 영도구 와치로 194 (동삼동)이병수2023-08-31
23출판사한국해양대학교출판부부산광역시 영도구 태종로 727 (동삼동)박한일2023-08-31
34출판사창경사부산광역시 영도구 태종로 594 (동삼동)김형주2023-08-31
45출판사CL&D부산광역시 영도구 와치로 194 (동삼동)조상래2023-08-31
56출판사엠.이.시(MEC)부산광역시 영도구 절영로85번길 1 (남항동2가)임현미2023-08-31
67출판사(재)한국조선해양기자재연구원부산광역시 영도구 해양로 435 (동삼동)김정렬2023-08-31
78출판사도서출판 산하부산광역시 영도구 태종로 170-1 (봉래동4가)정진우2023-08-31
89출판사행복을만드는사람들부산광역시 영도구 상리로 41 (동삼동)권오용2023-08-31
910출판사해양 인문 사회 공동체부산광역시 영도구 동삼서로 61 (동삼동)안영숙2023-08-31
연번업종사업체명칭사업체소재지(도로명)대표자데이터기준일자
5859출판사프로젝아트부산광역시 영도구 상리로 1 (동삼동)이지혜2023-08-31
5960출판사Pequod부산광역시 영도구 봉래나루로 33 (대교동1가)이용국2023-08-31
6061출판사(주) 북앤아트부산광역시 영도구 대교로46번길 46 (봉래동2가)박지선2023-08-31
6162출판사꽃기리네부산광역시 영도구 하나길 596 (신선동2가)이유미2023-08-31
6263인쇄사대교인쇄사부산광역시 영도구 대평로 7-4 (대평동1가)이현영2023-08-31
6364인쇄사해양마크사부산광역시 영도구 태종로 709 (동삼동)김창균2023-08-31
6465인쇄사성지문화사부산광역시 영도구 절영로13번길 25 (봉래동2가)정향자2023-08-31
6566인쇄사(주)애드원플러스부산광역시 영도구 동삼서로 52 (동삼동)옥영주2023-08-31
6667인쇄사동원문화사i부산광역시 영도구 태종로 662 (동삼동)황병률2023-08-31
6768인쇄사(주)이도시스템부산광역시 영도구 태종로 736 (동삼동)한창호2023-08-31