Overview

Dataset statistics

Number of variables4
Number of observations196
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.4 KiB
Average record size in memory33.7 B

Variable types

Numeric1
Text2
Categorical1

Dataset

Description부산광역시수영구_출판인쇄업등록현황_20230616
Author부산광역시 수영구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=3042069

Alerts

연번 is highly overall correlated with 구분High correlation
구분 is highly overall correlated with 연번High correlation
구분 is highly imbalanced (55.7%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 16:36:32.710012
Analysis finished2023-12-10 16:36:34.569303
Duration1.86 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct196
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98.5
Minimum1
Maximum196
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-11T01:36:34.638305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile10.75
Q149.75
median98.5
Q3147.25
95-th percentile186.25
Maximum196
Range195
Interquartile range (IQR)97.5

Descriptive statistics

Standard deviation56.72448
Coefficient of variation (CV)0.57588305
Kurtosis-1.2
Mean98.5
Median Absolute Deviation (MAD)49
Skewness0
Sum19306
Variance3217.6667
MonotonicityStrictly increasing
2023-12-11T01:36:34.776280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.5%
125 1
 
0.5%
127 1
 
0.5%
128 1
 
0.5%
129 1
 
0.5%
130 1
 
0.5%
131 1
 
0.5%
132 1
 
0.5%
133 1
 
0.5%
134 1
 
0.5%
Other values (186) 186
94.9%
ValueCountFrequency (%)
1 1
0.5%
2 1
0.5%
3 1
0.5%
4 1
0.5%
5 1
0.5%
6 1
0.5%
7 1
0.5%
8 1
0.5%
9 1
0.5%
10 1
0.5%
ValueCountFrequency (%)
196 1
0.5%
195 1
0.5%
194 1
0.5%
193 1
0.5%
192 1
0.5%
191 1
0.5%
190 1
0.5%
189 1
0.5%
188 1
0.5%
187 1
0.5%
Distinct186
Distinct (%)94.9%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-11T01:36:35.044015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length18
Mean length6.9642857
Min length1

Characters and Unicode

Total characters1365
Distinct characters324
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique177 ?
Unique (%)90.3%

Sample

1st row도서출판공신문화사
2nd row삼우기획
3rd row색동이교육
4th row경제기획
5th row브레인칠드런
ValueCountFrequency (%)
도서출판 13
 
4.8%
주식회사 7
 
2.6%
주)아테크 3
 
1.1%
출판사 3
 
1.1%
종합인쇄선명사 2
 
0.7%
푸조와곰솔 2
 
0.7%
2
 
0.7%
로터스 2
 
0.7%
우섬 2
 
0.7%
연구소 2
 
0.7%
Other values (223) 231
85.9%
2023-12-11T01:36:35.582943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
73
 
5.3%
) 39
 
2.9%
( 38
 
2.8%
33
 
2.4%
33
 
2.4%
32
 
2.3%
31
 
2.3%
29
 
2.1%
26
 
1.9%
22
 
1.6%
Other values (314) 1009
73.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 967
70.8%
Lowercase Letter 150
 
11.0%
Uppercase Letter 88
 
6.4%
Space Separator 73
 
5.3%
Close Punctuation 39
 
2.9%
Open Punctuation 38
 
2.8%
Other Punctuation 4
 
0.3%
Decimal Number 4
 
0.3%
Math Symbol 1
 
0.1%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
33
 
3.4%
33
 
3.4%
32
 
3.3%
31
 
3.2%
29
 
3.0%
26
 
2.7%
22
 
2.3%
21
 
2.2%
20
 
2.1%
17
 
1.8%
Other values (257) 703
72.7%
Uppercase Letter
ValueCountFrequency (%)
I 9
 
10.2%
S 8
 
9.1%
E 6
 
6.8%
A 6
 
6.8%
O 6
 
6.8%
C 6
 
6.8%
T 6
 
6.8%
P 5
 
5.7%
B 5
 
5.7%
M 4
 
4.5%
Other values (14) 27
30.7%
Lowercase Letter
ValueCountFrequency (%)
e 17
11.3%
o 15
 
10.0%
n 15
 
10.0%
m 10
 
6.7%
u 10
 
6.7%
t 9
 
6.0%
a 9
 
6.0%
i 9
 
6.0%
r 9
 
6.0%
s 8
 
5.3%
Other values (13) 39
26.0%
Decimal Number
ValueCountFrequency (%)
2 2
50.0%
4 1
25.0%
1 1
25.0%
Other Punctuation
ValueCountFrequency (%)
& 3
75.0%
. 1
 
25.0%
Space Separator
ValueCountFrequency (%)
73
100.0%
Close Punctuation
ValueCountFrequency (%)
) 39
100.0%
Open Punctuation
ValueCountFrequency (%)
( 38
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 957
70.1%
Latin 238
 
17.4%
Common 160
 
11.7%
Han 10
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
33
 
3.4%
33
 
3.4%
32
 
3.3%
31
 
3.2%
29
 
3.0%
26
 
2.7%
22
 
2.3%
21
 
2.2%
20
 
2.1%
17
 
1.8%
Other values (247) 693
72.4%
Latin
ValueCountFrequency (%)
e 17
 
7.1%
o 15
 
6.3%
n 15
 
6.3%
m 10
 
4.2%
u 10
 
4.2%
t 9
 
3.8%
I 9
 
3.8%
a 9
 
3.8%
i 9
 
3.8%
r 9
 
3.8%
Other values (37) 126
52.9%
Common
ValueCountFrequency (%)
73
45.6%
) 39
24.4%
( 38
23.8%
& 3
 
1.9%
2 2
 
1.2%
4 1
 
0.6%
1 1
 
0.6%
+ 1
 
0.6%
- 1
 
0.6%
. 1
 
0.6%
Han
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 957
70.1%
ASCII 398
29.2%
CJK 10
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
73
18.3%
) 39
 
9.8%
( 38
 
9.5%
e 17
 
4.3%
o 15
 
3.8%
n 15
 
3.8%
m 10
 
2.5%
u 10
 
2.5%
t 9
 
2.3%
I 9
 
2.3%
Other values (47) 163
41.0%
Hangul
ValueCountFrequency (%)
33
 
3.4%
33
 
3.4%
32
 
3.3%
31
 
3.2%
29
 
3.0%
26
 
2.7%
22
 
2.3%
21
 
2.2%
20
 
2.1%
17
 
1.8%
Other values (247) 693
72.4%
CJK
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
출판사
178 
인쇄사
18 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row출판사
2nd row출판사
3rd row출판사
4th row출판사
5th row출판사

Common Values

ValueCountFrequency (%)
출판사 178
90.8%
인쇄사 18
 
9.2%

Length

2023-12-11T01:36:35.726962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:36:35.847589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
출판사 178
90.8%
인쇄사 18
 
9.2%
Distinct186
Distinct (%)94.9%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-11T01:36:36.116040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length55
Median length46
Mean length34.397959
Min length23

Characters and Unicode

Total characters6742
Distinct characters207
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique177 ?
Unique (%)90.3%

Sample

1st row부산광역시 수영구 연수로391번길 5 (수영동)
2nd row부산광역시 수영구 남천동로 36 (남천동)
3rd row부산광역시 수영구 연수로249번길 3 (망미동)
4th row부산광역시 수영구 수미로26번길 30 (수영동)
5th row부산광역시 수영구 남천동로 30 (남천동,명성학원 4층)
ValueCountFrequency (%)
부산광역시 196
 
15.5%
수영구 196
 
15.5%
광안동 63
 
5.0%
남천동 41
 
3.3%
수영로 33
 
2.6%
망미동 28
 
2.2%
민락동 24
 
1.9%
수영동 20
 
1.6%
2층 15
 
1.2%
광남로 15
 
1.2%
Other values (407) 630
50.0%
2023-12-11T01:36:36.540486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1108
 
16.4%
321
 
4.8%
297
 
4.4%
288
 
4.3%
256
 
3.8%
1 222
 
3.3%
, 209
 
3.1%
207
 
3.1%
203
 
3.0%
201
 
3.0%
Other values (197) 3430
50.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3838
56.9%
Decimal Number 1139
 
16.9%
Space Separator 1108
 
16.4%
Other Punctuation 209
 
3.1%
Close Punctuation 196
 
2.9%
Open Punctuation 196
 
2.9%
Dash Punctuation 43
 
0.6%
Uppercase Letter 10
 
0.1%
Letter Number 2
 
< 0.1%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
321
 
8.4%
297
 
7.7%
288
 
7.5%
256
 
6.7%
207
 
5.4%
203
 
5.3%
201
 
5.2%
199
 
5.2%
197
 
5.1%
196
 
5.1%
Other values (172) 1473
38.4%
Decimal Number
ValueCountFrequency (%)
1 222
19.5%
0 167
14.7%
2 149
13.1%
3 120
10.5%
6 113
9.9%
5 91
8.0%
4 87
 
7.6%
8 69
 
6.1%
7 68
 
6.0%
9 53
 
4.7%
Uppercase Letter
ValueCountFrequency (%)
B 3
30.0%
A 1
 
10.0%
W 1
 
10.0%
E 1
 
10.0%
I 1
 
10.0%
V 1
 
10.0%
K 1
 
10.0%
S 1
 
10.0%
Space Separator
ValueCountFrequency (%)
1108
100.0%
Other Punctuation
ValueCountFrequency (%)
, 209
100.0%
Close Punctuation
ValueCountFrequency (%)
) 196
100.0%
Open Punctuation
ValueCountFrequency (%)
( 196
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 43
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3838
56.9%
Common 2891
42.9%
Latin 13
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
321
 
8.4%
297
 
7.7%
288
 
7.5%
256
 
6.7%
207
 
5.4%
203
 
5.3%
201
 
5.2%
199
 
5.2%
197
 
5.1%
196
 
5.1%
Other values (172) 1473
38.4%
Common
ValueCountFrequency (%)
1108
38.3%
1 222
 
7.7%
, 209
 
7.2%
) 196
 
6.8%
( 196
 
6.8%
0 167
 
5.8%
2 149
 
5.2%
3 120
 
4.2%
6 113
 
3.9%
5 91
 
3.1%
Other values (5) 320
 
11.1%
Latin
ValueCountFrequency (%)
B 3
23.1%
2
15.4%
e 1
 
7.7%
A 1
 
7.7%
W 1
 
7.7%
E 1
 
7.7%
I 1
 
7.7%
V 1
 
7.7%
K 1
 
7.7%
S 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3838
56.9%
ASCII 2902
43.0%
Number Forms 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1108
38.2%
1 222
 
7.6%
, 209
 
7.2%
) 196
 
6.8%
( 196
 
6.8%
0 167
 
5.8%
2 149
 
5.1%
3 120
 
4.1%
6 113
 
3.9%
5 91
 
3.1%
Other values (14) 331
 
11.4%
Hangul
ValueCountFrequency (%)
321
 
8.4%
297
 
7.7%
288
 
7.5%
256
 
6.7%
207
 
5.4%
203
 
5.3%
201
 
5.2%
199
 
5.2%
197
 
5.1%
196
 
5.1%
Other values (172) 1473
38.4%
Number Forms
ValueCountFrequency (%)
2
100.0%

Interactions

2023-12-11T01:36:34.278355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:36:36.627081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분
연번1.0000.996
구분0.9961.000
2023-12-11T01:36:36.701016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번구분
연번1.0000.921
구분0.9211.000

Missing values

2023-12-11T01:36:34.455183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:36:34.535930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번사업체명칭구분사업체소재지(도로명)
01도서출판공신문화사출판사부산광역시 수영구 연수로391번길 5 (수영동)
12삼우기획출판사부산광역시 수영구 남천동로 36 (남천동)
23색동이교육출판사부산광역시 수영구 연수로249번길 3 (망미동)
34경제기획출판사부산광역시 수영구 수미로26번길 30 (수영동)
45브레인칠드런출판사부산광역시 수영구 남천동로 30 (남천동,명성학원 4층)
56참편한세상출판사부산광역시 수영구 망미로7번길 26 (망미동)
67메가테크출판사부산광역시 수영구 과정로 52-2 (망미동)
78레오출판사부산광역시 수영구 수영로594번길 90, 2층 (광안동)
89애드몰출판사부산광역시 수영구 광남로 38 (남천동,태양빌딩 4층)
910애드-업출판사부산광역시 수영구 수영로 371 (남천동,부광빌딩 5층)
연번사업체명칭구분사업체소재지(도로명)
186187(주)브레인스톰인쇄사부산광역시 수영구 수영로 650 (광안동)
187188종합인쇄선명사인쇄사부산광역시 수영구 수영로 734-1 (광안동)
188189엔에스네트웍스인쇄사부산광역시 수영구 광남로 168, 2층 (민락동)
189190(주)호밀밭인쇄사부산광역시 수영구 수영로 668, 1209호 (광안동, 화목오피스텔)
190191성일문화사인쇄사부산광역시 수영구 망미번영로 29 (광안동)
191192종합인쇄선명사인쇄사부산광역시 수영구 과정로15번길 7, 상가 5호 (망미동, 국화)
192193푸조와곰솔인쇄사부산광역시 수영구 수영성로32번길 28 (수영동)
193194도영하우스인쇄사부산광역시 수영구 수영로 690-5, 수영메디세움 402호 (광안동)
194195도영하우스인쇄사부산광역시 수영구 수영로 690-5, 수영메디세움 402호 (광안동)
195196노리정 인터네셔널 프로젝트인쇄사부산광역시 수영구 수영로408번길 21-1 (남천동)