Overview

Dataset statistics

Number of variables6
Number of observations1747
Missing cells11
Missing cells (%)0.1%
Duplicate rows2
Duplicate rows (%)0.1%
Total size in memory82.0 KiB
Average record size in memory48.1 B

Variable types

Categorical3
Text2
DateTime1

Dataset

Description성남시 내 출판 및 인쇄소 현황 데이터로 구별, 동별, 업종, 상호명, 소재지도로명주소 등의 항목으로 구성되어 있습니다.
URLhttps://www.data.go.kr/data/15054298/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 2 (0.1%) duplicate rowsDuplicates
구별 is highly overall correlated with 동별High correlation
동별 is highly overall correlated with 구별High correlation
업종 is highly imbalanced (50.9%)Imbalance

Reproduction

Analysis started2023-12-12 09:15:33.940199
Analysis finished2023-12-12 09:15:35.126087
Duration1.19 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구별
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size13.8 KiB
분당구
1187 
중원구
298 
수정구
262 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수정구
2nd row수정구
3rd row수정구
4th row수정구
5th row수정구

Common Values

ValueCountFrequency (%)
분당구 1187
67.9%
중원구 298
 
17.1%
수정구 262
 
15.0%

Length

2023-12-12T18:15:35.215439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:15:35.345715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
분당구 1187
67.9%
중원구 298
 
17.1%
수정구 262
 
15.0%

동별
Categorical

HIGH CORRELATION 

Distinct41
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size13.8 KiB
정자동
223 
삼평동
143 
서현동
141 
야탑동
141 
상대원동
131 
Other values (36)
968 

Length

Max length4
Median length3
Mean length3.0875787
Min length2

Unique

Unique5 ?
Unique (%)0.3%

Sample

1st row고등동
2nd row고등동
3rd row고등동
4th row고등동
5th row고등동

Common Values

ValueCountFrequency (%)
정자동 223
12.8%
삼평동 143
 
8.2%
서현동 141
 
8.1%
야탑동 141
 
8.1%
상대원동 131
 
7.5%
수내동 121
 
6.9%
구미동 116
 
6.6%
금곡동 68
 
3.9%
태평동 56
 
3.2%
이매동 54
 
3.1%
Other values (31) 553
31.7%

Length

2023-12-12T18:15:35.471179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
정자동 223
12.8%
삼평동 143
 
8.2%
서현동 141
 
8.1%
야탑동 141
 
8.1%
상대원동 131
 
7.5%
수내동 121
 
6.9%
구미동 116
 
6.6%
금곡동 68
 
3.9%
태평동 56
 
3.2%
이매동 54
 
3.1%
Other values (31) 553
31.7%

업종
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size13.8 KiB
출판사
1560 
인쇄사
187 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row출판사
2nd row출판사
3rd row출판사
4th row출판사
5th row출판사

Common Values

ValueCountFrequency (%)
출판사 1560
89.3%
인쇄사 187
 
10.7%

Length

2023-12-12T18:15:35.601710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T18:15:35.719870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
출판사 1560
89.3%
인쇄사 187
 
10.7%
Distinct1707
Distinct (%)97.7%
Missing0
Missing (%)0.0%
Memory size13.8 KiB
2023-12-12T18:15:36.099000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length29
Mean length7.3514596
Min length1

Characters and Unicode

Total characters12843
Distinct characters735
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1667 ?
Unique (%)95.4%

Sample

1st row소울하우스
2nd row도서출판 W
3rd row체스인사이드
4th row북앤하우스
5th row숙곳
ValueCountFrequency (%)
주식회사 161
 
6.5%
도서출판 127
 
5.1%
출판사 25
 
1.0%
디자인 13
 
0.5%
books 10
 
0.4%
출판 8
 
0.3%
6
 
0.2%
미디어 6
 
0.2%
북스 5
 
0.2%
프레스 5
 
0.2%
Other values (1960) 2114
85.2%
2023-12-12T18:15:36.699670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
734
 
5.7%
368
 
2.9%
365
 
2.8%
326
 
2.5%
314
 
2.4%
( 300
 
2.3%
) 300
 
2.3%
235
 
1.8%
234
 
1.8%
195
 
1.5%
Other values (725) 9472
73.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9581
74.6%
Lowercase Letter 1020
 
7.9%
Uppercase Letter 792
 
6.2%
Space Separator 734
 
5.7%
Open Punctuation 300
 
2.3%
Close Punctuation 300
 
2.3%
Decimal Number 57
 
0.4%
Other Punctuation 52
 
0.4%
Dash Punctuation 7
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
368
 
3.8%
365
 
3.8%
326
 
3.4%
314
 
3.3%
235
 
2.5%
234
 
2.4%
195
 
2.0%
179
 
1.9%
175
 
1.8%
164
 
1.7%
Other values (653) 7026
73.3%
Lowercase Letter
ValueCountFrequency (%)
o 136
13.3%
e 116
11.4%
a 84
 
8.2%
n 78
 
7.6%
i 72
 
7.1%
s 68
 
6.7%
t 64
 
6.3%
r 59
 
5.8%
l 43
 
4.2%
d 37
 
3.6%
Other values (15) 263
25.8%
Uppercase Letter
ValueCountFrequency (%)
E 66
 
8.3%
A 65
 
8.2%
B 54
 
6.8%
S 53
 
6.7%
T 52
 
6.6%
C 50
 
6.3%
R 43
 
5.4%
O 41
 
5.2%
N 40
 
5.1%
P 39
 
4.9%
Other values (15) 289
36.5%
Decimal Number
ValueCountFrequency (%)
1 16
28.1%
0 9
15.8%
3 9
15.8%
2 8
14.0%
8 4
 
7.0%
7 3
 
5.3%
4 3
 
5.3%
9 2
 
3.5%
5 2
 
3.5%
6 1
 
1.8%
Other Punctuation
ValueCountFrequency (%)
. 24
46.2%
& 15
28.8%
, 6
 
11.5%
' 3
 
5.8%
1
 
1.9%
· 1
 
1.9%
% 1
 
1.9%
! 1
 
1.9%
Space Separator
ValueCountFrequency (%)
734
100.0%
Open Punctuation
ValueCountFrequency (%)
( 300
100.0%
Close Punctuation
ValueCountFrequency (%)
) 300
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9548
74.3%
Latin 1812
 
14.1%
Common 1450
 
11.3%
Han 33
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
368
 
3.9%
365
 
3.8%
326
 
3.4%
314
 
3.3%
235
 
2.5%
234
 
2.5%
195
 
2.0%
179
 
1.9%
175
 
1.8%
164
 
1.7%
Other values (622) 6993
73.2%
Latin
ValueCountFrequency (%)
o 136
 
7.5%
e 116
 
6.4%
a 84
 
4.6%
n 78
 
4.3%
i 72
 
4.0%
s 68
 
3.8%
E 66
 
3.6%
A 65
 
3.6%
t 64
 
3.5%
r 59
 
3.3%
Other values (40) 1004
55.4%
Han
ValueCountFrequency (%)
2
 
6.1%
2
 
6.1%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
1
 
3.0%
Other values (21) 21
63.6%
Common
ValueCountFrequency (%)
734
50.6%
( 300
20.7%
) 300
20.7%
. 24
 
1.7%
1 16
 
1.1%
& 15
 
1.0%
0 9
 
0.6%
3 9
 
0.6%
2 8
 
0.6%
- 7
 
0.5%
Other values (12) 28
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9548
74.3%
ASCII 3260
 
25.4%
CJK 32
 
0.2%
None 2
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
734
22.5%
( 300
 
9.2%
) 300
 
9.2%
o 136
 
4.2%
e 116
 
3.6%
a 84
 
2.6%
n 78
 
2.4%
i 72
 
2.2%
s 68
 
2.1%
E 66
 
2.0%
Other values (60) 1306
40.1%
Hangul
ValueCountFrequency (%)
368
 
3.9%
365
 
3.8%
326
 
3.4%
314
 
3.3%
235
 
2.5%
234
 
2.5%
195
 
2.0%
179
 
1.9%
175
 
1.8%
164
 
1.7%
Other values (622) 6993
73.2%
CJK
ValueCountFrequency (%)
2
 
6.2%
2
 
6.2%
1
 
3.1%
1
 
3.1%
1
 
3.1%
1
 
3.1%
1
 
3.1%
1
 
3.1%
1
 
3.1%
1
 
3.1%
Other values (20) 20
62.5%
None
ValueCountFrequency (%)
1
50.0%
· 1
50.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%
Distinct1666
Distinct (%)96.0%
Missing11
Missing (%)0.6%
Memory size13.8 KiB
2023-12-12T18:15:37.127710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length70
Median length55
Mean length41.545507
Min length23

Characters and Unicode

Total characters72123
Distinct characters413
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1606 ?
Unique (%)92.5%

Sample

1st row경기도 성남시 수정구 고등로 3, 현대지식산업센터 성남고등 A830호 (고등동)
2nd row경기도 성남시 수정구 고등로2길 6, 201호 (고등동)
3rd row경기도 성남시 수정구 고등로 33, 판교밸리포레자이 306동 1003호 (고등동)
4th row경기도 성남시 수정구 고등로 33, 301동 102호 (고등동, 판교밸리포레자이)
5th row경기도 성남시 수정구 고등로 57, 110동 309호 (고등동, 고등마을)
ValueCountFrequency (%)
경기도 1736
 
12.3%
성남시 1735
 
12.3%
분당구 1187
 
8.4%
중원구 290
 
2.1%
수정구 257
 
1.8%
정자동 184
 
1.3%
삼평동 140
 
1.0%
서현동 111
 
0.8%
정자일로 105
 
0.7%
상대원동 98
 
0.7%
Other values (2484) 8282
58.6%
2023-12-12T18:15:38.097797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12677
 
17.6%
1 2846
 
3.9%
2517
 
3.5%
, 2455
 
3.4%
2060
 
2.9%
2010
 
2.8%
0 1954
 
2.7%
1905
 
2.6%
1852
 
2.6%
1836
 
2.5%
Other values (403) 40011
55.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 40183
55.7%
Space Separator 12677
 
17.6%
Decimal Number 12484
 
17.3%
Other Punctuation 2466
 
3.4%
Open Punctuation 1765
 
2.4%
Close Punctuation 1765
 
2.4%
Uppercase Letter 476
 
0.7%
Dash Punctuation 233
 
0.3%
Lowercase Letter 41
 
0.1%
Letter Number 25
 
< 0.1%
Other values (2) 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2517
 
6.3%
2060
 
5.1%
2010
 
5.0%
1905
 
4.7%
1852
 
4.6%
1836
 
4.6%
1835
 
4.6%
1778
 
4.4%
1768
 
4.4%
1359
 
3.4%
Other values (347) 21263
52.9%
Uppercase Letter
ValueCountFrequency (%)
B 128
26.9%
A 91
19.1%
C 47
 
9.9%
S 33
 
6.9%
K 30
 
6.3%
H 19
 
4.0%
R 19
 
4.0%
T 16
 
3.4%
I 14
 
2.9%
E 14
 
2.9%
Other values (12) 65
13.7%
Lowercase Letter
ValueCountFrequency (%)
n 12
29.3%
b 5
12.2%
e 5
12.2%
t 4
 
9.8%
w 3
 
7.3%
o 3
 
7.3%
r 3
 
7.3%
c 3
 
7.3%
a 1
 
2.4%
k 1
 
2.4%
Decimal Number
ValueCountFrequency (%)
1 2846
22.8%
0 1954
15.7%
2 1824
14.6%
3 1319
10.6%
4 1071
 
8.6%
5 926
 
7.4%
6 792
 
6.3%
7 676
 
5.4%
8 548
 
4.4%
9 528
 
4.2%
Other Punctuation
ValueCountFrequency (%)
, 2455
99.6%
. 7
 
0.3%
' 2
 
0.1%
/ 1
 
< 0.1%
& 1
 
< 0.1%
Letter Number
ValueCountFrequency (%)
18
72.0%
7
 
28.0%
Space Separator
ValueCountFrequency (%)
12677
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1765
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1765
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 233
100.0%
Math Symbol
ValueCountFrequency (%)
~ 7
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 40182
55.7%
Common 31398
43.5%
Latin 542
 
0.8%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2517
 
6.3%
2060
 
5.1%
2010
 
5.0%
1905
 
4.7%
1852
 
4.6%
1836
 
4.6%
1835
 
4.6%
1778
 
4.4%
1768
 
4.4%
1359
 
3.4%
Other values (346) 21262
52.9%
Latin
ValueCountFrequency (%)
B 128
23.6%
A 91
16.8%
C 47
 
8.7%
S 33
 
6.1%
K 30
 
5.5%
H 19
 
3.5%
R 19
 
3.5%
18
 
3.3%
T 16
 
3.0%
I 14
 
2.6%
Other values (25) 127
23.4%
Common
ValueCountFrequency (%)
12677
40.4%
1 2846
 
9.1%
, 2455
 
7.8%
0 1954
 
6.2%
2 1824
 
5.8%
( 1765
 
5.6%
) 1765
 
5.6%
3 1319
 
4.2%
4 1071
 
3.4%
5 926
 
2.9%
Other values (11) 2796
 
8.9%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 40181
55.7%
ASCII 31914
44.2%
Number Forms 25
 
< 0.1%
Enclosed Alphanum 1
 
< 0.1%
CJK 1
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12677
39.7%
1 2846
 
8.9%
, 2455
 
7.7%
0 1954
 
6.1%
2 1824
 
5.7%
( 1765
 
5.5%
) 1765
 
5.5%
3 1319
 
4.1%
4 1071
 
3.4%
5 926
 
2.9%
Other values (43) 3312
 
10.4%
Hangul
ValueCountFrequency (%)
2517
 
6.3%
2060
 
5.1%
2010
 
5.0%
1905
 
4.7%
1852
 
4.6%
1836
 
4.6%
1835
 
4.6%
1778
 
4.4%
1768
 
4.4%
1359
 
3.4%
Other values (345) 21261
52.9%
Number Forms
ValueCountFrequency (%)
18
72.0%
7
 
28.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size13.8 KiB
Minimum2023-06-19 00:00:00
Maximum2023-06-19 00:00:00
2023-12-12T18:15:38.248964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T18:15:38.384575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Correlations

2023-12-12T18:15:38.470743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구별동별업종
구별1.0001.0000.205
동별1.0001.0000.463
업종0.2050.4631.000
2023-12-12T18:15:38.578069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종구별동별
업종1.0000.3360.385
구별0.3361.0000.989
동별0.3850.9891.000
2023-12-12T18:15:38.683921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구별동별업종
구별1.0000.9890.336
동별0.9891.0000.385
업종0.3360.3851.000

Missing values

2023-12-12T18:15:34.916121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T18:15:35.074159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구별동별업종사업장명소재지도로명주소데이터기준일자
0수정구고등동출판사소울하우스경기도 성남시 수정구 고등로 3, 현대지식산업센터 성남고등 A830호 (고등동)2023-06-19
1수정구고등동출판사도서출판 W경기도 성남시 수정구 고등로2길 6, 201호 (고등동)2023-06-19
2수정구고등동출판사체스인사이드경기도 성남시 수정구 고등로 33, 판교밸리포레자이 306동 1003호 (고등동)2023-06-19
3수정구고등동출판사북앤하우스경기도 성남시 수정구 고등로 33, 301동 102호 (고등동, 판교밸리포레자이)2023-06-19
4수정구고등동출판사숙곳경기도 성남시 수정구 고등로 57, 110동 309호 (고등동, 고등마을)2023-06-19
5수정구고등동출판사사랑의나무경기도 성남시 수정구 고등로 57, 104동 1106호 (고등동, 고등마을)2023-06-19
6수정구고등동출판사주식회사 아름담다경기도 성남시 수정구 청계산로 686, 801,802호 (고등동)2023-06-19
7수정구고등동출판사비욘드이미지네이션(Beyond Imagination)경기도 성남시 수정구 청계산로 686, 반도아이비밸리 지식산업센터 제2층 제215-216호 (고등동)2023-06-19
8수정구고등동출판사ARCC경기도 성남시 수정구 청계산로4길 26-1, 301호 (고등동)2023-06-19
9수정구고등동출판사주식회사 앱스트랙트경기도 성남시 수정구 청계산로4길 21, 라오재 1층 (고등동)2023-06-19
구별동별업종사업장명소재지도로명주소데이터기준일자
1737분당구판교동출판사나는별경기도 성남시 분당구 판교로210번길 14, 101호 (판교동)2023-06-19
1738분당구판교동출판사킹덤픽쳐스경기도 성남시 분당구 판교로210번길 8-3, 101호 (판교동)2023-06-19
1739분당구판교동출판사이스트베이경기도 성남시 분당구 판교원로 207, 501동 2801호 (판교동, 판교원마을5단지아파트)2023-06-19
1740분당구판교동출판사포웨이 출판사경기도 성남시 분당구 판교원로 207, 505동 1104호 (판교동,판교원마을)2023-06-19
1741분당구판교동출판사최훈경기도 성남시 분당구 판교원로 207, 507동 203호 (판교동, 판교원마을5단지아파트)2023-06-19
1742분당구판교동출판사JK아카데미경기도 성남시 분당구 판교원로 209, 102호 (판교동, 판교원마을6단지아파트)2023-06-19
1743분당구판교동출판사리아북스경기도 성남시 분당구 판교원로 209, 601동 404호 (판교동, 판교원마을6단지아파트)2023-06-19
1744분당구판교동출판사도서출판 글보라경기도 성남시 분당구 판교원로 209, 603동 1202호 (판교동, 판교원마을6단지아파트)2023-06-19
1745분당구판교동출판사시온출판사경기도 성남시 분당구 판교원로 237, 703동 1101호 (판교동, 판교원마을7단지아파트)2023-06-19
1746분당구판교동출판사비케이디씨경기도 성남시 분당구 판교원로 237, 704동 1804호 (판교동, 판교원마을7단지아파트)2023-06-19

Duplicate rows

Most frequently occurring

구별동별업종사업장명소재지도로명주소데이터기준일자# duplicates
0분당구정자동출판사마엘(MAEL)경기도 성남시 분당구 성남대로331번길 8, 킨스타워 19층 (정자동)2023-06-192
1수정구태평동출판사향원익청 蓮경기도 성남시 수정구 성남대로1258번길 5-4, 2층 (태평동)2023-06-192