Overview

Dataset statistics

Number of variables13
Number of observations4877
Missing cells29296
Missing cells (%)46.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory533.6 KiB
Average record size in memory112.0 B

Variable types

Numeric3
Text5
Unsupported5

Dataset

Description부산광역시 강서구 내 공장등록 현황에 대한 데이터로 회사명, 주소, 업종, 종업원수, 생산품 등의 항목을 제공합니다.
Author부산광역시 강서구
URLhttps://www.data.go.kr/data/15051978/fileData.do

Alerts

Unnamed: 12 has constant value ""Constant
Unnamed: 7 has 4877 (100.0%) missing valuesMissing
Unnamed: 8 has 4877 (100.0%) missing valuesMissing
Unnamed: 9 has 4877 (100.0%) missing valuesMissing
Unnamed: 10 has 4877 (100.0%) missing valuesMissing
Unnamed: 11 has 4877 (100.0%) missing valuesMissing
Unnamed: 12 has 4876 (> 99.9%) missing valuesMissing
종업원수 is highly skewed (γ1 = 37.1642917)Skewed
순번 has unique valuesUnique
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
종업원수 has 231 (4.7%) zerosZeros

Reproduction

Analysis started2023-12-12 12:05:47.291288
Analysis finished2023-12-12 12:05:50.204099
Duration2.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct4877
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2439.2678
Minimum1
Maximum4878
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2023-12-12T21:05:50.291410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile244.8
Q11220
median2439
Q33659
95-th percentile4634.2
Maximum4878
Range4877
Interquartile range (IQR)2439

Descriptive statistics

Standard deviation1408.3526
Coefficient of variation (CV)0.57736697
Kurtosis-1.1999772
Mean2439.2678
Median Absolute Deviation (MAD)1220
Skewness0.00038816694
Sum11896309
Variance1983457.2
MonotonicityStrictly increasing
2023-12-12T21:05:50.466166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
3250 1
 
< 0.1%
3257 1
 
< 0.1%
3256 1
 
< 0.1%
3255 1
 
< 0.1%
3254 1
 
< 0.1%
3253 1
 
< 0.1%
3252 1
 
< 0.1%
3251 1
 
< 0.1%
3249 1
 
< 0.1%
Other values (4867) 4867
99.8%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
4878 1
< 0.1%
4877 1
< 0.1%
4876 1
< 0.1%
4875 1
< 0.1%
4874 1
< 0.1%
4873 1
< 0.1%
4872 1
< 0.1%
4871 1
< 0.1%
4870 1
< 0.1%
4869 1
< 0.1%
Distinct4369
Distinct (%)89.6%
Missing2
Missing (%)< 0.1%
Memory size38.2 KiB
2023-12-12T21:05:50.764360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length24
Mean length6.7347692
Min length2

Characters and Unicode

Total characters32832
Distinct characters592
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3949 ?
Unique (%)81.0%

Sample

1st row
2nd row 나현섬유
3rd row 수정공업사
4th row(유)고려금속
5th row(유)비씨코로텍
ValueCountFrequency (%)
주식회사 274
 
5.1%
제2공장 12
 
0.2%
녹산공장 9
 
0.2%
2공장 9
 
0.2%
주)성광벤드 7
 
0.1%
부산공장 7
 
0.1%
7
 
0.1%
리노공업(주 6
 
0.1%
유한회사 6
 
0.1%
사단법인 5
 
0.1%
Other values (4437) 5023
93.6%
2023-12-12T21:05:51.197064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2602
 
7.9%
) 2330
 
7.1%
( 2327
 
7.1%
850
 
2.6%
704
 
2.1%
686
 
2.1%
618
 
1.9%
596
 
1.8%
510
 
1.6%
506
 
1.5%
Other values (582) 21103
64.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 26615
81.1%
Close Punctuation 2330
 
7.1%
Open Punctuation 2327
 
7.1%
Uppercase Letter 819
 
2.5%
Space Separator 506
 
1.5%
Other Punctuation 117
 
0.4%
Lowercase Letter 65
 
0.2%
Decimal Number 52
 
0.2%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2602
 
9.8%
850
 
3.2%
704
 
2.6%
686
 
2.6%
618
 
2.3%
596
 
2.2%
510
 
1.9%
483
 
1.8%
463
 
1.7%
449
 
1.7%
Other values (522) 18654
70.1%
Uppercase Letter
ValueCountFrequency (%)
E 107
13.1%
C 80
 
9.8%
G 76
 
9.3%
N 74
 
9.0%
S 71
 
8.7%
T 66
 
8.1%
M 39
 
4.8%
H 33
 
4.0%
K 32
 
3.9%
A 32
 
3.9%
Other values (16) 209
25.5%
Lowercase Letter
ValueCountFrequency (%)
e 10
15.4%
t 8
12.3%
o 7
10.8%
c 7
10.8%
n 6
9.2%
d 4
 
6.2%
a 4
 
6.2%
h 4
 
6.2%
i 3
 
4.6%
k 2
 
3.1%
Other values (7) 10
15.4%
Decimal Number
ValueCountFrequency (%)
2 34
65.4%
1 10
 
19.2%
3 3
 
5.8%
6 1
 
1.9%
0 1
 
1.9%
5 1
 
1.9%
9 1
 
1.9%
4 1
 
1.9%
Other Punctuation
ValueCountFrequency (%)
. 80
68.4%
& 22
 
18.8%
, 10
 
8.5%
/ 4
 
3.4%
1
 
0.9%
Close Punctuation
ValueCountFrequency (%)
) 2330
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2327
100.0%
Space Separator
ValueCountFrequency (%)
506
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 26616
81.1%
Common 5332
 
16.2%
Latin 884
 
2.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2602
 
9.8%
850
 
3.2%
704
 
2.6%
686
 
2.6%
618
 
2.3%
596
 
2.2%
510
 
1.9%
483
 
1.8%
463
 
1.7%
449
 
1.7%
Other values (523) 18655
70.1%
Latin
ValueCountFrequency (%)
E 107
 
12.1%
C 80
 
9.0%
G 76
 
8.6%
N 74
 
8.4%
S 71
 
8.0%
T 66
 
7.5%
M 39
 
4.4%
H 33
 
3.7%
K 32
 
3.6%
A 32
 
3.6%
Other values (33) 274
31.0%
Common
ValueCountFrequency (%)
) 2330
43.7%
( 2327
43.6%
506
 
9.5%
. 80
 
1.5%
2 34
 
0.6%
& 22
 
0.4%
, 10
 
0.2%
1 10
 
0.2%
/ 4
 
0.1%
3 3
 
0.1%
Other values (6) 6
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 26614
81.1%
ASCII 6215
 
18.9%
None 2
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2602
 
9.8%
850
 
3.2%
704
 
2.6%
686
 
2.6%
618
 
2.3%
596
 
2.2%
510
 
1.9%
483
 
1.8%
463
 
1.7%
449
 
1.7%
Other values (521) 18653
70.1%
ASCII
ValueCountFrequency (%)
) 2330
37.5%
( 2327
37.4%
506
 
8.1%
E 107
 
1.7%
C 80
 
1.3%
. 80
 
1.3%
G 76
 
1.2%
N 74
 
1.2%
S 71
 
1.1%
T 66
 
1.1%
Other values (48) 498
 
8.0%
None
ValueCountFrequency (%)
1
50.0%
1
50.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%

종업원수
Real number (ℝ)

SKEWED  ZEROS 

Distinct156
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.53291
Minimum0
Maximum4572
Zeros231
Zeros (%)4.7%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2023-12-12T21:05:51.332139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median6
Q314
95-th percentile48
Maximum4572
Range4572
Interquartile range (IQR)11

Descriptive statistics

Standard deviation93.485793
Coefficient of variation (CV)5.6545276
Kurtosis1602.485
Mean16.53291
Median Absolute Deviation (MAD)4
Skewness37.164292
Sum80631
Variance8739.5935
MonotonicityNot monotonic
2023-12-12T21:05:51.691413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5 448
 
9.2%
3 442
 
9.1%
4 431
 
8.8%
2 397
 
8.1%
1 291
 
6.0%
6 282
 
5.8%
7 232
 
4.8%
0 231
 
4.7%
10 213
 
4.4%
8 191
 
3.9%
Other values (146) 1719
35.2%
ValueCountFrequency (%)
0 231
4.7%
1 291
6.0%
2 397
8.1%
3 442
9.1%
4 431
8.8%
5 448
9.2%
6 282
5.8%
7 232
4.8%
8 191
3.9%
9 162
 
3.3%
ValueCountFrequency (%)
4572 1
< 0.1%
3471 1
< 0.1%
2213 1
< 0.1%
782 1
< 0.1%
710 1
< 0.1%
500 1
< 0.1%
479 1
< 0.1%
451 1
< 0.1%
410 1
< 0.1%
370 1
< 0.1%
Distinct3685
Distinct (%)75.6%
Missing5
Missing (%)0.1%
Memory size38.2 KiB
2023-12-12T21:05:52.005069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length58
Median length46
Mean length7.9997947
Min length1

Characters and Unicode

Total characters38975
Distinct characters691
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3304 ?
Unique (%)67.8%

Sample

1st row임대업
2nd row어망
3rd row스텐 닛불, 소켓
4th row밴(VANE)
5th row분사기(세척기)
ValueCountFrequency (%)
271
 
3.2%
225
 
2.6%
부품 149
 
1.7%
자동차부품 113
 
1.3%
밸브 102
 
1.2%
금형 95
 
1.1%
자동차 88
 
1.0%
선박용 84
 
1.0%
78
 
0.9%
산업기계 62
 
0.7%
Other values (3934) 7262
85.1%
2023-12-12T21:05:52.477269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3719
 
9.5%
, 1523
 
3.9%
1502
 
3.9%
1241
 
3.2%
934
 
2.4%
728
 
1.9%
587
 
1.5%
580
 
1.5%
536
 
1.4%
496
 
1.3%
Other values (681) 27129
69.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 31252
80.2%
Space Separator 3719
 
9.5%
Other Punctuation 1557
 
4.0%
Uppercase Letter 1188
 
3.0%
Lowercase Letter 820
 
2.1%
Close Punctuation 196
 
0.5%
Open Punctuation 196
 
0.5%
Decimal Number 35
 
0.1%
Dash Punctuation 10
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1502
 
4.8%
1241
 
4.0%
934
 
3.0%
728
 
2.3%
587
 
1.9%
580
 
1.9%
536
 
1.7%
496
 
1.6%
488
 
1.6%
480
 
1.5%
Other values (609) 23680
75.8%
Uppercase Letter
ValueCountFrequency (%)
E 143
12.0%
L 101
 
8.5%
P 91
 
7.7%
A 86
 
7.2%
C 84
 
7.1%
S 69
 
5.8%
D 68
 
5.7%
R 68
 
5.7%
T 64
 
5.4%
O 63
 
5.3%
Other values (15) 351
29.5%
Lowercase Letter
ValueCountFrequency (%)
e 111
13.5%
t 76
 
9.3%
r 61
 
7.4%
i 61
 
7.4%
a 60
 
7.3%
l 57
 
7.0%
o 56
 
6.8%
s 56
 
6.8%
n 43
 
5.2%
p 32
 
3.9%
Other values (14) 207
25.2%
Decimal Number
ValueCountFrequency (%)
1 10
28.6%
2 6
17.1%
0 6
17.1%
5 4
 
11.4%
9 2
 
5.7%
8 2
 
5.7%
4 2
 
5.7%
3 2
 
5.7%
6 1
 
2.9%
Other Punctuation
ValueCountFrequency (%)
, 1523
97.8%
/ 14
 
0.9%
. 11
 
0.7%
· 2
 
0.1%
' 2
 
0.1%
% 2
 
0.1%
" 2
 
0.1%
& 1
 
0.1%
Space Separator
ValueCountFrequency (%)
3719
100.0%
Close Punctuation
ValueCountFrequency (%)
) 196
100.0%
Open Punctuation
ValueCountFrequency (%)
( 196
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 31249
80.2%
Common 5715
 
14.7%
Latin 2008
 
5.2%
Han 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1502
 
4.8%
1241
 
4.0%
934
 
3.0%
728
 
2.3%
587
 
1.9%
580
 
1.9%
536
 
1.7%
496
 
1.6%
488
 
1.6%
480
 
1.5%
Other values (606) 23677
75.8%
Latin
ValueCountFrequency (%)
E 143
 
7.1%
e 111
 
5.5%
L 101
 
5.0%
P 91
 
4.5%
A 86
 
4.3%
C 84
 
4.2%
t 76
 
3.8%
S 69
 
3.4%
D 68
 
3.4%
R 68
 
3.4%
Other values (39) 1111
55.3%
Common
ValueCountFrequency (%)
3719
65.1%
, 1523
26.6%
) 196
 
3.4%
( 196
 
3.4%
/ 14
 
0.2%
. 11
 
0.2%
- 10
 
0.2%
1 10
 
0.2%
2 6
 
0.1%
0 6
 
0.1%
Other values (13) 24
 
0.4%
Han
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 31248
80.2%
ASCII 7721
 
19.8%
CJK 3
 
< 0.1%
None 2
 
< 0.1%
Compat Jamo 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3719
48.2%
, 1523
19.7%
) 196
 
2.5%
( 196
 
2.5%
E 143
 
1.9%
e 111
 
1.4%
L 101
 
1.3%
P 91
 
1.2%
A 86
 
1.1%
C 84
 
1.1%
Other values (61) 1471
 
19.1%
Hangul
ValueCountFrequency (%)
1502
 
4.8%
1241
 
4.0%
934
 
3.0%
728
 
2.3%
587
 
1.9%
580
 
1.9%
536
 
1.7%
496
 
1.6%
488
 
1.6%
480
 
1.5%
Other values (605) 23676
75.8%
None
ValueCountFrequency (%)
· 2
100.0%
CJK
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Distinct4233
Distinct (%)86.8%
Missing0
Missing (%)0.0%
Memory size38.2 KiB
2023-12-12T21:05:52.869782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length61
Median length60
Mean length23.905475
Min length18

Characters and Unicode

Total characters116587
Distinct characters228
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3734 ?
Unique (%)76.6%

Sample

1st row부산광역시 강서구 신호동 210-1번지
2nd row부산광역시 강서구 송정동 1742-1번지 1742-1번지
3rd row부산광역시 강서구 미음동 0번지 부산신항배후 국제산업물류도시(1단계) 일반산업단지 I33블록 6놋트
4th row부산광역시 강서구 강동동 2044-2번지
5th row부산광역시 강서구 녹산동 110번지
ValueCountFrequency (%)
부산광역시 4908
23.0%
강서구 4878
22.9%
송정동 1663
 
7.8%
미음동 490
 
2.3%
대저1동 436
 
2.0%
강동동 379
 
1.8%
대저2동 341
 
1.6%
화전동 337
 
1.6%
지사동 308
 
1.4%
번지 227
 
1.1%
Other values (4025) 7370
34.5%
2023-12-12T21:05:53.536898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17515
 
15.0%
1 6697
 
5.7%
5501
 
4.7%
5459
 
4.7%
5259
 
4.5%
5234
 
4.5%
5042
 
4.3%
5037
 
4.3%
5014
 
4.3%
4912
 
4.2%
Other values (218) 50917
43.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 67530
57.9%
Decimal Number 25715
 
22.1%
Space Separator 17515
 
15.0%
Dash Punctuation 4454
 
3.8%
Uppercase Letter 860
 
0.7%
Close Punctuation 185
 
0.2%
Open Punctuation 184
 
0.2%
Other Punctuation 89
 
0.1%
Lowercase Letter 54
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5501
 
8.1%
5459
 
8.1%
5259
 
7.8%
5234
 
7.8%
5042
 
7.5%
5037
 
7.5%
5014
 
7.4%
4912
 
7.3%
4908
 
7.3%
4881
 
7.2%
Other values (175) 16283
24.1%
Uppercase Letter
ValueCountFrequency (%)
T 146
17.0%
O 143
16.6%
I 136
15.8%
L 129
15.0%
A 78
9.1%
M 68
7.9%
R 68
7.9%
B 59
6.9%
C 13
 
1.5%
S 9
 
1.0%
Other values (3) 11
 
1.3%
Decimal Number
ValueCountFrequency (%)
1 6697
26.0%
2 3177
12.4%
5 2874
11.2%
3 2425
 
9.4%
6 2122
 
8.3%
7 2003
 
7.8%
4 1944
 
7.6%
0 1713
 
6.7%
8 1390
 
5.4%
9 1370
 
5.3%
Lowercase Letter
ValueCountFrequency (%)
i 40
74.1%
s 3
 
5.6%
o 2
 
3.7%
t 2
 
3.7%
p 2
 
3.7%
c 1
 
1.9%
g 1
 
1.9%
b 1
 
1.9%
a 1
 
1.9%
m 1
 
1.9%
Other Punctuation
ValueCountFrequency (%)
, 84
94.4%
. 2
 
2.2%
/ 1
 
1.1%
& 1
 
1.1%
; 1
 
1.1%
Space Separator
ValueCountFrequency (%)
17515
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4454
100.0%
Close Punctuation
ValueCountFrequency (%)
) 185
100.0%
Open Punctuation
ValueCountFrequency (%)
( 184
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 67530
57.9%
Common 48143
41.3%
Latin 914
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5501
 
8.1%
5459
 
8.1%
5259
 
7.8%
5234
 
7.8%
5042
 
7.5%
5037
 
7.5%
5014
 
7.4%
4912
 
7.3%
4908
 
7.3%
4881
 
7.2%
Other values (175) 16283
24.1%
Latin
ValueCountFrequency (%)
T 146
16.0%
O 143
15.6%
I 136
14.9%
L 129
14.1%
A 78
8.5%
M 68
7.4%
R 68
7.4%
B 59
6.5%
i 40
 
4.4%
C 13
 
1.4%
Other values (13) 34
 
3.7%
Common
ValueCountFrequency (%)
17515
36.4%
1 6697
 
13.9%
- 4454
 
9.3%
2 3177
 
6.6%
5 2874
 
6.0%
3 2425
 
5.0%
6 2122
 
4.4%
7 2003
 
4.2%
4 1944
 
4.0%
0 1713
 
3.6%
Other values (10) 3219
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 67530
57.9%
ASCII 49057
42.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
17515
35.7%
1 6697
 
13.7%
- 4454
 
9.1%
2 3177
 
6.5%
5 2874
 
5.9%
3 2425
 
4.9%
6 2122
 
4.3%
7 2003
 
4.1%
4 1944
 
4.0%
0 1713
 
3.5%
Other values (33) 4133
 
8.4%
Hangul
ValueCountFrequency (%)
5501
 
8.1%
5459
 
8.1%
5259
 
7.8%
5234
 
7.8%
5042
 
7.5%
5037
 
7.5%
5014
 
7.4%
4912
 
7.3%
4908
 
7.3%
4881
 
7.2%
Other values (175) 16283
24.1%

대표업종번호
Real number (ℝ)

Distinct430
Distinct (%)8.8%
Missing14
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean28718.378
Minimum10121
Maximum96921
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size43.0 KiB
2023-12-12T21:05:53.744559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10121
5-th percentile15219.1
Q125113
median28121
Q329294
95-th percentile49301
Maximum96921
Range86800
Interquartile range (IQR)4181

Descriptive statistics

Standard deviation10630.549
Coefficient of variation (CV)0.37016535
Kurtosis10.738204
Mean28718.378
Median Absolute Deviation (MAD)2210
Skewness2.735399
Sum1.3965747 × 108
Variance1.1300856 × 108
MonotonicityNot monotonic
2023-12-12T21:05:53.933477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25924 329
 
6.7%
31114 308
 
6.3%
29294 186
 
3.8%
28123 137
 
2.8%
29133 137
 
2.8%
30399 117
 
2.4%
24132 110
 
2.3%
24199 107
 
2.2%
68112 107
 
2.2%
25113 97
 
2.0%
Other values (420) 3228
66.2%
ValueCountFrequency (%)
10121 6
0.1%
10122 1
 
< 0.1%
10129 10
0.2%
10211 8
0.2%
10212 2
 
< 0.1%
10213 2
 
< 0.1%
10219 6
0.1%
10220 3
 
0.1%
10301 4
 
0.1%
10309 6
0.1%
ValueCountFrequency (%)
96921 1
 
< 0.1%
95212 1
 
< 0.1%
95211 13
0.3%
94200 1
 
< 0.1%
94110 3
 
0.1%
86102 1
 
< 0.1%
84224 1
 
< 0.1%
76390 1
 
< 0.1%
76190 1
 
< 0.1%
75912 1
 
< 0.1%
Distinct972
Distinct (%)20.0%
Missing14
Missing (%)0.3%
Memory size38.2 KiB
2023-12-12T21:05:54.325993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length28
Mean length15.939955
Min length3

Characters and Unicode

Total characters77516
Distinct characters340
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique506 ?
Unique (%)10.4%

Sample

1st row비주거용 건물 임대업
2nd row어망 및 기타 끈 가공품 제조업
3rd row볼트 및 너트류 제조업 외 2종
4th row그 외 기타 분류 안된 금속 가공 제품 제조업 외 1종
5th row선박 구성 부분품 제조업
ValueCountFrequency (%)
제조업 3709
 
15.4%
2506
 
10.4%
2052
 
8.5%
기타 1056
 
4.4%
1종 1010
 
4.2%
694
 
2.9%
금속 453
 
1.9%
부분품 345
 
1.4%
절삭가공 329
 
1.4%
유사처리업 329
 
1.4%
Other values (726) 11543
48.0%
2023-12-12T21:05:54.897086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19163
24.7%
4979
 
6.4%
4581
 
5.9%
4255
 
5.5%
2604
 
3.4%
2521
 
3.3%
2053
 
2.6%
1843
 
2.4%
1477
 
1.9%
1 1195
 
1.5%
Other values (330) 32845
42.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 55832
72.0%
Space Separator 19163
 
24.7%
Decimal Number 2020
 
2.6%
Other Punctuation 479
 
0.6%
Open Punctuation 11
 
< 0.1%
Close Punctuation 11
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4979
 
8.9%
4581
 
8.2%
4255
 
7.6%
2604
 
4.7%
2521
 
4.5%
2053
 
3.7%
1843
 
3.3%
1477
 
2.6%
1160
 
2.1%
1064
 
1.9%
Other values (315) 29295
52.5%
Decimal Number
ValueCountFrequency (%)
1 1195
59.2%
2 312
 
15.4%
3 215
 
10.6%
4 115
 
5.7%
5 61
 
3.0%
6 39
 
1.9%
8 31
 
1.5%
7 20
 
1.0%
9 20
 
1.0%
0 12
 
0.6%
Other Punctuation
ValueCountFrequency (%)
, 472
98.5%
. 7
 
1.5%
Space Separator
ValueCountFrequency (%)
19163
100.0%
Open Punctuation
ValueCountFrequency (%)
( 11
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 55832
72.0%
Common 21684
 
28.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4979
 
8.9%
4581
 
8.2%
4255
 
7.6%
2604
 
4.7%
2521
 
4.5%
2053
 
3.7%
1843
 
3.3%
1477
 
2.6%
1160
 
2.1%
1064
 
1.9%
Other values (315) 29295
52.5%
Common
ValueCountFrequency (%)
19163
88.4%
1 1195
 
5.5%
, 472
 
2.2%
2 312
 
1.4%
3 215
 
1.0%
4 115
 
0.5%
5 61
 
0.3%
6 39
 
0.2%
8 31
 
0.1%
7 20
 
0.1%
Other values (5) 61
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 55815
72.0%
ASCII 21684
 
28.0%
Compat Jamo 17
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
19163
88.4%
1 1195
 
5.5%
, 472
 
2.2%
2 312
 
1.4%
3 215
 
1.0%
4 115
 
0.5%
5 61
 
0.3%
6 39
 
0.2%
8 31
 
0.1%
7 20
 
0.1%
Other values (5) 61
 
0.3%
Hangul
ValueCountFrequency (%)
4979
 
8.9%
4581
 
8.2%
4255
 
7.6%
2604
 
4.7%
2521
 
4.5%
2053
 
3.7%
1843
 
3.3%
1477
 
2.6%
1160
 
2.1%
1064
 
1.9%
Other values (314) 29278
52.5%
Compat Jamo
ValueCountFrequency (%)
17
100.0%

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing4877
Missing (%)100.0%
Memory size43.0 KiB

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing4877
Missing (%)100.0%
Memory size43.0 KiB

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing4877
Missing (%)100.0%
Memory size43.0 KiB

Unnamed: 10
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing4877
Missing (%)100.0%
Memory size43.0 KiB

Unnamed: 11
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing4877
Missing (%)100.0%
Memory size43.0 KiB

Unnamed: 12
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing4876
Missing (%)> 99.9%
Memory size38.2 KiB
2023-12-12T21:05:55.005774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row`
ValueCountFrequency (%)
1
100.0%
2023-12-12T21:05:55.215261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
` 1
100.0%

Most occurring categories

ValueCountFrequency (%)
Modifier Symbol 1
100.0%

Most frequent character per category

Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
` 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
` 1
100.0%

Interactions

2023-12-12T21:05:49.413319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:05:48.676239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:05:49.066084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:05:49.535674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:05:48.790778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:05:49.169550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:05:49.637528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:05:48.935343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:05:49.283957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:05:55.324085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번종업원수대표업종번호
순번1.0000.0240.060
종업원수0.0241.0000.015
대표업종번호0.0600.0151.000
2023-12-12T21:05:55.448666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번종업원수대표업종번호
순번1.000-0.1890.006
종업원수-0.1891.000-0.099
대표업종번호0.006-0.0991.000

Missing values

2023-12-12T21:05:49.798851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:05:49.975991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T21:05:50.118796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

순번회사명종업원수생산품공장대표주소(지번)대표업종번호업종명Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12
011임대업부산광역시 강서구 신호동 210-1번지68112비주거용 건물 임대업<NA><NA><NA><NA><NA><NA>
12나현섬유5어망부산광역시 강서구 송정동 1742-1번지 1742-1번지13922어망 및 기타 끈 가공품 제조업<NA><NA><NA><NA><NA><NA>
23수정공업사0스텐 닛불, 소켓부산광역시 강서구 미음동 0번지 부산신항배후 국제산업물류도시(1단계) 일반산업단지 I33블록 6놋트25941볼트 및 너트류 제조업 외 2종<NA><NA><NA><NA><NA><NA>
34(유)고려금속6밴(VANE)부산광역시 강서구 강동동 2044-2번지25999그 외 기타 분류 안된 금속 가공 제품 제조업 외 1종<NA><NA><NA><NA><NA><NA>
45(유)비씨코로텍7분사기(세척기)부산광역시 강서구 녹산동 110번지31114선박 구성 부분품 제조업<NA><NA><NA><NA><NA><NA>
56(유)프라이그파이스트42선박용 밸브 원격제어장치부산광역시 강서구 송정동 1480-7번지29299그 외 기타 특수목적용 기계 제조업 외 1종<NA><NA><NA><NA><NA><NA>
67(유)해강운수0주차장부산광역시 강서구 화전동 582-2 번지52915주차장 운영업<NA><NA><NA><NA><NA><NA>
78(재)부산경제진흥원43일반스포츠화,특수화,금형부산광역시 강서구 송정동 1735-1번지15219기타 신발 제조업 외 1종<NA><NA><NA><NA><NA><NA>
89(재)부산경제진흥원 녹산출장소1산업단체, 비주거용 건물 임대업부산광역시 강서구 송정동 1709-2번지94110산업 단체 외 1종<NA><NA><NA><NA><NA><NA>
910(재)부산테크노파크25정보서비스부산광역시 강서구 미음동 1537-1번지63999그 외 기타 정보 서비스업 외 2종<NA><NA><NA><NA><NA><NA>
순번회사명종업원수생산품공장대표주소(지번)대표업종번호업종명Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12
48674869흥일폴리켐 녹산1공장2우레탄 레진 등부산광역시 강서구 송정동 1724-15번지20421계면활성제 제조업<NA><NA><NA><NA><NA><NA>
48684870희성기연18압연기용 롤러부산광역시 강서구 송정동 1680-4번지29230금속 주조 및 기타 야금용 기계 제조업<NA><NA><NA><NA><NA><NA>
48694871희정산업4금속판재부산광역시 강서구 신호동 290-6번지25112구조용 금속 판제품 및 공작물 제조업<NA><NA><NA><NA><NA><NA>
48704872히아브크레인(주)5굴절식 크레인부산광역시 강서구 범방2로 36 (미음동)29169기타 물품 취급장비 제조업<NA><NA><NA><NA><NA><NA>
48714873히타엔지니어링4히타,열처리로부산광역시 강서구 대저1동 1358-64번지29150산업용 오븐, 노 및 노용 버너 제조업 외 1종<NA><NA><NA><NA><NA><NA>
48724874히타치금속한국(주)12철도차량부품소재, 금형소재부산광역시 강서구 화전동 586-4번지25924절삭가공 및 유사처리업<NA><NA><NA><NA><NA><NA>
48734875힐티코리아(주)36케이블트레이서포트부산광역시 강서구 녹산동 581-3번지25924절삭가공 및 유사처리업<NA><NA><NA><NA><NA><NA>
48744876힘코이엔지3기계부품부산광역시 강서구 대저1동 2734-3번지29142기어 및 동력전달장치 제조업<NA><NA><NA><NA><NA><NA>
48754877힘텍2선박덕트, 가스파이프 등부산광역시 강서구 송정동 1481-2번지31114선박 구성 부분품 제조업<NA><NA><NA><NA><NA><NA>
48764878힙텍1유압(수압,기압) 테스트용 기계부산광역시 강서구 대저1동 337-67번지29299그 외 기타 특수목적용 기계 제조업 외 1종<NA><NA><NA><NA><NA><NA>