Overview

Dataset statistics

Number of variables9
Number of observations6911
Missing cells11083
Missing cells (%)17.8%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory499.6 KiB
Average record size in memory74.0 B

Variable types

Text6
Numeric2
Categorical1

Dataset

Description반월국가산업단지 입주기업 정보에 대한 데이터로 공장명, 주소, 업종명, 연락처, 종업원수, 기업구분, 총원, 생산품명, 생산품분류를 제공합니다.
Author경기도 안산시
URLhttps://www.data.go.kr/data/15112512/fileData.do

Alerts

Dataset has 1 (< 0.1%) duplicate rowsDuplicates
종업원수 is highly overall correlated with 총원 and 1 other fieldsHigh correlation
총원 is highly overall correlated with 종업원수High correlation
기업구분 is highly overall correlated with 종업원수High correlation
기업구분 is highly imbalanced (81.8%)Imbalance
연락처 has 1244 (18.0%) missing valuesMissing
종업원수 has 1681 (24.3%) missing valuesMissing
총원 has 5996 (86.8%) missing valuesMissing
생산품분류 has 2121 (30.7%) missing valuesMissing
총원 has 125 (1.8%) zerosZeros

Reproduction

Analysis started2023-12-12 13:32:19.741700
Analysis finished2023-12-12 13:32:22.349227
Duration2.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct6346
Distinct (%)91.8%
Missing0
Missing (%)0.0%
Memory size54.1 KiB
2023-12-12T22:32:22.610797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length24
Mean length6.8301259
Min length1

Characters and Unicode

Total characters47203
Distinct characters669
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5880 ?
Unique (%)85.1%

Sample

1st row(주)BMB산업
2nd row(주)ICP
3rd row(주)거림
4th row(주)건우정공
5th row(주)건우정밀
ValueCountFrequency (%)
주식회사 156
 
2.1%
tech 26
 
0.4%
2공장 17
 
0.2%
제2공장 15
 
0.2%
안산공장 13
 
0.2%
안산지점 8
 
0.1%
eng 7
 
0.1%
안산2공장 7
 
0.1%
지점 6
 
0.1%
주)해성아이다 6
 
0.1%
Other values (6389) 7051
96.4%
2023-12-12T22:32:23.181557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4271
 
9.0%
( 4097
 
8.7%
) 4096
 
8.7%
1771
 
3.8%
1485
 
3.1%
991
 
2.1%
900
 
1.9%
840
 
1.8%
611
 
1.3%
606
 
1.3%
Other values (659) 27535
58.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 37267
79.0%
Open Punctuation 4097
 
8.7%
Close Punctuation 4097
 
8.7%
Uppercase Letter 975
 
2.1%
Space Separator 402
 
0.9%
Lowercase Letter 135
 
0.3%
Decimal Number 114
 
0.2%
Other Punctuation 89
 
0.2%
Dash Punctuation 19
 
< 0.1%
Other Symbol 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4271
 
11.5%
1771
 
4.8%
1485
 
4.0%
991
 
2.7%
900
 
2.4%
840
 
2.3%
611
 
1.6%
606
 
1.6%
567
 
1.5%
563
 
1.5%
Other values (595) 24662
66.2%
Uppercase Letter
ValueCountFrequency (%)
E 109
 
11.2%
S 96
 
9.8%
T 84
 
8.6%
C 79
 
8.1%
N 77
 
7.9%
G 59
 
6.1%
M 57
 
5.8%
H 56
 
5.7%
A 44
 
4.5%
P 39
 
4.0%
Other values (15) 275
28.2%
Lowercase Letter
ValueCountFrequency (%)
e 25
18.5%
n 16
11.9%
c 14
10.4%
t 11
8.1%
o 11
8.1%
i 9
 
6.7%
h 9
 
6.7%
r 8
 
5.9%
a 7
 
5.2%
l 5
 
3.7%
Other values (9) 20
14.8%
Decimal Number
ValueCountFrequency (%)
2 70
61.4%
1 18
 
15.8%
3 13
 
11.4%
0 3
 
2.6%
4 3
 
2.6%
6 3
 
2.6%
7 2
 
1.8%
5 1
 
0.9%
9 1
 
0.9%
Other Punctuation
ValueCountFrequency (%)
. 58
65.2%
& 28
31.5%
, 2
 
2.2%
/ 1
 
1.1%
Close Punctuation
ValueCountFrequency (%)
) 4096
> 99.9%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 4097
100.0%
Space Separator
ValueCountFrequency (%)
402
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 19
100.0%
Other Symbol
ValueCountFrequency (%)
7
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 37274
79.0%
Common 8819
 
18.7%
Latin 1110
 
2.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4271
 
11.5%
1771
 
4.8%
1485
 
4.0%
991
 
2.7%
900
 
2.4%
840
 
2.3%
611
 
1.6%
606
 
1.6%
567
 
1.5%
563
 
1.5%
Other values (596) 24669
66.2%
Latin
ValueCountFrequency (%)
E 109
 
9.8%
S 96
 
8.6%
T 84
 
7.6%
C 79
 
7.1%
N 77
 
6.9%
G 59
 
5.3%
M 57
 
5.1%
H 56
 
5.0%
A 44
 
4.0%
P 39
 
3.5%
Other values (34) 410
36.9%
Common
ValueCountFrequency (%)
( 4097
46.5%
) 4096
46.4%
402
 
4.6%
2 70
 
0.8%
. 58
 
0.7%
& 28
 
0.3%
- 19
 
0.2%
1 18
 
0.2%
3 13
 
0.1%
0 3
 
< 0.1%
Other values (9) 15
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 37267
79.0%
ASCII 9928
 
21.0%
None 8
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4271
 
11.5%
1771
 
4.8%
1485
 
4.0%
991
 
2.7%
900
 
2.4%
840
 
2.3%
611
 
1.6%
606
 
1.6%
567
 
1.5%
563
 
1.5%
Other values (595) 24662
66.2%
ASCII
ValueCountFrequency (%)
( 4097
41.3%
) 4096
41.3%
402
 
4.0%
E 109
 
1.1%
S 96
 
1.0%
T 84
 
0.8%
C 79
 
0.8%
N 77
 
0.8%
2 70
 
0.7%
G 59
 
0.6%
Other values (52) 759
 
7.6%
None
ValueCountFrequency (%)
7
87.5%
1
 
12.5%

주소
Text

Distinct5866
Distinct (%)85.0%
Missing12
Missing (%)0.2%
Memory size54.1 KiB
2023-12-12T22:32:23.574599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length94
Median length71
Mean length37.866212
Min length16

Characters and Unicode

Total characters261239
Distinct characters427
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5290 ?
Unique (%)76.7%

Sample

1st row경기도 안산시 단원구 해안로 77, 505-10 (목내동)
2nd row경기도 안산시 단원구 산단로67번길 134, (16-3-1) (목내동)
3rd row경기도 안산시 단원구 엠티브이12로21번길 18, 시화MTV 4사 106호 (성곡동)
4th row경기도 안산시 단원구 동산로27번길 15,(11B-2L) (원시동)
5th row경기도 안산시 단원구 신원로 314, 디동 (454-1, 16B 19L) (목내동)
ValueCountFrequency (%)
경기도 6895
 
12.7%
안산시 6851
 
12.6%
단원구 6436
 
11.8%
성곡동 2463
 
4.5%
원시동 1419
 
2.6%
신길동 1015
 
1.9%
산단로 957
 
1.8%
목내동 873
 
1.6%
별망로 482
 
0.9%
상록구 415
 
0.8%
Other values (4698) 26568
48.9%
2023-12-12T22:32:24.149774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
47633
 
18.2%
1 9451
 
3.6%
9418
 
3.6%
9188
 
3.5%
8871
 
3.4%
8513
 
3.3%
8348
 
3.2%
( 7922
 
3.0%
) 7921
 
3.0%
7899
 
3.0%
Other values (417) 136075
52.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 136334
52.2%
Decimal Number 48337
 
18.5%
Space Separator 47633
 
18.2%
Open Punctuation 7956
 
3.0%
Close Punctuation 7955
 
3.0%
Other Punctuation 7232
 
2.8%
Uppercase Letter 2887
 
1.1%
Dash Punctuation 2668
 
1.0%
Letter Number 181
 
0.1%
Math Symbol 31
 
< 0.1%
Other values (2) 25
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9418
 
6.9%
9188
 
6.7%
8871
 
6.5%
8513
 
6.2%
8348
 
6.1%
7899
 
5.8%
7182
 
5.3%
7006
 
5.1%
6949
 
5.1%
6903
 
5.1%
Other values (365) 56057
41.1%
Uppercase Letter
ValueCountFrequency (%)
A 758
26.3%
T 439
15.2%
B 335
11.6%
R 301
 
10.4%
K 296
 
10.3%
F 178
 
6.2%
M 143
 
5.0%
V 130
 
4.5%
L 126
 
4.4%
C 50
 
1.7%
Other values (11) 131
 
4.5%
Decimal Number
ValueCountFrequency (%)
1 9451
19.6%
2 6589
13.6%
0 5275
10.9%
3 5057
10.5%
4 4536
9.4%
5 4499
9.3%
6 3982
8.2%
7 3573
 
7.4%
8 2897
 
6.0%
9 2478
 
5.1%
Lowercase Letter
ValueCountFrequency (%)
v 6
25.0%
t 6
25.0%
m 6
25.0%
f 3
12.5%
c 2
 
8.3%
n 1
 
4.2%
Other Punctuation
ValueCountFrequency (%)
, 7223
99.9%
. 5
 
0.1%
/ 2
 
< 0.1%
& 1
 
< 0.1%
· 1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 7922
99.6%
[ 34
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 7921
99.6%
] 34
 
0.4%
Letter Number
ValueCountFrequency (%)
180
99.4%
1
 
0.6%
Space Separator
ValueCountFrequency (%)
47633
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2668
100.0%
Math Symbol
ValueCountFrequency (%)
~ 31
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 136334
52.2%
Common 121813
46.6%
Latin 3092
 
1.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9418
 
6.9%
9188
 
6.7%
8871
 
6.5%
8513
 
6.2%
8348
 
6.1%
7899
 
5.8%
7182
 
5.3%
7006
 
5.1%
6949
 
5.1%
6903
 
5.1%
Other values (365) 56057
41.1%
Latin
ValueCountFrequency (%)
A 758
24.5%
T 439
14.2%
B 335
10.8%
R 301
 
9.7%
K 296
 
9.6%
180
 
5.8%
F 178
 
5.8%
M 143
 
4.6%
V 130
 
4.2%
L 126
 
4.1%
Other values (19) 206
 
6.7%
Common
ValueCountFrequency (%)
47633
39.1%
1 9451
 
7.8%
( 7922
 
6.5%
) 7921
 
6.5%
, 7223
 
5.9%
2 6589
 
5.4%
0 5275
 
4.3%
3 5057
 
4.2%
4 4536
 
3.7%
5 4499
 
3.7%
Other values (13) 15707
 
12.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 136334
52.2%
ASCII 124723
47.7%
Number Forms 181
 
0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
47633
38.2%
1 9451
 
7.6%
( 7922
 
6.4%
) 7921
 
6.4%
, 7223
 
5.8%
2 6589
 
5.3%
0 5275
 
4.2%
3 5057
 
4.1%
4 4536
 
3.6%
5 4499
 
3.6%
Other values (39) 18617
 
14.9%
Hangul
ValueCountFrequency (%)
9418
 
6.9%
9188
 
6.7%
8871
 
6.5%
8513
 
6.2%
8348
 
6.1%
7899
 
5.8%
7182
 
5.3%
7006
 
5.1%
6949
 
5.1%
6903
 
5.1%
Other values (365) 56057
41.1%
Number Forms
ValueCountFrequency (%)
180
99.4%
1
 
0.6%
None
ValueCountFrequency (%)
· 1
100.0%
Distinct1755
Distinct (%)25.5%
Missing22
Missing (%)0.3%
Memory size54.1 KiB
2023-12-12T22:32:24.505792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length605
Median length287
Mean length27.467121
Min length3

Characters and Unicode

Total characters189221
Distinct characters353
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1318 ?
Unique (%)19.1%

Sample

1st row그 외 기타 1차 철강 제조업
2nd row자동차 재제조 부품 제조업,절삭가공 및 유사처리업,그 외 기타 금속가공업,자동차 엔진용 신품 부품 제조업
3rd row인쇄회로기판용 적층판 제조업
4th row주형 및 금형 제조업
5th row자동차용 신품 제동장치 제조업,그 외 자동차용 신품 부품 제조업,자동차용 신품 조향장치 및 현가 장치 제조업,자동차 재제조 부품 제조업
ValueCountFrequency (%)
제조업 5654
 
11.9%
4967
 
10.4%
기타 3247
 
6.8%
1872
 
3.9%
1085
 
2.3%
인쇄회로기판 1042
 
2.2%
제조업,그 735
 
1.5%
금속 713
 
1.5%
기계 687
 
1.4%
전기 619
 
1.3%
Other values (1145) 26931
56.6%
2023-12-12T22:32:25.049094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
40663
21.5%
13536
 
7.2%
12779
 
6.8%
12211
 
6.5%
9761
 
5.2%
, 6643
 
3.5%
4967
 
2.6%
3705
 
2.0%
3332
 
1.8%
3162
 
1.7%
Other values (343) 78462
41.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 141509
74.8%
Space Separator 40663
 
21.5%
Other Punctuation 6741
 
3.6%
Decimal Number 266
 
0.1%
Close Punctuation 21
 
< 0.1%
Open Punctuation 21
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13536
 
9.6%
12779
 
9.0%
12211
 
8.6%
9761
 
6.9%
4967
 
3.5%
3705
 
2.6%
3332
 
2.4%
3162
 
2.2%
2588
 
1.8%
2419
 
1.7%
Other values (337) 73049
51.6%
Other Punctuation
ValueCountFrequency (%)
, 6643
98.5%
. 98
 
1.5%
Space Separator
ValueCountFrequency (%)
40663
100.0%
Decimal Number
ValueCountFrequency (%)
1 266
100.0%
Close Punctuation
ValueCountFrequency (%)
) 21
100.0%
Open Punctuation
ValueCountFrequency (%)
( 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 141509
74.8%
Common 47712
 
25.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13536
 
9.6%
12779
 
9.0%
12211
 
8.6%
9761
 
6.9%
4967
 
3.5%
3705
 
2.6%
3332
 
2.4%
3162
 
2.2%
2588
 
1.8%
2419
 
1.7%
Other values (337) 73049
51.6%
Common
ValueCountFrequency (%)
40663
85.2%
, 6643
 
13.9%
1 266
 
0.6%
. 98
 
0.2%
) 21
 
< 0.1%
( 21
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 141473
74.8%
ASCII 47712
 
25.2%
Compat Jamo 36
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
40663
85.2%
, 6643
 
13.9%
1 266
 
0.6%
. 98
 
0.2%
) 21
 
< 0.1%
( 21
 
< 0.1%
Hangul
ValueCountFrequency (%)
13536
 
9.6%
12779
 
9.0%
12211
 
8.6%
9761
 
6.9%
4967
 
3.5%
3705
 
2.6%
3332
 
2.4%
3162
 
2.2%
2588
 
1.8%
2419
 
1.7%
Other values (336) 73013
51.6%
Compat Jamo
ValueCountFrequency (%)
36
100.0%

연락처
Text

MISSING 

Distinct5081
Distinct (%)89.7%
Missing1244
Missing (%)18.0%
Memory size54.1 KiB
2023-12-12T22:32:25.339579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.019234
Min length8

Characters and Unicode

Total characters68113
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4588 ?
Unique (%)81.0%

Sample

1st row031-491-6400
2nd row031-494-5511
3rd row031-508-0504
4th row031-826-0954
5th row031-491-8665
ValueCountFrequency (%)
031-490-7294 5
 
0.1%
031-408-6987 5
 
0.1%
031-493-8555 5
 
0.1%
031-493-6561 5
 
0.1%
031-480-3849 4
 
0.1%
031-415-3653 4
 
0.1%
031-492-9315 4
 
0.1%
031-491-3536 4
 
0.1%
031-494-2222 4
 
0.1%
031-432-2121 4
 
0.1%
Other values (5071) 5623
99.2%
2023-12-12T22:32:25.720902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 11274
16.6%
0 9989
14.7%
3 9789
14.4%
1 9311
13.7%
4 6969
10.2%
9 5151
7.6%
8 3514
 
5.2%
5 3328
 
4.9%
2 3206
 
4.7%
7 3081
 
4.5%
Other values (4) 2501
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 56830
83.4%
Dash Punctuation 11274
 
16.6%
Uppercase Letter 9
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 9989
17.6%
3 9789
17.2%
1 9311
16.4%
4 6969
12.3%
9 5151
9.1%
8 3514
 
6.2%
5 3328
 
5.9%
2 3206
 
5.6%
7 3081
 
5.4%
6 2492
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
A 3
33.3%
R 3
33.3%
S 3
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 11274
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 68104
> 99.9%
Latin 9
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 11274
16.6%
0 9989
14.7%
3 9789
14.4%
1 9311
13.7%
4 6969
10.2%
9 5151
7.6%
8 3514
 
5.2%
5 3328
 
4.9%
2 3206
 
4.7%
7 3081
 
4.5%
Latin
ValueCountFrequency (%)
A 3
33.3%
R 3
33.3%
S 3
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 68113
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 11274
16.6%
0 9989
14.7%
3 9789
14.4%
1 9311
13.7%
4 6969
10.2%
9 5151
7.6%
8 3514
 
5.2%
5 3328
 
4.9%
2 3206
 
4.7%
7 3081
 
4.5%
Other values (4) 2501
 
3.7%

종업원수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct209
Distinct (%)4.0%
Missing1681
Missing (%)24.3%
Infinite0
Infinite (%)0.0%
Mean21.304015
Minimum0
Maximum1500
Zeros64
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size60.9 KiB
2023-12-12T22:32:25.894753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q14
median8
Q318
95-th percentile80
Maximum1500
Range1500
Interquartile range (IQR)14

Descriptive statistics

Standard deviation54.544913
Coefficient of variation (CV)2.5603114
Kurtosis194.25454
Mean21.304015
Median Absolute Deviation (MAD)5
Skewness10.915748
Sum111420
Variance2975.1476
MonotonicityNot monotonic
2023-12-12T22:32:26.065752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 453
 
6.6%
3 452
 
6.5%
4 441
 
6.4%
5 419
 
6.1%
6 331
 
4.8%
7 262
 
3.8%
8 211
 
3.1%
10 198
 
2.9%
9 165
 
2.4%
1 148
 
2.1%
Other values (199) 2150
31.1%
(Missing) 1681
24.3%
ValueCountFrequency (%)
0 64
 
0.9%
1 148
 
2.1%
2 453
6.6%
3 452
6.5%
4 441
6.4%
5 419
6.1%
6 331
4.8%
7 262
3.8%
8 211
3.1%
9 165
 
2.4%
ValueCountFrequency (%)
1500 1
< 0.1%
1200 1
< 0.1%
870 1
< 0.1%
775 1
< 0.1%
741 2
< 0.1%
621 1
< 0.1%
580 1
< 0.1%
579 1
< 0.1%
559 1
< 0.1%
554 1
< 0.1%

기업구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size54.1 KiB
소기업
6485 
중기업
 
388
대기업
 
31
<NA>
 
7

Length

Max length4
Median length3
Mean length3.0010129
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row소기업
2nd row소기업
3rd row중기업
4th row소기업
5th row소기업

Common Values

ValueCountFrequency (%)
소기업 6485
93.8%
중기업 388
 
5.6%
대기업 31
 
0.4%
<NA> 7
 
0.1%

Length

2023-12-12T22:32:26.231356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:32:26.332827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
소기업 6485
93.8%
중기업 388
 
5.6%
대기업 31
 
0.4%
na 7
 
0.1%

총원
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct155
Distinct (%)16.9%
Missing5996
Missing (%)86.8%
Infinite0
Infinite (%)0.0%
Mean44.697268
Minimum0
Maximum4000
Zeros125
Zeros (%)1.8%
Negative0
Negative (%)0.0%
Memory size60.9 KiB
2023-12-12T22:32:26.474237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median13
Q338
95-th percentile166.7
Maximum4000
Range4000
Interquartile range (IQR)33

Descriptive statistics

Standard deviation159.87564
Coefficient of variation (CV)3.5768549
Kurtosis414.67026
Mean44.697268
Median Absolute Deviation (MAD)11
Skewness17.680004
Sum40898
Variance25560.22
MonotonicityNot monotonic
2023-12-12T22:32:26.684762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 125
 
1.8%
5 52
 
0.8%
6 41
 
0.6%
4 36
 
0.5%
10 34
 
0.5%
3 31
 
0.4%
8 30
 
0.4%
7 27
 
0.4%
9 23
 
0.3%
20 22
 
0.3%
Other values (145) 494
 
7.1%
(Missing) 5996
86.8%
ValueCountFrequency (%)
0 125
1.8%
1 7
 
0.1%
2 12
 
0.2%
3 31
 
0.4%
4 36
 
0.5%
5 52
0.8%
6 41
 
0.6%
7 27
 
0.4%
8 30
 
0.4%
9 23
 
0.3%
ValueCountFrequency (%)
4000 1
< 0.1%
1100 1
< 0.1%
974 1
< 0.1%
773 1
< 0.1%
638 1
< 0.1%
597 1
< 0.1%
579 1
< 0.1%
577 1
< 0.1%
559 1
< 0.1%
556 1
< 0.1%
Distinct4802
Distinct (%)69.6%
Missing7
Missing (%)0.1%
Memory size54.1 KiB
2023-12-12T22:32:27.022567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length66
Median length54
Mean length8.1777231
Min length1

Characters and Unicode

Total characters56459
Distinct characters753
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4340 ?
Unique (%)62.9%

Sample

1st row방음벽 지주, 태양광 골조자재
2nd row모터축, 자동차 엔진용부품 등, 전자, 자동차부품 CNC 절삭가공
3rd rowFPCB 실장
4th row금형
5th row자동차부품
ValueCountFrequency (%)
287
 
2.5%
인쇄회로기판 268
 
2.3%
225
 
1.9%
pcb 169
 
1.5%
부품 162
 
1.4%
금형 149
 
1.3%
자동차부품 149
 
1.3%
기계부품 139
 
1.2%
전자부품 136
 
1.2%
121
 
1.0%
Other values (5285) 9821
84.5%
2023-12-12T22:32:27.591004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4729
 
8.4%
2573
 
4.6%
, 2137
 
3.8%
1791
 
3.2%
1349
 
2.4%
1272
 
2.3%
991
 
1.8%
985
 
1.7%
956
 
1.7%
874
 
1.5%
Other values (743) 38802
68.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 44828
79.4%
Space Separator 4729
 
8.4%
Uppercase Letter 2868
 
5.1%
Other Punctuation 2197
 
3.9%
Lowercase Letter 1002
 
1.8%
Close Punctuation 364
 
0.6%
Open Punctuation 363
 
0.6%
Decimal Number 74
 
0.1%
Dash Punctuation 32
 
0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2573
 
5.7%
1791
 
4.0%
1349
 
3.0%
1272
 
2.8%
991
 
2.2%
985
 
2.2%
956
 
2.1%
874
 
1.9%
871
 
1.9%
815
 
1.8%
Other values (667) 32351
72.2%
Uppercase Letter
ValueCountFrequency (%)
C 449
15.7%
P 406
14.2%
B 304
10.6%
E 215
 
7.5%
L 208
 
7.3%
D 188
 
6.6%
F 125
 
4.4%
S 122
 
4.3%
T 113
 
3.9%
A 107
 
3.7%
Other values (16) 631
22.0%
Lowercase Letter
ValueCountFrequency (%)
c 99
 
9.9%
e 98
 
9.8%
p 90
 
9.0%
r 81
 
8.1%
o 75
 
7.5%
b 70
 
7.0%
i 65
 
6.5%
t 62
 
6.2%
l 56
 
5.6%
a 56
 
5.6%
Other values (15) 250
25.0%
Decimal Number
ValueCountFrequency (%)
2 19
25.7%
1 19
25.7%
3 15
20.3%
0 10
13.5%
6 4
 
5.4%
8 2
 
2.7%
5 2
 
2.7%
7 1
 
1.4%
9 1
 
1.4%
4 1
 
1.4%
Other Punctuation
ValueCountFrequency (%)
, 2137
97.3%
. 26
 
1.2%
/ 25
 
1.1%
' 3
 
0.1%
· 3
 
0.1%
% 2
 
0.1%
& 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 362
99.5%
] 2
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 361
99.4%
[ 2
 
0.6%
Space Separator
ValueCountFrequency (%)
4729
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 32
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 44828
79.4%
Common 7761
 
13.7%
Latin 3870
 
6.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2573
 
5.7%
1791
 
4.0%
1349
 
3.0%
1272
 
2.8%
991
 
2.2%
985
 
2.2%
956
 
2.1%
874
 
1.9%
871
 
1.9%
815
 
1.8%
Other values (667) 32351
72.2%
Latin
ValueCountFrequency (%)
C 449
 
11.6%
P 406
 
10.5%
B 304
 
7.9%
E 215
 
5.6%
L 208
 
5.4%
D 188
 
4.9%
F 125
 
3.2%
S 122
 
3.2%
T 113
 
2.9%
A 107
 
2.8%
Other values (41) 1633
42.2%
Common
ValueCountFrequency (%)
4729
60.9%
, 2137
27.5%
) 362
 
4.7%
( 361
 
4.7%
- 32
 
0.4%
. 26
 
0.3%
/ 25
 
0.3%
2 19
 
0.2%
1 19
 
0.2%
3 15
 
0.2%
Other values (15) 36
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 44828
79.4%
ASCII 11628
 
20.6%
None 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4729
40.7%
, 2137
18.4%
C 449
 
3.9%
P 406
 
3.5%
) 362
 
3.1%
( 361
 
3.1%
B 304
 
2.6%
E 215
 
1.8%
L 208
 
1.8%
D 188
 
1.6%
Other values (65) 2269
19.5%
Hangul
ValueCountFrequency (%)
2573
 
5.7%
1791
 
4.0%
1349
 
3.0%
1272
 
2.8%
991
 
2.2%
985
 
2.2%
956
 
2.1%
874
 
1.9%
871
 
1.9%
815
 
1.8%
Other values (667) 32351
72.2%
None
ValueCountFrequency (%)
· 3
100.0%

생산품분류
Text

MISSING 

Distinct710
Distinct (%)14.8%
Missing2121
Missing (%)30.7%
Memory size54.1 KiB
2023-12-12T22:32:27.869587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length34
Mean length8.0286013
Min length2

Characters and Unicode

Total characters38457
Distinct characters218
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique478 ?
Unique (%)10.0%

Sample

1st row건축 또는 건축용 재료,조명기기
2nd row자동차 및 자동차부품,동력기계
3rd row가구,전기통신기계
4th row계측제어분석기
5th row자동차 및 자동차부품
ValueCountFrequency (%)
1089
 
12.8%
집적회로 311
 
3.7%
반도체 299
 
3.5%
또는 275
 
3.2%
건축용 275
 
3.2%
화학품 274
 
3.2%
자동차 267
 
3.1%
자동차부품 253
 
3.0%
컴퓨터 245
 
2.9%
계측제어분석기 244
 
2.9%
Other values (750) 4986
58.5%
2023-12-12T22:32:28.301203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3942
 
10.3%
3728
 
9.7%
, 1520
 
4.0%
1519
 
3.9%
1373
 
3.6%
1139
 
3.0%
992
 
2.6%
968
 
2.5%
893
 
2.3%
853
 
2.2%
Other values (208) 21530
56.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 32934
85.6%
Space Separator 3728
 
9.7%
Other Punctuation 1793
 
4.7%
Uppercase Letter 1
 
< 0.1%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3942
 
12.0%
1519
 
4.6%
1373
 
4.2%
1139
 
3.5%
992
 
3.0%
968
 
2.9%
893
 
2.7%
853
 
2.6%
841
 
2.6%
727
 
2.2%
Other values (203) 19687
59.8%
Other Punctuation
ValueCountFrequency (%)
, 1520
84.8%
/ 273
 
15.2%
Space Separator
ValueCountFrequency (%)
3728
100.0%
Uppercase Letter
ValueCountFrequency (%)
D 1
100.0%
Decimal Number
ValueCountFrequency (%)
3 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 32934
85.6%
Common 5522
 
14.4%
Latin 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3942
 
12.0%
1519
 
4.6%
1373
 
4.2%
1139
 
3.5%
992
 
3.0%
968
 
2.9%
893
 
2.7%
853
 
2.6%
841
 
2.6%
727
 
2.2%
Other values (203) 19687
59.8%
Common
ValueCountFrequency (%)
3728
67.5%
, 1520
27.5%
/ 273
 
4.9%
3 1
 
< 0.1%
Latin
ValueCountFrequency (%)
D 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 32934
85.6%
ASCII 5523
 
14.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3942
 
12.0%
1519
 
4.6%
1373
 
4.2%
1139
 
3.5%
992
 
3.0%
968
 
2.9%
893
 
2.7%
853
 
2.6%
841
 
2.6%
727
 
2.2%
Other values (203) 19687
59.8%
ASCII
ValueCountFrequency (%)
3728
67.5%
, 1520
27.5%
/ 273
 
4.9%
D 1
 
< 0.1%
3 1
 
< 0.1%

Interactions

2023-12-12T22:32:21.548436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:32:21.335345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:32:21.661886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:32:21.442277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:32:28.410697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종업원수기업구분총원
종업원수1.0000.7110.759
기업구분0.7111.0000.412
총원0.7590.4121.000
2023-12-12T22:32:28.507034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종업원수총원기업구분
종업원수1.0000.6620.601
총원0.6621.0000.404
기업구분0.6010.4041.000

Missing values

2023-12-12T22:32:21.816395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:32:21.987443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T22:32:22.179607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

공장명주소업종명연락처종업원수기업구분총원생산품명생산품분류
0(주)BMB산업경기도 안산시 단원구 해안로 77, 505-10 (목내동)그 외 기타 1차 철강 제조업031-491-640010소기업18방음벽 지주, 태양광 골조자재건축 또는 건축용 재료,조명기기
1(주)ICP경기도 안산시 단원구 산단로67번길 134, (16-3-1) (목내동)자동차 재제조 부품 제조업,절삭가공 및 유사처리업,그 외 기타 금속가공업,자동차 엔진용 신품 부품 제조업031-494-551130소기업14모터축, 자동차 엔진용부품 등, 전자, 자동차부품 CNC 절삭가공자동차 및 자동차부품,동력기계
2(주)거림경기도 안산시 단원구 엠티브이12로21번길 18, 시화MTV 4사 106호 (성곡동)인쇄회로기판용 적층판 제조업031-508-05047중기업115FPCB 실장가구,전기통신기계
3(주)건우정공경기도 안산시 단원구 동산로27번길 15,(11B-2L) (원시동)주형 및 금형 제조업031-826-095431소기업42금형계측제어분석기
4(주)건우정밀경기도 안산시 단원구 신원로 314, 디동 (454-1, 16B 19L) (목내동)자동차용 신품 제동장치 제조업,그 외 자동차용 신품 부품 제조업,자동차용 신품 조향장치 및 현가 장치 제조업,자동차 재제조 부품 제조업031-491-86654소기업4자동차부품자동차 및 자동차부품
5(주)경안파이프경기도 안산시 단원구 번영로94번길 29, 시화공단 4마 403호 (성곡동)강관 가공품 및 관 연결구류 제조업,강관 제조업031-432-606027소기업27구조관파이프건축 또는 건축용 재료
6(주)경인양행경기도 안산시 단원구 산단로 26, (10-64) (원시동)염료, 조제 무기안료, 유연제 및 기타 착색제 제조업031-491-011168중기업68염료염료, 안료, 도료
7(주)고려제지인쇄사업부경기도 안산시 단원구 번영2로 23, 시화단지 4다 707호 (성곡동) 외 1필지적층, 합성 및 특수표면처리 종이 제조업,오프셋 인쇄업,기타 인쇄업031-497-008426소기업0도공라이너지, 인쇄품의료위생용품,인쇄 및 제본기계
8(주)고려중전기경기도 안산시 단원구 산단로 112-15, 10블럭 (원시동)전동기 및 발전기 제조업031-492-84318소기업8전기모터,발전기발전기 및 모터
9(주)고려호이스트경기도 안산시 단원구 엠티브이4로 42, MTV 8사 201호 (목내동, 고려호이스트 공장)기타 물품 취급장비 제조업031-431-603070중기업71호이스트, 크레인운반하역기계
공장명주소업종명연락처종업원수기업구분총원생산품명생산품분류
6901흙광고기획경기도 안산시 단원구 화정천서로 523, 101호(선부동)간판 및 광고물 제조업,전시 및 광고용 조명장치 제조업031-416-0563<NA>소기업<NA>사인물, 아크릴, 실사현수막<NA>
6902흥상산업경기도 안산시 단원구 능안로 81 (신길동, 안산디지털파크)<NA><NA>1소기업<NA>건물임대건축 또는 건축용 재료
6903흥성사료(주)경기도 안산시 단원구 산단로35번길 141, 17블럭 (목내동)배합 사료 제조업,단미 사료 및 기타 사료 제조업031-493-132171중기업<NA>배합사료식물성물질/재료
6904흥아기업상공(주)경기도 안산시 단원구 강촌로139번길 78, 21블럭 (성곡동)포장용 플라스틱 성형용기 제조업031-491-06044중기업<NA>프라스틱바구니주방용품
6905흥진정밀경기도 안산시 단원구 진흥로10번길 26, 5바 904-2호 (성곡동)산업처리공정 제어장비 제조업031-488-87504소기업<NA>자동화설비 부품컴퓨터
6906희다이아몬드경기도 안산시 단원구 산단로 107, 서흥테크노밸리313호 (원시동)비동력식 수공구 제조업031-508-96421소기업<NA>연삭공구주조 및 금속가공기계
6907희상경기도 안산시 단원구 산단로 341, 808호(13-11) (신길동)그 외 기타 의료용 기기 제조업031-492-07812소기업<NA>뜸봉<NA>
6908희승정밀경기도 안산시 단원구 별망로79번길 25, (성곡동)주형 및 금형 제조업<NA>2소기업<NA>금형제작 및 가공<NA>
6909희진엠앤에프경기도 안산시 단원구 해봉로273번길 27, 1049-6번지 (신길동)연성 및 기타 인쇄회로기판 제조업,인쇄회로기판용 적층판 제조업,경성 인쇄회로기판 제조업<NA><NA>소기업<NA>PCB<NA>
6910희창산업경기도 안산시 단원구 산단로20번길 98, 9블럭 (초지동)톱 및 호환성 공구 제조업031-495-52283소기업<NA>공구주조 및 금속가공기계

Duplicate rows

Most frequently occurring

공장명주소업종명연락처종업원수기업구분총원생산품명생산품분류# duplicates
0(주)스탠다드엔지니어링경기도 안산시 단원구 첨단로 679 (초지동)산업용 오븐, 노 및 노용 버너 제조업031-494-075011소기업<NA>모듈<NA>2