Overview

Dataset statistics

Number of variables12
Number of observations6844
Missing cells15
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory675.2 KiB
Average record size in memory101.0 B

Variable types

Numeric5
Text4
Categorical3

Dataset

Description김해시의 공장기업 현황에 대한 데이터로 회사명, 단지명, 대표자명, 지식산업센터명, 공장대표주소, 남종업원, 여종업원, 외국인남자 종업원, 외국인여자 종업원, 업종번호, 업종명, 용도지역 정보를 제공합니다.
Author경상남도 김해시
URLhttps://www.data.go.kr/data/15126955/fileData.do

Alerts

지식산업센터명 is highly overall correlated with 순번 and 5 other fieldsHigh correlation
용도지역 is highly overall correlated with 지식산업센터명High correlation
순번 is highly overall correlated with 지식산업센터명High correlation
남종업원 is highly overall correlated with 지식산업센터명High correlation
여종업원 is highly overall correlated with 지식산업센터명High correlation
(외남)종업원 is highly overall correlated with 지식산업센터명High correlation
(외여)종업원 is highly overall correlated with 지식산업센터명High correlation
단지명 is highly imbalanced (77.1%)Imbalance
지식산업센터명 is highly imbalanced (97.6%)Imbalance
(외여)종업원 is highly skewed (γ1 = 27.66399715)Skewed
순번 has unique valuesUnique
남종업원 has 87 (1.3%) zerosZeros
여종업원 has 1361 (19.9%) zerosZeros
(외남)종업원 has 6342 (92.7%) zerosZeros
(외여)종업원 has 6744 (98.5%) zerosZeros

Reproduction

Analysis started2024-03-14 09:39:47.446185
Analysis finished2024-03-14 09:39:56.341245
Duration8.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct6844
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3422.5
Minimum1
Maximum6844
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size60.3 KiB
2024-03-14T18:39:56.550503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile343.15
Q11711.75
median3422.5
Q35133.25
95-th percentile6501.85
Maximum6844
Range6843
Interquartile range (IQR)3421.5

Descriptive statistics

Standard deviation1975.837
Coefficient of variation (CV)0.57730809
Kurtosis-1.2
Mean3422.5
Median Absolute Deviation (MAD)1711
Skewness0
Sum23423590
Variance3903931.7
MonotonicityStrictly increasing
2024-03-14T18:39:57.003110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
4573 1
 
< 0.1%
4571 1
 
< 0.1%
4570 1
 
< 0.1%
4569 1
 
< 0.1%
4568 1
 
< 0.1%
4567 1
 
< 0.1%
4566 1
 
< 0.1%
4565 1
 
< 0.1%
4564 1
 
< 0.1%
Other values (6834) 6834
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
6844 1
< 0.1%
6843 1
< 0.1%
6842 1
< 0.1%
6841 1
< 0.1%
6840 1
< 0.1%
6839 1
< 0.1%
6838 1
< 0.1%
6837 1
< 0.1%
6836 1
< 0.1%
6835 1
< 0.1%
Distinct6193
Distinct (%)90.5%
Missing0
Missing (%)0.0%
Memory size53.6 KiB
2024-03-14T18:39:58.046789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length25
Mean length6.614699
Min length2

Characters and Unicode

Total characters45271
Distinct characters639
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5664 ?
Unique (%)82.8%

Sample

1st row 종삼벤드
2nd row( 주)케이지앤에프
3rd row(사)경남교통장애인협회 마산지회
4th row(유)근영금속
5th row(유)성문
ValueCountFrequency (%)
주식회사 364
 
5.0%
tech 12
 
0.2%
농업회사법인 11
 
0.1%
대성산업 7
 
0.1%
영진산업 6
 
0.1%
주)dcf 6
 
0.1%
trek 6
 
0.1%
대원산업 6
 
0.1%
태성산업 6
 
0.1%
김해지점 6
 
0.1%
Other values (6224) 6910
94.1%
2024-03-14T18:39:59.396711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3922
 
8.7%
) 3431
 
7.6%
( 3431
 
7.6%
1328
 
2.9%
1178
 
2.6%
1082
 
2.4%
928
 
2.0%
883
 
2.0%
781
 
1.7%
679
 
1.5%
Other values (629) 27628
61.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 36443
80.5%
Close Punctuation 3431
 
7.6%
Open Punctuation 3431
 
7.6%
Uppercase Letter 1091
 
2.4%
Space Separator 504
 
1.1%
Other Punctuation 152
 
0.3%
Lowercase Letter 109
 
0.2%
Decimal Number 98
 
0.2%
Dash Punctuation 12
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3922
 
10.8%
1328
 
3.6%
1178
 
3.2%
1082
 
3.0%
928
 
2.5%
883
 
2.4%
781
 
2.1%
679
 
1.9%
667
 
1.8%
639
 
1.8%
Other values (570) 24356
66.8%
Uppercase Letter
ValueCountFrequency (%)
E 119
 
10.9%
S 119
 
10.9%
C 99
 
9.1%
T 88
 
8.1%
N 83
 
7.6%
G 76
 
7.0%
M 58
 
5.3%
H 56
 
5.1%
R 42
 
3.8%
P 40
 
3.7%
Other values (15) 311
28.5%
Lowercase Letter
ValueCountFrequency (%)
n 14
12.8%
o 11
10.1%
e 10
 
9.2%
t 9
 
8.3%
c 8
 
7.3%
s 7
 
6.4%
a 7
 
6.4%
g 7
 
6.4%
d 6
 
5.5%
y 4
 
3.7%
Other values (10) 26
23.9%
Other Punctuation
ValueCountFrequency (%)
. 114
75.0%
& 30
 
19.7%
, 5
 
3.3%
/ 2
 
1.3%
: 1
 
0.7%
Decimal Number
ValueCountFrequency (%)
2 63
64.3%
1 21
 
21.4%
3 10
 
10.2%
4 3
 
3.1%
5 1
 
1.0%
Close Punctuation
ValueCountFrequency (%)
) 3431
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3431
100.0%
Space Separator
ValueCountFrequency (%)
504
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 36443
80.5%
Common 7628
 
16.8%
Latin 1200
 
2.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3922
 
10.8%
1328
 
3.6%
1178
 
3.2%
1082
 
3.0%
928
 
2.5%
883
 
2.4%
781
 
2.1%
679
 
1.9%
667
 
1.8%
639
 
1.8%
Other values (570) 24356
66.8%
Latin
ValueCountFrequency (%)
E 119
 
9.9%
S 119
 
9.9%
C 99
 
8.2%
T 88
 
7.3%
N 83
 
6.9%
G 76
 
6.3%
M 58
 
4.8%
H 56
 
4.7%
R 42
 
3.5%
P 40
 
3.3%
Other values (35) 420
35.0%
Common
ValueCountFrequency (%)
) 3431
45.0%
( 3431
45.0%
504
 
6.6%
. 114
 
1.5%
2 63
 
0.8%
& 30
 
0.4%
1 21
 
0.3%
- 12
 
0.2%
3 10
 
0.1%
, 5
 
0.1%
Other values (4) 7
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 36443
80.5%
ASCII 8828
 
19.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3922
 
10.8%
1328
 
3.6%
1178
 
3.2%
1082
 
3.0%
928
 
2.5%
883
 
2.4%
781
 
2.1%
679
 
1.9%
667
 
1.8%
639
 
1.8%
Other values (570) 24356
66.8%
ASCII
ValueCountFrequency (%)
) 3431
38.9%
( 3431
38.9%
504
 
5.7%
E 119
 
1.3%
S 119
 
1.3%
. 114
 
1.3%
C 99
 
1.1%
T 88
 
1.0%
N 83
 
0.9%
G 76
 
0.9%
Other values (49) 764
 
8.7%

단지명
Categorical

IMBALANCE 

Distinct25
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size53.6 KiB
<NA>
5909 
김해테크노밸리일반산업단지
 
293
김해GoldenRoot일반산업단지
 
132
김해진영농공단지
 
71
서김해일반산업단지
 
55
Other values (20)
 
384

Length

Max length18
Median length4
Mean length5.0457335
Min length4

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row김해AM하이테크일반산업단지
3rd row<NA>
4th row<NA>
5th row김해테크노밸리일반산업단지

Common Values

ValueCountFrequency (%)
<NA> 5909
86.3%
김해테크노밸리일반산업단지 293
 
4.3%
김해GoldenRoot일반산업단지 132
 
1.9%
김해진영농공단지 71
 
1.0%
서김해일반산업단지 55
 
0.8%
김해안하농공단지 44
 
0.6%
김해병동일반산업단지 38
 
0.6%
김해나전농공단지 32
 
0.5%
김해덕암일반산업단지 30
 
0.4%
김해명동일반산업단지 29
 
0.4%
Other values (15) 211
 
3.1%

Length

2024-03-14T18:39:59.850714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 5909
86.3%
김해테크노밸리일반산업단지 293
 
4.3%
김해goldenroot일반산업단지 132
 
1.9%
김해진영농공단지 71
 
1.0%
서김해일반산업단지 55
 
0.8%
김해안하농공단지 44
 
0.6%
김해병동일반산업단지 38
 
0.6%
김해나전농공단지 32
 
0.5%
김해덕암일반산업단지 30
 
0.4%
김해명동일반산업단지 29
 
0.4%
Other values (15) 211
 
3.1%

지식산업센터명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size53.6 KiB
<NA>
6828 
재단법인김해시차세대의생명융합산업지원센터
 
16

Length

Max length21
Median length4
Mean length4.0397428
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 6828
99.8%
재단법인김해시차세대의생명융합산업지원센터 16
 
0.2%

Length

2024-03-14T18:40:00.277334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T18:40:00.607535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 6828
99.8%
재단법인김해시차세대의생명융합산업지원센터 16
 
0.2%
Distinct6073
Distinct (%)88.7%
Missing0
Missing (%)0.0%
Memory size53.6 KiB
2024-03-14T18:40:01.575796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length56
Median length49
Mean length25.241525
Min length7

Characters and Unicode

Total characters172753
Distinct characters267
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5441 ?
Unique (%)79.5%

Sample

1st row경상남도 김해시 진례면 담안리 1077-13번지
2nd row경상남도 김해시 진례면 하이테크로 22
3rd row경상남도 김해시 주촌면 서부로1541번길 86-98
4th row경상남도 김해시 주촌면 서부로1701번안길 58-199
5th row경상남도 김해시 진례면 테크노밸리1로 123
ValueCountFrequency (%)
김해시 6842
19.5%
경상남도 6841
19.5%
한림면 1361
 
3.9%
진례면 1110
 
3.2%
주촌면 1073
 
3.1%
진영읍 784
 
2.2%
상동면 674
 
1.9%
생림면 523
 
1.5%
안동 242
 
0.7%
어방동 200
 
0.6%
Other values (4275) 15406
43.9%
2024-03-14T18:40:03.024537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
28319
 
16.4%
7853
 
4.5%
7849
 
4.5%
7838
 
4.5%
1 7222
 
4.2%
6855
 
4.0%
6850
 
4.0%
6849
 
4.0%
6844
 
4.0%
6231
 
3.6%
Other values (257) 80043
46.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 101742
58.9%
Decimal Number 35851
 
20.8%
Space Separator 28319
 
16.4%
Dash Punctuation 3465
 
2.0%
Open Punctuation 1437
 
0.8%
Close Punctuation 1437
 
0.8%
Other Punctuation 298
 
0.2%
Uppercase Letter 182
 
0.1%
Lowercase Letter 22
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7853
 
7.7%
7849
 
7.7%
7838
 
7.7%
6855
 
6.7%
6850
 
6.7%
6849
 
6.7%
6844
 
6.7%
6231
 
6.1%
4776
 
4.7%
4336
 
4.3%
Other values (221) 35461
34.9%
Uppercase Letter
ValueCountFrequency (%)
B 73
40.1%
L 64
35.2%
A 14
 
7.7%
F 9
 
4.9%
E 5
 
2.7%
D 4
 
2.2%
I 2
 
1.1%
H 2
 
1.1%
S 2
 
1.1%
N 2
 
1.1%
Other values (4) 5
 
2.7%
Decimal Number
ValueCountFrequency (%)
1 7222
20.1%
2 4769
13.3%
3 4032
11.2%
4 3396
9.5%
5 3327
9.3%
6 3052
8.5%
9 2920
8.1%
7 2739
 
7.6%
0 2340
 
6.5%
8 2054
 
5.7%
Lowercase Letter
ValueCountFrequency (%)
o 6
27.3%
l 5
22.7%
k 3
13.6%
t 3
13.6%
c 3
13.6%
b 2
 
9.1%
Other Punctuation
ValueCountFrequency (%)
, 295
99.0%
. 3
 
1.0%
Space Separator
ValueCountFrequency (%)
28319
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3465
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1437
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1437
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 101742
58.9%
Common 70807
41.0%
Latin 204
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7853
 
7.7%
7849
 
7.7%
7838
 
7.7%
6855
 
6.7%
6850
 
6.7%
6849
 
6.7%
6844
 
6.7%
6231
 
6.1%
4776
 
4.7%
4336
 
4.3%
Other values (221) 35461
34.9%
Latin
ValueCountFrequency (%)
B 73
35.8%
L 64
31.4%
A 14
 
6.9%
F 9
 
4.4%
o 6
 
2.9%
l 5
 
2.5%
E 5
 
2.5%
D 4
 
2.0%
k 3
 
1.5%
t 3
 
1.5%
Other values (10) 18
 
8.8%
Common
ValueCountFrequency (%)
28319
40.0%
1 7222
 
10.2%
2 4769
 
6.7%
3 4032
 
5.7%
- 3465
 
4.9%
4 3396
 
4.8%
5 3327
 
4.7%
6 3052
 
4.3%
9 2920
 
4.1%
7 2739
 
3.9%
Other values (6) 7566
 
10.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 101742
58.9%
ASCII 71011
41.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
28319
39.9%
1 7222
 
10.2%
2 4769
 
6.7%
3 4032
 
5.7%
- 3465
 
4.9%
4 3396
 
4.8%
5 3327
 
4.7%
6 3052
 
4.3%
9 2920
 
4.1%
7 2739
 
3.9%
Other values (26) 7770
 
10.9%
Hangul
ValueCountFrequency (%)
7853
 
7.7%
7849
 
7.7%
7838
 
7.7%
6855
 
6.7%
6850
 
6.7%
6849
 
6.7%
6844
 
6.7%
6231
 
6.1%
4776
 
4.7%
4336
 
4.3%
Other values (221) 35461
34.9%

남종업원
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct118
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.913793
Minimum0
Maximum638
Zeros87
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size60.3 KiB
2024-03-14T18:40:03.428622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median6
Q311
95-th percentile33
Maximum638
Range638
Interquartile range (IQR)8

Descriptive statistics

Standard deviation21.696757
Coefficient of variation (CV)1.9880125
Kurtosis351.16697
Mean10.913793
Median Absolute Deviation (MAD)3
Skewness14.905901
Sum74694
Variance470.74925
MonotonicityNot monotonic
2024-03-14T18:40:03.861130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 875
12.8%
3 693
 
10.1%
2 600
 
8.8%
5 538
 
7.9%
9 492
 
7.2%
1 407
 
5.9%
6 375
 
5.5%
7 338
 
4.9%
10 313
 
4.6%
8 312
 
4.6%
Other values (108) 1901
27.8%
ValueCountFrequency (%)
0 87
 
1.3%
1 407
5.9%
2 600
8.8%
3 693
10.1%
4 875
12.8%
5 538
7.9%
6 375
5.5%
7 338
 
4.9%
8 312
 
4.6%
9 492
7.2%
ValueCountFrequency (%)
638 1
< 0.1%
625 1
< 0.1%
617 1
< 0.1%
486 1
< 0.1%
400 1
< 0.1%
285 1
< 0.1%
272 1
< 0.1%
254 1
< 0.1%
251 1
< 0.1%
238 1
< 0.1%

여종업원
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct69
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.8834015
Minimum0
Maximum300
Zeros1361
Zeros (%)19.9%
Negative0
Negative (%)0.0%
Memory size60.3 KiB
2024-03-14T18:40:04.272640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q32
95-th percentile10
Maximum300
Range300
Interquartile range (IQR)1

Descriptive statistics

Standard deviation9.0761546
Coefficient of variation (CV)3.1477248
Kurtosis395.42987
Mean2.8834015
Median Absolute Deviation (MAD)1
Skewness16.166841
Sum19734
Variance82.376583
MonotonicityNot monotonic
2024-03-14T18:40:04.712585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3114
45.5%
0 1361
19.9%
2 814
 
11.9%
3 417
 
6.1%
4 245
 
3.6%
5 203
 
3.0%
6 96
 
1.4%
10 94
 
1.4%
7 88
 
1.3%
8 69
 
1.0%
Other values (59) 343
 
5.0%
ValueCountFrequency (%)
0 1361
19.9%
1 3114
45.5%
2 814
 
11.9%
3 417
 
6.1%
4 245
 
3.6%
5 203
 
3.0%
6 96
 
1.4%
7 88
 
1.3%
8 69
 
1.0%
9 45
 
0.7%
ValueCountFrequency (%)
300 1
< 0.1%
280 1
< 0.1%
196 1
< 0.1%
194 1
< 0.1%
150 1
< 0.1%
140 1
< 0.1%
138 1
< 0.1%
120 1
< 0.1%
107 1
< 0.1%
97 1
< 0.1%

(외남)종업원
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct24
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.36177674
Minimum0
Maximum40
Zeros6342
Zeros (%)92.7%
Negative0
Negative (%)0.0%
Memory size60.3 KiB
2024-03-14T18:40:04.992236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum40
Range40
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.7731266
Coefficient of variation (CV)4.9011625
Kurtosis106.83986
Mean0.36177674
Median Absolute Deviation (MAD)0
Skewness8.3633618
Sum2476
Variance3.1439779
MonotonicityNot monotonic
2024-03-14T18:40:05.311173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
0 6342
92.7%
1 95
 
1.4%
2 84
 
1.2%
3 60
 
0.9%
4 55
 
0.8%
6 41
 
0.6%
5 38
 
0.6%
7 28
 
0.4%
8 26
 
0.4%
10 22
 
0.3%
Other values (14) 53
 
0.8%
ValueCountFrequency (%)
0 6342
92.7%
1 95
 
1.4%
2 84
 
1.2%
3 60
 
0.9%
4 55
 
0.8%
5 38
 
0.6%
6 41
 
0.6%
7 28
 
0.4%
8 26
 
0.4%
9 10
 
0.1%
ValueCountFrequency (%)
40 1
 
< 0.1%
34 1
 
< 0.1%
31 1
 
< 0.1%
28 1
 
< 0.1%
26 1
 
< 0.1%
21 1
 
< 0.1%
18 1
 
< 0.1%
17 4
0.1%
15 4
0.1%
14 5
0.1%

(외여)종업원
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct12
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.045149036
Minimum0
Maximum30
Zeros6744
Zeros (%)98.5%
Negative0
Negative (%)0.0%
Memory size60.3 KiB
2024-03-14T18:40:05.672536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum30
Range30
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.58013399
Coefficient of variation (CV)12.849311
Kurtosis1147.6491
Mean0.045149036
Median Absolute Deviation (MAD)0
Skewness27.663997
Sum309
Variance0.33655545
MonotonicityNot monotonic
2024-03-14T18:40:06.035807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0 6744
98.5%
1 41
 
0.6%
2 24
 
0.4%
3 10
 
0.1%
5 6
 
0.1%
6 4
 
0.1%
4 4
 
0.1%
11 3
 
< 0.1%
9 3
 
< 0.1%
8 2
 
< 0.1%
Other values (2) 3
 
< 0.1%
ValueCountFrequency (%)
0 6744
98.5%
1 41
 
0.6%
2 24
 
0.4%
3 10
 
0.1%
4 4
 
0.1%
5 6
 
0.1%
6 4
 
0.1%
7 2
 
< 0.1%
8 2
 
< 0.1%
9 3
 
< 0.1%
ValueCountFrequency (%)
30 1
 
< 0.1%
11 3
 
< 0.1%
9 3
 
< 0.1%
8 2
 
< 0.1%
7 2
 
< 0.1%
6 4
 
0.1%
5 6
 
0.1%
4 4
 
0.1%
3 10
0.1%
2 24
0.4%
Distinct1640
Distinct (%)24.0%
Missing8
Missing (%)0.1%
Memory size53.6 KiB
2024-03-14T18:40:07.544297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length425
Median length5
Mean length13.695728
Min length5

Characters and Unicode

Total characters93624
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1182 ?
Unique (%)17.3%

Sample

1st row31114
2nd row10795
3rd row22231
4th row24329
5th row31114
ValueCountFrequency (%)
30399 621
 
4.1%
30400 610
 
4.0%
30391 515
 
3.4%
30392 502
 
3.3%
31114 472
 
3.1%
25113 332
 
2.2%
25999 259
 
1.7%
29199 255
 
1.7%
25114 250
 
1.6%
29176 247
 
1.6%
Other values (435) 11265
73.5%
2024-03-14T18:40:09.520027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 20518
21.9%
1 14277
15.2%
9 13325
14.2%
3 10363
11.1%
, 8492
9.1%
8492
9.1%
0 6865
 
7.3%
4 3867
 
4.1%
5 3760
 
4.0%
6 1410
 
1.5%
Other values (2) 2255
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 76640
81.9%
Other Punctuation 8492
 
9.1%
Space Separator 8492
 
9.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 20518
26.8%
1 14277
18.6%
9 13325
17.4%
3 10363
13.5%
0 6865
 
9.0%
4 3867
 
5.0%
5 3760
 
4.9%
6 1410
 
1.8%
7 1236
 
1.6%
8 1019
 
1.3%
Other Punctuation
ValueCountFrequency (%)
, 8492
100.0%
Space Separator
ValueCountFrequency (%)
8492
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 93624
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 20518
21.9%
1 14277
15.2%
9 13325
14.2%
3 10363
11.1%
, 8492
9.1%
8492
9.1%
0 6865
 
7.3%
4 3867
 
4.1%
5 3760
 
4.0%
6 1410
 
1.5%
Other values (2) 2255
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 93624
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 20518
21.9%
1 14277
15.2%
9 13325
14.2%
3 10363
11.1%
, 8492
9.1%
8492
9.1%
0 6865
 
7.3%
4 3867
 
4.1%
5 3760
 
4.0%
6 1410
 
1.5%
Other values (2) 2255
 
2.4%
Distinct1135
Distinct (%)16.6%
Missing7
Missing (%)0.1%
Memory size53.6 KiB
2024-03-14T18:40:10.833043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length28
Mean length17.28609
Min length3

Characters and Unicode

Total characters118185
Distinct characters343
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique575 ?
Unique (%)8.4%

Sample

1st row선박 구성 부분품 제조업
2nd row인삼식품 제조업
3rd row플라스틱 포대, 봉투 및 유사제품 제조업
4th row기타 비철금속 주조업
5th row선박 구성 부분품 제조업
ValueCountFrequency (%)
제조업 5990
 
16.2%
4364
 
11.8%
2493
 
6.7%
기타 1908
 
5.2%
1종 1579
 
4.3%
1465
 
4.0%
금속 796
 
2.2%
신품 625
 
1.7%
부품 622
 
1.7%
자동차용 617
 
1.7%
Other values (708) 16487
44.6%
2024-03-14T18:40:12.797531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30112
25.5%
7607
 
6.4%
7102
 
6.0%
7021
 
5.9%
4466
 
3.8%
4066
 
3.4%
3134
 
2.7%
2932
 
2.5%
2493
 
2.1%
2249
 
1.9%
Other values (333) 47003
39.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 84076
71.1%
Space Separator 30112
 
25.5%
Decimal Number 3105
 
2.6%
Other Punctuation 716
 
0.6%
Close Punctuation 88
 
0.1%
Open Punctuation 88
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7607
 
9.0%
7102
 
8.4%
7021
 
8.4%
4466
 
5.3%
4066
 
4.8%
3134
 
3.7%
2932
 
3.5%
2493
 
3.0%
2249
 
2.7%
1939
 
2.3%
Other values (317) 41067
48.8%
Decimal Number
ValueCountFrequency (%)
1 1729
55.7%
3 525
 
16.9%
2 414
 
13.3%
4 170
 
5.5%
5 91
 
2.9%
6 61
 
2.0%
8 35
 
1.1%
9 27
 
0.9%
0 27
 
0.9%
7 26
 
0.8%
Other Punctuation
ValueCountFrequency (%)
, 699
97.6%
. 16
 
2.2%
· 1
 
0.1%
Space Separator
ValueCountFrequency (%)
30112
100.0%
Close Punctuation
ValueCountFrequency (%)
) 88
100.0%
Open Punctuation
ValueCountFrequency (%)
( 88
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 84076
71.1%
Common 34109
28.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7607
 
9.0%
7102
 
8.4%
7021
 
8.4%
4466
 
5.3%
4066
 
4.8%
3134
 
3.7%
2932
 
3.5%
2493
 
3.0%
2249
 
2.7%
1939
 
2.3%
Other values (317) 41067
48.8%
Common
ValueCountFrequency (%)
30112
88.3%
1 1729
 
5.1%
, 699
 
2.0%
3 525
 
1.5%
2 414
 
1.2%
4 170
 
0.5%
5 91
 
0.3%
) 88
 
0.3%
( 88
 
0.3%
6 61
 
0.2%
Other values (6) 132
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 84060
71.1%
ASCII 34108
28.9%
Compat Jamo 16
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
30112
88.3%
1 1729
 
5.1%
, 699
 
2.0%
3 525
 
1.5%
2 414
 
1.2%
4 170
 
0.5%
5 91
 
0.3%
) 88
 
0.3%
( 88
 
0.3%
6 61
 
0.2%
Other values (5) 131
 
0.4%
Hangul
ValueCountFrequency (%)
7607
 
9.0%
7102
 
8.4%
7021
 
8.4%
4466
 
5.3%
4066
 
4.8%
3134
 
3.7%
2932
 
3.5%
2493
 
3.0%
2249
 
2.7%
1939
 
2.3%
Other values (316) 41051
48.8%
Compat Jamo
ValueCountFrequency (%)
16
100.0%
None
ValueCountFrequency (%)
· 1
100.0%

용도지역
Categorical

HIGH CORRELATION 

Distinct26
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size53.6 KiB
관리지역/계획관리지역
2224 
도시지역/공업지역/일반공업지역
1839 
도시지역/공업지역/준공업지역
707 
관리지역
614 
관리지역/관리지역기타
584 
Other values (21)
876 

Length

Max length19
Median length16
Mean length12.746201
Min length4

Unique

Unique4 ?
Unique (%)0.1%

Sample

1st row관리지역/계획관리지역
2nd row관리지역/계획관리지역
3rd row도시지역/공업지역/준공업지역
4th row도시지역/주거지역/제2종일반주거지역
5th row도시지역/공업지역/일반공업지역

Common Values

ValueCountFrequency (%)
관리지역/계획관리지역 2224
32.5%
도시지역/공업지역/일반공업지역 1839
26.9%
도시지역/공업지역/준공업지역 707
 
10.3%
관리지역 614
 
9.0%
관리지역/관리지역기타 584
 
8.5%
도시지역/녹지지역/자연녹지지역 394
 
5.8%
도시지역/주거지역/제2종일반주거지역 246
 
3.6%
도시지역/주거지역 64
 
0.9%
도시지역/주거지역/준주거지역 52
 
0.8%
도시지역/주거지역/제1종일반주거지역 26
 
0.4%
Other values (16) 94
 
1.4%

Length

2024-03-14T18:40:13.242081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
관리지역/계획관리지역 2224
32.5%
도시지역/공업지역/일반공업지역 1839
26.9%
도시지역/공업지역/준공업지역 707
 
10.3%
관리지역 614
 
9.0%
관리지역/관리지역기타 584
 
8.5%
도시지역/녹지지역/자연녹지지역 394
 
5.8%
도시지역/주거지역/제2종일반주거지역 246
 
3.6%
도시지역/주거지역 64
 
0.9%
도시지역/주거지역/준주거지역 52
 
0.8%
도시지역/주거지역/제1종일반주거지역 26
 
0.4%
Other values (16) 94
 
1.4%

Interactions

2024-03-14T18:39:54.340040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:49.312462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:50.691896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:52.048058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:53.164802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:54.561848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:49.588967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:50.964350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:52.323391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:53.344585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:54.933182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:49.860220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:51.230950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:52.596573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:53.521371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:55.098853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:50.133065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:51.500000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:52.817321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:53.782508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:55.276628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:50.418515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:51.781312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:52.996760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:39:54.066076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T18:40:13.494631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번단지명남종업원여종업원(외남)종업원(외여)종업원용도지역
순번1.0000.3110.0520.0000.0580.0000.125
단지명0.3111.0000.5690.0000.0000.0000.772
남종업원0.0520.5691.0000.6580.1840.0880.000
여종업원0.0000.0000.6581.0000.3930.5050.000
(외남)종업원0.0580.0000.1840.3931.0000.8110.000
(외여)종업원0.0000.0000.0880.5050.8111.0000.000
용도지역0.1250.7720.0000.0000.0000.0001.000
2024-03-14T18:40:13.776959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단지명지식산업센터명용도지역
단지명1.000NaN0.427
지식산업센터명NaN1.0001.000
용도지역0.4271.0001.000
2024-03-14T18:40:14.047821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번남종업원여종업원(외남)종업원(외여)종업원단지명지식산업센터명용도지역
순번1.000-0.208-0.132-0.030-0.0170.1181.0000.044
남종업원-0.2081.0000.4860.1710.0630.2631.0000.000
여종업원-0.1320.4861.0000.1790.1330.0001.0000.000
(외남)종업원-0.0300.1710.1791.0000.3230.0001.0000.000
(외여)종업원-0.0170.0630.1330.3231.0000.0221.0000.000
단지명0.1180.2630.0000.0000.0221.000NaN0.427
지식산업센터명1.0001.0001.0001.0001.000NaN1.0001.000
용도지역0.0440.0000.0000.0000.0000.4271.0001.000

Missing values

2024-03-14T18:39:55.504856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T18:39:55.800684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T18:39:56.149556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

순번회사명단지명지식산업센터명공장대표주소남종업원여종업원(외남)종업원(외여)종업원업종번호업종명용도지역
01종삼벤드<NA><NA>경상남도 김해시 진례면 담안리 1077-13번지430031114선박 구성 부분품 제조업관리지역/계획관리지역
12( 주)케이지앤에프김해AM하이테크일반산업단지<NA>경상남도 김해시 진례면 하이테크로 229290310795인삼식품 제조업관리지역/계획관리지역
23(사)경남교통장애인협회 마산지회<NA><NA>경상남도 김해시 주촌면 서부로1541번길 86-98500022231플라스틱 포대, 봉투 및 유사제품 제조업도시지역/공업지역/준공업지역
34(유)근영금속<NA><NA>경상남도 김해시 주촌면 서부로1701번안길 58-1991010024329기타 비철금속 주조업도시지역/주거지역/제2종일반주거지역
45(유)성문김해테크노밸리일반산업단지<NA>경상남도 김해시 진례면 테크노밸리1로 1231000031114선박 구성 부분품 제조업도시지역/공업지역/일반공업지역
56(유)성문 김해지점김해테크노밸리일반산업단지<NA>경상남도 김해시 진례면 테크노밸리1로 122, (유)성문4950031114선박 구성 부분품 제조업도시지역/공업지역/일반공업지역
67(유)성문 김해지점2공장김해테크노밸리일반산업단지<NA>경상남도 김해시 진례면 테크노밸리로 77-32, 성문000031114선박 구성 부분품 제조업도시지역/공업지역/일반공업지역
78(유)아이티더블유건화김해병동농공단지<NA>경상남도 김해시 한림면 김해대로916번길 154-192890030399, 30391, 30392, 30400그 외 자동차용 신품 부품 제조업 외 3종관리지역/계획관리지역
89(유)영동레미콘<NA><NA>경상남도 김해시 상동면 상동로 862-79410023993비금속광물 분쇄물 생산업관리지역/계획관리지역
910(유)유성강재<NA><NA>경상남도 김해시 상동면 상동로685번길 53-2 (총 3 필지)1020024111제철업관리지역/관리지역기타
순번회사명단지명지식산업센터명공장대표주소남종업원여종업원(외남)종업원(외여)종업원업종번호업종명용도지역
68346835희락푸드(주)<NA><NA>경상남도 김해시 주촌면 서부로1295번길 97-9230010129, 10122, 10301, 10302육류 기타 가공 및 저장처리업 (가금류 제외) 외 3종도시지역/주거지역
68356836희망기업김해테크노밸리일반산업단지<NA>경상남도 김해시 진례면 테크노밸리1로 13430025912금속 단조제품 제조업도시지역/공업지역/일반공업지역
68366837희망복지영남방송(주)<NA><NA>경상남도 김해시 한림면 김해대로916번길 40-58410028519, 29172, 29173, 29174, 29175기타 가정용 전기기기 제조업 외 4종관리지역/계획관리지역
68376838희석종합식품<NA><NA>경상남도 김해시 분성로627번길 53-8 (삼방동)300010713과자류 및 코코아 제품 제조업도시지역/공업지역/일반공업지역
68386839희성산업<NA><NA>경상남도 김해시 진례면 고모로442번길 52-1500028123배전반 및 전기 자동제어반 제조업관리지역/계획관리지역
68396840희성산업<NA><NA>경상남도 김해시 진례면 고모로442번길 71-1 (총 2 필지)410031114, 29111, 29119, 29133, 29141, 29142, 29210, 29223선박 구성 부분품 제조업 외 7종관리지역/계획관리지역
68406841희성섬유<NA><NA>경상남도 김해시 상동면 매리 1045-외1번지200013999그 외 기타 분류 안된 섬유제품 제조업관리지역/관리지역기타
68416842희원산업기계<NA><NA>경상남도 김해시 김해대로2694번길 13-60 (지내동)520029229기타 가공 공작기계 제조업도시지역/공업지역/일반공업지역
68426843희창섬유<NA><NA>경상남도 김해시 상동면 동북로437번길 153-12330013109기타 방적업관리지역/계획관리지역
68436844히팅테크<NA><NA>경상남도 김해시 진례면 고모로324번길 103-65220029150, 28511, 28512, 28520산업용 오븐, 노 및 노용 버너 제조업 외 3종관리지역/계획관리지역