Overview

Dataset statistics

Number of variables5
Number of observations2363
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory97.0 KiB
Average record size in memory42.1 B

Variable types

Numeric1
Text2
Categorical2

Dataset

Description경상남도 내 스마트공장 구축현황으로, 경상남도 내 스마트공장 구축의 기업명, 업종, 소재지, 구축연도 데이터를 제공합니다.
URLhttps://www.data.go.kr/data/15075266/fileData.do

Alerts

연번 is highly overall correlated with 비고(연도)High correlation
비고(연도) is highly overall correlated with 연번High correlation
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 04:30:23.173019
Analysis finished2023-12-12 04:30:24.186789
Duration1.01 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct2363
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1182
Minimum1
Maximum2363
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size20.9 KiB
2023-12-12T13:30:24.324365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile119.1
Q1591.5
median1182
Q31772.5
95-th percentile2244.9
Maximum2363
Range2362
Interquartile range (IQR)1181

Descriptive statistics

Standard deviation682.28367
Coefficient of variation (CV)0.57722814
Kurtosis-1.2
Mean1182
Median Absolute Deviation (MAD)591
Skewness0
Sum2793066
Variance465511
MonotonicityStrictly increasing
2023-12-12T13:30:24.503760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1589 1
 
< 0.1%
1573 1
 
< 0.1%
1574 1
 
< 0.1%
1575 1
 
< 0.1%
1576 1
 
< 0.1%
1577 1
 
< 0.1%
1578 1
 
< 0.1%
1579 1
 
< 0.1%
1580 1
 
< 0.1%
Other values (2353) 2353
99.6%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2363 1
< 0.1%
2362 1
< 0.1%
2361 1
< 0.1%
2360 1
< 0.1%
2359 1
< 0.1%
2358 1
< 0.1%
2357 1
< 0.1%
2356 1
< 0.1%
2355 1
< 0.1%
2354 1
< 0.1%
Distinct1991
Distinct (%)84.3%
Missing0
Missing (%)0.0%
Memory size18.6 KiB
2023-12-12T13:30:24.858600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length19
Mean length7.2856538
Min length2

Characters and Unicode

Total characters17216
Distinct characters506
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1676 ?
Unique (%)70.9%

Sample

1st row티에스테크
2nd row하나앤스틸주식회사
3rd row에이치앤스틸주식회사
4th row(주)갑신RUBBER
5th row(주)삼원양산공장
ValueCountFrequency (%)
주식회사 223
 
8.2%
20
 
0.7%
농업회사법인 18
 
0.7%
창원공장 11
 
0.4%
2공장 6
 
0.2%
함안공장 5
 
0.2%
세진공업(주 5
 
0.2%
풍원공업(주 4
 
0.1%
주)디케이 4
 
0.1%
주)정민기전 4
 
0.1%
Other values (1997) 2436
89.0%
2023-12-12T13:30:25.419356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1636
 
9.5%
) 1308
 
7.6%
( 1295
 
7.5%
532
 
3.1%
473
 
2.7%
448
 
2.6%
388
 
2.3%
377
 
2.2%
340
 
2.0%
333
 
1.9%
Other values (496) 10086
58.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13730
79.8%
Close Punctuation 1308
 
7.6%
Open Punctuation 1295
 
7.5%
Space Separator 532
 
3.1%
Uppercase Letter 217
 
1.3%
Other Symbol 43
 
0.2%
Decimal Number 38
 
0.2%
Lowercase Letter 28
 
0.2%
Other Punctuation 23
 
0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1636
 
11.9%
473
 
3.4%
448
 
3.3%
388
 
2.8%
377
 
2.7%
340
 
2.5%
333
 
2.4%
304
 
2.2%
286
 
2.1%
282
 
2.1%
Other values (446) 8863
64.6%
Uppercase Letter
ValueCountFrequency (%)
E 22
 
10.1%
S 21
 
9.7%
T 21
 
9.7%
N 21
 
9.7%
C 16
 
7.4%
M 16
 
7.4%
G 14
 
6.5%
H 10
 
4.6%
R 9
 
4.1%
L 8
 
3.7%
Other values (12) 59
27.2%
Lowercase Letter
ValueCountFrequency (%)
c 4
14.3%
o 3
10.7%
s 2
 
7.1%
n 2
 
7.1%
a 2
 
7.1%
r 2
 
7.1%
t 2
 
7.1%
h 2
 
7.1%
i 2
 
7.1%
e 2
 
7.1%
Other values (4) 5
17.9%
Decimal Number
ValueCountFrequency (%)
2 29
76.3%
3 5
 
13.2%
8 2
 
5.3%
1 2
 
5.3%
Other Punctuation
ValueCountFrequency (%)
. 14
60.9%
& 6
26.1%
, 2
 
8.7%
/ 1
 
4.3%
Close Punctuation
ValueCountFrequency (%)
) 1308
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1295
100.0%
Space Separator
ValueCountFrequency (%)
532
100.0%
Other Symbol
ValueCountFrequency (%)
43
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13773
80.0%
Common 3198
 
18.6%
Latin 245
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1636
 
11.9%
473
 
3.4%
448
 
3.3%
388
 
2.8%
377
 
2.7%
340
 
2.5%
333
 
2.4%
304
 
2.2%
286
 
2.1%
282
 
2.0%
Other values (447) 8906
64.7%
Latin
ValueCountFrequency (%)
E 22
 
9.0%
S 21
 
8.6%
T 21
 
8.6%
N 21
 
8.6%
C 16
 
6.5%
M 16
 
6.5%
G 14
 
5.7%
H 10
 
4.1%
R 9
 
3.7%
L 8
 
3.3%
Other values (26) 87
35.5%
Common
ValueCountFrequency (%)
) 1308
40.9%
( 1295
40.5%
532
16.6%
2 29
 
0.9%
. 14
 
0.4%
& 6
 
0.2%
3 5
 
0.2%
8 2
 
0.1%
, 2
 
0.1%
1 2
 
0.1%
Other values (3) 3
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13730
79.8%
ASCII 3443
 
20.0%
None 43
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1636
 
11.9%
473
 
3.4%
448
 
3.3%
388
 
2.8%
377
 
2.7%
340
 
2.5%
333
 
2.4%
304
 
2.2%
286
 
2.1%
282
 
2.1%
Other values (446) 8863
64.6%
ASCII
ValueCountFrequency (%)
) 1308
38.0%
( 1295
37.6%
532
15.5%
2 29
 
0.8%
E 22
 
0.6%
S 21
 
0.6%
T 21
 
0.6%
N 21
 
0.6%
C 16
 
0.5%
M 16
 
0.5%
Other values (39) 162
 
4.7%
None
ValueCountFrequency (%)
43
100.0%

업종
Text

Distinct595
Distinct (%)25.2%
Missing0
Missing (%)0.0%
Memory size18.6 KiB
2023-12-12T13:30:25.797877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length22
Mean length12.790944
Min length1

Characters and Unicode

Total characters30225
Distinct characters304
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique315 ?
Unique (%)13.3%

Sample

1st row산업용 로봇 제조업
2nd row그외 기타 1차 철강 제조업
3rd row그외 기타 1차 철강 제조업
4th row산업용 비경화고무제품 제조업
5th row모 방적업
ValueCountFrequency (%)
제조업 1515
 
17.3%
595
 
6.8%
기타 591
 
6.8%
391
 
4.5%
385
 
4.4%
부품 325
 
3.7%
신품 217
 
2.5%
금속 203
 
2.3%
자동차 185
 
2.1%
자동차용 176
 
2.0%
Other values (733) 4168
47.6%
2023-12-12T13:30:26.362238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6473
21.4%
2318
 
7.7%
2151
 
7.1%
2107
 
7.0%
1391
 
4.6%
1289
 
4.3%
636
 
2.1%
627
 
2.1%
621
 
2.1%
569
 
1.9%
Other values (294) 12043
39.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 23406
77.4%
Space Separator 6473
 
21.4%
Other Punctuation 220
 
0.7%
Close Punctuation 40
 
0.1%
Decimal Number 40
 
0.1%
Open Punctuation 39
 
0.1%
Dash Punctuation 6
 
< 0.1%
Control 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2318
 
9.9%
2151
 
9.2%
2107
 
9.0%
1391
 
5.9%
1289
 
5.5%
636
 
2.7%
627
 
2.7%
621
 
2.7%
569
 
2.4%
546
 
2.3%
Other values (284) 11151
47.6%
Other Punctuation
ValueCountFrequency (%)
, 214
97.3%
. 4
 
1.8%
/ 1
 
0.5%
; 1
 
0.5%
Space Separator
ValueCountFrequency (%)
6473
100.0%
Close Punctuation
ValueCountFrequency (%)
) 40
100.0%
Decimal Number
ValueCountFrequency (%)
1 40
100.0%
Open Punctuation
ValueCountFrequency (%)
( 39
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 23406
77.4%
Common 6819
 
22.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2318
 
9.9%
2151
 
9.2%
2107
 
9.0%
1391
 
5.9%
1289
 
5.5%
636
 
2.7%
627
 
2.7%
621
 
2.7%
569
 
2.4%
546
 
2.3%
Other values (284) 11151
47.6%
Common
ValueCountFrequency (%)
6473
94.9%
, 214
 
3.1%
) 40
 
0.6%
1 40
 
0.6%
( 39
 
0.6%
- 6
 
0.1%
. 4
 
0.1%
/ 1
 
< 0.1%
; 1
 
< 0.1%
1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 23390
77.4%
ASCII 6819
 
22.6%
Compat Jamo 16
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6473
94.9%
, 214
 
3.1%
) 40
 
0.6%
1 40
 
0.6%
( 39
 
0.6%
- 6
 
0.1%
. 4
 
0.1%
/ 1
 
< 0.1%
; 1
 
< 0.1%
1
 
< 0.1%
Hangul
ValueCountFrequency (%)
2318
 
9.9%
2151
 
9.2%
2107
 
9.0%
1391
 
5.9%
1289
 
5.5%
636
 
2.7%
627
 
2.7%
621
 
2.7%
569
 
2.4%
546
 
2.3%
Other values (283) 11135
47.6%
Compat Jamo
ValueCountFrequency (%)
16
100.0%

소재지
Categorical

Distinct18
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size18.6 KiB
창원시
712 
김해시
700 
양산시
280 
함안군
160 
진주시
134 
Other values (13)
377 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row창원시
2nd row김해시
3rd row김해시
4th row김해시
5th row양산시

Common Values

ValueCountFrequency (%)
창원시 712
30.1%
김해시 700
29.6%
양산시 280
 
11.8%
함안군 160
 
6.8%
진주시 134
 
5.7%
사천시 82
 
3.5%
밀양시 72
 
3.0%
창녕군 67
 
2.8%
거제시 30
 
1.3%
의령군 25
 
1.1%
Other values (8) 101
 
4.3%

Length

2023-12-12T13:30:26.511416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
창원시 712
30.1%
김해시 700
29.6%
양산시 280
 
11.8%
함안군 160
 
6.8%
진주시 134
 
5.7%
사천시 82
 
3.5%
밀양시 72
 
3.0%
창녕군 67
 
2.8%
거제시 30
 
1.3%
의령군 25
 
1.1%
Other values (8) 101
 
4.3%

비고(연도)
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size18.6 KiB
2020
686 
2021
554 
2019
548 
2022
333 
2018
242 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2018
2nd row2018
3rd row2018
4th row2018
5th row2018

Common Values

ValueCountFrequency (%)
2020 686
29.0%
2021 554
23.4%
2019 548
23.2%
2022 333
14.1%
2018 242
 
10.2%

Length

2023-12-12T13:30:26.665088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:30:26.799964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 686
29.0%
2021 554
23.4%
2019 548
23.2%
2022 333
14.1%
2018 242
 
10.2%

Interactions

2023-12-12T13:30:23.770946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:30:26.895658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번소재지비고(연도)
연번1.0000.1050.997
소재지0.1051.0000.081
비고(연도)0.9970.0811.000
2023-12-12T13:30:27.021471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
소재지비고(연도)
소재지1.0000.041
비고(연도)0.0411.000
2023-12-12T13:30:27.128374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번소재지비고(연도)
연번1.0000.0400.918
소재지0.0401.0000.041
비고(연도)0.9180.0411.000

Missing values

2023-12-12T13:30:23.945683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:30:24.115724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번기업명업종소재지비고(연도)
01티에스테크산업용 로봇 제조업창원시2018
12하나앤스틸주식회사그외 기타 1차 철강 제조업김해시2018
23에이치앤스틸주식회사그외 기타 1차 철강 제조업김해시2018
34(주)갑신RUBBER산업용 비경화고무제품 제조업김해시2018
45(주)삼원양산공장모 방적업양산시2018
56(주)건화 창원공장토목공사 및 유사용 기계장비 제조업창원시2018
67(주)유니크그외 기타 자동차 부품 제조업김해시2018
78(주)유록도장 및 기타 피막처리업창원시2018
89한영정밀(주)절삭가공 및 유사처리업창원시2018
910상진정밀자동차용 동력전달장치 제조업창원시2018
연번기업명업종소재지비고(연도)
23532354(주)문교오엔에스기타 제품 제조업김해시2022
23542355(주)성미선박 구성 부분품 제조업밀양시2022
23552356(유)삼송 창원공장그 외 자동차용 신품 부품 제조업창원시2022
23562357대한오토텍자동차 차체용 신품 부품 제조업양산시2022
23572358대성나찌유압공업(주)유압 기기 제조업양산시2022
23582359(주)한라공업자동차 차체용 신품 부품 제조업양산시2022
23592360(주)제이에스디그 외 자동차용 신품 부품 제조업김해시2022
23602361진흥공업(주)자동차용 금속 압형제품 제조업김해시2022
23612362(주)디에이치아이선박 구성부분품 제조업사천시2022
23622363일진산업(주)동 압연, 압출 및 연신제품 제조업창원시2022