Overview

Dataset statistics

Number of variables6
Number of observations65
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.3 KiB
Average record size in memory52.0 B

Variable types

Numeric2
Text3
Categorical1

Dataset

Description새만금산업단지내에 입주중인 기업 현황에 대한 데이터로 회사명, 위치, 면적, 사업분야 진행상황 등에 대한 항목을 제공하고 있습니다.
Author공공데이터포털
URLhttps://www.data.go.kr/data/15002297/fileData.do

Alerts

면적(제곱미터) is highly overall correlated with 위치High correlation
위치 is highly overall correlated with 면적(제곱미터)High correlation
연번 has unique valuesUnique
회사명칭 has unique valuesUnique

Reproduction

Analysis started2024-04-21 13:26:26.680140
Analysis finished2024-04-21 13:26:28.899127
Duration2.22 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct65
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33
Minimum1
Maximum65
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size713.0 B
2024-04-21T22:26:29.105099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.2
Q117
median33
Q349
95-th percentile61.8
Maximum65
Range64
Interquartile range (IQR)32

Descriptive statistics

Standard deviation18.90767
Coefficient of variation (CV)0.57295971
Kurtosis-1.2
Mean33
Median Absolute Deviation (MAD)16
Skewness0
Sum2145
Variance357.5
MonotonicityStrictly increasing
2024-04-21T22:26:29.531507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.5%
50 1
 
1.5%
36 1
 
1.5%
37 1
 
1.5%
38 1
 
1.5%
39 1
 
1.5%
40 1
 
1.5%
41 1
 
1.5%
42 1
 
1.5%
43 1
 
1.5%
Other values (55) 55
84.6%
ValueCountFrequency (%)
1 1
1.5%
2 1
1.5%
3 1
1.5%
4 1
1.5%
5 1
1.5%
6 1
1.5%
7 1
1.5%
8 1
1.5%
9 1
1.5%
10 1
1.5%
ValueCountFrequency (%)
65 1
1.5%
64 1
1.5%
63 1
1.5%
62 1
1.5%
61 1
1.5%
60 1
1.5%
59 1
1.5%
58 1
1.5%
57 1
1.5%
56 1
1.5%

회사명칭
Text

UNIQUE 

Distinct65
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size648.0 B
2024-04-21T22:26:30.431031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length11
Mean length7.3538462
Min length3

Characters and Unicode

Total characters478
Distinct characters159
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65 ?
Unique (%)100.0%

Sample

1st rowOCI㈜
2nd rowOCISE㈜
3rd row도레이첨단소재㈜
4th row솔베이실리카코리아㈜
5th row㈜이씨에스
ValueCountFrequency (%)
oci㈜ 1
 
1.5%
㈜풍천엔지니어링 1
 
1.5%
㈜배터리솔루션 1
 
1.5%
㈜코스텍 1
 
1.5%
유)촌빛바이오 1
 
1.5%
㈜대흥씨씨유 1
 
1.5%
한국에너지공단 1
 
1.5%
성일하이텍㈜ 1
 
1.5%
유)도원산업기계 1
 
1.5%
디앨㈜ 1
 
1.5%
Other values (55) 55
84.6%
2024-04-21T22:26:31.735105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
51
 
10.7%
22
 
4.6%
) 14
 
2.9%
14
 
2.9%
14
 
2.9%
( 14
 
2.9%
12
 
2.5%
10
 
2.1%
8
 
1.7%
8
 
1.7%
Other values (149) 311
65.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 382
79.9%
Other Symbol 51
 
10.7%
Close Punctuation 14
 
2.9%
Open Punctuation 14
 
2.9%
Uppercase Letter 11
 
2.3%
Decimal Number 4
 
0.8%
Other Punctuation 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
22
 
5.8%
14
 
3.7%
14
 
3.7%
12
 
3.1%
10
 
2.6%
8
 
2.1%
8
 
2.1%
7
 
1.8%
7
 
1.8%
7
 
1.8%
Other values (137) 273
71.5%
Uppercase Letter
ValueCountFrequency (%)
I 3
27.3%
C 2
18.2%
O 2
18.2%
E 1
 
9.1%
S 1
 
9.1%
M 1
 
9.1%
B 1
 
9.1%
Other Symbol
ValueCountFrequency (%)
51
100.0%
Close Punctuation
ValueCountFrequency (%)
) 14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 14
100.0%
Decimal Number
ValueCountFrequency (%)
2 4
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 433
90.6%
Common 34
 
7.1%
Latin 11
 
2.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
51
 
11.8%
22
 
5.1%
14
 
3.2%
14
 
3.2%
12
 
2.8%
10
 
2.3%
8
 
1.8%
8
 
1.8%
7
 
1.6%
7
 
1.6%
Other values (138) 280
64.7%
Latin
ValueCountFrequency (%)
I 3
27.3%
C 2
18.2%
O 2
18.2%
E 1
 
9.1%
S 1
 
9.1%
M 1
 
9.1%
B 1
 
9.1%
Common
ValueCountFrequency (%)
) 14
41.2%
( 14
41.2%
2 4
 
11.8%
. 2
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 382
79.9%
None 51
 
10.7%
ASCII 45
 
9.4%

Most frequent character per block

None
ValueCountFrequency (%)
51
100.0%
Hangul
ValueCountFrequency (%)
22
 
5.8%
14
 
3.7%
14
 
3.7%
12
 
3.1%
10
 
2.6%
8
 
2.1%
8
 
2.1%
7
 
1.8%
7
 
1.8%
7
 
1.8%
Other values (137) 273
71.5%
ASCII
ValueCountFrequency (%)
) 14
31.1%
( 14
31.1%
2 4
 
8.9%
I 3
 
6.7%
. 2
 
4.4%
C 2
 
4.4%
O 2
 
4.4%
E 1
 
2.2%
S 1
 
2.2%
M 1
 
2.2%

위치
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size648.0 B
2공구
34 
1공구
24 
5공구
6공구
 
2
3공구
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique1 ?
Unique (%)1.5%

Sample

1st row3공구
2nd row2공구
3rd row2공구
4th row2공구
5th row2공구

Common Values

ValueCountFrequency (%)
2공구 34
52.3%
1공구 24
36.9%
5공구 4
 
6.2%
6공구 2
 
3.1%
3공구 1
 
1.5%

Length

2024-04-21T22:26:31.956765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T22:26:32.136339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2공구 34
52.3%
1공구 24
36.9%
5공구 4
 
6.2%
6공구 2
 
3.1%
3공구 1
 
1.5%

면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION 

Distinct64
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56703.835
Minimum1653
Maximum330000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size713.0 B
2024-04-21T22:26:32.348435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1653
5-th percentile6720.04
Q115200
median33000
Q369991.1
95-th percentile184129.7
Maximum330000
Range328347
Interquartile range (IQR)54791.1

Descriptive statistics

Standard deviation65690.83
Coefficient of variation (CV)1.1584901
Kurtosis5.3582262
Mean56703.835
Median Absolute Deviation (MAD)20438.9
Skewness2.2168013
Sum3685749.3
Variance4.3152852 × 109
MonotonicityNot monotonic
2024-04-21T22:26:32.791849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33000.0 2
 
3.1%
271395.6 1
 
1.5%
14885.1 1
 
1.5%
10002.2 1
 
1.5%
187371.0 1
 
1.5%
22192.2 1
 
1.5%
77529.7 1
 
1.5%
14478.0 1
 
1.5%
34211.0 1
 
1.5%
75429.4 1
 
1.5%
Other values (54) 54
83.1%
ValueCountFrequency (%)
1653.0 1
1.5%
5000.0 1
1.5%
5180.9 1
1.5%
6612.0 1
1.5%
7152.2 1
1.5%
7296.0 1
1.5%
9125.0 1
1.5%
10002.2 1
1.5%
10709.7 1
1.5%
12083.0 1
1.5%
ValueCountFrequency (%)
330000.0 1
1.5%
271395.6 1
1.5%
215036.1 1
1.5%
187371.0 1
1.5%
171164.5 1
1.5%
162180.7 1
1.5%
158624.0 1
1.5%
148400.0 1
1.5%
112397.0 1
1.5%
106174.1 1
1.5%
Distinct60
Distinct (%)92.3%
Missing0
Missing (%)0.0%
Memory size648.0 B
2024-04-21T22:26:33.693440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length25
Mean length13.630769
Min length5

Characters and Unicode

Total characters886
Distinct characters184
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)86.2%

Sample

1st row폴리실리콘, 카본소재
2nd row스팀, 전기(열병합발전소)
3rd rowPPS(자동차, 기계관련 경량화 부품소재) 등 고분자 소재
4th row고분산실리카(친환경타이어소재)
5th row열교환기, 탱크류
ValueCountFrequency (%)
이차전지 12
 
6.5%
7
 
3.8%
6
 
3.2%
자동차 4
 
2.2%
제조 4
 
2.2%
태양광 4
 
2.2%
리튬화합물 3
 
1.6%
제조업 3
 
1.6%
전해질 3
 
1.6%
설비 3
 
1.6%
Other values (128) 137
73.7%
2024-04-21T22:26:34.820205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
130
 
14.7%
39
 
4.4%
27
 
3.0%
26
 
2.9%
26
 
2.9%
, 23
 
2.6%
21
 
2.4%
15
 
1.7%
14
 
1.6%
14
 
1.6%
Other values (174) 551
62.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 698
78.8%
Space Separator 130
 
14.7%
Other Punctuation 26
 
2.9%
Uppercase Letter 11
 
1.2%
Close Punctuation 10
 
1.1%
Open Punctuation 10
 
1.1%
Lowercase Letter 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
39
 
5.6%
27
 
3.9%
26
 
3.7%
26
 
3.7%
21
 
3.0%
15
 
2.1%
14
 
2.0%
14
 
2.0%
13
 
1.9%
13
 
1.9%
Other values (159) 490
70.2%
Uppercase Letter
ValueCountFrequency (%)
S 2
18.2%
P 2
18.2%
N 1
9.1%
C 1
9.1%
U 1
9.1%
V 1
9.1%
T 1
9.1%
I 1
9.1%
M 1
9.1%
Other Punctuation
ValueCountFrequency (%)
, 23
88.5%
· 3
 
11.5%
Space Separator
ValueCountFrequency (%)
130
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Lowercase Letter
ValueCountFrequency (%)
o 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 698
78.8%
Common 176
 
19.9%
Latin 12
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
39
 
5.6%
27
 
3.9%
26
 
3.7%
26
 
3.7%
21
 
3.0%
15
 
2.1%
14
 
2.0%
14
 
2.0%
13
 
1.9%
13
 
1.9%
Other values (159) 490
70.2%
Latin
ValueCountFrequency (%)
S 2
16.7%
P 2
16.7%
N 1
8.3%
C 1
8.3%
U 1
8.3%
V 1
8.3%
T 1
8.3%
o 1
8.3%
I 1
8.3%
M 1
8.3%
Common
ValueCountFrequency (%)
130
73.9%
, 23
 
13.1%
) 10
 
5.7%
( 10
 
5.7%
· 3
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 696
78.6%
ASCII 185
 
20.9%
None 3
 
0.3%
Compat Jamo 2
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
130
70.3%
, 23
 
12.4%
) 10
 
5.4%
( 10
 
5.4%
S 2
 
1.1%
P 2
 
1.1%
N 1
 
0.5%
C 1
 
0.5%
U 1
 
0.5%
V 1
 
0.5%
Other values (4) 4
 
2.2%
Hangul
ValueCountFrequency (%)
39
 
5.6%
27
 
3.9%
26
 
3.7%
26
 
3.7%
21
 
3.0%
15
 
2.2%
14
 
2.0%
14
 
2.0%
13
 
1.9%
13
 
1.9%
Other values (158) 488
70.1%
None
ValueCountFrequency (%)
· 3
100.0%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
Distinct36
Distinct (%)55.4%
Missing0
Missing (%)0.0%
Memory size648.0 B
2024-04-21T22:26:35.450362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters520
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)27.7%

Sample

1st row2013.03.
2nd row2013.09.
3rd row2014.01.
4th row2015.02.
5th row2015.02.
ValueCountFrequency (%)
2022.12 7
 
10.8%
2023.03 6
 
9.2%
2022.03 3
 
4.6%
2019.12 3
 
4.6%
2020.12 2
 
3.1%
2020.01 2
 
3.1%
2022.09 2
 
3.1%
2021.11 2
 
3.1%
2022.05 2
 
3.1%
2021.01 2
 
3.1%
Other values (26) 34
52.3%
2024-04-21T22:26:36.286066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 155
29.8%
. 130
25.0%
0 119
22.9%
1 58
 
11.2%
3 26
 
5.0%
9 14
 
2.7%
5 5
 
1.0%
4 5
 
1.0%
7 4
 
0.8%
6 2
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 390
75.0%
Other Punctuation 130
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 155
39.7%
0 119
30.5%
1 58
 
14.9%
3 26
 
6.7%
9 14
 
3.6%
5 5
 
1.3%
4 5
 
1.3%
7 4
 
1.0%
6 2
 
0.5%
8 2
 
0.5%
Other Punctuation
ValueCountFrequency (%)
. 130
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 520
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 155
29.8%
. 130
25.0%
0 119
22.9%
1 58
 
11.2%
3 26
 
5.0%
9 14
 
2.7%
5 5
 
1.0%
4 5
 
1.0%
7 4
 
0.8%
6 2
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 520
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 155
29.8%
. 130
25.0%
0 119
22.9%
1 58
 
11.2%
3 26
 
5.0%
9 14
 
2.7%
5 5
 
1.0%
4 5
 
1.0%
7 4
 
0.8%
6 2
 
0.4%

Interactions

2024-04-21T22:26:27.845246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T22:26:27.356284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T22:26:28.094432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T22:26:27.599791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T22:26:36.444581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번회사명칭위치면적(제곱미터)사업분야입주계약
연번1.0001.0000.0000.0000.8350.982
회사명칭1.0001.0001.0001.0001.0001.000
위치0.0001.0001.0000.8200.9890.000
면적(제곱미터)0.0001.0000.8201.0000.0000.667
사업분야0.8351.0000.9890.0001.0000.984
입주계약0.9821.0000.0000.6670.9841.000
2024-04-21T22:26:36.616116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번면적(제곱미터)위치
연번1.0000.0170.000
면적(제곱미터)0.0171.0000.634
위치0.0000.6341.000

Missing values

2024-04-21T22:26:28.425153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T22:26:28.767321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번회사명칭위치면적(제곱미터)사업분야입주계약
01OCI㈜3공구271395.6폴리실리콘, 카본소재2013.03.
12OCISE㈜2공구162180.7스팀, 전기(열병합발전소)2013.09.
23도레이첨단소재㈜2공구215036.1PPS(자동차, 기계관련 경량화 부품소재) 등 고분자 소재2014.01.
34솔베이실리카코리아㈜2공구69991.1고분산실리카(친환경타이어소재)2015.02.
45㈜이씨에스2공구6612.0열교환기, 탱크류2015.02.
56㈜네모이엔지2공구66000.0수상태양광부유체2019.01.
67㈜레나인터내셔널2공구76000.0태양광모듈, 태양광구조물, 에너지 저장장치2019.04.
78㈜풍림파마텍1공구33000.0주사기 등 의료용품2019.05.
89㈜테크윈2공구19900.0수상태양광부유체2019.06.
910㈜우석에이엠테크2공구23196.3스마트 계량기2019.07.
연번회사명칭위치면적(제곱미터)사업분야입주계약
5556㈜덕산테코피아2공구93036.0이차전지 전해질2023.03.
5657㈜하이드로리튬1공구99900.0이차전지 리튬화합물2023.03.
5758㈜어반리튬1공구61000.0이차전지 리튬화합물2023.03.
5859㈜풍림파마텍(2차)1공구13228.3의료기기2023.03.
5960지이엠코리아뉴에너지머티리얼즈㈜6공구330000.0이차전지 전구체2023.03.
6061㈜에코앤드림1공구148400.0이차전지 양극활물질 전구체2023.04.
6162㈜리카본솔루션즈1공구29000.0탄소저감설비2023.04.
6263㈜이디엘5공구112397.0이차전지 소재(리튬염)2023.07.
6364㈜제이아이테크2공구31508.0반도체용 프리커서 등2023.07.
6465성일하이텍㈜(2차)2공구12561.1이차전지 소재(NCM 액상)2023.08.