Overview

Dataset statistics

Number of variables6
Number of observations93
Missing cells5
Missing cells (%)0.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.7 KiB
Average record size in memory51.4 B

Variable types

Categorical3
Numeric2
DateTime1

Dataset

Description경상북도에서 기업의 투자와 관련하여 체결한 MOU에 관한 자료- 기업명은 비공개- 고용인원이 공란인 MOU는 추후 실제 투자를 통해서 인원 추후 산출
Author경상북도
URLhttps://www.data.go.kr/data/15062931/fileData.do

Alerts

구분 has constant value ""Constant
투자규모(억원) is highly overall correlated with 고용인원High correlation
고용인원 is highly overall correlated with 투자규모(억원)High correlation
고용인원 has 5 (5.4%) missing valuesMissing
고용인원 has 1 (1.1%) zerosZeros

Reproduction

Analysis started2024-05-04 07:52:06.687523
Analysis finished2024-05-04 07:52:09.213732
Duration2.53 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기업명
Categorical

Distinct27
Distinct (%)29.0%
Missing0
Missing (%)0.0%
Memory size876.0 B
S사
17 
D사
11 
P사
H사
A사
Other values (22)
42 

Length

Max length8
Median length2
Mean length2.2365591
Min length2

Unique

Unique12 ?
Unique (%)12.9%

Sample

1st rowS사
2nd rowS사
3rd rowD사
4th rowA사
5th rowH사

Common Values

ValueCountFrequency (%)
S사 17
18.3%
D사 11
11.8%
P사 8
 
8.6%
H사 8
 
8.6%
A사 7
 
7.5%
K사 5
 
5.4%
J사 4
 
4.3%
E사 4
 
4.3%
G사 4
 
4.3%
W사 3
 
3.2%
Other values (17) 22
23.7%

Length

2024-05-04T07:52:09.478785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
s사 20
20.2%
d사 12
12.1%
p사 10
10.1%
h사 10
10.1%
a사 7
 
7.1%
k사 5
 
5.1%
g사 5
 
5.1%
j사 4
 
4.0%
e사 4
 
4.0%
l사 4
 
4.0%
Other values (12) 18
18.2%

구분
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size876.0 B
MOU체결
93 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMOU체결
2nd rowMOU체결
3rd rowMOU체결
4th rowMOU체결
5th rowMOU체결

Common Values

ValueCountFrequency (%)
MOU체결 93
100.0%

Length

2024-05-04T07:52:09.879507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T07:52:10.254975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
mou체결 93
100.0%

위치
Categorical

Distinct15
Distinct (%)16.1%
Missing0
Missing (%)0.0%
Memory size876.0 B
구미시
28 
경주시
16 
포항시
15 
상주시
김천시
Other values (10)
21 

Length

Max length5
Median length3
Mean length3.0215054
Min length3

Unique

Unique4 ?
Unique (%)4.3%

Sample

1st row청도군
2nd row영주시
3rd row영덕군
4th row김천시
5th row경주시

Common Values

ValueCountFrequency (%)
구미시 28
30.1%
경주시 16
17.2%
포항시 15
16.1%
상주시 9
 
9.7%
김천시 4
 
4.3%
안동시 4
 
4.3%
영주시 3
 
3.2%
영덕군 3
 
3.2%
경산시 3
 
3.2%
예천군 2
 
2.2%
Other values (5) 6
 
6.5%

Length

2024-05-04T07:52:10.600837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
구미시 28
30.1%
경주시 16
17.2%
포항시 15
16.1%
상주시 9
 
9.7%
김천시 4
 
4.3%
안동시 4
 
4.3%
영주시 3
 
3.2%
영덕군 3
 
3.2%
경산시 3
 
3.2%
예천군 2
 
2.2%
Other values (5) 6
 
6.5%

투자규모(억원)
Real number (ℝ)

HIGH CORRELATION 

Distinct58
Distinct (%)62.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2400.6774
Minimum100
Maximum20000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size969.0 B
2024-05-04T07:52:11.157095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile150
Q1450
median1000
Q32500
95-th percentile11097
Maximum20000
Range19900
Interquartile range (IQR)2050

Descriptive statistics

Standard deviation3691.1442
Coefficient of variation (CV)1.5375428
Kurtosis7.5346246
Mean2400.6774
Median Absolute Deviation (MAD)600
Skewness2.6758285
Sum223263
Variance13624546
MonotonicityNot monotonic
2024-05-04T07:52:11.776810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000 6
 
6.5%
800 5
 
5.4%
500 4
 
4.3%
5000 4
 
4.3%
400 3
 
3.2%
2000 3
 
3.2%
1100 3
 
3.2%
100 3
 
3.2%
300 3
 
3.2%
3000 3
 
3.2%
Other values (48) 56
60.2%
ValueCountFrequency (%)
100 3
3.2%
130 1
 
1.1%
150 2
2.2%
200 2
2.2%
250 2
2.2%
281 1
 
1.1%
300 3
3.2%
350 1
 
1.1%
389 1
 
1.1%
395 1
 
1.1%
ValueCountFrequency (%)
20000 1
1.1%
15200 1
1.1%
14000 1
1.1%
12360 1
1.1%
12000 1
1.1%
10495 1
1.1%
10000 1
1.1%
8000 2
2.2%
6000 2
2.2%
5500 1
1.1%

고용인원
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct45
Distinct (%)51.1%
Missing5
Missing (%)5.4%
Infinite0
Infinite (%)0.0%
Mean148.22727
Minimum0
Maximum1120
Zeros1
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size969.0 B
2024-05-04T07:52:12.310852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile20
Q148.25
median91
Q3181.5
95-th percentile484.95
Maximum1120
Range1120
Interquartile range (IQR)133.25

Descriptive statistics

Standard deviation195.31147
Coefficient of variation (CV)1.3176486
Kurtosis12.517686
Mean148.22727
Median Absolute Deviation (MAD)51.5
Skewness3.3150733
Sum13044
Variance38146.568
MonotonicityNot monotonic
2024-05-04T07:52:12.814046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=45)
ValueCountFrequency (%)
100 11
 
11.8%
50 10
 
10.8%
40 5
 
5.4%
80 4
 
4.3%
250 4
 
4.3%
20 4
 
4.3%
30 4
 
4.3%
200 3
 
3.2%
300 3
 
3.2%
150 2
 
2.2%
Other values (35) 38
40.9%
(Missing) 5
 
5.4%
ValueCountFrequency (%)
0 1
 
1.1%
10 1
 
1.1%
11 1
 
1.1%
12 1
 
1.1%
20 4
4.3%
25 1
 
1.1%
29 1
 
1.1%
30 4
4.3%
35 1
 
1.1%
39 1
 
1.1%
ValueCountFrequency (%)
1120 1
 
1.1%
1000 1
 
1.1%
933 1
 
1.1%
500 2
2.2%
457 1
 
1.1%
300 3
3.2%
270 2
2.2%
250 4
4.3%
240 1
 
1.1%
230 1
 
1.1%
Distinct80
Distinct (%)86.0%
Missing0
Missing (%)0.0%
Memory size876.0 B
Minimum2021-01-13 00:00:00
Maximum2024-03-15 00:00:00
2024-05-04T07:52:13.238794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:52:13.712451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-05-04T07:52:07.857595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:52:07.194037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:52:08.244004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T07:52:07.490339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T07:52:13.960340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기업명위치투자규모(억원)고용인원체결일자
기업명1.0000.8260.0000.6020.000
위치0.8261.0000.0000.3860.978
투자규모(억원)0.0000.0001.0000.7940.591
고용인원0.6020.3860.7941.0000.966
체결일자0.0000.9780.5910.9661.000
2024-05-04T07:52:14.317872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기업명위치
기업명1.0000.375
위치0.3751.000
2024-05-04T07:52:14.558554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
투자규모(억원)고용인원기업명위치
투자규모(억원)1.0000.5100.0000.000
고용인원0.5101.0000.2760.187
기업명0.0000.2761.0000.375
위치0.0000.1870.3751.000

Missing values

2024-05-04T07:52:08.632032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T07:52:09.082866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기업명구분위치투자규모(억원)고용인원체결일자
0S사MOU체결청도군150202021-01-13
1S사MOU체결영주시20002002021-01-27
2D사MOU체결영덕군500392021-01-28
3A사MOU체결김천시12001002021-02-03
4H사MOU체결경주시281302021-02-25
5G사MOU체결구미시5501002021-03-04
6S사MOU체결구미시250352021-03-04
7A사MOU체결구미시100502021-03-04
8L사MOU체결구미시500302021-04-01
9D사MOU체결구미시100202021-04-01
기업명구분위치투자규모(억원)고용인원체결일자
83S사MOU체결경주시33003002023-09-08
84S사MOU체결상주시5001002023-09-18
85J사MOU체결포항시10002502023-10-26
86H사MOU체결고령군80002002023-12-05
87A사MOU체결구미시600202023-12-12
88H사MOU체결구미시750502024-01-30
89D사MOU체결경주시450802024-02-06
90A사MOU체결경산시25001202024-02-28
91K사MOU체결구미시648722024-03-05
92I사MOU체결구미시30001002024-03-15