Overview

Dataset statistics

Number of variables9
Number of observations634
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory45.9 KiB
Average record size in memory74.2 B

Variable types

Categorical6
Text1
Boolean1
Numeric1

Dataset

Description한국기계연구원의 연구관리 분야에서 사업/과제계획서참여연구원월별관리를 하는 정보(과제번호, 참여자, 참여년월, 참여시작일, 참여종료일, 참여연구원인건비 등을 관리)
URLhttps://www.data.go.kr/data/15078045/fileData.do

Alerts

참여연구원참여적용년월일 has constant value ""Constant
참여시작일 has constant value ""Constant
집행여부 has constant value ""Constant
작성일 has constant value ""Constant
참여종료일 is highly imbalanced (98.3%)Imbalance
참여일수 is highly imbalanced (98.3%)Imbalance

Reproduction

Analysis started2023-12-13 00:49:03.498992
Analysis finished2023-12-13 00:49:03.982262
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct48
Distinct (%)7.6%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
NK237B
 
36
NK236I
 
36
NK240A
 
33
NK237A
 
29
NK236C
 
29
Other values (43)
471 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNK237C
2nd rowNK237C
3rd rowNK237C
4th rowNK237C
5th rowNK237C

Common Values

ValueCountFrequency (%)
NK237B 36
 
5.7%
NK236I 36
 
5.7%
NK240A 33
 
5.2%
NK237A 29
 
4.6%
NK236C 29
 
4.6%
NK238F 27
 
4.3%
NK238D 26
 
4.1%
NK237G 25
 
3.9%
NK238B 24
 
3.8%
NK236F 23
 
3.6%
Other values (38) 346
54.6%

Length

2023-12-13T09:49:04.040435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
nk237b 36
 
5.7%
nk236i 36
 
5.7%
nk240a 33
 
5.2%
nk237a 29
 
4.6%
nk236c 29
 
4.6%
nk238f 27
 
4.3%
nk238d 26
 
4.1%
nk237g 25
 
3.9%
nk238b 24
 
3.8%
nk236f 23
 
3.6%
Other values (38) 346
54.6%
Distinct105
Distinct (%)16.6%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2023-12-13T09:49:04.257239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1902
Distinct characters106
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)5.5%

Sample

1st row*현*
2nd row*해*
3rd row*은*
4th row*연*
5th row*상*
ValueCountFrequency (%)
29
 
4.6%
26
 
4.1%
26
 
4.1%
24
 
3.8%
23
 
3.6%
23
 
3.6%
23
 
3.6%
22
 
3.5%
19
 
3.0%
17
 
2.7%
Other values (95) 403
63.5%
2023-12-13T09:49:04.593396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 1268
66.7%
29
 
1.5%
26
 
1.4%
26
 
1.4%
24
 
1.3%
23
 
1.2%
23
 
1.2%
23
 
1.2%
22
 
1.2%
19
 
1.0%
Other values (96) 419
 
22.0%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 1268
66.7%
Other Letter 633
33.3%
Space Separator 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
29
 
4.6%
26
 
4.1%
26
 
4.1%
24
 
3.8%
23
 
3.6%
23
 
3.6%
23
 
3.6%
22
 
3.5%
19
 
3.0%
17
 
2.7%
Other values (94) 401
63.3%
Other Punctuation
ValueCountFrequency (%)
* 1268
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1269
66.7%
Hangul 633
33.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
29
 
4.6%
26
 
4.1%
26
 
4.1%
24
 
3.8%
23
 
3.6%
23
 
3.6%
23
 
3.6%
22
 
3.5%
19
 
3.0%
17
 
2.7%
Other values (94) 401
63.3%
Common
ValueCountFrequency (%)
* 1268
99.9%
1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1269
66.7%
Hangul 633
33.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 1268
99.9%
1
 
0.1%
Hangul
ValueCountFrequency (%)
29
 
4.6%
26
 
4.1%
26
 
4.1%
24
 
3.8%
23
 
3.6%
23
 
3.6%
23
 
3.6%
22
 
3.5%
19
 
3.0%
17
 
2.7%
Other values (94) 401
63.3%
Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2022-01-28
634 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-01-28
2nd row2022-01-28
3rd row2022-01-28
4th row2022-01-28
5th row2022-01-28

Common Values

ValueCountFrequency (%)
2022-01-28 634
100.0%

Length

2023-12-13T09:49:04.694170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:49:04.762930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022-01-28 634
100.0%

참여시작일
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2022-01-01
634 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-01-01
2nd row2022-01-01
3rd row2022-01-01
4th row2022-01-01
5th row2022-01-01

Common Values

ValueCountFrequency (%)
2022-01-01 634
100.0%

Length

2023-12-13T09:49:04.833702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:49:04.902516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022-01-01 634
100.0%

참여종료일
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2022-01-31
633 
2022-01-25
 
1

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row2022-01-31
2nd row2022-01-31
3rd row2022-01-31
4th row2022-01-31
5th row2022-01-31

Common Values

ValueCountFrequency (%)
2022-01-31 633
99.8%
2022-01-25 1
 
0.2%

Length

2023-12-13T09:49:04.979400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:49:05.069358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022-01-31 633
99.8%
2022-01-25 1
 
0.2%

참여일수
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
31
633 
25
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row31
2nd row31
3rd row31
4th row31
5th row31

Common Values

ValueCountFrequency (%)
31 633
99.8%
25 1
 
0.2%

Length

2023-12-13T09:49:05.149373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:49:05.221342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
31 633
99.8%
25 1
 
0.2%

집행여부
Boolean

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size766.0 B
True
634 
ValueCountFrequency (%)
True 634
100.0%
2023-12-13T09:49:05.278976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

인건비배분금액
Real number (ℝ)

Distinct593
Distinct (%)93.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2933307.8
Minimum10000
Maximum10077548
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.7 KiB
2023-12-13T09:49:05.362848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10000
5-th percentile139147.15
Q11634333.2
median3141149.5
Q34170540
95-th percentile5356451.8
Maximum10077548
Range10067548
Interquartile range (IQR)2536206.8

Descriptive statistics

Standard deviation1737364
Coefficient of variation (CV)0.59228832
Kurtosis-0.22241845
Mean2933307.8
Median Absolute Deviation (MAD)1274567
Skewness0.1364348
Sum1.8597172 × 109
Variance3.0184335 × 1012
MonotonicityNot monotonic
2023-12-13T09:49:05.479128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
169863 15
 
2.4%
84932 6
 
0.9%
2000000 6
 
0.9%
46712 4
 
0.6%
3212449 3
 
0.5%
254795 3
 
0.5%
76438 3
 
0.5%
1167044 2
 
0.3%
110411 2
 
0.3%
127397 2
 
0.3%
Other values (583) 588
92.7%
ValueCountFrequency (%)
10000 1
 
0.2%
46712 4
0.6%
67945 1
 
0.2%
73041 1
 
0.2%
76438 3
0.5%
81534 1
 
0.2%
84932 6
0.9%
92151 1
 
0.2%
93425 1
 
0.2%
103871 1
 
0.2%
ValueCountFrequency (%)
10077548 1
0.2%
8070022 1
0.2%
7978466 1
0.2%
7685452 1
0.2%
7596189 1
0.2%
7571219 1
0.2%
7549732 1
0.2%
7213742 1
0.2%
7125074 1
0.2%
6887436 1
0.2%

작성일
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size5.1 KiB
2023-07-28
634 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-07-28
2nd row2023-07-28
3rd row2023-07-28
4th row2023-07-28
5th row2023-07-28

Common Values

ValueCountFrequency (%)
2023-07-28 634
100.0%

Length

2023-12-13T09:49:05.574721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:49:05.647972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-07-28 634
100.0%

Interactions

2023-12-13T09:49:03.732650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T09:49:05.692548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업_과제번호참여종료일참여일수인건비배분금액
사업_과제번호1.0000.0000.0000.692
참여종료일0.0001.0000.7050.000
참여일수0.0000.7051.0000.000
인건비배분금액0.6920.0000.0001.000
2023-12-13T09:49:05.762821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사업_과제번호참여일수참여종료일
사업_과제번호1.0000.0000.000
참여일수0.0001.0000.498
참여종료일0.0000.4981.000
2023-12-13T09:49:05.826821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인건비배분금액사업_과제번호참여종료일참여일수
인건비배분금액1.0000.3040.0000.000
사업_과제번호0.3041.0000.0000.000
참여종료일0.0000.0001.0000.498
참여일수0.0000.0000.4981.000

Missing values

2023-12-13T09:49:03.844583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:49:03.940652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사업_과제번호참여자명참여연구원참여적용년월일참여시작일참여종료일참여일수집행여부인건비배분금액작성일
0NK237C*현*2022-01-282022-01-012022-01-3131Y42163402023-07-28
1NK237C*해*2022-01-282022-01-012022-01-3131Y26514932023-07-28
2NK237C*은*2022-01-282022-01-012022-01-3131Y13572902023-07-28
3NK237C*연*2022-01-282022-01-012022-01-3131Y17608002023-07-28
4NK237C*상*2022-01-282022-01-012022-01-3131Y17858552023-07-28
5NK237C*정*2022-01-282022-01-012022-01-3131Y36588492023-07-28
6NK237C*경*2022-01-282022-01-012022-01-3131Y32092222023-07-28
7NK237C*제*2022-01-282022-01-012022-01-3131Y44997562023-07-28
8NK237C*원*2022-01-282022-01-012022-01-3131Y30652632023-07-28
9NK237C*영*2022-01-282022-01-012022-01-3131Y51674882023-07-28
사업_과제번호참여자명참여연구원참여적용년월일참여시작일참여종료일참여일수집행여부인건비배분금액작성일
624NK240A*용*2022-01-282022-01-012022-01-3131Y65057072023-07-28
625NK240A*건*2022-01-282022-01-012022-01-3131Y29431322023-07-28
626NK239I*준*2022-01-282022-01-012022-01-3131Y6712992023-07-28
627NK239I*평*2022-01-282022-01-012022-01-3131Y7070552023-07-28
628NK239I*혁*2022-01-282022-01-012022-01-3131Y6471782023-07-28
629NK239I*윤*2022-01-282022-01-012022-01-3131Y8561102023-07-28
630NK239B*민*2022-01-282022-01-012022-01-3131Y6594932023-07-28
631NK239B*기*2022-01-282022-01-012022-01-3131Y7363562023-07-28
632NK239B*장*2022-01-282022-01-012022-01-3131Y7576742023-07-28
633NK239B*현*2022-01-282022-01-012022-01-3131Y9043512023-07-28