Overview

Dataset statistics

Number of variables7
Number of observations307
Missing cells3
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory17.5 KiB
Average record size in memory58.4 B

Variable types

Text3
Categorical3
Numeric1

Dataset

Description2021년 2월 기준 기술과 관련된 정보입니다.
Author한국연구재단 정보통신기획평가원
URLhttps://www.data.go.kr/data/15077418/fileData.do

Alerts

생성자 has constant value ""Constant
생성일시 is highly overall correlated with 클러스터분류레벨High correlation
클러스터분류레벨 is highly overall correlated with 생성일시High correlation
클러스터분류 has unique valuesUnique
클러스터분류이름타이틀 has unique valuesUnique

Reproduction

Analysis started2023-12-12 22:07:14.177306
Analysis finished2023-12-12 22:07:14.630848
Duration0.45 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

클러스터분류
Text

UNIQUE 

Distinct307
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2023-12-13T07:07:14.886467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.4104235
Min length1

Characters and Unicode

Total characters1661
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique307 ?
Unique (%)100.0%

Sample

1st rowA
2nd rowB
3rd rowC
4th rowA1
5th rowA2
ValueCountFrequency (%)
b2e12 2
 
0.7%
a 1
 
0.3%
b2e05d 1
 
0.3%
b2b01a 1
 
0.3%
b2a05a 1
 
0.3%
b2a04c 1
 
0.3%
b2a04b 1
 
0.3%
b2a04a 1
 
0.3%
b2a03h 1
 
0.3%
b2a03f 1
 
0.3%
Other values (296) 296
96.4%
2023-12-13T07:07:15.399952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 284
17.1%
0 262
15.8%
B 259
15.6%
2 247
14.9%
1 218
13.1%
C 158
9.5%
E 66
 
4.0%
3 47
 
2.8%
D 29
 
1.7%
4 24
 
1.4%
Other values (10) 67
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 856
51.5%
Uppercase Letter 804
48.4%
Close Punctuation 1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 262
30.6%
2 247
28.9%
1 218
25.5%
3 47
 
5.5%
4 24
 
2.8%
5 15
 
1.8%
7 15
 
1.8%
6 11
 
1.3%
9 9
 
1.1%
8 8
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
A 284
35.3%
B 259
32.2%
C 158
19.7%
E 66
 
8.2%
D 29
 
3.6%
F 3
 
0.4%
G 2
 
0.2%
H 2
 
0.2%
I 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 857
51.6%
Latin 804
48.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0 262
30.6%
2 247
28.8%
1 218
25.4%
3 47
 
5.5%
4 24
 
2.8%
5 15
 
1.8%
7 15
 
1.8%
6 11
 
1.3%
9 9
 
1.1%
8 8
 
0.9%
Latin
ValueCountFrequency (%)
A 284
35.3%
B 259
32.2%
C 158
19.7%
E 66
 
8.2%
D 29
 
3.6%
F 3
 
0.4%
G 2
 
0.2%
H 2
 
0.2%
I 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1661
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 284
17.1%
0 262
15.8%
B 259
15.6%
2 247
14.9%
1 218
13.1%
C 158
9.5%
E 66
 
4.0%
3 47
 
2.8%
D 29
 
1.7%
4 24
 
1.4%
Other values (10) 67
 
4.0%

생성자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
000000000000000UU0
307 

Length

Max length18
Median length18
Mean length18
Min length18

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row000000000000000UU0
2nd row000000000000000UU0
3rd row000000000000000UU0
4th row000000000000000UU0
5th row000000000000000UU0

Common Values

ValueCountFrequency (%)
000000000000000UU0 307
100.0%

Length

2023-12-13T07:07:15.564291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:07:15.674824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
000000000000000uu0 307
100.0%

생성일시
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2015-03-18 오전 9:46:51
135 
2015-03-18 오전 9:46:18
75 
2015-03-18 오전 9:46:50
66 
2015-03-18 오전 9:45:47
21 
2015-03-18 오전 9:44:14
 
7

Length

Max length21
Median length21
Mean length21
Min length21

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015-03-18 오전 9:43:32
2nd row2015-03-18 오전 9:43:32
3rd row2015-03-18 오전 9:43:32
4th row2015-03-18 오전 9:44:14
5th row2015-03-18 오전 9:44:14

Common Values

ValueCountFrequency (%)
2015-03-18 오전 9:46:51 135
44.0%
2015-03-18 오전 9:46:18 75
24.4%
2015-03-18 오전 9:46:50 66
21.5%
2015-03-18 오전 9:45:47 21
 
6.8%
2015-03-18 오전 9:44:14 7
 
2.3%
2015-03-18 오전 9:43:32 3
 
1.0%

Length

2023-12-13T07:07:15.772207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:07:15.901766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2015-03-18 307
33.3%
오전 307
33.3%
9:46:51 135
14.7%
9:46:18 75
 
8.1%
9:46:50 66
 
7.2%
9:45:47 21
 
2.3%
9:44:14 7
 
0.8%
9:43:32 3
 
0.3%
Distinct106
Distinct (%)34.9%
Missing3
Missing (%)1.0%
Memory size2.5 KiB
2023-12-13T07:07:16.209978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length4.2072368
Min length1

Characters and Unicode

Total characters1279
Distinct characters15
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)6.9%

Sample

1st rowA
2nd rowA
3rd rowB
4th rowB
5th rowC
ValueCountFrequency (%)
b2e 13
 
4.3%
a1a 10
 
3.3%
b1a02 9
 
3.0%
c2b 8
 
2.6%
b2a03 8
 
2.6%
b2e07 6
 
2.0%
c2c 5
 
1.6%
b2e09 5
 
1.6%
b2e12 5
 
1.6%
a1b01 5
 
1.6%
Other values (96) 230
75.7%
2023-12-13T07:07:16.723673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 226
17.7%
A 201
15.7%
B 193
15.1%
0 190
14.9%
1 188
14.7%
C 120
9.4%
E 58
 
4.5%
3 36
 
2.8%
4 17
 
1.3%
7 12
 
0.9%
Other values (5) 38
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 699
54.7%
Uppercase Letter 580
45.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 226
32.3%
0 190
27.2%
1 188
26.9%
3 36
 
5.2%
4 17
 
2.4%
7 12
 
1.7%
5 10
 
1.4%
6 8
 
1.1%
9 7
 
1.0%
8 5
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
A 201
34.7%
B 193
33.3%
C 120
20.7%
E 58
 
10.0%
D 8
 
1.4%

Most occurring scripts

ValueCountFrequency (%)
Common 699
54.7%
Latin 580
45.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 226
32.3%
0 190
27.2%
1 188
26.9%
3 36
 
5.2%
4 17
 
2.4%
7 12
 
1.7%
5 10
 
1.4%
6 8
 
1.1%
9 7
 
1.0%
8 5
 
0.7%
Latin
ValueCountFrequency (%)
A 201
34.7%
B 193
33.3%
C 120
20.7%
E 58
 
10.0%
D 8
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1279
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 226
17.7%
A 201
15.7%
B 193
15.1%
0 190
14.9%
1 188
14.7%
C 120
9.4%
E 58
 
4.5%
3 36
 
2.8%
4 17
 
1.3%
7 12
 
0.9%
Other values (5) 38
 
3.0%

클러스터분류레벨
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
5
201 
4
75 
3
21 
2
 
7
1
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row2
5th row2

Common Values

ValueCountFrequency (%)
5 201
65.5%
4 75
 
24.4%
3 21
 
6.8%
2 7
 
2.3%
1 3
 
1.0%

Length

2023-12-13T07:07:16.870306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:07:16.972503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5 201
65.5%
4 75
 
24.4%
3 21
 
6.8%
2 7
 
2.3%
1 3
 
1.0%
Distinct307
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.5 KiB
2023-12-13T07:07:17.183791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length17
Mean length16.410423
Min length12

Characters and Unicode

Total characters5038
Distinct characters25
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique307 ?
Unique (%)100.0%

Sample

1st rowTCD_CLS_ID.A
2nd rowTCD_CLS_ID.B
3rd rowTCD_CLS_ID.C
4th rowTCD_CLS_ID.A1
5th rowTCD_CLS_ID.A2
ValueCountFrequency (%)
tcd_cls_id.b2e12 2
 
0.7%
tcd_cls_id.a 1
 
0.3%
tcd_cls_id.b2e05d 1
 
0.3%
tcd_cls_id.b2b01a 1
 
0.3%
tcd_cls_id.b2a05a 1
 
0.3%
tcd_cls_id.b2a04c 1
 
0.3%
tcd_cls_id.b2a04b 1
 
0.3%
tcd_cls_id.b2a04a 1
 
0.3%
tcd_cls_id.b2a03h 1
 
0.3%
tcd_cls_id.b2a03f 1
 
0.3%
Other values (296) 296
96.4%
2023-12-13T07:07:17.534594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 772
15.3%
D 643
12.8%
_ 614
12.2%
I 308
 
6.1%
T 307
 
6.1%
L 307
 
6.1%
S 307
 
6.1%
. 307
 
6.1%
A 284
 
5.6%
0 262
 
5.2%
Other values (15) 927
18.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3260
64.7%
Decimal Number 856
 
17.0%
Connector Punctuation 614
 
12.2%
Other Punctuation 307
 
6.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 772
23.7%
D 643
19.7%
I 308
 
9.4%
T 307
 
9.4%
L 307
 
9.4%
S 307
 
9.4%
A 284
 
8.7%
B 259
 
7.9%
E 66
 
2.0%
F 3
 
0.1%
Other values (2) 4
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 262
30.6%
2 247
28.9%
1 218
25.5%
3 47
 
5.5%
4 24
 
2.8%
7 15
 
1.8%
5 15
 
1.8%
6 11
 
1.3%
9 9
 
1.1%
8 8
 
0.9%
Connector Punctuation
ValueCountFrequency (%)
_ 614
100.0%
Other Punctuation
ValueCountFrequency (%)
. 307
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3260
64.7%
Common 1778
35.3%

Most frequent character per script

Common
ValueCountFrequency (%)
_ 614
34.5%
. 307
17.3%
0 262
14.7%
2 247
13.9%
1 218
 
12.3%
3 47
 
2.6%
4 24
 
1.3%
7 15
 
0.8%
5 15
 
0.8%
6 11
 
0.6%
Other values (3) 18
 
1.0%
Latin
ValueCountFrequency (%)
C 772
23.7%
D 643
19.7%
I 308
 
9.4%
T 307
 
9.4%
L 307
 
9.4%
S 307
 
9.4%
A 284
 
8.7%
B 259
 
7.9%
E 66
 
2.0%
F 3
 
0.1%
Other values (2) 4
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5038
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 772
15.3%
D 643
12.8%
_ 614
12.2%
I 308
 
6.1%
T 307
 
6.1%
L 307
 
6.1%
S 307
 
6.1%
. 307
 
6.1%
A 284
 
5.6%
0 262
 
5.2%
Other values (15) 927
18.4%

표시순서
Real number (ℝ)

Distinct11
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4820847
Minimum0
Maximum10
Zeros1
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size2.8 KiB
2023-12-13T07:07:17.676613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median2
Q33
95-th percentile6
Maximum10
Range10
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.7514247
Coefficient of variation (CV)0.70562648
Kurtosis3.1739993
Mean2.4820847
Median Absolute Deviation (MAD)1
Skewness1.6997066
Sum762
Variance3.0674885
MonotonicityNot monotonic
2023-12-13T07:07:17.785446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 106
34.5%
2 88
28.7%
3 50
16.3%
4 28
 
9.1%
5 14
 
4.6%
6 6
 
2.0%
7 5
 
1.6%
8 5
 
1.6%
9 3
 
1.0%
10 1
 
0.3%
ValueCountFrequency (%)
0 1
 
0.3%
1 106
34.5%
2 88
28.7%
3 50
16.3%
4 28
 
9.1%
5 14
 
4.6%
6 6
 
2.0%
7 5
 
1.6%
8 5
 
1.6%
9 3
 
1.0%
ValueCountFrequency (%)
10 1
 
0.3%
9 3
 
1.0%
8 5
 
1.6%
7 5
 
1.6%
6 6
 
2.0%
5 14
 
4.6%
4 28
 
9.1%
3 50
16.3%
2 88
28.7%
1 106
34.5%

Interactions

2023-12-13T07:07:14.358914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:07:17.862850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
생성일시클러스터분류레벨표시순서
생성일시1.0001.0000.000
클러스터분류레벨1.0001.0000.000
표시순서0.0000.0001.000
2023-12-13T07:07:17.962280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
생성일시클러스터분류레벨
생성일시1.0000.998
클러스터분류레벨0.9981.000
2023-12-13T07:07:18.039608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
표시순서생성일시클러스터분류레벨
표시순서1.0000.0000.000
생성일시0.0001.0000.998
클러스터분류레벨0.0000.9981.000

Missing values

2023-12-13T07:07:14.465991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:07:14.582065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

클러스터분류생성자생성일시상위클러스터분류ID클러스터분류레벨클러스터분류이름타이틀표시순서
0A000000000000000UU02015-03-18 오전 9:43:32<NA>1TCD_CLS_ID.A1
1B000000000000000UU02015-03-18 오전 9:43:32<NA>1TCD_CLS_ID.B2
2C000000000000000UU02015-03-18 오전 9:43:32<NA>1TCD_CLS_ID.C3
3A1000000000000000UU02015-03-18 오전 9:44:14A2TCD_CLS_ID.A11
4A2000000000000000UU02015-03-18 오전 9:44:14A2TCD_CLS_ID.A22
5B1000000000000000UU02015-03-18 오전 9:44:14B2TCD_CLS_ID.B11
6B2000000000000000UU02015-03-18 오전 9:44:14B2TCD_CLS_ID.B22
7C1000000000000000UU02015-03-18 오전 9:44:14C2TCD_CLS_ID.C11
8C2000000000000000UU02015-03-18 오전 9:44:14C2TCD_CLS_ID.C22
9C3000000000000000UU02015-03-18 오전 9:44:14C2TCD_CLS_ID.C33
클러스터분류생성자생성일시상위클러스터분류ID클러스터분류레벨클러스터분류이름타이틀표시순서
297C2C04A000000000000000UU02015-03-18 오전 9:46:51C2C045TCD_CLS_ID.C2C04A1
298C2C04B000000000000000UU02015-03-18 오전 9:46:51C2C045TCD_CLS_ID.C2C04B2
299C2C04C000000000000000UU02015-03-18 오전 9:46:51C2C045TCD_CLS_ID.C2C04C3
300C2C04D000000000000000UU02015-03-18 오전 9:46:51C2C045TCD_CLS_ID.C2C04D4
301C2C05A000000000000000UU02015-03-18 오전 9:46:51C2C055TCD_CLS_ID.C2C05A1
302C3A01A000000000000000UU02015-03-18 오전 9:46:51C3A015TCD_CLS_ID.C3A01A1
303C3A01B000000000000000UU02015-03-18 오전 9:46:51C3A015TCD_CLS_ID.C3A01B2
304C3A01C000000000000000UU02015-03-18 오전 9:46:51C3A015TCD_CLS_ID.C3A01C3
305C3B01A000000000000000UU02015-03-18 오전 9:46:51C3B015TCD_CLS_ID.C3B01A1
306C3B01B000000000000000UU02015-03-18 오전 9:46:51C3B015TCD_CLS_ID.C3B01B2