Overview

Dataset statistics

Number of variables4
Number of observations106
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.5 KiB
Average record size in memory34.2 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description국립중앙과학관 홈페이지에 있는 과학학습콘텐츠의 코드 상세정보관리 목록입니다.
Author과학기술정보통신부 국립중앙과학관
URLhttps://www.data.go.kr/data/15067827/fileData.do

Alerts

우선순위 is highly overall correlated with 코드상세번호High correlation
코드상세번호 is highly overall correlated with 우선순위High correlation

Reproduction

Analysis started2023-12-12 22:05:05.371037
Analysis finished2023-12-12 22:05:05.730121
Duration0.36 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

코드번호
Categorical

Distinct14
Distinct (%)13.2%
Missing0
Missing (%)0.0%
Memory size980.0 B
I001
24 
4
17 
I003
10 
I002
1
Other values (9)
39 

Length

Max length4
Median length4
Mean length3.1226415
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowI002
2nd rowI002
3rd rowI002
4th rowI002
5th rowI003

Common Values

ValueCountFrequency (%)
I001 24
22.6%
4 17
16.0%
I003 10
9.4%
I002 8
 
7.5%
1 8
 
7.5%
M001 6
 
5.7%
M002 6
 
5.7%
M003 6
 
5.7%
M004 6
 
5.7%
C001 4
 
3.8%
Other values (4) 11
10.4%

Length

2023-12-13T07:05:05.788212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
i001 24
22.6%
4 17
16.0%
i003 10
9.4%
i002 8
 
7.5%
1 8
 
7.5%
m001 6
 
5.7%
m002 6
 
5.7%
m003 6
 
5.7%
m004 6
 
5.7%
c001 4
 
3.8%
Other values (4) 11
10.4%

코드상세번호
Categorical

HIGH CORRELATION 

Distinct36
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Memory size980.0 B
1
12 
2
12 
3
10 
4
5
Other values (31)
55 

Length

Max length4
Median length1
Mean length1.5660377
Min length1

Unique

Unique19 ?
Unique (%)17.9%

Sample

1st row5
2nd row6
3rd row7
4th row8
5th row6

Common Values

ValueCountFrequency (%)
1 12
 
11.3%
2 12
 
11.3%
3 10
 
9.4%
4 9
 
8.5%
5 8
 
7.5%
6 8
 
7.5%
7 4
 
3.8%
8 4
 
3.8%
9 3
 
2.8%
10 3
 
2.8%
Other values (26) 33
31.1%

Length

2023-12-13T07:05:05.900472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 12
 
11.3%
2 12
 
11.3%
3 10
 
9.4%
4 9
 
8.5%
5 8
 
7.5%
6 8
 
7.5%
7 4
 
3.8%
8 4
 
3.8%
9 3
 
2.8%
10 3
 
2.8%
Other values (26) 33
31.1%
Distinct104
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size980.0 B
2023-12-13T07:05:06.145008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length8.5
Mean length5.6226415
Min length2

Characters and Unicode

Total characters596
Distinct characters122
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique102 ?
Unique (%)96.2%

Sample

1st row활자아이콘5
2nd row활자아이콘6
3rd row활자아이콘7
4th row활자아이콘8
5th row로봇아이콘6
ValueCountFrequency (%)
서울특별시 4
 
3.6%
해외과학관 2
 
1.8%
로봇 2
 
1.8%
우리나라 2
 
1.8%
어린이 1
 
0.9%
활자아이콘5 1
 
0.9%
음악 1
 
0.9%
농업/산림 1
 
0.9%
천문/지질 1
 
0.9%
우주 1
 
0.9%
Other values (96) 96
85.7%
2023-12-13T07:05:06.523003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
43
 
7.2%
42
 
7.0%
42
 
7.0%
31
 
5.2%
30
 
5.0%
29
 
4.9%
26
 
4.4%
1 16
 
2.7%
15
 
2.5%
13
 
2.2%
Other values (112) 309
51.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 504
84.6%
Decimal Number 58
 
9.7%
Open Punctuation 9
 
1.5%
Close Punctuation 9
 
1.5%
Space Separator 6
 
1.0%
Other Punctuation 6
 
1.0%
Uppercase Letter 4
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
43
 
8.5%
42
 
8.3%
42
 
8.3%
31
 
6.2%
30
 
6.0%
29
 
5.8%
26
 
5.2%
15
 
3.0%
13
 
2.6%
13
 
2.6%
Other values (94) 220
43.7%
Decimal Number
ValueCountFrequency (%)
1 16
27.6%
2 10
17.2%
4 5
 
8.6%
3 5
 
8.6%
5 4
 
6.9%
6 4
 
6.9%
8 4
 
6.9%
7 4
 
6.9%
0 3
 
5.2%
9 3
 
5.2%
Uppercase Letter
ValueCountFrequency (%)
A 1
25.0%
D 1
25.0%
B 1
25.0%
C 1
25.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Space Separator
ValueCountFrequency (%)
6
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 504
84.6%
Common 88
 
14.8%
Latin 4
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
43
 
8.5%
42
 
8.3%
42
 
8.3%
31
 
6.2%
30
 
6.0%
29
 
5.8%
26
 
5.2%
15
 
3.0%
13
 
2.6%
13
 
2.6%
Other values (94) 220
43.7%
Common
ValueCountFrequency (%)
1 16
18.2%
2 10
11.4%
( 9
10.2%
) 9
10.2%
6
 
6.8%
/ 6
 
6.8%
4 5
 
5.7%
3 5
 
5.7%
5 4
 
4.5%
6 4
 
4.5%
Other values (4) 14
15.9%
Latin
ValueCountFrequency (%)
A 1
25.0%
D 1
25.0%
B 1
25.0%
C 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 504
84.6%
ASCII 92
 
15.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
43
 
8.5%
42
 
8.3%
42
 
8.3%
31
 
6.2%
30
 
6.0%
29
 
5.8%
26
 
5.2%
15
 
3.0%
13
 
2.6%
13
 
2.6%
Other values (94) 220
43.7%
ASCII
ValueCountFrequency (%)
1 16
17.4%
2 10
10.9%
( 9
9.8%
) 9
9.8%
6
 
6.5%
/ 6
 
6.5%
4 5
 
5.4%
3 5
 
5.4%
5 4
 
4.3%
6 4
 
4.3%
Other values (8) 18
19.6%

우선순위
Real number (ℝ)

HIGH CORRELATION 

Distinct24
Distinct (%)22.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.5660377
Minimum1
Maximum24
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-13T07:05:06.636797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q38.75
95-th percentile18.75
Maximum24
Range23
Interquartile range (IQR)6.75

Descriptive statistics

Standard deviation5.6535179
Coefficient of variation (CV)0.86102427
Kurtosis1.0905805
Mean6.5660377
Median Absolute Deviation (MAD)3
Skewness1.3392032
Sum696
Variance31.962264
MonotonicityNot monotonic
2023-12-13T07:05:06.748437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
1 14
13.2%
2 14
13.2%
3 12
11.3%
4 11
10.4%
5 9
8.5%
6 9
8.5%
7 5
 
4.7%
8 5
 
4.7%
9 3
 
2.8%
10 3
 
2.8%
Other values (14) 21
19.8%
ValueCountFrequency (%)
1 14
13.2%
2 14
13.2%
3 12
11.3%
4 11
10.4%
5 9
8.5%
6 9
8.5%
7 5
 
4.7%
8 5
 
4.7%
9 3
 
2.8%
10 3
 
2.8%
ValueCountFrequency (%)
24 1
0.9%
23 1
0.9%
22 1
0.9%
21 1
0.9%
20 1
0.9%
19 1
0.9%
18 1
0.9%
17 2
1.9%
16 2
1.9%
15 2
1.9%

Interactions

2023-12-13T07:05:05.529206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:05:06.830838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
코드번호코드상세번호우선순위
코드번호1.0000.0000.000
코드상세번호0.0001.0001.000
우선순위0.0001.0001.000
2023-12-13T07:05:06.908798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
코드번호코드상세번호
코드번호1.0000.000
코드상세번호0.0001.000
2023-12-13T07:05:06.980036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
우선순위코드번호코드상세번호
우선순위1.0000.0000.854
코드번호0.0001.0000.000
코드상세번호0.8540.0001.000

Missing values

2023-12-13T07:05:05.617565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:05:05.699564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

코드번호코드상세번호코드명우선순위
0I0025활자아이콘55
1I0026활자아이콘66
2I0027활자아이콘77
3I0028활자아이콘88
4I0036로봇아이콘66
5I0037로봇아이콘77
6I0038로봇아이콘88
7I0039로봇아이콘99
8I00310로봇아이콘1010
91C001공룡1
코드번호코드상세번호코드명우선순위
96414경상북도(대구)14
97415전라남도(광주)15
98416경상남도(울산)16
99417해외과학관17
1001C005우리나라 텃새5
1011C008축음기8
1021C006수의역사6
103C0014잡식공룡4
1041C004우리나라 성곽축조과학4
1051C007컴퓨터7