Overview

Dataset statistics

Number of variables4
Number of observations2937
Missing cells0
Missing cells (%)0.0%
Duplicate rows295
Duplicate rows (%)10.0%
Total size in memory94.8 KiB
Average record size in memory33.0 B

Variable types

Text2
Categorical1
Numeric1

Dataset

Description국립중앙과학관 홈페이지에 있는 과학학습콘텐츠의 통계테이블 목록입니다.
Author과학기술정보통신부 국립중앙과학관
URLhttps://www.data.go.kr/data/15067817/fileData.do

Alerts

Dataset has 295 (10.0%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 23:56:23.223752
Analysis finished2023-12-12 23:56:23.610595
Duration0.39 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct154
Distinct (%)5.2%
Missing0
Missing (%)0.0%
Memory size23.1 KiB
2023-12-13T08:56:23.758462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters35244
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)0.6%

Sample

1st rowCT0000000069
2nd rowCT0000000070
3rd rowCT0000000071
4th rowCT0000000072
5th rowCT0000000073
ValueCountFrequency (%)
ct0000000129 53
 
1.8%
ct0000000137 50
 
1.7%
ct0000000131 50
 
1.7%
ct0000000130 47
 
1.6%
ct0000000023 46
 
1.6%
ct0000000127 44
 
1.5%
ct0000000021 44
 
1.5%
ct0000000125 42
 
1.4%
ct0000000126 41
 
1.4%
ct0000000079 40
 
1.4%
Other values (144) 2480
84.4%
2023-12-13T08:56:24.081200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 22336
63.4%
C 2937
 
8.3%
T 2937
 
8.3%
1 2032
 
5.8%
3 896
 
2.5%
2 768
 
2.2%
4 650
 
1.8%
5 624
 
1.8%
6 603
 
1.7%
7 558
 
1.6%
Other values (2) 903
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 29370
83.3%
Uppercase Letter 5874
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 22336
76.1%
1 2032
 
6.9%
3 896
 
3.1%
2 768
 
2.6%
4 650
 
2.2%
5 624
 
2.1%
6 603
 
2.1%
7 558
 
1.9%
8 459
 
1.6%
9 444
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
C 2937
50.0%
T 2937
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 29370
83.3%
Latin 5874
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
0 22336
76.1%
1 2032
 
6.9%
3 896
 
3.1%
2 768
 
2.6%
4 650
 
2.2%
5 624
 
2.1%
6 603
 
2.1%
7 558
 
1.9%
8 459
 
1.6%
9 444
 
1.5%
Latin
ValueCountFrequency (%)
C 2937
50.0%
T 2937
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 35244
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 22336
63.4%
C 2937
 
8.3%
T 2937
 
8.3%
1 2032
 
5.8%
3 896
 
2.5%
2 768
 
2.2%
4 650
 
1.8%
5 624
 
1.8%
6 603
 
1.7%
7 558
 
1.6%
Other values (2) 903
 
2.6%

대분류코드
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size23.1 KiB
C001
1730 
C002
766 
C003
441 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC002
2nd rowC002
3rd rowC002
4th rowC002
5th rowC002

Common Values

ValueCountFrequency (%)
C001 1730
58.9%
C002 766
26.1%
C003 441
 
15.0%

Length

2023-12-13T08:56:24.210555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:56:24.289783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
c001 1730
58.9%
c002 766
26.1%
c003 441
 
15.0%
Distinct154
Distinct (%)5.2%
Missing0
Missing (%)0.0%
Memory size23.1 KiB
2023-12-13T08:56:24.484387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length15
Mean length7.1297242
Min length2

Characters and Unicode

Total characters20940
Distinct characters224
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)0.6%

Sample

1st row정유자
2nd row정리자
3rd row정리자체 철활자
4th row전사자
5th row필서체 철활자
ValueCountFrequency (%)
공룡은 186
 
4.3%
금속활자 115
 
2.6%
가장 103
 
2.4%
공룡이 74
 
1.7%
어떤 74
 
1.7%
휴보 68
 
1.6%
브라키오사우루스 53
 
1.2%
했나요 50
 
1.1%
50
 
1.1%
덩치가 50
 
1.1%
Other values (179) 3541
81.1%
2023-12-13T08:56:24.828858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1441
 
6.9%
1427
 
6.8%
936
 
4.5%
877
 
4.2%
848
 
4.0%
789
 
3.8%
568
 
2.7%
417
 
2.0%
399
 
1.9%
363
 
1.7%
Other values (214) 12875
61.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 19059
91.0%
Space Separator 1427
 
6.8%
Other Punctuation 361
 
1.7%
Decimal Number 92
 
0.4%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1441
 
7.6%
936
 
4.9%
877
 
4.6%
848
 
4.4%
789
 
4.1%
568
 
3.0%
417
 
2.2%
399
 
2.1%
363
 
1.9%
332
 
1.7%
Other values (209) 12089
63.4%
Decimal Number
ValueCountFrequency (%)
2 81
88.0%
6 11
 
12.0%
Space Separator
ValueCountFrequency (%)
1427
100.0%
Other Punctuation
ValueCountFrequency (%)
? 361
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 19059
91.0%
Common 1880
 
9.0%
Latin 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1441
 
7.6%
936
 
4.9%
877
 
4.6%
848
 
4.4%
789
 
4.1%
568
 
3.0%
417
 
2.2%
399
 
2.1%
363
 
1.9%
332
 
1.7%
Other values (209) 12089
63.4%
Common
ValueCountFrequency (%)
1427
75.9%
? 361
 
19.2%
2 81
 
4.3%
6 11
 
0.6%
Latin
ValueCountFrequency (%)
P 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 19059
91.0%
ASCII 1881
 
9.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1441
 
7.6%
936
 
4.9%
877
 
4.6%
848
 
4.4%
789
 
4.1%
568
 
3.0%
417
 
2.2%
399
 
2.1%
363
 
1.9%
332
 
1.7%
Other values (209) 12089
63.4%
ASCII
ValueCountFrequency (%)
1427
75.9%
? 361
 
19.2%
2 81
 
4.3%
6 11
 
0.6%
P 1
 
0.1%

조회수
Real number (ℝ)

Distinct18
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.8781069
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.9 KiB
2023-12-13T08:56:24.946010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile6
Maximum20
Range19
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.0439952
Coefficient of variation (CV)1.0883274
Kurtosis14.750095
Mean1.8781069
Median Absolute Deviation (MAD)0
Skewness3.489275
Sum5516
Variance4.1779163
MonotonicityNot monotonic
2023-12-13T08:56:25.046000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
1 2072
70.5%
2 392
 
13.3%
4 129
 
4.4%
3 108
 
3.7%
5 71
 
2.4%
7 36
 
1.2%
6 28
 
1.0%
8 24
 
0.8%
10 22
 
0.7%
9 21
 
0.7%
Other values (8) 34
 
1.2%
ValueCountFrequency (%)
1 2072
70.5%
2 392
 
13.3%
3 108
 
3.7%
4 129
 
4.4%
5 71
 
2.4%
6 28
 
1.0%
7 36
 
1.2%
8 24
 
0.8%
9 21
 
0.7%
10 22
 
0.7%
ValueCountFrequency (%)
20 1
 
< 0.1%
18 1
 
< 0.1%
16 3
 
0.1%
15 3
 
0.1%
14 4
 
0.1%
13 4
 
0.1%
12 7
 
0.2%
11 11
0.4%
10 22
0.7%
9 21
0.7%

Interactions

2023-12-13T08:56:23.399325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:56:25.127780image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대분류코드조회수
대분류코드1.0000.097
조회수0.0971.000
2023-12-13T08:56:25.208822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조회수대분류코드
조회수1.0000.058
대분류코드0.0581.000

Missing values

2023-12-13T08:56:23.508046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:56:23.578479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

내용 아이디대분류코드제목_한글명조회수
0CT0000000069C002정유자7
1CT0000000070C002정리자5
2CT0000000071C002정리자체 철활자9
3CT0000000072C002전사자4
4CT0000000073C002필서체 철활자10
5CT0000000074C002신연활자자9
6CT0000000075C002구텐베르크 금속활자10
7CT0000000075C002구텐베르크 금속활자7
8CT0000000075C002구텐베르크 금속활자6
9CT0000000079C003휴보2 제작사양서10
내용 아이디대분류코드제목_한글명조회수
2927CT0000000112C001스트루티오미무스1
2928CT0000000163C001투오지앙고사우루스1
2929CT0000000130C001가장 사나운 공룡은?1
2930CT0000000075C002구텐베르크 금속활자1
2931CT0000000037C002고려시대의 금속활자1
2932CT0000000137C001공룡은 왜 멸종을 했나요?1
2933CT0000000041C002흥덕사자1
2934CT0000000032C002금속활자 만들기1
2935CT0000000073C002필서체 철활자1
2936CT0000000051C002을유자20

Duplicate rows

Most frequently occurring

내용 아이디대분류코드제목_한글명조회수# duplicates
192CT0000000130C001가장 사나운 공룡은?130
198CT0000000131C001가장 덩치가 큰 공룡은?130
9CT0000000021C003휴보129
15CT0000000023C003키보128
113CT0000000079C003휴보2 제작사양서128
29CT0000000027C002금속활자란 무엇인가127
52CT0000000041C002흥덕사자127
188CT0000000129C001브라키오사우루스127
40CT0000000036C002금속활자의 의의126
175CT0000000126C001데이노니쿠스126