Dataset statistics
Number of variables | 3 |
---|---|
Number of observations | 1608 |
Missing cells | 388 |
Missing cells (%) | 8.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 37.8 KiB |
Average record size in memory | 24.1 B |
Variable types
Text | 3 |
---|
Dataset
Description | 국립중앙과학관 홈페이지 과학학습콘텐츠에서 제공하는 관련 사이트 목록입니다. |
---|---|
Author | 과학기술정보통신부 국립중앙과학관 |
URL | https://www.data.go.kr/data/15067815/fileData.do |
Reproduction
Analysis started | 2023-12-12 05:44:48.339787 |
---|---|
Analysis finished | 2023-12-12 05:44:48.803881 |
Duration | 0.46 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
고유 아이디
Text
UNIQUE
 
Distinct | 1608 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 12.7 KiB |
Value | Count | Frequency (%) |
435 | 1 | 0.1% |
5,028 | 1 | 0.1% |
5,000 | 1 | 0.1% |
2,880 | 1 | 0.1% |
2,879 | 1 | 0.1% |
2,878 | 1 | 0.1% |
2,877 | 1 | 0.1% |
2,876 | 1 | 0.1% |
2,873 | 1 | 0.1% |
1,575 | 1 | 0.1% |
Other values (1598) | 1598 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 994 | |
, | 692 | |
2 | 646 | |
5 | 596 | |
3 | 525 | |
4 | 521 | |
9 | 466 | |
8 | 448 | |
7 | 426 | |
6 | 418 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 5443 | |
Other Punctuation | 692 | 11.3% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 994 | |
2 | 646 | |
5 | 596 | |
3 | 525 | |
4 | 521 | |
9 | 466 | |
8 | 448 | |
7 | 426 | |
6 | 418 | |
0 | 403 |
Other Punctuation
Value | Count | Frequency (%) |
, | 692 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 6135 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 994 | |
, | 692 | |
2 | 646 | |
5 | 596 | |
3 | 525 | |
4 | 521 | |
9 | 466 | |
8 | 448 | |
7 | 426 | |
6 | 418 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 6135 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 994 | |
, | 692 | |
2 | 646 | |
5 | 596 | |
3 | 525 | |
4 | 521 | |
9 | 466 | |
8 | 448 | |
7 | 426 | |
6 | 418 |
고유 아이디 2
Text
Distinct | 723 |
---|---|
Distinct (%) | 45.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 12.7 KiB |
Value | Count | Frequency (%) |
181 | 15 | 0.9% |
1,217 | 12 | 0.7% |
388 | 11 | 0.7% |
314 | 10 | 0.6% |
188 | 10 | 0.6% |
313 | 10 | 0.6% |
378 | 9 | 0.6% |
309 | 9 | 0.6% |
387 | 8 | 0.5% |
1,078 | 8 | 0.5% |
Other values (713) | 1506 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 1249 | |
3 | 760 | |
2 | 711 | |
0 | 665 | |
, | 617 | |
8 | 362 | 6.1% |
4 | 354 | 5.9% |
9 | 339 | 5.7% |
7 | 315 | 5.3% |
5 | 298 | 5.0% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 5333 | |
Other Punctuation | 617 | 10.4% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 1249 | |
3 | 760 | |
2 | 711 | |
0 | 665 | |
8 | 362 | 6.8% |
4 | 354 | 6.6% |
9 | 339 | 6.4% |
7 | 315 | 5.9% |
5 | 298 | 5.6% |
6 | 280 | 5.3% |
Other Punctuation
Value | Count | Frequency (%) |
, | 617 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 5950 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
1 | 1249 | |
3 | 760 | |
2 | 711 | |
0 | 665 | |
, | 617 | |
8 | 362 | 6.1% |
4 | 354 | 5.9% |
9 | 339 | 5.7% |
7 | 315 | 5.3% |
5 | 298 | 5.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 5950 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 1249 | |
3 | 760 | |
2 | 711 | |
0 | 665 | |
, | 617 | |
8 | 362 | 6.1% |
4 | 354 | 5.9% |
9 | 339 | 5.7% |
7 | 315 | 5.3% |
5 | 298 | 5.0% |
사이트명
Text
MISSING
 
Distinct | 809 |
---|---|
Distinct (%) | 66.3% |
Missing | 388 |
Missing (%) | 24.1% |
Memory size | 12.7 KiB |
Length
Max length | 124 |
---|---|
Median length | 57 |
Mean length | 14.279508 |
Min length | 4 |
Characters and Unicode
Total characters | 17421 |
---|---|
Distinct characters | 504 |
Distinct categories | 9 ? |
Distinct scripts | 3 ? |
Distinct blocks | 3 ? |
Unique
Unique | 734 ? |
---|---|
Unique (%) | 60.2% |
Sample
1st row | 위키피디아 - RCA Records |
---|---|
2nd row | 위키피디아 - Extended play |
3rd row | 위키피디아 - Edison Records |
4th row | 위키피디아 - Edison Records |
5th row | 위키피디아 - Theremin |
Value | Count | Frequency (%) |
411 | 13.3% | |
위키피디아 | 234 | 7.5% |
두산백과 | 138 | 4.5% |
한국위키피디아 | 80 | 2.6% |
문화재청 | 42 | 1.4% |
네이버지식백과 | 41 | 1.3% |
문화콘텐츠닷컴 | 31 | 1.0% |
한국민족문화대백과 | 30 | 1.0% |
향토문화대전 | 28 | 0.9% |
youtube | 28 | 0.9% |
Other values (1221) | 2038 |
Most occurring characters
Value | Count | Frequency (%) |
1882 | 10.8% | |
- | 874 | 5.0% |
위 | 446 | 2.6% |
키 | 425 | 2.4% |
과 | 419 | 2.4% |
아 | 390 | 2.2% |
피 | 375 | 2.2% |
디 | 375 | 2.2% |
백 | 342 | 2.0% |
e | 340 | 2.0% |
Other values (494) | 11553 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 10137 | |
Lowercase Letter | 3105 | 17.8% |
Space Separator | 1882 | 10.8% |
Uppercase Letter | 1208 | 6.9% |
Dash Punctuation | 874 | 5.0% |
Decimal Number | 122 | 0.7% |
Open Punctuation | 35 | 0.2% |
Close Punctuation | 35 | 0.2% |
Other Punctuation | 23 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
위 | 446 | 4.4% |
키 | 425 | 4.2% |
과 | 419 | 4.1% |
아 | 390 | 3.8% |
피 | 375 | 3.7% |
디 | 375 | 3.7% |
백 | 342 | 3.4% |
국 | 274 | 2.7% |
이 | 268 | 2.6% |
리 | 235 | 2.3% |
Other values (421) | 6588 |
Uppercase Letter
Value | Count | Frequency (%) |
D | 149 | |
S | 139 | |
L | 132 | |
N | 115 | 9.5% |
M | 86 | 7.1% |
P | 68 | 5.6% |
C | 65 | 5.4% |
I | 60 | 5.0% |
R | 53 | 4.4% |
B | 48 | 4.0% |
Other values (16) | 293 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 340 | |
o | 300 | 9.7% |
a | 272 | 8.8% |
n | 250 | 8.1% |
r | 249 | 8.0% |
i | 213 | 6.9% |
t | 186 | 6.0% |
c | 163 | 5.2% |
u | 160 | 5.2% |
l | 138 | 4.4% |
Other values (14) | 834 |
Decimal Number
Value | Count | Frequency (%) |
0 | 41 | |
1 | 28 | |
3 | 11 | 9.0% |
2 | 11 | 9.0% |
8 | 9 | 7.4% |
5 | 8 | 6.6% |
6 | 7 | 5.7% |
7 | 3 | 2.5% |
4 | 3 | 2.5% |
9 | 1 | 0.8% |
Other Punctuation
Value | Count | Frequency (%) |
: | 7 | |
' | 5 | |
, | 3 | |
/ | 3 | |
& | 1 | 4.3% |
· | 1 | 4.3% |
? | 1 | 4.3% |
? | 1 | 4.3% |
. | 1 | 4.3% |
Space Separator
Value | Count | Frequency (%) |
1882 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 874 |
Open Punctuation
Value | Count | Frequency (%) |
( | 35 |
Close Punctuation
Value | Count | Frequency (%) |
) | 35 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 10137 | |
Latin | 4313 | |
Common | 2971 | 17.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
위 | 446 | 4.4% |
키 | 425 | 4.2% |
과 | 419 | 4.1% |
아 | 390 | 3.8% |
피 | 375 | 3.7% |
디 | 375 | 3.7% |
백 | 342 | 3.4% |
국 | 274 | 2.7% |
이 | 268 | 2.6% |
리 | 235 | 2.3% |
Other values (421) | 6588 |
Latin
Value | Count | Frequency (%) |
e | 340 | 7.9% |
o | 300 | 7.0% |
a | 272 | 6.3% |
n | 250 | 5.8% |
r | 249 | 5.8% |
i | 213 | 4.9% |
t | 186 | 4.3% |
c | 163 | 3.8% |
u | 160 | 3.7% |
D | 149 | 3.5% |
Other values (40) | 2031 |
Common
Value | Count | Frequency (%) |
1882 | ||
- | 874 | |
0 | 41 | 1.4% |
( | 35 | 1.2% |
) | 35 | 1.2% |
1 | 28 | 0.9% |
3 | 11 | 0.4% |
2 | 11 | 0.4% |
8 | 9 | 0.3% |
5 | 8 | 0.3% |
Other values (13) | 37 | 1.2% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 10137 | |
ASCII | 7282 | |
None | 2 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1882 | ||
- | 874 | 12.0% |
e | 340 | 4.7% |
o | 300 | 4.1% |
a | 272 | 3.7% |
n | 250 | 3.4% |
r | 249 | 3.4% |
i | 213 | 2.9% |
t | 186 | 2.6% |
c | 163 | 2.2% |
Other values (61) | 2553 |
Hangul
Value | Count | Frequency (%) |
위 | 446 | 4.4% |
키 | 425 | 4.2% |
과 | 419 | 4.1% |
아 | 390 | 3.8% |
피 | 375 | 3.7% |
디 | 375 | 3.7% |
백 | 342 | 3.4% |
국 | 274 | 2.7% |
이 | 268 | 2.6% |
리 | 235 | 2.3% |
Other values (421) | 6588 |
None
Value | Count | Frequency (%) |
· | 1 | |
? | 1 |
고유 아이디 | 고유 아이디 2 | 사이트명 | |
---|---|---|---|
0 | 435 | 189 | 위키피디아 - RCA Records |
1 | 442 | 188 | 위키피디아 - Extended play |
2 | 447 | 186 | 위키피디아 - Edison Records |
3 | 449 | 185 | 위키피디아 - Edison Records |
4 | 459 | 183 | 위키피디아 - Theremin |
5 | 463 | 181 | 위키피디아 - RCA Records |
6 | 471 | 181 | 위키피디아 - Extended play |
7 | 473 | 181 | 한국위키피디아 - 자기 테이프 |
8 | 480 | 177 | 위키피디아 - Gramophone Company |
9 | 501 | 256 | 두산백과 - 컴퓨터 |
고유 아이디 | 고유 아이디 2 | 사이트명 | |
---|---|---|---|
1598 | 2,934 | 1,375 | <NA> |
1599 | 2,958 | 1,399 | <NA> |
1600 | 2,965 | 1,405 | <NA> |
1601 | 2,968 | 1,408 | <NA> |
1602 | 2,972 | 1,413 | <NA> |
1603 | 2,973 | 1,414 | <NA> |
1604 | 2,974 | 1,415 | <NA> |
1605 | 2,976 | 1,417 | <NA> |
1606 | 2,989 | 1,430 | <NA> |
1607 | 2,994 | 1,435 | <NA> |