Overview

Dataset statistics

Number of variables8
Number of observations49
Missing cells12
Missing cells (%)3.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.3 KiB
Average record size in memory69.7 B

Variable types

Numeric1
Categorical4
Text2
Boolean1

Dataset

Description국립중앙과학관 홈페이지에 있는 과학학습콘텐츠의 곤충 목록입니다. 데이터 항목명: 콘텐츠 아이디, 대분류코드, 중분류코드, 콘텐츠 제목, 이름, 감수자, 공개유무, 첨부파일 ※ 대전광역시 유성구 대덕대로 481(국립중앙과학관)
URLhttps://www.data.go.kr/data/15067836/fileData.do

Alerts

대분류코드 has constant value ""Constant
감수자 has constant value ""Constant
공개유무 has constant value ""Constant
중분류코드 is highly overall correlated with 콘텐츠 아이디 and 1 other fieldsHigh correlation
이름 is highly overall correlated with 콘텐츠 아이디 and 1 other fieldsHigh correlation
콘텐츠 아이디 is highly overall correlated with 중분류코드 and 1 other fieldsHigh correlation
첨부파일 has 12 (24.5%) missing valuesMissing
콘텐츠 아이디 has unique valuesUnique
콘텐츠 제목 has unique valuesUnique

Reproduction

Analysis started2023-12-11 22:53:18.777881
Analysis finished2023-12-11 22:53:19.364282
Duration0.59 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

콘텐츠 아이디
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct49
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1410.1224
Minimum1386
Maximum1437
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size573.0 B
2023-12-12T07:53:19.458659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1386
5-th percentile1388.4
Q11398
median1410
Q31422
95-th percentile1431.6
Maximum1437
Range51
Interquartile range (IQR)24

Descriptive statistics

Standard deviation14.505218
Coefficient of variation (CV)0.010286496
Kurtosis-1.1181116
Mean1410.1224
Median Absolute Deviation (MAD)12
Skewness0.053788529
Sum69096
Variance210.40136
MonotonicityNot monotonic
2023-12-12T07:53:19.601754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
1411 1
 
2.0%
1414 1
 
2.0%
1398 1
 
2.0%
1400 1
 
2.0%
1402 1
 
2.0%
1403 1
 
2.0%
1404 1
 
2.0%
1406 1
 
2.0%
1407 1
 
2.0%
1409 1
 
2.0%
Other values (39) 39
79.6%
ValueCountFrequency (%)
1386 1
2.0%
1387 1
2.0%
1388 1
2.0%
1389 1
2.0%
1390 1
2.0%
1391 1
2.0%
1392 1
2.0%
1393 1
2.0%
1394 1
2.0%
1395 1
2.0%
ValueCountFrequency (%)
1437 1
2.0%
1436 1
2.0%
1432 1
2.0%
1431 1
2.0%
1430 1
2.0%
1429 1
2.0%
1428 1
2.0%
1427 1
2.0%
1426 1
2.0%
1425 1
2.0%

대분류코드
Categorical

CONSTANT 

Distinct1
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size524.0 B
1019
49 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1019
2nd row1019
3rd row1019
4th row1019
5th row1019

Common Values

ValueCountFrequency (%)
1019 49
100.0%

Length

2023-12-12T07:53:19.731202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:53:19.845058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1019 49
100.0%

중분류코드
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)8.2%
Missing0
Missing (%)0.0%
Memory size524.0 B
1032
24 
1031
21 
1030
 
2
1029
 
2

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1032
2nd row1032
3rd row1032
4th row1032
5th row1031

Common Values

ValueCountFrequency (%)
1032 24
49.0%
1031 21
42.9%
1030 2
 
4.1%
1029 2
 
4.1%

Length

2023-12-12T07:53:19.964936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:53:20.074526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1032 24
49.0%
1031 21
42.9%
1030 2
 
4.1%
1029 2
 
4.1%

콘텐츠 제목
Text

UNIQUE 

Distinct49
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size524.0 B
2023-12-12T07:53:20.283730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length3.7142857
Min length1

Characters and Unicode

Total characters182
Distinct characters81
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)100.0%

Sample

1st row여치
2nd row
3rd row벼룩
4th row멋쟁이나비
5th row흰개미목
ValueCountFrequency (%)
나방 2
 
3.9%
여치 1
 
2.0%
바구미 1
 
2.0%
다듬이벌레목 1
 
2.0%
노린재목 1
 
2.0%
풀잠자리목 1
 
2.0%
딱정벌레목 1
 
2.0%
부채벌레목 1
 
2.0%
밑들이목 1
 
2.0%
벼룩목 1
 
2.0%
Other values (40) 40
78.4%
2023-12-12T07:53:20.737592image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25
 
13.7%
9
 
4.9%
9
 
4.9%
8
 
4.4%
8
 
4.4%
6
 
3.3%
5
 
2.7%
5
 
2.7%
4
 
2.2%
4
 
2.2%
Other values (71) 99
54.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 180
98.9%
Space Separator 2
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
25
 
13.9%
9
 
5.0%
9
 
5.0%
8
 
4.4%
8
 
4.4%
6
 
3.3%
5
 
2.8%
5
 
2.8%
4
 
2.2%
4
 
2.2%
Other values (70) 97
53.9%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 180
98.9%
Common 2
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
25
 
13.9%
9
 
5.0%
9
 
5.0%
8
 
4.4%
8
 
4.4%
6
 
3.3%
5
 
2.8%
5
 
2.8%
4
 
2.2%
4
 
2.2%
Other values (70) 97
53.9%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 180
98.9%
ASCII 2
 
1.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
25
 
13.9%
9
 
5.0%
9
 
5.0%
8
 
4.4%
8
 
4.4%
6
 
3.3%
5
 
2.8%
5
 
2.8%
4
 
2.2%
4
 
2.2%
Other values (70) 97
53.9%
ASCII
ValueCountFrequency (%)
2
100.0%

이름
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Memory size524.0 B
<NA>
28 
매뚜기계열
밑들이계열
노린재계열
딱정벌레계열
 
2
Other values (2)
 
2

Length

Max length6
Median length4
Mean length4.4489796
Min length3

Unique

Unique2 ?
Unique (%)4.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row매뚜기계열

Common Values

ValueCountFrequency (%)
<NA> 28
57.1%
매뚜기계열 8
 
16.3%
밑들이계열 5
 
10.2%
노린재계열 4
 
8.2%
딱정벌레계열 2
 
4.1%
벌계열 1
 
2.0%
풀잠자리계열 1
 
2.0%

Length

2023-12-12T07:53:20.895631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:53:21.055089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 28
57.1%
매뚜기계열 8
 
16.3%
밑들이계열 5
 
10.2%
노린재계열 4
 
8.2%
딱정벌레계열 2
 
4.1%
벌계열 1
 
2.0%
풀잠자리계열 1
 
2.0%

감수자
Categorical

CONSTANT 

Distinct1
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size524.0 B
변봉규 교수
49 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row변봉규 교수
2nd row변봉규 교수
3rd row변봉규 교수
4th row변봉규 교수
5th row변봉규 교수

Common Values

ValueCountFrequency (%)
변봉규 교수 49
100.0%

Length

2023-12-12T07:53:21.199300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T07:53:21.293813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
변봉규 49
50.0%
교수 49
50.0%

공개유무
Boolean

CONSTANT 

Distinct1
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size181.0 B
True
49 
ValueCountFrequency (%)
True 49
100.0%
2023-12-12T07:53:21.364747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

첨부파일
Text

MISSING 

Distinct37
Distinct (%)100.0%
Missing12
Missing (%)24.5%
Memory size524.0 B
2023-12-12T07:53:21.651125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length71
Median length71
Mean length71
Min length71

Characters and Unicode

Total characters2627
Distinct characters40
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)100.0%

Sample

1st rowhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E26.pdf
2nd rowhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E08.pdf
3rd rowhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E09.pdf
4th rowhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E10.pdf
5th rowhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E03.pdf
ValueCountFrequency (%)
https://smart.science.go.kr/upload_data/subject/bugs/pdf/insect_e05.pdf 1
 
2.7%
https://smart.science.go.kr/upload_data/subject/bugs/pdf/insect_e34.pdf 1
 
2.7%
https://smart.science.go.kr/upload_data/subject/bugs/pdf/insect_e21.pdf 1
 
2.7%
https://smart.science.go.kr/upload_data/subject/bugs/pdf/insect_e38.pdf 1
 
2.7%
https://smart.science.go.kr/upload_data/subject/bugs/pdf/insect_e11.pdf 1
 
2.7%
https://smart.science.go.kr/upload_data/subject/bugs/pdf/insect_e13.pdf 1
 
2.7%
https://smart.science.go.kr/upload_data/subject/bugs/pdf/insect_e15.pdf 1
 
2.7%
https://smart.science.go.kr/upload_data/subject/bugs/pdf/insect_e17.pdf 1
 
2.7%
https://smart.science.go.kr/upload_data/subject/bugs/pdf/insect_e18.pdf 1
 
2.7%
https://smart.science.go.kr/upload_data/subject/bugs/pdf/insect_e19.pdf 1
 
2.7%
Other values (27) 27
73.0%
2023-12-12T07:53:22.090902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 259
 
9.9%
s 185
 
7.0%
t 185
 
7.0%
p 148
 
5.6%
d 148
 
5.6%
a 148
 
5.6%
. 148
 
5.6%
u 111
 
4.2%
c 111
 
4.2%
e 111
 
4.2%
Other values (30) 1073
40.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1776
67.6%
Other Punctuation 444
 
16.9%
Uppercase Letter 259
 
9.9%
Connector Punctuation 74
 
2.8%
Decimal Number 74
 
2.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 185
 
10.4%
t 185
 
10.4%
p 148
 
8.3%
d 148
 
8.3%
a 148
 
8.3%
u 111
 
6.2%
c 111
 
6.2%
e 111
 
6.2%
g 74
 
4.2%
o 74
 
4.2%
Other values (10) 481
27.1%
Decimal Number
ValueCountFrequency (%)
2 13
17.6%
3 13
17.6%
1 12
16.2%
0 11
14.9%
8 5
 
6.8%
9 5
 
6.8%
4 5
 
6.8%
6 4
 
5.4%
5 3
 
4.1%
7 3
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
E 74
28.6%
S 37
14.3%
N 37
14.3%
I 37
14.3%
C 37
14.3%
T 37
14.3%
Other Punctuation
ValueCountFrequency (%)
/ 259
58.3%
. 148
33.3%
: 37
 
8.3%
Connector Punctuation
ValueCountFrequency (%)
_ 74
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2035
77.5%
Common 592
 
22.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 185
 
9.1%
t 185
 
9.1%
p 148
 
7.3%
d 148
 
7.3%
a 148
 
7.3%
u 111
 
5.5%
c 111
 
5.5%
e 111
 
5.5%
g 74
 
3.6%
o 74
 
3.6%
Other values (16) 740
36.4%
Common
ValueCountFrequency (%)
/ 259
43.8%
. 148
25.0%
_ 74
 
12.5%
: 37
 
6.2%
2 13
 
2.2%
3 13
 
2.2%
1 12
 
2.0%
0 11
 
1.9%
8 5
 
0.8%
9 5
 
0.8%
Other values (4) 15
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2627
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 259
 
9.9%
s 185
 
7.0%
t 185
 
7.0%
p 148
 
5.6%
d 148
 
5.6%
a 148
 
5.6%
. 148
 
5.6%
u 111
 
4.2%
c 111
 
4.2%
e 111
 
4.2%
Other values (30) 1073
40.8%

Interactions

2023-12-12T07:53:19.036731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T07:53:22.484400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
콘텐츠 아이디중분류코드콘텐츠 제목이름첨부파일
콘텐츠 아이디1.0000.7901.0000.7901.000
중분류코드0.7901.0001.000NaN1.000
콘텐츠 제목1.0001.0001.0001.0001.000
이름0.790NaN1.0001.0001.000
첨부파일1.0001.0001.0001.0001.000
2023-12-12T07:53:22.585197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
중분류코드이름
중분류코드1.0001.000
이름1.0001.000
2023-12-12T07:53:22.673849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
콘텐츠 아이디중분류코드이름
콘텐츠 아이디1.0000.5870.647
중분류코드0.5871.0001.000
이름0.6471.0001.000

Missing values

2023-12-12T07:53:19.156403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T07:53:19.299892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

콘텐츠 아이디대분류코드중분류코드콘텐츠 제목이름감수자공개유무첨부파일
0141110191032여치<NA>변봉규 교수Yhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E26.pdf
1141210191032<NA>변봉규 교수Y<NA>
2142010191032벼룩<NA>변봉규 교수Y<NA>
3142810191032멋쟁이나비<NA>변봉규 교수Y<NA>
4139310191031흰개미목매뚜기계열변봉규 교수Yhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E08.pdf
5139410191031강도래목매뚜기계열변봉규 교수Yhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E09.pdf
6139510191031집게벌레목매뚜기계열변봉규 교수Yhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E10.pdf
7138810191030하루살이목<NA>변봉규 교수Yhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E03.pdf
8143710191032잠자리<NA>변봉규 교수Yhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E49.pdf
9138610191029좀목<NA>변봉규 교수Y<NA>
콘텐츠 아이디대분류코드중분류코드콘텐츠 제목이름감수자공개유무첨부파일
39141610191032반날개<NA>변봉규 교수Yhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E31.pdf
40141710191032반딧불이<NA>변봉규 교수Yhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E32.pdf
41142410191032기생파리<NA>변봉규 교수Yhttps://smart.science.go.kr/upload_data/subject/bugs/pdf/INSECT_E39.pdf
42142610191032흰나비<NA>변봉규 교수Y<NA>
43142710191032모시나비<NA>변봉규 교수Y<NA>
44142510191032파리매<NA>변봉규 교수Y<NA>
45143010191032박각시 나방<NA>변봉규 교수Y<NA>
46142910191032팔랑나비<NA>변봉규 교수Y<NA>
47143110191032주머니 나방<NA>변봉규 교수Y<NA>
48143210191032독나방<NA>변봉규 교수Y<NA>