Overview

Dataset statistics

Number of variables5
Number of observations34
Missing cells0
Missing cells (%)0.0%
Duplicate rows17
Duplicate rows (%)50.0%
Total size in memory1.5 KiB
Average record size in memory44.9 B

Variable types

Numeric1
Text1
Categorical3

Dataset

Descriptionsw중심사회포털 관련 파일명, 게시글제목, 파일다운 등 내용을 포함하고 있는 파일데이터
Author한국과학창의재단
URLhttps://www.data.go.kr/data/15073513/fileData.do

Alerts

파일다운로드 페이지 has constant value ""Constant
Dataset has 17 (50.0%) duplicate rowsDuplicates
게시물 아이디 is highly overall correlated with 파일 아이디 and 1 other fieldsHigh correlation
파일 명 is highly overall correlated with 파일 아이디 and 1 other fieldsHigh correlation
파일 아이디 is highly overall correlated with 파일 명 and 1 other fieldsHigh correlation

Reproduction

Analysis started2023-12-12 08:02:35.263985
Analysis finished2023-12-12 08:02:35.898466
Duration0.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

파일 아이디
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)32.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.4117647
Minimum1
Maximum21
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size438.0 B
2023-12-12T17:02:35.980042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q37
95-th percentile20.35
Maximum21
Range20
Interquartile range (IQR)6

Descriptive statistics

Standard deviation6.1993042
Coefficient of variation (CV)1.1455236
Kurtosis1.992302
Mean5.4117647
Median Absolute Deviation (MAD)2
Skewness1.7034462
Sum184
Variance38.431373
MonotonicityNot monotonic
2023-12-12T17:02:36.162386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 14
41.2%
9 2
 
5.9%
8 2
 
5.9%
7 2
 
5.9%
6 2
 
5.9%
5 2
 
5.9%
4 2
 
5.9%
3 2
 
5.9%
21 2
 
5.9%
20 2
 
5.9%
ValueCountFrequency (%)
1 14
41.2%
2 2
 
5.9%
3 2
 
5.9%
4 2
 
5.9%
5 2
 
5.9%
6 2
 
5.9%
7 2
 
5.9%
8 2
 
5.9%
9 2
 
5.9%
20 2
 
5.9%
ValueCountFrequency (%)
21 2
5.9%
20 2
5.9%
9 2
5.9%
8 2
5.9%
7 2
5.9%
6 2
5.9%
5 2
5.9%
4 2
5.9%
3 2
5.9%
2 2
5.9%
Distinct17
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size404.0 B
2023-12-12T17:02:36.476268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length17
Mean length16.882353
Min length13

Characters and Unicode

Total characters574
Distinct characters70
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row[초등학부모교사용]소프트웨어 마법이야기
2nd row[초등학부모교사용]소프트웨어 마법이야기
3rd row[초등교재]생각 쑥쑥 소프트웨어
4th row[초등교재]생각 쑥쑥 소프트웨어
5th row[초등교재]뚝딱뚝딱 코딩 공작소
ValueCountFrequency (%)
교사용]컴퓨팅 6
 
6.5%
위한 4
 
4.3%
기초다지기 4
 
4.3%
알고리즘 4
 
4.3%
창의적 4
 
4.3%
사고력 4
 
4.3%
키우기 4
 
4.3%
4
 
4.3%
직업세계 4
 
4.3%
공작소 4
 
4.3%
Other values (21) 50
54.3%
2023-12-12T17:02:36.909045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
58
 
10.1%
[ 34
 
5.9%
] 34
 
5.9%
30
 
5.2%
22
 
3.8%
18
 
3.1%
18
 
3.1%
14
 
2.4%
14
 
2.4%
14
 
2.4%
Other values (60) 318
55.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 444
77.4%
Space Separator 58
 
10.1%
Open Punctuation 34
 
5.9%
Close Punctuation 34
 
5.9%
Decimal Number 4
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
 
6.8%
22
 
5.0%
18
 
4.1%
18
 
4.1%
14
 
3.2%
14
 
3.2%
14
 
3.2%
12
 
2.7%
12
 
2.7%
12
 
2.7%
Other values (55) 278
62.6%
Decimal Number
ValueCountFrequency (%)
1 2
50.0%
2 2
50.0%
Space Separator
ValueCountFrequency (%)
58
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 34
100.0%
Close Punctuation
ValueCountFrequency (%)
] 34
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 444
77.4%
Common 130
 
22.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
 
6.8%
22
 
5.0%
18
 
4.1%
18
 
4.1%
14
 
3.2%
14
 
3.2%
14
 
3.2%
12
 
2.7%
12
 
2.7%
12
 
2.7%
Other values (55) 278
62.6%
Common
ValueCountFrequency (%)
58
44.6%
[ 34
26.2%
] 34
26.2%
1 2
 
1.5%
2 2
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 444
77.4%
ASCII 130
 
22.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
58
44.6%
[ 34
26.2%
] 34
26.2%
1 2
 
1.5%
2 2
 
1.5%
Hangul
ValueCountFrequency (%)
30
 
6.8%
22
 
5.0%
18
 
4.1%
18
 
4.1%
14
 
3.2%
14
 
3.2%
14
 
3.2%
12
 
2.7%
12
 
2.7%
12
 
2.7%
Other values (55) 278
62.6%

파일 명
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)47.1%
Missing0
Missing (%)0.0%
Memory size404.0 B
middle_teacherbook_2.zip
middle_teacherbook_3.zip
 
2
primary_textbook_1.zip
 
2
primary_textbook_2.zip
 
2
middle_textbook_1.zip
 
2
Other values (11)
22 

Length

Max length25
Median length24
Mean length21.705882
Min length16

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmiddle_teacherbook_3.zip
2nd rowmiddle_teacherbook_3.zip
3rd rowprimary_textbook_1.zip
4th rowprimary_textbook_1.zip
5th rowprimary_textbook_2.zip

Common Values

ValueCountFrequency (%)
middle_teacherbook_2.zip 4
 
11.8%
middle_teacherbook_3.zip 2
 
5.9%
primary_textbook_1.zip 2
 
5.9%
primary_textbook_2.zip 2
 
5.9%
middle_textbook_1.zip 2
 
5.9%
middle_textbook_2.zip 2
 
5.9%
primary_teacherbook_1.zip 2
 
5.9%
primary_teacherbook_2.zip 2
 
5.9%
algorithm_ad.zip 2
 
5.9%
algorithm_md.zip 2
 
5.9%
Other values (6) 12
35.3%

Length

2023-12-12T17:02:37.081232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
middle_teacherbook_2.zip 4
 
11.8%
middle_teacherbook_3.zip 2
 
5.9%
primary_textbook_1.zip 2
 
5.9%
primary_textbook_2.zip 2
 
5.9%
middle_textbook_1.zip 2
 
5.9%
middle_textbook_2.zip 2
 
5.9%
primary_teacherbook_1.zip 2
 
5.9%
primary_teacherbook_2.zip 2
 
5.9%
algorithm_ad.zip 2
 
5.9%
algorithm_md.zip 2
 
5.9%
Other values (6) 12
35.3%

게시물 아이디
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)47.1%
Missing0
Missing (%)0.0%
Memory size404.0 B
TB1
TB9
 
2
TB8
 
2
TB7
 
2
TB6
 
2
Other values (11)
22 

Length

Max length4
Median length3
Mean length3.3529412
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTB9
2nd rowTB9
3rd rowTB8
4th rowTB8
5th rowTB7

Common Values

ValueCountFrequency (%)
TB1 4
 
11.8%
TB9 2
 
5.9%
TB8 2
 
5.9%
TB7 2
 
5.9%
TB6 2
 
5.9%
TB5 2
 
5.9%
TB4 2
 
5.9%
TB3 2
 
5.9%
EB10 2
 
5.9%
EB9 2
 
5.9%
Other values (6) 12
35.3%

Length

2023-12-12T17:02:37.230043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tb1 4
 
11.8%
tb9 2
 
5.9%
tb8 2
 
5.9%
tb7 2
 
5.9%
tb6 2
 
5.9%
tb5 2
 
5.9%
tb4 2
 
5.9%
tb3 2
 
5.9%
eb10 2
 
5.9%
eb9 2
 
5.9%
Other values (6) 12
35.3%

파일다운로드 페이지
Categorical

CONSTANT 

Distinct1
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size404.0 B
/um/um01/um0102/um0102.do
34 

Length

Max length25
Median length25
Mean length25
Min length25

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row/um/um01/um0102/um0102.do
2nd row/um/um01/um0102/um0102.do
3rd row/um/um01/um0102/um0102.do
4th row/um/um01/um0102/um0102.do
5th row/um/um01/um0102/um0102.do

Common Values

ValueCountFrequency (%)
/um/um01/um0102/um0102.do 34
100.0%

Length

2023-12-12T17:02:37.372255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:02:37.512498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
um/um01/um0102/um0102.do 34
100.0%

Interactions

2023-12-12T17:02:35.503915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:02:37.597958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
파일 아이디게시물 제목파일 명게시물 아이디
파일 아이디1.0001.0001.0001.000
게시물 제목1.0001.0001.0001.000
파일 명1.0001.0001.0001.000
게시물 아이디1.0001.0001.0001.000
2023-12-12T17:02:37.695654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
게시물 아이디파일 명
게시물 아이디1.0001.000
파일 명1.0001.000
2023-12-12T17:02:37.783071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
파일 아이디파일 명게시물 아이디
파일 아이디1.0000.7880.788
파일 명0.7881.0001.000
게시물 아이디0.7881.0001.000

Missing values

2023-12-12T17:02:35.663794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:02:35.834262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

파일 아이디게시물 제목파일 명게시물 아이디파일다운로드 페이지
09[초등학부모교사용]소프트웨어 마법이야기middle_teacherbook_3.zipTB9/um/um01/um0102/um0102.do
19[초등학부모교사용]소프트웨어 마법이야기middle_teacherbook_3.zipTB9/um/um01/um0102/um0102.do
28[초등교재]생각 쑥쑥 소프트웨어primary_textbook_1.zipTB8/um/um01/um0102/um0102.do
38[초등교재]생각 쑥쑥 소프트웨어primary_textbook_1.zipTB8/um/um01/um0102/um0102.do
47[초등교재]뚝딱뚝딱 코딩 공작소primary_textbook_2.zipTB7/um/um01/um0102/um0102.do
57[초등교재]뚝딱뚝딱 코딩 공작소primary_textbook_2.zipTB7/um/um01/um0102/um0102.do
66[중등교재]컴퓨팅과 직업세계middle_textbook_1.zipTB6/um/um01/um0102/um0102.do
76[중등교재]컴퓨팅과 직업세계middle_textbook_1.zipTB6/um/um01/um0102/um0102.do
85[중등교재]프로그래밍과 나middle_textbook_2.zipTB5/um/um01/um0102/um0102.do
95[중등교재]프로그래밍과 나middle_textbook_2.zipTB5/um/um01/um0102/um0102.do
파일 아이디게시물 제목파일 명게시물 아이디파일다운로드 페이지
241[교사용]컴퓨팅 기초다지기 1단계primary_teacher_5.zipTB15/um/um01/um0102/um0102.do
251[교사용]컴퓨팅 기초다지기 2단계primary_teacher_6.zipTB16/um/um01/um0102/um0102.do
261[교사용]프로그래밍과 나middle_teacherbook_2.zipTB1/um/um01/um0102/um0102.do
271[고등교재]소프트웨어와 친해지기primary_teacher_7.zipTB12/um/um01/um0102/um0102.do
281[교사용]소프트웨어와 친해지기primary_teacher_8.zipTB13/um/um01/um0102/um0102.do
291[중등교재]컴퓨팅 사고력 키우기primary_teacher_3.zipTB14/um/um01/um0102/um0102.do
301[교사용]컴퓨팅 사고력 키우기middle_teacherbook_2.zipTB1/um/um01/um0102/um0102.do
311[교사용]컴퓨팅 기초다지기 1단계primary_teacher_5.zipTB15/um/um01/um0102/um0102.do
321[교사용]컴퓨팅 기초다지기 2단계primary_teacher_6.zipTB16/um/um01/um0102/um0102.do
331[교사용]프로그래밍과 나middle_teacherbook_2.zipTB1/um/um01/um0102/um0102.do

Duplicate rows

Most frequently occurring

파일 아이디게시물 제목파일 명게시물 아이디파일다운로드 페이지# duplicates
01[고등교재]소프트웨어와 친해지기primary_teacher_7.zipTB12/um/um01/um0102/um0102.do2
11[교사용]소프트웨어와 친해지기primary_teacher_8.zipTB13/um/um01/um0102/um0102.do2
21[교사용]컴퓨팅 기초다지기 1단계primary_teacher_5.zipTB15/um/um01/um0102/um0102.do2
31[교사용]컴퓨팅 기초다지기 2단계primary_teacher_6.zipTB16/um/um01/um0102/um0102.do2
41[교사용]컴퓨팅 사고력 키우기middle_teacherbook_2.zipTB1/um/um01/um0102/um0102.do2
51[교사용]프로그래밍과 나middle_teacherbook_2.zipTB1/um/um01/um0102/um0102.do2
61[중등교재]컴퓨팅 사고력 키우기primary_teacher_3.zipTB14/um/um01/um0102/um0102.do2
72[교사용]컴퓨팅과 직업세계middle_teacherbook_1.zipTB2/um/um01/um0102/um0102.do2
83[교사용]뚝딱뚝딱 코딩 공작소primary_teacherbook_2.zipTB3/um/um01/um0102/um0102.do2
94[교사용]생각 쑥쑥 소프트웨어primary_teacherbook_1.zipTB4/um/um01/um0102/um0102.do2