Overview

Dataset statistics

Number of variables4
Number of observations69
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 KiB
Average record size in memory35.9 B

Variable types

Numeric2
Text1
Categorical1

Dataset

Description한국기술교육대학교 온라인평생교육원 스마트 직업훈련 플랫폼 (STEP)에 대한 메시지 첨부파일 관련 내용을 제공합니다.
Author한국기술교육대학교
URLhttps://www.data.go.kr/data/15090991/fileData.do

Reproduction

Analysis started2024-04-16 17:23:44.696322
Analysis finished2024-04-16 17:23:45.327187
Duration0.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

메시지 ID
Real number (ℝ)

Distinct47
Distinct (%)68.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean402187.88
Minimum382851
Maximum428735
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size753.0 B
2024-04-17T02:23:45.394547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum382851
5-th percentile383323
Q1383323
median402376
Q3420743
95-th percentile424863.4
Maximum428735
Range45884
Interquartile range (IQR)37420

Descriptive statistics

Standard deviation17285.123
Coefficient of variation (CV)0.042977731
Kurtosis-1.7344149
Mean402187.88
Median Absolute Deviation (MAD)19053
Skewness0.095456371
Sum27750964
Variance2.9877547 × 108
MonotonicityIncreasing
2024-04-17T02:23:45.859411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
383323 21
30.4%
404773 3
 
4.3%
382851 1
 
1.4%
421925 1
 
1.4%
420609 1
 
1.4%
420737 1
 
1.4%
420743 1
 
1.4%
420933 1
 
1.4%
420937 1
 
1.4%
421115 1
 
1.4%
Other values (37) 37
53.6%
ValueCountFrequency (%)
382851 1
 
1.4%
382853 1
 
1.4%
383323 21
30.4%
385699 1
 
1.4%
385705 1
 
1.4%
386191 1
 
1.4%
386194 1
 
1.4%
386206 1
 
1.4%
386209 1
 
1.4%
399868 1
 
1.4%
ValueCountFrequency (%)
428735 1
1.4%
427415 1
1.4%
425289 1
1.4%
424981 1
1.4%
424687 1
1.4%
424409 1
1.4%
424053 1
1.4%
422935 1
1.4%
421989 1
1.4%
421931 1
1.4%
Distinct58
Distinct (%)84.1%
Missing0
Missing (%)0.0%
Memory size684.0 B
2024-04-17T02:23:46.099363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length30
Mean length21.492754
Min length6

Characters and Unicode

Total characters1483
Distinct characters223
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)69.6%

Sample

1st row소스 수정(loadV template).png
2nd row소스 수정(loadV template).png
3rd row1회차_지식 표현과 논리 - 명제 논리.pdf
4th row2회차_지식 표현과 논리 - 기호 논리(1).pdf
5th row3회차_지식 표현과 논리 - 기호 논리(2).pdf
ValueCountFrequency (%)
18
 
7.1%
논리 9
 
3.5%
표현과 7
 
2.8%
학습 5
 
2.0%
수립 5
 
2.0%
빅데이터 5
 
2.0%
계획 3
 
1.2%
온톨로지 3
 
1.2%
네트워크와 3
 
1.2%
환경에서 3
 
1.2%
Other values (147) 193
76.0%
2024-04-17T02:23:46.469528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
185
 
12.5%
. 77
 
5.2%
p 70
 
4.7%
1 47
 
3.2%
d 37
 
2.5%
f 34
 
2.3%
- 34
 
2.3%
_ 34
 
2.3%
25
 
1.7%
24
 
1.6%
Other values (213) 916
61.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 730
49.2%
Lowercase Letter 264
 
17.8%
Space Separator 185
 
12.5%
Decimal Number 124
 
8.4%
Other Punctuation 78
 
5.3%
Dash Punctuation 34
 
2.3%
Connector Punctuation 34
 
2.3%
Open Punctuation 12
 
0.8%
Close Punctuation 12
 
0.8%
Uppercase Letter 10
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
25
 
3.4%
24
 
3.3%
22
 
3.0%
21
 
2.9%
19
 
2.6%
15
 
2.1%
15
 
2.1%
15
 
2.1%
12
 
1.6%
12
 
1.6%
Other values (169) 550
75.3%
Lowercase Letter
ValueCountFrequency (%)
p 70
26.5%
d 37
14.0%
f 34
12.9%
h 20
 
7.6%
w 20
 
7.6%
g 15
 
5.7%
a 11
 
4.2%
t 11
 
4.2%
j 10
 
3.8%
n 7
 
2.7%
Other values (10) 29
11.0%
Decimal Number
ValueCountFrequency (%)
1 47
37.9%
2 23
18.5%
0 17
 
13.7%
8 8
 
6.5%
7 7
 
5.6%
9 5
 
4.0%
5 5
 
4.0%
4 5
 
4.0%
6 4
 
3.2%
3 3
 
2.4%
Uppercase Letter
ValueCountFrequency (%)
V 3
30.0%
P 3
30.0%
C 2
20.0%
D 1
 
10.0%
L 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
. 77
98.7%
, 1
 
1.3%
Open Punctuation
ValueCountFrequency (%)
( 10
83.3%
[ 2
 
16.7%
Close Punctuation
ValueCountFrequency (%)
) 10
83.3%
] 2
 
16.7%
Space Separator
ValueCountFrequency (%)
185
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 34
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 34
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 730
49.2%
Common 479
32.3%
Latin 274
 
18.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
25
 
3.4%
24
 
3.3%
22
 
3.0%
21
 
2.9%
19
 
2.6%
15
 
2.1%
15
 
2.1%
15
 
2.1%
12
 
1.6%
12
 
1.6%
Other values (169) 550
75.3%
Latin
ValueCountFrequency (%)
p 70
25.5%
d 37
13.5%
f 34
12.4%
h 20
 
7.3%
w 20
 
7.3%
g 15
 
5.5%
a 11
 
4.0%
t 11
 
4.0%
j 10
 
3.6%
n 7
 
2.6%
Other values (15) 39
14.2%
Common
ValueCountFrequency (%)
185
38.6%
. 77
16.1%
1 47
 
9.8%
- 34
 
7.1%
_ 34
 
7.1%
2 23
 
4.8%
0 17
 
3.5%
( 10
 
2.1%
) 10
 
2.1%
8 8
 
1.7%
Other values (9) 34
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 753
50.8%
Hangul 730
49.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
185
24.6%
. 77
10.2%
p 70
 
9.3%
1 47
 
6.2%
d 37
 
4.9%
f 34
 
4.5%
- 34
 
4.5%
_ 34
 
4.5%
2 23
 
3.1%
h 20
 
2.7%
Other values (34) 192
25.5%
Hangul
ValueCountFrequency (%)
25
 
3.4%
24
 
3.3%
22
 
3.0%
21
 
2.9%
19
 
2.6%
15
 
2.1%
15
 
2.1%
15
 
2.1%
12
 
1.6%
12
 
1.6%
Other values (169) 550
75.3%

콘텐츠 타입
Categorical

Distinct8
Distinct (%)11.6%
Missing0
Missing (%)0.0%
Memory size684.0 B
application/pdf
34 
application/octet-stream
16 
image/jpeg
10 
image/png
application/zip
 
1
Other values (3)
 
3

Length

Max length71
Median length15
Mean length17.666667
Min length9

Unique

Unique4 ?
Unique (%)5.8%

Sample

1st rowimage/png
2nd rowimage/png
3rd rowapplication/pdf
4th rowapplication/pdf
5th rowapplication/pdf

Common Values

ValueCountFrequency (%)
application/pdf 34
49.3%
application/octet-stream 16
23.2%
image/jpeg 10
 
14.5%
image/png 5
 
7.2%
application/zip 1
 
1.4%
application/vnd.openxmlformats-officedocument.wordprocessingml.document 1
 
1.4%
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet 1
 
1.4%
application/vnd.ms-powerpoint 1
 
1.4%

Length

2024-04-17T02:23:46.610826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T02:23:46.724984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
application/pdf 34
49.3%
application/octet-stream 16
23.2%
image/jpeg 10
 
14.5%
image/png 5
 
7.2%
application/zip 1
 
1.4%
application/vnd.openxmlformats-officedocument.wordprocessingml.document 1
 
1.4%
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet 1
 
1.4%
application/vnd.ms-powerpoint 1
 
1.4%

사이즈
Real number (ℝ)

Distinct58
Distinct (%)84.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1642954.5
Minimum8062
Maximum35963647
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size753.0 B
2024-04-17T02:23:46.878096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8062
5-th percentile13312
Q1122517
median407253
Q31090437
95-th percentile8523722.2
Maximum35963647
Range35955585
Interquartile range (IQR)967920

Descriptive statistics

Standard deviation4841407.3
Coefficient of variation (CV)2.946769
Kurtosis38.367199
Mean1642954.5
Median Absolute Deviation (MAD)348373
Skewness5.7991325
Sum1.1336386 × 108
Variance2.3439225 × 1013
MonotonicityNot monotonic
2024-04-17T02:23:47.015740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
583874 3
 
4.3%
50719 2
 
2.9%
8062 2
 
2.9%
74587 2
 
2.9%
1226752 2
 
2.9%
58880 2
 
2.9%
2951168 2
 
2.9%
267354 2
 
2.9%
13312 2
 
2.9%
12035938 2
 
2.9%
Other values (48) 48
69.6%
ValueCountFrequency (%)
8062 2
2.9%
9728 1
1.4%
13312 2
2.9%
16896 1
1.4%
21380 1
1.4%
25566 1
1.4%
28160 1
1.4%
40448 1
1.4%
50719 2
2.9%
58880 2
2.9%
ValueCountFrequency (%)
35963647 1
1.4%
12035938 2
2.9%
11247703 1
1.4%
4437751 1
1.4%
2951168 2
2.9%
2829174 1
1.4%
1588736 1
1.4%
1505807 1
1.4%
1467418 1
1.4%
1254912 1
1.4%

Interactions

2024-04-17T02:23:45.027796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T02:23:44.878947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T02:23:45.105466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T02:23:44.947411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-17T02:23:47.101845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
메시지 ID원본 파일명콘텐츠 타입사이즈
메시지 ID1.0000.9950.8840.766
원본 파일명0.9951.0001.0001.000
콘텐츠 타입0.8841.0001.0000.000
사이즈0.7661.0000.0001.000
2024-04-17T02:23:47.198394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
메시지 ID사이즈콘텐츠 타입
메시지 ID1.0000.0730.489
사이즈0.0731.0000.000
콘텐츠 타입0.4890.0001.000

Missing values

2024-04-17T02:23:45.212564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-17T02:23:45.295155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

메시지 ID원본 파일명콘텐츠 타입사이즈
0382851소스 수정(loadV template).pngimage/png50719
1382853소스 수정(loadV template).pngimage/png50719
23833231회차_지식 표현과 논리 - 명제 논리.pdfapplication/pdf227335
33833232회차_지식 표현과 논리 - 기호 논리(1).pdfapplication/pdf417078
43833233회차_지식 표현과 논리 - 기호 논리(2).pdfapplication/pdf285953
53833234회차_지식 표현과 논리 - 술어 논리.pdfapplication/pdf268060
63833235회차_지식 표현과 논리 - 지식 표현 언어와 일차 논리(1).pdfapplication/pdf576788
73833236회차_지식 표현과 논리 - 지식 표현 언어와 일차 논리(2).pdfapplication/pdf338044
83833237회차_지식 표현과 논리 - 논리 도출과 논리 프로그래밍.pdfapplication/pdf798998
93833238회차_의미 네트워크와 온톨로지 - 지식의 유형과 의미 네트워크.pdfapplication/pdf689102
메시지 ID원본 파일명콘텐츠 타입사이즈
59421931기술대-1.jpgimage/jpeg74587
60421989111.jpgimage/jpeg122517
6142293512-1.jpgimage/jpeg450659
62424053성형용 수지 제조공정 설명서_2021년 7월 18일.docxapplication/vnd.openxmlformats-officedocument.wordprocessingml.document21380
63424409공정분석표양식_2021년 7월 23일.xlsxapplication/vnd.openxmlformats-officedocument.spreadsheetml.sheet387516
6442468707.PVD (최종)-김상용.pdfapplication/pdf1505807
65424981기술유출 디지털포렌식.pdfapplication/pdf4437751
66425289공정개선활동_2021년 8월 4일.pptapplication/vnd.ms-powerpoint1588736
67427415오늘의문장-퇴계이황.jpgimage/jpeg123595
6842873512-1.jpgimage/jpeg163717