Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric2
Categorical2

Dataset

Description한국기술교육대학교 온라인평생교육원 스마트 직업훈련 플랫폼 (STEP)에 대한 게시판 카테고리 내용을 제공합니다.
Author한국기술교육대학교
URLhttps://www.data.go.kr/data/15090882/fileData.do

Alerts

아이디 is highly overall correlated with 게시판 아이디 and 1 other fieldsHigh correlation
게시판 아이디 is highly overall correlated with 아이디 and 1 other fieldsHigh correlation
표시 순서 is highly overall correlated with 카테고리명High correlation
카테고리명 is highly overall correlated with 아이디 and 2 other fieldsHigh correlation
표시 순서 is highly imbalanced (86.0%)Imbalance
아이디 has unique valuesUnique

Reproduction

Analysis started2023-12-12 03:38:22.281215
Analysis finished2023-12-12 03:38:23.634492
Duration1.35 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

아이디
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean85305.298
Minimum32
Maximum171440
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T12:38:23.744325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum32
5-th percentile5176.7
Q139412.5
median85721
Q3132868.75
95-th percentile166498.2
Maximum171440
Range171408
Interquartile range (IQR)93456.25

Descriptive statistics

Standard deviation51286.882
Coefficient of variation (CV)0.60121567
Kurtosis-1.2253455
Mean85305.298
Median Absolute Deviation (MAD)46699.5
Skewness0.0037436881
Sum8.5305298 × 108
Variance2.6303443 × 109
MonotonicityNot monotonic
2023-12-12T12:38:23.931135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
169760 1
 
< 0.1%
2907 1
 
< 0.1%
56334 1
 
< 0.1%
36910 1
 
< 0.1%
148559 1
 
< 0.1%
167984 1
 
< 0.1%
69642 1
 
< 0.1%
90396 1
 
< 0.1%
22323 1
 
< 0.1%
101702 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
32 1
< 0.1%
37 1
< 0.1%
41 1
< 0.1%
92 1
< 0.1%
137 1
< 0.1%
148 1
< 0.1%
151 1
< 0.1%
157 1
< 0.1%
164 1
< 0.1%
165 1
< 0.1%
ValueCountFrequency (%)
171440 1
< 0.1%
171426 1
< 0.1%
171418 1
< 0.1%
171407 1
< 0.1%
171392 1
< 0.1%
171380 1
< 0.1%
171371 1
< 0.1%
171364 1
< 0.1%
171363 1
< 0.1%
171360 1
< 0.1%

게시판 아이디
Real number (ℝ)

HIGH CORRELATION 

Distinct7642
Distinct (%)76.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean58836.513
Minimum3
Maximum400181
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T12:38:24.128330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile19554.9
Q134282
median37357
Q340366.25
95-th percentile391891.25
Maximum400181
Range400178
Interquartile range (IQR)6084.25

Descriptive statistics

Standard deviation85129.203
Coefficient of variation (CV)1.4468771
Kurtosis10.324488
Mean58836.513
Median Absolute Deviation (MAD)3037
Skewness3.4244717
Sum5.8836513 × 108
Variance7.2469812 × 109
MonotonicityNot monotonic
2023-12-12T12:38:24.318690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
35282 6
 
0.1%
40173 5
 
0.1%
33673 5
 
0.1%
38843 5
 
0.1%
36914 5
 
0.1%
33940 4
 
< 0.1%
41894 4
 
< 0.1%
34490 4
 
< 0.1%
41693 4
 
< 0.1%
34938 4
 
< 0.1%
Other values (7632) 9954
99.5%
ValueCountFrequency (%)
3 2
< 0.1%
4 1
< 0.1%
16940 1
< 0.1%
16976 1
< 0.1%
16985 1
< 0.1%
16995 1
< 0.1%
16996 1
< 0.1%
16999 1
< 0.1%
17005 2
< 0.1%
17012 1
< 0.1%
ValueCountFrequency (%)
400181 1
< 0.1%
400161 1
< 0.1%
400146 1
< 0.1%
400126 1
< 0.1%
400101 1
< 0.1%
400081 1
< 0.1%
400066 1
< 0.1%
400056 2
< 0.1%
400051 1
< 0.1%
400021 1
< 0.1%

표시 순서
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
9527 
3
 
240
2
 
231
4
 
1
6
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row3
5th row1

Common Values

ValueCountFrequency (%)
1 9527
95.3%
3 240
 
2.4%
2 231
 
2.3%
4 1
 
< 0.1%
6 1
 
< 0.1%

Length

2023-12-12T12:38:24.525903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:38:24.658547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 9527
95.3%
3 240
 
2.4%
2 231
 
2.3%
4 1
 
< 0.1%
6 1
 
< 0.1%

카테고리명
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
기타
1050 
프로그램
938 
사이트이용
937 
수강신청
933 
수료증발급
930 
Other values (14)
5212 

Length

Max length14
Median length7
Mean length4.4143
Min length2

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row공정1
2nd row수료증발급
3rd row회원
4th row기타
5th row회원

Common Values

ValueCountFrequency (%)
기타 1050
10.5%
프로그램 938
9.4%
사이트이용 937
9.4%
수강신청 933
9.3%
수료증발급 930
9.3%
참고자료 928
9.3%
시스템관련내용 914
9.1%
회원 911
9.1%
학습관련내용 894
8.9%
수강취소 880
8.8%
Other values (9) 685
6.9%

Length

2023-12-12T12:38:24.788481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
기타 1050
9.5%
프로그램 938
8.5%
사이트이용 937
8.5%
수강신청 933
8.5%
수료증발급 930
8.5%
참고자료 928
8.4%
시스템관련내용 914
8.3%
회원 911
8.3%
학습관련내용 894
8.1%
수강취소 880
8.0%
Other values (15) 1685
15.3%

Interactions

2023-12-12T12:38:23.077035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:38:22.643149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:38:23.242701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:38:22.808544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:38:24.879989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
아이디게시판 아이디표시 순서카테고리명
아이디1.0000.6000.6420.958
게시판 아이디0.6001.0000.5820.839
표시 순서0.6420.5821.0000.929
카테고리명0.9580.8390.9291.000
2023-12-12T12:38:25.007752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
카테고리명표시 순서
카테고리명1.0000.779
표시 순서0.7791.000
2023-12-12T12:38:25.114053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
아이디게시판 아이디표시 순서카테고리명
아이디1.0000.6510.3200.792
게시판 아이디0.6511.0000.3850.527
표시 순서0.3200.3851.0000.779
카테고리명0.7920.5270.7791.000

Missing values

2023-12-12T12:38:23.464433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:38:23.581131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

아이디게시판 아이디표시 순서카테고리명
939181697603973631공정1
55451101618365511수료증발급
2563941298417631회원
920041678343941223기타
1985635451359161회원
64020117824363741기타
4533183833351491수강취소
60653106872418051수료증발급
2012435724361891회원
2645449584336661사이트이용
아이디게시판 아이디표시 순서카테고리명
5270398753336861수료증발급
4241473352410511수강신청
65654119474380241기타
65353119173377231기타
919301677603940021학습 내용
3717568049357481수강신청
3622967061347601수강신청
57981104176391091수료증발급
2404239690401551회원
900416864173291프로그램