Overview

Dataset statistics

Number of variables3
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory332.0 KiB
Average record size in memory34.0 B

Variable types

Numeric2
Boolean1

Dataset

Description한국기술교육대학교 온라인평생교육원 스마트 직업훈련 플랫폼 (STEP)에 대한 사용자 약관서비스 아이템 관련 내용을 제공합니다.
Author한국기술교육대학교
URLhttps://www.data.go.kr/data/15090972/fileData.do

Alerts

동의 여부 is highly imbalanced (63.3%)Imbalance

Reproduction

Analysis started2023-12-12 16:05:08.067170
Analysis finished2023-12-12 16:05:09.324093
Duration1.26 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사용자 인덱스
Real number (ℝ)

Distinct8406
Distinct (%)84.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean528416.79
Minimum320
Maximum879934
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T01:05:09.422359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum320
5-th percentile72791
Q1266143
median604711.5
Q3790121.5
95-th percentile861922.05
Maximum879934
Range879614
Interquartile range (IQR)523978.5

Descriptive statistics

Standard deviation280578.04
Coefficient of variation (CV)0.53097866
Kurtosis-1.4112159
Mean528416.79
Median Absolute Deviation (MAD)226784.5
Skewness-0.34096657
Sum5.2841679 × 109
Variance7.8724037 × 1010
MonotonicityNot monotonic
2023-12-13T01:05:09.567229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
190323 4
 
< 0.1%
787469 4
 
< 0.1%
163884 4
 
< 0.1%
669711 4
 
< 0.1%
416625 4
 
< 0.1%
634439 4
 
< 0.1%
448137 3
 
< 0.1%
102186 3
 
< 0.1%
476895 3
 
< 0.1%
219015 3
 
< 0.1%
Other values (8396) 9964
99.6%
ValueCountFrequency (%)
320 1
< 0.1%
470 2
< 0.1%
481 1
< 0.1%
484 1
< 0.1%
499 1
< 0.1%
565 1
< 0.1%
1686 1
< 0.1%
1858 1
< 0.1%
1899 1
< 0.1%
2038 1
< 0.1%
ValueCountFrequency (%)
879934 1
< 0.1%
879913 1
< 0.1%
879848 1
< 0.1%
879798 1
< 0.1%
879784 1
< 0.1%
879741 1
< 0.1%
879669 1
< 0.1%
879655 2
< 0.1%
879590 1
< 0.1%
879533 1
< 0.1%
Distinct33
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.5991
Minimum1
Maximum145
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T01:05:09.692314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile28
Q131
median34
Q337
95-th percentile82
Maximum145
Range144
Interquartile range (IQR)6

Descriptive statistics

Standard deviation15.569053
Coefficient of variation (CV)0.42539443
Kurtosis9.2412777
Mean36.5991
Median Absolute Deviation (MAD)3
Skewness2.8957329
Sum365991
Variance242.39542
MonotonicityNot monotonic
2023-12-13T01:05:09.821925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
28 2256
22.6%
37 2246
22.5%
34 2236
22.4%
31 2160
21.6%
82 649
 
6.5%
1 51
 
0.5%
62 49
 
0.5%
91 46
 
0.5%
20 44
 
0.4%
23 33
 
0.3%
Other values (23) 230
 
2.3%
ValueCountFrequency (%)
1 51
 
0.5%
4 3
 
< 0.1%
7 8
 
0.1%
10 5
 
0.1%
17 31
 
0.3%
20 44
 
0.4%
23 33
 
0.3%
26 33
 
0.3%
28 2256
22.6%
31 2160
21.6%
ValueCountFrequency (%)
145 2
 
< 0.1%
136 1
 
< 0.1%
131 11
 
0.1%
127 10
 
0.1%
123 27
0.3%
119 2
 
< 0.1%
100 13
 
0.1%
95 5
 
0.1%
91 46
0.5%
85 19
0.2%

동의 여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.9 KiB
True
9296 
False
 
704
ValueCountFrequency (%)
True 9296
93.0%
False 704
 
7.0%
2023-12-13T01:05:09.936454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-13T01:05:08.909086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:05:08.294488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:05:09.028433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T01:05:08.405362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T01:05:10.000816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용자 인덱스약관 서비스 아이템 아이디동의 여부
사용자 인덱스1.0000.1770.411
약관 서비스 아이템 아이디0.1771.0000.190
동의 여부0.4110.1901.000
2023-12-13T01:05:10.094482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용자 인덱스약관 서비스 아이템 아이디동의 여부
사용자 인덱스1.000-0.0890.316
약관 서비스 아이템 아이디-0.0891.0000.190
동의 여부0.3160.1901.000

Missing values

2023-12-13T01:05:09.190736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T01:05:09.289161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사용자 인덱스약관 서비스 아이템 아이디동의 여부
4533453957428Y
9824387378528Y
1962522028537Y
11252988337Y
7285878350237Y
6875476912837N
4282048226834Y
9469486132028Y
5231263443962Y
9652686782631Y
사용자 인덱스약관 서비스 아이템 아이디동의 여부
7935780665228Y
7203178039637N
3264232565382Y
9197185154428Y
1219612566128Y
915410323028Y
8346582039734Y
4126946182334Y
3667238286837Y
1128911528082Y