Overview

Dataset statistics

Number of variables4
Number of observations3207
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory103.5 KiB
Average record size in memory33.0 B

Variable types

Numeric1
DateTime1
Categorical1
Boolean1

Dataset

Description환경측정분석사 자격관리시스템에 관련된 데이터로 등록한 회원상태(운영자 및 일반사용자), 수험번호 등 자료를 제공합니다.
URLhttps://www.data.go.kr/data/15041757/fileData.do

Alerts

회원상태 has constant value ""Constant
회원종류 is highly imbalanced (99.2%)Imbalance

Reproduction

Analysis started2023-12-12 19:28:31.674141
Analysis finished2023-12-12 19:28:32.155564
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

수험번호
Real number (ℝ)

Distinct3205
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean54912643
Minimum12010001
Maximum81400069
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size28.3 KiB
2023-12-13T04:28:32.268081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum12010001
5-th percentile12090039
Q131000138
median61500271
Q371400075
95-th percentile81200104
Maximum81400069
Range69390068
Interquartile range (IQR)40399938

Descriptive statistics

Standard deviation23238724
Coefficient of variation (CV)0.42319441
Kurtosis-1.2194817
Mean54912643
Median Absolute Deviation (MAD)19520060
Skewness-0.46297347
Sum1.7610485 × 1011
Variance5.4003828 × 1014
MonotonicityNot monotonic
2023-12-13T04:28:32.510704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
31000002 2
 
0.1%
31000003 2
 
0.1%
51500003 1
 
< 0.1%
31000058 1
 
< 0.1%
31000008 1
 
< 0.1%
31000007 1
 
< 0.1%
31000076 1
 
< 0.1%
31000069 1
 
< 0.1%
31000067 1
 
< 0.1%
31000066 1
 
< 0.1%
Other values (3195) 3195
99.6%
ValueCountFrequency (%)
12010001 1
< 0.1%
12010002 1
< 0.1%
12030001 1
< 0.1%
12030002 1
< 0.1%
12030003 1
< 0.1%
12040001 1
< 0.1%
12040002 1
< 0.1%
12040003 1
< 0.1%
12040004 1
< 0.1%
12050001 1
< 0.1%
ValueCountFrequency (%)
81400069 1
< 0.1%
81400068 1
< 0.1%
81400067 1
< 0.1%
81400065 1
< 0.1%
81400064 1
< 0.1%
81400063 1
< 0.1%
81400061 1
< 0.1%
81400059 1
< 0.1%
81400058 1
< 0.1%
81400057 1
< 0.1%
Distinct140
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size25.2 KiB
Minimum2009-12-02 00:00:00
Maximum2022-05-21 00:00:00
2023-12-13T04:28:32.704237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:28:32.910030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

회원종류
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size25.2 KiB
일반사용자
3205 
운영자
 
2

Length

Max length5
Median length5
Mean length4.9987527
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반사용자
2nd row일반사용자
3rd row일반사용자
4th row일반사용자
5th row일반사용자

Common Values

ValueCountFrequency (%)
일반사용자 3205
99.9%
운영자 2
 
0.1%

Length

2023-12-13T04:28:33.110344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:28:33.235056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반사용자 3205
99.9%
운영자 2
 
0.1%

회원상태
Boolean

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
True
3207 
ValueCountFrequency (%)
True 3207
100.0%
2023-12-13T04:28:33.340120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-13T04:28:31.806977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:28:33.412663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수험번호회원종류
수험번호1.0000.049
회원종류0.0491.000
2023-12-13T04:28:33.514283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수험번호회원종류
수험번호1.0000.036
회원종류0.0361.000

Missing values

2023-12-13T04:28:31.986866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:28:32.100770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

수험번호시험시작일회원종류회원상태
0515000032022-05-21일반사용자Y
1515000042022-05-21일반사용자Y
2515000052022-05-21일반사용자Y
3515000062022-05-21일반사용자Y
4515000082022-05-21일반사용자Y
5515000102022-05-21일반사용자Y
6515000122022-05-21일반사용자Y
7515000152022-05-21일반사용자Y
8515000162022-05-21일반사용자Y
9515000172022-05-21일반사용자Y
수험번호시험시작일회원종류회원상태
3197120500012013-10-12일반사용자Y
3198120400042012-10-13일반사용자Y
3199120400032012-10-13일반사용자Y
3200120400022012-10-13일반사용자Y
3201120400012012-10-13일반사용자Y
3202120300032011-08-27일반사용자Y
3203120300022011-08-27일반사용자Y
3204120300012011-08-27일반사용자Y
3205120100022010-12-04일반사용자Y
3206120100012010-12-04일반사용자Y