Overview

Dataset statistics

Number of variables10
Number of observations2924
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows531
Duplicate rows (%)18.2%
Total size in memory240.0 KiB
Average record size in memory84.0 B

Variable types

DateTime1
Categorical3
Unsupported3
Numeric3

Dataset

Description저희 도로교통공단에서는 법규위반, 사고, 음주운전, 난폭, 보복운전 등으로 운전면허 행정처분을 받은 대상자를 대상으로 특별교통안전교육을 진행하고 있습니다. 고령운전자교육 관련 교육일정 및 예약정보 관련 자료입니다.
Author도로교통공단
URLhttps://www.data.go.kr/data/15087808/fileData.do

Alerts

Dataset has 531 (18.2%) duplicate rowsDuplicates
강의실정원 is highly overall correlated with 지부코드High correlation
지부코드 is highly overall correlated with 강의실정원High correlation
시간표구분 is highly imbalanced (99.6%)Imbalance
교육반코드 is highly imbalanced (56.0%)Imbalance
강의실번호 is an unsupported type, check if it needs cleaning or further analysisUnsupported
강의시작시간 is an unsupported type, check if it needs cleaning or further analysisUnsupported
강의종료시간 is an unsupported type, check if it needs cleaning or further analysisUnsupported
강의실정원 has 450 (15.4%) zerosZeros
예약정원 has 2129 (72.8%) zerosZeros

Reproduction

Analysis started2023-12-12 13:14:12.168604
Analysis finished2023-12-12 13:14:13.545981
Duration1.38 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct236
Distinct (%)8.1%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
Minimum2020-01-02 00:00:00
Maximum2020-12-30 00:00:00
2023-12-12T22:14:13.620733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:13.752241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

지부코드
Categorical

HIGH CORRELATION 

Distinct35
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
전북시험장
379 
도봉시험장
334 
강남시험장
196 
제주시험장
 
163
대구시험장
 
163
Other values (30)
1689 

Length

Max length8
Median length5
Mean length5.0123119
Min length4

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row도봉시험장
2nd row도봉시험장
3rd row제주시험장
4th row제주시험장
5th row전남시험장

Common Values

ValueCountFrequency (%)
전북시험장 379
 
13.0%
도봉시험장 334
 
11.4%
강남시험장 196
 
6.7%
제주시험장 163
 
5.6%
대구시험장 163
 
5.6%
포항시험장 143
 
4.9%
북부시험장 131
 
4.5%
강서시험장 130
 
4.4%
문경시험장 111
 
3.8%
인천시험장 108
 
3.7%
Other values (25) 1066
36.5%

Length

2023-12-12T22:14:13.891514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전북시험장 379
 
13.0%
도봉시험장 334
 
11.4%
강남시험장 196
 
6.7%
제주시험장 163
 
5.6%
대구시험장 163
 
5.6%
포항시험장 143
 
4.9%
북부시험장 131
 
4.5%
강서시험장 130
 
4.4%
문경시험장 111
 
3.8%
인천시험장 108
 
3.7%
Other values (25) 1066
36.5%

시간표구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
4
2923 
1
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row4
2nd row4
3rd row4
4th row4
5th row4

Common Values

ValueCountFrequency (%)
4 2923
> 99.9%
1 1
 
< 0.1%

Length

2023-12-12T22:14:14.049331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:14:14.146452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 2923
> 99.9%
1 1
 
< 0.1%

교육반코드
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
고령자교육(의무)
2658 
고령자교육(권장)
266 

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row고령자교육(의무)
2nd row고령자교육(의무)
3rd row고령자교육(의무)
4th row고령자교육(의무)
5th row고령자교육(의무)

Common Values

ValueCountFrequency (%)
고령자교육(의무) 2658
90.9%
고령자교육(권장) 266
 
9.1%

Length

2023-12-12T22:14:14.252332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:14:14.383726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고령자교육(의무 2658
90.9%
고령자교육(권장 266
 
9.1%

강의실번호
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size23.0 KiB

강의시작시간
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size23.0 KiB

순번
Real number (ℝ)

Distinct44
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.7592339
Minimum1
Maximum44
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.8 KiB
2023-12-12T22:14:14.509103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile24
Maximum44
Range43
Interquartile range (IQR)2

Descriptive statistics

Standard deviation7.9286612
Coefficient of variation (CV)1.6659532
Kurtosis5.4358189
Mean4.7592339
Median Absolute Deviation (MAD)0
Skewness2.4161975
Sum13916
Variance62.863668
MonotonicityNot monotonic
2023-12-12T22:14:14.674074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
1 1997
68.3%
2 184
 
6.3%
7 43
 
1.5%
8 40
 
1.4%
5 40
 
1.4%
6 38
 
1.3%
15 34
 
1.2%
12 34
 
1.2%
14 33
 
1.1%
9 32
 
1.1%
Other values (34) 449
 
15.4%
ValueCountFrequency (%)
1 1997
68.3%
2 184
 
6.3%
3 32
 
1.1%
4 31
 
1.1%
5 40
 
1.4%
6 38
 
1.3%
7 43
 
1.5%
8 40
 
1.4%
9 32
 
1.1%
10 17
 
0.6%
ValueCountFrequency (%)
44 1
 
< 0.1%
43 1
 
< 0.1%
42 3
 
0.1%
41 2
 
0.1%
40 1
 
< 0.1%
39 7
0.2%
38 6
0.2%
37 7
0.2%
36 8
0.3%
35 7
0.2%

강의종료시간
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size23.0 KiB

강의실정원
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct33
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.780096
Minimum0
Maximum100
Zeros450
Zeros (%)15.4%
Negative0
Negative (%)0.0%
Memory size25.8 KiB
2023-12-12T22:14:14.836498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q19
median15
Q330
95-th percentile35
Maximum100
Range100
Interquartile range (IQR)21

Descriptive statistics

Standard deviation14.333351
Coefficient of variation (CV)0.85418768
Kurtosis11.688623
Mean16.780096
Median Absolute Deviation (MAD)8
Skewness2.3829006
Sum49065
Variance205.44495
MonotonicityNot monotonic
2023-12-12T22:14:14.990867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
30 523
17.9%
0 450
15.4%
15 396
13.5%
10 217
7.4%
17 185
 
6.3%
16 175
 
6.0%
35 155
 
5.3%
11 109
 
3.7%
12 91
 
3.1%
14 86
 
2.9%
Other values (23) 537
18.4%
ValueCountFrequency (%)
0 450
15.4%
1 19
 
0.6%
2 52
 
1.8%
3 34
 
1.2%
4 4
 
0.1%
5 20
 
0.7%
6 47
 
1.6%
7 74
 
2.5%
8 15
 
0.5%
9 22
 
0.8%
ValueCountFrequency (%)
100 36
 
1.2%
39 1
 
< 0.1%
37 2
 
0.1%
36 30
 
1.0%
35 155
 
5.3%
34 1
 
< 0.1%
31 2
 
0.1%
30 523
17.9%
29 5
 
0.2%
28 55
 
1.9%

예약정원
Real number (ℝ)

ZEROS 

Distinct34
Distinct (%)1.2%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean3.8693124
Minimum0
Maximum50
Zeros2129
Zeros (%)72.8%
Negative0
Negative (%)0.0%
Memory size25.8 KiB
2023-12-12T22:14:15.148532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile28
Maximum50
Range50
Interquartile range (IQR)1

Descriptive statistics

Standard deviation8.3713045
Coefficient of variation (CV)2.1635122
Kurtosis6.3272603
Mean3.8693124
Median Absolute Deviation (MAD)0
Skewness2.5196693
Sum11310
Variance70.07874
MonotonicityNot monotonic
2023-12-12T22:14:15.285731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
0 2129
72.8%
15 123
 
4.2%
1 84
 
2.9%
10 80
 
2.7%
30 77
 
2.6%
7 56
 
1.9%
2 40
 
1.4%
16 37
 
1.3%
17 35
 
1.2%
11 34
 
1.2%
Other values (24) 228
 
7.8%
ValueCountFrequency (%)
0 2129
72.8%
1 84
 
2.9%
2 40
 
1.4%
3 17
 
0.6%
4 4
 
0.1%
5 16
 
0.5%
6 15
 
0.5%
7 56
 
1.9%
8 9
 
0.3%
9 25
 
0.9%
ValueCountFrequency (%)
50 9
 
0.3%
37 1
 
< 0.1%
36 13
 
0.4%
35 26
 
0.9%
34 1
 
< 0.1%
31 1
 
< 0.1%
30 77
2.6%
29 5
 
0.2%
28 19
 
0.6%
27 1
 
< 0.1%

Interactions

2023-12-12T22:14:13.013322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:12.492577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:12.730832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:13.113220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:12.566119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:12.814632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:13.210004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:12.646590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:12.916013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:14:15.380533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지부코드시간표구분교육반코드순번강의실정원예약정원
지부코드1.0000.0000.4600.6290.9050.699
시간표구분0.0001.0000.0000.0000.0000.058
교육반코드0.4600.0001.0000.1550.1710.049
순번0.6290.0000.1551.0000.2420.104
강의실정원0.9050.0000.1710.2421.0000.689
예약정원0.6990.0580.0490.1040.6891.000
2023-12-12T22:14:15.730542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시간표구분교육반코드지부코드
시간표구분1.0000.0000.000
교육반코드0.0001.0000.388
지부코드0.0000.3881.000
2023-12-12T22:14:15.815950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번강의실정원예약정원지부코드시간표구분교육반코드
순번1.000-0.048-0.0900.2680.0000.119
강의실정원-0.0481.0000.0430.6990.0000.233
예약정원-0.0900.0431.0000.3070.0440.070
지부코드0.2680.6990.3071.0000.0000.388
시간표구분0.0000.0000.0440.0001.0000.000
교육반코드0.1190.2330.0700.3880.0001.000

Missing values

2023-12-12T22:14:13.343095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:14:13.482090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

교육일자지부코드시간표구분교육반코드강의실번호강의시작시간순번강의종료시간강의실정원예약정원
02020-01-02도봉시험장4고령자교육(의무)109:30:00111:30:003030
12020-01-02도봉시험장4고령자교육(의무)214:00:00116:00:002020
22020-01-02제주시험장4고령자교육(의무)A10:00:00112:00:001211
32020-01-02제주시험장4고령자교육(의무)B14:00:00116:00:001211
42020-01-02전남시험장4고령자교육(의무)109:30:00111:30:001515
52020-01-02전남시험장4고령자교육(의무)114:00:00216:00:001515
62020-01-02북부시험장4고령자교육(의무)110:00:00112:00:0077
72020-01-02북부시험장4고령자교육(의무)214:00:00116:00:0077
82020-01-02강남시험장4고령자교육(의무)109:30:00111:30:003535
92020-01-02강남시험장4고령자교육(의무)214:00:00116:00:003535
교육일자지부코드시간표구분교육반코드강의실번호강의시작시간순번강의종료시간강의실정원예약정원
29142020-12-23강남시험장4고령자교육(의무)A09:30:00111:30:0022
29152020-12-23강남시험장4고령자교육(의무)B14:00:00116:00:0022
29162020-12-23안동교육장4고령자교육(의무)110:00:00112:00:00106
29172020-12-24원주시험장4고령자교육(의무)110:00:00112:00:001001
29182020-12-24북부시험장4고령자교육(의무)A10:00:00112:00:0055
29192020-12-24울산경남지부4고령자교육(의무)110:00:00112:00:0000
29202020-12-24북부시험장4고령자교육(의무)114:00:00116:00:0088
29212020-12-28광주전남지부4고령자교육(의무)110:00:00112:00:00109
29222020-12-28광주전남지부4고령자교육(의무)114:00:00216:00:001010
29232020-12-30인천지부4고령자교육(의무)110:00:00112:00:0000

Duplicate rows

Most frequently occurring

교육일자지부코드시간표구분교육반코드순번강의실정원예약정원# duplicates
3272020-04-01인천시험장4고령자교육(의무)11703
02020-01-02강남시험장4고령자교육(의무)135352
12020-01-02북부시험장4고령자교육(의무)1772
22020-01-02안산시험장4고령자교육(의무)112122
32020-01-02용인시험장4고령자교육(의무)136362
42020-01-02전북시험장4고령자교육(의무)115152
52020-01-02제주시험장4고령자교육(의무)112112
62020-01-02포항시험장4고령자교육(의무)116162
72020-01-03대구시험장4고령자교육(의무)130302
82020-01-03도봉시험장4고령자교육(의무)130302