Overview

Dataset statistics

Number of variables10
Number of observations2924
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory240.0 KiB
Average record size in memory84.0 B

Variable types

DateTime3
Categorical4
Numeric3

Dataset

Description저희 도로교통공단에서는 법규위반, 사고, 음주운전, 난폭, 보복운전 등으로 운전면허 행정처분을 받은 대상자를 대상으로 특별교통안전교육을 진행하고 있습니다. 고령운전자교육 관련 교육일정 및 예약정보 관련 자료입니다.
Author도로교통공단
URLhttps://www.data.go.kr/data/15087808/fileData.do

Alerts

강의실정원 is highly overall correlated with 지부코드High correlation
지부코드 is highly overall correlated with 강의실정원 and 1 other fieldsHigh correlation
강의실번호 is highly overall correlated with 지부코드High correlation
시간표구분 is highly imbalanced (99.6%)Imbalance
교육반코드 is highly imbalanced (56.0%)Imbalance
강의실정원 has 450 (15.4%) zerosZeros
예약정원 has 2129 (72.8%) zerosZeros

Reproduction

Analysis started2023-12-12 13:14:05.835927
Analysis finished2023-12-12 13:14:07.576196
Duration1.74 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct236
Distinct (%)8.1%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
Minimum2020-01-02 00:00:00
Maximum2020-12-30 00:00:00
2023-12-12T22:14:07.652808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:07.786700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

지부코드
Categorical

HIGH CORRELATION 

Distinct35
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
전북시험장
379 
도봉시험장
334 
강남시험장
196 
제주시험장
 
163
대구시험장
 
163
Other values (30)
1689 

Length

Max length8
Median length5
Mean length5.0123119
Min length4

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row도봉시험장
2nd row도봉시험장
3rd row제주시험장
4th row제주시험장
5th row전남시험장

Common Values

ValueCountFrequency (%)
전북시험장 379
 
13.0%
도봉시험장 334
 
11.4%
강남시험장 196
 
6.7%
제주시험장 163
 
5.6%
대구시험장 163
 
5.6%
포항시험장 143
 
4.9%
북부시험장 131
 
4.5%
강서시험장 130
 
4.4%
문경시험장 111
 
3.8%
인천시험장 108
 
3.7%
Other values (25) 1066
36.5%

Length

2023-12-12T22:14:07.927679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전북시험장 379
 
13.0%
도봉시험장 334
 
11.4%
강남시험장 196
 
6.7%
제주시험장 163
 
5.6%
대구시험장 163
 
5.6%
포항시험장 143
 
4.9%
북부시험장 131
 
4.5%
강서시험장 130
 
4.4%
문경시험장 111
 
3.8%
인천시험장 108
 
3.7%
Other values (25) 1066
36.5%

시간표구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
4
2923 
1
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row4
2nd row4
3rd row4
4th row4
5th row4

Common Values

ValueCountFrequency (%)
4 2923
> 99.9%
1 1
 
< 0.1%

Length

2023-12-12T22:14:08.045882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:14:08.146012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 2923
> 99.9%
1 1
 
< 0.1%

교육반코드
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
고령자교육(의무)
2658 
고령자교육(권장)
266 

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row고령자교육(의무)
2nd row고령자교육(의무)
3rd row고령자교육(의무)
4th row고령자교육(의무)
5th row고령자교육(의무)

Common Values

ValueCountFrequency (%)
고령자교육(의무) 2658
90.9%
고령자교육(권장) 266
 
9.1%

Length

2023-12-12T22:14:08.236703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:14:08.346365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고령자교육(의무 2658
90.9%
고령자교육(권장 266
 
9.1%

강의실번호
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
1
1553 
2
978 
3
199 
A
 
81
B
 
75
Other values (4)
 
38

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row1
2nd row2
3rd rowA
4th rowB
5th row1

Common Values

ValueCountFrequency (%)
1 1553
53.1%
2 978
33.4%
3 199
 
6.8%
A 81
 
2.8%
B 75
 
2.6%
4 33
 
1.1%
5 3
 
0.1%
6 1
 
< 0.1%
9 1
 
< 0.1%

Length

2023-12-12T22:14:08.451375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:14:08.570284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 1553
53.1%
2 978
33.4%
3 199
 
6.8%
a 81
 
2.8%
b 75
 
2.6%
4 33
 
1.1%
5 3
 
0.1%
6 1
 
< 0.1%
9 1
 
< 0.1%
Distinct13
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
Minimum2023-12-12 09:00:00
Maximum2023-12-12 16:00:00
2023-12-12T22:14:08.676336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:08.770594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)

순번
Real number (ℝ)

Distinct44
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.7592339
Minimum1
Maximum44
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size25.8 KiB
2023-12-12T22:14:08.910378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile24
Maximum44
Range43
Interquartile range (IQR)2

Descriptive statistics

Standard deviation7.9286612
Coefficient of variation (CV)1.6659532
Kurtosis5.4358189
Mean4.7592339
Median Absolute Deviation (MAD)0
Skewness2.4161975
Sum13916
Variance62.863668
MonotonicityNot monotonic
2023-12-12T22:14:09.050245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
1 1997
68.3%
2 184
 
6.3%
7 43
 
1.5%
8 40
 
1.4%
5 40
 
1.4%
6 38
 
1.3%
15 34
 
1.2%
12 34
 
1.2%
14 33
 
1.1%
9 32
 
1.1%
Other values (34) 449
 
15.4%
ValueCountFrequency (%)
1 1997
68.3%
2 184
 
6.3%
3 32
 
1.1%
4 31
 
1.1%
5 40
 
1.4%
6 38
 
1.3%
7 43
 
1.5%
8 40
 
1.4%
9 32
 
1.1%
10 17
 
0.6%
ValueCountFrequency (%)
44 1
 
< 0.1%
43 1
 
< 0.1%
42 3
 
0.1%
41 2
 
0.1%
40 1
 
< 0.1%
39 7
0.2%
38 6
0.2%
37 7
0.2%
36 8
0.3%
35 7
0.2%
Distinct14
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
Minimum2023-12-12 11:00:00
Maximum2023-12-12 18:00:00
2023-12-12T22:14:09.503792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:09.634489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)

강의실정원
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct33
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.780096
Minimum0
Maximum100
Zeros450
Zeros (%)15.4%
Negative0
Negative (%)0.0%
Memory size25.8 KiB
2023-12-12T22:14:09.803675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q19
median15
Q330
95-th percentile35
Maximum100
Range100
Interquartile range (IQR)21

Descriptive statistics

Standard deviation14.333351
Coefficient of variation (CV)0.85418768
Kurtosis11.688623
Mean16.780096
Median Absolute Deviation (MAD)8
Skewness2.3829006
Sum49065
Variance205.44495
MonotonicityNot monotonic
2023-12-12T22:14:09.966234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
30 523
17.9%
0 450
15.4%
15 396
13.5%
10 217
7.4%
17 185
 
6.3%
16 175
 
6.0%
35 155
 
5.3%
11 109
 
3.7%
12 91
 
3.1%
14 86
 
2.9%
Other values (23) 537
18.4%
ValueCountFrequency (%)
0 450
15.4%
1 19
 
0.6%
2 52
 
1.8%
3 34
 
1.2%
4 4
 
0.1%
5 20
 
0.7%
6 47
 
1.6%
7 74
 
2.5%
8 15
 
0.5%
9 22
 
0.8%
ValueCountFrequency (%)
100 36
 
1.2%
39 1
 
< 0.1%
37 2
 
0.1%
36 30
 
1.0%
35 155
 
5.3%
34 1
 
< 0.1%
31 2
 
0.1%
30 523
17.9%
29 5
 
0.2%
28 55
 
1.9%

예약정원
Real number (ℝ)

ZEROS 

Distinct34
Distinct (%)1.2%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean3.8693124
Minimum0
Maximum50
Zeros2129
Zeros (%)72.8%
Negative0
Negative (%)0.0%
Memory size25.8 KiB
2023-12-12T22:14:10.092511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile28
Maximum50
Range50
Interquartile range (IQR)1

Descriptive statistics

Standard deviation8.3713045
Coefficient of variation (CV)2.1635122
Kurtosis6.3272603
Mean3.8693124
Median Absolute Deviation (MAD)0
Skewness2.5196693
Sum11310
Variance70.07874
MonotonicityNot monotonic
2023-12-12T22:14:10.239174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%)
0 2129
72.8%
15 123
 
4.2%
1 84
 
2.9%
10 80
 
2.7%
30 77
 
2.6%
7 56
 
1.9%
2 40
 
1.4%
16 37
 
1.3%
17 35
 
1.2%
11 34
 
1.2%
Other values (24) 228
 
7.8%
ValueCountFrequency (%)
0 2129
72.8%
1 84
 
2.9%
2 40
 
1.4%
3 17
 
0.6%
4 4
 
0.1%
5 16
 
0.5%
6 15
 
0.5%
7 56
 
1.9%
8 9
 
0.3%
9 25
 
0.9%
ValueCountFrequency (%)
50 9
 
0.3%
37 1
 
< 0.1%
36 13
 
0.4%
35 26
 
0.9%
34 1
 
< 0.1%
31 1
 
< 0.1%
30 77
2.6%
29 5
 
0.2%
28 19
 
0.6%
27 1
 
< 0.1%

Interactions

2023-12-12T22:14:07.034039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:06.423471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:06.699512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:07.120092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:06.513170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:06.810724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:07.211246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:06.598341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:14:06.921180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:14:10.353121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지부코드시간표구분교육반코드강의실번호강의시작시간순번강의종료시간강의실정원예약정원
지부코드1.0000.0000.4600.8500.8710.6290.8550.9050.699
시간표구분0.0001.0000.0000.0000.0000.0000.0000.0000.058
교육반코드0.4600.0001.0000.3150.2050.1550.9140.1710.049
강의실번호0.8500.0000.3151.0000.5460.1230.5940.2370.223
강의시작시간0.8710.0000.2050.5461.0000.2470.9640.5500.304
순번0.6290.0000.1550.1230.2471.0000.2520.2420.104
강의종료시간0.8550.0000.9140.5940.9640.2521.0000.5790.381
강의실정원0.9050.0000.1710.2370.5500.2420.5791.0000.689
예약정원0.6990.0580.0490.2230.3040.1040.3810.6891.000
2023-12-12T22:14:10.480883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시간표구분교육반코드강의실번호지부코드
시간표구분1.0000.0000.0000.000
교육반코드0.0001.0000.3140.388
강의실번호0.0000.3141.0000.514
지부코드0.0000.3880.5141.000
2023-12-12T22:14:10.586793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번강의실정원예약정원지부코드시간표구분교육반코드강의실번호
순번1.000-0.048-0.0900.2680.0000.1190.056
강의실정원-0.0481.0000.0430.6990.0000.2330.150
예약정원-0.0900.0431.0000.3070.0440.0700.070
지부코드0.2680.6990.3071.0000.0000.3880.514
시간표구분0.0000.0000.0440.0001.0000.0000.000
교육반코드0.1190.2330.0700.3880.0001.0000.314
강의실번호0.0560.1500.0700.5140.0000.3141.000

Missing values

2023-12-12T22:14:07.341853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:14:07.504915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

교육일자지부코드시간표구분교육반코드강의실번호강의시작시간순번강의종료시간강의실정원예약정원
02020-01-02도봉시험장4고령자교육(의무)109:30111:303030
12020-01-02도봉시험장4고령자교육(의무)214:00116:002020
22020-01-02제주시험장4고령자교육(의무)A10:00112:001211
32020-01-02제주시험장4고령자교육(의무)B14:00116:001211
42020-01-02전남시험장4고령자교육(의무)109:30111:301515
52020-01-02전남시험장4고령자교육(의무)114:00216:001515
62020-01-02북부시험장4고령자교육(의무)110:00112:0077
72020-01-02북부시험장4고령자교육(의무)214:00116:0077
82020-01-02강남시험장4고령자교육(의무)109:30111:303535
92020-01-02강남시험장4고령자교육(의무)214:00116:003535
교육일자지부코드시간표구분교육반코드강의실번호강의시작시간순번강의종료시간강의실정원예약정원
29142020-12-23강남시험장4고령자교육(의무)A09:30111:3022
29152020-12-23강남시험장4고령자교육(의무)B14:00116:0022
29162020-12-23안동교육장4고령자교육(의무)110:00112:00106
29172020-12-24원주시험장4고령자교육(의무)110:00112:001001
29182020-12-24북부시험장4고령자교육(의무)A10:00112:0055
29192020-12-24울산경남지부4고령자교육(의무)110:00112:0000
29202020-12-24북부시험장4고령자교육(의무)114:00116:0088
29212020-12-28광주전남지부4고령자교육(의무)110:00112:00109
29222020-12-28광주전남지부4고령자교육(의무)114:00216:001010
29232020-12-30인천지부4고령자교육(의무)110:00112:0000