Overview

Dataset statistics

Number of variables9
Number of observations6670
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.7 KiB
Average record size in memory75.0 B

Variable types

Numeric2
DateTime3
Categorical3
Text1

Dataset

Description저희 도로교통공단에서는 법규위반, 사고, 음주운전, 난폭, 보복운전 등으로 운전면허 행정처분을 받은 대상자를 대상으로 교통안전교육을 진행하고 있습니다.
Author도로교통공단
URLhttps://www.data.go.kr/data/15087818/fileData.do

Alerts

교육구분_1 is highly overall correlated with 교육구분High correlation
교육구분 is highly overall correlated with 교육구분_1High correlation
실적번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 01:33:45.645158
Analysis finished2023-12-12 01:33:47.684503
Duration2.04 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

실적번호
Real number (ℝ)

UNIQUE 

Distinct6670
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean781388.04
Minimum777624
Maximum791798
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.8 KiB
2023-12-12T10:33:47.760112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum777624
5-th percentile778296.45
Q1779703.25
median781387.5
Q3783062.75
95-th percentile784452.55
Maximum791798
Range14174
Interquartile range (IQR)3359.5

Descriptive statistics

Standard deviation1970.7477
Coefficient of variation (CV)0.0025221114
Kurtosis-0.89477471
Mean781388.04
Median Absolute Deviation (MAD)1680
Skewness0.060654578
Sum5.2118582 × 109
Variance3883846.3
MonotonicityStrictly increasing
2023-12-12T10:33:47.927317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
777624 1
 
< 0.1%
782647 1
 
< 0.1%
782509 1
 
< 0.1%
782508 1
 
< 0.1%
782507 1
 
< 0.1%
782506 1
 
< 0.1%
782505 1
 
< 0.1%
782504 1
 
< 0.1%
782503 1
 
< 0.1%
782502 1
 
< 0.1%
Other values (6660) 6660
99.9%
ValueCountFrequency (%)
777624 1
< 0.1%
777625 1
< 0.1%
777863 1
< 0.1%
777864 1
< 0.1%
777865 1
< 0.1%
777867 1
< 0.1%
777868 1
< 0.1%
777959 1
< 0.1%
777961 1
< 0.1%
777966 1
< 0.1%
ValueCountFrequency (%)
791798 1
< 0.1%
790306 1
< 0.1%
790305 1
< 0.1%
789147 1
< 0.1%
789143 1
< 0.1%
788112 1
< 0.1%
785803 1
< 0.1%
785802 1
< 0.1%
785189 1
< 0.1%
784854 1
< 0.1%
Distinct278
Distinct (%)4.2%
Missing1
Missing (%)< 0.1%
Memory size52.2 KiB
Minimum2020-01-02 00:00:00
Maximum2020-12-30 00:00:00
2023-12-12T10:33:48.113118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:33:48.282819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

지부코드
Categorical

Distinct13
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size52.2 KiB
부산지부
1096 
광주전남지부
1001 
대구지부
942 
울산경남지부
675 
강원지부
582 
Other values (8)
2374 

Length

Max length8
Median length4
Mean length4.7316342
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row인천지부
2nd row인천지부
3rd row인천지부
4th row인천지부
5th row인천지부

Common Values

ValueCountFrequency (%)
부산지부 1096
16.4%
광주전남지부 1001
15.0%
대구지부 942
14.1%
울산경남지부 675
10.1%
강원지부 582
8.7%
충북지부 450
6.7%
제주지부 433
 
6.5%
대전세종충남지부 382
 
5.7%
경북지부 381
 
5.7%
전북지부 312
 
4.7%
Other values (3) 416
 
6.2%

Length

2023-12-12T10:33:48.462951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
부산지부 1096
16.4%
광주전남지부 1001
15.0%
대구지부 942
14.1%
울산경남지부 675
10.1%
강원지부 582
8.7%
충북지부 450
6.7%
제주지부 433
 
6.5%
대전세종충남지부 382
 
5.7%
경북지부 381
 
5.7%
전북지부 312
 
4.7%
Other values (3) 416
 
6.2%

교육구분
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size52.2 KiB
1
4896 
3
1271 
4
 
466
2
 
37

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row3
3rd row4
4th row4
5th row4

Common Values

ValueCountFrequency (%)
1 4896
73.4%
3 1271
 
19.1%
4 466
 
7.0%
2 37
 
0.6%

Length

2023-12-12T10:33:48.626078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:33:48.790793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 4896
73.4%
3 1271
 
19.1%
4 466
 
7.0%
2 37
 
0.6%

교육구분_1
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size52.2 KiB
어린이 교통안전교육
4896 
기관/단체, 기업체 등 교통안전교육
1271 
노인 교통안전교육
 
466
청소년 교통안전교육
 
37

Length

Max length19
Median length10
Mean length11.645127
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기관/단체, 기업체 등 교통안전교육
2nd row기관/단체, 기업체 등 교통안전교육
3rd row노인 교통안전교육
4th row노인 교통안전교육
5th row노인 교통안전교육

Common Values

ValueCountFrequency (%)
어린이 교통안전교육 4896
73.4%
기관/단체, 기업체 등 교통안전교육 1271
 
19.1%
노인 교통안전교육 466
 
7.0%
청소년 교통안전교육 37
 
0.6%

Length

2023-12-12T10:33:48.915637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:33:49.046154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
교통안전교육 6670
42.0%
어린이 4896
30.8%
기관/단체 1271
 
8.0%
기업체 1271
 
8.0%
1271
 
8.0%
노인 466
 
2.9%
청소년 37
 
0.2%
Distinct2382
Distinct (%)35.7%
Missing0
Missing (%)0.0%
Memory size52.2 KiB
2023-12-12T10:33:49.363857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length21
Mean length7.9910045
Min length2

Characters and Unicode

Total characters53300
Distinct characters559
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1098 ?
Unique (%)16.5%

Sample

1st row171연대
2nd row172연대
3rd row인천노인인력개발센터
4th row인천노인인력개발센터
5th row인천노인인력개발센터
ValueCountFrequency (%)
어린이집 172
 
2.1%
목포 168
 
2.1%
유치원 133
 
1.6%
병설유치원 129
 
1.6%
양재교육장(법규준수 112
 
1.4%
삼다자동차운전학원(시청각 100
 
1.2%
제일자동차운전학원(시청각 100
 
1.2%
동구청 94
 
1.2%
지부 86
 
1.1%
광양 70
 
0.9%
Other values (2425) 6904
85.6%
2023-12-12T10:33:49.925386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2766
 
5.2%
2461
 
4.6%
2447
 
4.6%
2386
 
4.5%
2312
 
4.3%
1661
 
3.1%
1643
 
3.1%
1486
 
2.8%
1403
 
2.6%
1109
 
2.1%
Other values (549) 33626
63.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 50359
94.5%
Space Separator 1403
 
2.6%
Close Punctuation 508
 
1.0%
Open Punctuation 508
 
1.0%
Decimal Number 279
 
0.5%
Uppercase Letter 152
 
0.3%
Lowercase Letter 36
 
0.1%
Connector Punctuation 28
 
0.1%
Other Punctuation 24
 
< 0.1%
Dash Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2766
 
5.5%
2461
 
4.9%
2447
 
4.9%
2386
 
4.7%
2312
 
4.6%
1661
 
3.3%
1643
 
3.3%
1486
 
3.0%
1109
 
2.2%
1104
 
2.2%
Other values (507) 30984
61.5%
Uppercase Letter
ValueCountFrequency (%)
C 28
18.4%
A 23
15.1%
Y 19
12.5%
M 13
8.6%
G 11
 
7.2%
S 10
 
6.6%
W 9
 
5.9%
E 6
 
3.9%
L 6
 
3.9%
O 4
 
2.6%
Other values (9) 23
15.1%
Decimal Number
ValueCountFrequency (%)
2 81
29.0%
1 79
28.3%
3 49
17.6%
5 22
 
7.9%
6 15
 
5.4%
4 14
 
5.0%
7 9
 
3.2%
8 5
 
1.8%
9 3
 
1.1%
0 2
 
0.7%
Lowercase Letter
ValueCountFrequency (%)
m 16
44.4%
e 10
27.8%
o 8
22.2%
a 1
 
2.8%
i 1
 
2.8%
Other Punctuation
ValueCountFrequency (%)
@ 20
83.3%
, 3
 
12.5%
1
 
4.2%
Space Separator
ValueCountFrequency (%)
1403
100.0%
Close Punctuation
ValueCountFrequency (%)
) 508
100.0%
Open Punctuation
ValueCountFrequency (%)
( 508
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 28
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 50359
94.5%
Common 2753
 
5.2%
Latin 188
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2766
 
5.5%
2461
 
4.9%
2447
 
4.9%
2386
 
4.7%
2312
 
4.6%
1661
 
3.3%
1643
 
3.3%
1486
 
3.0%
1109
 
2.2%
1104
 
2.2%
Other values (507) 30984
61.5%
Latin
ValueCountFrequency (%)
C 28
14.9%
A 23
12.2%
Y 19
10.1%
m 16
 
8.5%
M 13
 
6.9%
G 11
 
5.9%
S 10
 
5.3%
e 10
 
5.3%
W 9
 
4.8%
o 8
 
4.3%
Other values (14) 41
21.8%
Common
ValueCountFrequency (%)
1403
51.0%
) 508
 
18.5%
( 508
 
18.5%
2 81
 
2.9%
1 79
 
2.9%
3 49
 
1.8%
_ 28
 
1.0%
5 22
 
0.8%
@ 20
 
0.7%
6 15
 
0.5%
Other values (8) 40
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 50359
94.5%
ASCII 2940
 
5.5%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2766
 
5.5%
2461
 
4.9%
2447
 
4.9%
2386
 
4.7%
2312
 
4.6%
1661
 
3.3%
1643
 
3.3%
1486
 
3.0%
1109
 
2.2%
1104
 
2.2%
Other values (507) 30984
61.5%
ASCII
ValueCountFrequency (%)
1403
47.7%
) 508
 
17.3%
( 508
 
17.3%
2 81
 
2.8%
1 79
 
2.7%
3 49
 
1.7%
_ 28
 
1.0%
C 28
 
1.0%
A 23
 
0.8%
5 22
 
0.7%
Other values (31) 211
 
7.2%
None
ValueCountFrequency (%)
1
100.0%

교육인원
Real number (ℝ)

Distinct177
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31.516192
Minimum1
Maximum1000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.8 KiB
2023-12-12T10:33:50.154510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q113
median22.5
Q337
95-th percentile90
Maximum1000
Range999
Interquartile range (IQR)24

Descriptive statistics

Standard deviation38.907937
Coefficient of variation (CV)1.234538
Kurtosis161.0529
Mean31.516192
Median Absolute Deviation (MAD)10.5
Skewness9.2208869
Sum210213
Variance1513.8275
MonotonicityNot monotonic
2023-12-12T10:33:50.361062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10 293
 
4.4%
20 288
 
4.3%
24 268
 
4.0%
15 254
 
3.8%
25 253
 
3.8%
11 214
 
3.2%
12 209
 
3.1%
30 184
 
2.8%
9 183
 
2.7%
13 181
 
2.7%
Other values (167) 4343
65.1%
ValueCountFrequency (%)
1 30
 
0.4%
2 42
 
0.6%
3 34
 
0.5%
4 25
 
0.4%
5 95
 
1.4%
6 132
2.0%
7 131
2.0%
8 165
2.5%
9 183
2.7%
10 293
4.4%
ValueCountFrequency (%)
1000 1
< 0.1%
972 1
< 0.1%
800 1
< 0.1%
554 1
< 0.1%
500 2
< 0.1%
459 2
< 0.1%
437 1
< 0.1%
436 1
< 0.1%
425 1
< 0.1%
420 1
< 0.1%
Distinct70
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size52.2 KiB
Minimum2023-12-12 05:30:00
Maximum2023-12-12 20:00:00
2023-12-12T10:33:50.541841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:33:50.733735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct73
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size52.2 KiB
Minimum2023-12-12 09:10:00
Maximum2023-12-12 20:40:00
2023-12-12T10:33:50.918148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:33:51.093125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T10:33:47.169542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:33:46.859533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:33:47.289009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:33:47.007623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:33:51.227275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
실적번호지부코드교육구분교육구분_1교육인원교육시작시간교육종료시간
실적번호1.0000.6170.4250.4250.0960.4520.479
지부코드0.6171.0000.5940.5940.1760.7140.778
교육구분0.4250.5941.0001.0000.3880.8080.861
교육구분_10.4250.5941.0001.0000.3880.8080.861
교육인원0.0960.1760.3880.3881.0000.5980.665
교육시작시간0.4520.7140.8080.8080.5981.0000.989
교육종료시간0.4790.7780.8610.8610.6650.9891.000
2023-12-12T10:33:51.355271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지부코드교육구분_1교육구분
지부코드1.0000.3900.390
교육구분_10.3901.0001.000
교육구분0.3901.0001.000
2023-12-12T10:33:51.484014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
실적번호교육인원지부코드교육구분교육구분_1
실적번호1.000-0.1180.3220.2850.285
교육인원-0.1181.0000.0800.1820.182
지부코드0.3220.0801.0000.3900.390
교육구분0.2850.1820.3901.0001.000
교육구분_10.2850.1820.3901.0001.000

Missing values

2023-12-12T10:33:47.451603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:33:47.614337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

실적번호교육일자지부코드교육구분교육구분_1교육장소교육인원교육시작시간교육종료시간
07776242020-01-07인천지부3기관/단체, 기업체 등 교통안전교육171연대16210:0012:00
17776252020-01-07인천지부3기관/단체, 기업체 등 교통안전교육172연대4314:0016:00
27778632020-01-09인천지부4노인 교통안전교육인천노인인력개발센터9414:0015:00
37778642020-01-08인천지부4노인 교통안전교육인천노인인력개발센터23311:0012:00
47778652020-01-08인천지부4노인 교통안전교육인천노인인력개발센터26214:0015:00
57778672020-01-10경북지부2청소년 교통안전교육구미 보호관찰소513:0016:00
67778682020-01-10대전세종충남지부4노인 교통안전교육당진시 노인복지관5013:3014:30
77779592020-01-13인천지부3기관/단체, 기업체 등 교통안전교육계양구청3110:0012:00
87779612020-01-15인천지부4노인 교통안전교육인천노인인력개발센터2510:0011:00
97779662020-01-16인천지부3기관/단체, 기업체 등 교통안전교육안산보호관찰소6010:0012:00
실적번호교육일자지부코드교육구분교육구분_1교육장소교육인원교육시작시간교육종료시간
66607848542020-11-11울산경남지부1어린이 교통안전교육하늘숲어린이집 외 2개소5910:0012:00
66617851892020-03-10울산경남지부1어린이 교통안전교육정지유치원4211:0012:00
66627858022020-04-20대구지부1어린이 교통안전교육달서초등학교병설유치원3110:3011:30
66637858032020-04-20대구지부1어린이 교통안전교육달서초등학교병설유치원3111:3012:30
66647881122020-05-24울산경남지부1어린이 교통안전교육예다움어린이집1710:3011:30
66657891432020-12-04대구지부1어린이 교통안전교육함지초등병설유치원2010:0011:00
66667891472020-11-30대구지부1어린이 교통안전교육문화유치원5813:0014:00
66677903052020-11-23인천지부3기관/단체, 기업체 등 교통안전교육한국가스기술공사 인천지사 교육장1516:0017:00
66687903062020-12-18인천지부3기관/단체, 기업체 등 교통안전교육비대면교육6311:0012:00
66697917982020-03-26경기지부1어린이 교통안전교육효동초등학교509:5011:20