Overview

Dataset statistics

Number of variables7
Number of observations84
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.1 KiB
Average record size in memory62.6 B

Variable types

Categorical2
Numeric5

Dataset

Description- 도로종류별(일반국도, 고속국도 등), 시간대별 교통사고 통계 - 경찰에서 조사, 처리한 교통사고에 대한 통계 정보로 인적 피해가 있는 사고만 집계 됨 - 교통사고분석시스템(http://taas.koroad.or.kr)의 데이터를 바탕으로 함
URLhttps://www.data.go.kr/data/15070257/fileData.do

Alerts

사고건수 is highly overall correlated with 사망자수 and 3 other fieldsHigh correlation
사망자수 is highly overall correlated with 사고건수 and 3 other fieldsHigh correlation
중상자수 is highly overall correlated with 사고건수 and 3 other fieldsHigh correlation
경상자수 is highly overall correlated with 사고건수 and 3 other fieldsHigh correlation
부상신고자수 is highly overall correlated with 사고건수 and 3 other fieldsHigh correlation
사망자수 has 1 (1.2%) zerosZeros

Reproduction

Analysis started2023-12-12 22:23:04.187009
Analysis finished2023-12-12 22:23:06.561003
Duration2.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

도로종류
Categorical

Distinct7
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size804.0 B
일반국도
12 
지방도
12 
특별광역시도
12 
시도
12 
군도
12 
Other values (2)
24 

Length

Max length6
Median length4
Mean length3.2857143
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반국도
2nd row일반국도
3rd row일반국도
4th row일반국도
5th row일반국도

Common Values

ValueCountFrequency (%)
일반국도 12
14.3%
지방도 12
14.3%
특별광역시도 12
14.3%
시도 12
14.3%
군도 12
14.3%
고속국도 12
14.3%
기타 12
14.3%

Length

2023-12-13T07:23:06.632548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:23:06.758368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반국도 12
14.3%
지방도 12
14.3%
특별광역시도 12
14.3%
시도 12
14.3%
군도 12
14.3%
고속국도 12
14.3%
기타 12
14.3%

시간대
Categorical

Distinct12
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Memory size804.0 B
00시-02시
02시-04시
04시-06시
06시-08시
08시-10시
Other values (7)
49 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row00시-02시
2nd row02시-04시
3rd row04시-06시
4th row06시-08시
5th row08시-10시

Common Values

ValueCountFrequency (%)
00시-02시 7
8.3%
02시-04시 7
8.3%
04시-06시 7
8.3%
06시-08시 7
8.3%
08시-10시 7
8.3%
10시-12시 7
8.3%
12시-14시 7
8.3%
14시-16시 7
8.3%
16시-18시 7
8.3%
18시-20시 7
8.3%
Other values (2) 14
16.7%

Length

2023-12-13T07:23:06.876422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
00시-02시 7
8.3%
02시-04시 7
8.3%
04시-06시 7
8.3%
06시-08시 7
8.3%
08시-10시 7
8.3%
10시-12시 7
8.3%
12시-14시 7
8.3%
14시-16시 7
8.3%
16시-18시 7
8.3%
18시-20시 7
8.3%
Other values (2) 14
16.7%

사고건수
Real number (ℝ)

HIGH CORRELATION 

Distinct83
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2343.2857
Minimum39
Maximum10750
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size888.0 B
2023-12-13T07:23:07.008589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum39
5-th percentile171.1
Q1516.75
median1290
Q32512.75
95-th percentile8506.65
Maximum10750
Range10711
Interquartile range (IQR)1996

Descriptive statistics

Standard deviation2761.5601
Coefficient of variation (CV)1.1784991
Kurtosis1.3374414
Mean2343.2857
Median Absolute Deviation (MAD)897.5
Skewness1.5822691
Sum196836
Variance7626214.2
MonotonicityNot monotonic
2023-12-13T07:23:07.172163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
219 2
 
2.4%
692 1
 
1.2%
923 1
 
1.2%
150 1
 
1.2%
183 1
 
1.2%
274 1
 
1.2%
492 1
 
1.2%
878 1
 
1.2%
1058 1
 
1.2%
1027 1
 
1.2%
Other values (73) 73
86.9%
ValueCountFrequency (%)
39 1
1.2%
84 1
1.2%
98 1
1.2%
150 1
1.2%
169 1
1.2%
183 1
1.2%
185 1
1.2%
192 1
1.2%
219 2
2.4%
274 1
1.2%
ValueCountFrequency (%)
10750 1
1.2%
10167 1
1.2%
9041 1
1.2%
8893 1
1.2%
8520 1
1.2%
8431 1
1.2%
7653 1
1.2%
7588 1
1.2%
7526 1
1.2%
7131 1
1.2%

사망자수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct50
Distinct (%)59.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.559524
Minimum0
Maximum89
Zeros1
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size888.0 B
2023-12-13T07:23:07.384469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.15
Q115.75
median29.5
Q347
95-th percentile72.95
Maximum89
Range89
Interquartile range (IQR)31.25

Descriptive statistics

Standard deviation21.451178
Coefficient of variation (CV)0.6588296
Kurtosis-0.26223961
Mean32.559524
Median Absolute Deviation (MAD)14.5
Skewness0.66106243
Sum2735
Variance460.15304
MonotonicityNot monotonic
2023-12-13T07:23:07.539516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
67 3
 
3.6%
38 3
 
3.6%
40 3
 
3.6%
9 3
 
3.6%
15 3
 
3.6%
18 3
 
3.6%
19 3
 
3.6%
4 3
 
3.6%
14 2
 
2.4%
47 2
 
2.4%
Other values (40) 56
66.7%
ValueCountFrequency (%)
0 1
 
1.2%
3 1
 
1.2%
4 3
3.6%
5 1
 
1.2%
7 2
2.4%
8 1
 
1.2%
9 3
3.6%
10 1
 
1.2%
11 1
 
1.2%
12 2
2.4%
ValueCountFrequency (%)
89 1
 
1.2%
85 1
 
1.2%
81 1
 
1.2%
74 2
2.4%
67 3
3.6%
65 2
2.4%
64 1
 
1.2%
60 1
 
1.2%
53 1
 
1.2%
51 2
2.4%

중상자수
Real number (ℝ)

HIGH CORRELATION 

Distinct83
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean615.65476
Minimum9
Maximum2331
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size888.0 B
2023-12-13T07:23:07.674970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile56.9
Q1159
median385
Q3709.75
95-th percentile2053.05
Maximum2331
Range2322
Interquartile range (IQR)550.75

Descriptive statistics

Standard deviation648.88153
Coefficient of variation (CV)1.0539698
Kurtosis0.66327135
Mean615.65476
Median Absolute Deviation (MAD)245
Skewness1.3921276
Sum51715
Variance421047.24
MonotonicityNot monotonic
2023-12-13T07:23:07.824887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
96 2
 
2.4%
230 1
 
1.2%
392 1
 
1.2%
56 1
 
1.2%
76 1
 
1.2%
167 1
 
1.2%
315 1
 
1.2%
390 1
 
1.2%
365 1
 
1.2%
371 1
 
1.2%
Other values (73) 73
86.9%
ValueCountFrequency (%)
9 1
1.2%
28 1
1.2%
44 1
1.2%
48 1
1.2%
56 1
1.2%
62 1
1.2%
67 1
1.2%
73 1
1.2%
76 1
1.2%
79 1
1.2%
ValueCountFrequency (%)
2331 1
1.2%
2325 1
1.2%
2147 1
1.2%
2098 1
1.2%
2064 1
1.2%
1991 1
1.2%
1863 1
1.2%
1815 1
1.2%
1814 1
1.2%
1809 1
1.2%

경상자수
Real number (ℝ)

HIGH CORRELATION 

Distinct81
Distinct (%)96.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2528.9286
Minimum29
Maximum11410
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size888.0 B
2023-12-13T07:23:07.963822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum29
5-th percentile165.15
Q1675.25
median1318
Q33106.75
95-th percentile9174.85
Maximum11410
Range11381
Interquartile range (IQR)2431.5

Descriptive statistics

Standard deviation2948.8308
Coefficient of variation (CV)1.1660396
Kurtosis1.3240529
Mean2528.9286
Median Absolute Deviation (MAD)913.5
Skewness1.5718066
Sum212430
Variance8695602.9
MonotonicityNot monotonic
2023-12-13T07:23:08.118583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
329 2
 
2.4%
969 2
 
2.4%
833 2
 
2.4%
306 1
 
1.2%
168 1
 
1.2%
201 1
 
1.2%
251 1
 
1.2%
467 1
 
1.2%
991 1
 
1.2%
953 1
 
1.2%
Other values (71) 71
84.5%
ValueCountFrequency (%)
29 1
1.2%
70 1
1.2%
102 1
1.2%
146 1
1.2%
165 1
1.2%
166 1
1.2%
168 1
1.2%
191 1
1.2%
201 1
1.2%
228 1
1.2%
ValueCountFrequency (%)
11410 1
1.2%
11021 1
1.2%
9704 1
1.2%
9434 1
1.2%
9193 1
1.2%
9072 1
1.2%
8143 1
1.2%
7936 1
1.2%
7916 1
1.2%
7826 1
1.2%

부상신고자수
Real number (ℝ)

HIGH CORRELATION 

Distinct76
Distinct (%)90.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean210.21429
Minimum6
Maximum918
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size888.0 B
2023-12-13T07:23:08.276477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile20.15
Q161.25
median121.5
Q3259.5
95-th percentile736.7
Maximum918
Range912
Interquartile range (IQR)198.25

Descriptive statistics

Standard deviation225.18571
Coefficient of variation (CV)1.0712198
Kurtosis1.5783652
Mean210.21429
Median Absolute Deviation (MAD)81.5
Skewness1.5841069
Sum17658
Variance50708.604
MonotonicityNot monotonic
2023-12-13T07:23:08.429286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
114 3
 
3.6%
59 2
 
2.4%
34 2
 
2.4%
137 2
 
2.4%
9 2
 
2.4%
560 2
 
2.4%
22 2
 
2.4%
74 1
 
1.2%
97 1
 
1.2%
91 1
 
1.2%
Other values (66) 66
78.6%
ValueCountFrequency (%)
6 1
1.2%
9 2
2.4%
13 1
1.2%
20 1
1.2%
21 1
1.2%
22 2
2.4%
23 1
1.2%
24 1
1.2%
27 1
1.2%
28 1
1.2%
ValueCountFrequency (%)
918 1
1.2%
860 1
1.2%
776 1
1.2%
741 1
1.2%
740 1
1.2%
718 1
1.2%
667 1
1.2%
623 1
1.2%
587 1
1.2%
560 2
2.4%

Interactions

2023-12-13T07:23:05.976181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:04.430841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:04.747835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:05.282395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:05.610719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:06.113523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:04.496427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:04.809510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:05.348260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:05.687136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:06.181680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:04.558882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:04.869065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:05.412017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:05.755406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:06.247787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:04.627709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:05.153000image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:05.476594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:05.828898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:06.316793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:04.689186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:05.211237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:05.538144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:23:05.898640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:23:08.524919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
도로종류시간대사고건수사망자수중상자수경상자수부상신고자수
도로종류1.0000.0000.6480.6950.6000.6620.566
시간대0.0001.0000.0000.0000.3570.0000.000
사고건수0.6480.0001.0000.6750.9740.9930.980
사망자수0.6950.0000.6751.0000.7100.6150.676
중상자수0.6000.3570.9740.7101.0000.9630.954
경상자수0.6620.0000.9930.6150.9631.0000.987
부상신고자수0.5660.0000.9800.6760.9540.9871.000
2023-12-13T07:23:08.661045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시간대도로종류
시간대1.0000.000
도로종류0.0001.000
2023-12-13T07:23:08.762783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사고건수사망자수중상자수경상자수부상신고자수도로종류시간대
사고건수1.0000.8240.9860.9870.9710.4080.000
사망자수0.8241.0000.8720.8160.7790.4360.000
중상자수0.9860.8721.0000.9720.9540.3650.151
경상자수0.9870.8160.9721.0000.9880.4220.000
부상신고자수0.9710.7790.9540.9881.0000.3370.000
도로종류0.4080.4360.3650.4220.3371.0000.000
시간대0.0000.0000.1510.0000.0000.0001.000

Missing values

2023-12-13T07:23:06.406192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:23:06.516523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

도로종류시간대사고건수사망자수중상자수경상자수부상신고자수
0일반국도00시-02시6922923075169
1일반국도02시-04시3502312132936
2일반국도04시-06시4863718245543
3일반국도06시-08시1305304011349114
4일반국도08시-10시2286385752587215
5일반국도10시-12시2164407152541213
6일반국도12시-14시2224436782613218
7일반국도14시-16시2436387083101257
8일반국도16시-18시2743496623401269
9일반국도18시-20시2778537213124285
도로종류시간대사고건수사망자수중상자수경상자수부상신고자수
74기타04시-06시21977319121
75기타06시-08시6651516861659
76기타08시-10시134693481297102
77기타10시-12시1354193551360114
78기타12시-14시1543153871559159
79기타14시-16시1552143731571172
80기타16시-18시1832234131820203
81기타18시-20시178183961735182
82기타20시-22시1233102991171114
83기타22시-24시766418676881