Overview

Dataset statistics

Number of variables8
Number of observations158
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory10.9 KiB
Average record size in memory70.8 B

Variable types

Categorical1
Text1
Numeric6

Dataset

Description* 부문별 고속도로 교통사고(2018)
Author도로교통공단
URLhttps://www.data.go.kr/data/15094161/fileData.do

Alerts

발생건수 is highly overall correlated with 사망자수 and 4 other fieldsHigh correlation
사망자수 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
부상자수 is highly overall correlated with 발생건수 and 4 other fieldsHigh correlation
중상 is highly overall correlated with 발생건수 and 4 other fieldsHigh correlation
경상 is highly overall correlated with 발생건수 and 4 other fieldsHigh correlation
부상신고 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
사망자수 has 60 (38.0%) zerosZeros
중상 has 19 (12.0%) zerosZeros
경상 has 6 (3.8%) zerosZeros
부상신고 has 71 (44.9%) zerosZeros

Reproduction

Analysis started2024-04-20 14:55:13.443436
Analysis finished2024-04-20 14:55:22.475848
Duration9.03 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도
Categorical

Distinct16
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
경기
27 
경북
17 
전남
14 
경남
14 
충남
13 
Other values (11)
73 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique2 ?
Unique (%)1.3%

Sample

1st row서울
2nd row서울
3rd row서울
4th row서울
5th row서울

Common Values

ValueCountFrequency (%)
경기 27
17.1%
경북 17
10.8%
전남 14
8.9%
경남 14
8.9%
충남 13
8.2%
전북 13
8.2%
충북 11
7.0%
서울 10
 
6.3%
강원 10
 
6.3%
인천 8
 
5.1%
Other values (6) 21
13.3%

Length

2024-04-20T23:55:22.593552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 27
17.1%
경북 17
10.8%
전남 14
8.9%
경남 14
8.9%
충남 13
8.2%
전북 13
8.2%
충북 11
7.0%
서울 10
 
6.3%
강원 10
 
6.3%
인천 8
 
5.1%
Other values (6) 21
13.3%
Distinct150
Distinct (%)94.9%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
2024-04-20T23:55:23.808093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length2.9620253
Min length2

Characters and Unicode

Total characters468
Distinct characters111
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique145 ?
Unique (%)91.8%

Sample

1st row성북구
2nd row강서구
3rd row강남구
4th row강동구
5th row송파구
ValueCountFrequency (%)
동구 3
 
1.9%
서구 3
 
1.9%
북구 3
 
1.9%
강서구 2
 
1.3%
중구 2
 
1.3%
함평군 1
 
0.6%
영천시 1
 
0.6%
영광군 1
 
0.6%
구례군 1
 
0.6%
보성군 1
 
0.6%
Other values (140) 140
88.6%
2024-04-20T23:55:25.286025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
68
 
14.5%
58
 
12.4%
38
 
8.1%
18
 
3.8%
17
 
3.6%
16
 
3.4%
16
 
3.4%
13
 
2.8%
9
 
1.9%
9
 
1.9%
Other values (101) 206
44.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 466
99.6%
Close Punctuation 1
 
0.2%
Open Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
68
 
14.6%
58
 
12.4%
38
 
8.2%
18
 
3.9%
17
 
3.6%
16
 
3.4%
16
 
3.4%
13
 
2.8%
9
 
1.9%
9
 
1.9%
Other values (99) 204
43.8%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 466
99.6%
Common 2
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
68
 
14.6%
58
 
12.4%
38
 
8.2%
18
 
3.9%
17
 
3.6%
16
 
3.4%
16
 
3.4%
13
 
2.8%
9
 
1.9%
9
 
1.9%
Other values (99) 204
43.8%
Common
ValueCountFrequency (%)
) 1
50.0%
( 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 466
99.6%
ASCII 2
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
68
 
14.6%
58
 
12.4%
38
 
8.2%
18
 
3.9%
17
 
3.6%
16
 
3.4%
16
 
3.4%
13
 
2.8%
9
 
1.9%
9
 
1.9%
Other values (99) 204
43.8%
ASCII
ValueCountFrequency (%)
) 1
50.0%
( 1
50.0%

발생건수
Real number (ℝ)

HIGH CORRELATION 

Distinct63
Distinct (%)39.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25.816456
Minimum1
Maximum317
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2024-04-20T23:55:25.555291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14.25
median13.5
Q330.75
95-th percentile93.3
Maximum317
Range316
Interquartile range (IQR)26.5

Descriptive statistics

Standard deviation38.473556
Coefficient of variation (CV)1.4902726
Kurtosis22.989331
Mean25.816456
Median Absolute Deviation (MAD)10.5
Skewness3.9891947
Sum4079
Variance1480.2145
MonotonicityNot monotonic
2024-04-20T23:55:25.806372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 14
 
8.9%
2 11
 
7.0%
3 10
 
6.3%
5 7
 
4.4%
8 6
 
3.8%
11 6
 
3.8%
16 5
 
3.2%
9 5
 
3.2%
4 5
 
3.2%
17 5
 
3.2%
Other values (53) 84
53.2%
ValueCountFrequency (%)
1 14
8.9%
2 11
7.0%
3 10
6.3%
4 5
 
3.2%
5 7
4.4%
6 4
 
2.5%
7 4
 
2.5%
8 6
3.8%
9 5
 
3.2%
10 4
 
2.5%
ValueCountFrequency (%)
317 1
0.6%
197 1
0.6%
152 1
0.6%
118 1
0.6%
110 1
0.6%
105 1
0.6%
102 1
0.6%
95 1
0.6%
93 1
0.6%
88 1
0.6%

사망자수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct9
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5949367
Minimum0
Maximum8
Zeros60
Zeros (%)38.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2024-04-20T23:55:26.019189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile6
Maximum8
Range8
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.9353828
Coefficient of variation (CV)1.2134543
Kurtosis1.9825869
Mean1.5949367
Median Absolute Deviation (MAD)1
Skewness1.5226984
Sum252
Variance3.7457067
MonotonicityNot monotonic
2024-04-20T23:55:26.225880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
0 60
38.0%
1 38
24.1%
2 21
 
13.3%
3 18
 
11.4%
5 6
 
3.8%
4 6
 
3.8%
7 4
 
2.5%
8 3
 
1.9%
6 2
 
1.3%
ValueCountFrequency (%)
0 60
38.0%
1 38
24.1%
2 21
 
13.3%
3 18
 
11.4%
4 6
 
3.8%
5 6
 
3.8%
6 2
 
1.3%
7 4
 
2.5%
8 3
 
1.9%
ValueCountFrequency (%)
8 3
 
1.9%
7 4
 
2.5%
6 2
 
1.3%
5 6
 
3.8%
4 6
 
3.8%
3 18
 
11.4%
2 21
 
13.3%
1 38
24.1%
0 60
38.0%

부상자수
Real number (ℝ)

HIGH CORRELATION 

Distinct88
Distinct (%)55.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean56.411392
Minimum1
Maximum760
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2024-04-20T23:55:26.480222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.85
Q18
median27
Q368.25
95-th percentile197.3
Maximum760
Range759
Interquartile range (IQR)60.25

Descriptive statistics

Standard deviation88.042269
Coefficient of variation (CV)1.5607179
Kurtosis27.068975
Mean56.411392
Median Absolute Deviation (MAD)23
Skewness4.2865339
Sum8913
Variance7751.4411
MonotonicityNot monotonic
2024-04-20T23:55:26.751912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 9
 
5.7%
1 8
 
5.1%
5 7
 
4.4%
4 7
 
4.4%
26 5
 
3.2%
8 4
 
2.5%
17 4
 
2.5%
10 4
 
2.5%
3 4
 
2.5%
51 4
 
2.5%
Other values (78) 102
64.6%
ValueCountFrequency (%)
1 8
5.1%
2 9
5.7%
3 4
2.5%
4 7
4.4%
5 7
4.4%
6 1
 
0.6%
7 2
 
1.3%
8 4
2.5%
9 4
2.5%
10 4
2.5%
ValueCountFrequency (%)
760 1
0.6%
368 1
0.6%
342 1
0.6%
315 1
0.6%
281 1
0.6%
270 1
0.6%
234 1
0.6%
199 1
0.6%
197 1
0.6%
180 1
0.6%

중상
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct39
Distinct (%)24.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.335443
Minimum0
Maximum138
Zeros19
Zeros (%)12.0%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2024-04-20T23:55:26.997685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11.25
median7
Q316.75
95-th percentile41.45
Maximum138
Range138
Interquartile range (IQR)15.5

Descriptive statistics

Standard deviation17.467572
Coefficient of variation (CV)1.4160474
Kurtosis18.50577
Mean12.335443
Median Absolute Deviation (MAD)6
Skewness3.5412445
Sum1949
Variance305.11606
MonotonicityNot monotonic
2024-04-20T23:55:27.338309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
1 21
 
13.3%
0 19
 
12.0%
5 10
 
6.3%
2 9
 
5.7%
4 9
 
5.7%
9 8
 
5.1%
8 7
 
4.4%
17 7
 
4.4%
3 6
 
3.8%
15 5
 
3.2%
Other values (29) 57
36.1%
ValueCountFrequency (%)
0 19
12.0%
1 21
13.3%
2 9
5.7%
3 6
 
3.8%
4 9
5.7%
5 10
6.3%
6 4
 
2.5%
7 2
 
1.3%
8 7
 
4.4%
9 8
 
5.1%
ValueCountFrequency (%)
138 1
0.6%
77 1
0.6%
68 2
1.3%
67 1
0.6%
51 1
0.6%
45 1
0.6%
44 1
0.6%
41 1
0.6%
40 1
0.6%
39 1
0.6%

경상
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct73
Distinct (%)46.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.829114
Minimum0
Maximum494
Zeros6
Zeros (%)3.8%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2024-04-20T23:55:28.113413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q15.25
median19
Q348.75
95-th percentile131.55
Maximum494
Range494
Interquartile range (IQR)43.5

Descriptive statistics

Standard deviation59.648567
Coefficient of variation (CV)1.5361815
Kurtosis22.986021
Mean38.829114
Median Absolute Deviation (MAD)16
Skewness3.9752081
Sum6135
Variance3557.9515
MonotonicityNot monotonic
2024-04-20T23:55:28.702290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 9
 
5.7%
3 7
 
4.4%
2 7
 
4.4%
0 6
 
3.8%
8 6
 
3.8%
5 6
 
3.8%
11 5
 
3.2%
7 5
 
3.2%
4 5
 
3.2%
32 4
 
2.5%
Other values (63) 98
62.0%
ValueCountFrequency (%)
0 6
3.8%
1 9
5.7%
2 7
4.4%
3 7
4.4%
4 5
3.2%
5 6
3.8%
6 2
 
1.3%
7 5
3.2%
8 6
3.8%
9 4
2.5%
ValueCountFrequency (%)
494 1
0.6%
242 1
0.6%
236 1
0.6%
232 1
0.6%
228 1
0.6%
166 1
0.6%
159 1
0.6%
146 1
0.6%
129 1
0.6%
124 1
0.6%

부상신고
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct27
Distinct (%)17.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.2468354
Minimum0
Maximum128
Zeros71
Zeros (%)44.9%
Negative0
Negative (%)0.0%
Memory size1.5 KiB
2024-04-20T23:55:29.197197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q34
95-th percentile23.15
Maximum128
Range128
Interquartile range (IQR)4

Descriptive statistics

Standard deviation14.283609
Coefficient of variation (CV)2.7223284
Kurtosis40.595
Mean5.2468354
Median Absolute Deviation (MAD)1
Skewness5.7497192
Sum829
Variance204.02149
MonotonicityNot monotonic
2024-04-20T23:55:29.635109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
0 71
44.9%
1 22
 
13.9%
2 13
 
8.2%
4 10
 
6.3%
3 8
 
5.1%
9 4
 
2.5%
6 4
 
2.5%
5 3
 
1.9%
15 3
 
1.9%
10 2
 
1.3%
Other values (17) 18
 
11.4%
ValueCountFrequency (%)
0 71
44.9%
1 22
 
13.9%
2 13
 
8.2%
3 8
 
5.1%
4 10
 
6.3%
5 3
 
1.9%
6 4
 
2.5%
7 1
 
0.6%
8 1
 
0.6%
9 4
 
2.5%
ValueCountFrequency (%)
128 1
0.6%
80 1
0.6%
64 1
0.6%
46 1
0.6%
33 1
0.6%
30 1
0.6%
26 1
0.6%
24 1
0.6%
23 1
0.6%
22 1
0.6%

Interactions

2024-04-20T23:55:20.705111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:13.841703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:15.210261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:16.640854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:17.800734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:19.444519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:20.955269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:14.088916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:15.470184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:16.800141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:18.057987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:19.602851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:21.219120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:14.355499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:15.736036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:16.967439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:18.322299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:19.763315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:21.427868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:14.564506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:15.997313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:17.126963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:18.607271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:19.960628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:21.598591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:14.724903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:16.258318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:17.286495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:18.946605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:20.216395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:21.740808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:14.958101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:16.482440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:17.568903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:19.190524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-20T23:55:20.455708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-20T23:55:29.908314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도발생건수사망자수부상자수중상경상부상신고
시도1.0000.0000.0000.1330.0000.2890.000
발생건수0.0001.0000.5940.8840.9730.8430.966
사망자수0.0000.5941.0000.5940.6330.5830.498
부상자수0.1330.8840.5941.0000.9180.9850.802
중상0.0000.9730.6330.9181.0000.8320.923
경상0.2890.8430.5830.9850.8321.0000.778
부상신고0.0000.9660.4980.8020.9230.7781.000
2024-04-20T23:55:30.251231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발생건수사망자수부상자수중상경상부상신고시도
발생건수1.0000.5640.9760.9190.9660.7820.000
사망자수0.5641.0000.5440.5840.5370.3280.000
부상자수0.9760.5441.0000.9320.9910.8000.057
중상0.9190.5840.9321.0000.8880.6840.000
경상0.9660.5370.9910.8881.0000.7860.136
부상신고0.7820.3280.8000.6840.7861.0000.000
시도0.0000.0000.0570.0000.1360.0001.000

Missing values

2024-04-20T23:55:21.952484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-20T23:55:22.371846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도시군구발생건수사망자수부상자수중상경상부상신고
0서울성북구101001
1서울강서구102110
2서울강남구101100
3서울강동구37185106015
4서울송파구241495386
5서울서초구411127217630
6서울양천구304130
7서울중랑구3010091
8서울노원구6018675
9서울금천구314040
시도시군구발생건수사망자수부상자수중상경상부상신고
148인천계양구761132298815
149광주북구201445381
150광주광산구70171151
151대전동구143299191
152대전중구101100
153대전서구51143110
154대전유성구2625220320
155대전대덕구212349241
156울산울주군4757918574
157세종세종409180