Overview

Dataset statistics

Number of variables8
Number of observations229
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory15.8 KiB
Average record size in memory70.6 B

Variable types

Categorical1
Text1
Numeric6

Dataset

Description* 부문별 사망 교통사고(2018)
Author도로교통공단
URLhttps://www.data.go.kr/data/15094168/fileData.do

Alerts

발생건수 is highly overall correlated with 사망자수 and 3 other fieldsHigh correlation
사망자수 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
부상자수 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
중상 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
경상 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
부상자수 has 29 (12.7%) zerosZeros
중상 has 54 (23.6%) zerosZeros
경상 has 57 (24.9%) zerosZeros
부상신고 has 182 (79.5%) zerosZeros

Reproduction

Analysis started2023-12-12 21:57:51.671416
Analysis finished2023-12-12 21:57:55.749437
Duration4.08 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도
Categorical

Distinct17
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
경기
31 
서울
25 
경북
23 
전남
22 
강원
18 
Other values (12)
110 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st row서울
2nd row서울
3rd row서울
4th row서울
5th row서울

Common Values

ValueCountFrequency (%)
경기 31
13.5%
서울 25
10.9%
경북 23
10.0%
전남 22
9.6%
강원 18
7.9%
경남 18
7.9%
부산 16
7.0%
충남 15
6.6%
전북 14
 
6.1%
충북 11
 
4.8%
Other values (7) 36
15.7%

Length

2023-12-13T06:57:55.813813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 31
13.5%
서울 25
10.9%
경북 23
10.0%
전남 22
9.6%
강원 18
7.9%
경남 18
7.9%
부산 16
7.0%
충남 15
6.6%
전북 14
 
6.1%
충북 11
 
4.8%
Other values (7) 36
15.7%
Distinct206
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2023-12-13T06:57:56.168736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length2.9388646
Min length2

Characters and Unicode

Total characters673
Distinct characters133
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique199 ?
Unique (%)86.9%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row동대문구
ValueCountFrequency (%)
중구 6
 
2.6%
동구 6
 
2.6%
서구 5
 
2.2%
남구 5
 
2.2%
북구 4
 
1.7%
강서구 2
 
0.9%
고성군 2
 
0.9%
곡성군 1
 
0.4%
화순군 1
 
0.4%
보성군 1
 
0.4%
Other values (196) 196
85.6%
2023-12-13T06:57:56.704534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
85
 
12.6%
78
 
11.6%
74
 
11.0%
22
 
3.3%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.5%
16
 
2.4%
13
 
1.9%
Other values (123) 312
46.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 671
99.7%
Open Punctuation 1
 
0.1%
Close Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
85
 
12.7%
78
 
11.6%
74
 
11.0%
22
 
3.3%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.5%
16
 
2.4%
13
 
1.9%
Other values (121) 310
46.2%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 671
99.7%
Common 2
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
85
 
12.7%
78
 
11.6%
74
 
11.0%
22
 
3.3%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.5%
16
 
2.4%
13
 
1.9%
Other values (121) 310
46.2%
Common
ValueCountFrequency (%)
( 1
50.0%
) 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 671
99.7%
ASCII 2
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
85
 
12.7%
78
 
11.6%
74
 
11.0%
22
 
3.3%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.5%
16
 
2.4%
13
 
1.9%
Other values (121) 310
46.2%
ASCII
ValueCountFrequency (%)
( 1
50.0%
) 1
50.0%

발생건수
Real number (ℝ)

HIGH CORRELATION 

Distinct44
Distinct (%)19.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.969432
Minimum1
Maximum84
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-13T06:57:56.863221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q19
median13
Q320
95-th percentile39
Maximum84
Range83
Interquartile range (IQR)11

Descriptive statistics

Standard deviation11.813386
Coefficient of variation (CV)0.73974988
Kurtosis6.0728713
Mean15.969432
Median Absolute Deviation (MAD)5
Skewness1.9814265
Sum3657
Variance139.55608
MonotonicityNot monotonic
2023-12-13T06:57:57.009928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)
ValueCountFrequency (%)
11 17
 
7.4%
6 16
 
7.0%
13 14
 
6.1%
10 12
 
5.2%
9 12
 
5.2%
5 11
 
4.8%
14 11
 
4.8%
12 10
 
4.4%
17 10
 
4.4%
8 9
 
3.9%
Other values (34) 107
46.7%
ValueCountFrequency (%)
1 5
 
2.2%
2 6
 
2.6%
3 5
 
2.2%
5 11
4.8%
6 16
7.0%
7 4
 
1.7%
8 9
3.9%
9 12
5.2%
10 12
5.2%
11 17
7.4%
ValueCountFrequency (%)
84 1
0.4%
64 1
0.4%
57 1
0.4%
52 1
0.4%
51 1
0.4%
48 1
0.4%
42 2
0.9%
41 2
0.9%
40 1
0.4%
39 2
0.9%

사망자수
Real number (ℝ)

HIGH CORRELATION 

Distinct46
Distinct (%)20.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.510917
Minimum1
Maximum86
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-13T06:57:57.149854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q19
median14
Q320
95-th percentile39.6
Maximum86
Range85
Interquartile range (IQR)11

Descriptive statistics

Standard deviation12.156729
Coefficient of variation (CV)0.73628431
Kurtosis5.8153256
Mean16.510917
Median Absolute Deviation (MAD)6
Skewness1.9326029
Sum3781
Variance147.78606
MonotonicityNot monotonic
2023-12-13T06:57:57.287623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%)
11 20
 
8.7%
6 15
 
6.6%
14 14
 
6.1%
10 12
 
5.2%
13 12
 
5.2%
5 10
 
4.4%
9 10
 
4.4%
15 9
 
3.9%
20 9
 
3.9%
17 8
 
3.5%
Other values (36) 110
48.0%
ValueCountFrequency (%)
1 5
 
2.2%
2 6
 
2.6%
3 5
 
2.2%
5 10
4.4%
6 15
6.6%
7 4
 
1.7%
8 8
 
3.5%
9 10
4.4%
10 12
5.2%
11 20
8.7%
ValueCountFrequency (%)
86 1
0.4%
64 1
0.4%
59 1
0.4%
54 1
0.4%
53 1
0.4%
51 1
0.4%
45 1
0.4%
42 2
0.9%
41 2
0.9%
40 1
0.4%

부상자수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct35
Distinct (%)15.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.3624454
Minimum0
Maximum53
Zeros29
Zeros (%)12.7%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-13T06:57:57.422177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q311
95-th percentile28
Maximum53
Range53
Interquartile range (IQR)9

Descriptive statistics

Standard deviation9.3463843
Coefficient of variation (CV)1.1176616
Kurtosis4.0384874
Mean8.3624454
Median Absolute Deviation (MAD)4
Skewness1.8621054
Sum1915
Variance87.354899
MonotonicityNot monotonic
2023-12-13T06:57:57.561425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
0 29
12.7%
3 25
 
10.9%
1 24
 
10.5%
4 20
 
8.7%
10 14
 
6.1%
2 13
 
5.7%
8 13
 
5.7%
7 11
 
4.8%
5 8
 
3.5%
6 8
 
3.5%
Other values (25) 64
27.9%
ValueCountFrequency (%)
0 29
12.7%
1 24
10.5%
2 13
5.7%
3 25
10.9%
4 20
8.7%
5 8
 
3.5%
6 8
 
3.5%
7 11
 
4.8%
8 13
5.7%
9 5
 
2.2%
ValueCountFrequency (%)
53 1
0.4%
46 1
0.4%
43 1
0.4%
42 1
0.4%
33 2
0.9%
32 1
0.4%
31 1
0.4%
30 1
0.4%
29 2
0.9%
28 2
0.9%

중상
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct21
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.5196507
Minimum0
Maximum21
Zeros54
Zeros (%)23.6%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-13T06:57:57.663284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q35
95-th percentile12
Maximum21
Range21
Interquartile range (IQR)4

Descriptive statistics

Standard deviation4.0192309
Coefficient of variation (CV)1.1419403
Kurtosis3.5072183
Mean3.5196507
Median Absolute Deviation (MAD)2
Skewness1.7800639
Sum806
Variance16.154217
MonotonicityNot monotonic
2023-12-13T06:57:57.770140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
0 54
23.6%
1 37
16.2%
2 31
13.5%
3 23
10.0%
4 20
 
8.7%
6 15
 
6.6%
5 10
 
4.4%
8 8
 
3.5%
7 8
 
3.5%
9 4
 
1.7%
Other values (11) 19
 
8.3%
ValueCountFrequency (%)
0 54
23.6%
1 37
16.2%
2 31
13.5%
3 23
10.0%
4 20
 
8.7%
5 10
 
4.4%
6 15
 
6.6%
7 8
 
3.5%
8 8
 
3.5%
9 4
 
1.7%
ValueCountFrequency (%)
21 1
 
0.4%
19 1
 
0.4%
18 1
 
0.4%
17 2
0.9%
16 1
 
0.4%
15 1
 
0.4%
14 1
 
0.4%
13 2
0.9%
12 3
1.3%
11 3
1.3%

경상
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct23
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.4323144
Minimum0
Maximum34
Zeros57
Zeros (%)24.9%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-13T06:57:57.889080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q36
95-th percentile14.6
Maximum34
Range34
Interquartile range (IQR)5

Descriptive statistics

Standard deviation5.8422992
Coefficient of variation (CV)1.3181148
Kurtosis6.3433578
Mean4.4323144
Median Absolute Deviation (MAD)2
Skewness2.2548477
Sum1015
Variance34.13246
MonotonicityNot monotonic
2023-12-13T06:57:58.013217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
0 57
24.9%
1 36
15.7%
2 26
11.4%
3 25
10.9%
4 16
 
7.0%
7 9
 
3.9%
6 8
 
3.5%
10 8
 
3.5%
13 7
 
3.1%
9 6
 
2.6%
Other values (13) 31
13.5%
ValueCountFrequency (%)
0 57
24.9%
1 36
15.7%
2 26
11.4%
3 25
10.9%
4 16
 
7.0%
5 5
 
2.2%
6 8
 
3.5%
7 9
 
3.9%
8 4
 
1.7%
9 6
 
2.6%
ValueCountFrequency (%)
34 1
 
0.4%
32 1
 
0.4%
30 1
 
0.4%
23 2
 
0.9%
21 2
 
0.9%
20 1
 
0.4%
19 1
 
0.4%
15 3
1.3%
14 4
1.7%
13 7
3.1%

부상신고
Real number (ℝ)

ZEROS 

Distinct8
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.41048035
Minimum0
Maximum10
Zeros182
Zeros (%)79.5%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-13T06:57:58.139946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum10
Range10
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.1344193
Coefficient of variation (CV)2.7636386
Kurtosis28.191592
Mean0.41048035
Median Absolute Deviation (MAD)0
Skewness4.6335678
Sum94
Variance1.2869072
MonotonicityNot monotonic
2023-12-13T06:57:58.235483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
0 182
79.5%
1 26
 
11.4%
2 13
 
5.7%
4 3
 
1.3%
6 2
 
0.9%
3 1
 
0.4%
5 1
 
0.4%
10 1
 
0.4%
ValueCountFrequency (%)
0 182
79.5%
1 26
 
11.4%
2 13
 
5.7%
3 1
 
0.4%
4 3
 
1.3%
5 1
 
0.4%
6 2
 
0.9%
10 1
 
0.4%
ValueCountFrequency (%)
10 1
 
0.4%
6 2
 
0.9%
5 1
 
0.4%
4 3
 
1.3%
3 1
 
0.4%
2 13
 
5.7%
1 26
 
11.4%
0 182
79.5%

Interactions

2023-12-13T06:57:54.586237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:51.993245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:52.474607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:52.975135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:53.475960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:54.012337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:54.997120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:52.064279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:52.549650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:53.047709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:53.571337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:54.092566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:55.088136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:52.139651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:52.623830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:53.117270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:53.666393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:54.171675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:55.188061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:52.211322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:52.698329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:53.192939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:53.755548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:54.252538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:55.280640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:52.286381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:52.777988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:53.280973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:53.830416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:54.342751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:55.389144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:52.402284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:52.879374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:53.379050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:53.927460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:57:54.466678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:57:58.329398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도발생건수사망자수부상자수중상경상부상신고
시도1.0000.3220.3150.2400.3700.2960.000
발생건수0.3221.0000.9980.7350.6100.8630.683
사망자수0.3150.9981.0000.7460.6370.8570.570
부상자수0.2400.7350.7461.0000.9120.8900.580
중상0.3700.6100.6370.9121.0000.7230.395
경상0.2960.8630.8570.8900.7231.0000.475
부상신고0.0000.6830.5700.5800.3950.4751.000
2023-12-13T06:57:58.432069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발생건수사망자수부상자수중상경상부상신고시도
발생건수1.0000.9950.7410.6330.6940.2900.131
사망자수0.9951.0000.7500.6520.6930.2870.133
부상자수0.7410.7501.0000.8910.8990.3580.092
중상0.6330.6520.8911.0000.6480.2250.150
경상0.6940.6930.8990.6481.0000.2520.119
부상신고0.2900.2870.3580.2250.2521.0000.000
시도0.1310.1330.0920.1500.1190.0001.000

Missing values

2023-12-13T06:57:55.524350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:57:55.678827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도시군구발생건수사망자수부상자수중상경상부상신고
0서울종로구662020
1서울중구993201
2서울용산구12141100
3서울성동구991010
4서울동대문구14140000
5서울성북구14143210
6서울도봉구660000
7서울은평구12121100
8서울서대문구550000
9서울마포구11111100
시도시군구발생건수사망자수부상자수중상경상부상신고
219대전중구12122020
220대전서구222211371
221대전유성구21228260
222대전대덕구13133030
223울산중구10101100
224울산남구2323174130
225울산동구660000
226울산북구15164311320
227울산울주군23242411121
228세종세종17203210