Overview

Dataset statistics

Number of variables8
Number of observations39
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.8 KiB
Average record size in memory73.4 B

Variable types

Categorical2
Text1
Numeric5

Dataset

Description* 부문별 대형 교통사고(2018)
Author도로교통공단
URLhttps://www.data.go.kr/data/15094164/fileData.do

Alerts

사망자수 is highly overall correlated with 부상자수 and 1 other fieldsHigh correlation
부상자수 is highly overall correlated with 사망자수 and 3 other fieldsHigh correlation
경상 is highly overall correlated with 사망자수 and 2 other fieldsHigh correlation
부상신고 is highly overall correlated with 부상자수 and 1 other fieldsHigh correlation
발생건수 is highly overall correlated with 부상자수 and 2 other fieldsHigh correlation
발생건수 is highly imbalanced (62.6%)Imbalance
사망자수 has 21 (53.8%) zerosZeros
부상자수 has 2 (5.1%) zerosZeros
중상 has 9 (23.1%) zerosZeros
경상 has 7 (17.9%) zerosZeros
부상신고 has 26 (66.7%) zerosZeros

Reproduction

Analysis started2023-12-11 23:18:57.732916
Analysis finished2023-12-11 23:19:00.756575
Duration3.02 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도
Categorical

Distinct14
Distinct (%)35.9%
Missing0
Missing (%)0.0%
Memory size444.0 B
경기
서울
충남
전남
충북
Other values (9)
12 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique6 ?
Unique (%)15.4%

Sample

1st row서울
2nd row서울
3rd row서울
4th row서울
5th row서울

Common Values

ValueCountFrequency (%)
경기 8
20.5%
서울 6
15.4%
충남 6
15.4%
전남 4
10.3%
충북 3
 
7.7%
강원 2
 
5.1%
경남 2
 
5.1%
대전 2
 
5.1%
부산 1
 
2.6%
전북 1
 
2.6%
Other values (4) 4
10.3%

Length

2023-12-12T08:19:00.834469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 8
20.5%
서울 6
15.4%
충남 6
15.4%
전남 4
10.3%
충북 3
 
7.7%
강원 2
 
5.1%
경남 2
 
5.1%
대전 2
 
5.1%
부산 1
 
2.6%
전북 1
 
2.6%
Other values (4) 4
10.3%
Distinct38
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Memory size444.0 B
2023-12-12T08:19:01.019497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.9487179
Min length2

Characters and Unicode

Total characters115
Distinct characters41
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)94.9%

Sample

1st row강서구
2nd row강남구
3rd row서초구
4th row양천구
5th row강북구
ValueCountFrequency (%)
북구 2
 
5.1%
전주시 1
 
2.6%
동구 1
 
2.6%
순천시 1
 
2.6%
아산시 1
 
2.6%
공주시 1
 
2.6%
서산시 1
 
2.6%
홍성군 1
 
2.6%
예산군 1
 
2.6%
여수시 1
 
2.6%
Other values (28) 28
71.8%
2023-12-12T08:19:01.331890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18
15.7%
12
 
10.4%
9
 
7.8%
8
 
7.0%
6
 
5.2%
6
 
5.2%
4
 
3.5%
4
 
3.5%
3
 
2.6%
3
 
2.6%
Other values (31) 42
36.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 115
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
18
15.7%
12
 
10.4%
9
 
7.8%
8
 
7.0%
6
 
5.2%
6
 
5.2%
4
 
3.5%
4
 
3.5%
3
 
2.6%
3
 
2.6%
Other values (31) 42
36.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 115
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
18
15.7%
12
 
10.4%
9
 
7.8%
8
 
7.0%
6
 
5.2%
6
 
5.2%
4
 
3.5%
4
 
3.5%
3
 
2.6%
3
 
2.6%
Other values (31) 42
36.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 115
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
18
15.7%
12
 
10.4%
9
 
7.8%
8
 
7.0%
6
 
5.2%
6
 
5.2%
4
 
3.5%
4
 
3.5%
3
 
2.6%
3
 
2.6%
Other values (31) 42
36.5%

발생건수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)10.3%
Missing0
Missing (%)0.0%
Memory size444.0 B
1
34 
2
 
2
3
 
2
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)2.6%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 34
87.2%
2 2
 
5.1%
3 2
 
5.1%
4 1
 
2.6%

Length

2023-12-12T08:19:01.435808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T08:19:01.513916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 34
87.2%
2 2
 
5.1%
3 2
 
5.1%
4 1
 
2.6%

사망자수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3846154
Minimum0
Maximum8
Zeros21
Zeros (%)53.8%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-12T08:19:01.591451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q33
95-th percentile4
Maximum8
Range8
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.8298028
Coefficient of variation (CV)1.3215242
Kurtosis2.6892328
Mean1.3846154
Median Absolute Deviation (MAD)0
Skewness1.4306092
Sum54
Variance3.3481781
MonotonicityNot monotonic
2023-12-12T08:19:01.691860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 21
53.8%
3 11
28.2%
1 3
 
7.7%
4 2
 
5.1%
8 1
 
2.6%
2 1
 
2.6%
ValueCountFrequency (%)
0 21
53.8%
1 3
 
7.7%
2 1
 
2.6%
3 11
28.2%
4 2
 
5.1%
8 1
 
2.6%
ValueCountFrequency (%)
8 1
 
2.6%
4 2
 
5.1%
3 11
28.2%
2 1
 
2.6%
1 3
 
7.7%
0 21
53.8%

부상자수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct24
Distinct (%)61.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24.666667
Minimum0
Maximum98
Zeros2
Zeros (%)5.1%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-12T08:19:01.783682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.9
Q15
median22
Q331.5
95-th percentile59.4
Maximum98
Range98
Interquartile range (IQR)26.5

Descriptive statistics

Standard deviation21.665317
Coefficient of variation (CV)0.87832367
Kurtosis2.9025913
Mean24.666667
Median Absolute Deviation (MAD)15
Skewness1.381658
Sum962
Variance469.38596
MonotonicityNot monotonic
2023-12-12T08:19:01.884804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
1 5
 
12.8%
21 4
 
10.3%
31 3
 
7.7%
19 2
 
5.1%
27 2
 
5.1%
26 2
 
5.1%
0 2
 
5.1%
5 2
 
5.1%
2 2
 
5.1%
49 1
 
2.6%
Other values (14) 14
35.9%
ValueCountFrequency (%)
0 2
 
5.1%
1 5
12.8%
2 2
 
5.1%
5 2
 
5.1%
11 1
 
2.6%
19 2
 
5.1%
20 1
 
2.6%
21 4
10.3%
22 1
 
2.6%
24 1
 
2.6%
ValueCountFrequency (%)
98 1
2.6%
81 1
2.6%
57 1
2.6%
49 1
2.6%
45 1
2.6%
42 1
2.6%
40 1
2.6%
39 1
2.6%
37 1
2.6%
32 1
2.6%

중상
Real number (ℝ)

ZEROS 

Distinct14
Distinct (%)35.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.4871795
Minimum0
Maximum19
Zeros9
Zeros (%)23.1%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-12T08:19:01.972624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q36
95-th percentile14.2
Maximum19
Range19
Interquartile range (IQR)5

Descriptive statistics

Standard deviation5.154823
Coefficient of variation (CV)1.1487891
Kurtosis0.76944953
Mean4.4871795
Median Absolute Deviation (MAD)2
Skewness1.3042831
Sum175
Variance26.5722
MonotonicityNot monotonic
2023-12-12T08:19:02.070260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
0 9
23.1%
1 6
15.4%
2 5
12.8%
4 4
10.3%
5 3
 
7.7%
3 2
 
5.1%
14 2
 
5.1%
10 2
 
5.1%
11 1
 
2.6%
19 1
 
2.6%
Other values (4) 4
10.3%
ValueCountFrequency (%)
0 9
23.1%
1 6
15.4%
2 5
12.8%
3 2
 
5.1%
4 4
10.3%
5 3
 
7.7%
7 1
 
2.6%
8 1
 
2.6%
10 2
 
5.1%
11 1
 
2.6%
ValueCountFrequency (%)
19 1
 
2.6%
16 1
 
2.6%
14 2
5.1%
13 1
 
2.6%
11 1
 
2.6%
10 2
5.1%
8 1
 
2.6%
7 1
 
2.6%
5 3
7.7%
4 4
10.3%

경상
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct24
Distinct (%)61.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.307692
Minimum0
Maximum78
Zeros7
Zeros (%)17.9%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-12T08:19:02.181892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11.5
median14
Q324.5
95-th percentile42.2
Maximum78
Range78
Interquartile range (IQR)23

Descriptive statistics

Standard deviation16.463922
Coefficient of variation (CV)1.0095801
Kurtosis3.7405294
Mean16.307692
Median Absolute Deviation (MAD)12
Skewness1.5203036
Sum636
Variance271.06073
MonotonicityNot monotonic
2023-12-12T08:19:02.318910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
0 7
17.9%
18 4
 
10.3%
1 3
 
7.7%
2 2
 
5.1%
24 2
 
5.1%
29 2
 
5.1%
10 2
 
5.1%
35 1
 
2.6%
6 1
 
2.6%
42 1
 
2.6%
Other values (14) 14
35.9%
ValueCountFrequency (%)
0 7
17.9%
1 3
7.7%
2 2
 
5.1%
4 1
 
2.6%
6 1
 
2.6%
8 1
 
2.6%
9 1
 
2.6%
10 2
 
5.1%
11 1
 
2.6%
14 1
 
2.6%
ValueCountFrequency (%)
78 1
2.6%
44 1
2.6%
42 1
2.6%
36 1
2.6%
35 1
2.6%
29 2
5.1%
28 1
2.6%
27 1
2.6%
25 1
2.6%
24 2
5.1%

부상신고
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct12
Distinct (%)30.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.8717949
Minimum0
Maximum38
Zeros26
Zeros (%)66.7%
Negative0
Negative (%)0.0%
Memory size483.0 B
2023-12-12T08:19:02.431223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32.5
95-th percentile22.5
Maximum38
Range38
Interquartile range (IQR)2.5

Descriptive statistics

Standard deviation8.6242604
Coefficient of variation (CV)2.227458
Kurtosis6.8997878
Mean3.8717949
Median Absolute Deviation (MAD)0
Skewness2.6545946
Sum151
Variance74.377868
MonotonicityNot monotonic
2023-12-12T08:19:02.550177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
0 26
66.7%
3 2
 
5.1%
1 2
 
5.1%
11 1
 
2.6%
19 1
 
2.6%
6 1
 
2.6%
2 1
 
2.6%
4 1
 
2.6%
22 1
 
2.6%
27 1
 
2.6%
Other values (2) 2
 
5.1%
ValueCountFrequency (%)
0 26
66.7%
1 2
 
5.1%
2 1
 
2.6%
3 2
 
5.1%
4 1
 
2.6%
6 1
 
2.6%
11 1
 
2.6%
14 1
 
2.6%
19 1
 
2.6%
22 1
 
2.6%
ValueCountFrequency (%)
38 1
2.6%
27 1
2.6%
22 1
2.6%
19 1
2.6%
14 1
2.6%
11 1
2.6%
6 1
2.6%
4 1
2.6%
3 2
5.1%
2 1
2.6%

Interactions

2023-12-12T08:19:00.172227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:58.251065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:58.745394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:59.206120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:59.699689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:19:00.278893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:58.338311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:58.844999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:59.294480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:59.791812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:19:00.369583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:58.448394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:58.916221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:59.415231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:59.876965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:19:00.454255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:58.540208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:59.002278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:59.513484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:59.970667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:19:00.529087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:58.635130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:59.086897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:18:59.614859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T08:19:00.065269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T08:19:02.646832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도시군구발생건수사망자수부상자수중상경상부상신고
시도1.0000.0000.0000.7340.0000.3320.1120.000
시군구0.0001.0001.0000.0000.8600.0000.0001.000
발생건수0.0001.0001.0000.0000.9970.7210.7740.685
사망자수0.7340.0000.0001.0000.7620.8380.4050.000
부상자수0.0000.8600.9970.7621.0000.7410.8330.623
중상0.3320.0000.7210.8380.7411.0000.6360.000
경상0.1120.0000.7740.4050.8330.6361.0000.597
부상신고0.0001.0000.6850.0000.6230.0000.5971.000
2023-12-12T08:19:02.755839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도발생건수
시도1.0000.000
발생건수0.0001.000
2023-12-12T08:19:02.829471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사망자수부상자수중상경상부상신고시도발생건수
사망자수1.000-0.583-0.184-0.588-0.3290.4040.000
부상자수-0.5831.0000.3910.9210.5750.0000.777
중상-0.1840.3911.0000.372-0.0020.0760.473
경상-0.5880.9210.3721.0000.3420.0000.630
부상신고-0.3290.575-0.0020.3421.0000.0000.583
시도0.4040.0000.0760.0000.0001.0000.000
발생건수0.0000.7770.4730.6300.5830.0001.000

Missing values

2023-12-12T08:19:00.623467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T08:19:00.715938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도시군구발생건수사망자수부상자수중상경상부상신고
0서울강서구204911353
1서울강남구10211911
2서울서초구10210219
3서울양천구10313226
4서울강북구10324280
5서울금천구104014242
6부산동래구11195140
7경기부천시10275184
8경기평택시334502322
9경기화성시3181104427
시도시군구발생건수사망자수부상자수중상경상부상신고
29전남영암군1811740
30전남진도군10220814
31경북울진군131100
32경남양산시131010
33경남고성군135410
34대구수성구130000
35광주북구10420420
36대전동구102616100
37대전유성구10213180
38울산북구12378290