Overview

Dataset statistics

Number of variables8
Number of observations226
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory15.6 KiB
Average record size in memory70.6 B

Variable types

Categorical2
Text1
Numeric5

Dataset

Description* 부문별 뺑소니 교통사고(2018)
Author도로교통공단
URLhttps://www.data.go.kr/data/15094167/fileData.do

Alerts

발생건수 is highly overall correlated with 부상자수 and 3 other fieldsHigh correlation
부상자수 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
중상 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
경상 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
부상신고 is highly overall correlated with 발생건수 and 4 other fieldsHigh correlation
사망자수 is highly overall correlated with 부상신고High correlation
부상자수 has 3 (1.3%) zerosZeros
중상 has 18 (8.0%) zerosZeros
경상 has 6 (2.7%) zerosZeros
부상신고 has 138 (61.1%) zerosZeros

Reproduction

Analysis started2024-04-20 17:05:16.447551
Analysis finished2024-04-20 17:05:23.964047
Duration7.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도
Categorical

Distinct17
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
경기
31 
서울
25 
전남
22 
경북
22 
강원
17 
Other values (12)
109 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st row서울
2nd row서울
3rd row서울
4th row서울
5th row서울

Common Values

ValueCountFrequency (%)
경기 31
13.7%
서울 25
11.1%
전남 22
9.7%
경북 22
9.7%
강원 17
7.5%
경남 17
7.5%
부산 16
7.1%
충남 15
6.6%
전북 14
 
6.2%
충북 11
 
4.9%
Other values (7) 36
15.9%

Length

2024-04-21T02:05:24.159885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 31
13.7%
서울 25
11.1%
전남 22
9.7%
경북 22
9.7%
강원 17
7.5%
경남 17
7.5%
부산 16
7.1%
충남 15
6.6%
전북 14
 
6.2%
충북 11
 
4.9%
Other values (7) 36
15.9%
Distinct204
Distinct (%)90.3%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2024-04-21T02:05:25.328771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length2.9380531
Min length2

Characters and Unicode

Total characters664
Distinct characters132
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique198 ?
Unique (%)87.6%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row동대문구
ValueCountFrequency (%)
중구 6
 
2.7%
동구 6
 
2.7%
서구 5
 
2.2%
남구 5
 
2.2%
북구 4
 
1.8%
강서구 2
 
0.9%
구례군 1
 
0.4%
장흥군 1
 
0.4%
화순군 1
 
0.4%
보성군 1
 
0.4%
Other values (194) 194
85.8%
2024-04-21T02:05:26.892740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
81
 
12.2%
78
 
11.7%
74
 
11.1%
22
 
3.3%
20
 
3.0%
18
 
2.7%
17
 
2.6%
17
 
2.6%
16
 
2.4%
13
 
2.0%
Other values (122) 308
46.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 662
99.7%
Open Punctuation 1
 
0.2%
Close Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
81
 
12.2%
78
 
11.8%
74
 
11.2%
22
 
3.3%
20
 
3.0%
18
 
2.7%
17
 
2.6%
17
 
2.6%
16
 
2.4%
13
 
2.0%
Other values (120) 306
46.2%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 662
99.7%
Common 2
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
81
 
12.2%
78
 
11.8%
74
 
11.2%
22
 
3.3%
20
 
3.0%
18
 
2.7%
17
 
2.6%
17
 
2.6%
16
 
2.4%
13
 
2.0%
Other values (120) 306
46.2%
Common
ValueCountFrequency (%)
( 1
50.0%
) 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 662
99.7%
ASCII 2
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
81
 
12.2%
78
 
11.8%
74
 
11.2%
22
 
3.3%
20
 
3.0%
18
 
2.7%
17
 
2.6%
17
 
2.6%
16
 
2.4%
13
 
2.0%
Other values (120) 306
46.2%
ASCII
ValueCountFrequency (%)
( 1
50.0%
) 1
50.0%

발생건수
Real number (ℝ)

HIGH CORRELATION 

Distinct83
Distinct (%)36.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.632743
Minimum1
Maximum261
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2024-04-21T02:05:27.243232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q16
median18.5
Q339
95-th percentile120.5
Maximum261
Range260
Interquartile range (IQR)33

Descriptive statistics

Standard deviation43.232833
Coefficient of variation (CV)1.2854388
Kurtosis7.7024378
Mean33.632743
Median Absolute Deviation (MAD)13.5
Skewness2.580525
Sum7601
Variance1869.0779
MonotonicityNot monotonic
2024-04-21T02:05:27.497921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 15
 
6.6%
5 14
 
6.2%
15 11
 
4.9%
6 11
 
4.9%
2 10
 
4.4%
4 6
 
2.7%
27 6
 
2.7%
14 6
 
2.7%
21 5
 
2.2%
1 5
 
2.2%
Other values (73) 137
60.6%
ValueCountFrequency (%)
1 5
 
2.2%
2 10
4.4%
3 15
6.6%
4 6
 
2.7%
5 14
6.2%
6 11
4.9%
7 4
 
1.8%
8 3
 
1.3%
9 4
 
1.8%
10 4
 
1.8%
ValueCountFrequency (%)
261 1
0.4%
233 1
0.4%
224 1
0.4%
182 1
0.4%
174 1
0.4%
166 1
0.4%
165 1
0.4%
161 1
0.4%
158 1
0.4%
143 1
0.4%

사망자수
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
0
150 
1
56 
2
 
10
3
 
9
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 150
66.4%
1 56
 
24.8%
2 10
 
4.4%
3 9
 
4.0%
4 1
 
0.4%

Length

2024-04-21T02:05:27.717311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T02:05:27.887678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 150
66.4%
1 56
 
24.8%
2 10
 
4.4%
3 9
 
4.0%
4 1
 
0.4%

부상자수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct99
Distinct (%)43.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.110619
Minimum0
Maximum351
Zeros3
Zeros (%)1.3%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2024-04-21T02:05:28.088808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q19.25
median29.5
Q355.75
95-th percentile183
Maximum351
Range351
Interquartile range (IQR)46.5

Descriptive statistics

Standard deviation61.390562
Coefficient of variation (CV)1.2500466
Kurtosis7.0135164
Mean49.110619
Median Absolute Deviation (MAD)22.5
Skewness2.4856907
Sum11099
Variance3768.801
MonotonicityNot monotonic
2024-04-21T02:05:28.433047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7 11
 
4.9%
4 9
 
4.0%
2 8
 
3.5%
5 7
 
3.1%
20 7
 
3.1%
3 7
 
3.1%
15 6
 
2.7%
6 6
 
2.7%
38 5
 
2.2%
24 5
 
2.2%
Other values (89) 155
68.6%
ValueCountFrequency (%)
0 3
 
1.3%
2 8
3.5%
3 7
3.1%
4 9
4.0%
5 7
3.1%
6 6
2.7%
7 11
4.9%
8 4
 
1.8%
9 2
 
0.9%
10 3
 
1.3%
ValueCountFrequency (%)
351 1
0.4%
335 1
0.4%
297 1
0.4%
272 1
0.4%
270 1
0.4%
264 1
0.4%
231 1
0.4%
221 1
0.4%
220 1
0.4%
207 1
0.4%

중상
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct39
Distinct (%)17.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.2345133
Minimum0
Maximum61
Zeros18
Zeros (%)8.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2024-04-21T02:05:28.850344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median6
Q313
95-th percentile28.75
Maximum61
Range61
Interquartile range (IQR)11

Descriptive statistics

Standard deviation10.403113
Coefficient of variation (CV)1.126547
Kurtosis6.2874271
Mean9.2345133
Median Absolute Deviation (MAD)5
Skewness2.2341827
Sum2087
Variance108.22476
MonotonicityNot monotonic
2024-04-21T02:05:29.273361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
1 23
 
10.2%
2 23
 
10.2%
0 18
 
8.0%
4 18
 
8.0%
3 17
 
7.5%
7 14
 
6.2%
6 11
 
4.9%
8 11
 
4.9%
12 8
 
3.5%
13 8
 
3.5%
Other values (29) 75
33.2%
ValueCountFrequency (%)
0 18
8.0%
1 23
10.2%
2 23
10.2%
3 17
7.5%
4 18
8.0%
5 8
 
3.5%
6 11
4.9%
7 14
6.2%
8 11
4.9%
9 3
 
1.3%
ValueCountFrequency (%)
61 1
0.4%
58 1
0.4%
49 1
0.4%
48 1
0.4%
46 1
0.4%
42 1
0.4%
38 1
0.4%
35 1
0.4%
34 2
0.9%
31 1
0.4%

경상
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct86
Distinct (%)38.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.49115
Minimum0
Maximum301
Zeros6
Zeros (%)2.7%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2024-04-21T02:05:29.674154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q17
median21.5
Q342
95-th percentile145.75
Maximum301
Range301
Interquartile range (IQR)35

Descriptive statistics

Standard deviation50.288787
Coefficient of variation (CV)1.3065026
Kurtosis7.8218684
Mean38.49115
Median Absolute Deviation (MAD)16.5
Skewness2.6149802
Sum8699
Variance2528.9621
MonotonicityNot monotonic
2024-04-21T02:05:30.103690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 10
 
4.4%
5 10
 
4.4%
17 9
 
4.0%
18 8
 
3.5%
3 8
 
3.5%
4 8
 
3.5%
1 8
 
3.5%
0 6
 
2.7%
31 5
 
2.2%
12 5
 
2.2%
Other values (76) 149
65.9%
ValueCountFrequency (%)
0 6
2.7%
1 8
3.5%
2 10
4.4%
3 8
3.5%
4 8
3.5%
5 10
4.4%
6 4
 
1.8%
7 4
 
1.8%
8 5
2.2%
9 1
 
0.4%
ValueCountFrequency (%)
301 1
0.4%
251 1
0.4%
244 1
0.4%
239 1
0.4%
237 1
0.4%
230 1
0.4%
181 1
0.4%
167 2
0.9%
160 1
0.4%
158 1
0.4%

부상신고
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct15
Distinct (%)6.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3849558
Minimum0
Maximum23
Zeros138
Zeros (%)61.1%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2024-04-21T02:05:30.470075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile6.75
Maximum23
Range23
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.1633378
Coefficient of variation (CV)2.2840714
Kurtosis18.735517
Mean1.3849558
Median Absolute Deviation (MAD)0
Skewness3.9547966
Sum313
Variance10.006706
MonotonicityNot monotonic
2024-04-21T02:05:30.823256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
0 138
61.1%
1 37
 
16.4%
2 18
 
8.0%
3 6
 
2.7%
4 6
 
2.7%
6 5
 
2.2%
5 4
 
1.8%
8 3
 
1.3%
7 2
 
0.9%
11 2
 
0.9%
Other values (5) 5
 
2.2%
ValueCountFrequency (%)
0 138
61.1%
1 37
 
16.4%
2 18
 
8.0%
3 6
 
2.7%
4 6
 
2.7%
5 4
 
1.8%
6 5
 
2.2%
7 2
 
0.9%
8 3
 
1.3%
11 2
 
0.9%
ValueCountFrequency (%)
23 1
 
0.4%
19 1
 
0.4%
18 1
 
0.4%
16 1
 
0.4%
12 1
 
0.4%
11 2
 
0.9%
8 3
1.3%
7 2
 
0.9%
6 5
2.2%
5 4
1.8%

Interactions

2024-04-21T02:05:22.123023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:16.976345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:18.195564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:19.449839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:20.896215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:22.356881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:17.215635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:18.441678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:19.907605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:21.138073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:22.605554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:17.470531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:18.703126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:20.161875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:21.391333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:22.845523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:17.718090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:18.955883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:20.409912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:21.640619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:23.087224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:17.961401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:19.210049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:20.659182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T02:05:21.886387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T02:05:31.062590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도발생건수사망자수부상자수중상경상부상신고
시도1.0000.3620.3060.4670.3320.3710.000
발생건수0.3621.0000.4940.9290.8510.9160.812
사망자수0.3060.4941.0000.7460.7790.7020.858
부상자수0.4670.9290.7461.0000.9500.9820.897
중상0.3320.8510.7790.9501.0000.9240.908
경상0.3710.9160.7020.9820.9241.0000.878
부상신고0.0000.8120.8580.8970.9080.8781.000
2024-04-21T02:05:31.549162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도사망자수
시도1.0000.157
사망자수0.1571.000
2024-04-21T02:05:31.728295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발생건수부상자수중상경상부상신고시도사망자수
발생건수1.0000.9890.9090.9770.6120.1240.316
부상자수0.9891.0000.9060.9910.6060.1980.398
중상0.9090.9061.0000.8500.5520.1320.429
경상0.9770.9910.8501.0000.5750.1500.361
부상신고0.6120.6060.5520.5751.0000.0000.520
시도0.1240.1980.1320.1500.0001.0000.157
사망자수0.3160.3980.4290.3610.5200.1571.000

Missing values

2024-04-21T02:05:23.410733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T02:05:23.812587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도시군구발생건수사망자수부상자수중상경상부상신고
0서울종로구170286202
1서울중구130174112
2서울용산구4115715420
3서울성동구240444400
4서울동대문구3304512330
5서울성북구291548451
6서울도봉구140223190
7서울은평구210339222
8서울서대문구150323272
9서울마포구4206111473
시도시군구발생건수사망자수부상자수중상경상부상신고
216대전중구340498401
217대전서구692123161070
218대전유성구57110313900
219대전대덕구271467381
220울산중구3715110392
221울산남구5639927711
222울산동구211247170
223울산북구4106714530
224울산울주군4125717400
225세종세종190274230