Overview

Dataset statistics

Number of variables7
Number of observations229
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory13.8 KiB
Average record size in memory61.6 B

Variable types

Categorical1
Text1
Numeric5

Dataset

Description- 경찰에서 조사, 처리한 교통사고에 대한 통계 정보로 인적 피해가 있는 사고만 집계 됨 - 시도 및 시군구별 교통사고 사고건수 사망자수, 중상자수, 경상자수, 부상신고자수 통계 - 교통사고분석시스템(http://taas.koroad.or.kr)의 데이터를 바탕으로 함
URLhttps://www.data.go.kr/data/15070297/fileData.do

Alerts

사고건수 is highly overall correlated with 사망자수 and 3 other fieldsHigh correlation
사망자수 is highly overall correlated with 사고건수 and 2 other fieldsHigh correlation
중상자수 is highly overall correlated with 사고건수 and 3 other fieldsHigh correlation
경상자수 is highly overall correlated with 사고건수 and 3 other fieldsHigh correlation
부상신고자수 is highly overall correlated with 사고건수 and 2 other fieldsHigh correlation
부상신고자수 has 6 (2.6%) zerosZeros

Reproduction

Analysis started2023-12-12 23:34:49.772850
Analysis finished2023-12-12 23:34:52.536811
Duration2.76 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도
Categorical

Distinct17
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
경기
31 
서울
25 
경북
23 
전남
22 
강원
18 
Other values (12)
110 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st row서울
2nd row서울
3rd row서울
4th row서울
5th row서울

Common Values

ValueCountFrequency (%)
경기 31
13.5%
서울 25
10.9%
경북 23
10.0%
전남 22
9.6%
강원 18
7.9%
경남 18
7.9%
부산 16
7.0%
충남 15
6.6%
전북 14
 
6.1%
충북 11
 
4.8%
Other values (7) 36
15.7%

Length

2023-12-13T08:34:52.592644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 31
13.5%
서울 25
10.9%
경북 23
10.0%
전남 22
9.6%
강원 18
7.9%
경남 18
7.9%
부산 16
7.0%
충남 15
6.6%
전북 14
 
6.1%
충북 11
 
4.8%
Other values (7) 36
15.7%
Distinct207
Distinct (%)90.4%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2023-12-13T08:34:52.915985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length2.9519651
Min length2

Characters and Unicode

Total characters676
Distinct characters135
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique200 ?
Unique (%)87.3%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row동대문구
ValueCountFrequency (%)
중구 6
 
2.6%
동구 6
 
2.6%
서구 5
 
2.2%
남구 4
 
1.7%
북구 4
 
1.7%
강서구 2
 
0.9%
고성군 2
 
0.9%
구례군 1
 
0.4%
담양군 1
 
0.4%
곡성군 1
 
0.4%
Other values (197) 197
86.0%
2023-12-13T08:34:53.343860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
85
 
12.6%
79
 
11.7%
74
 
10.9%
22
 
3.3%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.5%
16
 
2.4%
13
 
1.9%
Other values (125) 314
46.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 674
99.7%
Close Punctuation 1
 
0.1%
Open Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
85
 
12.6%
79
 
11.7%
74
 
11.0%
22
 
3.3%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.5%
16
 
2.4%
13
 
1.9%
Other values (123) 312
46.3%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 674
99.7%
Common 2
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
85
 
12.6%
79
 
11.7%
74
 
11.0%
22
 
3.3%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.5%
16
 
2.4%
13
 
1.9%
Other values (123) 312
46.3%
Common
ValueCountFrequency (%)
) 1
50.0%
( 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 674
99.7%
ASCII 2
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
85
 
12.6%
79
 
11.7%
74
 
11.0%
22
 
3.3%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.5%
16
 
2.4%
13
 
1.9%
Other values (123) 312
46.3%
ASCII
ValueCountFrequency (%)
) 1
50.0%
( 1
50.0%

사고건수
Real number (ℝ)

HIGH CORRELATION 

Distinct208
Distinct (%)90.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean859.54585
Minimum15
Maximum4705
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-13T08:34:53.501275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile92.4
Q1209
median584
Q31165
95-th percentile2733.6
Maximum4705
Range4690
Interquartile range (IQR)956

Descriptive statistics

Standard deviation882.68801
Coefficient of variation (CV)1.0269237
Kurtosis3.3517926
Mean859.54585
Median Absolute Deviation (MAD)426
Skewness1.7968782
Sum196836
Variance779138.13
MonotonicityNot monotonic
2023-12-13T08:34:53.670025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
152 3
 
1.3%
113 3
 
1.3%
445 3
 
1.3%
278 2
 
0.9%
154 2
 
0.9%
79 2
 
0.9%
198 2
 
0.9%
196 2
 
0.9%
1384 2
 
0.9%
228 2
 
0.9%
Other values (198) 206
90.0%
ValueCountFrequency (%)
15 1
0.4%
26 1
0.4%
40 1
0.4%
47 2
0.9%
62 1
0.4%
75 1
0.4%
79 2
0.9%
87 1
0.4%
89 1
0.4%
92 1
0.4%
ValueCountFrequency (%)
4705 1
0.4%
4039 1
0.4%
3709 1
0.4%
3685 1
0.4%
3599 1
0.4%
3588 1
0.4%
3562 1
0.4%
3508 1
0.4%
3457 1
0.4%
3230 1
0.4%

사망자수
Real number (ℝ)

HIGH CORRELATION 

Distinct38
Distinct (%)16.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.943231
Minimum0
Maximum61
Zeros2
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-13T08:34:53.801235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.4
Q16
median9
Q315
95-th percentile27
Maximum61
Range61
Interquartile range (IQR)9

Descriptive statistics

Standard deviation9.3335515
Coefficient of variation (CV)0.78149298
Kurtosis7.0084215
Mean11.943231
Median Absolute Deviation (MAD)4
Skewness2.2166718
Sum2735
Variance87.115184
MonotonicityNot monotonic
2023-12-13T08:34:53.912429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
9 20
 
8.7%
8 19
 
8.3%
10 19
 
8.3%
6 15
 
6.6%
3 14
 
6.1%
5 13
 
5.7%
7 11
 
4.8%
4 11
 
4.8%
13 10
 
4.4%
11 9
 
3.9%
Other values (28) 88
38.4%
ValueCountFrequency (%)
0 2
 
0.9%
1 1
 
0.4%
2 9
3.9%
3 14
6.1%
4 11
4.8%
5 13
5.7%
6 15
6.6%
7 11
4.8%
8 19
8.3%
9 20
8.7%
ValueCountFrequency (%)
61 1
0.4%
56 1
0.4%
52 1
0.4%
47 1
0.4%
43 1
0.4%
39 1
0.4%
37 1
0.4%
35 1
0.4%
32 1
0.4%
29 1
0.4%

중상자수
Real number (ℝ)

HIGH CORRELATION 

Distinct183
Distinct (%)79.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean225.82969
Minimum9
Maximum1085
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-13T08:34:54.041198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9
5-th percentile36
Q185
median180
Q3285
95-th percentile661
Maximum1085
Range1076
Interquartile range (IQR)200

Descriptive statistics

Standard deviation197.11792
Coefficient of variation (CV)0.87286095
Kurtosis3.622745
Mean225.82969
Median Absolute Deviation (MAD)96
Skewness1.8028932
Sum51715
Variance38855.475
MonotonicityNot monotonic
2023-12-13T08:34:54.174312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
175 3
 
1.3%
100 3
 
1.3%
40 3
 
1.3%
79 3
 
1.3%
70 3
 
1.3%
105 2
 
0.9%
36 2
 
0.9%
215 2
 
0.9%
77 2
 
0.9%
56 2
 
0.9%
Other values (173) 204
89.1%
ValueCountFrequency (%)
9 1
0.4%
10 1
0.4%
17 2
0.9%
18 2
0.9%
29 1
0.4%
33 2
0.9%
34 1
0.4%
35 1
0.4%
36 2
0.9%
37 1
0.4%
ValueCountFrequency (%)
1085 1
0.4%
1037 1
0.4%
859 1
0.4%
855 1
0.4%
850 1
0.4%
828 1
0.4%
795 1
0.4%
769 1
0.4%
732 1
0.4%
685 1
0.4%

경상자수
Real number (ℝ)

HIGH CORRELATION 

Distinct214
Distinct (%)93.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean927.64192
Minimum13
Maximum5055
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-13T08:34:54.294362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum13
5-th percentile83
Q1213
median620
Q31231
95-th percentile2888.4
Maximum5055
Range5042
Interquartile range (IQR)1018

Descriptive statistics

Standard deviation988.10144
Coefficient of variation (CV)1.0651755
Kurtosis3.2963188
Mean927.64192
Median Absolute Deviation (MAD)451
Skewness1.823133
Sum212430
Variance976344.46
MonotonicityNot monotonic
2023-12-13T08:34:54.440406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
169 3
 
1.3%
94 2
 
0.9%
1239 2
 
0.9%
131 2
 
0.9%
123 2
 
0.9%
579 2
 
0.9%
391 2
 
0.9%
1243 2
 
0.9%
2775 2
 
0.9%
77 2
 
0.9%
Other values (204) 208
90.8%
ValueCountFrequency (%)
13 1
0.4%
31 1
0.4%
34 1
0.4%
43 1
0.4%
48 1
0.4%
53 1
0.4%
59 1
0.4%
64 1
0.4%
75 1
0.4%
77 2
0.9%
ValueCountFrequency (%)
5055 1
0.4%
4595 1
0.4%
4296 1
0.4%
4181 1
0.4%
4035 1
0.4%
3857 1
0.4%
3832 1
0.4%
3797 1
0.4%
3760 1
0.4%
3700 1
0.4%

부상신고자수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct121
Distinct (%)52.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77.10917
Minimum0
Maximum619
Zeros6
Zeros (%)2.6%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-13T08:34:54.550815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q112
median38
Q399
95-th percentile283
Maximum619
Range619
Interquartile range (IQR)87

Descriptive statistics

Standard deviation102.36896
Coefficient of variation (CV)1.3275848
Kurtosis6.7504041
Mean77.10917
Median Absolute Deviation (MAD)32
Skewness2.3934009
Sum17658
Variance10479.405
MonotonicityNot monotonic
2023-12-13T08:34:54.713448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 8
 
3.5%
1 7
 
3.1%
19 6
 
2.6%
0 6
 
2.6%
5 6
 
2.6%
6 6
 
2.6%
41 5
 
2.2%
15 5
 
2.2%
12 5
 
2.2%
52 4
 
1.7%
Other values (111) 171
74.7%
ValueCountFrequency (%)
0 6
2.6%
1 7
3.1%
2 8
3.5%
3 4
1.7%
4 4
1.7%
5 6
2.6%
6 6
2.6%
7 2
 
0.9%
8 4
1.7%
9 1
 
0.4%
ValueCountFrequency (%)
619 1
0.4%
498 1
0.4%
491 1
0.4%
477 1
0.4%
453 1
0.4%
377 1
0.4%
363 1
0.4%
352 2
0.9%
332 1
0.4%
299 1
0.4%

Interactions

2023-12-13T08:34:51.964154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:50.047751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:50.763488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:51.191001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:51.597212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:52.037106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:50.144720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:50.837884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:51.261762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:51.666715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:52.110643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:50.224091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:50.930909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:51.355941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:51.739637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:52.178972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:50.610486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:51.005406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:51.433572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:51.812135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:52.268890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:50.685370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:51.093867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:51.519880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:34:51.892139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:34:54.817153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도사고건수사망자수중상자수경상자수부상신고자수
시도1.0000.5460.3550.4930.5380.565
사고건수0.5461.0000.9150.8620.9800.872
사망자수0.3550.9151.0000.7260.8600.834
중상자수0.4930.8620.7261.0000.8480.718
경상자수0.5380.9800.8600.8481.0000.873
부상신고자수0.5650.8720.8340.7180.8731.000
2023-12-13T08:34:54.912054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사고건수사망자수중상자수경상자수부상신고자수시도
사고건수1.0000.5560.9600.9920.8790.242
사망자수0.5561.0000.6100.5460.4230.142
중상자수0.9600.6101.0000.9360.8420.216
경상자수0.9920.5460.9361.0000.8620.238
부상신고자수0.8790.4230.8420.8621.0000.254
시도0.2420.1420.2160.2380.2541.000

Missing values

2023-12-13T08:34:52.383795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:34:52.494757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도시군구사고건수사망자수중상자수경상자수부상신고자수
0서울종로구97432241006121
1서울중구9432246866100
2서울용산구101411250103743
3서울성동구948822694155
4서울동대문구1534123671308168
5서울성북구12888298123197
6서울도봉구522217247162
7서울은평구918927389588
8서울서대문구9745211911181
9서울마포구1065102461121116
시도시군구사고건수사망자수중상자수경상자수부상신고자수
219대전중구10245221121731
220대전서구23029470271752
221대전유성구1645133362031105
222대전대덕구92310190105655
223울산중구603418164950
224울산남구1094533699790
225울산동구501218141031
226울산북구8021024491093
227울산울주군66011239647119
228세종세종시93217234744352