Overview

Dataset statistics

Number of variables10
Number of observations7009
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory588.8 KiB
Average record size in memory86.0 B

Variable types

DateTime1
Numeric4
Categorical4
Text1

Dataset

Description* 전북 전주시 일별 시간별 법정동별 교통사고 현황(2017~2019년)
Author도로교통공단
URLhttps://www.data.go.kr/data/15094462/fileData.do

Alerts

발생지_시도 has constant value ""Constant
발생지_시군구 has constant value ""Constant
중상자수 is highly overall correlated with 경상자수High correlation
경상자수 is highly overall correlated with 중상자수High correlation
사고건수 is highly imbalanced (94.3%)Imbalance
사망자수 is highly imbalanced (91.8%)Imbalance
부상신고자수 is highly skewed (γ1 = 45.20319818)Skewed
발생시간 has 259 (3.7%) zerosZeros
중상자수 has 5098 (72.7%) zerosZeros
경상자수 has 1659 (23.7%) zerosZeros
부상신고자수 has 6904 (98.5%) zerosZeros

Reproduction

Analysis started2023-12-12 18:11:33.047919
Analysis finished2023-12-12 18:11:36.404454
Duration3.36 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1089
Distinct (%)15.5%
Missing0
Missing (%)0.0%
Memory size54.9 KiB
Minimum2017-01-01 00:00:00
Maximum2019-12-31 00:00:00
2023-12-13T03:11:36.521897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:36.769164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

발생시간
Real number (ℝ)

ZEROS 

Distinct24
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13.50749
Minimum0
Maximum23
Zeros259
Zeros (%)3.7%
Negative0
Negative (%)0.0%
Memory size61.7 KiB
2023-12-13T03:11:36.977222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q19
median14
Q319
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.432272
Coefficient of variation (CV)0.47620038
Kurtosis-0.74917127
Mean13.50749
Median Absolute Deviation (MAD)5
Skewness-0.45525636
Sum94674
Variance41.374123
MonotonicityNot monotonic
2023-12-13T03:11:37.118739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
18 500
 
7.1%
17 445
 
6.3%
19 408
 
5.8%
20 391
 
5.6%
22 364
 
5.2%
16 356
 
5.1%
21 355
 
5.1%
8 352
 
5.0%
14 343
 
4.9%
15 335
 
4.8%
Other values (14) 3160
45.1%
ValueCountFrequency (%)
0 259
3.7%
1 212
3.0%
2 121
 
1.7%
3 81
 
1.2%
4 115
 
1.6%
5 140
 
2.0%
6 161
2.3%
7 199
2.8%
8 352
5.0%
9 322
4.6%
ValueCountFrequency (%)
23 327
4.7%
22 364
5.2%
21 355
5.1%
20 391
5.6%
19 408
5.8%
18 500
7.1%
17 445
6.3%
16 356
5.1%
15 335
4.8%
14 343
4.9%

발생지_시도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.9 KiB
전북
7009 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전북
2nd row전북
3rd row전북
4th row전북
5th row전북

Common Values

ValueCountFrequency (%)
전북 7009
100.0%

Length

2023-12-13T03:11:37.271181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:11:37.394358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전북 7009
100.0%

발생지_시군구
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.9 KiB
전주시
7009 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전주시
2nd row전주시
3rd row전주시
4th row전주시
5th row전주시

Common Values

ValueCountFrequency (%)
전주시 7009
100.0%

Length

2023-12-13T03:11:37.519614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:11:37.630293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전주시 7009
100.0%
Distinct82
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size54.9 KiB
2023-12-13T03:11:37.929797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length4.4407191
Min length2

Characters and Unicode

Total characters31125
Distinct characters64
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row효자동3가
2nd row효자동3가
3rd row효자동3가
4th row삼천동1가
5th row중화산동2가
ValueCountFrequency (%)
효자동3가 461
 
6.6%
금암동 449
 
6.4%
중화산동2가 434
 
6.2%
인후동1가 425
 
6.1%
서신동 357
 
5.1%
효자동2가 306
 
4.4%
진북동 298
 
4.3%
효자동1가 292
 
4.2%
송천동2가 276
 
3.9%
덕진동1가 268
 
3.8%
Other values (72) 3443
49.1%
2023-12-13T03:11:38.413036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7124
22.9%
4758
15.3%
1 2003
 
6.4%
2 1824
 
5.9%
1059
 
3.4%
1059
 
3.4%
3 835
 
2.7%
818
 
2.6%
781
 
2.5%
770
 
2.5%
Other values (54) 10094
32.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 26435
84.9%
Decimal Number 4690
 
15.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7124
26.9%
4758
18.0%
1059
 
4.0%
1059
 
4.0%
818
 
3.1%
781
 
3.0%
770
 
2.9%
725
 
2.7%
722
 
2.7%
685
 
2.6%
Other values (50) 7934
30.0%
Decimal Number
ValueCountFrequency (%)
1 2003
42.7%
2 1824
38.9%
3 835
17.8%
4 28
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 26435
84.9%
Common 4690
 
15.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7124
26.9%
4758
18.0%
1059
 
4.0%
1059
 
4.0%
818
 
3.1%
781
 
3.0%
770
 
2.9%
725
 
2.7%
722
 
2.7%
685
 
2.6%
Other values (50) 7934
30.0%
Common
ValueCountFrequency (%)
1 2003
42.7%
2 1824
38.9%
3 835
17.8%
4 28
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 26435
84.9%
ASCII 4690
 
15.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7124
26.9%
4758
18.0%
1059
 
4.0%
1059
 
4.0%
818
 
3.1%
781
 
3.0%
770
 
2.9%
725
 
2.7%
722
 
2.7%
685
 
2.6%
Other values (50) 7934
30.0%
ASCII
ValueCountFrequency (%)
1 2003
42.7%
2 1824
38.9%
3 835
17.8%
4 28
 
0.6%

사고건수
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.9 KiB
1
6963 
2
 
46

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 6963
99.3%
2 46
 
0.7%

Length

2023-12-13T03:11:38.577003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:11:38.688581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 6963
99.3%
2 46
 
0.7%

사망자수
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size54.9 KiB
0
6888 
1
 
117
2
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 6888
98.3%
1 117
 
1.7%
2 4
 
0.1%

Length

2023-12-13T03:11:38.806619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:11:38.961924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 6888
98.3%
1 117
 
1.7%
2 4
 
0.1%

중상자수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.30803253
Minimum0
Maximum7
Zeros5098
Zeros (%)72.7%
Negative0
Negative (%)0.0%
Memory size61.7 KiB
2023-12-13T03:11:39.065119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile1
Maximum7
Range7
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.55465519
Coefficient of variation (CV)1.8006384
Kurtosis10.239584
Mean0.30803253
Median Absolute Deviation (MAD)0
Skewness2.2918397
Sum2159
Variance0.30764238
MonotonicityNot monotonic
2023-12-13T03:11:39.210743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 5098
72.7%
1 1717
 
24.5%
2 157
 
2.2%
3 26
 
0.4%
4 9
 
0.1%
7 2
 
< 0.1%
ValueCountFrequency (%)
0 5098
72.7%
1 1717
 
24.5%
2 157
 
2.2%
3 26
 
0.4%
4 9
 
0.1%
7 2
 
< 0.1%
ValueCountFrequency (%)
7 2
 
< 0.1%
4 9
 
0.1%
3 26
 
0.4%
2 157
 
2.2%
1 1717
 
24.5%
0 5098
72.7%

경상자수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct15
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2134399
Minimum0
Maximum19
Zeros1659
Zeros (%)23.7%
Negative0
Negative (%)0.0%
Memory size61.7 KiB
2023-12-13T03:11:39.353912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median1
Q32
95-th percentile3
Maximum19
Range19
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.2005453
Coefficient of variation (CV)0.98937356
Kurtosis25.792609
Mean1.2134399
Median Absolute Deviation (MAD)0
Skewness3.1670835
Sum8505
Variance1.4413091
MonotonicityNot monotonic
2023-12-13T03:11:39.525408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
1 3527
50.3%
0 1659
23.7%
2 1112
 
15.9%
3 383
 
5.5%
4 187
 
2.7%
5 88
 
1.3%
6 25
 
0.4%
7 12
 
0.2%
8 6
 
0.1%
11 3
 
< 0.1%
Other values (5) 7
 
0.1%
ValueCountFrequency (%)
0 1659
23.7%
1 3527
50.3%
2 1112
 
15.9%
3 383
 
5.5%
4 187
 
2.7%
5 88
 
1.3%
6 25
 
0.4%
7 12
 
0.2%
8 6
 
0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
19 1
 
< 0.1%
18 2
 
< 0.1%
13 2
 
< 0.1%
12 1
 
< 0.1%
11 3
 
< 0.1%
9 1
 
< 0.1%
8 6
 
0.1%
7 12
 
0.2%
6 25
 
0.4%
5 88
1.3%

부상신고자수
Real number (ℝ)

SKEWED  ZEROS 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.019403624
Minimum0
Maximum17
Zeros6904
Zeros (%)98.5%
Negative0
Negative (%)0.0%
Memory size61.7 KiB
2023-12-13T03:11:39.657554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum17
Range17
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.25490651
Coefficient of variation (CV)13.137057
Kurtosis2847.3977
Mean0.019403624
Median Absolute Deviation (MAD)0
Skewness45.203198
Sum136
Variance0.064977327
MonotonicityNot monotonic
2023-12-13T03:11:39.785544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 6904
98.5%
1 95
 
1.4%
2 6
 
0.1%
17 1
 
< 0.1%
4 1
 
< 0.1%
3 1
 
< 0.1%
5 1
 
< 0.1%
ValueCountFrequency (%)
0 6904
98.5%
1 95
 
1.4%
2 6
 
0.1%
3 1
 
< 0.1%
4 1
 
< 0.1%
5 1
 
< 0.1%
17 1
 
< 0.1%
ValueCountFrequency (%)
17 1
 
< 0.1%
5 1
 
< 0.1%
4 1
 
< 0.1%
3 1
 
< 0.1%
2 6
 
0.1%
1 95
 
1.4%
0 6904
98.5%

Interactions

2023-12-13T03:11:35.276762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:33.732455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:34.192553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:34.690196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:35.379412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:33.839783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:34.301448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:34.838544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:35.520378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:33.950949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:34.427748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:34.989755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:35.642486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:34.064064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:34.548748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:11:35.145097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:11:39.895509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발생시간법정동명사고건수사망자수중상자수경상자수부상신고자수
발생시간1.0000.2230.0270.0790.0500.0570.000
법정동명0.2231.0000.0320.2060.3470.1520.037
사고건수0.0270.0321.0000.0000.1680.1460.074
사망자수0.0790.2060.0001.0000.0790.0590.000
중상자수0.0500.3470.1680.0791.0000.2100.000
경상자수0.0570.1520.1460.0590.2101.0000.267
부상신고자수0.0000.0370.0740.0000.0000.2671.000
2023-12-13T03:11:40.037369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사고건수사망자수
사고건수1.0000.000
사망자수0.0001.000
2023-12-13T03:11:40.141658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발생시간중상자수경상자수부상신고자수사고건수사망자수
발생시간1.000-0.0220.0160.0080.0210.047
중상자수-0.0221.000-0.596-0.0600.1210.033
경상자수0.016-0.5961.000-0.0780.1090.037
부상신고자수0.008-0.060-0.0781.0000.0490.000
사고건수0.0210.1210.1090.0491.0000.000
사망자수0.0470.0330.0370.0000.0001.000

Missing values

2023-12-13T03:11:35.819423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:11:36.322326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

발생일발생시간발생지_시도발생지_시군구법정동명사고건수사망자수중상자수경상자수부상신고자수
02017-01-011전북전주시효자동3가10010
12017-01-014전북전주시효자동3가10130
22017-01-015전북전주시효자동3가10130
32017-01-016전북전주시삼천동1가10100
42017-01-018전북전주시중화산동2가10010
52017-01-0118전북전주시송천동2가10010
62017-01-0118전북전주시효자동2가10030
72017-01-0119전북전주시서신동10010
82017-01-025전북전주시효자동1가10010
92017-01-026전북전주시평화동2가11000
발생일발생시간발생지_시도발생지_시군구법정동명사고건수사망자수중상자수경상자수부상신고자수
69992019-12-3012전북전주시효자동3가10100
70002019-12-3015전북전주시평화동2가10010
70012019-12-310전북전주시서노송동10010
70022019-12-316전북전주시덕진동2가10010
70032019-12-318전북전주시중화산동2가10010
70042019-12-3112전북전주시효자동3가10100
70052019-12-3114전북전주시인후동2가10010
70062019-12-3117전북전주시산정동10020
70072019-12-3117전북전주시효자동3가10010
70082019-12-3120전북전주시진북동10010