Overview

Dataset statistics

Number of variables8
Number of observations228
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory15.7 KiB
Average record size in memory70.6 B

Variable types

Categorical2
Text1
Numeric5

Dataset

Description* 부문별 어린이 교통사고(2018)
Author도로교통공단
URLhttps://www.data.go.kr/data/15094169/fileData.do

Alerts

발생건수 is highly overall correlated with 부상자수 and 3 other fieldsHigh correlation
부상자수 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
중상 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
경상 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
부상신고 is highly overall correlated with 발생건수 and 3 other fieldsHigh correlation
사망자수 is highly imbalanced (59.9%)Imbalance
중상 has 16 (7.0%) zerosZeros
부상신고 has 66 (28.9%) zerosZeros

Reproduction

Analysis started2023-12-12 02:28:38.376104
Analysis finished2023-12-12 02:28:41.609805
Duration3.23 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도
Categorical

Distinct17
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
경기
31 
서울
25 
경북
23 
전남
22 
강원
18 
Other values (12)
109 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)0.4%

Sample

1st row서울
2nd row서울
3rd row서울
4th row서울
5th row서울

Common Values

ValueCountFrequency (%)
경기 31
13.6%
서울 25
11.0%
경북 23
10.1%
전남 22
9.6%
강원 18
7.9%
경남 17
7.5%
부산 16
7.0%
충남 15
6.6%
전북 14
 
6.1%
충북 11
 
4.8%
Other values (7) 36
15.8%

Length

2023-12-12T11:28:41.669982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 31
13.6%
서울 25
11.0%
경북 23
10.1%
전남 22
9.6%
강원 18
7.9%
경남 17
7.5%
부산 16
7.0%
충남 15
6.6%
전북 14
 
6.1%
충북 11
 
4.8%
Other values (7) 36
15.8%
Distinct205
Distinct (%)89.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2023-12-12T11:28:41.985902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length2.9385965
Min length2

Characters and Unicode

Total characters670
Distinct characters133
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique198 ?
Unique (%)86.8%

Sample

1st row종로구
2nd row중구
3rd row용산구
4th row성동구
5th row동대문구
ValueCountFrequency (%)
중구 6
 
2.6%
동구 6
 
2.6%
서구 5
 
2.2%
남구 5
 
2.2%
북구 4
 
1.8%
강서구 2
 
0.9%
고성군 2
 
0.9%
곡성군 1
 
0.4%
화순군 1
 
0.4%
보성군 1
 
0.4%
Other values (195) 195
85.5%
2023-12-12T11:28:42.513528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
84
 
12.5%
78
 
11.6%
74
 
11.0%
22
 
3.3%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.5%
16
 
2.4%
13
 
1.9%
Other values (123) 310
46.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 668
99.7%
Open Punctuation 1
 
0.1%
Close Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
84
 
12.6%
78
 
11.7%
74
 
11.1%
22
 
3.3%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.5%
16
 
2.4%
13
 
1.9%
Other values (121) 308
46.1%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 668
99.7%
Common 2
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
84
 
12.6%
78
 
11.7%
74
 
11.1%
22
 
3.3%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.5%
16
 
2.4%
13
 
1.9%
Other values (121) 308
46.1%
Common
ValueCountFrequency (%)
( 1
50.0%
) 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 668
99.7%
ASCII 2
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
84
 
12.6%
78
 
11.7%
74
 
11.1%
22
 
3.3%
20
 
3.0%
18
 
2.7%
18
 
2.7%
17
 
2.5%
16
 
2.4%
13
 
1.9%
Other values (121) 308
46.1%
ASCII
ValueCountFrequency (%)
( 1
50.0%
) 1
50.0%

발생건수
Real number (ℝ)

HIGH CORRELATION 

Distinct95
Distinct (%)41.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.899123
Minimum1
Maximum266
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-12T11:28:42.668809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q111
median28.5
Q358.5
95-th percentile150.75
Maximum266
Range265
Interquartile range (IQR)47.5

Descriptive statistics

Standard deviation46.76206
Coefficient of variation (CV)1.0652163
Kurtosis4.0170525
Mean43.899123
Median Absolute Deviation (MAD)19.5
Skewness1.9270699
Sum10009
Variance2186.6902
MonotonicityNot monotonic
2023-12-12T11:28:42.841127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 10
 
4.4%
9 9
 
3.9%
5 8
 
3.5%
7 7
 
3.1%
6 6
 
2.6%
3 6
 
2.6%
47 6
 
2.6%
21 6
 
2.6%
29 5
 
2.2%
13 5
 
2.2%
Other values (85) 160
70.2%
ValueCountFrequency (%)
1 1
 
0.4%
2 3
 
1.3%
3 6
2.6%
4 10
4.4%
5 8
3.5%
6 6
2.6%
7 7
3.1%
8 3
 
1.3%
9 9
3.9%
10 2
 
0.9%
ValueCountFrequency (%)
266 1
0.4%
217 1
0.4%
200 1
0.4%
192 1
0.4%
188 2
0.9%
181 2
0.9%
168 1
0.4%
162 1
0.4%
160 1
0.4%
156 1
0.4%

사망자수
Categorical

IMBALANCE 

Distinct3
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
0
198 
1
26 
2
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 198
86.8%
1 26
 
11.4%
2 4
 
1.8%

Length

2023-12-12T11:28:43.004329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:28:43.103914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 198
86.8%
1 26
 
11.4%
2 4
 
1.8%

부상자수
Real number (ℝ)

HIGH CORRELATION 

Distinct108
Distinct (%)47.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55.013158
Minimum1
Maximum356
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-12T11:28:43.241156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q114.75
median35
Q373
95-th percentile180.65
Maximum356
Range355
Interquartile range (IQR)58.25

Descriptive statistics

Standard deviation59.317965
Coefficient of variation (CV)1.0782505
Kurtosis4.7095154
Mean55.013158
Median Absolute Deviation (MAD)24.5
Skewness2.0312527
Sum12543
Variance3518.621
MonotonicityNot monotonic
2023-12-12T11:28:43.420080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10 9
 
3.9%
5 8
 
3.5%
6 7
 
3.1%
25 6
 
2.6%
15 6
 
2.6%
79 5
 
2.2%
11 5
 
2.2%
4 5
 
2.2%
9 4
 
1.8%
8 4
 
1.8%
Other values (98) 169
74.1%
ValueCountFrequency (%)
1 1
 
0.4%
2 3
 
1.3%
3 1
 
0.4%
4 5
2.2%
5 8
3.5%
6 7
3.1%
7 4
1.8%
8 4
1.8%
9 4
1.8%
10 9
3.9%
ValueCountFrequency (%)
356 1
0.4%
268 1
0.4%
262 1
0.4%
252 1
0.4%
240 1
0.4%
235 1
0.4%
234 1
0.4%
220 1
0.4%
214 1
0.4%
200 2
0.9%

중상
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct32
Distinct (%)14.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.5701754
Minimum0
Maximum49
Zeros16
Zeros (%)7.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-12T11:28:43.599749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q310
95-th percentile24.65
Maximum49
Range49
Interquartile range (IQR)8

Descriptive statistics

Standard deviation8.2006439
Coefficient of variation (CV)1.0832832
Kurtosis5.1248402
Mean7.5701754
Median Absolute Deviation (MAD)3
Skewness2.0817404
Sum1726
Variance67.25056
MonotonicityNot monotonic
2023-12-12T11:28:43.806191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
1 29
12.7%
2 22
 
9.6%
3 20
 
8.8%
5 19
 
8.3%
4 16
 
7.0%
0 16
 
7.0%
6 15
 
6.6%
8 13
 
5.7%
7 11
 
4.8%
9 9
 
3.9%
Other values (22) 58
25.4%
ValueCountFrequency (%)
0 16
7.0%
1 29
12.7%
2 22
9.6%
3 20
8.8%
4 16
7.0%
5 19
8.3%
6 15
6.6%
7 11
 
4.8%
8 13
5.7%
9 9
 
3.9%
ValueCountFrequency (%)
49 1
 
0.4%
40 2
0.9%
35 1
 
0.4%
33 1
 
0.4%
30 1
 
0.4%
29 4
1.8%
28 1
 
0.4%
25 1
 
0.4%
24 2
0.9%
23 1
 
0.4%

경상
Real number (ℝ)

HIGH CORRELATION 

Distinct96
Distinct (%)42.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.092105
Minimum0
Maximum295
Zeros2
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-12T11:28:43.958483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q19.75
median25
Q351.25
95-th percentile142.6
Maximum295
Range295
Interquartile range (IQR)41.5

Descriptive statistics

Standard deviation43.874323
Coefficient of variation (CV)1.0943382
Kurtosis6.0253356
Mean40.092105
Median Absolute Deviation (MAD)17
Skewness2.1478582
Sum9141
Variance1924.9562
MonotonicityNot monotonic
2023-12-12T11:28:44.126580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 12
 
5.3%
7 9
 
3.9%
9 9
 
3.9%
17 8
 
3.5%
3 8
 
3.5%
15 6
 
2.6%
16 6
 
2.6%
8 6
 
2.6%
24 6
 
2.6%
33 5
 
2.2%
Other values (86) 153
67.1%
ValueCountFrequency (%)
0 2
 
0.9%
1 2
 
0.9%
2 2
 
0.9%
3 8
3.5%
4 12
5.3%
5 4
 
1.8%
6 3
 
1.3%
7 9
3.9%
8 6
2.6%
9 9
3.9%
ValueCountFrequency (%)
295 1
0.4%
195 1
0.4%
180 1
0.4%
171 1
0.4%
161 1
0.4%
160 1
0.4%
159 1
0.4%
153 1
0.4%
150 1
0.4%
145 2
0.9%

부상신고
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct38
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.3508772
Minimum0
Maximum62
Zeros66
Zeros (%)28.9%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-12T11:28:44.315244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median3
Q39.25
95-th percentile33.3
Maximum62
Range62
Interquartile range (IQR)9.25

Descriptive statistics

Standard deviation11.128787
Coefficient of variation (CV)1.5139401
Kurtosis7.0723518
Mean7.3508772
Median Absolute Deviation (MAD)3
Skewness2.5221099
Sum1676
Variance123.84991
MonotonicityNot monotonic
2023-12-12T11:28:44.454095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
0 66
28.9%
1 23
 
10.1%
2 15
 
6.6%
5 14
 
6.1%
3 13
 
5.7%
4 12
 
5.3%
8 9
 
3.9%
7 8
 
3.5%
6 8
 
3.5%
11 7
 
3.1%
Other values (28) 53
23.2%
ValueCountFrequency (%)
0 66
28.9%
1 23
 
10.1%
2 15
 
6.6%
3 13
 
5.7%
4 12
 
5.3%
5 14
 
6.1%
6 8
 
3.5%
7 8
 
3.5%
8 9
 
3.9%
9 3
 
1.3%
ValueCountFrequency (%)
62 1
 
0.4%
59 1
 
0.4%
57 1
 
0.4%
45 1
 
0.4%
44 1
 
0.4%
43 1
 
0.4%
41 2
0.9%
35 3
1.3%
34 1
 
0.4%
32 1
 
0.4%

Interactions

2023-12-12T11:28:40.845132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:38.715436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:39.256040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:39.779373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:40.359156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:40.938669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:38.816666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:39.364813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:39.875846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:40.446792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:41.065043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:38.921193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:39.476027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:40.007878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:40.557937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:41.194087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:39.026154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:39.570156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:40.113173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:40.659518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:41.289010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:39.149512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:39.666980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:40.225079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:28:40.756422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T11:28:44.541657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도발생건수사망자수부상자수중상경상부상신고
시도1.0000.4480.2100.4720.3680.4000.577
발생건수0.4481.0000.4130.9460.9380.9200.702
사망자수0.2100.4131.0000.5650.3720.3950.426
부상자수0.4720.9460.5651.0000.8560.9430.849
중상0.3680.9380.3720.8561.0000.8290.572
경상0.4000.9200.3950.9430.8291.0000.687
부상신고0.5770.7020.4260.8490.5720.6871.000
2023-12-12T11:28:44.648551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도사망자수
시도1.0000.109
사망자수0.1091.000
2023-12-12T11:28:44.766527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발생건수부상자수중상경상부상신고시도사망자수
발생건수1.0000.9900.8760.9750.7950.1890.275
부상자수0.9901.0000.8600.9870.8070.2050.295
중상0.8760.8601.0000.8150.6130.1490.237
경상0.9750.9870.8151.0000.7450.1750.271
부상신고0.7950.8070.6130.7451.0000.2660.212
시도0.1890.2050.1490.1750.2661.0000.109
사망자수0.2750.2950.2370.2710.2120.1091.000

Missing values

2023-12-12T11:28:41.421920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T11:28:41.561094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도시군구발생건수사망자수부상자수중상경상부상신고
0서울종로구280331275
1서울중구230251168
2서울용산구340396258
3서울성동구280314207
4서울동대문구4204633310
5서울성북구6408216579
6서울도봉구340382297
7서울은평구5608455425
8서울서대문구320367227
9서울마포구450507376
시도시군구발생건수사망자수부상자수중상경상부상신고
218대전중구470607458
219대전서구1171142181177
220대전유성구960133139525
221대전대덕구250345245
222울산중구2602710161
223울산남구430499364
224울산동구210216150
225울산북구290306240
226울산울주군35051112911
227세종세종4906118394