Overview

Dataset statistics

Number of variables7
Number of observations6052
Missing cells0
Missing cells (%)0.0%
Duplicate rows149
Duplicate rows (%)2.5%
Total size in memory342.9 KiB
Average record size in memory58.0 B

Variable types

Categorical4
Numeric1
Text2

Dataset

Description대전광역시 서구 소화전주변 불법주정차 위반 단속현황(위반연도, 위반월, 위반시간, 단속위치, 신고구분, 위도, 경도)을 제공합니다.
Author대전광역시 서구
URLhttps://www.data.go.kr/data/15075607/fileData.do

Alerts

위반연도 has constant value ""Constant
Dataset has 149 (2.5%) duplicate rowsDuplicates
단속동명 is highly overall correlated with 신고구분High correlation
신고구분 is highly overall correlated with 단속동명High correlation
차량분류 is highly imbalanced (76.7%)Imbalance
신고구분 is highly imbalanced (76.6%)Imbalance

Reproduction

Analysis started2024-03-14 13:38:05.773566
Analysis finished2024-03-14 13:38:07.105530
Duration1.33 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

차량분류
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size47.4 KiB
승용
5475 
화물4톤이하
 
453
승합
 
104
건설,중기,특수
 
12
화물4톤초과
 
8

Length

Max length8
Median length2
Mean length2.3165896
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row승용
2nd row승용
3rd row승용
4th row승용
5th row승용

Common Values

ValueCountFrequency (%)
승용 5475
90.5%
화물4톤이하 453
 
7.5%
승합 104
 
1.7%
건설,중기,특수 12
 
0.2%
화물4톤초과 8
 
0.1%

Length

2024-03-14T22:38:07.241529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T22:38:07.600352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
승용 5475
90.5%
화물4톤이하 453
 
7.5%
승합 104
 
1.7%
건설,중기,특수 12
 
0.2%
화물4톤초과 8
 
0.1%

위반연도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size47.4 KiB
2023
6052 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2023 6052
100.0%

Length

2024-03-14T22:38:07.854277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T22:38:08.158917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 6052
100.0%

위반월
Real number (ℝ)

Distinct12
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.9320886
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size53.3 KiB
2024-03-14T22:38:08.330786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.5198305
Coefficient of variation (CV)0.50775901
Kurtosis-1.2408912
Mean6.9320886
Median Absolute Deviation (MAD)3
Skewness-0.14872661
Sum41953
Variance12.389206
MonotonicityIncreasing
2024-03-14T22:38:08.527344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
12 716
11.8%
10 620
10.2%
9 574
9.5%
3 513
8.5%
8 507
8.4%
11 490
8.1%
6 452
7.5%
1 447
7.4%
4 445
7.4%
7 438
7.2%
Other values (2) 850
14.0%
ValueCountFrequency (%)
1 447
7.4%
2 415
6.9%
3 513
8.5%
4 445
7.4%
5 435
7.2%
6 452
7.5%
7 438
7.2%
8 507
8.4%
9 574
9.5%
10 620
10.2%
ValueCountFrequency (%)
12 716
11.8%
11 490
8.1%
10 620
10.2%
9 574
9.5%
8 507
8.4%
7 438
7.2%
6 452
7.5%
5 435
7.2%
4 445
7.4%
3 513
8.5%
Distinct1893
Distinct (%)31.3%
Missing0
Missing (%)0.0%
Memory size47.4 KiB
2024-03-14T22:38:10.050782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters54468
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique877 ?
Unique (%)14.5%

Sample

1st row 01:10:00
2nd row 07:43:00
3rd row 11:53:00
4th row 12:57:00
5th row 13:39:00
ValueCountFrequency (%)
18:35:00 16
 
0.3%
13:22:00 16
 
0.3%
12:32:00 15
 
0.2%
19:31:00 15
 
0.2%
20:36:00 15
 
0.2%
18:11:00 14
 
0.2%
20:33:00 14
 
0.2%
18:24:00 14
 
0.2%
21:46:00 14
 
0.2%
20:06:00 14
 
0.2%
Other values (1893) 5928
97.6%
2024-03-14T22:38:11.685546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 14321
26.3%
: 12081
22.2%
1 6155
11.3%
6052
11.1%
2 3878
 
7.1%
3 2465
 
4.5%
5 2252
 
4.1%
4 2216
 
4.1%
9 1497
 
2.7%
8 1307
 
2.4%
Other values (3) 2244
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 36312
66.7%
Other Punctuation 12081
 
22.2%
Space Separator 6052
 
11.1%
Dash Punctuation 23
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 14321
39.4%
1 6155
17.0%
2 3878
 
10.7%
3 2465
 
6.8%
5 2252
 
6.2%
4 2216
 
6.1%
9 1497
 
4.1%
8 1307
 
3.6%
7 1139
 
3.1%
6 1082
 
3.0%
Other Punctuation
ValueCountFrequency (%)
: 12081
100.0%
Space Separator
ValueCountFrequency (%)
6052
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 54468
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 14321
26.3%
: 12081
22.2%
1 6155
11.3%
6052
11.1%
2 3878
 
7.1%
3 2465
 
4.5%
5 2252
 
4.1%
4 2216
 
4.1%
9 1497
 
2.7%
8 1307
 
2.4%
Other values (3) 2244
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54468
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 14321
26.3%
: 12081
22.2%
1 6155
11.3%
6052
11.1%
2 3878
 
7.1%
3 2465
 
4.5%
5 2252
 
4.1%
4 2216
 
4.1%
9 1497
 
2.7%
8 1307
 
2.4%
Other values (3) 2244
 
4.1%
Distinct702
Distinct (%)11.6%
Missing0
Missing (%)0.0%
Memory size47.4 KiB
2024-03-14T22:38:12.929408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length6.7199273
Min length3

Characters and Unicode

Total characters40669
Distinct characters162
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique281 ?
Unique (%)4.6%

Sample

1st row1268 부근
2nd row984 부근
3rd row821 부근
4th row2161 부근
5th row707 부근
ValueCountFrequency (%)
부근 5378
44.9%
821 313
 
2.6%
1528 239
 
2.0%
822 163
 
1.4%
후문 156
 
1.3%
은하수아파트 154
 
1.3%
429 153
 
1.3%
279 116
 
1.0%
106
 
0.9%
1486 101
 
0.8%
Other values (699) 5104
42.6%
2024-03-14T22:38:14.510547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5961
14.7%
5476
13.5%
5472
13.5%
1 3804
9.4%
2 2699
 
6.6%
8 2214
 
5.4%
5 2203
 
5.4%
4 1737
 
4.3%
6 1642
 
4.0%
0 1442
 
3.5%
Other values (152) 8019
19.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19194
47.2%
Other Letter 14913
36.7%
Space Separator 5961
 
14.7%
Dash Punctuation 496
 
1.2%
Uppercase Letter 80
 
0.2%
Close Punctuation 11
 
< 0.1%
Open Punctuation 11
 
< 0.1%
Lowercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5476
36.7%
5472
36.7%
251
 
1.7%
171
 
1.1%
166
 
1.1%
164
 
1.1%
161
 
1.1%
158
 
1.1%
156
 
1.0%
155
 
1.0%
Other values (135) 2583
17.3%
Decimal Number
ValueCountFrequency (%)
1 3804
19.8%
2 2699
14.1%
8 2214
11.5%
5 2203
11.5%
4 1737
9.0%
6 1642
8.6%
0 1442
 
7.5%
7 1232
 
6.4%
9 1162
 
6.1%
3 1059
 
5.5%
Uppercase Letter
ValueCountFrequency (%)
M 40
50.0%
K 40
50.0%
Space Separator
ValueCountFrequency (%)
5961
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 496
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%
Open Punctuation
ValueCountFrequency (%)
( 11
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 25673
63.1%
Hangul 14913
36.7%
Latin 83
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5476
36.7%
5472
36.7%
251
 
1.7%
171
 
1.1%
166
 
1.1%
164
 
1.1%
161
 
1.1%
158
 
1.1%
156
 
1.0%
155
 
1.0%
Other values (135) 2583
17.3%
Common
ValueCountFrequency (%)
5961
23.2%
1 3804
14.8%
2 2699
10.5%
8 2214
 
8.6%
5 2203
 
8.6%
4 1737
 
6.8%
6 1642
 
6.4%
0 1442
 
5.6%
7 1232
 
4.8%
9 1162
 
4.5%
Other values (4) 1577
 
6.1%
Latin
ValueCountFrequency (%)
M 40
48.2%
K 40
48.2%
e 3
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25756
63.3%
Hangul 14913
36.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5961
23.1%
1 3804
14.8%
2 2699
10.5%
8 2214
 
8.6%
5 2203
 
8.6%
4 1737
 
6.7%
6 1642
 
6.4%
0 1442
 
5.6%
7 1232
 
4.8%
9 1162
 
4.5%
Other values (7) 1660
 
6.4%
Hangul
ValueCountFrequency (%)
5476
36.7%
5472
36.7%
251
 
1.7%
171
 
1.1%
166
 
1.1%
164
 
1.1%
161
 
1.1%
158
 
1.1%
156
 
1.0%
155
 
1.0%
Other values (135) 2583
17.3%

단속동명
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size47.4 KiB
둔산동
1190 
월평동
1123 
갈마동
843 
탄방동
668 
괴정동
548 
Other values (20)
1680 

Length

Max length4
Median length3
Mean length3.0071051
Min length2

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row갈마동
2nd row복수동
3rd row갈마동
4th row둔산동
5th row탄방동

Common Values

ValueCountFrequency (%)
둔산동 1190
19.7%
월평동 1123
18.6%
갈마동 843
13.9%
탄방동 668
11.0%
괴정동 548
9.1%
관저동 410
 
6.8%
도안동 372
 
6.1%
만년동 292
 
4.8%
도마동 234
 
3.9%
용문동 71
 
1.2%
Other values (15) 301
 
5.0%

Length

2024-03-14T22:38:14.964420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
둔산동 1190
19.7%
월평동 1123
18.6%
갈마동 843
13.9%
탄방동 668
11.0%
괴정동 548
9.1%
관저동 410
 
6.8%
도안동 372
 
6.1%
만년동 292
 
4.8%
도마동 234
 
3.9%
용문동 71
 
1.2%
Other values (15) 301
 
5.0%

신고구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size47.4 KiB
안전신문고
5376 
고정형CCTV
547 
주행형CCTV
 
99
PDA
 
23
버스장착형CCTV
 
6

Length

Max length9
Median length5
Mean length5.2105089
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row안전신문고
2nd row안전신문고
3rd row안전신문고
4th row안전신문고
5th row안전신문고

Common Values

ValueCountFrequency (%)
안전신문고 5376
88.8%
고정형CCTV 547
 
9.0%
주행형CCTV 99
 
1.6%
PDA 23
 
0.4%
버스장착형CCTV 6
 
0.1%
시청주행형CCTV 1
 
< 0.1%

Length

2024-03-14T22:38:15.393992image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T22:38:15.738282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
안전신문고 5376
88.8%
고정형cctv 547
 
9.0%
주행형cctv 99
 
1.6%
pda 23
 
0.4%
버스장착형cctv 6
 
0.1%
시청주행형cctv 1
 
< 0.1%

Interactions

2024-03-14T22:38:06.285128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T22:38:15.978697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
차량분류위반월단속동명신고구분
차량분류1.0000.0340.1750.000
위반월0.0341.0000.1970.103
단속동명0.1750.1971.0000.804
신고구분0.0000.1030.8041.000
2024-03-14T22:38:16.234226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단속동명신고구분차량분류
단속동명1.0000.5210.076
신고구분0.5211.0000.000
차량분류0.0760.0001.000
2024-03-14T22:38:16.479187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위반월차량분류단속동명신고구분
위반월1.0000.0140.0710.054
차량분류0.0141.0000.0760.000
단속동명0.0710.0761.0000.521
신고구분0.0540.0000.5211.000

Missing values

2024-03-14T22:38:06.814351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T22:38:07.018460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

차량분류위반연도위반월위반시간단속위치단속동명신고구분
0승용2023101:10:001268 부근갈마동안전신문고
1승용2023107:43:00984 부근복수동안전신문고
2승용2023111:53:00821 부근갈마동안전신문고
3승용2023112:57:002161 부근둔산동안전신문고
4승용2023113:39:00707 부근탄방동안전신문고
5승용2023113:59:00423- 21 부근괴정동안전신문고
6승용2023114:03:00807 부근월평동안전신문고
7승용2023117:00:002085 부근도안동안전신문고
8승용2023117:15:00703 부근정림동안전신문고
9승용2023120:01:001260 부근월평동안전신문고
차량분류위반연도위반월위반시간단속위치단속동명신고구분
6042승용20231215:15:00은하수아파트 후문둔산동고정형CCTV
6043승용20231219:33:35은하수아파트 후문둔산동고정형CCTV
6044승용20231210:07:41은하수아파트 후문둔산동고정형CCTV
6045승용20231219:43:29은하수아파트 후문둔산동고정형CCTV
6046승용20231208:00:01허룡약국괴정동고정형CCTV
6047승용20231218:53:38허룡약국괴정동고정형CCTV
6048승용20231217:59:07괴정약국괴정동고정형CCTV
6049승용20231219:05:07괴정약국괴정동고정형CCTV
6050승용20231215:30:41괴정약국괴정동고정형CCTV
6051승용20231217:23:57괴정약국괴정동고정형CCTV

Duplicate rows

Most frequently occurring

차량분류위반연도위반월위반시간단속위치단속동명신고구분# duplicates
20승용2023309:32:00821 부근갈마동안전신문고3
22승용2023310:41:00429 부근괴정동안전신문고3
25승용2023317:59:001528 부근둔산동안전신문고3
50승용2023519:01:00429 부근괴정동안전신문고3
62승용2023612:55:00800 부근탄방동안전신문고3
97승용20231009:06:00265 부근용문동안전신문고3
110승용20231110:16:002162 부근둔산동안전신문고3
128승용20231211:27:001909 부근관저동안전신문고3
131승용20231211:38:001909 부근관저동안전신문고3
0승용2023105:01:00386 부근월평동안전신문고2