Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells1904
Missing cells (%)4.8%
Duplicate rows151
Duplicate rows (%)1.5%
Total size in memory410.2 KiB
Average record size in memory42.0 B

Variable types

Categorical1
Numeric2
Text1

Dataset

Description대구광역시 동구의 자동차관련 과태료 부과현황 데이터입니다. 주정차위반, 손해배상보장법위반, 검사지연과태료 등 자동차 과태료 종류 별 과태료 부과금액, 주정차위반 시간, 주정차위반 장소 등의 정보를 포함하고 있습니다.
URLhttps://www.data.go.kr/data/15118611/fileData.do

Alerts

Dataset has 151 (1.5%) duplicate rowsDuplicates
위반내역1 is highly overall correlated with 세목명High correlation
세목명 is highly overall correlated with 위반내역1High correlation
세목명 is highly imbalanced (73.0%)Imbalance
위반내역1 has 952 (9.5%) missing valuesMissing
위반내역2 has 952 (9.5%) missing valuesMissing

Reproduction

Analysis started2023-12-12 08:03:09.289501
Analysis finished2023-12-12 08:03:10.699055
Duration1.41 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

세목명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
주정차위반
9049 
자동차검사지연과태료
 
655
자동차손해배상보장법위반과태료
 
293
<NA>
 
3

Length

Max length15
Median length5
Mean length5.6202
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row주정차위반
2nd row주정차위반
3rd row주정차위반
4th row자동차손해배상보장법위반과태료
5th row주정차위반

Common Values

ValueCountFrequency (%)
주정차위반 9049
90.5%
자동차검사지연과태료 655
 
6.6%
자동차손해배상보장법위반과태료 293
 
2.9%
<NA> 3
 
< 0.1%

Length

2023-12-12T17:03:10.810908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:03:10.969496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주정차위반 9049
90.5%
자동차검사지연과태료 655
 
6.6%
자동차손해배상보장법위반과태료 293
 
2.9%
na 3
 
< 0.1%

본세
Real number (ℝ)

Distinct110
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40405.36
Minimum0
Maximum1000000
Zeros3
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T17:03:11.157320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile32000
Q132000
median32000
Q340000
95-th percentile96000
Maximum1000000
Range1000000
Interquartile range (IQR)8000

Descriptive statistics

Standard deviation41569.499
Coefficient of variation (CV)1.0288115
Kurtosis197.35165
Mean40405.36
Median Absolute Deviation (MAD)0
Skewness11.977202
Sum4.040536 × 108
Variance1.7280232 × 109
MonotonicityNot monotonic
2023-12-12T17:03:11.406860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
32000 6374
63.7%
40000 2306
 
23.1%
96000 323
 
3.2%
16000 204
 
2.0%
12000 134
 
1.3%
50000 93
 
0.9%
64000 79
 
0.8%
20000 68
 
0.7%
120000 63
 
0.6%
15000 50
 
0.5%
Other values (100) 306
 
3.1%
ValueCountFrequency (%)
0 3
 
< 0.1%
5000 3
 
< 0.1%
6250 1
 
< 0.1%
10000 2
 
< 0.1%
11000 1
 
< 0.1%
12000 134
1.3%
14670 1
 
< 0.1%
15000 50
 
0.5%
16000 204
2.0%
16800 3
 
< 0.1%
ValueCountFrequency (%)
1000000 1
 
< 0.1%
930400 1
 
< 0.1%
900000 5
0.1%
720000 3
 
< 0.1%
698400 1
 
< 0.1%
600000 1
 
< 0.1%
520000 1
 
< 0.1%
480000 12
0.1%
453120 1
 
< 0.1%
450000 1
 
< 0.1%

위반내역1
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct8788
Distinct (%)97.1%
Missing952
Missing (%)9.5%
Infinite0
Infinite (%)0.0%
Mean2.022014 × 1011
Minimum1.9930316 × 1011
Maximum2.0230614 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T17:03:11.646626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.9930316 × 1011
5-th percentile2.0210619 × 1011
Q12.022031 × 1011
median2.0220715 × 1011
Q32.0221214 × 1011
95-th percentile2.0230428 × 1011
Maximum2.0230614 × 1011
Range3.0029803 × 109
Interquartile range (IQR)9040094

Descriptive statistics

Standard deviation1.3221304 × 108
Coefficient of variation (CV)0.00065386805
Kurtosis125.48506
Mean2.022014 × 1011
Median Absolute Deviation (MAD)4870448.5
Skewness-8.8804016
Sum1.8295183 × 1015
Variance1.7480287 × 1016
MonotonicityNot monotonic
2023-12-12T17:03:11.894446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
202202091440 3
 
< 0.1%
202112271453 3
 
< 0.1%
202206291538 3
 
< 0.1%
202303071505 3
 
< 0.1%
202203141010 3
 
< 0.1%
202207251447 3
 
< 0.1%
202204150817 3
 
< 0.1%
202201071056 3
 
< 0.1%
202203301440 3
 
< 0.1%
202208300943 3
 
< 0.1%
Other values (8778) 9018
90.2%
(Missing) 952
 
9.5%
ValueCountFrequency (%)
199303161625 1
< 0.1%
199703031545 1
< 0.1%
199802271337 1
< 0.1%
200012280932 1
< 0.1%
200101171425 1
< 0.1%
200203191541 1
< 0.1%
200204241111 1
< 0.1%
200205300937 1
< 0.1%
200212011252 1
< 0.1%
200312290956 1
< 0.1%
ValueCountFrequency (%)
202306141959 1
< 0.1%
202306141723 1
< 0.1%
202306141710 1
< 0.1%
202306141709 1
< 0.1%
202306141404 1
< 0.1%
202306132129 1
< 0.1%
202306131012 1
< 0.1%
202306121856 1
< 0.1%
202306121705 1
< 0.1%
202306121639 1
< 0.1%

위반내역2
Text

MISSING 

Distinct1222
Distinct (%)13.5%
Missing952
Missing (%)9.5%
Memory size156.2 KiB
2023-12-12T17:03:12.223564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length13
Mean length8.1993811
Min length3

Characters and Unicode

Total characters74188
Distinct characters261
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique653 ?
Unique (%)7.2%

Sample

1st row팔공로51길
2nd row화랑로9길
3rd row성동고가교
4th row지저동 761-30 부근
5th row검사동 해동로 197
ValueCountFrequency (%)
부근 1975
 
11.7%
685
 
4.1%
이시아 365
 
2.2%
신암동 329
 
1.9%
306
 
1.8%
성동고가교 306
 
1.8%
중앙고속 306
 
1.8%
신천동 298
 
1.8%
동대구역 264
 
1.6%
신세계백화점 262
 
1.6%
Other values (1257) 11783
69.8%
2023-12-12T17:03:12.813673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7873
 
10.6%
3897
 
5.3%
2525
 
3.4%
2489
 
3.4%
1 2306
 
3.1%
2152
 
2.9%
2019
 
2.7%
1698
 
2.3%
5 1507
 
2.0%
1442
 
1.9%
Other values (251) 46280
62.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 53318
71.9%
Decimal Number 11424
 
15.4%
Space Separator 7873
 
10.6%
Dash Punctuation 1119
 
1.5%
Open Punctuation 157
 
0.2%
Close Punctuation 157
 
0.2%
Uppercase Letter 138
 
0.2%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3897
 
7.3%
2525
 
4.7%
2489
 
4.7%
2152
 
4.0%
2019
 
3.8%
1698
 
3.2%
1442
 
2.7%
1365
 
2.6%
1109
 
2.1%
981
 
1.8%
Other values (226) 33641
63.1%
Decimal Number
ValueCountFrequency (%)
1 2306
20.2%
5 1507
13.2%
2 1376
12.0%
3 1307
11.4%
4 968
8.5%
6 876
 
7.7%
7 807
 
7.1%
9 787
 
6.9%
8 781
 
6.8%
0 709
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
M 51
37.0%
T 23
16.7%
P 23
16.7%
A 23
16.7%
K 10
 
7.2%
D 5
 
3.6%
U 1
 
0.7%
I 1
 
0.7%
S 1
 
0.7%
Other Punctuation
ValueCountFrequency (%)
. 1
50.0%
, 1
50.0%
Space Separator
ValueCountFrequency (%)
7873
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1119
100.0%
Open Punctuation
ValueCountFrequency (%)
( 157
100.0%
Close Punctuation
ValueCountFrequency (%)
) 157
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 53318
71.9%
Common 20732
 
27.9%
Latin 138
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3897
 
7.3%
2525
 
4.7%
2489
 
4.7%
2152
 
4.0%
2019
 
3.8%
1698
 
3.2%
1442
 
2.7%
1365
 
2.6%
1109
 
2.1%
981
 
1.8%
Other values (226) 33641
63.1%
Common
ValueCountFrequency (%)
7873
38.0%
1 2306
 
11.1%
5 1507
 
7.3%
2 1376
 
6.6%
3 1307
 
6.3%
- 1119
 
5.4%
4 968
 
4.7%
6 876
 
4.2%
7 807
 
3.9%
9 787
 
3.8%
Other values (6) 1806
 
8.7%
Latin
ValueCountFrequency (%)
M 51
37.0%
T 23
16.7%
P 23
16.7%
A 23
16.7%
K 10
 
7.2%
D 5
 
3.6%
U 1
 
0.7%
I 1
 
0.7%
S 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 53318
71.9%
ASCII 20870
 
28.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7873
37.7%
1 2306
 
11.0%
5 1507
 
7.2%
2 1376
 
6.6%
3 1307
 
6.3%
- 1119
 
5.4%
4 968
 
4.6%
6 876
 
4.2%
7 807
 
3.9%
9 787
 
3.8%
Other values (15) 1944
 
9.3%
Hangul
ValueCountFrequency (%)
3897
 
7.3%
2525
 
4.7%
2489
 
4.7%
2152
 
4.0%
2019
 
3.8%
1698
 
3.2%
1442
 
2.7%
1365
 
2.6%
1109
 
2.1%
981
 
1.8%
Other values (226) 33641
63.1%

Interactions

2023-12-12T17:03:10.044566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:03:09.735804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:03:10.169249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:03:09.888711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:03:12.992916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세목명본세위반내역1
세목명1.0000.392NaN
본세0.3921.0000.000
위반내역1NaN0.0001.000
2023-12-12T17:03:13.172026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
본세위반내역1세목명
본세1.000-0.2540.256
위반내역1-0.2541.0001.000
세목명0.2561.0001.000

Missing values

2023-12-12T17:03:10.336983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:03:10.463483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T17:03:10.609296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

세목명본세위반내역1위반내역2
37896주정차위반32000202206291446팔공로51길
41754주정차위반32000202207191044화랑로9길
22830주정차위반32000202204191416성동고가교
18907자동차손해배상보장법위반과태료900000<NA><NA>
76445주정차위반32000202301091256지저동 761-30 부근
49776주정차위반40000202207151623검사동 해동로 197
71151주정차위반32000202301031053반야월북로
96353주정차위반40000202305121442송라로28길
44726주정차위반40000202101211536첨단로8길
76272주정차위반32000202302031451봉무동 1591
세목명본세위반내역1위반내역2
85148주정차위반32000202302271106팔공로 227
96913주정차위반120000202303301043동촌로58길
97729자동차손해배상보장법위반과태료12000<NA><NA>
84006자동차손해배상보장법위반과태료291000<NA><NA>
4455주정차위반32000202201121743대구은행 강촌지점
17660주정차위반32000202203301415반야월북로11길
2810주정차위반32000202201111529방촌동 화랑로 405
17376주정차위반32000202203231647효목동 64-15 부근
85182주정차위반40000202210310711현대시티 아울렛 후문
75642주정차위반96000202301051500화랑로25길

Duplicate rows

Most frequently occurring

세목명본세위반내역1위반내역2# duplicates
4자동차검사지연과태료32000<NA><NA>269
0자동차검사지연과태료16000<NA><NA>147
27자동차손해배상보장법위반과태료12000<NA><NA>134
1자동차검사지연과태료20000<NA><NA>65
28자동차손해배상보장법위반과태료15000<NA><NA>50
23자동차검사지연과태료300000<NA><NA>32
5자동차검사지연과태료40000<NA><NA>30
22자동차검사지연과태료240000<NA><NA>12
24자동차검사지연과태료480000<NA><NA>12
9자동차검사지연과태료80000<NA><NA>10