Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells13868
Missing cells (%)27.7%
Duplicate rows39
Duplicate rows (%)0.4%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

DateTime1
Text1
Categorical1
Numeric2

Dataset

Description경기도 의정부시 내 주정차 위반 발생 현황 데이터로 단속일시, 단속위치, 단속구분, 위도, 경도 등의 항목으로 구성되어 있습니다.
Author경기도 의정부시
URLhttps://www.data.go.kr/data/15017431/fileData.do

Alerts

Dataset has 39 (0.4%) duplicate rowsDuplicates
위도 is highly overall correlated with 단속구분High correlation
경도 is highly overall correlated with 단속구분High correlation
단속구분 is highly overall correlated with 위도 and 1 other fieldsHigh correlation
위도 has 6934 (69.3%) missing valuesMissing
경도 has 6934 (69.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 10:58:57.985773
Analysis finished2023-12-12 10:58:59.510154
Duration1.52 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct9287
Distinct (%)92.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2023-01-01 00:42:00
Maximum2023-09-15 13:29:00
2023-12-12T19:58:59.636151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:58:59.840196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1339
Distinct (%)13.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T19:59:00.334905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length18
Mean length13.261
Min length5

Characters and Unicode

Total characters132610
Distinct characters282
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique612 ?
Unique (%)6.1%

Sample

1st row신곡동 장곡로 부근
2nd row낙양동 민락로 부근
3rd row의정부동 669
4th row(신곡동)(양주방면 신곡동 산 87-6)
5th row노바웨딩홀(T-71)
ValueCountFrequency (%)
부근 3066
 
13.8%
신곡동 1769
 
7.9%
의정부동 1503
 
6.8%
813
 
3.7%
금오동 585
 
2.6%
호원동 544
 
2.4%
민락동 512
 
2.3%
신곡동)(양주방면 386
 
1.7%
87-6 386
 
1.7%
94-1 365
 
1.6%
Other values (1360) 12327
55.4%
2023-12-12T19:59:01.105158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12268
 
9.3%
7415
 
5.6%
- 6200
 
4.7%
) 5431
 
4.1%
( 5430
 
4.1%
1 5264
 
4.0%
5019
 
3.8%
T 3759
 
2.8%
3355
 
2.5%
2 3316
 
2.5%
Other values (272) 75153
56.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 72507
54.7%
Decimal Number 26658
 
20.1%
Space Separator 12268
 
9.3%
Dash Punctuation 6200
 
4.7%
Close Punctuation 5431
 
4.1%
Open Punctuation 5430
 
4.1%
Uppercase Letter 4108
 
3.1%
Lowercase Letter 7
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7415
 
10.2%
5019
 
6.9%
3355
 
4.6%
3207
 
4.4%
3044
 
4.2%
2781
 
3.8%
1991
 
2.7%
1932
 
2.7%
1884
 
2.6%
1791
 
2.5%
Other values (247) 40088
55.3%
Decimal Number
ValueCountFrequency (%)
1 5264
19.7%
2 3316
12.4%
7 3252
12.2%
4 2760
10.4%
6 2492
9.3%
8 2228
8.4%
3 2108
7.9%
5 1815
 
6.8%
0 1721
 
6.5%
9 1702
 
6.4%
Uppercase Letter
ValueCountFrequency (%)
T 3759
91.5%
S 165
 
4.0%
G 148
 
3.6%
K 14
 
0.3%
A 7
 
0.2%
L 5
 
0.1%
P 4
 
0.1%
R 3
 
0.1%
H 3
 
0.1%
Space Separator
ValueCountFrequency (%)
12268
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6200
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5431
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5430
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 7
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 72507
54.7%
Common 55988
42.2%
Latin 4115
 
3.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7415
 
10.2%
5019
 
6.9%
3355
 
4.6%
3207
 
4.4%
3044
 
4.2%
2781
 
3.8%
1991
 
2.7%
1932
 
2.7%
1884
 
2.6%
1791
 
2.5%
Other values (247) 40088
55.3%
Common
ValueCountFrequency (%)
12268
21.9%
- 6200
11.1%
) 5431
9.7%
( 5430
9.7%
1 5264
9.4%
2 3316
 
5.9%
7 3252
 
5.8%
4 2760
 
4.9%
6 2492
 
4.5%
8 2228
 
4.0%
Other values (5) 7347
13.1%
Latin
ValueCountFrequency (%)
T 3759
91.3%
S 165
 
4.0%
G 148
 
3.6%
K 14
 
0.3%
A 7
 
0.2%
e 7
 
0.2%
L 5
 
0.1%
P 4
 
0.1%
R 3
 
0.1%
H 3
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 72507
54.7%
ASCII 60103
45.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12268
20.4%
- 6200
10.3%
) 5431
9.0%
( 5430
9.0%
1 5264
8.8%
T 3759
 
6.3%
2 3316
 
5.5%
7 3252
 
5.4%
4 2760
 
4.6%
6 2492
 
4.1%
Other values (15) 9931
16.5%
Hangul
ValueCountFrequency (%)
7415
 
10.2%
5019
 
6.9%
3355
 
4.6%
3207
 
4.4%
3044
 
4.2%
2781
 
3.8%
1991
 
2.7%
1932
 
2.7%
1884
 
2.6%
1791
 
2.5%
Other values (247) 40088
55.3%

단속구분
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
고정형CCTV
4587 
차량형CCTV
3066 
민원신고
2341 
인력단속
 
6

Length

Max length7
Median length7
Mean length6.2959
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row차량형CCTV
2nd row차량형CCTV
3rd row민원신고
4th row고정형CCTV
5th row고정형CCTV

Common Values

ValueCountFrequency (%)
고정형CCTV 4587
45.9%
차량형CCTV 3066
30.7%
민원신고 2341
23.4%
인력단속 6
 
0.1%

Length

2023-12-12T19:59:01.318580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:59:01.492679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고정형cctv 4587
45.9%
차량형cctv 3066
30.7%
민원신고 2341
23.4%
인력단속 6
 
0.1%

위도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct2758
Distinct (%)90.0%
Missing6934
Missing (%)69.3%
Infinite0
Infinite (%)0.0%
Mean37.73949
Minimum37.693592
Maximum37.762848
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:59:01.667585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.693592
5-th percentile37.71257
Q137.734257
median37.741487
Q337.746008
95-th percentile37.756458
Maximum37.762848
Range0.069256
Interquartile range (IQR)0.011751

Descriptive statistics

Standard deviation0.011976794
Coefficient of variation (CV)0.00031735442
Kurtosis1.5868187
Mean37.73949
Median Absolute Deviation (MAD)0.0062715
Skewness-1.0624714
Sum115709.28
Variance0.0001434436
MonotonicityNot monotonic
2023-12-12T19:59:01.865448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.736458 4
 
< 0.1%
37.75582 4
 
< 0.1%
37.73754 4
 
< 0.1%
37.748455 3
 
< 0.1%
37.743385 3
 
< 0.1%
37.737413 3
 
< 0.1%
37.74217 3
 
< 0.1%
37.737583 3
 
< 0.1%
37.741402 3
 
< 0.1%
37.734438 3
 
< 0.1%
Other values (2748) 3033
30.3%
(Missing) 6934
69.3%
ValueCountFrequency (%)
37.693592 1
< 0.1%
37.693903 1
< 0.1%
37.693983 1
< 0.1%
37.695273 1
< 0.1%
37.695298 1
< 0.1%
37.695725 1
< 0.1%
37.695905 1
< 0.1%
37.69606 1
< 0.1%
37.69975 1
< 0.1%
37.699762 2
< 0.1%
ValueCountFrequency (%)
37.762848 1
< 0.1%
37.76268 1
< 0.1%
37.761032 1
< 0.1%
37.760627 1
< 0.1%
37.760488 1
< 0.1%
37.760372 1
< 0.1%
37.760273 1
< 0.1%
37.760013 1
< 0.1%
37.759968 1
< 0.1%
37.759782 1
< 0.1%

경도
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct2799
Distinct (%)91.3%
Missing6934
Missing (%)69.3%
Infinite0
Infinite (%)0.0%
Mean127.06253
Minimum127.01804
Maximum127.12068
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:59:02.101262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum127.01804
5-th percentile127.03668
Q1127.04524
median127.0502
Q3127.08344
95-th percentile127.11256
Maximum127.12068
Range0.102634
Interquartile range (IQR)0.0381965

Descriptive statistics

Standard deviation0.023930597
Coefficient of variation (CV)0.00018833717
Kurtosis-0.51393885
Mean127.06253
Median Absolute Deviation (MAD)0.010085
Skewness0.85227357
Sum389573.72
Variance0.00057267349
MonotonicityNot monotonic
2023-12-12T19:59:02.375604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.048837 4
 
< 0.1%
127.039385 4
 
< 0.1%
127.048885 4
 
< 0.1%
127.0462 3
 
< 0.1%
127.04404 3
 
< 0.1%
127.047425 3
 
< 0.1%
127.082023 3
 
< 0.1%
127.047537 3
 
< 0.1%
127.048437 3
 
< 0.1%
127.04834 3
 
< 0.1%
Other values (2789) 3033
30.3%
(Missing) 6934
69.3%
ValueCountFrequency (%)
127.018043 1
< 0.1%
127.025255 1
< 0.1%
127.025848 1
< 0.1%
127.025892 1
< 0.1%
127.025913 1
< 0.1%
127.025985 1
< 0.1%
127.029487 1
< 0.1%
127.029507 1
< 0.1%
127.029548 1
< 0.1%
127.029617 1
< 0.1%
ValueCountFrequency (%)
127.120677 1
< 0.1%
127.120672 1
< 0.1%
127.120657 1
< 0.1%
127.120655 1
< 0.1%
127.11995 1
< 0.1%
127.11896 1
< 0.1%
127.118902 1
< 0.1%
127.11885 1
< 0.1%
127.118847 1
< 0.1%
127.118555 1
< 0.1%

Interactions

2023-12-12T19:58:58.859355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:58:58.570849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:58:59.004013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:58:58.735106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:59:02.568851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
단속구분위도경도
단속구분1.000NaNNaN
위도NaN1.0000.669
경도NaN0.6691.000
2023-12-12T19:59:02.755753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도단속구분
위도1.0000.2201.000
경도0.2201.0001.000
단속구분1.0001.0001.000

Missing values

2023-12-12T19:58:59.156769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:58:59.282951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T19:58:59.421232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

단속일시단속장소단속구분위도경도
742752023-07-13 10:01신곡동 장곡로 부근차량형CCTV37.745993127.058382
349062023-03-28 09:17낙양동 민락로 부근차량형CCTV37.756868127.113305
427472023-04-16 14:23의정부동 669민원신고<NA><NA>
21292023-01-06 06:58(신곡동)(양주방면 신곡동 산 87-6)고정형CCTV<NA><NA>
503942023-05-06 16:26노바웨딩홀(T-71)고정형CCTV<NA><NA>
755692023-07-16 21:00가능동 59-9민원신고<NA><NA>
213272023-02-25 10:13고산GS프라자삼거리(T-197)고정형CCTV<NA><NA>
740972023-07-12 15:42백석천제2공영주차장(T-189)고정형CCTV<NA><NA>
73662023-01-19 15:40의정부동 둔야로33번길 부근차량형CCTV37.738763127.038165
294182023-03-15 09:35홈플러스 주차장입구(T-102)고정형CCTV<NA><NA>
단속일시단속장소단속구분위도경도
412432023-04-12 16:03호원동 호암로 부근차량형CCTV37.719418127.043977
119302023-02-02 10:24새마을식당앞(T-217)고정형CCTV<NA><NA>
870652023-08-14 12:20금오동 477-8민원신고<NA><NA>
493652023-05-03 16:13민락동 산 66-1 부근차량형CCTV37.74676127.120672
583722023-05-29 19:34금오동 477-7민원신고<NA><NA>
883772023-08-17 19:53의정부동 691민원신고<NA><NA>
796372023-07-27 06:01가능동 755민원신고<NA><NA>
349822023-03-28 10:28의정부동 범골로74번길 부근차량형CCTV37.73172127.039747
131782023-02-05 10:19푸르미 아파트 뒤편(T-36)고정형CCTV<NA><NA>
22972023-01-06 14:22낙양동 용민로 부근차량형CCTV37.75654127.112462

Duplicate rows

Most frequently occurring

단속일시단속장소단속구분위도경도# duplicates
02023-01-21 14:40신곡동 813-16민원신고<NA><NA>2
12023-01-28 11:30의정부동 395민원신고<NA><NA>2
22023-03-04 10:17양지공원길주차장입구(T-141)고정형CCTV<NA><NA>2
32023-03-13 12:16신곡동 765-2민원신고<NA><NA>2
42023-03-19 14:27역전근린공원(T-224)고정형CCTV<NA><NA>2
52023-03-19 17:07신곡동 736민원신고<NA><NA>2
62023-04-08 11:58신곡동 813-22민원신고<NA><NA>2
72023-04-09 18:11신곡동 813-21민원신고<NA><NA>2
82023-04-09 18:28금오동 477-8민원신고<NA><NA>2
92023-04-26 17:14의정부동 15-7민원신고<NA><NA>2