Overview

Dataset statistics

Number of variables13
Number of observations2386
Missing cells2034
Missing cells (%)6.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory244.8 KiB
Average record size in memory105.1 B

Variable types

Text7
Categorical3
Numeric1
DateTime2

Alerts

confirm_date has constant value ""Constant
sigungu is highly overall correlated with lng and 2 other fieldsHigh correlation
road_type is highly overall correlated with gubun and 1 other fieldsHigh correlation
lng is highly overall correlated with sigunguHigh correlation
gubun is highly overall correlated with road_type and 1 other fieldsHigh correlation
gubun is highly imbalanced (66.2%)Imbalance
road_type is highly imbalanced (76.0%)Imbalance
road has 1886 (79.0%) missing valuesMissing
ins_date has 28 (1.2%) missing valuesMissing
lng has 30 (1.3%) missing valuesMissing
confirm_date has 30 (1.3%) missing valuesMissing
last_load_dttm has 30 (1.3%) missing valuesMissing
skey has unique valuesUnique

Reproduction

Analysis started2024-04-16 13:54:24.087354
Analysis finished2024-04-16 13:54:27.223991
Duration3.14 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

skey
Text

UNIQUE 

Distinct2386
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size18.8 KiB
2024-04-16T22:54:27.520863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length5
Mean length5.0314334
Min length5

Characters and Unicode

Total characters12005
Distinct characters91
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2386 ?
Unique (%)100.0%

Sample

1st row19522
2nd row19523
3rd row19524
4th row19525
5th row19526
ValueCountFrequency (%)
19522 1
 
< 0.1%
18027 1
 
< 0.1%
17929 1
 
< 0.1%
17936 1
 
< 0.1%
17930 1
 
< 0.1%
17931 1
 
< 0.1%
17932 1
 
< 0.1%
17933 1
 
< 0.1%
17934 1
 
< 0.1%
17935 1
 
< 0.1%
Other values (2377) 2377
99.6%
2024-04-16T22:54:28.013263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3038
25.3%
8 1677
14.0%
9 1378
11.5%
7 1348
11.2%
4 778
 
6.5%
6 777
 
6.5%
5 777
 
6.5%
3 748
 
6.2%
0 669
 
5.6%
2 668
 
5.6%
Other values (81) 147
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11858
98.8%
Other Letter 111
 
0.9%
Close Punctuation 15
 
0.1%
Open Punctuation 15
 
0.1%
Uppercase Letter 2
 
< 0.1%
Space Separator 2
 
< 0.1%
Other Punctuation 1
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (65) 80
72.1%
Decimal Number
ValueCountFrequency (%)
1 3038
25.6%
8 1677
14.1%
9 1378
11.6%
7 1348
11.4%
4 778
 
6.6%
6 777
 
6.6%
5 777
 
6.6%
3 748
 
6.3%
0 669
 
5.6%
2 668
 
5.6%
Close Punctuation
ValueCountFrequency (%)
) 15
100.0%
Open Punctuation
ValueCountFrequency (%)
( 15
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 2
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%
Math Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11892
99.1%
Hangul 111
 
0.9%
Latin 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (65) 80
72.1%
Common
ValueCountFrequency (%)
1 3038
25.5%
8 1677
14.1%
9 1378
11.6%
7 1348
11.3%
4 778
 
6.5%
6 777
 
6.5%
5 777
 
6.5%
3 748
 
6.3%
0 669
 
5.6%
2 668
 
5.6%
Other values (5) 34
 
0.3%
Latin
ValueCountFrequency (%)
E 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11893
99.1%
Hangul 111
 
0.9%
Arrows 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3038
25.5%
8 1677
14.1%
9 1378
11.6%
7 1348
11.3%
4 778
 
6.5%
6 777
 
6.5%
5 777
 
6.5%
3 748
 
6.3%
0 669
 
5.6%
2 668
 
5.6%
Other values (5) 35
 
0.3%
Hangul
ValueCountFrequency (%)
4
 
3.6%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
3
 
2.7%
Other values (65) 80
72.1%
Arrows
ValueCountFrequency (%)
1
100.0%

mgrnu
Text

Distinct2321
Distinct (%)97.3%
Missing0
Missing (%)0.0%
Memory size18.8 KiB
2024-04-16T22:54:28.399844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length4.5616094
Min length4

Characters and Unicode

Total characters10884
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2268 ?
Unique (%)95.1%

Sample

1st row16124
2nd row16125
3rd row16126
4th row16127
5th row16128
ValueCountFrequency (%)
제어 15
 
0.6%
일반신호 10
 
0.4%
전자신호 5
 
0.2%
13364 3
 
0.1%
13368 2
 
0.1%
13376 2
 
0.1%
9293 2
 
0.1%
12350 2
 
0.1%
12111 2
 
0.1%
13384 2
 
0.1%
Other values (2312) 2356
98.1%
2024-04-16T22:54:28.911247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2768
25.4%
0 1889
17.4%
2 1245
11.4%
3 903
 
8.3%
6 826
 
7.6%
5 763
 
7.0%
9 643
 
5.9%
4 635
 
5.8%
7 564
 
5.2%
8 543
 
5.0%
Other values (9) 105
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10779
99.0%
Other Letter 90
 
0.8%
Space Separator 15
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2768
25.7%
0 1889
17.5%
2 1245
11.6%
3 903
 
8.4%
6 826
 
7.7%
5 763
 
7.1%
9 643
 
6.0%
4 635
 
5.9%
7 564
 
5.2%
8 543
 
5.0%
Other Letter
ValueCountFrequency (%)
15
16.7%
15
16.7%
15
16.7%
15
16.7%
10
11.1%
10
11.1%
5
 
5.6%
5
 
5.6%
Space Separator
ValueCountFrequency (%)
15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10794
99.2%
Hangul 90
 
0.8%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2768
25.6%
0 1889
17.5%
2 1245
11.5%
3 903
 
8.4%
6 826
 
7.7%
5 763
 
7.1%
9 643
 
6.0%
4 635
 
5.9%
7 564
 
5.2%
8 543
 
5.0%
Hangul
ValueCountFrequency (%)
15
16.7%
15
16.7%
15
16.7%
15
16.7%
10
11.1%
10
11.1%
5
 
5.6%
5
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10794
99.2%
Hangul 90
 
0.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2768
25.6%
0 1889
17.5%
2 1245
11.5%
3 903
 
8.4%
6 826
 
7.7%
5 763
 
7.1%
9 643
 
6.0%
4 635
 
5.9%
7 564
 
5.2%
8 543
 
5.0%
Hangul
ValueCountFrequency (%)
15
16.7%
15
16.7%
15
16.7%
15
16.7%
10
11.1%
10
11.1%
5
 
5.6%
5
 
5.6%

road
Text

MISSING 

Distinct53
Distinct (%)10.6%
Missing1886
Missing (%)79.0%
Memory size18.8 KiB
2024-04-16T22:54:29.097138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length4.002
Min length3

Characters and Unicode

Total characters2001
Distinct characters73
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)5.6%

Sample

1st row산단로
2nd row정관로
3rd row산단로
4th row정관로
5th row정관로
ValueCountFrequency (%)
중앙대로 56
 
11.2%
해운대로 40
 
8.0%
반송로 39
 
7.8%
낙동대로 36
 
7.2%
백양대로 32
 
6.4%
정관로 26
 
5.2%
기장대로 26
 
5.2%
태종로 25
 
5.0%
공항로 22
 
4.4%
수영로 21
 
4.2%
Other values (43) 177
35.4%
2024-04-16T22:54:29.432082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
486
24.3%
275
 
13.7%
72
 
3.6%
65
 
3.2%
62
 
3.1%
62
 
3.1%
60
 
3.0%
0 53
 
2.6%
50
 
2.5%
39
 
1.9%
Other values (63) 777
38.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1844
92.2%
Decimal Number 125
 
6.2%
Dash Punctuation 30
 
1.5%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
486
26.4%
275
14.9%
72
 
3.9%
65
 
3.5%
62
 
3.4%
62
 
3.4%
60
 
3.3%
50
 
2.7%
39
 
2.1%
39
 
2.1%
Other values (51) 634
34.4%
Decimal Number
ValueCountFrequency (%)
0 53
42.4%
1 29
23.2%
2 14
 
11.2%
9 8
 
6.4%
7 6
 
4.8%
6 4
 
3.2%
4 3
 
2.4%
5 3
 
2.4%
3 3
 
2.4%
8 2
 
1.6%
Dash Punctuation
ValueCountFrequency (%)
- 30
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1844
92.2%
Common 157
 
7.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
486
26.4%
275
14.9%
72
 
3.9%
65
 
3.5%
62
 
3.4%
62
 
3.4%
60
 
3.3%
50
 
2.7%
39
 
2.1%
39
 
2.1%
Other values (51) 634
34.4%
Common
ValueCountFrequency (%)
0 53
33.8%
- 30
19.1%
1 29
18.5%
2 14
 
8.9%
9 8
 
5.1%
7 6
 
3.8%
6 4
 
2.5%
4 3
 
1.9%
5 3
 
1.9%
3 3
 
1.9%
Other values (2) 4
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1844
92.2%
ASCII 157
 
7.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
486
26.4%
275
14.9%
72
 
3.9%
65
 
3.5%
62
 
3.4%
62
 
3.4%
60
 
3.3%
50
 
2.7%
39
 
2.1%
39
 
2.1%
Other values (51) 634
34.4%
ASCII
ValueCountFrequency (%)
0 53
33.8%
- 30
19.1%
1 29
18.5%
2 14
 
8.9%
9 8
 
5.1%
7 6
 
3.8%
6 4
 
2.5%
4 3
 
1.9%
5 3
 
1.9%
3 3
 
1.9%
Other values (2) 4
 
2.5%
Distinct2318
Distinct (%)97.2%
Missing0
Missing (%)0.0%
Memory size18.8 KiB
2024-04-16T22:54:29.732452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length30
Mean length9.3088852
Min length3

Characters and Unicode

Total characters22211
Distinct characters534
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2266 ?
Unique (%)95.0%

Sample

1st row 정관신도시(#11) 현진에버빌 103동 앞
2nd row 정관신도시(#12) 현진에버빌 101동 앞
3rd row 정관신도시(#13)정관동일스위트 201동 앞
4th row 정관신도시(#14) 정관동일스위트 204동 앞
5th row 정관신도시(#15) 신정중학교앞 사거리
ValueCountFrequency (%)
81
 
2.5%
명지국제신도시 41
 
1.3%
명지주거단지 23
 
0.7%
화전지구산업단지 22
 
0.7%
교차로 21
 
0.7%
삼거리 14
 
0.4%
입구 14
 
0.4%
주변 13
 
0.4%
조만교-세산교차로 12
 
0.4%
12
 
0.4%
Other values (2558) 2969
92.1%
2024-04-16T22:54:30.202785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1473
 
6.6%
681
 
3.1%
( 611
 
2.8%
) 609
 
2.7%
515
 
2.3%
483
 
2.2%
463
 
2.1%
1 418
 
1.9%
351
 
1.6%
317
 
1.4%
Other values (524) 16290
73.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 17352
78.1%
Space Separator 1473
 
6.6%
Decimal Number 1416
 
6.4%
Open Punctuation 611
 
2.8%
Close Punctuation 609
 
2.7%
Uppercase Letter 458
 
2.1%
Dash Punctuation 168
 
0.8%
Other Punctuation 102
 
0.5%
Lowercase Letter 8
 
< 0.1%
Other Symbol 7
 
< 0.1%
Other values (2) 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
681
 
3.9%
515
 
3.0%
483
 
2.8%
463
 
2.7%
351
 
2.0%
317
 
1.8%
296
 
1.7%
287
 
1.7%
277
 
1.6%
276
 
1.6%
Other values (470) 13406
77.3%
Uppercase Letter
ValueCountFrequency (%)
L 65
14.2%
T 63
13.8%
P 43
9.4%
C 38
8.3%
S 35
7.6%
A 33
7.2%
B 31
6.8%
I 29
 
6.3%
G 25
 
5.5%
K 19
 
4.1%
Other values (15) 77
16.8%
Decimal Number
ValueCountFrequency (%)
1 418
29.5%
2 301
21.3%
3 159
 
11.2%
0 141
 
10.0%
4 118
 
8.3%
5 66
 
4.7%
7 55
 
3.9%
8 53
 
3.7%
9 53
 
3.7%
6 52
 
3.7%
Other Punctuation
ValueCountFrequency (%)
# 51
50.0%
, 22
21.6%
' 15
 
14.7%
. 7
 
6.9%
: 3
 
2.9%
" 2
 
2.0%
/ 1
 
1.0%
@ 1
 
1.0%
Lowercase Letter
ValueCountFrequency (%)
e 5
62.5%
k 1
 
12.5%
a 1
 
12.5%
r 1
 
12.5%
Space Separator
ValueCountFrequency (%)
1473
100.0%
Open Punctuation
ValueCountFrequency (%)
( 611
100.0%
Close Punctuation
ValueCountFrequency (%)
) 609
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 168
100.0%
Other Symbol
ValueCountFrequency (%)
7
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 17359
78.2%
Common 4386
 
19.7%
Latin 466
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
681
 
3.9%
515
 
3.0%
483
 
2.8%
463
 
2.7%
351
 
2.0%
317
 
1.8%
296
 
1.7%
287
 
1.7%
277
 
1.6%
276
 
1.6%
Other values (471) 13413
77.3%
Latin
ValueCountFrequency (%)
L 65
13.9%
T 63
13.5%
P 43
9.2%
C 38
8.2%
S 35
7.5%
A 33
 
7.1%
B 31
 
6.7%
I 29
 
6.2%
G 25
 
5.4%
K 19
 
4.1%
Other values (19) 85
18.2%
Common
ValueCountFrequency (%)
1473
33.6%
( 611
13.9%
) 609
13.9%
1 418
 
9.5%
2 301
 
6.9%
- 168
 
3.8%
3 159
 
3.6%
0 141
 
3.2%
4 118
 
2.7%
5 66
 
1.5%
Other values (14) 322
 
7.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 17352
78.1%
ASCII 4852
 
21.8%
None 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1473
30.4%
( 611
12.6%
) 609
12.6%
1 418
 
8.6%
2 301
 
6.2%
- 168
 
3.5%
3 159
 
3.3%
0 141
 
2.9%
4 118
 
2.4%
5 66
 
1.4%
Other values (43) 788
16.2%
Hangul
ValueCountFrequency (%)
681
 
3.9%
515
 
3.0%
483
 
2.8%
463
 
2.7%
351
 
2.0%
317
 
1.8%
296
 
1.7%
287
 
1.7%
277
 
1.6%
276
 
1.6%
Other values (470) 13406
77.3%
None
ValueCountFrequency (%)
7
100.0%

gubun
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct16
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size18.8 KiB
전자신호 제어
1439 
일반신호 제어
801 
전자신호제어
 
68
일반신호제어
 
42
<NA>
 
15
Other values (11)
 
21

Length

Max length7
Median length7
Mean length6.906119
Min length2

Unique

Unique6 ?
Unique (%)0.3%

Sample

1st row전자신호 제어
2nd row전자신호 제어
3rd row전자신호 제어
4th row전자신호 제어
5th row전자신호 제어

Common Values

ValueCountFrequency (%)
전자신호 제어 1439
60.3%
일반신호 제어 801
33.6%
전자신호제어 68
 
2.8%
일반신호제어 42
 
1.8%
<NA> 15
 
0.6%
가변신호제어 6
 
0.3%
강서구 3
 
0.1%
금정구 2
 
0.1%
북구 2
 
0.1%
남구 2
 
0.1%
Other values (6) 6
 
0.3%

Length

2024-04-16T22:54:30.350988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
제어 2240
48.4%
전자신호 1439
31.1%
일반신호 801
 
17.3%
전자신호제어 68
 
1.5%
일반신호제어 42
 
0.9%
na 15
 
0.3%
가변신호제어 6
 
0.1%
강서구 3
 
0.1%
남구 2
 
< 0.1%
금정구 2
 
< 0.1%
Other values (7) 8
 
0.2%

ins_date
Text

MISSING 

Distinct395
Distinct (%)16.8%
Missing28
Missing (%)1.2%
Memory size18.8 KiB
2024-04-16T22:54:30.692808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length10
Mean length9.9317218
Min length4

Characters and Unicode

Total characters23419
Distinct characters53
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique132 ?
Unique (%)5.6%

Sample

1st row2008-06-01
2nd row2008-06-01
3rd row2008-06-01
4th row2008-06-01
5th row2008-06-01
ValueCountFrequency (%)
1999-01-01 89
 
3.7%
1998-02-01 83
 
3.5%
1997-02-01 81
 
3.4%
2001-02-01 68
 
2.8%
2008-06-01 58
 
2.4%
1999-12-01 55
 
2.3%
2010-10-01 47
 
2.0%
2007-10-01 38
 
1.6%
2005-04-01 38
 
1.6%
2008-11-01 33
 
1.4%
Other values (410) 1814
75.5%
2024-04-16T22:54:31.586479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 7301
31.2%
1 4677
20.0%
- 4596
19.6%
2 2624
 
11.2%
9 1511
 
6.5%
6 502
 
2.1%
8 492
 
2.1%
5 487
 
2.1%
7 395
 
1.7%
4 356
 
1.5%
Other values (43) 478
 
2.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 18611
79.5%
Dash Punctuation 4596
 
19.6%
Other Letter 166
 
0.7%
Space Separator 46
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
18
10.8%
17
 
10.2%
16
 
9.6%
16
 
9.6%
15
 
9.0%
15
 
9.0%
14
 
8.4%
5
 
3.0%
3
 
1.8%
3
 
1.8%
Other values (31) 44
26.5%
Decimal Number
ValueCountFrequency (%)
0 7301
39.2%
1 4677
25.1%
2 2624
 
14.1%
9 1511
 
8.1%
6 502
 
2.7%
8 492
 
2.6%
5 487
 
2.6%
7 395
 
2.1%
4 356
 
1.9%
3 266
 
1.4%
Dash Punctuation
ValueCountFrequency (%)
- 4596
100.0%
Space Separator
ValueCountFrequency (%)
46
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23253
99.3%
Hangul 166
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
18
10.8%
17
 
10.2%
16
 
9.6%
16
 
9.6%
15
 
9.0%
15
 
9.0%
14
 
8.4%
5
 
3.0%
3
 
1.8%
3
 
1.8%
Other values (31) 44
26.5%
Common
ValueCountFrequency (%)
0 7301
31.4%
1 4677
20.1%
- 4596
19.8%
2 2624
 
11.3%
9 1511
 
6.5%
6 502
 
2.2%
8 492
 
2.1%
5 487
 
2.1%
7 395
 
1.7%
4 356
 
1.5%
Other values (2) 312
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23253
99.3%
Hangul 166
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7301
31.4%
1 4677
20.1%
- 4596
19.8%
2 2624
 
11.3%
9 1511
 
6.5%
6 502
 
2.2%
8 492
 
2.1%
5 487
 
2.1%
7 395
 
1.7%
4 356
 
1.5%
Other values (2) 312
 
1.3%
Hangul
ValueCountFrequency (%)
18
10.8%
17
 
10.2%
16
 
9.6%
16
 
9.6%
15
 
9.0%
15
 
9.0%
14
 
8.4%
5
 
3.0%
3
 
1.8%
3
 
1.8%
Other values (31) 44
26.5%

road_type
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct20
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size18.8 KiB
교차로
1796 
단일로
511 
가변차로
 
28
<NA>
 
23
모뎀
 
13
Other values (15)
 
15

Length

Max length12
Median length3
Mean length3.0725063
Min length2

Unique

Unique15 ?
Unique (%)0.6%

Sample

1st row교차로
2nd row교차로
3rd row교차로
4th row교차로
5th row교차로

Common Values

ValueCountFrequency (%)
교차로 1796
75.3%
단일로 511
 
21.4%
가변차로 28
 
1.2%
<NA> 23
 
1.0%
모뎀 13
 
0.5%
129.08919964 1
 
< 0.1%
129.22019165 1
 
< 0.1%
128.85945379 1
 
< 0.1%
128.85190889 1
 
< 0.1%
128.91230111 1
 
< 0.1%
Other values (10) 10
 
0.4%

Length

2024-04-16T22:54:31.847931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
교차로 1796
75.3%
단일로 511
 
21.4%
가변차로 28
 
1.2%
na 23
 
1.0%
모뎀 13
 
0.5%
129.06508159 1
 
< 0.1%
129.01616086 1
 
< 0.1%
129.07792837 1
 
< 0.1%
129.05329687 1
 
< 0.1%
129.06983593 1
 
< 0.1%
Other values (10) 10
 
0.4%

sigungu
Categorical

HIGH CORRELATION 

Distinct32
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size18.8 KiB
강서구
409 
해운대구
246 
기장군
239 
사하구
183 
부산진구
160 
Other values (27)
1149 

Length

Max length12
Median length3
Mean length3.0540654
Min length2

Unique

Unique15 ?
Unique (%)0.6%

Sample

1st row기장군
2nd row기장군
3rd row기장군
4th row기장군
5th row기장군

Common Values

ValueCountFrequency (%)
강서구 409
17.1%
해운대구 246
10.3%
기장군 239
10.0%
사하구 183
7.7%
부산진구 160
 
6.7%
동래구 151
 
6.3%
사상구 147
 
6.2%
남구 147
 
6.2%
금정구 133
 
5.6%
북구 129
 
5.4%
Other values (22) 442
18.5%

Length

2024-04-16T22:54:31.965914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강서구 409
17.1%
해운대구 246
10.3%
기장군 239
10.0%
사하구 183
7.7%
부산진구 160
 
6.7%
동래구 151
 
6.3%
사상구 147
 
6.2%
남구 147
 
6.2%
금정구 133
 
5.6%
북구 129
 
5.4%
Other values (22) 442
18.5%
Distinct2261
Distinct (%)95.4%
Missing15
Missing (%)0.6%
Memory size18.8 KiB
2024-04-16T22:54:32.386944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length33
Mean length19.874315
Min length6

Characters and Unicode

Total characters47122
Distinct characters190
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2178 ?
Unique (%)91.9%

Sample

1st row부산광역시 기장군 정관면 모전리 741-1
2nd row부산광역시 기장군 정관면 모전리 748
3rd row부산광역시 기장군 정관면 모전리 541
4th row부산광역시 기장군 정관면 용수리 1031-12
5th row부산광역시 기장군 정관면 용수리 1363
ValueCountFrequency (%)
부산광역시 2355
24.3%
강서구 407
 
4.2%
해운대구 242
 
2.5%
기장군 234
 
2.4%
사하구 177
 
1.8%
동래구 152
 
1.6%
부산진구 145
 
1.5%
사상구 145
 
1.5%
남구 144
 
1.5%
금정구 133
 
1.4%
Other values (2559) 5563
57.4%
2024-04-16T22:54:33.025585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7336
15.6%
2709
 
5.7%
1 2682
 
5.7%
2590
 
5.5%
2396
 
5.1%
2373
 
5.0%
2357
 
5.0%
2335
 
5.0%
2195
 
4.7%
- 1898
 
4.0%
Other values (180) 18251
38.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 26771
56.8%
Decimal Number 11072
23.5%
Space Separator 7336
 
15.6%
Dash Punctuation 1898
 
4.0%
Close Punctuation 22
 
< 0.1%
Open Punctuation 22
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2709
 
10.1%
2590
 
9.7%
2396
 
8.9%
2373
 
8.9%
2357
 
8.8%
2335
 
8.7%
2195
 
8.2%
530
 
2.0%
444
 
1.7%
439
 
1.6%
Other values (165) 8403
31.4%
Decimal Number
ValueCountFrequency (%)
1 2682
24.2%
2 1590
14.4%
3 1264
11.4%
4 1034
 
9.3%
5 931
 
8.4%
6 763
 
6.9%
7 751
 
6.8%
8 727
 
6.6%
0 687
 
6.2%
9 643
 
5.8%
Space Separator
ValueCountFrequency (%)
7336
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1898
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Math Symbol
ValueCountFrequency (%)
> 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 26771
56.8%
Common 20351
43.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2709
 
10.1%
2590
 
9.7%
2396
 
8.9%
2373
 
8.9%
2357
 
8.8%
2335
 
8.7%
2195
 
8.2%
530
 
2.0%
444
 
1.7%
439
 
1.6%
Other values (165) 8403
31.4%
Common
ValueCountFrequency (%)
7336
36.0%
1 2682
 
13.2%
- 1898
 
9.3%
2 1590
 
7.8%
3 1264
 
6.2%
4 1034
 
5.1%
5 931
 
4.6%
6 763
 
3.7%
7 751
 
3.7%
8 727
 
3.6%
Other values (5) 1375
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 26771
56.8%
ASCII 20351
43.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7336
36.0%
1 2682
 
13.2%
- 1898
 
9.3%
2 1590
 
7.8%
3 1264
 
6.2%
4 1034
 
5.1%
5 931
 
4.6%
6 763
 
3.7%
7 751
 
3.7%
8 727
 
3.6%
Other values (5) 1375
 
6.8%
Hangul
ValueCountFrequency (%)
2709
 
10.1%
2590
 
9.7%
2396
 
8.9%
2373
 
8.9%
2357
 
8.8%
2335
 
8.7%
2195
 
8.2%
530
 
2.0%
444
 
1.7%
439
 
1.6%
Other values (165) 8403
31.4%

lat
Text

Distinct2161
Distinct (%)91.1%
Missing15
Missing (%)0.6%
Memory size18.8 KiB
2024-04-16T22:54:33.291873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length12
Mean length11.894137
Min length8

Characters and Unicode

Total characters28201
Distinct characters14
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2056 ?
Unique (%)86.7%

Sample

1st row129.16748965
2nd row129.16906172
3rd row129.16290522
4th row129.16429102
5th row129.168895
ValueCountFrequency (%)
129.075091 49
 
2.1%
2020-12-22 15
 
0.6%
129.08021873 14
 
0.6%
128.8292943 11
 
0.5%
14:28:58 9
 
0.4%
128.81926115 6
 
0.3%
14:28:57 6
 
0.3%
129.03946248 5
 
0.2%
128.97773824 5
 
0.2%
129.14237426 3
 
0.1%
Other values (2152) 2263
94.8%
2024-04-16T22:54:33.727508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 4610
16.3%
2 4226
15.0%
9 3783
13.4%
8 2721
9.6%
0 2369
8.4%
. 2356
8.4%
7 1763
 
6.3%
6 1648
 
5.8%
5 1637
 
5.8%
4 1511
 
5.4%
Other values (4) 1577
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 25770
91.4%
Other Punctuation 2386
 
8.5%
Dash Punctuation 30
 
0.1%
Space Separator 15
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 4610
17.9%
2 4226
16.4%
9 3783
14.7%
8 2721
10.6%
0 2369
9.2%
7 1763
 
6.8%
6 1648
 
6.4%
5 1637
 
6.4%
4 1511
 
5.9%
3 1502
 
5.8%
Other Punctuation
ValueCountFrequency (%)
. 2356
98.7%
: 30
 
1.3%
Dash Punctuation
ValueCountFrequency (%)
- 30
100.0%
Space Separator
ValueCountFrequency (%)
15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 28201
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 4610
16.3%
2 4226
15.0%
9 3783
13.4%
8 2721
9.6%
0 2369
8.4%
. 2356
8.4%
7 1763
 
6.3%
6 1648
 
5.8%
5 1637
 
5.8%
4 1511
 
5.4%
Other values (4) 1577
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28201
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 4610
16.3%
2 4226
15.0%
9 3783
13.4%
8 2721
9.6%
0 2369
8.4%
. 2356
8.4%
7 1763
 
6.3%
6 1648
 
5.8%
5 1637
 
5.8%
4 1511
 
5.4%
Other values (4) 1577
 
5.6%

lng
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct2159
Distinct (%)91.6%
Missing30
Missing (%)1.3%
Infinite0
Infinite (%)0.0%
Mean35.168668
Minimum35.033738
Maximum35.384732
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size21.1 KiB
2024-04-16T22:54:33.894446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum35.033738
5-th percentile35.08047
Q135.114828
median35.164695
Q335.203941
95-th percentile35.319323
Maximum35.384732
Range0.35099383
Interquartile range (IQR)0.089113645

Descriptive statistics

Standard deviation0.066008894
Coefficient of variation (CV)0.0018769234
Kurtosis0.30785863
Mean35.168668
Median Absolute Deviation (MAD)0.043259531
Skewness0.69455726
Sum82857.381
Variance0.004357174
MonotonicityNot monotonic
2024-04-16T22:54:34.034558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
35.179928 49
 
2.1%
35.18408371 14
 
0.6%
35.147457295 11
 
0.5%
35.08377723 6
 
0.3%
35.067772917 5
 
0.2%
35.163348261 5
 
0.2%
35.071052071 3
 
0.1%
35.192431626 3
 
0.1%
35.054896607 3
 
0.1%
35.103738024 3
 
0.1%
Other values (2149) 2254
94.5%
(Missing) 30
 
1.3%
ValueCountFrequency (%)
35.033738228 1
< 0.1%
35.047424153 1
< 0.1%
35.047519182 1
< 0.1%
35.048190632 1
< 0.1%
35.048924882 1
< 0.1%
35.049965216 1
< 0.1%
35.051124081 1
< 0.1%
35.0516318 1
< 0.1%
35.051847155 1
< 0.1%
35.052324246 1
< 0.1%
ValueCountFrequency (%)
35.384732061 1
< 0.1%
35.373143184 1
< 0.1%
35.370780128 1
< 0.1%
35.369356988 1
< 0.1%
35.368465882 1
< 0.1%
35.368042272 1
< 0.1%
35.367082115 1
< 0.1%
35.357584557 1
< 0.1%
35.357407534 1
< 0.1%
35.355069939 1
< 0.1%

confirm_date
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing30
Missing (%)1.3%
Memory size18.8 KiB
Minimum2019-12-31 00:00:00
Maximum2019-12-31 00:00:00
2024-04-16T22:54:34.142062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-16T22:54:34.228312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

last_load_dttm
Date

MISSING 

Distinct2
Distinct (%)0.1%
Missing30
Missing (%)1.3%
Memory size18.8 KiB
Minimum2020-12-22 14:28:57
Maximum2020-12-22 14:28:58
2024-04-16T22:54:34.315513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-16T22:54:34.410002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=2)

Interactions

2024-04-16T22:54:26.095814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-16T22:54:34.497323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
roadgubunroad_typesigungulnglast_load_dttm
road1.0000.9970.9950.9940.8870.735
gubun0.9971.0000.9800.9840.2180.053
road_type0.9950.9801.0000.9940.0720.114
sigungu0.9940.9840.9941.0000.8460.981
lng0.8870.2180.0720.8461.0000.260
last_load_dttm0.7350.0530.1140.9810.2601.000
2024-04-16T22:54:34.646962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
sigunguroad_typegubun
sigungu1.0000.9120.844
road_type0.9121.0000.854
gubun0.8440.8541.000
2024-04-16T22:54:34.781321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
lnggubunroad_typesigungu
lng1.0000.0920.0430.540
gubun0.0921.0000.8540.844
road_type0.0430.8541.0000.912
sigungu0.5400.8440.9121.000

Missing values

2024-04-16T22:54:26.421158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-16T22:54:26.800464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-16T22:54:27.113789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

skeymgrnuroadins_placegubunins_dateroad_typesigunguaddresslatlngconfirm_datelast_load_dttm
01952216124<NA>정관신도시(#11) 현진에버빌 103동 앞전자신호 제어2008-06-01교차로기장군부산광역시 기장군 정관면 모전리 741-1129.1674896535.3346032019-12-312020-12-22 14:28:57
11952316125<NA>정관신도시(#12) 현진에버빌 101동 앞전자신호 제어2008-06-01교차로기장군부산광역시 기장군 정관면 모전리 748129.1690617235.3345432019-12-312020-12-22 14:28:57
21952416126<NA>정관신도시(#13)정관동일스위트 201동 앞전자신호 제어2008-06-01교차로기장군부산광역시 기장군 정관면 모전리 541129.1629052235.3301682019-12-312020-12-22 14:28:57
31952516127산단로정관신도시(#14) 정관동일스위트 204동 앞전자신호 제어2008-06-01교차로기장군부산광역시 기장군 정관면 용수리 1031-12129.1642910235.3278652019-12-312020-12-22 14:28:57
41952616128정관로정관신도시(#15) 신정중학교앞 사거리전자신호 제어2008-06-01교차로기장군부산광역시 기장군 정관면 용수리 1363129.16889535.3290822019-12-312020-12-22 14:28:57
51952716129산단로정관신도시(#16) 정관동일스위트 101동 앞전자신호 제어2008-06-01교차로기장군부산광역시 기장군 정관면 용수리 1378129.1666568635.3248862019-12-312020-12-22 14:28:57
61952816130정관로정관신도시(#17) 동부산농협전자신호 제어2008-06-01교차로기장군부산광역시 기장군 정관면 용수리 1328-3129.1706526335.3258842019-12-312020-12-22 14:28:57
71952916131<NA>정관신도시(#18) 정관고등학교전자신호 제어2008-06-01교차로기장군부산광역시 기장군 정관면 용수리 1390129.1688517935.3222692019-12-312020-12-22 14:28:57
81953016132정관로정관신도시(#19) 정관초등학교전자신호 제어2008-06-01교차로기장군부산광역시 기장군 정관면 용수리 1393129.172119635.3231292019-12-312020-12-22 14:28:57
91953116133산단로정관신도시(#20) 정관동원로얄듀크2차 101동 앞전자신호 제어2008-06-01교차로기장군부산광역시 기장군 정관면 용수리 1405129.1729796135.3202372019-12-312020-12-22 14:28:57
skeymgrnuroadins_placegubunins_dateroad_typesigunguaddresslatlngconfirm_datelast_load_dttm
23761957116173<NA>철마초등학교일반신호 제어2003-04-01단일로기장군부산광역시 기장군 철마면 와여리 548-6129.1499211135.2746872019-12-312020-12-22 14:28:58
23771957216174<NA>철마초등학교주변(국도17호선)일반신호 제어2009-09-01교차로기장군부산광역시 기장군 철마면 와여리 481-2129.1512365835.2752152019-12-312020-12-22 14:28:58
23781957316175<NA>(철마IC)철마농업기술센터일반신호 제어2009-01-10교차로기장군부산광역시 기장군 철마면 장전리 282-16129.1510276635.2736582019-12-312020-12-22 14:28:58
23791957416176<NA>신동아아파트 102동 (정관면 28블럭)전자신호 제어2009-01-05교차로기장군부산광역시 기장군 정관면 용수리 188-1129.1772317335.3267842019-12-312020-12-22 14:28:58
23801957516177<NA>롯데캐슬101동전자신호 제어2009-01-05교차로기장군부산광역시 기장군 정관면 용수리 1364129.1702306935.3286552019-12-312020-12-22 14:28:58
23811957616178<NA>(모뎀)철마백길마을전자신호 제어2009-12-22모뎀기장군부산광역시 기장군 철마면 백길리 253-2(정관산업로)129.1548202635.2798862019-12-312020-12-22 14:28:58
23821957716179<NA>(철마IC)철마보건지소일반신호 제어2009-01-10교차로기장군부산광역시 기장군 철마면 와여리 561-3129.1518800135.2733072019-12-312020-12-22 14:28:58
23831957816180<NA>임랑삼거리일반신호 제어2015-12-01교차로기장군부산광역시 기장군 장안읍 임랑리 163-4129.260646235.3170282019-12-312020-12-22 14:28:58
23841733117002<NA>LH새천년나무아파트전자신호 제어2017-02-01교차로사하구부산광역시 사하구 신평동 286-18128.9723462435.0928762019-12-312020-12-22 14:28:58
23851733017001<NA>지하철 다대선 6공구(다대해수욕장)전자신호 제어2017-02-01교차로사하구부산광역시 사하구 다대동 461-4128.9686288935.0475192019-12-312020-12-22 14:28:58