Overview

Dataset statistics

Number of variables12
Number of observations10000
Missing cells13187
Missing cells (%)11.0%
Duplicate rows452
Duplicate rows (%)4.5%
Total size in memory1.0 MiB
Average record size in memory106.0 B

Variable types

Text6
Categorical3
DateTime1
Numeric2

Dataset

Description부산광역시중구_U-옥외광고물시스템_옥외광고물허가관리_20230927
Author부산광역시 중구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15039634

Alerts

Dataset has 452 (4.5%) duplicate rowsDuplicates
시도 is highly overall correlated with 건물번호 and 3 other fieldsHigh correlation
광고물종류명 is highly overall correlated with 시도 and 1 other fieldsHigh correlation
구군 is highly overall correlated with 건물번호 and 3 other fieldsHigh correlation
건물번호 is highly overall correlated with 시도 and 1 other fieldsHigh correlation
건물번호2 is highly overall correlated with 시도 and 1 other fieldsHigh correlation
광고물종류명 is highly imbalanced (68.2%)Imbalance
읍면동 has 1756 (17.6%) missing valuesMissing
도로명 has 3793 (37.9%) missing valuesMissing
건물번호 has 3793 (37.9%) missing valuesMissing
건물번호2 has 3793 (37.9%) missing valuesMissing
건물번호2 has 4965 (49.6%) zerosZeros

Reproduction

Analysis started2023-12-10 16:53:35.639438
Analysis finished2023-12-10 16:53:39.951480
Duration4.31 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct4966
Distinct (%)49.8%
Missing24
Missing (%)0.2%
Memory size156.2 KiB
2023-12-11T01:53:40.257081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length24
Mean length6.3506415
Min length1

Characters and Unicode

Total characters63354
Distinct characters887
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3473 ?
Unique (%)34.8%

Sample

1st row국제스피치 평생교육원
2nd row동명한문펜글씨
3rd row드림코리아
4th row속초코다리냉면
5th row함경남포면옥
ValueCountFrequency (%)
신성애드 202
 
1.8%
우리현수막 109
 
1.0%
부산극장 88
 
0.8%
61 75
 
0.7%
대영시네마 64
 
0.6%
그린자동차직업전문학교 63
 
0.6%
국제기업사 57
 
0.5%
대륙 51
 
0.5%
하나 47
 
0.4%
메디포맨 47
 
0.4%
Other values (5224) 10465
92.9%
2023-12-11T01:53:40.933040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1362
 
2.1%
1355
 
2.1%
1293
 
2.0%
1174
 
1.9%
1165
 
1.8%
) 984
 
1.6%
( 979
 
1.5%
937
 
1.5%
924
 
1.5%
888
 
1.4%
Other values (877) 52293
82.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 57071
90.1%
Uppercase Letter 1571
 
2.5%
Space Separator 1293
 
2.0%
Close Punctuation 984
 
1.6%
Open Punctuation 979
 
1.5%
Lowercase Letter 668
 
1.1%
Decimal Number 638
 
1.0%
Other Punctuation 105
 
0.2%
Dash Punctuation 40
 
0.1%
Math Symbol 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1362
 
2.4%
1355
 
2.4%
1174
 
2.1%
1165
 
2.0%
937
 
1.6%
924
 
1.6%
888
 
1.6%
862
 
1.5%
833
 
1.5%
824
 
1.4%
Other values (802) 46747
81.9%
Uppercase Letter
ValueCountFrequency (%)
S 209
13.3%
C 157
 
10.0%
K 141
 
9.0%
B 133
 
8.5%
T 96
 
6.1%
G 84
 
5.3%
A 76
 
4.8%
O 75
 
4.8%
P 69
 
4.4%
E 64
 
4.1%
Other values (16) 467
29.7%
Lowercase Letter
ValueCountFrequency (%)
e 94
14.1%
k 59
 
8.8%
b 51
 
7.6%
a 50
 
7.5%
c 48
 
7.2%
t 46
 
6.9%
l 41
 
6.1%
s 41
 
6.1%
o 40
 
6.0%
r 29
 
4.3%
Other values (14) 169
25.3%
Decimal Number
ValueCountFrequency (%)
2 152
23.8%
1 132
20.7%
6 103
16.1%
5 84
13.2%
4 48
 
7.5%
0 37
 
5.8%
3 33
 
5.2%
9 21
 
3.3%
8 19
 
3.0%
7 9
 
1.4%
Other Punctuation
ValueCountFrequency (%)
& 47
44.8%
. 40
38.1%
' 6
 
5.7%
/ 5
 
4.8%
: 2
 
1.9%
2
 
1.9%
# 1
 
1.0%
* 1
 
1.0%
, 1
 
1.0%
Space Separator
ValueCountFrequency (%)
1293
100.0%
Close Punctuation
ValueCountFrequency (%)
) 984
100.0%
Open Punctuation
ValueCountFrequency (%)
( 979
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 40
100.0%
Math Symbol
ValueCountFrequency (%)
+ 4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 57067
90.1%
Common 4044
 
6.4%
Latin 2239
 
3.5%
Han 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1362
 
2.4%
1355
 
2.4%
1174
 
2.1%
1165
 
2.0%
937
 
1.6%
924
 
1.6%
888
 
1.6%
862
 
1.5%
833
 
1.5%
824
 
1.4%
Other values (798) 46743
81.9%
Latin
ValueCountFrequency (%)
S 209
 
9.3%
C 157
 
7.0%
K 141
 
6.3%
B 133
 
5.9%
T 96
 
4.3%
e 94
 
4.2%
G 84
 
3.8%
A 76
 
3.4%
O 75
 
3.3%
P 69
 
3.1%
Other values (40) 1105
49.4%
Common
ValueCountFrequency (%)
1293
32.0%
) 984
24.3%
( 979
24.2%
2 152
 
3.8%
1 132
 
3.3%
6 103
 
2.5%
5 84
 
2.1%
4 48
 
1.2%
& 47
 
1.2%
. 40
 
1.0%
Other values (15) 182
 
4.5%
Han
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 57065
90.1%
ASCII 6281
 
9.9%
CJK 4
 
< 0.1%
None 2
 
< 0.1%
Compat Jamo 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1362
 
2.4%
1355
 
2.4%
1174
 
2.1%
1165
 
2.0%
937
 
1.6%
924
 
1.6%
888
 
1.6%
862
 
1.5%
833
 
1.5%
824
 
1.4%
Other values (797) 46741
81.9%
ASCII
ValueCountFrequency (%)
1293
20.6%
) 984
15.7%
( 979
15.6%
S 209
 
3.3%
C 157
 
2.5%
2 152
 
2.4%
K 141
 
2.2%
B 133
 
2.1%
1 132
 
2.1%
6 103
 
1.6%
Other values (64) 1998
31.8%
None
ValueCountFrequency (%)
2
100.0%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
CJK
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

광고물종류명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct15
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
7624 
돌출간판
1241 
가로형간판
 
681
공연간판
 
248
지주이용 간판
 
64
Other values (10)
 
142

Length

Max length11
Median length4
Mean length4.1351
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 7624
76.2%
돌출간판 1241
 
12.4%
가로형간판 681
 
6.8%
공연간판 248
 
2.5%
지주이용 간판 64
 
0.6%
공공시설물이용 광고물 39
 
0.4%
옥상간판 24
 
0.2%
현수막게시틀 23
 
0.2%
가로형간판_입체형 22
 
0.2%
세로형간판 18
 
0.2%
Other values (5) 16
 
0.2%

Length

2023-12-11T01:53:41.206696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 7624
75.4%
돌출간판 1241
 
12.3%
가로형간판 681
 
6.7%
공연간판 248
 
2.5%
지주이용 64
 
0.6%
간판 64
 
0.6%
광고물 42
 
0.4%
공공시설물이용 39
 
0.4%
옥상간판 24
 
0.2%
현수막게시틀 23
 
0.2%
Other values (7) 56
 
0.6%
Distinct3073
Distinct (%)30.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T01:53:41.749762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length36
Mean length21.9392
Min length12

Characters and Unicode

Total characters219392
Distinct characters218
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1715 ?
Unique (%)17.2%

Sample

1st row부산광역시 중구 대청동1가 1번지 4호
2nd row부산광역시 해운대구 반송동 250번지 195호
3rd row부산광역시 해운대구 우동 1474번지
4th row부산광역시 사하구 당리동 237번지 33호
5th row부산광역시 중구 대청동1가 1번지 4호
ValueCountFrequency (%)
부산광역시 9882
21.0%
중구 7931
16.9%
4호 2499
 
5.3%
1번지 2392
 
5.1%
대청동1가 2297
 
4.9%
1호 1488
 
3.2%
2호 766
 
1.6%
중앙동4가 631
 
1.3%
남포동5가 564
 
1.2%
3호 505
 
1.1%
Other values (955) 18058
38.4%
2023-12-11T01:53:42.638026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
48149
21.9%
11121
 
5.1%
10869
 
5.0%
1 10826
 
4.9%
10529
 
4.8%
10419
 
4.7%
10081
 
4.6%
10037
 
4.6%
9924
 
4.5%
9156
 
4.2%
Other values (208) 78281
35.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 137022
62.5%
Space Separator 48149
 
21.9%
Decimal Number 34158
 
15.6%
Dash Punctuation 45
 
< 0.1%
Uppercase Letter 8
 
< 0.1%
Close Punctuation 4
 
< 0.1%
Open Punctuation 4
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
11121
 
8.1%
10869
 
7.9%
10529
 
7.7%
10419
 
7.6%
10081
 
7.4%
10037
 
7.3%
9924
 
7.2%
9156
 
6.7%
9147
 
6.7%
9127
 
6.7%
Other values (187) 36612
26.7%
Decimal Number
ValueCountFrequency (%)
1 10826
31.7%
4 5184
15.2%
2 4978
14.6%
3 3187
 
9.3%
5 2448
 
7.2%
8 1830
 
5.4%
6 1773
 
5.2%
7 1540
 
4.5%
0 1416
 
4.1%
9 976
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
A 3
37.5%
R 1
 
12.5%
B 1
 
12.5%
X 1
 
12.5%
T 1
 
12.5%
S 1
 
12.5%
Space Separator
ValueCountFrequency (%)
48149
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 45
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 137022
62.5%
Common 82362
37.5%
Latin 8
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
11121
 
8.1%
10869
 
7.9%
10529
 
7.7%
10419
 
7.6%
10081
 
7.4%
10037
 
7.3%
9924
 
7.2%
9156
 
6.7%
9147
 
6.7%
9127
 
6.7%
Other values (187) 36612
26.7%
Common
ValueCountFrequency (%)
48149
58.5%
1 10826
 
13.1%
4 5184
 
6.3%
2 4978
 
6.0%
3 3187
 
3.9%
5 2448
 
3.0%
8 1830
 
2.2%
6 1773
 
2.2%
7 1540
 
1.9%
0 1416
 
1.7%
Other values (5) 1031
 
1.3%
Latin
ValueCountFrequency (%)
A 3
37.5%
R 1
 
12.5%
B 1
 
12.5%
X 1
 
12.5%
T 1
 
12.5%
S 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 137022
62.5%
ASCII 82370
37.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
48149
58.5%
1 10826
 
13.1%
4 5184
 
6.3%
2 4978
 
6.0%
3 3187
 
3.9%
5 2448
 
3.0%
8 1830
 
2.2%
6 1773
 
2.2%
7 1540
 
1.9%
0 1416
 
1.7%
Other values (11) 1039
 
1.3%
Hangul
ValueCountFrequency (%)
11121
 
8.1%
10869
 
7.9%
10529
 
7.7%
10419
 
7.6%
10081
 
7.4%
10037
 
7.3%
9924
 
7.2%
9156
 
6.7%
9147
 
6.7%
9127
 
6.7%
Other values (187) 36612
26.7%
Distinct3042
Distinct (%)30.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2000-12-27 00:00:00
Maximum2023-10-16 00:00:00
2023-12-11T01:53:43.028040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:53:43.306878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct3048
Distinct (%)30.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T01:53:44.200999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters100000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1528 ?
Unique (%)15.3%

Sample

1st row2019-10-31
2nd row2014-09-30
3rd row2013-06-30
4th row2014-08-08
5th row2018-03-22
ValueCountFrequency (%)
2010-07-12 53
 
0.5%
2010-09-02 44
 
0.4%
2023-11-29 39
 
0.4%
2014-09-30 36
 
0.4%
2014-06-30 34
 
0.3%
2018-02-28 33
 
0.3%
2019-04-30 32
 
0.3%
2014-05-15 31
 
0.3%
2010-09-12 31
 
0.3%
2010-05-18 31
 
0.3%
Other values (3038) 9636
96.4%
2023-12-11T01:53:44.955649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 24077
24.1%
- 20000
20.0%
2 19090
19.1%
1 16535
16.5%
3 5185
 
5.2%
5 3284
 
3.3%
6 2530
 
2.5%
9 2401
 
2.4%
8 2324
 
2.3%
4 2323
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80000
80.0%
Dash Punctuation 20000
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 24077
30.1%
2 19090
23.9%
1 16535
20.7%
3 5185
 
6.5%
5 3284
 
4.1%
6 2530
 
3.2%
9 2401
 
3.0%
8 2324
 
2.9%
4 2323
 
2.9%
7 2251
 
2.8%
Dash Punctuation
ValueCountFrequency (%)
- 20000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 100000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 24077
24.1%
- 20000
20.0%
2 19090
19.1%
1 16535
16.5%
3 5185
 
5.2%
5 3284
 
3.3%
6 2530
 
2.5%
9 2401
 
2.4%
8 2324
 
2.3%
4 2323
 
2.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 100000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 24077
24.1%
- 20000
20.0%
2 19090
19.1%
1 16535
16.5%
3 5185
 
5.2%
5 3284
 
3.3%
6 2530
 
2.5%
9 2401
 
2.4%
8 2324
 
2.3%
4 2323
 
2.3%
Distinct5402
Distinct (%)54.2%
Missing28
Missing (%)0.3%
Memory size156.2 KiB
2023-12-11T01:53:45.464800image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length44
Mean length6.6888287
Min length1

Characters and Unicode

Total characters66701
Distinct characters967
Distinct categories14 ?
Distinct scripts6 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3953 ?
Unique (%)39.6%

Sample

1st row국제스피치 평생교육원
2nd row동명한문펜글씨
3rd row드림코리아
4th row속초코다리냉면
5th row함경남포면옥
ValueCountFrequency (%)
신성애드 103
 
0.8%
메디포맨 71
 
0.6%
우리현수막 60
 
0.5%
그린자동차직업전문학교 59
 
0.5%
더쎈남성의원 52
 
0.4%
홍보 51
 
0.4%
61 47
 
0.4%
부산극장 46
 
0.4%
음식나라조리학원 41
 
0.3%
대륙 41
 
0.3%
Other values (6005) 11550
95.3%
2023-12-11T01:53:46.283410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2138
 
3.2%
1550
 
2.3%
1132
 
1.7%
1123
 
1.7%
957
 
1.4%
939
 
1.4%
831
 
1.2%
787
 
1.2%
688
 
1.0%
654
 
1.0%
Other values (957) 55902
83.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 55296
82.9%
Uppercase Letter 4195
 
6.3%
Lowercase Letter 2334
 
3.5%
Space Separator 2138
 
3.2%
Decimal Number 1085
 
1.6%
Close Punctuation 640
 
1.0%
Open Punctuation 640
 
1.0%
Other Punctuation 245
 
0.4%
Dash Punctuation 87
 
0.1%
Control 23
 
< 0.1%
Other values (4) 18
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1550
 
2.8%
1132
 
2.0%
1123
 
2.0%
957
 
1.7%
939
 
1.7%
831
 
1.5%
787
 
1.4%
688
 
1.2%
654
 
1.2%
619
 
1.1%
Other values (869) 46016
83.2%
Lowercase Letter
ValueCountFrequency (%)
e 327
14.0%
o 205
 
8.8%
a 191
 
8.2%
i 159
 
6.8%
l 151
 
6.5%
s 137
 
5.9%
n 133
 
5.7%
r 126
 
5.4%
t 125
 
5.4%
c 98
 
4.2%
Other values (17) 682
29.2%
Uppercase Letter
ValueCountFrequency (%)
S 377
 
9.0%
E 310
 
7.4%
C 307
 
7.3%
T 303
 
7.2%
O 276
 
6.6%
A 269
 
6.4%
K 236
 
5.6%
B 222
 
5.3%
N 210
 
5.0%
I 195
 
4.6%
Other values (16) 1490
35.5%
Other Punctuation
ValueCountFrequency (%)
. 76
31.0%
& 65
26.5%
, 56
22.9%
' 16
 
6.5%
/ 13
 
5.3%
: 7
 
2.9%
4
 
1.6%
" 2
 
0.8%
% 2
 
0.8%
* 2
 
0.8%
Other values (2) 2
 
0.8%
Decimal Number
ValueCountFrequency (%)
2 280
25.8%
1 171
15.8%
5 125
11.5%
3 103
 
9.5%
0 99
 
9.1%
4 96
 
8.8%
6 91
 
8.4%
8 41
 
3.8%
9 40
 
3.7%
7 39
 
3.6%
Math Symbol
ValueCountFrequency (%)
+ 8
80.0%
> 1
 
10.0%
~ 1
 
10.0%
Other Symbol
ValueCountFrequency (%)
2
40.0%
2
40.0%
1
20.0%
Space Separator
ValueCountFrequency (%)
2138
100.0%
Close Punctuation
ValueCountFrequency (%)
) 640
100.0%
Open Punctuation
ValueCountFrequency (%)
( 640
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 87
100.0%
Control
ValueCountFrequency (%)
23
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 55250
82.8%
Latin 6528
 
9.8%
Common 4875
 
7.3%
Han 36
 
0.1%
Katakana 11
 
< 0.1%
Greek 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1550
 
2.8%
1132
 
2.0%
1123
 
2.0%
957
 
1.7%
939
 
1.7%
831
 
1.5%
787
 
1.4%
688
 
1.2%
654
 
1.2%
619
 
1.1%
Other values (831) 45970
83.2%
Latin
ValueCountFrequency (%)
S 377
 
5.8%
e 327
 
5.0%
E 310
 
4.7%
C 307
 
4.7%
T 303
 
4.6%
O 276
 
4.2%
A 269
 
4.1%
K 236
 
3.6%
B 222
 
3.4%
N 210
 
3.2%
Other values (42) 3691
56.5%
Common
ValueCountFrequency (%)
2138
43.9%
) 640
 
13.1%
( 640
 
13.1%
2 280
 
5.7%
1 171
 
3.5%
5 125
 
2.6%
3 103
 
2.1%
0 99
 
2.0%
4 96
 
2.0%
6 91
 
1.9%
Other values (24) 492
 
10.1%
Han
ValueCountFrequency (%)
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
2
 
5.6%
1
 
2.8%
1
 
2.8%
1
 
2.8%
西 1
 
2.8%
Other values (20) 20
55.6%
Katakana
ValueCountFrequency (%)
2
18.2%
2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
Greek
ValueCountFrequency (%)
α 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 55243
82.8%
ASCII 11391
 
17.1%
CJK 31
 
< 0.1%
Katakana 11
 
< 0.1%
None 7
 
< 0.1%
Compat Jamo 6
 
< 0.1%
CJK Compat Ideographs 5
 
< 0.1%
Misc Symbols 4
 
< 0.1%
Punctuation 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2138
 
18.8%
) 640
 
5.6%
( 640
 
5.6%
S 377
 
3.3%
e 327
 
2.9%
E 310
 
2.7%
C 307
 
2.7%
T 303
 
2.7%
2 280
 
2.5%
O 276
 
2.4%
Other values (70) 5793
50.9%
Hangul
ValueCountFrequency (%)
1550
 
2.8%
1132
 
2.0%
1123
 
2.0%
957
 
1.7%
939
 
1.7%
831
 
1.5%
787
 
1.4%
688
 
1.2%
654
 
1.2%
619
 
1.1%
Other values (826) 45963
83.2%
None
ValueCountFrequency (%)
4
57.1%
1
 
14.3%
α 1
 
14.3%
1
 
14.3%
CJK
ValueCountFrequency (%)
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
2
 
6.5%
1
 
3.2%
1
 
3.2%
1
 
3.2%
西 1
 
3.2%
1
 
3.2%
Other values (16) 16
51.6%
Compat Jamo
ValueCountFrequency (%)
2
33.3%
2
33.3%
1
16.7%
1
16.7%
Katakana
ValueCountFrequency (%)
2
18.2%
2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
Misc Symbols
ValueCountFrequency (%)
2
50.0%
2
50.0%
Punctuation
ValueCountFrequency (%)
2
66.7%
1
33.3%
CJK Compat Ideographs
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%

시도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
부산광역시
8244 
<NA>
1756 

Length

Max length5
Median length5
Mean length4.8244
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산광역시
2nd row부산광역시
3rd row부산광역시
4th row부산광역시
5th row부산광역시

Common Values

ValueCountFrequency (%)
부산광역시 8244
82.4%
<NA> 1756
 
17.6%

Length

2023-12-11T01:53:46.514789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:53:46.693105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산광역시 8244
82.4%
na 1756
 
17.6%

구군
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
중구
8244 
<NA>
1756 

Length

Max length4
Median length2
Mean length2.3512
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중구
2nd row중구
3rd row중구
4th row중구
5th row중구

Common Values

ValueCountFrequency (%)
중구 8244
82.4%
<NA> 1756
 
17.6%

Length

2023-12-11T01:53:46.881615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:53:47.053757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
중구 8244
82.4%
na 1756
 
17.6%

읍면동
Text

MISSING 

Distinct149
Distinct (%)1.8%
Missing1756
Missing (%)17.6%
Memory size156.2 KiB
2023-12-11T01:53:47.413236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length4.6960213
Min length2

Characters and Unicode

Total characters38714
Distinct characters120
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)0.4%

Sample

1st row대청동1가
2nd row반송동
3rd row우동
4th row당리동
5th row대청동1가
ValueCountFrequency (%)
대청동1가 2251
27.3%
중앙동4가 586
 
7.1%
남포동5가 498
 
6.0%
광복동2가 279
 
3.4%
광복동1가 241
 
2.9%
창선동1가 231
 
2.8%
남포동2가 220
 
2.7%
부평동2가 218
 
2.6%
부평동1가 203
 
2.5%
남포동6가 189
 
2.3%
Other values (139) 3328
40.4%
2023-12-11T01:53:48.078864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8476
21.9%
7025
18.1%
1 3411
 
8.8%
2705
 
7.0%
2486
 
6.4%
2 1311
 
3.4%
1306
 
3.4%
1274
 
3.3%
999
 
2.6%
937
 
2.4%
Other values (110) 8784
22.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 31687
81.8%
Decimal Number 7027
 
18.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8476
26.7%
7025
22.2%
2705
 
8.5%
2486
 
7.8%
1306
 
4.1%
1274
 
4.0%
999
 
3.2%
937
 
3.0%
787
 
2.5%
687
 
2.2%
Other values (103) 5005
15.8%
Decimal Number
ValueCountFrequency (%)
1 3411
48.5%
2 1311
 
18.7%
4 871
 
12.4%
5 594
 
8.5%
3 542
 
7.7%
6 267
 
3.8%
7 31
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 31687
81.8%
Common 7027
 
18.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8476
26.7%
7025
22.2%
2705
 
8.5%
2486
 
7.8%
1306
 
4.1%
1274
 
4.0%
999
 
3.2%
937
 
3.0%
787
 
2.5%
687
 
2.2%
Other values (103) 5005
15.8%
Common
ValueCountFrequency (%)
1 3411
48.5%
2 1311
 
18.7%
4 871
 
12.4%
5 594
 
8.5%
3 542
 
7.7%
6 267
 
3.8%
7 31
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 31687
81.8%
ASCII 7027
 
18.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8476
26.7%
7025
22.2%
2705
 
8.5%
2486
 
7.8%
1306
 
4.1%
1274
 
4.0%
999
 
3.2%
937
 
3.0%
787
 
2.5%
687
 
2.2%
Other values (103) 5005
15.8%
ASCII
ValueCountFrequency (%)
1 3411
48.5%
2 1311
 
18.7%
4 871
 
12.4%
5 594
 
8.5%
3 542
 
7.7%
6 267
 
3.8%
7 31
 
0.4%

도로명
Text

MISSING 

Distinct159
Distinct (%)2.6%
Missing3793
Missing (%)37.9%
Memory size156.2 KiB
2023-12-11T01:53:48.445900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.9581118
Min length2

Characters and Unicode

Total characters24568
Distinct characters70
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)0.5%

Sample

1st row중구로
2nd row중구로
3rd row중구로
4th row남포길
5th row중구로
ValueCountFrequency (%)
중구로 2316
37.3%
광복로 685
 
11.0%
중앙대로 415
 
6.7%
비프광장로 346
 
5.6%
구덕로 259
 
4.2%
대청로 244
 
3.9%
해관로 146
 
2.4%
광복중앙로 144
 
2.3%
남포길 138
 
2.2%
흑교로 138
 
2.2%
Other values (149) 1376
22.2%
2023-12-11T01:53:49.087443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5911
24.1%
3085
12.6%
2770
11.3%
1414
 
5.8%
1263
 
5.1%
1181
 
4.8%
1033
 
4.2%
967
 
3.9%
643
 
2.6%
523
 
2.1%
Other values (60) 5778
23.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 22578
91.9%
Decimal Number 1990
 
8.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5911
26.2%
3085
13.7%
2770
12.3%
1414
 
6.3%
1263
 
5.6%
1181
 
5.2%
1033
 
4.6%
967
 
4.3%
643
 
2.8%
523
 
2.3%
Other values (50) 3788
16.8%
Decimal Number
ValueCountFrequency (%)
3 310
15.6%
1 297
14.9%
4 276
13.9%
2 255
12.8%
9 255
12.8%
5 227
11.4%
7 149
7.5%
6 97
 
4.9%
8 91
 
4.6%
0 33
 
1.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 22578
91.9%
Common 1990
 
8.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5911
26.2%
3085
13.7%
2770
12.3%
1414
 
6.3%
1263
 
5.6%
1181
 
5.2%
1033
 
4.6%
967
 
4.3%
643
 
2.8%
523
 
2.3%
Other values (50) 3788
16.8%
Common
ValueCountFrequency (%)
3 310
15.6%
1 297
14.9%
4 276
13.9%
2 255
12.8%
9 255
12.8%
5 227
11.4%
7 149
7.5%
6 97
 
4.9%
8 91
 
4.6%
0 33
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 22578
91.9%
ASCII 1990
 
8.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5911
26.2%
3085
13.7%
2770
12.3%
1414
 
6.3%
1263
 
5.6%
1181
 
5.2%
1033
 
4.6%
967
 
4.3%
643
 
2.8%
523
 
2.3%
Other values (50) 3788
16.8%
ASCII
ValueCountFrequency (%)
3 310
15.6%
1 297
14.9%
4 276
13.9%
2 255
12.8%
9 255
12.8%
5 227
11.4%
7 149
7.5%
6 97
 
4.9%
8 91
 
4.6%
0 33
 
1.7%

건물번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct176
Distinct (%)2.8%
Missing3793
Missing (%)37.9%
Infinite0
Infinite (%)0.0%
Mean73.605123
Minimum1
Maximum457
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T01:53:49.298990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q128
median73
Q3120
95-th percentile120
Maximum457
Range456
Interquartile range (IQR)92

Descriptive statistics

Standard deviation50.211161
Coefficient of variation (CV)0.68216937
Kurtosis1.9708302
Mean73.605123
Median Absolute Deviation (MAD)47
Skewness0.56202404
Sum456867
Variance2521.1607
MonotonicityNot monotonic
2023-12-11T01:53:49.502685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
120 2216
22.2%
28 127
 
1.3%
4 121
 
1.2%
3 117
 
1.2%
37 103
 
1.0%
6 100
 
1.0%
31 83
 
0.8%
75 77
 
0.8%
7 77
 
0.8%
5 74
 
0.7%
Other values (166) 3112
31.1%
(Missing) 3793
37.9%
ValueCountFrequency (%)
1 33
 
0.3%
2 44
 
0.4%
3 117
1.2%
4 121
1.2%
5 74
0.7%
6 100
1.0%
7 77
0.8%
8 69
0.7%
9 67
0.7%
10 50
0.5%
ValueCountFrequency (%)
457 1
 
< 0.1%
405 1
 
< 0.1%
396 8
0.1%
343 1
 
< 0.1%
258 3
 
< 0.1%
247 2
 
< 0.1%
245 1
 
< 0.1%
242 1
 
< 0.1%
241 1
 
< 0.1%
240 1
 
< 0.1%

건물번호2
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct22
Distinct (%)0.4%
Missing3793
Missing (%)37.9%
Infinite0
Infinite (%)0.0%
Mean0.36764943
Minimum0
Maximum34
Zeros4965
Zeros (%)49.6%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T01:53:49.682248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum34
Range34
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.5182329
Coefficient of variation (CV)4.129567
Kurtosis181.57509
Mean0.36764943
Median Absolute Deviation (MAD)0
Skewness11.690647
Sum2282
Variance2.3050313
MonotonicityNot monotonic
2023-12-11T01:53:49.859452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
0 4965
49.6%
1 898
 
9.0%
2 242
 
2.4%
3 42
 
0.4%
14 10
 
0.1%
13 7
 
0.1%
7 7
 
0.1%
12 5
 
0.1%
5 4
 
< 0.1%
18 4
 
< 0.1%
Other values (12) 23
 
0.2%
(Missing) 3793
37.9%
ValueCountFrequency (%)
0 4965
49.6%
1 898
 
9.0%
2 242
 
2.4%
3 42
 
0.4%
4 4
 
< 0.1%
5 4
 
< 0.1%
6 1
 
< 0.1%
7 7
 
0.1%
8 3
 
< 0.1%
9 3
 
< 0.1%
ValueCountFrequency (%)
34 2
 
< 0.1%
31 1
 
< 0.1%
27 1
 
< 0.1%
23 1
 
< 0.1%
22 1
 
< 0.1%
21 3
 
< 0.1%
18 4
 
< 0.1%
15 1
 
< 0.1%
14 10
0.1%
13 7
0.1%

Interactions

2023-12-11T01:53:38.637022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:53:38.304412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:53:38.790654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:53:38.461336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:53:50.011482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
광고물종류명건물번호건물번호2
광고물종류명1.0000.2460.000
건물번호0.2461.0000.043
건물번호20.0000.0431.000
2023-12-11T01:53:50.133198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도광고물종류명구군
시도1.0001.0001.000
광고물종류명1.0001.0001.000
구군1.0001.0001.000
2023-12-11T01:53:50.249157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건물번호건물번호2광고물종류명시도구군
건물번호1.000-0.3080.1241.0001.000
건물번호2-0.3081.0000.0001.0001.000
광고물종류명0.1240.0001.0001.0001.000
시도1.0001.0001.0001.0001.000
구군1.0001.0001.0001.0001.000

Missing values

2023-12-11T01:53:39.065006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:53:39.399752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T01:53:39.734904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업소명광고물종류명표시위치표시시작일표시종료일광고내용시도구군읍면동도로명건물번호건물번호2
12000국제스피치 평생교육원<NA>부산광역시 중구 대청동1가 1번지 4호2019-10-012019-10-31국제스피치 평생교육원부산광역시중구대청동1가중구로1200
7468동명한문펜글씨<NA>부산광역시 해운대구 반송동 250번지 195호2014-09-162014-09-30동명한문펜글씨부산광역시중구반송동<NA><NA><NA>
5982드림코리아<NA>부산광역시 해운대구 우동 1474번지2013-06-012013-06-30드림코리아부산광역시중구우동<NA><NA><NA>
7197속초코다리냉면<NA>부산광역시 사하구 당리동 237번지 33호2014-07-252014-08-08속초코다리냉면부산광역시중구당리동<NA><NA><NA>
10659함경남포면옥<NA>부산광역시 중구 대청동1가 1번지 4호2018-03-082018-03-22함경남포면옥부산광역시중구대청동1가중구로1200
2538부산극장돌출간판부산광역시 중구 남포동5가 18번지 0호2010-04-172013-04-16CINUS 부산극장부산광역시중구남포동5가<NA><NA><NA>
8370보람파출부<NA>부산광역시 중구 대청동1가 1번지 4호2015-07-012015-07-15보람파출부부산광역시중구대청동1가중구로1200
5745베리스포츠마사지<NA>부산광역시 중구 남포동2가 5번지2013-03-202016-03-19베리스포츠마사지부산광역시중구남포동2가남포길391
10982부산직업능력교육원<NA>부산광역시 중구 대청동1가 1번지 4호2018-08-012018-08-31부산직업능력교육원부산광역시중구대청동1가중구로1200
11139제임스짐<NA>부산광역시 중구 대청동1가 1번지 4호2018-10-012018-10-15제임스짐부산광역시중구대청동1가중구로1200
업소명광고물종류명표시위치표시시작일표시종료일광고내용시도구군읍면동도로명건물번호건물번호2
10497경두기획<NA>부산광역시 연제구 연산동 104번지 32호2018-01-012018-01-10사무실임대업<NA><NA><NA><NA><NA><NA>
105352pm커피<NA>부산광역시 강서구 강동동2018-01-212018-01-312pm커피<NA><NA><NA><NA><NA><NA>
16154백세건강<NA>부산광역시 중구 대청동1가 1번지 4호2023-08-012023-08-31백세건강부산광역시중구대청동1가중구로1200
1756현대자동차(주)가로형간판부산광역시 중구 부평동4가 52번지 3호2007-10-102010-10-09현대자동차부산광역시중구부평동4가보수대로820
12813부산안마수련원<NA>부산광역시 중구 대청동1가 1번지 4호2020-08-212020-08-31부산안마수련원 입학홍보부산광역시중구대청동1가중구로1200
14284그린자동차 직업전문학교<NA>부산광역시 중구 대청동1가 1번지 4호2021-10-012021-10-31그린자동차 직업전문학교부산광역시중구대청동1가중구로1200
8412국제스피치평생교육원<NA>부산광역시 중구 대청동1가 1번지 4호2015-07-162015-07-30국제스피치평생교육원부산광역시중구대청동1가중구로1200
1224자이언트PC방돌출간판부산광역시 중구 남포동6가 94번지2006-12-112009-12-10자이언트PC방부산광역시중구남포동6가비프광장로160
3679오천콜<NA>부산광역시 부산진구 범천동 849번지2011-09-212011-10-05오천콜부산광역시중구범천동<NA><NA><NA>
2274-공연간판부산광역시 중구 남포동1가 (주)부산극장2009-06-242011-06-23트랜스포머2부산광역시중구남포동1가<NA><NA><NA>

Duplicate rows

Most frequently occurring

업소명광고물종류명표시위치표시시작일표시종료일광고내용시도구군읍면동도로명건물번호건물번호2# duplicates
2461<NA>부산광역시 사하구 하단동 887번지 15호2016-12-292017-01-1261<NA><NA><NA><NA><NA><NA>4
227사하e편한<NA>부산광역시 해운대구 반여동 890번지 3호2014-09-012014-09-15사하e편한부산광역시중구반여동<NA><NA><NA>4
426하나<NA>부산광역시 해운대구 재송동 1185번지2016-06-292016-07-13하나<NA><NA><NA><NA><NA><NA>4
17(주)타겟신화하니엘<NA>부산광역시 중구 대청동1가 1번지 4호2015-12-012015-12-10(주)타겟신화하니엘부산광역시중구대청동1가중구로12003
51bk기획<NA>부산광역시 서구 부민동1가 30번지 3호2017-01-052017-01-19bk기획<NA><NA><NA><NA><NA><NA>3
52bk기획<NA>부산광역시 서구 암남동 450번지 1호2016-01-292016-02-12bk기획<NA><NA><NA><NA><NA><NA>3
68경주러브캐슬<NA>부산광역시 북구 구포동 1075번지 24호2015-01-092015-01-23경주러브캐슬<NA><NA><NA><NA><NA><NA>3
71고심정사불교대학<NA>부산광역시 중구 중앙동4가 37번지 12호2013-08-012013-08-30고심정사불교대학부산광역시중구중앙동4가대청로135번길2003
142돈불왕소금구이<NA>부산광역시 중구 부평동1가 48번지 2호2016-02-072019-02-06돈불왕소금구이부산광역시중구부평동1가비프광장로1313
144동구여성인력개발센터<NA>부산광역시 사상구 덕포동 150번지 10호2015-03-172015-03-31동구여성인력개발센터<NA><NA><NA><NA><NA><NA>3