Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells6439
Missing cells (%)9.2%
Duplicate rows679
Duplicate rows (%)6.8%
Total size in memory644.5 KiB
Average record size in memory66.0 B

Variable types

Text2
Categorical3
Numeric2

Dataset

Description부산광역시사상구_U-옥외광고물통합관리시스템_전수조사광고물및옥외광고물정보_20221028
Author부산광역시 사상구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15093724

Alerts

시도 has constant value ""Constant
구군 has constant value ""Constant
Dataset has 679 (6.8%) duplicate rowsDuplicates
도로명 has 186 (1.9%) missing valuesMissing
건물1 has 181 (1.8%) missing valuesMissing
건물2 has 6072 (60.7%) missing valuesMissing
건물2 has 3093 (30.9%) zerosZeros

Reproduction

Analysis started2023-12-10 16:22:55.207505
Analysis finished2023-12-10 16:22:56.872209
Duration1.66 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct7189
Distinct (%)71.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T01:22:57.160055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length22
Mean length5.7404
Min length1

Characters and Unicode

Total characters57404
Distinct characters977
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5167 ?
Unique (%)51.7%

Sample

1st row롤로토바코 부산괘법점
2nd row대게마당동해
3rd row대보실업
4th row닭도리
5th row오리마당
ValueCountFrequency (%)
gs25 33
 
0.3%
cu 28
 
0.3%
사상점 27
 
0.2%
coffee 22
 
0.2%
bar 20
 
0.2%
세븐일레븐 19
 
0.2%
hotel 18
 
0.2%
the 17
 
0.2%
휴대폰할인마트 13
 
0.1%
pc 13
 
0.1%
Other values (7431) 10641
98.1%
2023-12-11T01:22:57.684726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1031
 
1.8%
965
 
1.7%
920
 
1.6%
866
 
1.5%
805
 
1.4%
727
 
1.3%
681
 
1.2%
679
 
1.2%
674
 
1.2%
661
 
1.2%
Other values (967) 49395
86.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 50250
87.5%
Uppercase Letter 3765
 
6.6%
Lowercase Letter 923
 
1.6%
Space Separator 866
 
1.5%
Decimal Number 513
 
0.9%
Open Punctuation 462
 
0.8%
Close Punctuation 459
 
0.8%
Other Punctuation 123
 
0.2%
Dash Punctuation 22
 
< 0.1%
Math Symbol 15
 
< 0.1%
Other values (3) 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1031
 
2.1%
965
 
1.9%
920
 
1.8%
805
 
1.6%
727
 
1.4%
681
 
1.4%
679
 
1.4%
674
 
1.3%
661
 
1.3%
640
 
1.3%
Other values (883) 42467
84.5%
Uppercase Letter
ValueCountFrequency (%)
E 335
 
8.9%
O 301
 
8.0%
C 285
 
7.6%
S 280
 
7.4%
T 237
 
6.3%
A 206
 
5.5%
P 183
 
4.9%
G 173
 
4.6%
N 169
 
4.5%
L 169
 
4.5%
Other values (16) 1427
37.9%
Lowercase Letter
ValueCountFrequency (%)
e 126
13.7%
a 97
10.5%
o 88
 
9.5%
t 73
 
7.9%
n 55
 
6.0%
l 54
 
5.9%
c 54
 
5.9%
r 53
 
5.7%
i 47
 
5.1%
m 41
 
4.4%
Other values (16) 235
25.5%
Other Punctuation
ValueCountFrequency (%)
& 54
43.9%
. 42
34.1%
* 7
 
5.7%
# 6
 
4.9%
, 5
 
4.1%
/ 3
 
2.4%
! 2
 
1.6%
: 1
 
0.8%
· 1
 
0.8%
" 1
 
0.8%
Decimal Number
ValueCountFrequency (%)
2 120
23.4%
1 86
16.8%
5 74
14.4%
0 63
12.3%
3 44
 
8.6%
4 35
 
6.8%
6 25
 
4.9%
8 23
 
4.5%
7 23
 
4.5%
9 20
 
3.9%
Open Punctuation
ValueCountFrequency (%)
( 329
71.2%
[ 133
28.8%
Close Punctuation
ValueCountFrequency (%)
) 327
71.2%
] 132
28.8%
Math Symbol
ValueCountFrequency (%)
+ 8
53.3%
~ 7
46.7%
Space Separator
ValueCountFrequency (%)
866
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 22
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 3
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 50217
87.5%
Latin 4688
 
8.2%
Common 2466
 
4.3%
Han 33
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1031
 
2.1%
965
 
1.9%
920
 
1.8%
805
 
1.6%
727
 
1.4%
681
 
1.4%
679
 
1.4%
674
 
1.3%
661
 
1.3%
640
 
1.3%
Other values (866) 42434
84.5%
Latin
ValueCountFrequency (%)
E 335
 
7.1%
O 301
 
6.4%
C 285
 
6.1%
S 280
 
6.0%
T 237
 
5.1%
A 206
 
4.4%
P 183
 
3.9%
G 173
 
3.7%
N 169
 
3.6%
L 169
 
3.6%
Other values (42) 2350
50.1%
Common
ValueCountFrequency (%)
866
35.1%
( 329
 
13.3%
) 327
 
13.3%
[ 133
 
5.4%
] 132
 
5.4%
2 120
 
4.9%
1 86
 
3.5%
5 74
 
3.0%
0 63
 
2.6%
& 54
 
2.2%
Other values (22) 282
 
11.4%
Han
ValueCountFrequency (%)
5
15.2%
4
12.1%
3
9.1%
3
9.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
1
 
3.0%
Other values (7) 7
21.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 50216
87.5%
ASCII 7152
 
12.5%
CJK 29
 
0.1%
CJK Compat Ideographs 4
 
< 0.1%
None 1
 
< 0.1%
Compat Jamo 1
 
< 0.1%
Geometric Shapes 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1031
 
2.1%
965
 
1.9%
920
 
1.8%
805
 
1.6%
727
 
1.4%
681
 
1.4%
679
 
1.4%
674
 
1.3%
661
 
1.3%
640
 
1.3%
Other values (865) 42433
84.5%
ASCII
ValueCountFrequency (%)
866
 
12.1%
E 335
 
4.7%
( 329
 
4.6%
) 327
 
4.6%
O 301
 
4.2%
C 285
 
4.0%
S 280
 
3.9%
T 237
 
3.3%
A 206
 
2.9%
P 183
 
2.6%
Other values (72) 3803
53.2%
CJK
ValueCountFrequency (%)
5
17.2%
4
13.8%
3
10.3%
2
 
6.9%
2
 
6.9%
2
 
6.9%
2
 
6.9%
2
 
6.9%
1
 
3.4%
1
 
3.4%
Other values (5) 5
17.2%
CJK Compat Ideographs
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
None
ValueCountFrequency (%)
· 1
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%

시도
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
부산광역시
10000 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부산광역시
2nd row부산광역시
3rd row부산광역시
4th row부산광역시
5th row부산광역시

Common Values

ValueCountFrequency (%)
부산광역시 10000
100.0%

Length

2023-12-11T01:22:57.895431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:22:58.000994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부산광역시 10000
100.0%

구군
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
사상구
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row사상구
2nd row사상구
3rd row사상구
4th row사상구
5th row사상구

Common Values

ValueCountFrequency (%)
사상구 10000
100.0%

Length

2023-12-11T01:22:58.138828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:22:58.257783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
사상구 10000
100.0%

도로명
Text

MISSING 

Distinct213
Distinct (%)2.2%
Missing186
Missing (%)1.9%
Memory size156.2 KiB
2023-12-11T01:22:58.545680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length5.3139393
Min length3

Characters and Unicode

Total characters52151
Distinct characters52
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)0.2%

Sample

1st row운산로
2nd row사상로
3rd row괘감로
4th row가야대로366번길
5th row새벽시장로63번길
ValueCountFrequency (%)
사상로 970
 
9.9%
괘감로 966
 
9.8%
백양대로 677
 
6.9%
가야대로 590
 
6.0%
낙동대로 468
 
4.8%
새벽로 292
 
3.0%
광장로 251
 
2.6%
새벽시장로 207
 
2.1%
가야대로366번길 172
 
1.8%
학감대로 172
 
1.8%
Other values (203) 5049
51.4%
2023-12-11T01:22:59.086328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9814
18.8%
3963
 
7.6%
3963
 
7.6%
3163
 
6.1%
2101
 
4.0%
2 2098
 
4.0%
2034
 
3.9%
1 1782
 
3.4%
1444
 
2.8%
3 1350
 
2.6%
Other values (42) 20439
39.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 41537
79.6%
Decimal Number 10614
 
20.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9814
23.6%
3963
 
9.5%
3963
 
9.5%
3163
 
7.6%
2101
 
5.1%
2034
 
4.9%
1444
 
3.5%
1306
 
3.1%
1213
 
2.9%
1196
 
2.9%
Other values (32) 11340
27.3%
Decimal Number
ValueCountFrequency (%)
2 2098
19.8%
1 1782
16.8%
3 1350
12.7%
6 1098
10.3%
0 961
9.1%
4 869
8.2%
7 728
 
6.9%
8 638
 
6.0%
5 621
 
5.9%
9 469
 
4.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 41537
79.6%
Common 10614
 
20.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9814
23.6%
3963
 
9.5%
3963
 
9.5%
3163
 
7.6%
2101
 
5.1%
2034
 
4.9%
1444
 
3.5%
1306
 
3.1%
1213
 
2.9%
1196
 
2.9%
Other values (32) 11340
27.3%
Common
ValueCountFrequency (%)
2 2098
19.8%
1 1782
16.8%
3 1350
12.7%
6 1098
10.3%
0 961
9.1%
4 869
8.2%
7 728
 
6.9%
8 638
 
6.0%
5 621
 
5.9%
9 469
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 41537
79.6%
ASCII 10614
 
20.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9814
23.6%
3963
 
9.5%
3963
 
9.5%
3163
 
7.6%
2101
 
5.1%
2034
 
4.9%
1444
 
3.5%
1306
 
3.1%
1213
 
2.9%
1196
 
2.9%
Other values (32) 11340
27.3%
ASCII
ValueCountFrequency (%)
2 2098
19.8%
1 1782
16.8%
3 1350
12.7%
6 1098
10.3%
0 961
9.1%
4 869
8.2%
7 728
 
6.9%
8 638
 
6.0%
5 621
 
5.9%
9 469
 
4.4%

건물1
Real number (ℝ)

MISSING 

Distinct734
Distinct (%)7.5%
Missing181
Missing (%)1.8%
Infinite0
Infinite (%)0.0%
Mean180.27661
Minimum1
Maximum1562
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T01:22:59.285239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7
Q125
median55
Q3226
95-th percentile906
Maximum1562
Range1561
Interquartile range (IQR)201

Descriptive statistics

Standard deviation278.78605
Coefficient of variation (CV)1.546435
Kurtosis6.2315513
Mean180.27661
Median Absolute Deviation (MAD)42
Skewness2.4826944
Sum1770136
Variance77721.664
MonotonicityNot monotonic
2023-12-11T01:22:59.493705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37 661
 
6.6%
16 188
 
1.9%
10 145
 
1.5%
22 140
 
1.4%
7 139
 
1.4%
14 135
 
1.4%
13 132
 
1.3%
40 128
 
1.3%
20 128
 
1.3%
11 118
 
1.2%
Other values (724) 7905
79.0%
(Missing) 181
 
1.8%
ValueCountFrequency (%)
1 50
 
0.5%
2 65
0.7%
3 77
0.8%
4 68
0.7%
5 98
1.0%
6 76
0.8%
7 139
1.4%
8 87
0.9%
9 105
1.1%
10 145
1.5%
ValueCountFrequency (%)
1562 1
 
< 0.1%
1558 2
< 0.1%
1556 3
< 0.1%
1554 2
< 0.1%
1552 2
< 0.1%
1548 1
 
< 0.1%
1542 3
< 0.1%
1540 1
 
< 0.1%
1538 2
< 0.1%
1536 2
< 0.1%

건물2
Real number (ℝ)

MISSING  ZEROS 

Distinct40
Distinct (%)1.0%
Missing6072
Missing (%)60.7%
Infinite0
Infinite (%)0.0%
Mean2.0730652
Minimum0
Maximum91
Zeros3093
Zeros (%)30.9%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T01:22:59.829654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile16
Maximum91
Range91
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6.7338737
Coefficient of variation (CV)3.2482692
Kurtosis51.837512
Mean2.0730652
Median Absolute Deviation (MAD)0
Skewness5.8632764
Sum8143
Variance45.345055
MonotonicityNot monotonic
2023-12-11T01:23:00.053473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
0 3093
30.9%
1 275
 
2.8%
16 49
 
0.5%
7 40
 
0.4%
8 36
 
0.4%
3 34
 
0.3%
10 32
 
0.3%
19 31
 
0.3%
2 30
 
0.3%
6 29
 
0.3%
Other values (30) 279
 
2.8%
(Missing) 6072
60.7%
ValueCountFrequency (%)
0 3093
30.9%
1 275
 
2.8%
2 30
 
0.3%
3 34
 
0.3%
4 27
 
0.3%
5 23
 
0.2%
6 29
 
0.3%
7 40
 
0.4%
8 36
 
0.4%
9 20
 
0.2%
ValueCountFrequency (%)
91 2
 
< 0.1%
89 3
 
< 0.1%
77 1
 
< 0.1%
73 1
 
< 0.1%
51 5
0.1%
42 1
 
< 0.1%
41 3
 
< 0.1%
36 9
0.1%
33 2
 
< 0.1%
32 5
0.1%

광고물종류
Categorical

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
가로형간판
4396 
돌출간판
3029 
가로형간판_입체형
1759 
지주이용 간판
609 
세로형간판
 
96
Other values (2)
 
111

Length

Max length9
Median length7
Mean length5.518
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가로형간판
2nd row가로형간판_입체형
3rd row가로형간판
4th row돌출간판
5th row가로형간판_입체형

Common Values

ValueCountFrequency (%)
가로형간판 4396
44.0%
돌출간판 3029
30.3%
가로형간판_입체형 1759
17.6%
지주이용 간판 609
 
6.1%
세로형간판 96
 
1.0%
옥상간판 78
 
0.8%
현수막게시틀 33
 
0.3%

Length

2023-12-11T01:23:00.246238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T01:23:00.407474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
가로형간판 4396
41.4%
돌출간판 3029
28.6%
가로형간판_입체형 1759
16.6%
지주이용 609
 
5.7%
간판 609
 
5.7%
세로형간판 96
 
0.9%
옥상간판 78
 
0.7%
현수막게시틀 33
 
0.3%

Interactions

2023-12-11T01:22:56.251958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:22:56.001786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:22:56.373791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T01:22:56.110428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T01:23:00.525794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건물1건물2광고물종류
건물11.0000.2260.135
건물20.2261.0000.000
광고물종류0.1350.0001.000
2023-12-11T01:23:00.657950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건물1건물2광고물종류
건물11.0000.1750.068
건물20.1751.0000.000
광고물종류0.0680.0001.000

Missing values

2023-12-11T01:22:56.514917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T01:22:56.668132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T01:22:56.797483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업소명시도구군도로명건물1건물2광고물종류
3649롤로토바코 부산괘법점부산광역시사상구운산로88<NA>가로형간판
26444대게마당동해부산광역시사상구사상로1230가로형간판_입체형
3085대보실업부산광역시사상구괘감로37<NA>가로형간판
23560닭도리부산광역시사상구가야대로366번길140돌출간판
16971오리마당부산광역시사상구새벽시장로63번길7<NA>가로형간판_입체형
11551대박부동산부산광역시사상구백양대로342번길17<NA>가로형간판
24276세일전기부산광역시사상구낙동대로9100가로형간판
22918파크랜드부산광역시사상구가야대로2850가로형간판_입체형
27998이륙기계부산광역시사상구학감대로222번길200가로형간판
18574황금어장부산광역시사상구주례로10번길117<NA>가로형간판
업소명시도구군도로명건물1건물2광고물종류
12616마나보다만화카페부산광역시사상구사상로200<NA>옥상간판
4085대원엔지니어링부산광역시사상구백양대로662<NA>가로형간판
27200대한후렉시블부산광역시사상구광장로20번길400가로형간판
9889샤트렌부산광역시사상구백양대로927<NA>가로형간판_입체형
24448칼라모텔부산광역시사상구낙동대로10500가로형간판_입체형
1779예현테크부산광역시사상구괘감로37<NA>가로형간판
12211천냥시대부산광역시사상구사상로250번길11<NA>돌출간판
21193뉴그린해어필부산광역시사상구백양대로9580돌출간판
25127한우리종합물류부산광역시사상구새벽시장로29번길280돌출간판
8511한국노동조합총연맹부산광역시사상구광장로34<NA>가로형간판

Duplicate rows

Most frequently occurring

업소명시도구군도로명건물1건물2광고물종류# duplicates
263명천대중사우나부산광역시사상구새벽로137번길400돌출간판5
531장한상사부산광역시사상구사상로53<NA>가로형간판5
614팔레스당구장부산광역시사상구광장로81<NA>가로형간판5
45FINE부산광역시사상구광장로10<NA>가로형간판_입체형4
84TOP DVD부산광역시사상구광장로81<NA>가로형간판4
88WG미용실부산광역시사상구사상로2481가로형간판4
118건축창호자재부산광역시사상구가야대로1210지주이용 간판4
141국제전자부산광역시사상구광장로20번길5817가로형간판4
267모모스테이크부산광역시사상구사상로212<NA>가로형간판4
297반찬1번지부산광역시사상구괘감로1021돌출간판4