Overview

Dataset statistics

Number of variables4
Number of observations235
Missing cells96
Missing cells (%)10.2%
Duplicate rows17
Duplicate rows (%)7.2%
Total size in memory7.7 KiB
Average record size in memory33.6 B

Variable types

Text2
Numeric1
Categorical1

Dataset

Description경상남도 양산시 불법투기감시 CCTV설치현황입니다. 항목은 설치주소 설치년도 설치형태 비고(위치)로 구성되어 있습니다.
Author경상남도 양산시
URLhttps://www.data.go.kr/data/15104448/fileData.do

Alerts

Dataset has 17 (7.2%) duplicate rowsDuplicates
설치년도 is highly overall correlated with 설치형태High correlation
설치형태 is highly overall correlated with 설치년도High correlation
비고 has 96 (40.9%) missing valuesMissing

Reproduction

Analysis started2023-12-12 13:42:53.691959
Analysis finished2023-12-12 13:42:54.149669
Duration0.46 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

주소
Text

Distinct190
Distinct (%)80.9%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2023-12-12T22:42:54.449657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length22
Mean length19.914894
Min length15

Characters and Unicode

Total characters4680
Distinct characters78
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique160 ?
Unique (%)68.1%

Sample

1st row경상남도 양산시 상북면 석계리 244-8
2nd row경상남도 양산시 상북면 석계리 74-1
3rd row경상남도 양산시 덕계동 116-2
4th row경상남도 양산시 덕계동 1185
5th row경상남도 양산시 덕계동 1218
ValueCountFrequency (%)
경상남도 235
22.0%
양산시 235
22.0%
원동면 33
 
3.1%
상북면 30
 
2.8%
하북면 28
 
2.6%
물금읍 24
 
2.3%
삼호동 23
 
2.2%
범어리 16
 
1.5%
중부동 14
 
1.3%
동면 11
 
1.0%
Other values (233) 417
39.1%
2023-12-12T22:42:54.988058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
833
17.8%
269
 
5.7%
260
 
5.6%
243
 
5.2%
235
 
5.0%
235
 
5.0%
235
 
5.0%
235
 
5.0%
- 184
 
3.9%
1 166
 
3.5%
Other values (68) 1785
38.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2707
57.8%
Decimal Number 956
 
20.4%
Space Separator 833
 
17.8%
Dash Punctuation 184
 
3.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
269
9.9%
260
9.6%
243
 
9.0%
235
 
8.7%
235
 
8.7%
235
 
8.7%
235
 
8.7%
153
 
5.7%
123
 
4.5%
102
 
3.8%
Other values (56) 617
22.8%
Decimal Number
ValueCountFrequency (%)
1 166
17.4%
2 128
13.4%
5 113
11.8%
4 109
11.4%
6 93
9.7%
3 88
9.2%
7 83
8.7%
8 64
 
6.7%
9 62
 
6.5%
0 50
 
5.2%
Space Separator
ValueCountFrequency (%)
833
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 184
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2707
57.8%
Common 1973
42.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
269
9.9%
260
9.6%
243
 
9.0%
235
 
8.7%
235
 
8.7%
235
 
8.7%
235
 
8.7%
153
 
5.7%
123
 
4.5%
102
 
3.8%
Other values (56) 617
22.8%
Common
ValueCountFrequency (%)
833
42.2%
- 184
 
9.3%
1 166
 
8.4%
2 128
 
6.5%
5 113
 
5.7%
4 109
 
5.5%
6 93
 
4.7%
3 88
 
4.5%
7 83
 
4.2%
8 64
 
3.2%
Other values (2) 112
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2707
57.8%
ASCII 1973
42.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
833
42.2%
- 184
 
9.3%
1 166
 
8.4%
2 128
 
6.5%
5 113
 
5.7%
4 109
 
5.5%
6 93
 
4.7%
3 88
 
4.5%
7 83
 
4.2%
8 64
 
3.2%
Other values (2) 112
 
5.7%
Hangul
ValueCountFrequency (%)
269
9.9%
260
9.6%
243
 
9.0%
235
 
8.7%
235
 
8.7%
235
 
8.7%
235
 
8.7%
153
 
5.7%
123
 
4.5%
102
 
3.8%
Other values (56) 617
22.8%

설치년도
Real number (ℝ)

HIGH CORRELATION 

Distinct16
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.3021
Minimum2003
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 KiB
2023-12-12T22:42:55.163461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2003
5-th percentile2004
Q12015
median2018
Q32020.5
95-th percentile2022
Maximum2022
Range19
Interquartile range (IQR)5.5

Descriptive statistics

Standard deviation5.9684172
Coefficient of variation (CV)0.0029600808
Kurtosis0.075094955
Mean2016.3021
Median Absolute Deviation (MAD)3
Skewness-1.2285997
Sum473831
Variance35.622004
MonotonicityIncreasing
2023-12-12T22:42:55.313418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
2021 35
14.9%
2020 34
14.5%
2018 33
14.0%
2004 31
13.2%
2022 24
10.2%
2019 23
9.8%
2016 14
 
6.0%
2017 10
 
4.3%
2015 7
 
3.0%
2007 7
 
3.0%
Other values (6) 17
7.2%
ValueCountFrequency (%)
2003 2
 
0.9%
2004 31
13.2%
2005 1
 
0.4%
2007 7
 
3.0%
2009 3
 
1.3%
2012 1
 
0.4%
2013 5
 
2.1%
2014 5
 
2.1%
2015 7
 
3.0%
2016 14
6.0%
ValueCountFrequency (%)
2022 24
10.2%
2021 35
14.9%
2020 34
14.5%
2019 23
9.8%
2018 33
14.0%
2017 10
 
4.3%
2016 14
 
6.0%
2015 7
 
3.0%
2014 5
 
2.1%
2013 5
 
2.1%

설치형태
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
<NA>
93 
실물
80 
모형
35 
이동식
22 
음성
 
5

Length

Max length4
Median length2
Mean length2.8851064
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row모형
2nd row모형
3rd row실물
4th row모형
5th row모형

Common Values

ValueCountFrequency (%)
<NA> 93
39.6%
실물 80
34.0%
모형 35
 
14.9%
이동식 22
 
9.4%
음성 5
 
2.1%

Length

2023-12-12T22:42:55.797258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:42:55.909769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 93
39.6%
실물 80
34.0%
모형 35
 
14.9%
이동식 22
 
9.4%
음성 5
 
2.1%

비고
Text

MISSING 

Distinct122
Distinct (%)87.8%
Missing96
Missing (%)40.9%
Memory size2.0 KiB
2023-12-12T22:42:56.188152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length11
Mean length7.2374101
Min length2

Characters and Unicode

Total characters1006
Distinct characters197
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique118 ?
Unique (%)84.9%

Sample

1st row그린빌라 옆
2nd row한솔마트 앞
3rd row외산마을회관앞
4th row새웅상교회옆
5th row해피카정비앞
ValueCountFrequency (%)
재활용동네마당 22
 
9.7%
20
 
8.8%
일원 15
 
6.6%
외부 11
 
4.9%
내부 11
 
4.9%
4
 
1.8%
3
 
1.3%
3
 
1.3%
외석마을 2
 
0.9%
양지마을 2
 
0.9%
Other values (120) 133
58.8%
2023-12-12T22:42:56.670425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
89
 
8.8%
65
 
6.5%
59
 
5.9%
36
 
3.6%
33
 
3.3%
31
 
3.1%
25
 
2.5%
24
 
2.4%
23
 
2.3%
22
 
2.2%
Other values (187) 599
59.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 914
90.9%
Space Separator 89
 
8.8%
Decimal Number 2
 
0.2%
Uppercase Letter 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
65
 
7.1%
59
 
6.5%
36
 
3.9%
33
 
3.6%
31
 
3.4%
25
 
2.7%
24
 
2.6%
23
 
2.5%
22
 
2.4%
22
 
2.4%
Other values (183) 574
62.8%
Decimal Number
ValueCountFrequency (%)
2 1
50.0%
5 1
50.0%
Space Separator
ValueCountFrequency (%)
89
100.0%
Uppercase Letter
ValueCountFrequency (%)
B 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 914
90.9%
Common 91
 
9.0%
Latin 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
65
 
7.1%
59
 
6.5%
36
 
3.9%
33
 
3.6%
31
 
3.4%
25
 
2.7%
24
 
2.6%
23
 
2.5%
22
 
2.4%
22
 
2.4%
Other values (183) 574
62.8%
Common
ValueCountFrequency (%)
89
97.8%
2 1
 
1.1%
5 1
 
1.1%
Latin
ValueCountFrequency (%)
B 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 914
90.9%
ASCII 92
 
9.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
89
96.7%
2 1
 
1.1%
5 1
 
1.1%
B 1
 
1.1%
Hangul
ValueCountFrequency (%)
65
 
7.1%
59
 
6.5%
36
 
3.9%
33
 
3.6%
31
 
3.4%
25
 
2.7%
24
 
2.6%
23
 
2.5%
22
 
2.4%
22
 
2.4%
Other values (183) 574
62.8%

Interactions

2023-12-12T22:42:53.935231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:42:56.782081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치년도설치형태
설치년도1.0000.907
설치형태0.9071.000
2023-12-12T22:42:56.861930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설치년도설치형태
설치년도1.0000.823
설치형태0.8231.000

Missing values

2023-12-12T22:42:54.048177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:42:54.119856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

주소설치년도설치형태비고
0경상남도 양산시 상북면 석계리 244-82003모형그린빌라 옆
1경상남도 양산시 상북면 석계리 74-12003모형한솔마트 앞
2경상남도 양산시 덕계동 116-22004실물외산마을회관앞
3경상남도 양산시 덕계동 11852004모형새웅상교회옆
4경상남도 양산시 덕계동 12182004모형해피카정비앞
5경상남도 양산시 덕계동 12302004모형덕계중앙교회앞
6경상남도 양산시 덕계동 3362004모형덕계프로스펙스뒤
7경상남도 양산시 덕계동 669-212004모형덕계우체국뒤
8경상남도 양산시 덕계동 718-812004모형덕계마을회관앞
9경상남도 양산시 덕계동 736-42004모형메가마트앞
주소설치년도설치형태비고
225경상남도 양산시 교동 970-352022<NA>일원
226경상남도 양산시 상북면 상삼리 426-22022<NA>일원
227경상남도 양산시 상북면 상삼리 426-22022<NA>일원
228경상남도 양산시 상북면 상삼리 426-22022<NA>일원
229경상남도 양산시 주남동 1076-1282022<NA>일원
230경상남도 양산시 주남동 1076-1282022<NA>일원
231경상남도 양산시 주남동 1076-1282022<NA>일원
232경상남도 양산시 하북면 지산리 392-52022<NA>일원
233경상남도 양산시 하북면 지산리 392-52022<NA>일원
234경상남도 양산시 하북면 지산리 392-52022<NA>일원

Duplicate rows

Most frequently occurring

주소설치년도설치형태비고# duplicates
0경상남도 양산시 교동 970-352022<NA>일원3
1경상남도 양산시 동면 여락리 624-22020<NA><NA>3
4경상남도 양산시 상북면 상삼리 426-22022<NA>일원3
5경상남도 양산시 상북면 석계리 1089-12020<NA><NA>3
6경상남도 양산시 상북면 소토리 4062020<NA><NA>3
7경상남도 양산시 원동면 내포리 125-12020<NA><NA>3
8경상남도 양산시 원동면 내포리 8942020<NA><NA>3
9경상남도 양산시 원동면 선리 5152020<NA><NA>3
10경상남도 양산시 원동면 용당리 1251-12020<NA><NA>3
11경상남도 양산시 원동면 화제리 17642020<NA><NA>3