Overview

Dataset statistics

Number of variables5
Number of observations291
Missing cells10
Missing cells (%)0.7%
Duplicate rows1
Duplicate rows (%)0.3%
Total size in memory12.1 KiB
Average record size in memory42.5 B

Variable types

Numeric1
Categorical3
Text1

Dataset

Description광주광역시 5개 자치구에서 설치한 가로 쓰레기통 현황을 공공데이터 목록에 등록합니다. 광주광역시 자원순환과 등록(설치장소, 설치위치 등)
URLhttps://www.data.go.kr/data/15056448/fileData.do

Alerts

Dataset has 1 (0.3%) duplicate rowsDuplicates
데이터기준일자 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
설치개수 is highly overall correlated with 데이터기준일자High correlation
자치구명 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 자치구명 and 1 other fieldsHigh correlation
설치개수 is highly imbalanced (62.9%)Imbalance
데이터기준일자 is highly imbalanced (87.5%)Imbalance
연번 has 5 (1.7%) missing valuesMissing
설치장소 has 5 (1.7%) missing valuesMissing

Reproduction

Analysis started2023-12-12 12:30:46.583367
Analysis finished2023-12-12 12:30:47.843524
Duration1.26 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct286
Distinct (%)100.0%
Missing5
Missing (%)1.7%
Infinite0
Infinite (%)0.0%
Mean241.07692
Minimum1
Maximum440
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-12T21:30:47.946758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile15.25
Q187.25
median289.5
Q3368.75
95-th percentile425.75
Maximum440
Range439
Interquartile range (IQR)281.5

Descriptive statistics

Standard deviation145.20674
Coefficient of variation (CV)0.60232534
Kurtosis-1.4408933
Mean241.07692
Median Absolute Deviation (MAD)110
Skewness-0.33863051
Sum68948
Variance21084.998
MonotonicityStrictly increasing
2023-12-12T21:30:48.120234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
336 1
 
0.3%
342 1
 
0.3%
341 1
 
0.3%
340 1
 
0.3%
339 1
 
0.3%
338 1
 
0.3%
337 1
 
0.3%
335 1
 
0.3%
344 1
 
0.3%
334 1
 
0.3%
Other values (276) 276
94.8%
(Missing) 5
 
1.7%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
440 1
0.3%
439 1
0.3%
438 1
0.3%
437 1
0.3%
436 1
0.3%
435 1
0.3%
434 1
0.3%
433 1
0.3%
432 1
0.3%
431 1
0.3%

자치구명
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
광주광역시 북구
92 
광주광역시 광산구
79 
광주광역시 서구
59 
광주광역시 동구
43 
광주광역시 남구
13 

Length

Max length9
Median length8
Mean length8.2027491
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광주광역시 동구
2nd row광주광역시 동구
3rd row광주광역시 동구
4th row광주광역시 동구
5th row광주광역시 동구

Common Values

ValueCountFrequency (%)
광주광역시 북구 92
31.6%
광주광역시 광산구 79
27.1%
광주광역시 서구 59
20.3%
광주광역시 동구 43
14.8%
광주광역시 남구 13
 
4.5%
<NA> 5
 
1.7%

Length

2023-12-12T21:30:48.273314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:30:48.391502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
광주광역시 286
49.6%
북구 92
 
15.9%
광산구 79
 
13.7%
서구 59
 
10.2%
동구 43
 
7.5%
남구 13
 
2.3%
na 5
 
0.9%

설치장소
Text

MISSING 

Distinct286
Distinct (%)100.0%
Missing5
Missing (%)1.7%
Memory size2.4 KiB
2023-12-12T21:30:48.674623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length20
Mean length11.265734
Min length3

Characters and Unicode

Total characters3222
Distinct characters306
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique286 ?
Unique (%)100.0%

Sample

1st row금남공원(지하도 입구)
2nd row삼성생명 앞
3rd rowYMCA 앞
4th row농협중앙회 앞
5th row금남로4가역 남선빌딩 앞 하나증권
ValueCountFrequency (%)
79
 
12.7%
맞은편 19
 
3.1%
버스승강장 19
 
3.1%
사거리 11
 
1.8%
횡단보도 10
 
1.6%
건너편 9
 
1.4%
입구 9
 
1.4%
방향 8
 
1.3%
금호동 8
 
1.3%
6
 
1.0%
Other values (363) 443
71.3%
2023-12-12T21:30:49.112602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
346
 
10.7%
) 90
 
2.8%
( 89
 
2.8%
88
 
2.7%
86
 
2.7%
73
 
2.3%
51
 
1.6%
49
 
1.5%
47
 
1.5%
44
 
1.4%
Other values (296) 2259
70.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2608
80.9%
Space Separator 346
 
10.7%
Close Punctuation 90
 
2.8%
Open Punctuation 89
 
2.8%
Decimal Number 53
 
1.6%
Uppercase Letter 27
 
0.8%
Other Punctuation 8
 
0.2%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
88
 
3.4%
86
 
3.3%
73
 
2.8%
51
 
2.0%
49
 
1.9%
47
 
1.8%
44
 
1.7%
42
 
1.6%
42
 
1.6%
41
 
1.6%
Other values (265) 2045
78.4%
Uppercase Letter
ValueCountFrequency (%)
K 5
18.5%
S 4
14.8%
C 3
11.1%
G 3
11.1%
T 2
 
7.4%
N 2
 
7.4%
B 2
 
7.4%
M 2
 
7.4%
D 1
 
3.7%
Y 1
 
3.7%
Other values (2) 2
 
7.4%
Decimal Number
ValueCountFrequency (%)
1 14
26.4%
3 10
18.9%
2 6
11.3%
4 6
11.3%
5 5
 
9.4%
0 3
 
5.7%
6 3
 
5.7%
9 3
 
5.7%
8 2
 
3.8%
7 1
 
1.9%
Other Punctuation
ValueCountFrequency (%)
· 2
25.0%
& 2
25.0%
. 2
25.0%
/ 1
12.5%
, 1
12.5%
Space Separator
ValueCountFrequency (%)
346
100.0%
Close Punctuation
ValueCountFrequency (%)
) 90
100.0%
Open Punctuation
ValueCountFrequency (%)
( 89
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2608
80.9%
Common 586
 
18.2%
Latin 28
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
88
 
3.4%
86
 
3.3%
73
 
2.8%
51
 
2.0%
49
 
1.9%
47
 
1.8%
44
 
1.7%
42
 
1.6%
42
 
1.6%
41
 
1.6%
Other values (265) 2045
78.4%
Common
ValueCountFrequency (%)
346
59.0%
) 90
 
15.4%
( 89
 
15.2%
1 14
 
2.4%
3 10
 
1.7%
2 6
 
1.0%
4 6
 
1.0%
5 5
 
0.9%
0 3
 
0.5%
6 3
 
0.5%
Other values (8) 14
 
2.4%
Latin
ValueCountFrequency (%)
K 5
17.9%
S 4
14.3%
C 3
10.7%
G 3
10.7%
T 2
 
7.1%
N 2
 
7.1%
B 2
 
7.1%
M 2
 
7.1%
D 1
 
3.6%
Y 1
 
3.6%
Other values (3) 3
10.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2608
80.9%
ASCII 612
 
19.0%
None 2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
346
56.5%
) 90
 
14.7%
( 89
 
14.5%
1 14
 
2.3%
3 10
 
1.6%
2 6
 
1.0%
4 6
 
1.0%
K 5
 
0.8%
5 5
 
0.8%
S 4
 
0.7%
Other values (20) 37
 
6.0%
Hangul
ValueCountFrequency (%)
88
 
3.4%
86
 
3.3%
73
 
2.8%
51
 
2.0%
49
 
1.9%
47
 
1.8%
44
 
1.7%
42
 
1.6%
42
 
1.6%
41
 
1.6%
Other values (265) 2045
78.4%
None
ValueCountFrequency (%)
· 2
100.0%

설치개수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
1
239 
2
41 
3
 
5
<NA>
 
5
4
 
1

Length

Max length4
Median length1
Mean length1.0515464
Min length1

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 239
82.1%
2 41
 
14.1%
3 5
 
1.7%
<NA> 5
 
1.7%
4 1
 
0.3%

Length

2023-12-12T21:30:49.290831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:30:49.439968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 239
82.1%
2 41
 
14.1%
3 5
 
1.7%
na 5
 
1.7%
4 1
 
0.3%

데이터기준일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-07-29
286 
<NA>
 
5

Length

Max length10
Median length10
Mean length9.8969072
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-07-29
2nd row2023-07-29
3rd row2023-07-29
4th row2023-07-29
5th row2023-07-29

Common Values

ValueCountFrequency (%)
2023-07-29 286
98.3%
<NA> 5
 
1.7%

Length

2023-12-12T21:30:49.600396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:30:49.736435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-07-29 286
98.3%
na 5
 
1.7%

Interactions

2023-12-12T21:30:46.935083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:30:49.820896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번자치구명설치개수
연번1.0000.9830.339
자치구명0.9831.0000.282
설치개수0.3390.2821.000
2023-12-12T21:30:49.954596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
데이터기준일자설치개수자치구명
데이터기준일자1.0001.0001.000
설치개수1.0001.0000.233
자치구명1.0000.2331.000
2023-12-12T21:30:50.060580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번자치구명설치개수데이터기준일자
연번1.0000.9700.2211.000
자치구명0.9701.0000.2331.000
설치개수0.2210.2331.0001.000
데이터기준일자1.0001.0001.0001.000

Missing values

2023-12-12T21:30:47.085641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:30:47.233875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T21:30:47.732931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번자치구명설치장소설치개수데이터기준일자
01광주광역시 동구금남공원(지하도 입구)12023-07-29
12광주광역시 동구삼성생명 앞12023-07-29
23광주광역시 동구YMCA 앞12023-07-29
34광주광역시 동구농협중앙회 앞12023-07-29
45광주광역시 동구금남로4가역 남선빌딩 앞 하나증권12023-07-29
56광주광역시 동구광주축협 횡단보도 양쪽(NC웨이브)112023-07-29
67광주광역시 동구광주축협 횡단보도 양쪽(NC웨이브)212023-07-29
78광주광역시 동구대신증권 앞12023-07-29
89광주광역시 동구대인광장쪽 신정B/D12023-07-29
910광주광역시 동구전남여고12023-07-29
연번자치구명설치장소설치개수데이터기준일자
281436광주광역시 광산구하남중 앞12023-07-29
282437광주광역시 광산구호가정정류장(동곡로)22023-07-29
283438광주광역시 광산구휴먼시아3단지 앞 삼거리12023-07-29
284439광주광역시 광산구흑석사거리 월곡동방향12023-07-29
285440광주광역시 광산구흑석사거리 홈플러스방향12023-07-29
286<NA><NA><NA><NA><NA>
287<NA><NA><NA><NA><NA>
288<NA><NA><NA><NA><NA>
289<NA><NA><NA><NA><NA>
290<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번자치구명설치장소설치개수데이터기준일자# duplicates
0<NA><NA><NA><NA><NA>5