Overview

Dataset statistics

Number of variables5
Number of observations472
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory19.0 KiB
Average record size in memory41.3 B

Variable types

Numeric1
Categorical2
Text2

Dataset

Description부산광역시사하구_종량제봉투판매소현황_20221026
Author부산광역시 사하구
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15092857

Alerts

연번 is highly overall correlated with 행정동High correlation
행정동 is highly overall correlated with 연번High correlation
업종 is highly imbalanced (51.0%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-10 17:01:19.296182
Analysis finished2023-12-10 17:01:20.224828
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct472
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean236.5
Minimum1
Maximum472
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.3 KiB
2023-12-11T02:01:20.343173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile24.55
Q1118.75
median236.5
Q3354.25
95-th percentile448.45
Maximum472
Range471
Interquartile range (IQR)235.5

Descriptive statistics

Standard deviation136.39892
Coefficient of variation (CV)0.57673964
Kurtosis-1.2
Mean236.5
Median Absolute Deviation (MAD)118
Skewness0
Sum111628
Variance18604.667
MonotonicityStrictly increasing
2023-12-11T02:01:20.533991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.2%
312 1
 
0.2%
324 1
 
0.2%
323 1
 
0.2%
322 1
 
0.2%
321 1
 
0.2%
320 1
 
0.2%
319 1
 
0.2%
318 1
 
0.2%
317 1
 
0.2%
Other values (462) 462
97.9%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
472 1
0.2%
471 1
0.2%
470 1
0.2%
469 1
0.2%
468 1
0.2%
467 1
0.2%
466 1
0.2%
465 1
0.2%
464 1
0.2%
463 1
0.2%

행정동
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size3.8 KiB
하단2동
49 
다대1동
44 
신평1동
38 
장림2동
38 
당리동
36 
Other values (11)
267 

Length

Max length4
Median length4
Mean length3.875
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row괴정1동
2nd row괴정1동
3rd row괴정1동
4th row괴정1동
5th row괴정1동

Common Values

ValueCountFrequency (%)
하단2동 49
10.4%
다대1동 44
 
9.3%
신평1동 38
 
8.1%
장림2동 38
 
8.1%
당리동 36
 
7.6%
감천1동 35
 
7.4%
하단1동 34
 
7.2%
괴정1동 30
 
6.4%
괴정3동 26
 
5.5%
다대2동 25
 
5.3%
Other values (6) 117
24.8%

Length

2023-12-11T02:01:20.703138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
하단2동 49
10.4%
다대1동 44
 
9.3%
신평1동 38
 
8.1%
장림2동 38
 
8.1%
당리동 36
 
7.6%
감천1동 35
 
7.4%
하단1동 34
 
7.2%
괴정1동 30
 
6.4%
괴정3동 26
 
5.5%
다대2동 25
 
5.3%
Other values (6) 117
24.8%
Distinct461
Distinct (%)97.7%
Missing0
Missing (%)0.0%
Memory size3.8 KiB
2023-12-11T02:01:21.027537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length14
Mean length8.3834746
Min length3

Characters and Unicode

Total characters3957
Distinct characters334
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique451 ?
Unique (%)95.6%

Sample

1st row뉴민씨상회
2nd row동아마트
3rd row한신슈퍼
4th row세왕마트
5th row태화마트
ValueCountFrequency (%)
gs25 81
 
10.4%
씨유 68
 
8.7%
세븐일레븐 32
 
4.1%
이마트24 31
 
4.0%
괴정점 12
 
1.5%
장림점 9
 
1.2%
다대점 6
 
0.8%
탑마트 6
 
0.8%
하단점 5
 
0.6%
감천점 5
 
0.6%
Other values (462) 525
67.3%
2023-12-11T02:01:21.544050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
314
 
7.9%
290
 
7.3%
178
 
4.5%
164
 
4.1%
2 120
 
3.0%
89
 
2.2%
5 87
 
2.2%
S 87
 
2.2%
G 84
 
2.1%
77
 
1.9%
Other values (324) 2467
62.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3161
79.9%
Space Separator 314
 
7.9%
Decimal Number 240
 
6.1%
Uppercase Letter 197
 
5.0%
Open Punctuation 17
 
0.4%
Close Punctuation 17
 
0.4%
Lowercase Letter 9
 
0.2%
Dash Punctuation 1
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
290
 
9.2%
178
 
5.6%
164
 
5.2%
89
 
2.8%
77
 
2.4%
76
 
2.4%
73
 
2.3%
67
 
2.1%
64
 
2.0%
62
 
2.0%
Other values (291) 2021
63.9%
Uppercase Letter
ValueCountFrequency (%)
S 87
44.2%
G 84
42.6%
C 4
 
2.0%
K 4
 
2.0%
D 3
 
1.5%
N 2
 
1.0%
J 2
 
1.0%
F 1
 
0.5%
W 1
 
0.5%
P 1
 
0.5%
Other values (8) 8
 
4.1%
Lowercase Letter
ValueCountFrequency (%)
e 2
22.2%
s 2
22.2%
r 1
11.1%
h 1
11.1%
c 1
11.1%
a 1
11.1%
p 1
11.1%
Decimal Number
ValueCountFrequency (%)
2 120
50.0%
5 87
36.2%
4 33
 
13.8%
Space Separator
ValueCountFrequency (%)
314
100.0%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3161
79.9%
Common 590
 
14.9%
Latin 206
 
5.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
290
 
9.2%
178
 
5.6%
164
 
5.2%
89
 
2.8%
77
 
2.4%
76
 
2.4%
73
 
2.3%
67
 
2.1%
64
 
2.0%
62
 
2.0%
Other values (291) 2021
63.9%
Latin
ValueCountFrequency (%)
S 87
42.2%
G 84
40.8%
C 4
 
1.9%
K 4
 
1.9%
D 3
 
1.5%
N 2
 
1.0%
e 2
 
1.0%
J 2
 
1.0%
s 2
 
1.0%
F 1
 
0.5%
Other values (15) 15
 
7.3%
Common
ValueCountFrequency (%)
314
53.2%
2 120
 
20.3%
5 87
 
14.7%
4 33
 
5.6%
( 17
 
2.9%
) 17
 
2.9%
- 1
 
0.2%
, 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3161
79.9%
ASCII 796
 
20.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
314
39.4%
2 120
 
15.1%
5 87
 
10.9%
S 87
 
10.9%
G 84
 
10.6%
4 33
 
4.1%
( 17
 
2.1%
) 17
 
2.1%
C 4
 
0.5%
K 4
 
0.5%
Other values (23) 29
 
3.6%
Hangul
ValueCountFrequency (%)
290
 
9.2%
178
 
5.6%
164
 
5.2%
89
 
2.8%
77
 
2.4%
76
 
2.4%
73
 
2.3%
67
 
2.1%
64
 
2.0%
62
 
2.0%
Other values (291) 2021
63.9%

위치
Text

Distinct465
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Memory size3.8 KiB
2023-12-11T02:01:22.049160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length35
Mean length20.330508
Min length15

Characters and Unicode

Total characters9596
Distinct characters149
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique458 ?
Unique (%)97.0%

Sample

1st row부산광역시 사하구 승학로281번길 78
2nd row부산광역시 사하구 괴정로244번길 23
3rd row부산광역시 사하구 승학로 204
4th row부산광역시 사하구 낙동대로 236
5th row부산광역시 사하구 괴정로244번길 57
ValueCountFrequency (%)
부산광역시 472
24.3%
사하구 472
24.3%
낙동대로 34
 
1.8%
다대로 28
 
1.4%
상가 20
 
1.0%
하신번영로 16
 
0.8%
하신중앙로 14
 
0.7%
옥천로 12
 
0.6%
장평로 11
 
0.6%
17 10
 
0.5%
Other values (477) 852
43.9%
2023-12-11T02:01:22.799742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1492
 
15.5%
549
 
5.7%
503
 
5.2%
485
 
5.1%
483
 
5.0%
477
 
5.0%
475
 
4.9%
474
 
4.9%
473
 
4.9%
450
 
4.7%
Other values (139) 3735
38.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6266
65.3%
Decimal Number 1695
 
17.7%
Space Separator 1492
 
15.5%
Close Punctuation 54
 
0.6%
Open Punctuation 54
 
0.6%
Dash Punctuation 25
 
0.3%
Other Punctuation 6
 
0.1%
Uppercase Letter 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
549
 
8.8%
503
 
8.0%
485
 
7.7%
483
 
7.7%
477
 
7.6%
475
 
7.6%
474
 
7.6%
473
 
7.5%
450
 
7.2%
224
 
3.6%
Other values (121) 1673
26.7%
Decimal Number
ValueCountFrequency (%)
1 345
20.4%
3 219
12.9%
2 215
12.7%
4 172
10.1%
5 161
9.5%
7 149
8.8%
0 127
 
7.5%
6 110
 
6.5%
9 108
 
6.4%
8 89
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
W 2
50.0%
H 1
25.0%
L 1
25.0%
Space Separator
ValueCountFrequency (%)
1492
100.0%
Close Punctuation
ValueCountFrequency (%)
) 54
100.0%
Open Punctuation
ValueCountFrequency (%)
( 54
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 25
100.0%
Other Punctuation
ValueCountFrequency (%)
, 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6266
65.3%
Common 3326
34.7%
Latin 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
549
 
8.8%
503
 
8.0%
485
 
7.7%
483
 
7.7%
477
 
7.6%
475
 
7.6%
474
 
7.6%
473
 
7.5%
450
 
7.2%
224
 
3.6%
Other values (121) 1673
26.7%
Common
ValueCountFrequency (%)
1492
44.9%
1 345
 
10.4%
3 219
 
6.6%
2 215
 
6.5%
4 172
 
5.2%
5 161
 
4.8%
7 149
 
4.5%
0 127
 
3.8%
6 110
 
3.3%
9 108
 
3.2%
Other values (5) 228
 
6.9%
Latin
ValueCountFrequency (%)
W 2
50.0%
H 1
25.0%
L 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6266
65.3%
ASCII 3330
34.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1492
44.8%
1 345
 
10.4%
3 219
 
6.6%
2 215
 
6.5%
4 172
 
5.2%
5 161
 
4.8%
7 149
 
4.5%
0 127
 
3.8%
6 110
 
3.3%
9 108
 
3.2%
Other values (8) 232
 
7.0%
Hangul
ValueCountFrequency (%)
549
 
8.8%
503
 
8.0%
485
 
7.7%
483
 
7.7%
477
 
7.6%
475
 
7.6%
474
 
7.6%
473
 
7.5%
450
 
7.2%
224
 
3.6%
Other values (121) 1673
26.7%

업종
Categorical

IMBALANCE 

Distinct12
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size3.8 KiB
편의점
226 
슈퍼
186 
기타
24 
철물
 
7
기업형슈퍼
 
7
Other values (7)
 
22

Length

Max length5
Median length4
Mean length2.5508475
Min length2

Unique

Unique3 ?
Unique (%)0.6%

Sample

1st row슈퍼
2nd row슈퍼
3rd row슈퍼
4th row슈퍼
5th row슈퍼

Common Values

ValueCountFrequency (%)
편의점 226
47.9%
슈퍼 186
39.4%
기타 24
 
5.1%
철물 7
 
1.5%
기업형슈퍼 7
 
1.5%
대형 7
 
1.5%
대형슈퍼 6
 
1.3%
문구 4
 
0.8%
부식 2
 
0.4%
철물점 1
 
0.2%
Other values (2) 2
 
0.4%

Length

2023-12-11T02:01:23.046604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
편의점 226
47.9%
슈퍼 186
39.4%
기타 24
 
5.1%
철물 7
 
1.5%
기업형슈퍼 7
 
1.5%
대형 7
 
1.5%
대형슈퍼 6
 
1.3%
문구 4
 
0.8%
부식 2
 
0.4%
철물점 1
 
0.2%
Other values (2) 2
 
0.4%

Interactions

2023-12-11T02:01:19.850490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T02:01:23.213397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번행정동업종
연번1.0000.9730.232
행정동0.9731.0000.215
업종0.2320.2151.000
2023-12-11T02:01:23.400429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종행정동
업종1.0000.077
행정동0.0771.000
2023-12-11T02:01:23.549164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번행정동업종
연번1.0000.8660.099
행정동0.8661.0000.077
업종0.0990.0771.000

Missing values

2023-12-11T02:01:20.024335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T02:01:20.171514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번행정동업소명위치업종
01괴정1동뉴민씨상회부산광역시 사하구 승학로281번길 78슈퍼
12괴정1동동아마트부산광역시 사하구 괴정로244번길 23슈퍼
23괴정1동한신슈퍼부산광역시 사하구 승학로 204슈퍼
34괴정1동세왕마트부산광역시 사하구 낙동대로 236슈퍼
45괴정1동태화마트부산광역시 사하구 괴정로244번길 57슈퍼
56괴정1동씨유 괴정중앙점부산광역시 사하구 낙동대로 247편의점
67괴정1동탑에버 괴정점부산광역시 사하구 사하로141번길 37슈퍼
78괴정1동GS25 괴정큰샘점부산광역시 사하구 낙동대로233번길 19편의점
89괴정1동씨유 괴정점부산광역시 사하구 괴정로 272편의점
910괴정1동GS25 괴정뉴코아점부산광역시 사하구 사하로 199-1편의점
연번행정동업소명위치업종
462463감천2동미광슈퍼부산광역시 사하구 옥천로 104슈퍼
463464감천2동한일마트부산광역시 사하구 옥천로 62-2슈퍼
464465감천2동천복마트 감천점부산광역시 사하구 옥천로 125슈퍼
465466감천2동문화마을할인마트부산광역시 사하구 감내2로 202슈퍼
466467감천2동동남마트 감천시장점부산광역시 사하구 옥천로75번길 17슈퍼
467468감천2동신마트 감천점부산광역시 사하구 옥천로 33슈퍼
468469감천2동감천문화마을 주민협의회부산광역시 사하구 감내2로 177-1기타
469470감천2동킹마트부산광역시 사하구 옥천로 36-1슈퍼
470471감천2동세븐일레븐 부산감천자유점부산광역시 사하구 감천로105번길 55편의점
471472감천2동씨유 감천문화마을점부산광역시 사하구 옥천로 43편의점