Overview

Dataset statistics

Number of variables4
Number of observations471
Missing cells8
Missing cells (%)0.4%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory15.3 KiB
Average record size in memory33.3 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description자치구명,법정동명,업태명,업소수
Author강동구
URLhttps://data.seoul.go.kr/dataList/OA-10683/S/1/datasetView.do

Alerts

자치구명 has constant value ""Constant
Dataset has 1 (0.2%) duplicate rowsDuplicates
업태명 has 8 (1.7%) missing valuesMissing

Reproduction

Analysis started2024-05-04 04:22:41.953389
Analysis finished2024-05-04 04:22:43.181788
Duration1.23 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자치구명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.8 KiB
강동구
471 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강동구
2nd row강동구
3rd row강동구
4th row강동구
5th row강동구

Common Values

ValueCountFrequency (%)
강동구 471
100.0%

Length

2024-05-04T04:22:43.398297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:22:43.723164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
강동구 471
100.0%

법정동명
Categorical

Distinct9
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size3.8 KiB
천호동
67 
성내동
61 
길동
58 
명일동
55 
암사동
51 
Other values (4)
179 

Length

Max length3
Median length3
Mean length2.8768577
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row명일동
2nd row명일동
3rd row명일동
4th row명일동
5th row명일동

Common Values

ValueCountFrequency (%)
천호동 67
14.2%
성내동 61
13.0%
길동 58
12.3%
명일동 55
11.7%
암사동 51
10.8%
둔촌동 50
10.6%
고덕동 48
10.2%
상일동 47
10.0%
강일동 34
7.2%

Length

2024-05-04T04:22:44.064551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:22:44.675763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
천호동 67
14.2%
성내동 61
13.0%
길동 58
12.3%
명일동 55
11.7%
암사동 51
10.8%
둔촌동 50
10.6%
고덕동 48
10.2%
상일동 47
10.0%
강일동 34
7.2%

업태명
Text

MISSING 

Distinct72
Distinct (%)15.6%
Missing8
Missing (%)1.7%
Memory size3.8 KiB
2024-05-04T04:22:45.163341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.6760259
Min length2

Characters and Unicode

Total characters2628
Distinct characters151
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)1.3%

Sample

1st row한식
2nd row중국식
3rd row경양식
4th row일식
5th row분식
ValueCountFrequency (%)
기타 34
 
6.7%
패스트푸드 16
 
3.1%
식품제조가공업 15
 
3.0%
집단급식소 11
 
2.2%
한식 9
 
1.8%
유통전문판매업 9
 
1.8%
일반조리판매 9
 
1.8%
식품자동판매기영업 9
 
1.8%
식품소분업 9
 
1.8%
중국식 9
 
1.8%
Other values (63) 378
74.4%
2024-05-04T04:22:46.091765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
184
 
7.0%
152
 
5.8%
119
 
4.5%
115
 
4.4%
83
 
3.2%
79
 
3.0%
) 55
 
2.1%
( 55
 
2.1%
49
 
1.9%
49
 
1.9%
Other values (141) 1688
64.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2440
92.8%
Close Punctuation 55
 
2.1%
Open Punctuation 55
 
2.1%
Space Separator 45
 
1.7%
Other Punctuation 33
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
184
 
7.5%
152
 
6.2%
119
 
4.9%
115
 
4.7%
83
 
3.4%
79
 
3.2%
49
 
2.0%
49
 
2.0%
48
 
2.0%
41
 
1.7%
Other values (136) 1521
62.3%
Other Punctuation
ValueCountFrequency (%)
/ 25
75.8%
, 8
 
24.2%
Close Punctuation
ValueCountFrequency (%)
) 55
100.0%
Open Punctuation
ValueCountFrequency (%)
( 55
100.0%
Space Separator
ValueCountFrequency (%)
45
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2440
92.8%
Common 188
 
7.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
184
 
7.5%
152
 
6.2%
119
 
4.9%
115
 
4.7%
83
 
3.4%
79
 
3.2%
49
 
2.0%
49
 
2.0%
48
 
2.0%
41
 
1.7%
Other values (136) 1521
62.3%
Common
ValueCountFrequency (%)
) 55
29.3%
( 55
29.3%
45
23.9%
/ 25
13.3%
, 8
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2440
92.8%
ASCII 188
 
7.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
184
 
7.5%
152
 
6.2%
119
 
4.9%
115
 
4.7%
83
 
3.4%
79
 
3.2%
49
 
2.0%
49
 
2.0%
48
 
2.0%
41
 
1.7%
Other values (136) 1521
62.3%
ASCII
ValueCountFrequency (%)
) 55
29.3%
( 55
29.3%
45
23.9%
/ 25
13.3%
, 8
 
4.3%

업소수
Real number (ℝ)

Distinct82
Distinct (%)17.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.694268
Minimum1
Maximum464
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.3 KiB
2024-05-04T04:22:46.511283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median6
Q318.5
95-th percentile78
Maximum464
Range463
Interquartile range (IQR)16.5

Descriptive statistics

Standard deviation41.322447
Coefficient of variation (CV)2.2104341
Kurtosis56.191086
Mean18.694268
Median Absolute Deviation (MAD)5
Skewness6.4026885
Sum8805
Variance1707.5446
MonotonicityNot monotonic
2024-05-04T04:22:46.938984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 98
20.8%
2 53
 
11.3%
3 44
 
9.3%
4 24
 
5.1%
6 22
 
4.7%
7 18
 
3.8%
10 12
 
2.5%
8 12
 
2.5%
13 12
 
2.5%
9 12
 
2.5%
Other values (72) 164
34.8%
ValueCountFrequency (%)
1 98
20.8%
2 53
11.3%
3 44
9.3%
4 24
 
5.1%
5 11
 
2.3%
6 22
 
4.7%
7 18
 
3.8%
8 12
 
2.5%
9 12
 
2.5%
10 12
 
2.5%
ValueCountFrequency (%)
464 1
0.2%
439 1
0.2%
301 1
0.2%
183 1
0.2%
171 1
0.2%
147 1
0.2%
144 1
0.2%
140 1
0.2%
136 1
0.2%
129 1
0.2%

Interactions

2024-05-04T04:22:42.322371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T04:22:47.189625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동명업태명업소수
법정동명1.0000.0000.095
업태명0.0001.0000.457
업소수0.0950.4571.000
2024-05-04T04:22:47.550834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소수법정동명
업소수1.0000.047
법정동명0.0471.000

Missing values

2024-05-04T04:22:42.672451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T04:22:43.055392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자치구명법정동명업태명업소수
0강동구명일동한식171
1강동구명일동중국식14
2강동구명일동경양식26
3강동구명일동일식19
4강동구명일동분식47
5강동구명일동뷔페식1
6강동구명일동정종/대포집/소주방4
7강동구명일동출장조리1
8강동구명일동패스트푸드3
9강동구명일동호프/통닭33
자치구명법정동명업태명업소수
461강동구강일동유통전문판매업2
462강동구강일동기타식품판매업2
463강동구강일동위탁급식영업1
464강동구강일동제과점영업7
465강동구강일동영업장판매8
466강동구강일동방문판매1
467강동구강일동전자상거래(통신판매업)42
468강동구강일동다단계판매1
469강동구강일동기타 건강기능식품일반판매업1
470강동구강일동건강기능식품유통전문판매업1

Duplicate rows

Most frequently occurring

자치구명법정동명업태명업소수# duplicates
0강동구천호동키즈카페22