Overview

Dataset statistics

Number of variables4
Number of observations216
Missing cells3
Missing cells (%)0.3%
Duplicate rows2
Duplicate rows (%)0.9%
Total size in memory7.1 KiB
Average record size in memory33.6 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description자치구명,법정동명,업태명,업소수
Author강북구
URLhttps://data.seoul.go.kr/dataList/OA-10914/S/1/datasetView.do

Alerts

자치구명 has constant value ""Constant
Dataset has 2 (0.9%) duplicate rowsDuplicates
업태명 has 3 (1.4%) missing valuesMissing

Reproduction

Analysis started2024-05-03 19:53:44.640567
Analysis finished2024-05-03 19:53:45.848332
Duration1.21 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자치구명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
강북구
216 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강북구
2nd row강북구
3rd row강북구
4th row강북구
5th row강북구

Common Values

ValueCountFrequency (%)
강북구 216
100.0%

Length

2024-05-03T19:53:46.102415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-03T19:53:46.506018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
강북구 216
100.0%

법정동명
Categorical

Distinct4
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size1.8 KiB
수유동
66 
미아동
63 
번동
53 
우이동
34 

Length

Max length3
Median length3
Mean length2.7546296
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row미아동
2nd row미아동
3rd row미아동
4th row미아동
5th row미아동

Common Values

ValueCountFrequency (%)
수유동 66
30.6%
미아동 63
29.2%
번동 53
24.5%
우이동 34
15.7%

Length

2024-05-03T19:53:47.106635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-03T19:53:47.490596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수유동 66
30.6%
미아동 63
29.2%
번동 53
24.5%
우이동 34
15.7%

업태명
Text

MISSING 

Distinct74
Distinct (%)34.7%
Missing3
Missing (%)1.4%
Memory size1.8 KiB
2024-05-03T19:53:47.980114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.4319249
Min length2

Characters and Unicode

Total characters1157
Distinct characters156
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)8.5%

Sample

1st row한식
2nd row중국식
3rd row경양식
4th row일식
5th row분식
ValueCountFrequency (%)
기타 17
 
7.3%
패스트푸드 8
 
3.4%
식품제조가공업 7
 
3.0%
집단급식소 5
 
2.2%
일반조리판매 4
 
1.7%
어린이집 4
 
1.7%
커피숍 4
 
1.7%
편의점 4
 
1.7%
제과점영업 4
 
1.7%
수입판매업 4
 
1.7%
Other values (65) 171
73.7%
2024-05-03T19:53:49.286062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
74
 
6.4%
62
 
5.4%
49
 
4.2%
48
 
4.1%
35
 
3.0%
32
 
2.8%
25
 
2.2%
23
 
2.0%
21
 
1.8%
21
 
1.8%
Other values (146) 767
66.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1084
93.7%
Open Punctuation 20
 
1.7%
Close Punctuation 20
 
1.7%
Space Separator 19
 
1.6%
Other Punctuation 14
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
74
 
6.8%
62
 
5.7%
49
 
4.5%
48
 
4.4%
35
 
3.2%
32
 
3.0%
25
 
2.3%
23
 
2.1%
21
 
1.9%
21
 
1.9%
Other values (141) 694
64.0%
Other Punctuation
ValueCountFrequency (%)
/ 12
85.7%
, 2
 
14.3%
Open Punctuation
ValueCountFrequency (%)
( 20
100.0%
Close Punctuation
ValueCountFrequency (%)
) 20
100.0%
Space Separator
ValueCountFrequency (%)
19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1084
93.7%
Common 73
 
6.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
74
 
6.8%
62
 
5.7%
49
 
4.5%
48
 
4.4%
35
 
3.2%
32
 
3.0%
25
 
2.3%
23
 
2.1%
21
 
1.9%
21
 
1.9%
Other values (141) 694
64.0%
Common
ValueCountFrequency (%)
( 20
27.4%
) 20
27.4%
19
26.0%
/ 12
16.4%
, 2
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1084
93.7%
ASCII 73
 
6.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
74
 
6.8%
62
 
5.7%
49
 
4.5%
48
 
4.4%
35
 
3.2%
32
 
3.0%
25
 
2.3%
23
 
2.1%
21
 
1.9%
21
 
1.9%
Other values (141) 694
64.0%
ASCII
ValueCountFrequency (%)
( 20
27.4%
) 20
27.4%
19
26.0%
/ 12
16.4%
, 2
 
2.7%

업소수
Real number (ℝ)

Distinct61
Distinct (%)28.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.800926
Minimum1
Maximum649
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.0 KiB
2024-05-03T19:53:49.857335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median6
Q321.75
95-th percentile130.75
Maximum649
Range648
Interquartile range (IQR)19.75

Descriptive statistics

Standard deviation74.12505
Coefficient of variation (CV)2.5737037
Kurtosis40.803905
Mean28.800926
Median Absolute Deviation (MAD)5
Skewness5.8058582
Sum6221
Variance5494.523
MonotonicityNot monotonic
2024-05-03T19:53:50.278098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 35
16.2%
2 23
 
10.6%
3 22
 
10.2%
4 15
 
6.9%
5 12
 
5.6%
7 8
 
3.7%
10 8
 
3.7%
8 7
 
3.2%
6 6
 
2.8%
25 4
 
1.9%
Other values (51) 76
35.2%
ValueCountFrequency (%)
1 35
16.2%
2 23
10.6%
3 22
10.2%
4 15
6.9%
5 12
 
5.6%
6 6
 
2.8%
7 8
 
3.7%
8 7
 
3.2%
9 4
 
1.9%
10 8
 
3.7%
ValueCountFrequency (%)
649 1
0.5%
604 1
0.5%
352 1
0.5%
291 1
0.5%
221 1
0.5%
185 1
0.5%
184 1
0.5%
161 1
0.5%
146 2
0.9%
133 1
0.5%

Interactions

2024-05-03T19:53:45.042498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-03T19:53:50.619334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동명업태명업소수
법정동명1.0000.0000.000
업태명0.0001.0000.000
업소수0.0000.0001.000
2024-05-03T19:53:50.938554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소수법정동명
업소수1.0000.000
법정동명0.0001.000

Missing values

2024-05-03T19:53:45.466533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-03T19:53:45.751036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자치구명법정동명업태명업소수
0강북구미아동한식604
1강북구미아동중국식39
2강북구미아동경양식21
3강북구미아동일식42
4강북구미아동분식93
5강북구미아동뷔페식3
6강북구미아동정종/대포집/소주방37
7강북구미아동전통찻집2
8강북구미아동출장조리1
9강북구미아동패스트푸드4
자치구명법정동명업태명업소수
206강북구우이동즉석판매제조가공업14
207강북구우이동식품등 수입판매업2
208강북구우이동식품자동판매기영업8
209강북구우이동유통전문판매업5
210강북구우이동위탁급식영업3
211강북구우이동제과점영업3
212강북구우이동영업장판매5
213강북구우이동전자상거래(통신판매업)12
214강북구우이동기타(복합 등)1
215강북구우이동건강기능식품유통전문판매업3

Duplicate rows

Most frequently occurring

자치구명법정동명업태명업소수# duplicates
0강북구수유동전통찻집12
1강북구우이동패스트푸드12