Overview

Dataset statistics

Number of variables4
Number of observations504
Missing cells10
Missing cells (%)0.5%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory16.4 KiB
Average record size in memory33.3 B

Variable types

Categorical2
Text1
Numeric1

Dataset

Description자치구명,법정동명,업태명,업소수
Author은평구
URLhttps://data.seoul.go.kr/dataList/OA-10529/S/1/datasetView.do

Alerts

자치구명 has constant value ""Constant
Dataset has 1 (0.2%) duplicate rowsDuplicates
업태명 has 10 (2.0%) missing valuesMissing

Reproduction

Analysis started2024-05-18 08:41:15.289945
Analysis finished2024-05-18 08:41:16.429717
Duration1.14 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

자치구명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
은평구
504 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row은평구
2nd row은평구
3rd row은평구
4th row은평구
5th row은평구

Common Values

ValueCountFrequency (%)
은평구 504
100.0%

Length

2024-05-18T17:41:16.645807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-18T17:41:16.987974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
은평구 504
100.0%

법정동명
Categorical

Distinct14
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
갈현동
54 
응암동
54 
불광동
53 
녹번동
52 
대조동
49 
Other values (9)
242 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique3 ?
Unique (%)0.6%

Sample

1st row필운동
2nd row중화동
3rd row수색동
4th row수색동
5th row수색동

Common Values

ValueCountFrequency (%)
갈현동 54
10.7%
응암동 54
10.7%
불광동 53
10.5%
녹번동 52
10.3%
대조동 49
9.7%
진관동 47
9.3%
역촌동 44
8.7%
신사동 40
7.9%
구산동 39
7.7%
수색동 36
7.1%
Other values (4) 36
7.1%

Length

2024-05-18T17:41:17.292558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
갈현동 54
10.7%
응암동 54
10.7%
불광동 53
10.5%
녹번동 52
10.3%
대조동 49
9.7%
진관동 47
9.3%
역촌동 44
8.7%
신사동 40
7.9%
구산동 39
7.7%
수색동 36
7.1%
Other values (4) 36
7.1%

업태명
Text

MISSING 

Distinct69
Distinct (%)14.0%
Missing10
Missing (%)2.0%
Memory size4.1 KiB
2024-05-18T17:41:17.789510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length5.7834008
Min length2

Characters and Unicode

Total characters2857
Distinct characters150
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)2.0%

Sample

1st row건강기능식품유통전문판매업
2nd row유통전문판매업
3rd row한식
4th row중국식
5th row경양식
ValueCountFrequency (%)
기타 48
 
8.7%
패스트푸드 19
 
3.4%
식품제조가공업 18
 
3.3%
집단급식소 15
 
2.7%
한식 11
 
2.0%
전자상거래(통신판매업 11
 
2.0%
학교 11
 
2.0%
어린이집 11
 
2.0%
영업장판매 11
 
2.0%
제과점영업 11
 
2.0%
Other values (60) 385
69.9%
2024-05-18T17:41:18.718799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
212
 
7.4%
168
 
5.9%
130
 
4.6%
130
 
4.6%
104
 
3.6%
98
 
3.4%
60
 
2.1%
57
 
2.0%
56
 
2.0%
52
 
1.8%
Other values (140) 1790
62.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2665
93.3%
Space Separator 57
 
2.0%
Open Punctuation 49
 
1.7%
Close Punctuation 49
 
1.7%
Other Punctuation 37
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
212
 
8.0%
168
 
6.3%
130
 
4.9%
130
 
4.9%
104
 
3.9%
98
 
3.7%
60
 
2.3%
56
 
2.1%
52
 
2.0%
48
 
1.8%
Other values (135) 1607
60.3%
Other Punctuation
ValueCountFrequency (%)
/ 29
78.4%
, 8
 
21.6%
Space Separator
ValueCountFrequency (%)
57
100.0%
Open Punctuation
ValueCountFrequency (%)
( 49
100.0%
Close Punctuation
ValueCountFrequency (%)
) 49
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2665
93.3%
Common 192
 
6.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
212
 
8.0%
168
 
6.3%
130
 
4.9%
130
 
4.9%
104
 
3.9%
98
 
3.7%
60
 
2.3%
56
 
2.1%
52
 
2.0%
48
 
1.8%
Other values (135) 1607
60.3%
Common
ValueCountFrequency (%)
57
29.7%
( 49
25.5%
) 49
25.5%
/ 29
15.1%
, 8
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2665
93.3%
ASCII 192
 
6.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
212
 
8.0%
168
 
6.3%
130
 
4.9%
130
 
4.9%
104
 
3.9%
98
 
3.7%
60
 
2.3%
56
 
2.1%
52
 
2.0%
48
 
1.8%
Other values (135) 1607
60.3%
ASCII
ValueCountFrequency (%)
57
29.7%
( 49
25.5%
) 49
25.5%
/ 29
15.1%
, 8
 
4.2%

업소수
Real number (ℝ)

Distinct75
Distinct (%)14.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.253968
Minimum1
Maximum420
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.6 KiB
2024-05-18T17:41:19.197995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q314
95-th percentile68.4
Maximum420
Range419
Interquartile range (IQR)12

Descriptive statistics

Standard deviation36.766727
Coefficient of variation (CV)2.2620154
Kurtosis50.403553
Mean16.253968
Median Absolute Deviation (MAD)3
Skewness6.1656554
Sum8192
Variance1351.7922
MonotonicityNot monotonic
2024-05-18T17:41:19.851577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 122
24.2%
2 69
13.7%
4 34
 
6.7%
3 31
 
6.2%
5 20
 
4.0%
8 15
 
3.0%
9 15
 
3.0%
7 15
 
3.0%
11 13
 
2.6%
10 12
 
2.4%
Other values (65) 158
31.3%
ValueCountFrequency (%)
1 122
24.2%
2 69
13.7%
3 31
 
6.2%
4 34
 
6.7%
5 20
 
4.0%
6 10
 
2.0%
7 15
 
3.0%
8 15
 
3.0%
9 15
 
3.0%
10 12
 
2.4%
ValueCountFrequency (%)
420 1
0.2%
346 1
0.2%
266 1
0.2%
243 1
0.2%
217 1
0.2%
180 1
0.2%
150 1
0.2%
140 1
0.2%
135 1
0.2%
124 1
0.2%

Interactions

2024-05-18T17:41:15.654730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T17:41:20.253721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정동명업태명업소수
법정동명1.0000.0000.000
업태명0.0001.0000.000
업소수0.0000.0001.000
2024-05-18T17:41:20.563564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업소수법정동명
업소수1.0000.000
법정동명0.0001.000

Missing values

2024-05-18T17:41:16.016810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T17:41:16.319393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자치구명법정동명업태명업소수
0은평구필운동건강기능식품유통전문판매업1
1은평구중화동유통전문판매업1
2은평구수색동한식40
3은평구수색동중국식2
4은평구수색동경양식2
5은평구수색동일식1
6은평구수색동분식11
7은평구수색동뷔페식1
8은평구수색동정종/대포집/소주방2
9은평구수색동패스트푸드1
자치구명법정동명업태명업소수
494은평구진관동위탁급식영업8
495은평구진관동제과점영업26
496은평구진관동집단급식소 식품판매업3
497은평구진관동건강기능식품수입업1
498은평구진관동영업장판매42
499은평구진관동방문판매3
500은평구진관동전자상거래(통신판매업)63
501은평구진관동<NA>1
502은평구진관동건강기능식품유통전문판매업2
503은평구반포동푸드트럭1

Duplicate rows

Most frequently occurring

자치구명법정동명업태명업소수# duplicates
0은평구증산동패스트푸드12