Overview

Dataset statistics

Number of variables5
Number of observations4407
Missing cells1664
Missing cells (%)7.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory180.9 KiB
Average record size in memory42.0 B

Variable types

Categorical1
Text2
Numeric2

Dataset

Description이 파일 데이터는 가락, 강서, 양곡 도매시장, 친환경유통센터에 위치한 점포면적과 크기, 위치를 알 수 있습니다. 점포정보를 통해 시장별로 입점 가능한 수, 제공되는 점포 면적 등을 파악할 수 있습니다.
Author서울특별시농수산식품공사
URLhttps://www.data.go.kr/data/15123533/fileData.do

Alerts

전용면적 is highly overall correlated with 공용면적High correlation
공용면적 is highly overall correlated with 전용면적High correlation
시장구분 is highly imbalanced (58.6%)Imbalance
전용면적 has 219 (5.0%) missing valuesMissing
공용면적 has 1445 (32.8%) missing valuesMissing
공용면적 is highly skewed (γ1 = 27.10600015)Skewed
시설번호 has unique valuesUnique
공용면적 has 413 (9.4%) zerosZeros

Reproduction

Analysis started2023-12-12 12:43:04.652863
Analysis finished2023-12-12 12:43:05.996260
Duration1.34 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시장구분
Categorical

IMBALANCE 

Distinct4
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size34.6 KiB
가락
3648 
강서
546 
양곡
 
197
친환경
 
16

Length

Max length3
Median length2
Mean length2.0036306
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가락
2nd row가락
3rd row가락
4th row가락
5th row가락

Common Values

ValueCountFrequency (%)
가락 3648
82.8%
강서 546
 
12.4%
양곡 197
 
4.5%
친환경 16
 
0.4%

Length

2023-12-12T21:43:06.075697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:43:06.217510image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
가락 3648
82.8%
강서 546
 
12.4%
양곡 197
 
4.5%
친환경 16
 
0.4%

시설번호
Text

UNIQUE 

Distinct4407
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size34.6 KiB
2023-12-12T21:43:06.560836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters30849
Distinct characters42
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4407 ?
Unique (%)100.0%

Sample

1st row1113181
2nd row1113191
3rd row1113201
4th row1113211
5th row1113221
ValueCountFrequency (%)
1113181 1
 
< 0.1%
g1ah694 1
 
< 0.1%
g1ah671 1
 
< 0.1%
g1ah681 1
 
< 0.1%
g1ah691 1
 
< 0.1%
g1ah692 1
 
< 0.1%
g1ah693 1
 
< 0.1%
g1ahb21 1
 
< 0.1%
g1ah751 1
 
< 0.1%
g1ah051 1
 
< 0.1%
Other values (4397) 4397
99.8%
2023-12-12T21:43:07.049150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 12101
39.2%
0 3362
 
10.9%
2 3252
 
10.5%
3 1528
 
5.0%
G 1446
 
4.7%
8 1167
 
3.8%
5 1116
 
3.6%
4 1101
 
3.6%
6 879
 
2.8%
7 831
 
2.7%
Other values (32) 4066
 
13.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26142
84.7%
Uppercase Letter 3354
 
10.9%
Lowercase Letter 1353
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 685
50.6%
b 390
28.8%
j 141
 
10.4%
c 43
 
3.2%
d 29
 
2.1%
e 17
 
1.3%
f 13
 
1.0%
g 11
 
0.8%
h 7
 
0.5%
i 4
 
0.3%
Other values (11) 13
 
1.0%
Uppercase Letter
ValueCountFrequency (%)
G 1446
43.1%
K 531
 
15.8%
B 314
 
9.4%
F 274
 
8.2%
A 226
 
6.7%
D 220
 
6.6%
C 169
 
5.0%
H 81
 
2.4%
E 44
 
1.3%
S 28
 
0.8%
Decimal Number
ValueCountFrequency (%)
1 12101
46.3%
0 3362
 
12.9%
2 3252
 
12.4%
3 1528
 
5.8%
8 1167
 
4.5%
5 1116
 
4.3%
4 1101
 
4.2%
6 879
 
3.4%
7 831
 
3.2%
9 805
 
3.1%

Most occurring scripts

ValueCountFrequency (%)
Common 26142
84.7%
Latin 4707
 
15.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 1446
30.7%
a 685
14.6%
K 531
 
11.3%
b 390
 
8.3%
B 314
 
6.7%
F 274
 
5.8%
A 226
 
4.8%
D 220
 
4.7%
C 169
 
3.6%
j 141
 
3.0%
Other values (22) 311
 
6.6%
Common
ValueCountFrequency (%)
1 12101
46.3%
0 3362
 
12.9%
2 3252
 
12.4%
3 1528
 
5.8%
8 1167
 
4.5%
5 1116
 
4.3%
4 1101
 
4.2%
6 879
 
3.4%
7 831
 
3.2%
9 805
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30849
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 12101
39.2%
0 3362
 
10.9%
2 3252
 
10.5%
3 1528
 
5.0%
G 1446
 
4.7%
8 1167
 
3.8%
5 1116
 
3.6%
4 1101
 
3.6%
6 879
 
2.8%
7 831
 
2.7%
Other values (32) 4066
 
13.2%
Distinct4405
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size34.6 KiB
2023-12-12T21:43:07.386615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length29
Mean length18.540277
Min length5

Characters and Unicode

Total characters81707
Distinct characters192
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4403 ?
Unique (%)99.9%

Sample

1st row청과물시장동 1층 318-1호
2nd row청과물시장동 1층 319-1호
3rd row청과물시장동 1층 320-1호
4th row청과물시장동 1층 321-1호
5th row청과물시장동 1층 322-1호
ValueCountFrequency (%)
1층 2559
 
14.8%
가락몰 1664
 
9.6%
판매동 1374
 
7.9%
지하1층 742
 
4.3%
청과물시장동 687
 
4.0%
청과부류 683
 
3.9%
채소시장 502
 
2.9%
수산물시장동 463
 
2.7%
1동 371
 
2.1%
지하2층 354
 
2.0%
Other values (2104) 7894
45.6%
2023-12-12T21:43:07.842891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12889
 
15.8%
1 8887
 
10.9%
4357
 
5.3%
4306
 
5.3%
4203
 
5.1%
0 3732
 
4.6%
- 3534
 
4.3%
2 2720
 
3.3%
2347
 
2.9%
2154
 
2.6%
Other values (182) 32578
39.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 43004
52.6%
Decimal Number 21295
26.1%
Space Separator 12889
 
15.8%
Dash Punctuation 3534
 
4.3%
Uppercase Letter 828
 
1.0%
Open Punctuation 77
 
0.1%
Close Punctuation 77
 
0.1%
Other Punctuation 2
 
< 0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4357
 
10.1%
4306
 
10.0%
4203
 
9.8%
2347
 
5.5%
2154
 
5.0%
1802
 
4.2%
1706
 
4.0%
1693
 
3.9%
1692
 
3.9%
1679
 
3.9%
Other values (153) 17065
39.7%
Uppercase Letter
ValueCountFrequency (%)
A 181
21.9%
C 138
16.7%
D 115
13.9%
B 105
12.7%
F 84
10.1%
H 81
9.8%
G 63
 
7.6%
E 35
 
4.2%
S 12
 
1.4%
K 12
 
1.4%
Decimal Number
ValueCountFrequency (%)
1 8887
41.7%
0 3732
17.5%
2 2720
 
12.8%
3 1219
 
5.7%
4 973
 
4.6%
5 926
 
4.3%
6 752
 
3.5%
7 720
 
3.4%
8 689
 
3.2%
9 677
 
3.2%
Open Punctuation
ValueCountFrequency (%)
( 76
98.7%
[ 1
 
1.3%
Close Punctuation
ValueCountFrequency (%)
) 76
98.7%
] 1
 
1.3%
Space Separator
ValueCountFrequency (%)
12889
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3534
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 43005
52.6%
Common 37874
46.4%
Latin 828
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4357
 
10.1%
4306
 
10.0%
4203
 
9.8%
2347
 
5.5%
2154
 
5.0%
1802
 
4.2%
1706
 
4.0%
1693
 
3.9%
1692
 
3.9%
1679
 
3.9%
Other values (154) 17066
39.7%
Common
ValueCountFrequency (%)
12889
34.0%
1 8887
23.5%
0 3732
 
9.9%
- 3534
 
9.3%
2 2720
 
7.2%
3 1219
 
3.2%
4 973
 
2.6%
5 926
 
2.4%
6 752
 
2.0%
7 720
 
1.9%
Other values (7) 1522
 
4.0%
Latin
ValueCountFrequency (%)
A 181
21.9%
C 138
16.7%
D 115
13.9%
B 105
12.7%
F 84
10.1%
H 81
9.8%
G 63
 
7.6%
E 35
 
4.2%
S 12
 
1.4%
K 12
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 43004
52.6%
ASCII 38702
47.4%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12889
33.3%
1 8887
23.0%
0 3732
 
9.6%
- 3534
 
9.1%
2 2720
 
7.0%
3 1219
 
3.1%
4 973
 
2.5%
5 926
 
2.4%
6 752
 
1.9%
7 720
 
1.9%
Other values (18) 2350
 
6.1%
Hangul
ValueCountFrequency (%)
4357
 
10.1%
4306
 
10.0%
4203
 
9.8%
2347
 
5.5%
2154
 
5.0%
1802
 
4.2%
1706
 
4.0%
1693
 
3.9%
1692
 
3.9%
1679
 
3.9%
Other values (153) 17065
39.7%
None
ValueCountFrequency (%)
1
100.0%

전용면적
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct688
Distinct (%)16.4%
Missing219
Missing (%)5.0%
Infinite0
Infinite (%)0.0%
Mean39.713926
Minimum0
Maximum3372
Zeros25
Zeros (%)0.6%
Negative0
Negative (%)0.0%
Memory size38.9 KiB
2023-12-12T21:43:08.007219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6.5
Q112.63
median22.5
Q340.4
95-th percentile107.9895
Maximum3372
Range3372
Interquartile range (IQR)27.77

Descriptive statistics

Standard deviation101.38543
Coefficient of variation (CV)2.5528936
Kurtosis377.26846
Mean39.713926
Median Absolute Deviation (MAD)13.74
Skewness15.837035
Sum166321.92
Variance10279.005
MonotonicityNot monotonic
2023-12-12T21:43:08.153850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.4 332
 
7.5%
12.63 228
 
5.2%
6.7 170
 
3.9%
22.5 169
 
3.8%
42.14 142
 
3.2%
6.5 135
 
3.1%
24.0 133
 
3.0%
18.9 128
 
2.9%
13.4 112
 
2.5%
21.07 92
 
2.1%
Other values (678) 2547
57.8%
(Missing) 219
 
5.0%
ValueCountFrequency (%)
0.0 25
0.6%
1.25 1
 
< 0.1%
1.46 1
 
< 0.1%
1.69 1
 
< 0.1%
2.12 1
 
< 0.1%
2.23 1
 
< 0.1%
2.3 2
 
< 0.1%
3.0 1
 
< 0.1%
3.3 2
 
< 0.1%
3.5 1
 
< 0.1%
ValueCountFrequency (%)
3372.0 1
< 0.1%
1972.84 1
< 0.1%
1790.0 1
< 0.1%
1620.0 1
< 0.1%
1450.0 1
< 0.1%
1350.0 1
< 0.1%
1227.8 1
< 0.1%
1132.09 1
< 0.1%
912.0 1
< 0.1%
874.0 1
< 0.1%

공용면적
Real number (ℝ)

HIGH CORRELATION  MISSING  SKEWED  ZEROS 

Distinct326
Distinct (%)11.0%
Missing1445
Missing (%)32.8%
Infinite0
Infinite (%)0.0%
Mean15.387355
Minimum0
Maximum1333.45
Zeros413
Zeros (%)9.4%
Negative0
Negative (%)0.0%
Memory size38.9 KiB
2023-12-12T21:43:08.298098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15.44
median10.67
Q322.73
95-th percentile38.10295
Maximum1333.45
Range1333.45
Interquartile range (IQR)17.29

Descriptive statistics

Standard deviation31.330845
Coefficient of variation (CV)2.0361423
Kurtosis1072.1726
Mean15.387355
Median Absolute Deviation (MAD)8.65
Skewness27.106
Sum45577.346
Variance981.62187
MonotonicityNot monotonic
2023-12-12T21:43:08.476437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 413
 
9.4%
22.73 332
 
7.5%
5.44 228
 
5.2%
9.7 167
 
3.8%
18.43 141
 
3.2%
1.66 128
 
2.9%
10.67 126
 
2.9%
9.215 85
 
1.9%
8.84 65
 
1.5%
12.36 54
 
1.2%
Other values (316) 1223
27.8%
(Missing) 1445
32.8%
ValueCountFrequency (%)
0.0 413
9.4%
0.87 6
 
0.1%
1.66 128
 
2.9%
2.3 24
 
0.5%
2.33 1
 
< 0.1%
2.34 15
 
0.3%
2.45 16
 
0.4%
2.5 14
 
0.3%
2.67 9
 
0.2%
2.79 2
 
< 0.1%
ValueCountFrequency (%)
1333.45 1
 
< 0.1%
285.97 1
 
< 0.1%
282.32 5
0.1%
277.24 2
 
< 0.1%
179.77 1
 
< 0.1%
168.84 1
 
< 0.1%
152.861 1
 
< 0.1%
151.23 1
 
< 0.1%
144.0 1
 
< 0.1%
130.0 1
 
< 0.1%

Interactions

2023-12-12T21:43:05.222692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:43:04.992844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:43:05.337676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:43:05.094944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:43:08.590348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시장구분전용면적공용면적
시장구분1.0000.0410.076
전용면적0.0411.0000.843
공용면적0.0760.8431.000
2023-12-12T21:43:08.681334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
전용면적공용면적시장구분
전용면적1.0000.8630.028
공용면적0.8631.0000.030
시장구분0.0280.0301.000

Missing values

2023-12-12T21:43:05.465350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:43:05.559730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T21:43:05.946828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시장구분시설번호시설명전용면적공용면적
0가락1113181청과물시장동 1층 318-1호40.9123.01
1가락1113191청과물시장동 1층 319-1호40.422.73
2가락1113201청과물시장동 1층 320-1호40.422.73
3가락1113211청과물시장동 1층 321-1호40.422.73
4가락1113221청과물시장동 1층 322-1호40.422.73
5가락1113231청과물시장동 1층 323-1호40.422.73
6가락1113241청과물시장동 1층 324-1호40.422.73
7가락1113251청과물시장동 1층 325-1호40.422.73
8가락1113261청과물시장동 1층 326-1호41.4123.29
9가락1113271청과물시장동 1층 327-1호41.4123.29
시장구분시설번호시설명전용면적공용면적
4397강서K116131청과동 1층 613-1호21.079.215
4398강서K116132청과동 1층 613-2호21.079.215
4399강서K116141청과동 1층 614-1호42.1418.43
4400강서K117011청과동 1층 701-1호21.079.215
4401강서K117012청과동 1층 701-2호21.079.215
4402강서K117021청과동 1층 702-1호42.1418.43
4403강서K117031청과동 1층 703-1호42.1418.43
4404강서K117041청과동 1층 704-1호42.1418.43
4405강서K117051청과동 1층 705-1호42.1418.43
4406강서K117061청과동 1층 706-1호21.079.215