Overview

Dataset statistics

Number of variables5
Number of observations164
Missing cells244
Missing cells (%)29.8%
Duplicate rows1
Duplicate rows (%)0.6%
Total size in memory6.7 KiB
Average record size in memory41.8 B

Variable types

Numeric1
Categorical1
Text2
DateTime1

Dataset

Description인천광역시 서구관내에 위치한 우유류판매업 현황(판매업구분명, 사업장명칭,소재지(도로명))정보를 담은 데이터파일입니다.
Author인천광역시 서구
URLhttps://www.data.go.kr/data/15088817/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 1 (0.6%) duplicate rowsDuplicates
연번 is highly overall correlated with 판매업구분명High correlation
판매업구분명 is highly overall correlated with 연번High correlation
연번 has 61 (37.2%) missing valuesMissing
사업장명칭 has 61 (37.2%) missing valuesMissing
소재지 has 61 (37.2%) missing valuesMissing
데이터기준일자 has 61 (37.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 21:51:36.760500
Analysis finished2023-12-12 21:51:38.066579
Duration1.31 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct103
Distinct (%)100.0%
Missing61
Missing (%)37.2%
Infinite0
Infinite (%)0.0%
Mean52
Minimum1
Maximum103
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.6 KiB
2023-12-13T06:51:38.161572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.1
Q126.5
median52
Q377.5
95-th percentile97.9
Maximum103
Range102
Interquartile range (IQR)51

Descriptive statistics

Standard deviation29.877528
Coefficient of variation (CV)0.57456784
Kurtosis-1.2
Mean52
Median Absolute Deviation (MAD)26
Skewness0
Sum5356
Variance892.66667
MonotonicityStrictly increasing
2023-12-13T06:51:38.349699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 1
 
0.6%
77 1
 
0.6%
76 1
 
0.6%
75 1
 
0.6%
74 1
 
0.6%
73 1
 
0.6%
72 1
 
0.6%
71 1
 
0.6%
70 1
 
0.6%
69 1
 
0.6%
Other values (93) 93
56.7%
(Missing) 61
37.2%
ValueCountFrequency (%)
1 1
0.6%
2 1
0.6%
3 1
0.6%
4 1
0.6%
5 1
0.6%
6 1
0.6%
7 1
0.6%
8 1
0.6%
9 1
0.6%
10 1
0.6%
ValueCountFrequency (%)
103 1
0.6%
102 1
0.6%
101 1
0.6%
100 1
0.6%
99 1
0.6%
98 1
0.6%
97 1
0.6%
96 1
0.6%
95 1
0.6%
94 1
0.6%

판매업구분명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size1.4 KiB
우유류판매업
103 
<NA>
61 

Length

Max length6
Median length6
Mean length5.2560976
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row우유류판매업
2nd row우유류판매업
3rd row우유류판매업
4th row우유류판매업
5th row우유류판매업

Common Values

ValueCountFrequency (%)
우유류판매업 103
62.8%
<NA> 61
37.2%

Length

2023-12-13T06:51:38.513006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:51:38.654907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
우유류판매업 103
62.8%
na 61
37.2%

사업장명칭
Text

MISSING 

Distinct100
Distinct (%)97.1%
Missing61
Missing (%)37.2%
Memory size1.4 KiB
2023-12-13T06:51:38.946210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length13
Mean length8.8737864
Min length2

Characters and Unicode

Total characters914
Distinct characters161
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique97 ?
Unique (%)94.2%

Sample

1st row주)서울우유 석남1동2동 고객센터
2nd row삼양대관령우유서인천특약점
3rd row건국우유가정보급소
4th row서울우유가정동고객센터
5th row에치와이 가좌점
ValueCountFrequency (%)
서울우유 9
 
6.2%
남양유업 4
 
2.8%
주식회사 4
 
2.8%
에치와이 3
 
2.1%
연세우유 3
 
2.1%
청라대리점 2
 
1.4%
주)에치와이 2
 
1.4%
고객센터 2
 
1.4%
원당고객센터 2
 
1.4%
검단점 2
 
1.4%
Other values (108) 112
77.2%
2023-12-13T06:51:39.820096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
62
 
6.8%
44
 
4.8%
43
 
4.7%
42
 
4.6%
38
 
4.2%
36
 
3.9%
24
 
2.6%
24
 
2.6%
) 22
 
2.4%
21
 
2.3%
Other values (151) 558
61.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 823
90.0%
Space Separator 42
 
4.6%
Close Punctuation 22
 
2.4%
Open Punctuation 19
 
2.1%
Uppercase Letter 4
 
0.4%
Other Punctuation 2
 
0.2%
Decimal Number 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
62
 
7.5%
44
 
5.3%
43
 
5.2%
38
 
4.6%
36
 
4.4%
24
 
2.9%
24
 
2.9%
21
 
2.6%
20
 
2.4%
19
 
2.3%
Other values (140) 492
59.8%
Uppercase Letter
ValueCountFrequency (%)
B 1
25.0%
F 1
25.0%
D 1
25.0%
S 1
25.0%
Other Punctuation
ValueCountFrequency (%)
& 1
50.0%
. 1
50.0%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
2 1
50.0%
Space Separator
ValueCountFrequency (%)
42
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 19
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 823
90.0%
Common 87
 
9.5%
Latin 4
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
62
 
7.5%
44
 
5.3%
43
 
5.2%
38
 
4.6%
36
 
4.4%
24
 
2.9%
24
 
2.9%
21
 
2.6%
20
 
2.4%
19
 
2.3%
Other values (140) 492
59.8%
Common
ValueCountFrequency (%)
42
48.3%
) 22
25.3%
( 19
21.8%
& 1
 
1.1%
1 1
 
1.1%
2 1
 
1.1%
. 1
 
1.1%
Latin
ValueCountFrequency (%)
B 1
25.0%
F 1
25.0%
D 1
25.0%
S 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 823
90.0%
ASCII 91
 
10.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
62
 
7.5%
44
 
5.3%
43
 
5.2%
38
 
4.6%
36
 
4.4%
24
 
2.9%
24
 
2.9%
21
 
2.6%
20
 
2.4%
19
 
2.3%
Other values (140) 492
59.8%
ASCII
ValueCountFrequency (%)
42
46.2%
) 22
24.2%
( 19
20.9%
B 1
 
1.1%
& 1
 
1.1%
F 1
 
1.1%
1 1
 
1.1%
2 1
 
1.1%
D 1
 
1.1%
S 1
 
1.1%

소재지
Text

MISSING 

Distinct94
Distinct (%)91.3%
Missing61
Missing (%)37.2%
Memory size1.4 KiB
2023-12-13T06:51:40.384161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length46
Median length40
Mean length27.495146
Min length16

Characters and Unicode

Total characters2832
Distinct characters143
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique86 ?
Unique (%)83.5%

Sample

1st row인천광역시 서구 거북로119번길 20 (석남동)
2nd row인천광역시 서구 가정로138번길 18 (가좌동)
3rd row인천광역시 서구 가정동 172-8
4th row인천광역시 서구 원창로229번길 30 (가정동)
5th row인천광역시 서구 건지로284번길 15 (가좌동)
ValueCountFrequency (%)
인천광역시 103
19.3%
서구 103
19.3%
가좌동 21
 
3.9%
석남동 10
 
1.9%
왕길동 7
 
1.3%
가정동 7
 
1.3%
1층 7
 
1.3%
심곡동 7
 
1.3%
연희동 6
 
1.1%
금곡동 6
 
1.1%
Other values (188) 257
48.1%
2023-12-13T06:51:40.899505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
471
 
16.6%
1 128
 
4.5%
108
 
3.8%
107
 
3.8%
105
 
3.7%
104
 
3.7%
103
 
3.6%
103
 
3.6%
103
 
3.6%
103
 
3.6%
Other values (133) 1397
49.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1581
55.8%
Decimal Number 523
 
18.5%
Space Separator 471
 
16.6%
Open Punctuation 87
 
3.1%
Close Punctuation 87
 
3.1%
Dash Punctuation 48
 
1.7%
Other Punctuation 30
 
1.1%
Uppercase Letter 5
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
108
 
6.8%
107
 
6.8%
105
 
6.6%
104
 
6.6%
103
 
6.5%
103
 
6.5%
103
 
6.5%
103
 
6.5%
86
 
5.4%
79
 
5.0%
Other values (113) 580
36.7%
Decimal Number
ValueCountFrequency (%)
1 128
24.5%
2 73
14.0%
3 73
14.0%
0 48
 
9.2%
5 40
 
7.6%
4 39
 
7.5%
8 38
 
7.3%
6 32
 
6.1%
7 27
 
5.2%
9 25
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
G 1
20.0%
J 1
20.0%
S 1
20.0%
C 1
20.0%
A 1
20.0%
Space Separator
ValueCountFrequency (%)
471
100.0%
Open Punctuation
ValueCountFrequency (%)
( 87
100.0%
Close Punctuation
ValueCountFrequency (%)
) 87
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 48
100.0%
Other Punctuation
ValueCountFrequency (%)
, 30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1581
55.8%
Common 1246
44.0%
Latin 5
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
108
 
6.8%
107
 
6.8%
105
 
6.6%
104
 
6.6%
103
 
6.5%
103
 
6.5%
103
 
6.5%
103
 
6.5%
86
 
5.4%
79
 
5.0%
Other values (113) 580
36.7%
Common
ValueCountFrequency (%)
471
37.8%
1 128
 
10.3%
( 87
 
7.0%
) 87
 
7.0%
2 73
 
5.9%
3 73
 
5.9%
0 48
 
3.9%
- 48
 
3.9%
5 40
 
3.2%
4 39
 
3.1%
Other values (5) 152
 
12.2%
Latin
ValueCountFrequency (%)
G 1
20.0%
J 1
20.0%
S 1
20.0%
C 1
20.0%
A 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1581
55.8%
ASCII 1251
44.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
471
37.6%
1 128
 
10.2%
( 87
 
7.0%
) 87
 
7.0%
2 73
 
5.8%
3 73
 
5.8%
0 48
 
3.8%
- 48
 
3.8%
5 40
 
3.2%
4 39
 
3.1%
Other values (10) 157
 
12.5%
Hangul
ValueCountFrequency (%)
108
 
6.8%
107
 
6.8%
105
 
6.6%
104
 
6.6%
103
 
6.5%
103
 
6.5%
103
 
6.5%
103
 
6.5%
86
 
5.4%
79
 
5.0%
Other values (113) 580
36.7%

데이터기준일자
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)1.0%
Missing61
Missing (%)37.2%
Memory size1.4 KiB
Minimum2022-09-05 00:00:00
Maximum2022-09-05 00:00:00
2023-12-13T06:51:41.026117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:51:41.119931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-13T06:51:37.662401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:51:41.201153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사업장명칭소재지
연번1.0000.8350.886
사업장명칭0.8351.0000.993
소재지0.8860.9931.000
2023-12-13T06:51:41.306797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번판매업구분명
연번1.0001.000
판매업구분명1.0001.000

Missing values

2023-12-13T06:51:37.763787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:51:37.863768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T06:51:37.988867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번판매업구분명사업장명칭소재지데이터기준일자
01우유류판매업주)서울우유 석남1동2동 고객센터인천광역시 서구 거북로119번길 20 (석남동)2022-09-05
12우유류판매업삼양대관령우유서인천특약점인천광역시 서구 가정로138번길 18 (가좌동)2022-09-05
23우유류판매업건국우유가정보급소인천광역시 서구 가정동 172-82022-09-05
34우유류판매업서울우유가정동고객센터인천광역시 서구 원창로229번길 30 (가정동)2022-09-05
45우유류판매업에치와이 가좌점인천광역시 서구 건지로284번길 15 (가좌동)2022-09-05
56우유류판매업연세우유 인천계양대리점인천광역시 서구 청라에메랄드로163번길 3-12, 101호 (연희동)2022-09-05
67우유류판매업연세우유 인천연희대리점인천광역시 서구 심곡로132번길 16(심곡동)2022-09-05
78우유류판매업서울우유 인천연희고객센터인천광역시 서구 대평로56번길 17 (연희동)2022-09-05
89우유류판매업서울우유 원당고객센터인천광역시 서구 봉수대로 1328-2, C동 101호 (왕길동)2022-09-05
910우유류판매업매일우유연희대리점인천광역시 서구 가정동 169-92022-09-05
연번판매업구분명사업장명칭소재지데이터기준일자
154<NA><NA><NA><NA><NA>
155<NA><NA><NA><NA><NA>
156<NA><NA><NA><NA><NA>
157<NA><NA><NA><NA><NA>
158<NA><NA><NA><NA><NA>
159<NA><NA><NA><NA><NA>
160<NA><NA><NA><NA><NA>
161<NA><NA><NA><NA><NA>
162<NA><NA><NA><NA><NA>
163<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번판매업구분명사업장명칭소재지데이터기준일자# duplicates
0<NA><NA><NA><NA><NA>61