Overview

Dataset statistics

Number of variables13
Number of observations294
Missing cells122
Missing cells (%)3.2%
Duplicate rows6
Duplicate rows (%)2.0%
Total size in memory30.3 KiB
Average record size in memory105.4 B

Variable types

Categorical6
Text6
Numeric1

Dataset

Description충청북도내 지역별 전통문화사업체인 전통주 제조업체에 대한 데이터 현황으로 제품명, 알코올도수, 용량, 판매가, 주원료, 유효기간, 대표전화, 홈페이지를 제공합니다
Author충청북도
URLhttps://www.data.go.kr/data/15011930/fileData.do

Alerts

Dataset has 6 (2.0%) duplicate rowsDuplicates
홈페이지 is highly overall correlated with 시군 and 2 other fieldsHigh correlation
특이사항 is highly overall correlated with 시군 and 3 other fieldsHigh correlation
알코올도수(퍼센트) is highly overall correlated with 주종 and 1 other fieldsHigh correlation
시군 is highly overall correlated with 홈페이지 and 1 other fieldsHigh correlation
주종 is highly overall correlated with 알코올도수(퍼센트) and 2 other fieldsHigh correlation
유효기간 is highly overall correlated with 알코올도수(퍼센트) and 1 other fieldsHigh correlation
특이사항 is highly imbalanced (74.2%)Imbalance
제품명 has 6 (2.0%) missing valuesMissing
판매가(원) has 76 (25.9%) missing valuesMissing
주원료 has 5 (1.7%) missing valuesMissing
대표전화 has 33 (11.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 00:53:36.485424
Analysis finished2023-12-12 00:53:38.094098
Duration1.61 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
청주시
105 
영동군
46 
진천군
33 
충주시
28 
음성군
18 
Other values (7)
64 

Length

Max length4
Median length3
Mean length3.0170068
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row청주시
2nd row청주시
3rd row청주시
4th row청주시
5th row청주시

Common Values

ValueCountFrequency (%)
청주시 105
35.7%
영동군 46
15.6%
진천군 33
 
11.2%
충주시 28
 
9.5%
음성군 18
 
6.1%
제천시 17
 
5.8%
괴산군 13
 
4.4%
옥천군 11
 
3.7%
보은군 6
 
2.0%
증평군 6
 
2.0%
Other values (2) 11
 
3.7%

Length

2023-12-12T09:53:38.178605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
청주시 105
35.7%
영동군 46
15.6%
진천군 33
 
11.2%
충주시 33
 
11.2%
음성군 18
 
6.1%
제천시 17
 
5.8%
괴산군 13
 
4.4%
옥천군 11
 
3.7%
보은군 6
 
2.0%
증평군 6
 
2.0%
Distinct110
Distinct (%)37.4%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-12T09:53:38.422876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length14.5
Mean length7.7857143
Min length3

Characters and Unicode

Total characters2289
Distinct characters199
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)21.8%

Sample

1st row농업회사법인 조은술세종㈜
2nd row농업회사법인 조은술세종㈜
3rd row농업회사법인 조은술세종㈜
4th row농업회사법인 조은술세종㈜
5th row농업회사법인 조은술세종㈜
ValueCountFrequency (%)
농업회사법인 55
 
14.3%
조은술세종㈜ 31
 
8.1%
고려주조 28
 
7.3%
청주주조 14
 
3.6%
서울장수주식회사 13
 
3.4%
주식회사 12
 
3.1%
잣나무골술도가 10
 
2.6%
제천한약영농조합법인 10
 
2.6%
농업회사법인(유)화양 5
 
1.3%
서가원전통술 5
 
1.3%
Other values (114) 201
52.3%
2023-12-12T09:53:38.975475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
178
 
7.8%
127
 
5.5%
118
 
5.2%
112
 
4.9%
103
 
4.5%
103
 
4.5%
94
 
4.1%
90
 
3.9%
69
 
3.0%
63
 
2.8%
Other values (189) 1232
53.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2117
92.5%
Space Separator 90
 
3.9%
Other Symbol 54
 
2.4%
Close Punctuation 9
 
0.4%
Open Punctuation 9
 
0.4%
Uppercase Letter 8
 
0.3%
Lowercase Letter 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
178
 
8.4%
127
 
6.0%
118
 
5.6%
112
 
5.3%
103
 
4.9%
103
 
4.9%
94
 
4.4%
69
 
3.3%
63
 
3.0%
58
 
2.7%
Other values (181) 1092
51.6%
Uppercase Letter
ValueCountFrequency (%)
L 4
50.0%
B 4
50.0%
Lowercase Letter
ValueCountFrequency (%)
o 1
50.0%
c 1
50.0%
Space Separator
ValueCountFrequency (%)
90
100.0%
Other Symbol
ValueCountFrequency (%)
54
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2171
94.8%
Common 108
 
4.7%
Latin 10
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
178
 
8.2%
127
 
5.8%
118
 
5.4%
112
 
5.2%
103
 
4.7%
103
 
4.7%
94
 
4.3%
69
 
3.2%
63
 
2.9%
58
 
2.7%
Other values (182) 1146
52.8%
Latin
ValueCountFrequency (%)
L 4
40.0%
B 4
40.0%
o 1
 
10.0%
c 1
 
10.0%
Common
ValueCountFrequency (%)
90
83.3%
) 9
 
8.3%
( 9
 
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2117
92.5%
ASCII 118
 
5.2%
None 54
 
2.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
178
 
8.4%
127
 
6.0%
118
 
5.6%
112
 
5.3%
103
 
4.9%
103
 
4.9%
94
 
4.4%
69
 
3.3%
63
 
3.0%
58
 
2.7%
Other values (181) 1092
51.6%
ASCII
ValueCountFrequency (%)
90
76.3%
) 9
 
7.6%
( 9
 
7.6%
L 4
 
3.4%
B 4
 
3.4%
o 1
 
0.8%
c 1
 
0.8%
None
ValueCountFrequency (%)
54
100.0%

주소
Text

Distinct106
Distinct (%)36.1%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-12T09:53:39.382704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length36
Median length26
Mean length22.081633
Min length13

Characters and Unicode

Total characters6492
Distinct characters166
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique61 ?
Unique (%)20.7%

Sample

1st row충청북도 청주시 청원구 사천로 18번길 5-2
2nd row충청북도 청주시 청원구 사천로 18번길 5-2
3rd row충청북도 청주시 청원구 사천로 18번길 5-2
4th row충청북도 청주시 청원구 사천로 18번길 5-2
5th row충청북도 청주시 청원구 사천로 18번길 5-2
ValueCountFrequency (%)
충청북도 235
 
15.6%
청주시 105
 
6.9%
상당구 62
 
4.1%
영동군 46
 
3.0%
청원구 40
 
2.6%
가덕면 36
 
2.4%
진천군 33
 
2.2%
충주시 33
 
2.2%
18번길 31
 
2.1%
5-2 31
 
2.1%
Other values (259) 859
56.8%
2023-12-12T09:53:39.913208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1222
18.8%
1 400
 
6.2%
394
 
6.1%
271
 
4.2%
252
 
3.9%
246
 
3.8%
195
 
3.0%
182
 
2.8%
158
 
2.4%
- 155
 
2.4%
Other values (156) 3017
46.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4024
62.0%
Space Separator 1222
 
18.8%
Decimal Number 1075
 
16.6%
Dash Punctuation 155
 
2.4%
Uppercase Letter 6
 
0.1%
Other Punctuation 4
 
0.1%
Close Punctuation 3
 
< 0.1%
Open Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
394
 
9.8%
271
 
6.7%
252
 
6.3%
246
 
6.1%
195
 
4.8%
182
 
4.5%
158
 
3.9%
153
 
3.8%
151
 
3.8%
140
 
3.5%
Other values (138) 1882
46.8%
Decimal Number
ValueCountFrequency (%)
1 400
37.2%
5 106
 
9.9%
2 106
 
9.9%
4 91
 
8.5%
3 87
 
8.1%
8 80
 
7.4%
6 62
 
5.8%
0 59
 
5.5%
7 56
 
5.2%
9 28
 
2.6%
Uppercase Letter
ValueCountFrequency (%)
D 3
50.0%
R 3
50.0%
Other Punctuation
ValueCountFrequency (%)
& 3
75.0%
, 1
 
25.0%
Space Separator
ValueCountFrequency (%)
1222
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 155
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4024
62.0%
Common 2462
37.9%
Latin 6
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
394
 
9.8%
271
 
6.7%
252
 
6.3%
246
 
6.1%
195
 
4.8%
182
 
4.5%
158
 
3.9%
153
 
3.8%
151
 
3.8%
140
 
3.5%
Other values (138) 1882
46.8%
Common
ValueCountFrequency (%)
1222
49.6%
1 400
 
16.2%
- 155
 
6.3%
5 106
 
4.3%
2 106
 
4.3%
4 91
 
3.7%
3 87
 
3.5%
8 80
 
3.2%
6 62
 
2.5%
0 59
 
2.4%
Other values (6) 94
 
3.8%
Latin
ValueCountFrequency (%)
D 3
50.0%
R 3
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4024
62.0%
ASCII 2468
38.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1222
49.5%
1 400
 
16.2%
- 155
 
6.3%
5 106
 
4.3%
2 106
 
4.3%
4 91
 
3.7%
3 87
 
3.5%
8 80
 
3.2%
6 62
 
2.5%
0 59
 
2.4%
Other values (8) 100
 
4.1%
Hangul
ValueCountFrequency (%)
394
 
9.8%
271
 
6.7%
252
 
6.3%
246
 
6.1%
195
 
4.8%
182
 
4.5%
158
 
3.9%
153
 
3.8%
151
 
3.8%
140
 
3.5%
Other values (138) 1882
46.8%

주종
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)8.2%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
생막걸리
64 
탁주
37 
과실주(포도)
37 
약주
35 
기타주류
24 
Other values (19)
97 

Length

Max length9
Median length8
Mean length3.952381
Min length2

Unique

Unique5 ?
Unique (%)1.7%

Sample

1st row탁주
2nd row탁주
3rd row탁주
4th row탁주
5th row탁주

Common Values

ValueCountFrequency (%)
생막걸리 64
21.8%
탁주 37
12.6%
과실주(포도) 37
12.6%
약주 35
11.9%
기타주류 24
 
8.2%
증류식소주 21
 
7.1%
과실주 13
 
4.4%
일반증류주 11
 
3.7%
리큐르주 9
 
3.1%
소주 7
 
2.4%
Other values (14) 36
12.2%

Length

2023-12-12T09:53:40.155900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
생막걸리 64
21.8%
탁주 37
12.6%
과실주(포도 37
12.6%
약주 35
11.9%
기타주류 24
 
8.2%
증류식소주 21
 
7.1%
과실주 13
 
4.4%
일반증류주 11
 
3.7%
리큐르주 9
 
3.1%
소주 7
 
2.4%
Other values (14) 36
12.2%

제품명
Text

MISSING 

Distinct229
Distinct (%)79.5%
Missing6
Missing (%)2.0%
Memory size2.4 KiB
2023-12-12T09:53:40.528856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length12
Mean length6.0243056
Min length1

Characters and Unicode

Total characters1735
Distinct characters265
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique183 ?
Unique (%)63.5%

Sample

1st row세종청주생막걸리
2nd row세종청주생막걸리
3rd row세종생막걸리1700
4th row세종알밤막걸리
5th row세종알밤막걸리
ValueCountFrequency (%)
약주 6
 
1.8%
다올찬 5
 
1.5%
장수막걸리 5
 
1.5%
생막걸리 4
 
1.2%
이류생막걸리 3
 
0.9%
청주신선주 3
 
0.9%
서가원 3
 
0.9%
금왕 3
 
0.9%
청주가덕쌀막걸리 3
 
0.9%
이도32 3
 
0.9%
Other values (239) 294
88.6%
2023-12-12T09:53:41.091926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
126
 
7.3%
113
 
6.5%
110
 
6.3%
85
 
4.9%
67
 
3.9%
45
 
2.6%
40
 
2.3%
38
 
2.2%
37
 
2.1%
28
 
1.6%
Other values (255) 1046
60.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1617
93.2%
Space Separator 45
 
2.6%
Decimal Number 26
 
1.5%
Close Punctuation 13
 
0.7%
Open Punctuation 13
 
0.7%
Uppercase Letter 13
 
0.7%
Lowercase Letter 4
 
0.2%
Connector Punctuation 3
 
0.2%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
126
 
7.8%
113
 
7.0%
110
 
6.8%
85
 
5.3%
67
 
4.1%
40
 
2.5%
38
 
2.4%
37
 
2.3%
28
 
1.7%
23
 
1.4%
Other values (233) 950
58.8%
Decimal Number
ValueCountFrequency (%)
2 10
38.5%
3 4
 
15.4%
5 3
 
11.5%
4 3
 
11.5%
0 3
 
11.5%
1 2
 
7.7%
7 1
 
3.8%
Uppercase Letter
ValueCountFrequency (%)
P 3
23.1%
E 3
23.1%
T 3
23.1%
N 2
15.4%
A 1
 
7.7%
C 1
 
7.7%
Lowercase Letter
ValueCountFrequency (%)
y 1
25.0%
s 1
25.0%
d 1
25.0%
a 1
25.0%
Space Separator
ValueCountFrequency (%)
45
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1613
93.0%
Common 101
 
5.8%
Latin 17
 
1.0%
Han 4
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
126
 
7.8%
113
 
7.0%
110
 
6.8%
85
 
5.3%
67
 
4.2%
40
 
2.5%
38
 
2.4%
37
 
2.3%
28
 
1.7%
23
 
1.4%
Other values (232) 946
58.6%
Common
ValueCountFrequency (%)
45
44.6%
) 13
 
12.9%
( 13
 
12.9%
2 10
 
9.9%
3 4
 
4.0%
5 3
 
3.0%
4 3
 
3.0%
0 3
 
3.0%
_ 3
 
3.0%
1 2
 
2.0%
Other values (2) 2
 
2.0%
Latin
ValueCountFrequency (%)
P 3
17.6%
E 3
17.6%
T 3
17.6%
N 2
11.8%
y 1
 
5.9%
s 1
 
5.9%
A 1
 
5.9%
C 1
 
5.9%
d 1
 
5.9%
a 1
 
5.9%
Han
ValueCountFrequency (%)
4
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1613
93.0%
ASCII 118
 
6.8%
CJK 4
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
126
 
7.8%
113
 
7.0%
110
 
6.8%
85
 
5.3%
67
 
4.2%
40
 
2.5%
38
 
2.4%
37
 
2.3%
28
 
1.7%
23
 
1.4%
Other values (232) 946
58.6%
ASCII
ValueCountFrequency (%)
45
38.1%
) 13
 
11.0%
( 13
 
11.0%
2 10
 
8.5%
3 4
 
3.4%
P 3
 
2.5%
E 3
 
2.5%
5 3
 
2.5%
4 3
 
2.5%
0 3
 
2.5%
Other values (12) 18
 
15.3%
CJK
ValueCountFrequency (%)
4
100.0%

알코올도수(퍼센트)
Real number (ℝ)

HIGH CORRELATION 

Distinct31
Distinct (%)10.6%
Missing2
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean13.205479
Minimum3
Maximum55
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2023-12-12T09:53:41.250042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile6
Q16
median11
Q314
95-th percentile40
Maximum55
Range52
Interquartile range (IQR)8

Descriptive statistics

Standard deviation10.399192
Coefficient of variation (CV)0.7874907
Kurtosis2.5978206
Mean13.205479
Median Absolute Deviation (MAD)5
Skewness1.8053368
Sum3856
Variance108.1432
MonotonicityNot monotonic
2023-12-12T09:53:41.416440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
6 114
38.8%
12 48
16.3%
11 14
 
4.8%
7 11
 
3.7%
13 11
 
3.7%
25 11
 
3.7%
15 10
 
3.4%
35 9
 
3.1%
14 8
 
2.7%
40 8
 
2.7%
Other values (21) 48
16.3%
ValueCountFrequency (%)
3 1
 
0.3%
4 3
 
1.0%
5 3
 
1.0%
6 114
38.8%
7 11
 
3.7%
8 5
 
1.7%
9 2
 
0.7%
10 1
 
0.3%
11 14
 
4.8%
12 48
16.3%
ValueCountFrequency (%)
55 1
 
0.3%
53 1
 
0.3%
45 2
 
0.7%
43 1
 
0.3%
42 3
 
1.0%
41 1
 
0.3%
40 8
2.7%
36 1
 
0.3%
35 9
3.1%
32 4
1.4%
Distinct23
Distinct (%)7.8%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
750
117 
375
37 
1700
28 
1200
27 
500
19 
Other values (18)
66 

Length

Max length15
Median length4
Mean length4.3163265
Min length4

Unique

Unique7 ?
Unique (%)2.4%

Sample

1st row750
2nd row1200
3rd row1700
4th row750
5th row1000

Common Values

ValueCountFrequency (%)
750 117
39.8%
375 37
 
12.6%
1700 28
 
9.5%
1200 27
 
9.2%
500 19
 
6.5%
1000 11
 
3.7%
900 10
 
3.4%
700 8
 
2.7%
360 6
 
2.0%
<NA> 5
 
1.7%
Other values (13) 26
 
8.8%

Length

2023-12-12T09:53:41.564426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
750 117
39.8%
375 37
 
12.6%
1700 28
 
9.5%
1200 27
 
9.2%
500 19
 
6.5%
1000 11
 
3.7%
900 10
 
3.4%
700 8
 
2.7%
360 6
 
2.0%
na 5
 
1.7%
Other values (13) 26
 
8.8%

판매가(원)
Text

MISSING 

Distinct70
Distinct (%)32.1%
Missing76
Missing (%)25.9%
Memory size2.4 KiB
2023-12-12T09:53:41.804015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length9
Mean length5.3348624
Min length2

Characters and Unicode

Total characters1163
Distinct characters20
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique38 ?
Unique (%)17.4%

Sample

1st row30000
2nd row35000
3rd row15000
4th row35000
5th row22000
ValueCountFrequency (%)
15000 38
 
17.4%
2000 19
 
8.7%
1500 17
 
7.8%
1200 10
 
4.6%
1000 8
 
3.7%
20000 7
 
3.2%
4000 6
 
2.8%
867 5
 
2.3%
700 5
 
2.3%
650 5
 
2.3%
Other values (60) 98
45.0%
2023-12-12T09:53:42.305803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 537
46.2%
212
 
18.2%
1 123
 
10.6%
5 91
 
7.8%
2 66
 
5.7%
3 23
 
2.0%
6 22
 
1.9%
7 21
 
1.8%
8 19
 
1.6%
9 15
 
1.3%
Other values (10) 34
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 932
80.1%
Space Separator 212
 
18.2%
Other Letter 14
 
1.2%
Other Punctuation 4
 
0.3%
Math Symbol 1
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 537
57.6%
1 123
 
13.2%
5 91
 
9.8%
2 66
 
7.1%
3 23
 
2.5%
6 22
 
2.4%
7 21
 
2.3%
8 19
 
2.0%
9 15
 
1.6%
4 15
 
1.6%
Other Letter
ValueCountFrequency (%)
3
21.4%
3
21.4%
3
21.4%
3
21.4%
1
 
7.1%
1
 
7.1%
Other Punctuation
ValueCountFrequency (%)
/ 2
50.0%
, 2
50.0%
Space Separator
ValueCountFrequency (%)
212
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1149
98.8%
Hangul 14
 
1.2%

Most frequent character per script

Common
ValueCountFrequency (%)
0 537
46.7%
212
 
18.5%
1 123
 
10.7%
5 91
 
7.9%
2 66
 
5.7%
3 23
 
2.0%
6 22
 
1.9%
7 21
 
1.8%
8 19
 
1.7%
9 15
 
1.3%
Other values (4) 20
 
1.7%
Hangul
ValueCountFrequency (%)
3
21.4%
3
21.4%
3
21.4%
3
21.4%
1
 
7.1%
1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1149
98.8%
Hangul 14
 
1.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 537
46.7%
212
 
18.5%
1 123
 
10.7%
5 91
 
7.9%
2 66
 
5.7%
3 23
 
2.0%
6 22
 
1.9%
7 21
 
1.8%
8 19
 
1.7%
9 15
 
1.3%
Other values (4) 20
 
1.7%
Hangul
ValueCountFrequency (%)
3
21.4%
3
21.4%
3
21.4%
3
21.4%
1
 
7.1%
1
 
7.1%

주원료
Text

MISSING 

Distinct55
Distinct (%)19.0%
Missing5
Missing (%)1.7%
Memory size2.4 KiB
2023-12-12T09:53:42.577872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length13
Mean length2.6020761
Min length1

Characters and Unicode

Total characters752
Distinct characters81
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)10.4%

Sample

1st row
2nd row
3rd row
4th row
5th row
ValueCountFrequency (%)
116
40.1%
포도 37
 
12.8%
유기농쌀 11
 
3.8%
찹쌀 11
 
3.8%
쌀,밀가루 8
 
2.8%
사과 7
 
2.4%
쌀,밀 6
 
2.1%
6
 
2.1%
밀가루 5
 
1.7%
쌀,인삼 5
 
1.7%
Other values (45) 77
26.6%
2023-12-12T09:53:43.042064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
205
27.3%
, 69
 
9.2%
37
 
4.9%
37
 
4.9%
22
 
2.9%
21
 
2.8%
20
 
2.7%
15
 
2.0%
15
 
2.0%
15
 
2.0%
Other values (71) 296
39.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 665
88.4%
Other Punctuation 70
 
9.3%
Open Punctuation 7
 
0.9%
Close Punctuation 7
 
0.9%
Decimal Number 3
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
205
30.8%
37
 
5.6%
37
 
5.6%
22
 
3.3%
21
 
3.2%
20
 
3.0%
15
 
2.3%
15
 
2.3%
15
 
2.3%
14
 
2.1%
Other values (65) 264
39.7%
Other Punctuation
ValueCountFrequency (%)
, 69
98.6%
% 1
 
1.4%
Decimal Number
ValueCountFrequency (%)
0 2
66.7%
1 1
33.3%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 665
88.4%
Common 87
 
11.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
205
30.8%
37
 
5.6%
37
 
5.6%
22
 
3.3%
21
 
3.2%
20
 
3.0%
15
 
2.3%
15
 
2.3%
15
 
2.3%
14
 
2.1%
Other values (65) 264
39.7%
Common
ValueCountFrequency (%)
, 69
79.3%
( 7
 
8.0%
) 7
 
8.0%
0 2
 
2.3%
% 1
 
1.1%
1 1
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 665
88.4%
ASCII 87
 
11.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
205
30.8%
37
 
5.6%
37
 
5.6%
22
 
3.3%
21
 
3.2%
20
 
3.0%
15
 
2.3%
15
 
2.3%
15
 
2.3%
14
 
2.1%
Other values (65) 264
39.7%
ASCII
ValueCountFrequency (%)
, 69
79.3%
( 7
 
8.0%
) 7
 
8.0%
0 2
 
2.3%
% 1
 
1.1%
1 1
 
1.1%

유효기간
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
30일
89 
5년
43 
<NA>
38 
1년
35 
없음
17 
Other values (14)
72 

Length

Max length10
Median length7
Mean length2.9251701
Min length2

Unique

Unique3 ?
Unique (%)1.0%

Sample

1st row30일
2nd row30일
3rd row30일
4th row30일
5th row30일

Common Values

ValueCountFrequency (%)
30일 89
30.3%
5년 43
14.6%
<NA> 38
12.9%
1년 35
 
11.9%
없음 17
 
5.8%
3년 13
 
4.4%
40일 9
 
3.1%
2년 9
 
3.1%
유통기한 없음 9
 
3.1%
20일 6
 
2.0%
Other values (9) 26
 
8.8%

Length

2023-12-12T09:53:43.269927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
30일 89
29.1%
5년 43
14.1%
na 38
12.4%
1년 35
 
11.4%
없음 26
 
8.5%
3년 13
 
4.2%
40일 9
 
2.9%
2년 9
 
2.9%
유통기한 9
 
2.9%
3개월 6
 
2.0%
Other values (9) 29
 
9.5%

대표전화
Text

MISSING 

Distinct89
Distinct (%)34.1%
Missing33
Missing (%)11.2%
Memory size2.4 KiB
2023-12-12T09:53:43.580195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.015326
Min length12

Characters and Unicode

Total characters3136
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)19.2%

Sample

1st row043-218-7689
2nd row043-218-7689
3rd row043-218-7689
4th row043-218-7689
5th row043-218-7689
ValueCountFrequency (%)
043-218-7689 31
 
11.9%
043-288-7400 28
 
10.7%
043-225-3737 14
 
5.4%
043-537-7611 13
 
5.0%
043-642-9000 10
 
3.8%
043-534-2336 10
 
3.8%
080-361-0100 5
 
1.9%
043-877-0808 5
 
1.9%
043-535-3567 5
 
1.9%
043-855-3333 5
 
1.9%
Other values (79) 135
51.7%
2023-12-12T09:53:44.069869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 522
16.6%
3 468
14.9%
0 444
14.2%
4 444
14.2%
7 258
8.2%
8 245
7.8%
2 234
7.5%
1 151
 
4.8%
5 148
 
4.7%
6 129
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2614
83.4%
Dash Punctuation 522
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 468
17.9%
0 444
17.0%
4 444
17.0%
7 258
9.9%
8 245
9.4%
2 234
9.0%
1 151
 
5.8%
5 148
 
5.7%
6 129
 
4.9%
9 93
 
3.6%
Dash Punctuation
ValueCountFrequency (%)
- 522
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3136
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 522
16.6%
3 468
14.9%
0 444
14.2%
4 444
14.2%
7 258
8.2%
8 245
7.8%
2 234
7.5%
1 151
 
4.8%
5 148
 
4.7%
6 129
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3136
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 522
16.6%
3 468
14.9%
0 444
14.2%
4 444
14.2%
7 258
8.2%
8 245
7.8%
2 234
7.5%
1 151
 
4.8%
5 148
 
4.7%
6 129
 
4.1%

홈페이지
Categorical

HIGH CORRELATION 

Distinct32
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
<NA>
167 
https://www.joeunsulsj.co.kr:5009/
31 
http://grjujo.com/home/main.php
28 
http://www.koreawine.co.kr/2011/index.php
 
13
https://mssool.co.kr/
 
5
Other values (27)
50 

Length

Max length44
Median length4
Mean length15.710884
Min length4

Unique

Unique18 ?
Unique (%)6.1%

Sample

1st rowhttps://www.joeunsulsj.co.kr:5009/
2nd rowhttps://www.joeunsulsj.co.kr:5009/
3rd rowhttps://www.joeunsulsj.co.kr:5009/
4th rowhttps://www.joeunsulsj.co.kr:5009/
5th rowhttps://www.joeunsulsj.co.kr:5009/

Common Values

ValueCountFrequency (%)
<NA> 167
56.8%
https://www.joeunsulsj.co.kr:5009/ 31
 
10.5%
http://grjujo.com/home/main.php 28
 
9.5%
http://www.koreawine.co.kr/2011/index.php 13
 
4.4%
https://mssool.co.kr/ 5
 
1.7%
https://hwayang.co/ 5
 
1.7%
https://duksanwine.modoo.at/ 5
 
1.7%
http://seogawon.com/ 5
 
1.7%
http://blog.naver.com/mokdoju 4
 
1.4%
https://smartstore.naver.com/sinseon 3
 
1.0%
Other values (22) 28
 
9.5%

Length

2023-12-12T09:53:44.303373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 167
56.8%
https://www.joeunsulsj.co.kr:5009 31
 
10.5%
http://grjujo.com/home/main.php 28
 
9.5%
http://www.koreawine.co.kr/2011/index.php 13
 
4.4%
https://mssool.co.kr 5
 
1.7%
https://hwayang.co 5
 
1.7%
https://duksanwine.modoo.at 5
 
1.7%
http://seogawon.com 5
 
1.7%
http://blog.naver.com/mokdoju 4
 
1.4%
https://www.jujuberry.co.kr 3
 
1.0%
Other values (22) 28
 
9.5%

특이사항
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct12
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
<NA>
255 
술품질인증, HACCP인증
 
13
유기가공인증
 
11
충북무형문화재
 
4
우리술인증
 
2
Other values (7)
 
9

Length

Max length20
Median length4
Mean length4.7414966
Min length3

Unique

Unique5 ?
Unique (%)1.7%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 255
86.7%
술품질인증, HACCP인증 13
 
4.4%
유기가공인증 11
 
3.7%
충북무형문화재 4
 
1.4%
우리술인증 2
 
0.7%
우리술품질인증 2
 
0.7%
문백면 통산마을 주민 제조 2
 
0.7%
HACCP 1
 
0.3%
ISO9001 1
 
0.3%
GAP 1
 
0.3%
Other values (2) 2
 
0.7%

Length

2023-12-12T09:53:44.873304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 255
80.2%
haccp인증 13
 
4.1%
술품질인증 13
 
4.1%
유기가공인증 11
 
3.5%
충북무형문화재 4
 
1.3%
통산마을 2
 
0.6%
주민 2
 
0.6%
제조 2
 
0.6%
문백면 2
 
0.6%
우리술품질인증 2
 
0.6%
Other values (11) 12
 
3.8%

Interactions

2023-12-12T09:53:37.445199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T09:53:44.998317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군주종알코올도수(퍼센트)용량(미리리터)판매가(원)주원료유효기간대표전화홈페이지특이사항
시군1.0000.8520.5640.7370.8860.9400.7501.0001.0000.959
주종0.8521.0000.8530.7730.9400.9770.8850.9630.9290.807
알코올도수(퍼센트)0.5640.8531.0000.5950.9630.8160.9060.5210.6910.781
용량(미리리터)0.7370.7730.5951.0000.9340.8570.7040.6890.0000.656
판매가(원)0.8860.9400.9630.9341.0000.9580.9230.0000.0000.883
주원료0.9400.9770.8160.8570.9581.0000.9370.9520.8580.943
유효기간0.7500.8850.9060.7040.9230.9371.0000.9470.8830.940
대표전화1.0000.9630.5210.6890.0000.9520.9471.0001.0000.976
홈페이지1.0000.9290.6910.0000.0000.8580.8831.0001.0001.000
특이사항0.9590.8070.7810.6560.8830.9430.9400.9761.0001.000
2023-12-12T09:53:45.154451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군홈페이지용량(미리리터)특이사항유효기간주종
시군1.0000.8980.3500.8160.3880.482
홈페이지0.8981.0000.0000.9810.4770.528
용량(미리리터)0.3500.0001.0000.3640.2930.316
특이사항0.8160.9810.3641.0000.7720.532
유효기간0.3880.4770.2930.7721.0000.494
주종0.4820.5280.3160.5320.4941.000
2023-12-12T09:53:45.284248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
알코올도수(퍼센트)시군주종용량(미리리터)유효기간홈페이지특이사항
알코올도수(퍼센트)1.0000.2760.5160.2600.5460.3030.488
시군0.2761.0000.4820.3500.3880.8980.816
주종0.5160.4821.0000.3160.4940.5280.532
용량(미리리터)0.2600.3500.3161.0000.2930.0000.364
유효기간0.5460.3880.4940.2931.0000.4770.772
홈페이지0.3030.8980.5280.0000.4771.0000.981
특이사항0.4880.8160.5320.3640.7720.9811.000

Missing values

2023-12-12T09:53:37.584349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T09:53:37.774132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T09:53:37.952617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군업체명주소주종제품명알코올도수(퍼센트)용량(미리리터)판매가(원)주원료유효기간대표전화홈페이지특이사항
0청주시농업회사법인 조은술세종㈜충청북도 청주시 청원구 사천로 18번길 5-2탁주세종청주생막걸리6750<NA>30일043-218-7689https://www.joeunsulsj.co.kr:5009/<NA>
1청주시농업회사법인 조은술세종㈜충청북도 청주시 청원구 사천로 18번길 5-2탁주세종청주생막걸리61200<NA>30일043-218-7689https://www.joeunsulsj.co.kr:5009/<NA>
2청주시농업회사법인 조은술세종㈜충청북도 청주시 청원구 사천로 18번길 5-2탁주세종생막걸리170061700<NA>30일043-218-7689https://www.joeunsulsj.co.kr:5009/<NA>
3청주시농업회사법인 조은술세종㈜충청북도 청주시 청원구 사천로 18번길 5-2탁주세종알밤막걸리6750<NA>30일043-218-7689https://www.joeunsulsj.co.kr:5009/<NA>
4청주시농업회사법인 조은술세종㈜충청북도 청주시 청원구 사천로 18번길 5-2탁주세종알밤막걸리61000<NA>30일043-218-7689https://www.joeunsulsj.co.kr:5009/<NA>
5청주시농업회사법인 조은술세종㈜충청북도 청주시 청원구 사천로 18번길 5-2탁주세종생막걸리6750<NA>30일043-218-7689https://www.joeunsulsj.co.kr:5009/<NA>
6청주시농업회사법인 조은술세종㈜충청북도 청주시 청원구 사천로 18번길 5-2탁주이동생막걸리6750<NA>30일043-218-7689https://www.joeunsulsj.co.kr:5009/<NA>
7청주시농업회사법인 조은술세종㈜충청북도 청주시 청원구 사천로 18번길 5-2살균탁주주포천막걸리61200<NA>1년043-218-7689https://www.joeunsulsj.co.kr:5009/<NA>
8청주시농업회사법인 조은술세종㈜충청북도 청주시 청원구 사천로 18번길 5-2살균탁주조껍데기막걸리6750<NA>쌀,좁쌀분1년043-218-7689https://www.joeunsulsj.co.kr:5009/<NA>
9청주시농업회사법인 조은술세종㈜충청북도 청주시 청원구 사천로 18번길 5-2살균탁주조껍데기막걸리61200<NA>쌀,좁쌀분1년043-218-7689https://www.joeunsulsj.co.kr:5009/<NA>
시군업체명주소주종제품명알코올도수(퍼센트)용량(미리리터)판매가(원)주원료유효기간대표전화홈페이지특이사항
284음성군보천양조장충청북도 음성군 원남면 보천로 47생막걸리보천생막걸리617002000밀가루30일043-872-7016<NA><NA>
285음성군하나도가충청북도 음성군 금왕읍 대금로1851번길 9생막걸리엄마막걸리6900200010일043-882-4583<NA><NA>
286음성군하나도가충청북도 음성군 금왕읍 대금로1851번길 9기타주류태좌주2350018000없음043-882-4583<NA><NA>
287음성군하나도가충청북도 음성군 금왕읍 대금로1851번길 9증류식소주농태기253005000없음043-882-4583<NA><NA>
288단양군단양양조장충청북도 단양군 도전9나길 3탁주단고을소백산생막걸리617002000쌀,밀30일043-422-2153http://danyangbrew.co.kr<NA>
289단양군대강양조장충청북도 단양군 대강면 대강로 60탁주소백산생막걸리617002000쌀,밀30일043-422-0077http://www.krwine.com<NA>
290단양군소백산술도가충청북도 단양군 대강면 대강로 60약주소백산 신선주163753500대추1년043-422-0900<NA><NA>
291단양군향산산약초 작목반 영농조합충청북도 단양군 가곡면 남한강로 1003리큐르주소백산아2275070000산양삼<NA>043-422-9597<NA><NA>
292단양군에델농원충청북도 단양군 영춘면 별방창원로 896과실주(와인)에델와인1337527000매실<NA>043-421-8285http://www.wineedel.com<NA>
293단양군농업회사법인 주식회사 도깨비양조장충청북도 단양군 가곡면 사평3길 5탁주도깨비술117501200030일070-4133-2033http://dokkaebisul.com<NA>

Duplicate rows

Most frequently occurring

시군업체명주소주종제품명알코올도수(퍼센트)용량(미리리터)판매가(원)주원료유효기간대표전화홈페이지특이사항# duplicates
0음성군대소양조장충청북도 음성군 대소면 오산로 13-1생막걸리대소생막걸리61200120030일043-881-7015<NA><NA>2
1진천군잣나무골술도가충청북도 진천군 백곡면 장터길 16-1기타주류맛있는옥수수생막걸리61000200030일043-534-2336<NA><NA>2
2진천군잣나무골술도가충청북도 진천군 백곡면 장터길 16-1기타주류진천누룽지생막걸리61700200030일043-534-2336<NA><NA>2
3진천군잣나무골술도가충청북도 진천군 백곡면 장터길 16-1기타주류진천백곡알밤생막걸리61000200030일043-534-2336<NA><NA>2
4진천군잣나무골술도가충청북도 진천군 백곡면 장터길 16-1생막걸리진천백곡생동동주61700150030일043-534-2336<NA><NA>2
5진천군잣나무골술도가충청북도 진천군 백곡면 장터길 16-1생막걸리진천백곡생막걸리61700150030일043-534-2336<NA><NA>2