Overview

Dataset statistics

Number of variables5
Number of observations369
Missing cells17
Missing cells (%)0.9%
Duplicate rows2
Duplicate rows (%)0.5%
Total size in memory14.5 KiB
Average record size in memory40.4 B

Variable types

Categorical2
Text3

Alerts

Dataset has 2 (0.5%) duplicate rowsDuplicates
전화번호 has 17 (4.6%) missing valuesMissing

Reproduction

Analysis started2023-12-10 21:36:02.843218
Analysis finished2023-12-10 21:36:03.415552
Duration0.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

Distinct24
Distinct (%)6.5%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
광주시
33 
가평군
32 
포천시
31 
이천시
31 
구리시
30 
Other values (19)
212 

Length

Max length4
Median length3
Mean length3.0298103
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가평군
2nd row가평군
3rd row가평군
4th row가평군
5th row가평군

Common Values

ValueCountFrequency (%)
광주시 33
 
8.9%
가평군 32
 
8.7%
포천시 31
 
8.4%
이천시 31
 
8.4%
구리시 30
 
8.1%
연천군 29
 
7.9%
용인시 28
 
7.6%
김포시 27
 
7.3%
고양시 18
 
4.9%
파주시 17
 
4.6%
Other values (14) 93
25.2%

Length

2023-12-11T06:36:03.517740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
광주시 33
 
8.9%
가평군 32
 
8.7%
포천시 31
 
8.4%
이천시 31
 
8.4%
구리시 30
 
8.1%
연천군 29
 
7.9%
용인시 28
 
7.6%
김포시 27
 
7.3%
고양시 18
 
4.9%
파주시 17
 
4.6%
Other values (14) 93
25.2%
Distinct335
Distinct (%)90.8%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-11T06:36:03.883017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length13
Mean length7.2195122
Min length3

Characters and Unicode

Total characters2664
Distinct characters290
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique316 ?
Unique (%)85.6%

Sample

1st row가평 농산
2nd row가평 잣집
3rd row가평 풍물 잣 공장
4th row가평군 농협 북면 지점
5th row가평군 농협 하면 지점
ValueCountFrequency (%)
작목반 18
 
3.7%
포도 10
 
2.0%
화성시 7
 
1.4%
유통사업단 7
 
1.4%
사과 7
 
1.4%
농산물 7
 
1.4%
포천시연합사업단 6
 
1.2%
경기도 4
 
0.8%
사이버농장 4
 
0.8%
대성농산 4
 
0.8%
Other values (367) 416
84.9%
2023-12-11T06:36:04.399287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
146
 
5.5%
140
 
5.3%
70
 
2.6%
66
 
2.5%
58
 
2.2%
57
 
2.1%
56
 
2.1%
55
 
2.1%
52
 
2.0%
49
 
1.8%
Other values (280) 1915
71.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2440
91.6%
Space Separator 146
 
5.5%
Other Symbol 41
 
1.5%
Lowercase Letter 19
 
0.7%
Decimal Number 11
 
0.4%
Other Punctuation 7
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
140
 
5.7%
70
 
2.9%
66
 
2.7%
58
 
2.4%
57
 
2.3%
56
 
2.3%
55
 
2.3%
52
 
2.1%
49
 
2.0%
48
 
2.0%
Other values (259) 1789
73.3%
Lowercase Letter
ValueCountFrequency (%)
p 4
21.1%
m 4
21.1%
c 2
10.5%
f 2
10.5%
a 2
10.5%
g 1
 
5.3%
z 1
 
5.3%
d 1
 
5.3%
r 1
 
5.3%
s 1
 
5.3%
Other Punctuation
ValueCountFrequency (%)
; 2
28.6%
& 2
28.6%
, 1
14.3%
? 1
14.3%
: 1
14.3%
Decimal Number
ValueCountFrequency (%)
0 5
45.5%
1 3
27.3%
2 2
 
18.2%
6 1
 
9.1%
Space Separator
ValueCountFrequency (%)
146
100.0%
Other Symbol
ValueCountFrequency (%)
41
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2481
93.1%
Common 164
 
6.2%
Latin 19
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
140
 
5.6%
70
 
2.8%
66
 
2.7%
58
 
2.3%
57
 
2.3%
56
 
2.3%
55
 
2.2%
52
 
2.1%
49
 
2.0%
48
 
1.9%
Other values (260) 1830
73.8%
Common
ValueCountFrequency (%)
146
89.0%
0 5
 
3.0%
1 3
 
1.8%
2 2
 
1.2%
; 2
 
1.2%
& 2
 
1.2%
, 1
 
0.6%
? 1
 
0.6%
: 1
 
0.6%
6 1
 
0.6%
Latin
ValueCountFrequency (%)
p 4
21.1%
m 4
21.1%
c 2
10.5%
f 2
10.5%
a 2
10.5%
g 1
 
5.3%
z 1
 
5.3%
d 1
 
5.3%
r 1
 
5.3%
s 1
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2440
91.6%
ASCII 183
 
6.9%
None 41
 
1.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
146
79.8%
0 5
 
2.7%
p 4
 
2.2%
m 4
 
2.2%
1 3
 
1.6%
c 2
 
1.1%
2 2
 
1.1%
f 2
 
1.1%
; 2
 
1.1%
a 2
 
1.1%
Other values (10) 11
 
6.0%
Hangul
ValueCountFrequency (%)
140
 
5.7%
70
 
2.9%
66
 
2.7%
58
 
2.4%
57
 
2.3%
56
 
2.3%
55
 
2.3%
52
 
2.1%
49
 
2.0%
48
 
2.0%
Other values (259) 1789
73.3%
None
ValueCountFrequency (%)
41
100.0%
Distinct5
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
농산물
239 
가공식품
82 
축산물
 
20
수산물
 
16
임산물
 
12

Length

Max length4
Median length3
Mean length3.2222222
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row농산물
2nd row농산물
3rd row농산물
4th row농산물
5th row농산물

Common Values

ValueCountFrequency (%)
농산물 239
64.8%
가공식품 82
 
22.2%
축산물 20
 
5.4%
수산물 16
 
4.3%
임산물 12
 
3.3%

Length

2023-12-11T06:36:04.545949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:36:04.702611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
농산물 239
64.8%
가공식품 82
 
22.2%
축산물 20
 
5.4%
수산물 16
 
4.3%
임산물 12
 
3.3%
Distinct124
Distinct (%)33.6%
Missing0
Missing (%)0.0%
Memory size3.0 KiB
2023-12-11T06:36:05.049383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length10
Mean length3.8238482
Min length1

Characters and Unicode

Total characters1411
Distinct characters185
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76 ?
Unique (%)20.6%

Sample

1st row
2nd row
3rd row
4th row사과
5th row포도
ValueCountFrequency (%)
포도 31
 
7.8%
30
 
7.5%
토마토 19
 
4.8%
버섯 15
 
3.8%
12
 
3.0%
12
 
3.0%
막걸리 12
 
3.0%
사과 9
 
2.2%
과일전품목 8
 
2.0%
양평 8
 
2.0%
Other values (117) 244
61.0%
2023-12-11T06:36:05.577030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
124
 
8.8%
44
 
3.1%
44
 
3.1%
43
 
3.0%
42
 
3.0%
41
 
2.9%
34
 
2.4%
31
 
2.2%
31
 
2.2%
25
 
1.8%
Other values (175) 952
67.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1212
85.9%
Space Separator 124
 
8.8%
Other Punctuation 26
 
1.8%
Close Punctuation 23
 
1.6%
Open Punctuation 23
 
1.6%
Lowercase Letter 3
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
44
 
3.6%
44
 
3.6%
43
 
3.5%
42
 
3.5%
41
 
3.4%
34
 
2.8%
31
 
2.6%
31
 
2.6%
25
 
2.1%
25
 
2.1%
Other values (167) 852
70.3%
Lowercase Letter
ValueCountFrequency (%)
d 1
33.3%
m 1
33.3%
z 1
33.3%
Other Punctuation
ValueCountFrequency (%)
, 22
84.6%
/ 4
 
15.4%
Space Separator
ValueCountFrequency (%)
124
100.0%
Close Punctuation
ValueCountFrequency (%)
) 23
100.0%
Open Punctuation
ValueCountFrequency (%)
( 23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1212
85.9%
Common 196
 
13.9%
Latin 3
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
44
 
3.6%
44
 
3.6%
43
 
3.5%
42
 
3.5%
41
 
3.4%
34
 
2.8%
31
 
2.6%
31
 
2.6%
25
 
2.1%
25
 
2.1%
Other values (167) 852
70.3%
Common
ValueCountFrequency (%)
124
63.3%
) 23
 
11.7%
( 23
 
11.7%
, 22
 
11.2%
/ 4
 
2.0%
Latin
ValueCountFrequency (%)
d 1
33.3%
m 1
33.3%
z 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1212
85.9%
ASCII 199
 
14.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
124
62.3%
) 23
 
11.6%
( 23
 
11.6%
, 22
 
11.1%
/ 4
 
2.0%
d 1
 
0.5%
m 1
 
0.5%
z 1
 
0.5%
Hangul
ValueCountFrequency (%)
44
 
3.6%
44
 
3.6%
43
 
3.5%
42
 
3.5%
41
 
3.4%
34
 
2.8%
31
 
2.6%
31
 
2.6%
25
 
2.1%
25
 
2.1%
Other values (167) 852
70.3%

전화번호
Text

MISSING 

Distinct241
Distinct (%)68.5%
Missing17
Missing (%)4.6%
Memory size3.0 KiB
2023-12-11T06:36:05.800264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length12
Mean length12.534091
Min length8

Characters and Unicode

Total characters4412
Distinct characters15
Distinct categories5 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique222 ?
Unique (%)63.1%

Sample

1st row031-584-0712
2nd row031-582-2294
3rd row031-582-0481
4th row031-582-2590
5th row031-585-0050~2
ValueCountFrequency (%)
89
 
24.7%
031-278-3600 7
 
1.9%
031-539-8602 6
 
1.7%
031-833-2606 4
 
1.1%
031-983-7751 3
 
0.8%
031-919-8663 3
 
0.8%
031-761-5334 2
 
0.6%
031-768-9345 2
 
0.6%
031-535-2800 2
 
0.6%
031-847-3208 2
 
0.6%
Other values (231) 240
66.7%
2023-12-11T06:36:06.215268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 966
21.9%
- 716
16.2%
0 468
10.6%
3 463
10.5%
1 344
 
7.8%
5 278
 
6.3%
2 227
 
5.1%
8 212
 
4.8%
6 201
 
4.6%
9 168
 
3.8%
Other values (5) 369
 
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2678
60.7%
Other Punctuation 975
 
22.1%
Dash Punctuation 716
 
16.2%
Space Separator 40
 
0.9%
Math Symbol 3
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 468
17.5%
3 463
17.3%
1 344
12.8%
5 278
10.4%
2 227
8.5%
8 212
7.9%
6 201
7.5%
9 168
 
6.3%
7 164
 
6.1%
4 153
 
5.7%
Other Punctuation
ValueCountFrequency (%)
* 966
99.1%
, 9
 
0.9%
Dash Punctuation
ValueCountFrequency (%)
- 716
100.0%
Space Separator
ValueCountFrequency (%)
40
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4412
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 966
21.9%
- 716
16.2%
0 468
10.6%
3 463
10.5%
1 344
 
7.8%
5 278
 
6.3%
2 227
 
5.1%
8 212
 
4.8%
6 201
 
4.6%
9 168
 
3.8%
Other values (5) 369
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4412
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 966
21.9%
- 716
16.2%
0 468
10.6%
3 463
10.5%
1 344
 
7.8%
5 278
 
6.3%
2 227
 
5.1%
8 212
 
4.8%
6 201
 
4.6%
9 168
 
3.8%
Other values (5) 369
 
8.4%

Correlations

2023-12-11T06:36:06.339862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명가공식품분류명
시군명1.0000.654
가공식품분류명0.6541.000
2023-12-11T06:36:06.436774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명가공식품분류명
시군명1.0000.375
가공식품분류명0.3751.000
2023-12-11T06:36:06.536525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명가공식품분류명
시군명1.0000.375
가공식품분류명0.3751.000

Missing values

2023-12-11T06:36:03.236380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T06:36:03.354627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군명생산자명가공식품분류명가공식품품목명전화번호
0가평군가평 농산농산물031-584-0712
1가평군가평 잣집농산물031-582-2294
2가평군가평 풍물 잣 공장농산물031-582-0481
3가평군가평군 농협 북면 지점농산물사과031-582-2590
4가평군가평군 농협 하면 지점농산물포도031-585-0050~2
5가평군가평군 산림조합농산물031-582-2207
6가평군가평요나농산임산물버섯02-584-7500
7가평군가평읍 사과 작목반농산물사과031-582-2751
8가평군가평잣 정보화마을농산물031-582-6089
9가평군고품질 사과 작목반농산물사과031-582-0677
시군명생산자명가공식품분류명가공식품품목명전화번호
359하남시하남시청 농업지원과농산물부추031-790-6274
360하남시하남채소작목반연합회농산물시설채소031-791-7749
361화성시화성시 농산물 유통사업단가공식품김치031-278-3600
362화성시화성시 농산물 유통사업단농산물031-278-3600
363화성시화성시 농산물 유통사업단농산물신선채소031-278-3600
364화성시화성시 농산물 유통사업단농산물느타리버섯031-278-3600
365화성시화성시 농산물 유통사업단농산물031-278-3600
366화성시화성시 농산물 유통사업단농산물포도031-278-3600
367화성시화성시 농산물 유통사업단농산물표고버섯031-278-3600
368화성시화성웰빙떡클러스터사업단가공식품031-352-7521

Duplicate rows

Most frequently occurring

시군명생산자명가공식품분류명가공식품품목명전화번호# duplicates
0안양시안양포도농가농산물포도***-****-****2
1포천시이동주조㈜가공식품막걸리031-535-28002