Overview

Dataset statistics

Number of variables2
Number of observations1873
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory29.4 KiB
Average record size in memory16.1 B

Variable types

Text2

Dataset

Description대구광역시_수산물 거래 품종별 원산지 목록_20210630
Author대구광역시
URLhttp://data.daegu.go.kr/open/data/dataView.do?dataSetId=15086039&dataSetDetailId=150860391c4917c273346&provdMethod=FILE

Reproduction

Analysis started2023-12-10 19:48:16.095922
Analysis finished2023-12-10 19:48:16.642226
Duration0.55 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

품종
Text

Distinct705
Distinct (%)37.6%
Missing0
Missing (%)0.0%
Memory size14.8 KiB
2023-12-11T04:48:17.060364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length5.2712226
Min length2

Characters and Unicode

Total characters9873
Distinct characters316
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique330 ?
Unique (%)17.6%

Sample

1st row냉동 명태
2nd row활 전어
3rd row냉동 갈치
4th row신선 굴
5th row건 과메기
ValueCountFrequency (%)
냉동 672
20.1%
신선 363
 
10.8%
281
 
8.4%
155
 
4.6%
기타 56
 
1.7%
오징어 41
 
1.2%
고등어 29
 
0.9%
새우 29
 
0.9%
갈치 27
 
0.8%
가자미 25
 
0.7%
Other values (488) 1668
49.9%
2023-12-11T04:48:17.773191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1473
 
14.9%
688
 
7.0%
672
 
6.8%
367
 
3.7%
363
 
3.7%
348
 
3.5%
307
 
3.1%
287
 
2.9%
) 192
 
1.9%
( 192
 
1.9%
Other values (306) 4984
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7996
81.0%
Space Separator 1473
 
14.9%
Close Punctuation 192
 
1.9%
Open Punctuation 192
 
1.9%
Other Punctuation 17
 
0.2%
Decimal Number 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
688
 
8.6%
672
 
8.4%
367
 
4.6%
363
 
4.5%
348
 
4.4%
307
 
3.8%
287
 
3.6%
186
 
2.3%
181
 
2.3%
164
 
2.1%
Other values (300) 4433
55.4%
Decimal Number
ValueCountFrequency (%)
1 2
66.7%
4 1
33.3%
Space Separator
ValueCountFrequency (%)
1473
100.0%
Close Punctuation
ValueCountFrequency (%)
) 192
100.0%
Open Punctuation
ValueCountFrequency (%)
( 192
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7995
81.0%
Common 1877
 
19.0%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
688
 
8.6%
672
 
8.4%
367
 
4.6%
363
 
4.5%
348
 
4.4%
307
 
3.8%
287
 
3.6%
186
 
2.3%
181
 
2.3%
164
 
2.1%
Other values (299) 4432
55.4%
Common
ValueCountFrequency (%)
1473
78.5%
) 192
 
10.2%
( 192
 
10.2%
/ 17
 
0.9%
1 2
 
0.1%
4 1
 
0.1%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7995
81.0%
ASCII 1877
 
19.0%
CJK 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1473
78.5%
) 192
 
10.2%
( 192
 
10.2%
/ 17
 
0.9%
1 2
 
0.1%
4 1
 
0.1%
Hangul
ValueCountFrequency (%)
688
 
8.6%
672
 
8.4%
367
 
4.6%
363
 
4.5%
348
 
4.4%
307
 
3.8%
287
 
3.6%
186
 
2.3%
181
 
2.3%
164
 
2.1%
Other values (299) 4432
55.4%
CJK
ValueCountFrequency (%)
1
100.0%
Distinct141
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Memory size14.8 KiB
2023-12-11T04:48:18.125823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length3.6572344
Min length2

Characters and Unicode

Total characters6850
Distinct characters164
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)2.1%

Sample

1st row러시아 연방
2nd row한국
3rd row한국
4th row한국
5th row원양산
ValueCountFrequency (%)
한국 418
 
17.3%
경북 209
 
8.7%
중국 156
 
6.5%
동해산 79
 
3.3%
러시아 73
 
3.0%
연방 73
 
3.0%
전남 63
 
2.6%
베트남 61
 
2.5%
수입산 59
 
2.4%
의성군 56
 
2.3%
Other values (142) 1163
48.3%
2023-12-11T04:48:18.762433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
712
 
10.4%
632
 
9.2%
418
 
6.1%
345
 
5.0%
273
 
4.0%
236
 
3.4%
234
 
3.4%
233
 
3.4%
180
 
2.6%
173
 
2.5%
Other values (154) 3414
49.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6138
89.6%
Space Separator 712
 
10.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
632
 
10.3%
418
 
6.8%
345
 
5.6%
273
 
4.4%
236
 
3.8%
234
 
3.8%
233
 
3.8%
180
 
2.9%
173
 
2.8%
163
 
2.7%
Other values (153) 3251
53.0%
Space Separator
ValueCountFrequency (%)
712
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6138
89.6%
Common 712
 
10.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
632
 
10.3%
418
 
6.8%
345
 
5.6%
273
 
4.4%
236
 
3.8%
234
 
3.8%
233
 
3.8%
180
 
2.9%
173
 
2.8%
163
 
2.7%
Other values (153) 3251
53.0%
Common
ValueCountFrequency (%)
712
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6138
89.6%
ASCII 712
 
10.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
712
100.0%
Hangul
ValueCountFrequency (%)
632
 
10.3%
418
 
6.8%
345
 
5.6%
273
 
4.4%
236
 
3.8%
234
 
3.8%
233
 
3.8%
180
 
2.9%
173
 
2.8%
163
 
2.7%
Other values (153) 3251
53.0%

Missing values

2023-12-11T04:48:16.479043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T04:48:16.592790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

품종원산지
0냉동 명태러시아 연방
1활 전어한국
2냉동 갈치한국
3신선 굴한국
4건 과메기원양산
5냉동 명태수입산
6냉동 고등어한국
7활 숭어한국
8건 멸치한국
9신선 홍합한국
품종원산지
1863냉동 참치방어일본
1864냉동 고둥아르헨티나
1865냉동 조기아이슬란드
1866활 개불수입산
1867신선 전갱이부산 서구
1868냉동 갈치(기타)남해안
1869활 홍살치한국
1870굼벵이(제조)대구 달성군
1871원삼경북 의성군
1872냉동 서대대서양