Overview

Dataset statistics

Number of variables3
Number of observations1753
Missing cells0
Missing cells (%)0.0%
Duplicate rows12
Duplicate rows (%)0.7%
Total size in memory41.2 KiB
Average record size in memory24.1 B

Variable types

Text2
Categorical1

Dataset

Description전라남도 장흥군에서 허가받은 축사에 관한 정보로 사업장명, 주사육품종(한우, 오리, 닭), 사업장소재지 주소를 포함한다.
URLhttps://www.data.go.kr/data/15117113/fileData.do

Alerts

Dataset has 12 (0.7%) duplicate rowsDuplicates
주사육업종 is highly imbalanced (85.8%)Imbalance

Reproduction

Analysis started2023-12-12 05:28:00.837197
Analysis finished2023-12-12 05:28:01.259187
Duration0.42 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct784
Distinct (%)44.7%
Missing0
Missing (%)0.0%
Memory size13.8 KiB
2023-12-12T14:28:01.486465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length13
Mean length3.2150599
Min length2

Characters and Unicode

Total characters5636
Distinct characters318
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique681 ?
Unique (%)38.8%

Sample

1st row봉림농장
2nd row천금농장
3rd row삼산양계장
4th row민주농장
5th row야베스 축산
ValueCountFrequency (%)
없음 808
44.8%
농장 22
 
1.2%
16
 
0.9%
11
 
0.6%
김**농장 8
 
0.4%
6
 
0.3%
영지한우 6
 
0.3%
5
 
0.3%
축산 5
 
0.3%
박**축산 4
 
0.2%
Other values (780) 912
50.6%
2023-12-12T14:28:02.045832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
810
14.4%
809
14.4%
640
 
11.4%
607
 
10.8%
223
 
4.0%
* 210
 
3.7%
208
 
3.7%
75
 
1.3%
50
 
0.9%
48
 
0.9%
Other values (308) 1956
34.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5330
94.6%
Other Punctuation 210
 
3.7%
Space Separator 50
 
0.9%
Decimal Number 23
 
0.4%
Lowercase Letter 8
 
0.1%
Open Punctuation 5
 
0.1%
Close Punctuation 5
 
0.1%
Uppercase Letter 3
 
0.1%
Other Symbol 1
 
< 0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
810
15.2%
809
15.2%
640
 
12.0%
607
 
11.4%
223
 
4.2%
208
 
3.9%
75
 
1.4%
48
 
0.9%
46
 
0.9%
43
 
0.8%
Other values (291) 1821
34.2%
Lowercase Letter
ValueCountFrequency (%)
r 2
25.0%
e 2
25.0%
n 1
12.5%
m 1
12.5%
a 1
12.5%
f 1
12.5%
Uppercase Letter
ValueCountFrequency (%)
G 1
33.3%
F 1
33.3%
P 1
33.3%
Decimal Number
ValueCountFrequency (%)
2 19
82.6%
1 4
 
17.4%
Other Punctuation
ValueCountFrequency (%)
* 210
100.0%
Space Separator
ValueCountFrequency (%)
50
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5330
94.6%
Common 294
 
5.2%
Latin 12
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
810
15.2%
809
15.2%
640
 
12.0%
607
 
11.4%
223
 
4.2%
208
 
3.9%
75
 
1.4%
48
 
0.9%
46
 
0.9%
43
 
0.8%
Other values (291) 1821
34.2%
Latin
ValueCountFrequency (%)
r 2
16.7%
e 2
16.7%
1
8.3%
n 1
8.3%
G 1
8.3%
m 1
8.3%
a 1
8.3%
f 1
8.3%
F 1
8.3%
P 1
8.3%
Common
ValueCountFrequency (%)
* 210
71.4%
50
 
17.0%
2 19
 
6.5%
( 5
 
1.7%
) 5
 
1.7%
1 4
 
1.4%
1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5330
94.6%
ASCII 304
 
5.4%
Misc Symbols 1
 
< 0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
810
15.2%
809
15.2%
640
 
12.0%
607
 
11.4%
223
 
4.2%
208
 
3.9%
75
 
1.4%
48
 
0.9%
46
 
0.9%
43
 
0.8%
Other values (291) 1821
34.2%
ASCII
ValueCountFrequency (%)
* 210
69.1%
50
 
16.4%
2 19
 
6.2%
( 5
 
1.6%
) 5
 
1.6%
1 4
 
1.3%
r 2
 
0.7%
e 2
 
0.7%
n 1
 
0.3%
G 1
 
0.3%
Other values (5) 5
 
1.6%
Misc Symbols
ValueCountFrequency (%)
1
100.0%
Number Forms
ValueCountFrequency (%)
1
100.0%

주사육업종
Categorical

IMBALANCE 

Distinct10
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size13.8 KiB
한우
1648 
오리
 
46
육계
 
22
돼지
 
12
염소
 
8
Other values (5)
 
17

Length

Max length6
Median length2
Mean length2.0068454
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row육계
2nd row돼지
3rd row육계
4th row육계
5th row돼지

Common Values

ValueCountFrequency (%)
한우 1648
94.0%
오리 46
 
2.6%
육계 22
 
1.3%
돼지 12
 
0.7%
염소 8
 
0.5%
산양 6
 
0.3%
젖소 4
 
0.2%
종계/산란계 3
 
0.2%
육우 3
 
0.2%
사슴 1
 
0.1%

Length

2023-12-12T14:28:02.239341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:28:02.420027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
한우 1648
94.0%
오리 46
 
2.6%
육계 22
 
1.3%
돼지 12
 
0.7%
염소 8
 
0.5%
산양 6
 
0.3%
젖소 4
 
0.2%
종계/산란계 3
 
0.2%
육우 3
 
0.2%
사슴 1
 
0.1%
Distinct1719
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size13.8 KiB
2023-12-12T14:28:02.865889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length85
Median length84
Mean length27.715916
Min length19

Characters and Unicode

Total characters48586
Distinct characters146
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1689 ?
Unique (%)96.3%

Sample

1st row전라남도 장흥군 장평면 봉림리 16-1번지
2nd row전라남도 장흥군 용산면 풍길리 447번지 10호
3rd row전라남도 장흥군 장흥읍 삼산리 산 92번지 5호
4th row전라남도 장흥군 용산면 계산리 415번지 1호
5th row전라남도 장흥군 관산읍 신동리 620번지
ValueCountFrequency (%)
전라남도 1753
 
16.7%
장흥군 1753
 
16.7%
관산읍 417
 
4.0%
대덕읍 380
 
3.6%
1호 333
 
3.2%
용산면 286
 
2.7%
안양면 133
 
1.3%
2호 130
 
1.2%
장흥읍 130
 
1.2%
장평면 127
 
1.2%
Other values (1547) 5041
48.1%
2023-12-12T14:28:03.560228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11846
24.4%
2098
 
4.3%
1976
 
4.1%
1890
 
3.9%
1816
 
3.7%
1793
 
3.7%
1780
 
3.7%
1770
 
3.6%
1756
 
3.6%
1753
 
3.6%
Other values (136) 20108
41.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 27722
57.1%
Space Separator 11846
24.4%
Decimal Number 8088
 
16.6%
Dash Punctuation 448
 
0.9%
Other Punctuation 431
 
0.9%
Close Punctuation 22
 
< 0.1%
Open Punctuation 22
 
< 0.1%
Math Symbol 4
 
< 0.1%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2098
 
7.6%
1976
 
7.1%
1890
 
6.8%
1816
 
6.6%
1793
 
6.5%
1780
 
6.4%
1770
 
6.4%
1756
 
6.3%
1753
 
6.3%
1744
 
6.3%
Other values (116) 9346
33.7%
Decimal Number
ValueCountFrequency (%)
1 1639
20.3%
2 1025
12.7%
3 930
11.5%
4 820
10.1%
5 793
9.8%
6 730
9.0%
7 648
 
8.0%
8 530
 
6.6%
0 508
 
6.3%
9 465
 
5.7%
Uppercase Letter
ValueCountFrequency (%)
E 1
33.3%
A 1
33.3%
D 1
33.3%
Other Punctuation
ValueCountFrequency (%)
, 424
98.4%
/ 7
 
1.6%
Space Separator
ValueCountFrequency (%)
11846
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 448
100.0%
Close Punctuation
ValueCountFrequency (%)
) 22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Math Symbol
ValueCountFrequency (%)
~ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 27722
57.1%
Common 20861
42.9%
Latin 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2098
 
7.6%
1976
 
7.1%
1890
 
6.8%
1816
 
6.6%
1793
 
6.5%
1780
 
6.4%
1770
 
6.4%
1756
 
6.3%
1753
 
6.3%
1744
 
6.3%
Other values (116) 9346
33.7%
Common
ValueCountFrequency (%)
11846
56.8%
1 1639
 
7.9%
2 1025
 
4.9%
3 930
 
4.5%
4 820
 
3.9%
5 793
 
3.8%
6 730
 
3.5%
7 648
 
3.1%
8 530
 
2.5%
0 508
 
2.4%
Other values (7) 1392
 
6.7%
Latin
ValueCountFrequency (%)
E 1
33.3%
A 1
33.3%
D 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 27722
57.1%
ASCII 20864
42.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11846
56.8%
1 1639
 
7.9%
2 1025
 
4.9%
3 930
 
4.5%
4 820
 
3.9%
5 793
 
3.8%
6 730
 
3.5%
7 648
 
3.1%
8 530
 
2.5%
0 508
 
2.4%
Other values (10) 1395
 
6.7%
Hangul
ValueCountFrequency (%)
2098
 
7.6%
1976
 
7.1%
1890
 
6.8%
1816
 
6.6%
1793
 
6.5%
1780
 
6.4%
1770
 
6.4%
1756
 
6.3%
1753
 
6.3%
1744
 
6.3%
Other values (116) 9346
33.7%

Missing values

2023-12-12T14:28:01.129833image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:28:01.223952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

축산업 사업장 명칭주사육업종사업장소재지
0봉림농장육계전라남도 장흥군 장평면 봉림리 16-1번지
1천금농장돼지전라남도 장흥군 용산면 풍길리 447번지 10호
2삼산양계장육계전라남도 장흥군 장흥읍 삼산리 산 92번지 5호
3민주농장육계전라남도 장흥군 용산면 계산리 415번지 1호
4야베스 축산돼지전라남도 장흥군 관산읍 신동리 620번지
5없음한우전라남도 장흥군 용산면 운주리 579번지 6호
6대상농장돼지전라남도 장흥군 관산읍 죽청리 110번지 1호
7판호농장돼지전라남도 장흥군 관산읍 죽청리 133번지 , 134번지
8우수농장한우전라남도 장흥군 관산읍 죽청리 26-5번지(가동)
9우수농장한우전라남도 장흥군 관산읍 죽청리 26-5번지(나동)
축산업 사업장 명칭주사육업종사업장소재지
1743김**한우전라남도 장흥군 관산읍 삼산리 241번지 23호
1744없음염소전라남도 장흥군 장흥읍 평장리 79번지
1745없음한우전라남도 장흥군 안양면 당암리 10-35번지(다동)
1746없음염소전라남도 장흥군 안양면 당암리 1번지 3호
1747없음한우전라남도 장흥군 관산읍 삼산리 894-53번지
1748없음육계전라남도 장흥군 장흥읍 성불리 344번지 1호
1749후니농장한우전라남도 장흥군 안양면 학송리 37번지
1750이화농장한우전라남도 장흥군 대덕읍 연정리 1341번지 5호
1751동진농장한우전라남도 장흥군 용산면 금곡리 3번지 18호
1752강하누축산한우전라남도 장흥군 용산면 금곡리 3번지 16호 , 3-17

Duplicate rows

Most frequently occurring

축산업 사업장 명칭주사육업종사업장소재지# duplicates
6없음한우전라남도 장흥군 대덕읍 잠두리 72번지3
0남가네 농장한우전라남도 장흥군 관산읍 삼산리 276번지 1호2
1없음한우전라남도 장흥군 관산읍 삼산리 1번지 5호2
2없음한우전라남도 장흥군 관산읍 신동리 180번지 3호2
3없음한우전라남도 장흥군 관산읍 외동리 395번지2
4없음한우전라남도 장흥군 대덕읍 신월리 883번지2
5없음한우전라남도 장흥군 대덕읍 잠두리 41번지2
7없음한우전라남도 장흥군 안양면 해창리 14번지2
8없음한우전라남도 장흥군 장동면 봉동리 170번지 1호2
9없음한우전라남도 장흥군 장동면 북교리 598번지 2호2