Overview

Dataset statistics

Number of variables3
Number of observations599
Missing cells0
Missing cells (%)0.0%
Duplicate rows7
Duplicate rows (%)1.2%
Total size in memory14.2 KiB
Average record size in memory24.2 B

Variable types

Text2
Categorical1

Dataset

Description경상북도 고령군에서 축산농가를 하고 있는 현황을 제공합니다. 농장명, 축종, 소재지를 제공하고 있으며, 개인정보로 인하여 소재지는 리 단위 까지만 제공합니다.
URLhttps://www.data.go.kr/data/15034404/fileData.do

Alerts

Dataset has 7 (1.2%) duplicate rowsDuplicates
축종 is highly imbalanced (73.2%)Imbalance

Reproduction

Analysis started2023-12-12 15:24:28.617528
Analysis finished2023-12-12 15:24:29.017777
Duration0.4 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct587
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
2023-12-13T00:24:29.240883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length5
Mean length5.0417362
Min length2

Characters and Unicode

Total characters3020
Distinct characters247
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique575 ?
Unique (%)96.0%

Sample

1st row박준배농장
2nd row꿀꿀이농장
3rd row거농영농조합법인Ⅲ
4th row김성열농장
5th row청돈농장
ValueCountFrequency (%)
농장 59
 
8.8%
ii 5
 
0.7%
제종호농장 2
 
0.3%
이상철농장 2
 
0.3%
진성축산 2
 
0.3%
김재우농장 2
 
0.3%
이진태농장 2
 
0.3%
서교희 2
 
0.3%
김기석농장 2
 
0.3%
형제축산 2
 
0.3%
Other values (583) 592
88.1%
2023-12-13T00:24:29.697905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
558
 
18.5%
547
 
18.1%
102
 
3.4%
87
 
2.9%
73
 
2.4%
55
 
1.8%
45
 
1.5%
40
 
1.3%
36
 
1.2%
33
 
1.1%
Other values (237) 1444
47.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2915
96.5%
Space Separator 73
 
2.4%
Decimal Number 12
 
0.4%
Uppercase Letter 10
 
0.3%
Letter Number 8
 
0.3%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
558
19.1%
547
 
18.8%
102
 
3.5%
87
 
3.0%
55
 
1.9%
45
 
1.5%
40
 
1.4%
36
 
1.2%
33
 
1.1%
33
 
1.1%
Other values (225) 1379
47.3%
Decimal Number
ValueCountFrequency (%)
2 8
66.7%
3 1
 
8.3%
6 1
 
8.3%
1 1
 
8.3%
7 1
 
8.3%
Letter Number
ValueCountFrequency (%)
5
62.5%
2
 
25.0%
1
 
12.5%
Space Separator
ValueCountFrequency (%)
73
100.0%
Uppercase Letter
ValueCountFrequency (%)
I 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2915
96.5%
Common 87
 
2.9%
Latin 18
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
558
19.1%
547
 
18.8%
102
 
3.5%
87
 
3.0%
55
 
1.9%
45
 
1.5%
40
 
1.4%
36
 
1.2%
33
 
1.1%
33
 
1.1%
Other values (225) 1379
47.3%
Common
ValueCountFrequency (%)
73
83.9%
2 8
 
9.2%
3 1
 
1.1%
6 1
 
1.1%
1 1
 
1.1%
( 1
 
1.1%
7 1
 
1.1%
) 1
 
1.1%
Latin
ValueCountFrequency (%)
I 10
55.6%
5
27.8%
2
 
11.1%
1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2915
96.5%
ASCII 97
 
3.2%
Number Forms 8
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
558
19.1%
547
 
18.8%
102
 
3.5%
87
 
3.0%
55
 
1.9%
45
 
1.5%
40
 
1.4%
36
 
1.2%
33
 
1.1%
33
 
1.1%
Other values (225) 1379
47.3%
ASCII
ValueCountFrequency (%)
73
75.3%
I 10
 
10.3%
2 8
 
8.2%
3 1
 
1.0%
6 1
 
1.0%
1 1
 
1.0%
( 1
 
1.0%
7 1
 
1.0%
) 1
 
1.0%
Number Forms
ValueCountFrequency (%)
5
62.5%
2
 
25.0%
1
 
12.5%

축종
Categorical

IMBALANCE 

Distinct10
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
한우
512 
돼지
52 
육계
 
11
젖소
 
7
산양
 
6
Other values (5)
 
11

Length

Max length3
Median length2
Mean length2
Min length1

Unique

Unique3 ?
Unique (%)0.5%

Sample

1st row한우
2nd row돼지
3rd row돼지
4th row돼지
5th row돼지

Common Values

ValueCountFrequency (%)
한우 512
85.5%
돼지 52
 
8.7%
육계 11
 
1.8%
젖소 7
 
1.2%
산양 6
 
1.0%
염소 5
 
0.8%
육우 3
 
0.5%
1
 
0.2%
사슴 1
 
0.2%
메추리 1
 
0.2%

Length

2023-12-13T00:24:29.838172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T00:24:29.948165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
한우 512
85.5%
돼지 52
 
8.7%
육계 11
 
1.8%
젖소 7
 
1.2%
산양 6
 
1.0%
염소 5
 
0.8%
육우 3
 
0.5%
1
 
0.2%
사슴 1
 
0.2%
메추리 1
 
0.2%
Distinct116
Distinct (%)19.4%
Missing0
Missing (%)0.0%
Memory size4.8 KiB
2023-12-13T00:24:30.153969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length16
Mean length16.008347
Min length12

Characters and Unicode

Total characters9589
Distinct characters113
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)4.3%

Sample

1st row경상북도 고령군 개진면 반운리
2nd row경상북도 고령군 우곡면 대곡리
3rd row경상북도 고령군 우곡면 대곡리
4th row경상북도 고령군 우곡면 대곡리
5th row경상북도 고령군 우곡면 대곡리
ValueCountFrequency (%)
경상북도 599
25.1%
고령군 599
25.1%
운수면 156
 
6.5%
덕곡면 77
 
3.2%
성산면 76
 
3.2%
쌍림면 61
 
2.6%
개진면 58
 
2.4%
고령읍 56
 
2.3%
다산면 44
 
1.8%
우곡면 42
 
1.8%
Other values (101) 622
26.0%
2023-12-13T00:24:30.480411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1872
19.5%
665
 
6.9%
655
 
6.8%
614
 
6.4%
601
 
6.3%
599
 
6.2%
599
 
6.2%
599
 
6.2%
588
 
6.1%
514
 
5.4%
Other values (103) 2283
23.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7717
80.5%
Space Separator 1872
 
19.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
665
 
8.6%
655
 
8.5%
614
 
8.0%
601
 
7.8%
599
 
7.8%
599
 
7.8%
599
 
7.8%
588
 
7.6%
514
 
6.7%
189
 
2.4%
Other values (102) 2094
27.1%
Space Separator
ValueCountFrequency (%)
1872
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7717
80.5%
Common 1872
 
19.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
665
 
8.6%
655
 
8.5%
614
 
8.0%
601
 
7.8%
599
 
7.8%
599
 
7.8%
599
 
7.8%
588
 
7.6%
514
 
6.7%
189
 
2.4%
Other values (102) 2094
27.1%
Common
ValueCountFrequency (%)
1872
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7717
80.5%
ASCII 1872
 
19.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1872
100.0%
Hangul
ValueCountFrequency (%)
665
 
8.6%
655
 
8.5%
614
 
8.0%
601
 
7.8%
599
 
7.8%
599
 
7.8%
599
 
7.8%
588
 
7.6%
514
 
6.7%
189
 
2.4%
Other values (102) 2094
27.1%

Missing values

2023-12-13T00:24:28.853101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:24:28.968587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

농장명축종소재지
0박준배농장한우경상북도 고령군 개진면 반운리
1꿀꿀이농장돼지경상북도 고령군 우곡면 대곡리
2거농영농조합법인Ⅲ돼지경상북도 고령군 우곡면 대곡리
3김성열농장돼지경상북도 고령군 우곡면 대곡리
4청돈농장돼지경상북도 고령군 우곡면 대곡리
5이장원농장돼지경상북도 고령군 우곡면 대곡리
6연호축산돼지경상북도 고령군 운수면 월산리
7대동축산돼지경상북도 고령군 운수면 신간리
8대왕농장돼지경상북도 고령군 고령읍 본관리
9진용일농장돼지경상북도 고령군 우곡면 대곡리
농장명축종소재지
589노영농장한우경상북도 고령군 다산면 노곡리
590키움팜육우경상북도 고령군 우곡면 포리
591고석만농장한우경상북도 고령군 운수면 운산리
592소소농장한우경상북도 고령군 개진면 오사리
593백상연농장한우경상북도 고령군 성산면 사부리
594배상훈농장한우경상북도 고령군 운수면 봉평리
595조영만농장한우경상북도 고령군 개진면 양전리
596이춘복농장한우경상북도 고령군 쌍림면 용리
597연화네메추리경상북도 고령군 성산면 오곡리
598영식농장한우경상북도 고령군 개진면 부리

Duplicate rows

Most frequently occurring

농장명축종소재지# duplicates
0가륜농장한우경상북도 고령군 덕곡면 가륜리2
1곽재원농장한우경상북도 고령군 개진면 구곡리2
2나정목장한우경상북도 고령군 다산면 나정리2
3시진기농장한우경상북도 고령군 운수면 봉평리2
4제종호농장한우경상북도 고령군 운수면 운산리2
5청하축산돼지경상북도 고령군 대가야읍 장기리2
6형제축산한우경상북도 고령군 쌍림면 합가리2