Overview

Dataset statistics

Number of variables6
Number of observations84
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.1 KiB
Average record size in memory50.6 B

Variable types

Text1
Categorical4
Numeric1

Dataset

Description청주시 축산농장 현황 (주소, 축주명, 축종 등)
Author충청북도 청주시
URLhttps://www.data.go.kr/data/15021577/fileData.do

Alerts

축주명 is highly overall correlated with 비 고High correlation
축종 is highly overall correlated with 사육규모 and 2 other fieldsHigh correlation
품종 is highly overall correlated with 축종 and 1 other fieldsHigh correlation
비 고 is highly overall correlated with 사육규모 and 3 other fieldsHigh correlation
사육규모 is highly overall correlated with 축종 and 1 other fieldsHigh correlation
비 고 is highly imbalanced (58.6%)Imbalance

Reproduction

Analysis started2023-12-13 00:48:34.394348
Analysis finished2023-12-13 00:48:34.876015
Duration0.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

주소
Text

Distinct64
Distinct (%)76.2%
Missing0
Missing (%)0.0%
Memory size804.0 B
2023-12-13T09:48:35.018009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16
Mean length16.214286
Min length15

Characters and Unicode

Total characters1362
Distinct characters92
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51 ?
Unique (%)60.7%

Sample

1st row충청북도 청주시 가덕면 계산리
2nd row충청북도 청주시 가덕면 내암리
3rd row충청북도 청주시 가덕면 삼항2리
4th row충청북도 청주시 가덕면 삼항리
5th row충청북도 청주시 가덕면 인차리
ValueCountFrequency (%)
충청북도 84
24.9%
청주시 84
24.9%
북이면 20
 
5.9%
미원면 12
 
3.6%
가덕면 12
 
3.6%
오창읍 8
 
2.4%
옥산면 8
 
2.4%
오송읍 6
 
1.8%
화상리 6
 
1.8%
강내면 5
 
1.5%
Other values (67) 93
27.5%
2023-12-13T09:48:35.306521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
263
19.3%
169
12.4%
106
 
7.8%
87
 
6.4%
84
 
6.2%
84
 
6.2%
84
 
6.2%
83
 
6.1%
65
 
4.8%
25
 
1.8%
Other values (82) 312
22.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1097
80.5%
Space Separator 263
 
19.3%
Decimal Number 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
169
15.4%
106
 
9.7%
87
 
7.9%
84
 
7.7%
84
 
7.7%
84
 
7.7%
83
 
7.6%
65
 
5.9%
25
 
2.3%
18
 
1.6%
Other values (79) 292
26.6%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
2 1
50.0%
Space Separator
ValueCountFrequency (%)
263
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1097
80.5%
Common 265
 
19.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
169
15.4%
106
 
9.7%
87
 
7.9%
84
 
7.7%
84
 
7.7%
84
 
7.7%
83
 
7.6%
65
 
5.9%
25
 
2.3%
18
 
1.6%
Other values (79) 292
26.6%
Common
ValueCountFrequency (%)
263
99.2%
1 1
 
0.4%
2 1
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1097
80.5%
ASCII 265
 
19.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
263
99.2%
1 1
 
0.4%
2 1
 
0.4%
Hangul
ValueCountFrequency (%)
169
15.4%
106
 
9.7%
87
 
7.9%
84
 
7.7%
84
 
7.7%
84
 
7.7%
83
 
7.6%
65
 
5.9%
25
 
2.3%
18
 
1.6%
Other values (79) 292
26.6%

축주명
Categorical

HIGH CORRELATION 

Distinct23
Distinct (%)27.4%
Missing0
Missing (%)0.0%
Memory size804.0 B
이○○
17 
김○○
17 
박○○
최○○
강○○
Other values (18)
32 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique6 ?
Unique (%)7.1%

Sample

1st row이○○
2nd row이○○
3rd row전○○
4th row오○○
5th row강○○

Common Values

ValueCountFrequency (%)
이○○ 17
20.2%
김○○ 17
20.2%
박○○ 9
10.7%
최○○ 5
 
6.0%
강○○ 4
 
4.8%
유○○ 3
 
3.6%
장○○ 3
 
3.6%
홍○○ 2
 
2.4%
한○○ 2
 
2.4%
정○○ 2
 
2.4%
Other values (13) 20
23.8%

Length

2023-12-13T09:48:35.404488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
이○○ 17
20.2%
김○○ 17
20.2%
박○○ 9
10.7%
최○○ 5
 
6.0%
강○○ 4
 
4.8%
유○○ 3
 
3.6%
장○○ 3
 
3.6%
전○○ 2
 
2.4%
변○○ 2
 
2.4%
지○○ 2
 
2.4%
Other values (13) 20
23.8%

축종
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size804.0 B
58 
오리
22 
메추리
 
3
<NA>
 
1

Length

Max length4
Median length1
Mean length1.3690476
Min length1

Unique

Unique1 ?
Unique (%)1.2%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
58
69.0%
오리 22
 
26.2%
메추리 3
 
3.6%
<NA> 1
 
1.2%

Length

2023-12-13T09:48:35.490357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:48:35.571154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
58
69.0%
오리 22
 
26.2%
메추리 3
 
3.6%
na 1
 
1.2%

품종
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size804.0 B
육계
37 
육용오리
21 
토종닭
15 
산란계
육종계
 
3
Other values (2)

Length

Max length5
Median length4
Mean length2.8809524
Min length2

Unique

Unique1 ?
Unique (%)1.2%

Sample

1st row육계
2nd row육계
3rd row육계
4th row육계
5th row육계

Common Values

ValueCountFrequency (%)
육계 37
44.0%
육용오리 21
25.0%
토종닭 15
17.9%
산란계 4
 
4.8%
육종계 3
 
3.6%
산란메추리 3
 
3.6%
종오리 1
 
1.2%

Length

2023-12-13T09:48:35.660769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:48:35.750234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
육계 37
44.0%
육용오리 21
25.0%
토종닭 15
17.9%
산란계 4
 
4.8%
육종계 3
 
3.6%
산란메추리 3
 
3.6%
종오리 1
 
1.2%

사육규모
Real number (ℝ)

HIGH CORRELATION 

Distinct46
Distinct (%)54.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean31745.238
Minimum300
Maximum170000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size888.0 B
2023-12-13T09:48:35.844402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum300
5-th percentile4690
Q113750
median24750
Q346700
95-th percentile69700
Maximum170000
Range169700
Interquartile range (IQR)32950

Descriptive statistics

Standard deviation26248.127
Coefficient of variation (CV)0.82683669
Kurtosis8.2313934
Mean31745.238
Median Absolute Deviation (MAD)14750
Skewness2.1444723
Sum2666600
Variance6.8896419 × 108
MonotonicityNot monotonic
2023-12-13T09:48:35.947797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%)
30000 6
 
7.1%
50000 6
 
7.1%
10000 5
 
6.0%
15000 5
 
6.0%
40000 4
 
4.8%
60000 4
 
4.8%
12000 3
 
3.6%
20000 3
 
3.6%
19000 3
 
3.6%
35000 3
 
3.6%
Other values (36) 42
50.0%
ValueCountFrequency (%)
300 1
 
1.2%
400 1
 
1.2%
1000 2
 
2.4%
4600 1
 
1.2%
5200 1
 
1.2%
6400 1
 
1.2%
7500 1
 
1.2%
9000 2
 
2.4%
9500 1
 
1.2%
10000 5
6.0%
ValueCountFrequency (%)
170000 1
 
1.2%
95000 1
 
1.2%
87000 1
 
1.2%
82000 1
 
1.2%
70000 1
 
1.2%
68000 1
 
1.2%
64000 1
 
1.2%
60000 4
4.8%
59000 1
 
1.2%
55000 1
 
1.2%

비 고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size804.0 B
<NA>
77 
AI
 
7

Length

Max length4
Median length4
Mean length3.8333333
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 77
91.7%
AI 7
 
8.3%

Length

2023-12-13T09:48:36.086773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:48:36.177247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 77
91.7%
ai 7
 
8.3%

Interactions

2023-12-13T09:48:34.670491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T09:48:36.226547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주소축주명축종품종사육규모
주소1.0000.8030.9080.9750.738
축주명0.8031.0000.6270.5090.460
축종0.9080.6271.0001.0000.647
품종0.9750.5091.0001.0000.742
사육규모0.7380.4600.6470.7421.000
2023-12-13T09:48:36.300416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
축주명축종품종비 고
축주명1.0000.3540.2131.000
축종0.3541.0000.9751.000
품종0.2130.9751.0001.000
비 고1.0001.0001.0001.000
2023-12-13T09:48:36.372802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사육규모축주명축종품종비 고
사육규모1.0000.1920.5210.3351.000
축주명0.1921.0000.3540.2131.000
축종0.5210.3541.0000.9751.000
품종0.3350.2130.9751.0001.000
비 고1.0001.0001.0001.0001.000

Missing values

2023-12-13T09:48:34.758235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:48:34.843695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

주소축주명축종품종사육규모비 고
0충청북도 청주시 가덕면 계산리이○○육계40000<NA>
1충청북도 청주시 가덕면 내암리이○○육계19000<NA>
2충청북도 청주시 가덕면 삼항2리전○○육계12000<NA>
3충청북도 청주시 가덕면 삼항리오○○육계12000<NA>
4충청북도 청주시 가덕면 인차리강○○육계18000<NA>
5충청북도 청주시 가덕면 행정리김○○육계60000<NA>
6충청북도 청주시 강내면 연정리이○○육계40000<NA>
7충청북도 청주시 강내면 저산리이○○육계40000<NA>
8충청북도 청주시 남이면 비룡리이○○육계30000<NA>
9충청북도 청주시 남이면 상발리박○○육계33000<NA>
주소축주명축종품종사육규모비 고
74충청북도 청주시 내수읍 도원리김○○오리육용오리5200<NA>
75충청북도 청주시 옥산면 장남리공○○오리육용오리7500<NA>
76충청북도 청주시 옥산면 신촌리윤○○오리육용오리25000<NA>
77충청북도 청주시 옥산면 신촌리박○○오리육용오리9500<NA>
78충청북도 청주시 오창읍 도암리연○○오리육용오리12000<NA>
79충청북도 청주시 오창읍 성재리유○○오리육용오리10600AI
80충청북도 청주시 오창읍 괴정리김○○오리육용오리14000<NA>
81충청북도 청주시 오송읍 쌍청리변○○오리육용오리15000AI
82충청북도 청주시 미원면 운교리이○○오리육용오리16000<NA>
83충청북도 청주시 미원면 월용리 산조○○오리종오리6400<NA>