Overview

Dataset statistics

Number of variables8
Number of observations293
Missing cells586
Missing cells (%)25.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory19.0 KiB
Average record size in memory66.5 B

Variable types

Text4
Unsupported1
Categorical2
Numeric1

Dataset

Description관내 소 사육농가에 대한 데이터로 사업장 명칭, 대표자, 사업장 주소, 사육 마리 수, 사육 축종을 구분한 자료입니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=397&beforeMenuCd=DOM_000000201001001000&publicdatapk=15042690

Alerts

데이터기준일 has constant value ""Constant
牛 종류 구분 is highly imbalanced (70.5%)Imbalance
전화번호 has 293 (100.0%) missing valuesMissing
지번주소 has 226 (77.1%) missing valuesMissing
도로명 주소 has 67 (22.9%) missing valuesMissing
전화번호 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-01-09 21:43:34.865192
Analysis finished2024-01-09 21:43:35.532924
Duration0.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct291
Distinct (%)99.3%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2024-01-10T06:43:35.744567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length3
Mean length4.4880546
Min length2

Characters and Unicode

Total characters1315
Distinct characters167
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique289 ?
Unique (%)98.6%

Sample

1st row김경훈
2nd row이복임
3rd row오상환
4th row이강헌
5th row길현종
ValueCountFrequency (%)
이정숙 2
 
0.7%
이명임 2
 
0.7%
김영태 2
 
0.7%
김정근 1
 
0.3%
최병현 1
 
0.3%
강남규 1
 
0.3%
정건희 1
 
0.3%
신막선 1
 
0.3%
이경희(명창환 1
 
0.3%
박남일 1
 
0.3%
Other values (281) 281
95.6%
2024-01-10T06:43:36.116626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
82
 
6.2%
( 75
 
5.7%
) 75
 
5.7%
56
 
4.3%
42
 
3.2%
34
 
2.6%
30
 
2.3%
23
 
1.7%
23
 
1.7%
23
 
1.7%
Other values (157) 852
64.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1152
87.6%
Open Punctuation 75
 
5.7%
Close Punctuation 75
 
5.7%
Decimal Number 12
 
0.9%
Space Separator 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
82
 
7.1%
56
 
4.9%
42
 
3.6%
34
 
3.0%
30
 
2.6%
23
 
2.0%
23
 
2.0%
23
 
2.0%
22
 
1.9%
22
 
1.9%
Other values (147) 795
69.0%
Decimal Number
ValueCountFrequency (%)
2 4
33.3%
3 3
25.0%
5 1
 
8.3%
4 1
 
8.3%
7 1
 
8.3%
8 1
 
8.3%
6 1
 
8.3%
Open Punctuation
ValueCountFrequency (%)
( 75
100.0%
Close Punctuation
ValueCountFrequency (%)
) 75
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1152
87.6%
Common 163
 
12.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
82
 
7.1%
56
 
4.9%
42
 
3.6%
34
 
3.0%
30
 
2.6%
23
 
2.0%
23
 
2.0%
23
 
2.0%
22
 
1.9%
22
 
1.9%
Other values (147) 795
69.0%
Common
ValueCountFrequency (%)
( 75
46.0%
) 75
46.0%
2 4
 
2.5%
3 3
 
1.8%
5 1
 
0.6%
4 1
 
0.6%
7 1
 
0.6%
8 1
 
0.6%
6 1
 
0.6%
1
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1152
87.6%
ASCII 163
 
12.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
82
 
7.1%
56
 
4.9%
42
 
3.6%
34
 
3.0%
30
 
2.6%
23
 
2.0%
23
 
2.0%
23
 
2.0%
22
 
1.9%
22
 
1.9%
Other values (147) 795
69.0%
ASCII
ValueCountFrequency (%)
( 75
46.0%
) 75
46.0%
2 4
 
2.5%
3 3
 
1.8%
5 1
 
0.6%
4 1
 
0.6%
7 1
 
0.6%
8 1
 
0.6%
6 1
 
0.6%
1
 
0.6%
Distinct281
Distinct (%)95.9%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2024-01-10T06:43:36.359260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length3
Mean length3.331058
Min length2

Characters and Unicode

Total characters976
Distinct characters152
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique275 ?
Unique (%)93.9%

Sample

1st row김경훈
2nd row이복임
3rd row오상환
4th row이강헌
5th row길현종
ValueCountFrequency (%)
금산축협 7
 
2.4%
김은주 3
 
1.0%
이정숙 2
 
0.7%
이명임 2
 
0.7%
김영태 2
 
0.7%
박병운 2
 
0.7%
이경희 1
 
0.3%
박상준 1
 
0.3%
명노관 1
 
0.3%
최영진 1
 
0.3%
Other values (271) 271
92.5%
2024-01-10T06:43:36.709271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
66
 
6.8%
49
 
5.0%
36
 
3.7%
29
 
3.0%
27
 
2.8%
20
 
2.0%
20
 
2.0%
18
 
1.8%
17
 
1.7%
) 16
 
1.6%
Other values (142) 678
69.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 944
96.7%
Close Punctuation 16
 
1.6%
Open Punctuation 16
 
1.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
66
 
7.0%
49
 
5.2%
36
 
3.8%
29
 
3.1%
27
 
2.9%
20
 
2.1%
20
 
2.1%
18
 
1.9%
17
 
1.8%
14
 
1.5%
Other values (140) 648
68.6%
Close Punctuation
ValueCountFrequency (%)
) 16
100.0%
Open Punctuation
ValueCountFrequency (%)
( 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 944
96.7%
Common 32
 
3.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
66
 
7.0%
49
 
5.2%
36
 
3.8%
29
 
3.1%
27
 
2.9%
20
 
2.1%
20
 
2.1%
18
 
1.9%
17
 
1.8%
14
 
1.5%
Other values (140) 648
68.6%
Common
ValueCountFrequency (%)
) 16
50.0%
( 16
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 944
96.7%
ASCII 32
 
3.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
66
 
7.0%
49
 
5.2%
36
 
3.8%
29
 
3.1%
27
 
2.9%
20
 
2.1%
20
 
2.1%
18
 
1.9%
17
 
1.8%
14
 
1.5%
Other values (140) 648
68.6%
ASCII
ValueCountFrequency (%)
) 16
50.0%
( 16
50.0%

전화번호
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing293
Missing (%)100.0%
Memory size2.7 KiB

지번주소
Text

MISSING 

Distinct56
Distinct (%)83.6%
Missing226
Missing (%)77.1%
Memory size2.4 KiB
2024-01-10T06:43:36.930793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length22
Mean length20.985075
Min length18

Characters and Unicode

Total characters1406
Distinct characters77
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique47 ?
Unique (%)70.1%

Sample

1st row충청남도 금산군 군북면 동편리 46-1
2nd row충청남도 금산군 군북면 보광리 325
3rd row충청남도 금산군 군북면 산안리 257-1
4th row충청남도 금산군 군북면 산안리 497
5th row충청남도 금산군 군북면 상곡리 838-1
ValueCountFrequency (%)
충청남도 67
20.0%
금산군 67
20.0%
군북면 12
 
3.6%
금성면 12
 
3.6%
복수면 12
 
3.6%
추부면 7
 
2.1%
도곡리 6
 
1.8%
부리면 6
 
1.8%
금산읍 5
 
1.5%
진산면 5
 
1.5%
Other values (93) 136
40.6%
2024-01-10T06:43:37.253973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
268
19.1%
84
 
6.0%
79
 
5.6%
79
 
5.6%
76
 
5.4%
73
 
5.2%
73
 
5.2%
67
 
4.8%
67
 
4.8%
62
 
4.4%
Other values (67) 478
34.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 871
61.9%
Space Separator 268
 
19.1%
Decimal Number 231
 
16.4%
Dash Punctuation 36
 
2.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
84
9.6%
79
9.1%
79
9.1%
76
 
8.7%
73
 
8.4%
73
 
8.4%
67
 
7.7%
67
 
7.7%
62
 
7.1%
14
 
1.6%
Other values (55) 197
22.6%
Decimal Number
ValueCountFrequency (%)
2 39
16.9%
1 38
16.5%
5 30
13.0%
3 26
11.3%
8 23
10.0%
6 19
8.2%
4 18
7.8%
9 16
6.9%
0 12
 
5.2%
7 10
 
4.3%
Space Separator
ValueCountFrequency (%)
268
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 36
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 871
61.9%
Common 535
38.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
84
9.6%
79
9.1%
79
9.1%
76
 
8.7%
73
 
8.4%
73
 
8.4%
67
 
7.7%
67
 
7.7%
62
 
7.1%
14
 
1.6%
Other values (55) 197
22.6%
Common
ValueCountFrequency (%)
268
50.1%
2 39
 
7.3%
1 38
 
7.1%
- 36
 
6.7%
5 30
 
5.6%
3 26
 
4.9%
8 23
 
4.3%
6 19
 
3.6%
4 18
 
3.4%
9 16
 
3.0%
Other values (2) 22
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 871
61.9%
ASCII 535
38.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
268
50.1%
2 39
 
7.3%
1 38
 
7.1%
- 36
 
6.7%
5 30
 
5.6%
3 26
 
4.9%
8 23
 
4.3%
6 19
 
3.6%
4 18
 
3.4%
9 16
 
3.0%
Other values (2) 22
 
4.1%
Hangul
ValueCountFrequency (%)
84
9.6%
79
9.1%
79
9.1%
76
 
8.7%
73
 
8.4%
73
 
8.4%
67
 
7.7%
67
 
7.7%
62
 
7.1%
14
 
1.6%
Other values (55) 197
22.6%

도로명 주소
Text

MISSING 

Distinct158
Distinct (%)69.9%
Missing67
Missing (%)22.9%
Memory size2.4 KiB
2024-01-10T06:43:37.559684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length22
Mean length21.141593
Min length17

Characters and Unicode

Total characters4778
Distinct characters156
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique110 ?
Unique (%)48.7%

Sample

1st row충청남도 금산군 군북면 안보광1길 73
2nd row충청남도 금산군 군북면 보광로 242-5
3rd row충청남도 금산군 군북면 보광로 241
4th row충청남도 금산군 군북면 자진뱅이길 172-7
5th row충청남도 금산군 군북면 자진뱅이길 172-7
ValueCountFrequency (%)
충청남도 226
20.1%
금산군 226
20.1%
금성면 39
 
3.5%
부리면 29
 
2.6%
군북면 26
 
2.3%
남일면 25
 
2.2%
복수면 24
 
2.1%
제원면 22
 
2.0%
진산면 21
 
1.9%
추부면 17
 
1.5%
Other values (253) 472
41.9%
2024-01-10T06:43:37.976404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
905
18.9%
297
 
6.2%
273
 
5.7%
262
 
5.5%
253
 
5.3%
226
 
4.7%
226
 
4.7%
226
 
4.7%
214
 
4.5%
162
 
3.4%
Other values (146) 1734
36.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3057
64.0%
Space Separator 905
 
18.9%
Decimal Number 719
 
15.0%
Dash Punctuation 97
 
2.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
297
 
9.7%
273
 
8.9%
262
 
8.6%
253
 
8.3%
226
 
7.4%
226
 
7.4%
226
 
7.4%
214
 
7.0%
162
 
5.3%
66
 
2.2%
Other values (134) 852
27.9%
Decimal Number
ValueCountFrequency (%)
1 141
19.6%
2 106
14.7%
3 80
11.1%
4 75
10.4%
7 72
10.0%
5 70
9.7%
6 56
 
7.8%
9 48
 
6.7%
0 43
 
6.0%
8 28
 
3.9%
Space Separator
ValueCountFrequency (%)
905
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 97
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3057
64.0%
Common 1721
36.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
297
 
9.7%
273
 
8.9%
262
 
8.6%
253
 
8.3%
226
 
7.4%
226
 
7.4%
226
 
7.4%
214
 
7.0%
162
 
5.3%
66
 
2.2%
Other values (134) 852
27.9%
Common
ValueCountFrequency (%)
905
52.6%
1 141
 
8.2%
2 106
 
6.2%
- 97
 
5.6%
3 80
 
4.6%
4 75
 
4.4%
7 72
 
4.2%
5 70
 
4.1%
6 56
 
3.3%
9 48
 
2.8%
Other values (2) 71
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3057
64.0%
ASCII 1721
36.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
905
52.6%
1 141
 
8.2%
2 106
 
6.2%
- 97
 
5.6%
3 80
 
4.6%
4 75
 
4.4%
7 72
 
4.2%
5 70
 
4.1%
6 56
 
3.3%
9 48
 
2.8%
Other values (2) 71
 
4.1%
Hangul
ValueCountFrequency (%)
297
 
9.7%
273
 
8.9%
262
 
8.6%
253
 
8.3%
226
 
7.4%
226
 
7.4%
226
 
7.4%
214
 
7.0%
162
 
5.3%
66
 
2.2%
Other values (134) 852
27.9%

牛 종류 구분
Categorical

IMBALANCE 

Distinct3
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
한우
267 
젖소
 
24
육우
 
2

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row한우
2nd row한우
3rd row한우
4th row한우
5th row한우

Common Values

ValueCountFrequency (%)
한우 267
91.1%
젖소 24
 
8.2%
육우 2
 
0.7%

Length

2024-01-10T06:43:38.095525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:43:38.180808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
한우 267
91.1%
젖소 24
 
8.2%
육우 2
 
0.7%

사육두수
Real number (ℝ)

Distinct91
Distinct (%)31.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean33.52901
Minimum1
Maximum314
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.7 KiB
2024-01-10T06:43:38.279095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median9
Q335
95-th percentile157.2
Maximum314
Range313
Interquartile range (IQR)31

Descriptive statistics

Standard deviation56.757626
Coefficient of variation (CV)1.6927916
Kurtosis8.3188193
Mean33.52901
Median Absolute Deviation (MAD)7
Skewness2.7851618
Sum9824
Variance3221.4281
MonotonicityNot monotonic
2024-01-10T06:43:38.395892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 34
 
11.6%
5 22
 
7.5%
1 20
 
6.8%
3 19
 
6.5%
4 18
 
6.1%
7 12
 
4.1%
9 10
 
3.4%
6 10
 
3.4%
11 7
 
2.4%
10 6
 
2.0%
Other values (81) 135
46.1%
ValueCountFrequency (%)
1 20
6.8%
2 34
11.6%
3 19
6.5%
4 18
6.1%
5 22
7.5%
6 10
 
3.4%
7 12
 
4.1%
8 5
 
1.7%
9 10
 
3.4%
10 6
 
2.0%
ValueCountFrequency (%)
314 1
0.3%
311 1
0.3%
290 1
0.3%
283 1
0.3%
278 1
0.3%
259 1
0.3%
230 1
0.3%
202 1
0.3%
201 1
0.3%
198 1
0.3%

데이터기준일
Categorical

CONSTANT 

Distinct1
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2020-01-27
293 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-01-27
2nd row2020-01-27
3rd row2020-01-27
4th row2020-01-27
5th row2020-01-27

Common Values

ValueCountFrequency (%)
2020-01-27 293
100.0%

Length

2024-01-10T06:43:38.493309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T06:43:38.562804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-01-27 293
100.0%

Interactions

2024-01-10T06:43:35.219443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T06:43:38.604302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지번주소牛 종류 구분사육두수
지번주소1.0001.0000.000
牛 종류 구분1.0001.0000.465
사육두수0.0000.4651.000
2024-01-10T06:43:38.671144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사육두수牛 종류 구분
사육두수1.0000.312
牛 종류 구분0.3121.000

Missing values

2024-01-10T06:43:35.315973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T06:43:35.408178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-01-10T06:43:35.489150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

사업장명칭대표자명전화번호지번주소도로명 주소牛 종류 구분사육두수데이터기준일
0김경훈김경훈<NA>충청남도 금산군 군북면 동편리 46-1<NA>한우1262020-01-27
1이복임이복임<NA><NA>충청남도 금산군 군북면 안보광1길 73한우72020-01-27
2오상환오상환<NA>충청남도 금산군 군북면 보광리 325<NA>한우12020-01-27
3이강헌이강헌<NA><NA>충청남도 금산군 군북면 보광로 242-5한우32020-01-27
4길현종길현종<NA><NA>충청남도 금산군 군북면 보광로 241한우32020-01-27
5강영창강영창<NA><NA>충청남도 금산군 군북면 자진뱅이길 172-7한우92020-01-27
6김원옥(이상진)김원옥(이상진)<NA><NA>충청남도 금산군 군북면 자진뱅이길 172-7한우32020-01-27
7김현경김현경<NA><NA>충청남도 금산군 군북면 자진뱅이길 172-7한우82020-01-27
8이상진이상진<NA><NA>충청남도 금산군 군북면 자진뱅이길 172-7한우52020-01-27
9한광년(강영창)한광년(강영창)<NA><NA>충청남도 금산군 군북면 자진뱅이길 172-7한우52020-01-27
사업장명칭대표자명전화번호지번주소도로명 주소牛 종류 구분사육두수데이터기준일
283이민우이민우<NA>충청남도 금산군 추부면 용지리 125<NA>한우252020-01-27
284정광조정광조<NA><NA>충청남도 금산군 추부면 아래못골길 7한우22020-01-27
285김진수김진수<NA><NA>충청남도 금산군 추부면 추풍로 234-22한우1112020-01-27
286이소영(김진수)이소영<NA><NA>충청남도 금산군 추부면 추풍로 234-22한우22020-01-27
287강덕원강덕원<NA><NA>충청남도 금산군 추부면 추풍천2길 66한우252020-01-27
288박정완박정완<NA><NA>충청남도 금산군 추부면 삽재길 77한우22020-01-27
289김둘자(정봉구)김둘자<NA><NA>충청남도 금산군 추부면 추풍로 98한우12020-01-27
290정봉구정봉구<NA><NA>충청남도 금산군 추부면 추풍로 98한우172020-01-27
291김정근김정근<NA><NA>충청남도 금산군 추부면 숭암로 52한우72020-01-27
292김정원(김정근)김정원(김정근)<NA><NA>충청남도 금산군 추부면 숭암로 52한우32020-01-27