Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 1004 |
Missing cells | 679 |
Missing cells (%) | 9.7% |
Duplicate rows | 2 |
Duplicate rows (%) | 0.2% |
Total size in memory | 58.0 KiB |
Average record size in memory | 59.1 B |
Variable types
Numeric | 2 |
---|---|
Categorical | 3 |
Text | 1 |
Boolean | 1 |
Dataset
Description | 생산농가 패널의 경영규모, 재배기술, 투입, 산출, 비용 등 경영실태 조사분석관련 내부 관리시스템으로 질문번호, 행/열, 순번, 텍스트, 사용자입력여부, 단위, 답변유형(ques_t_cd와동일)을 제공합니다 |
---|---|
Author | 충청북도 |
URL | https://www.data.go.kr/data/15050272/fileData.do |
Dataset has 2 (0.2%) duplicate rows | Duplicates |
행/열 is highly overall correlated with 사용자입력여부 and 2 other fields | High correlation |
답변유형 is highly overall correlated with 질문번호 and 3 other fields | High correlation |
단위 is highly overall correlated with 행/열 | High correlation |
사용자입력여부 is highly overall correlated with 행/열 and 1 other fields | High correlation |
질문번호 is highly overall correlated with 답변유형 | High correlation |
순번 is highly overall correlated with 답변유형 | High correlation |
사용자입력여부 is highly imbalanced (92.5%) | Imbalance |
단위 is highly imbalanced (71.3%) | Imbalance |
답변유형 is highly imbalanced (94.0%) | Imbalance |
사용자입력여부 has 676 (67.3%) missing values | Missing |
Reproduction
Analysis started | 2023-12-11 23:30:23.548065 |
---|---|
Analysis finished | 2023-12-11 23:30:24.841617 |
Duration | 1.29 second |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
질문번호
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 98 |
---|---|
Distinct (%) | 9.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 47.241036 |
Minimum | 2 |
---|---|
Maximum | 105 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.0 KiB |
Quantile statistics
Minimum | 2 |
---|---|
5-th percentile | 5 |
Q1 | 21 |
median | 39 |
Q3 | 78 |
95-th percentile | 98 |
Maximum | 105 |
Range | 103 |
Interquartile range (IQR) | 57 |
Descriptive statistics
Standard deviation | 30.592124 |
---|---|
Coefficient of variation (CV) | 0.64757521 |
Kurtosis | -1.2798449 |
Mean | 47.241036 |
Median Absolute Deviation (MAD) | 25.5 |
Skewness | 0.27269615 |
Sum | 47430 |
Variance | 935.87804 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
26 | 24 | 2.4% |
25 | 24 | 2.4% |
82 | 24 | 2.4% |
28 | 23 | 2.3% |
24 | 22 | 2.2% |
40 | 22 | 2.2% |
77 | 21 | 2.1% |
27 | 21 | 2.1% |
29 | 21 | 2.1% |
85 | 20 | 2.0% |
Other values (88) | 782 |
Value | Count | Frequency (%) |
2 | 10 | |
3 | 14 | |
4 | 14 | |
5 | 19 | |
6 | 11 | |
7 | 13 | |
8 | 13 | |
9 | 11 | |
10 | 15 | |
11 | 11 |
Value | Count | Frequency (%) |
105 | 8 | |
104 | 12 | |
103 | 12 | |
102 | 6 | |
101 | 6 | |
100 | 2 | 0.2% |
99 | 2 | 0.2% |
98 | 6 | |
97 | 3 | 0.3% |
96 | 3 | 0.3% |
행/열
Categorical
HIGH CORRELATION
 
Distinct | 2 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 8.0 KiB |
C | |
---|---|
R |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | R |
---|---|
2nd row | R |
3rd row | R |
4th row | R |
5th row | R |
Common Values
Value | Count | Frequency (%) |
C | 603 | |
R | 401 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
c | 603 | |
r | 401 |
순번
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 19 |
---|---|
Distinct (%) | 1.9% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 3.3286853 |
Minimum | 1 |
---|---|
Maximum | 19 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 9.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 2 |
median | 3 |
Q3 | 4 |
95-th percentile | 8 |
Maximum | 19 |
Range | 18 |
Interquartile range (IQR) | 2 |
Descriptive statistics
Standard deviation | 2.3741575 |
---|---|
Coefficient of variation (CV) | 0.71324182 |
Kurtosis | 6.1565359 |
Mean | 3.3286853 |
Median Absolute Deviation (MAD) | 1 |
Skewness | 1.8807347 |
Sum | 3342 |
Variance | 5.636624 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 233 | |
2 | 223 | |
3 | 169 | |
4 | 129 | |
5 | 115 | |
6 | 44 | 4.4% |
7 | 31 | 3.1% |
8 | 24 | 2.4% |
9 | 19 | 1.9% |
10 | 4 | 0.4% |
Other values (9) | 13 | 1.3% |
Value | Count | Frequency (%) |
1 | 233 | |
2 | 223 | |
3 | 169 | |
4 | 129 | |
5 | 115 | |
6 | 44 | 4.4% |
7 | 31 | 3.1% |
8 | 24 | 2.4% |
9 | 19 | 1.9% |
10 | 4 | 0.4% |
Value | Count | Frequency (%) |
19 | 1 | 0.1% |
18 | 1 | 0.1% |
17 | 1 | 0.1% |
16 | 1 | 0.1% |
15 | 1 | 0.1% |
14 | 1 | 0.1% |
13 | 1 | 0.1% |
12 | 3 | |
11 | 3 | |
10 | 4 |
텍스트
Text
Distinct | 449 |
---|---|
Distinct (%) | 44.9% |
Missing | 3 |
Missing (%) | 0.3% |
Memory size | 8.0 KiB |
Value | Count | Frequency (%) |
매우 | 43 | 2.4% |
보통이다 | 36 | 2.0% |
않다 | 32 | 1.8% |
그렇다 | 30 | 1.7% |
그렇지 | 30 | 1.7% |
있다 | 28 | 1.6% |
금액 | 27 | 1.5% |
27 | 1.5% | |
비목 | 23 | 1.3% |
등 | 22 | 1.2% |
Other values (518) | 1476 |
Most occurring characters
Value | Count | Frequency (%) |
774 | 11.5% | |
다 | 317 | 4.7% |
이 | 157 | 2.3% |
비 | 123 | 1.8% |
기 | 110 | 1.6% |
. | 109 | 1.6% |
적 | 99 | 1.5% |
농 | 93 | 1.4% |
93 | 1.4% | |
수 | 87 | 1.3% |
Other values (324) | 4764 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 5377 | |
Space Separator | 774 | 11.5% |
Other Punctuation | 174 | 2.6% |
Control | 93 | 1.4% |
Decimal Number | 91 | 1.4% |
Open Punctuation | 54 | 0.8% |
Close Punctuation | 54 | 0.8% |
Other Number | 36 | 0.5% |
Lowercase Letter | 33 | 0.5% |
Dash Punctuation | 26 | 0.4% |
Other values (2) | 14 | 0.2% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
다 | 317 | 5.9% |
이 | 157 | 2.9% |
비 | 123 | 2.3% |
기 | 110 | 2.0% |
적 | 99 | 1.8% |
농 | 93 | 1.7% |
수 | 87 | 1.6% |
매 | 83 | 1.5% |
가 | 82 | 1.5% |
보 | 81 | 1.5% |
Other values (294) | 4145 |
Decimal Number
Value | Count | Frequency (%) |
1 | 21 | |
2 | 18 | |
0 | 15 | |
5 | 13 | |
7 | 6 | 6.6% |
6 | 5 | 5.5% |
4 | 5 | 5.5% |
3 | 4 | 4.4% |
9 | 2 | 2.2% |
8 | 2 | 2.2% |
Other Punctuation
Value | Count | Frequency (%) |
. | 109 | |
, | 41 | 23.6% |
· | 16 | 9.2% |
% | 5 | 2.9% |
/ | 3 | 1.7% |
Lowercase Letter
Value | Count | Frequency (%) |
g | 15 | |
k | 15 | |
m | 3 | 9.1% |
Other Number
Value | Count | Frequency (%) |
③ | 12 | |
① | 12 | |
② | 12 |
Uppercase Letter
Value | Count | Frequency (%) |
X | 5 | |
O | 4 | |
P | 4 |
Space Separator
Value | Count | Frequency (%) |
774 |
Control
Value | Count | Frequency (%) |
93 |
Open Punctuation
Value | Count | Frequency (%) |
( | 54 |
Close Punctuation
Value | Count | Frequency (%) |
) | 54 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 26 |
Math Symbol
Value | Count | Frequency (%) |
~ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 5377 | |
Common | 1303 | 19.4% |
Latin | 46 | 0.7% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
다 | 317 | 5.9% |
이 | 157 | 2.9% |
비 | 123 | 2.3% |
기 | 110 | 2.0% |
적 | 99 | 1.8% |
농 | 93 | 1.7% |
수 | 87 | 1.6% |
매 | 83 | 1.5% |
가 | 82 | 1.5% |
보 | 81 | 1.5% |
Other values (294) | 4145 |
Common
Value | Count | Frequency (%) |
774 | ||
. | 109 | 8.4% |
93 | 7.1% | |
( | 54 | 4.1% |
) | 54 | 4.1% |
, | 41 | 3.1% |
- | 26 | 2.0% |
1 | 21 | 1.6% |
2 | 18 | 1.4% |
· | 16 | 1.2% |
Other values (14) | 97 | 7.4% |
Latin
Value | Count | Frequency (%) |
g | 15 | |
k | 15 | |
X | 5 | 10.9% |
O | 4 | 8.7% |
P | 4 | 8.7% |
m | 3 | 6.5% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 5377 | |
ASCII | 1297 | 19.3% |
Enclosed Alphanum | 36 | 0.5% |
None | 16 | 0.2% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
774 | ||
. | 109 | 8.4% |
93 | 7.2% | |
( | 54 | 4.2% |
) | 54 | 4.2% |
, | 41 | 3.2% |
- | 26 | 2.0% |
1 | 21 | 1.6% |
2 | 18 | 1.4% |
g | 15 | 1.2% |
Other values (16) | 92 | 7.1% |
Hangul
Value | Count | Frequency (%) |
다 | 317 | 5.9% |
이 | 157 | 2.9% |
비 | 123 | 2.3% |
기 | 110 | 2.0% |
적 | 99 | 1.8% |
농 | 93 | 1.7% |
수 | 87 | 1.6% |
매 | 83 | 1.5% |
가 | 82 | 1.5% |
보 | 81 | 1.5% |
Other values (294) | 4145 |
None
Value | Count | Frequency (%) |
· | 16 |
Enclosed Alphanum
Value | Count | Frequency (%) |
③ | 12 | |
① | 12 | |
② | 12 |
사용자입력여부
Boolean
HIGH CORRELATION
  IMBALANCE
  MISSING
 
Distinct | 2 |
---|---|
Distinct (%) | 0.6% |
Missing | 676 |
Missing (%) | 67.3% |
Memory size | 2.1 KiB |
False | |
---|---|
True | 3 |
(Missing) |
Value | Count | Frequency (%) |
False | 325 | |
True | 3 | 0.3% |
(Missing) | 676 |
단위
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 28 |
---|---|
Distinct (%) | 2.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 8.0 KiB |
<NA> | |
---|---|
% | 49 |
원 | 45 |
kg | 16 |
회 | 12 |
Other values (23) | 71 |
Length
Max length | 9 |
---|---|
Median length | 4 |
Mean length | 3.5388446 |
Min length | 1 |
Unique
Unique | 13 ? |
---|---|
Unique (%) | 1.3% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 811 | |
% | 49 | 4.9% |
원 | 45 | 4.5% |
kg | 16 | 1.6% |
회 | 12 | 1.2% |
% | 11 | 1.1% |
(원) | 10 | 1.0% |
평 | 10 | 1.0% |
년 | 7 | 0.7% |
(%) | 5 | 0.5% |
Other values (18) | 28 | 2.8% |
Length
Value | Count | Frequency (%) |
na | 811 | |
원 | 55 | 5.5% |
54 | 5.4% | |
kg | 18 | 1.8% |
회 | 12 | 1.2% |
% | 11 | 1.1% |
평 | 10 | 1.0% |
년 | 7 | 0.7% |
원/kg | 5 | 0.5% |
일 | 5 | 0.5% |
Other values (9) | 20 | 2.0% |
답변유형
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 8.0 KiB |
<NA> | |
---|---|
5 | 7 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 3.9790837 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | <NA> |
---|---|
2nd row | <NA> |
3rd row | <NA> |
4th row | <NA> |
5th row | <NA> |
Common Values
Value | Count | Frequency (%) |
<NA> | 997 | |
5 | 7 | 0.7% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 997 | |
5 | 7 | 0.7% |
질문번호 | 행/열 | 순번 | 사용자입력여부 | 단위 | |
---|---|---|---|---|---|
질문번호 | 1.000 | 0.316 | 0.342 | 0.000 | 0.717 |
행/열 | 0.316 | 1.000 | 0.418 | NaN | NaN |
순번 | 0.342 | 0.418 | 1.000 | 0.028 | 0.643 |
사용자입력여부 | 0.000 | NaN | 0.028 | 1.000 | 0.000 |
단위 | 0.717 | NaN | 0.643 | 0.000 | 1.000 |
행/열 | 답변유형 | 단위 | 사용자입력여부 | |
---|---|---|---|---|
행/열 | 1.000 | 1.000 | 1.000 | 1.000 |
답변유형 | 1.000 | 1.000 | NaN | 1.000 |
단위 | 1.000 | NaN | 1.000 | 0.000 |
사용자입력여부 | 1.000 | 1.000 | 0.000 | 1.000 |
질문번호 | 순번 | 행/열 | 사용자입력여부 | 단위 | 답변유형 | |
---|---|---|---|---|---|---|
질문번호 | 1.000 | -0.069 | 0.242 | 0.000 | 0.336 | 1.000 |
순번 | -0.069 | 1.000 | 0.320 | 0.034 | 0.370 | 1.000 |
행/열 | 0.242 | 0.320 | 1.000 | 1.000 | 1.000 | 1.000 |
사용자입력여부 | 0.000 | 0.034 | 1.000 | 1.000 | 0.000 | 1.000 |
단위 | 0.336 | 0.370 | 1.000 | 0.000 | 1.000 | NaN |
답변유형 | 1.000 | 1.000 | 1.000 | 1.000 | NaN | 1.000 |
질문번호 | 행/열 | 순번 | 텍스트 | 사용자입력여부 | 단위 | 답변유형 | |
---|---|---|---|---|---|---|---|
0 | 3 | R | 1 | 성명 | <NA> | <NA> | <NA> |
1 | 3 | R | 2 | 성별 | <NA> | <NA> | <NA> |
2 | 3 | R | 3 | 주소 | <NA> | <NA> | <NA> |
3 | 3 | R | 4 | 이메일주소 | <NA> | <NA> | <NA> |
4 | 3 | R | 5 | 출생연도 | <NA> | <NA> | <NA> |
5 | 3 | R | 6 | 전화번호 | <NA> | <NA> | <NA> |
6 | 3 | R | 7 | 휴대전화번호 | <NA> | <NA> | <NA> |
7 | 3 | R | 8 | 영농경력 | <NA> | <NA> | <NA> |
8 | 3 | R | 9 | 사과 재배경력 | <NA> | <NA> | <NA> |
9 | 4 | C | 1 | 품종명 | N | <NA> | <NA> |
질문번호 | 행/열 | 순번 | 텍스트 | 사용자입력여부 | 단위 | 답변유형 | |
---|---|---|---|---|---|---|---|
994 | 104 | C | 4 | 희망적이다 | <NA> | <NA> | <NA> |
995 | 104 | C | 5 | 매우 희망적이다 | <NA> | <NA> | <NA> |
996 | 105 | R | 1 | 올해 전망 | <NA> | <NA> | <NA> |
997 | 105 | R | 2 | 5년 후 전망 | <NA> | <NA> | <NA> |
998 | 105 | R | 3 | 10년 후 전망 | <NA> | <NA> | <NA> |
999 | 105 | C | 1 | 매우 부정적이다 | <NA> | <NA> | <NA> |
1000 | 105 | C | 2 | 부정적이다 | <NA> | <NA> | <NA> |
1001 | 105 | C | 3 | 보통이다 | <NA> | <NA> | <NA> |
1002 | 105 | C | 4 | 희망적이다 | <NA> | <NA> | <NA> |
1003 | 105 | C | 5 | 매우 희망적이다 | <NA> | <NA> | <NA> |
Most frequently occurring
질문번호 | 행/열 | 순번 | 텍스트 | 사용자입력여부 | 단위 | 답변유형 | # duplicates | |
---|---|---|---|---|---|---|---|---|
0 | 25 | C | 3 | 보통이다 | <NA> | <NA> | <NA> | 2 |
1 | 26 | C | 3 | 보통이다 | <NA> | <NA> | <NA> | 2 |