Dataset statistics
Number of variables | 6 |
---|---|
Number of observations | 75 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 3.9 KiB |
Average record size in memory | 53.7 B |
Variable types
Categorical | 3 |
---|---|
Numeric | 3 |
Dataset
Description | NAPPO(북미식물보호기구)에서 요구하는 북미국가(미국, 캐나다 등), 칠레, 뉴질랜드 등 출항선박에 대한 선박 AGM 검사 통계정보 |
---|---|
Author | 국제식물검역인증원 |
URL | https://data.mafra.go.kr/opendata/data/indexOpenDataDetail.do?data_id=20220714000000002161 |
건수(기본) is highly correlated with 건수(할증) | High correlation |
건수(할증) is highly correlated with 건수(기본) | High correlation |
건수(기본) has 7 (9.3%) zeros | Zeros |
건수(할증) has 6 (8.0%) zeros | Zeros |
Reproduction
Analysis started | 2022-08-12 14:48:36.566364 |
---|---|
Analysis finished | 2022-08-12 14:48:38.244419 |
Duration | 1.68 second |
Software version | pandas-profiling v3.2.0 |
Download configuration | config.json |
검사년도
Categorical
Distinct | 5 |
---|---|
Distinct (%) | 6.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 728.0 B |
2017 | |
---|---|
2016 | |
2015 | |
2014 | |
2013 |
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2017 |
---|---|
2nd row | 2017 |
3rd row | 2017 |
4th row | 2017 |
5th row | 2017 |
Common Values
Value | Count | Frequency (%) |
2017 | 15 | |
2016 | 15 | |
2015 | 15 | |
2014 | 15 | |
2013 | 15 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
2017 | 15 | |
2016 | 15 | |
2015 | 15 | |
2014 | 15 | |
2013 | 15 |
선박 종류
Categorical
Distinct | 4 |
---|---|
Distinct (%) | 5.3% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 728.0 B |
컨테이너 | |
---|---|
기타선박류 | |
벌크선 | |
차량운반선 |
Length
Max length | 5 |
---|---|
Median length | 4 |
Mean length | 4.266666667 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 벌크선 |
---|---|
2nd row | 차량운반선 |
3rd row | 벌크선 |
4th row | 컨테이너 |
5th row | 컨테이너 |
Common Values
Value | Count | Frequency (%) |
컨테이너 | 25 | |
기타선박류 | 20 | |
벌크선 | 15 | |
차량운반선 | 15 |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
컨테이너 | 25 | |
기타선박류 | 20 | |
벌크선 | 15 | |
차량운반선 | 15 |
선박 중량
Categorical
Distinct | 9 |
---|---|
Distinct (%) | 12.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 728.0 B |
2.5만톤 미만 | |
---|---|
2.5만~4만톤 미만 | |
4만톤 이상 | |
7만톤 이상 | |
2만톤 미만 | |
Other values (4) |
Length
Max length | 11 |
---|---|
Median length | 9 |
Mean length | 8.2 |
Min length | 6 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2.5만톤 미만 |
---|---|
2nd row | 4만톤 이상 |
3rd row | 4만톤 이상 |
4th row | 2만톤 미만 |
5th row | 2만~3만톤 미만 |
Common Values
Value | Count | Frequency (%) |
2.5만톤 미만 | 15 | |
2.5만~4만톤 미만 | 15 | |
4만톤 이상 | 10 | |
7만톤 이상 | 10 | |
2만톤 미만 | 5 | 6.7% |
2만~3만톤 미만 | 5 | 6.7% |
3만~5만톤 미만 | 5 | 6.7% |
5만~7만톤 미만 | 5 | 6.7% |
4만~7만톤 미만 | 5 | 6.7% |
Length
Category Frequency Plot
Value | Count | Frequency (%) |
미만 | 55 | |
이상 | 20 | 13.3% |
2.5만톤 | 15 | 10.0% |
2.5만~4만톤 | 15 | 10.0% |
4만톤 | 10 | 6.7% |
7만톤 | 10 | 6.7% |
2만톤 | 5 | 3.3% |
2만~3만톤 | 5 | 3.3% |
3만~5만톤 | 5 | 3.3% |
5만~7만톤 | 5 | 3.3% |
검사수수료(천원)
Real number (ℝ≥0)
Distinct | 10 |
---|---|
Distinct (%) | 13.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 996.1333333 |
Minimum | 80 |
---|---|
Maximum | 2250 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 803.0 B |
Quantile statistics
Minimum | 80 |
---|---|
5-th percentile | 120 |
Q1 | 200 |
median | 1125 |
Q3 | 1500 |
95-th percentile | 2250 |
Maximum | 2250 |
Range | 2170 |
Interquartile range (IQR) | 1300 |
Descriptive statistics
Standard deviation | 761.3961098 |
---|---|
Coefficient of variation (CV) | 0.7643516027 |
Kurtosis | -1.497680411 |
Mean | 996.1333333 |
Median Absolute Deviation (MAD) | 750 |
Skewness | 0.1298834873 |
Sum | 74710 |
Variance | 579724.036 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1125 | 12 | |
1875 | 12 | |
1500 | 12 | |
120 | 8 | |
200 | 8 | |
160 | 8 | |
2250 | 6 | |
240 | 4 | 5.3% |
750 | 3 | 4.0% |
80 | 2 | 2.7% |
Value | Count | Frequency (%) |
80 | 2 | 2.7% |
120 | 8 | |
160 | 8 | |
200 | 8 | |
240 | 4 | 5.3% |
750 | 3 | 4.0% |
1125 | 12 | |
1500 | 12 | |
1875 | 12 | |
2250 | 6 |
Value | Count | Frequency (%) |
2250 | 6 | |
1875 | 12 | |
1500 | 12 | |
1125 | 12 | |
750 | 3 | 4.0% |
240 | 4 | 5.3% |
200 | 8 | |
160 | 8 | |
120 | 8 | |
80 | 2 | 2.7% |
Distinct | 56 |
---|---|
Distinct (%) | 74.7% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 91.82666667 |
Minimum | 0 |
---|---|
Maximum | 272 |
Zeros | 7 |
Zeros (%) | 9.3% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 803.0 B |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 7 |
median | 86 |
Q3 | 144 |
95-th percentile | 253.1 |
Maximum | 272 |
Range | 272 |
Interquartile range (IQR) | 137 |
Descriptive statistics
Standard deviation | 80.56999195 |
---|---|
Coefficient of variation (CV) | 0.8774138807 |
Kurtosis | -0.6468130949 |
Mean | 91.82666667 |
Median Absolute Deviation (MAD) | 76 |
Skewness | 0.5491431952 |
Sum | 6887 |
Variance | 6491.523604 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
0 | 7 | 9.3% |
7 | 3 | 4.0% |
3 | 3 | 4.0% |
2 | 3 | 4.0% |
98 | 2 | 2.7% |
64 | 2 | 2.7% |
6 | 2 | 2.7% |
135 | 2 | 2.7% |
165 | 2 | 2.7% |
122 | 2 | 2.7% |
Other values (46) | 47 |
Value | Count | Frequency (%) |
0 | 7 | |
1 | 2 | 2.7% |
2 | 3 | |
3 | 3 | |
4 | 1 | 1.3% |
6 | 2 | 2.7% |
7 | 3 | |
10 | 1 | 1.3% |
16 | 1 | 1.3% |
22 | 1 | 1.3% |
Value | Count | Frequency (%) |
272 | 1 | |
261 | 1 | |
260 | 1 | |
258 | 1 | |
251 | 1 | |
249 | 1 | |
232 | 1 | |
206 | 1 | |
189 | 1 | |
175 | 1 |
Distinct | 52 |
---|---|
Distinct (%) | 69.3% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 78.62666667 |
Minimum | 0 |
---|---|
Maximum | 295 |
Zeros | 6 |
Zeros (%) | 8.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 803.0 B |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 5.5 |
median | 52 |
Q3 | 128.5 |
95-th percentile | 228.2 |
Maximum | 295 |
Range | 295 |
Interquartile range (IQR) | 123 |
Descriptive statistics
Standard deviation | 78.00706367 |
---|---|
Coefficient of variation (CV) | 0.9921196837 |
Kurtosis | 0.06105737906 |
Mean | 78.62666667 |
Median Absolute Deviation (MAD) | 50 |
Skewness | 0.9603237499 |
Sum | 5897 |
Variance | 6085.101982 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2 | 7 | 9.3% |
0 | 6 | 8.0% |
33 | 4 | 5.3% |
3 | 3 | 4.0% |
44 | 2 | 2.7% |
92 | 2 | 2.7% |
203 | 2 | 2.7% |
1 | 2 | 2.7% |
40 | 2 | 2.7% |
31 | 2 | 2.7% |
Other values (42) | 43 |
Value | Count | Frequency (%) |
0 | 6 | |
1 | 2 | 2.7% |
2 | 7 | |
3 | 3 | |
4 | 1 | 1.3% |
7 | 1 | 1.3% |
9 | 1 | 1.3% |
12 | 1 | 1.3% |
27 | 1 | 1.3% |
28 | 1 | 1.3% |
Value | Count | Frequency (%) |
295 | 1 | |
271 | 1 | |
255 | 1 | |
252 | 1 | |
218 | 1 | |
214 | 1 | |
203 | 2 | |
202 | 1 | |
180 | 1 | |
175 | 1 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
검사년도 | 선박 종류 | 선박 중량 | 검사수수료(천원) | 건수(기본) | 건수(할증) | |
---|---|---|---|---|---|---|
0 | 2017 | 벌크선 | 2.5만톤 미만 | 1125 | 189 | 125 |
1 | 2017 | 차량운반선 | 4만톤 이상 | 1875 | 136 | 51 |
2 | 2017 | 벌크선 | 4만톤 이상 | 1875 | 163 | 218 |
3 | 2017 | 컨테이너 | 2만톤 미만 | 750 | 2 | 1 |
4 | 2017 | 컨테이너 | 2만~3만톤 미만 | 1125 | 7 | 0 |
5 | 2017 | 컨테이너 | 3만~5만톤 미만 | 1500 | 40 | 33 |
6 | 2017 | 컨테이너 | 5만~7만톤 미만 | 1875 | 90 | 92 |
7 | 2017 | 컨테이너 | 7만톤 이상 | 2250 | 272 | 214 |
8 | 2017 | 기타선박류 | 2.5만톤 미만 | 1125 | 64 | 53 |
9 | 2017 | 기타선박류 | 2.5만~4만톤 미만 | 1500 | 146 | 90 |
Last rows
검사년도 | 선박 종류 | 선박 중량 | 검사수수료(천원) | 건수(기본) | 건수(할증) | |
---|---|---|---|---|---|---|
65 | 2013 | 기타선박류 | 2.5만~4만톤 미만 | 160 | 55 | 33 |
66 | 2013 | 기타선박류 | 2.5만톤 미만 | 120 | 46 | 57 |
67 | 2013 | 컨테이너 | 7만톤 이상 | 240 | 135 | 99 |
68 | 2013 | 컨테이너 | 5만~7만톤 미만 | 200 | 134 | 85 |
69 | 2013 | 컨테이너 | 3만~5만톤 미만 | 160 | 80 | 60 |
70 | 2013 | 컨테이너 | 2만~3만톤 미만 | 120 | 7 | 2 |
71 | 2013 | 컨테이너 | 2만톤 미만 | 80 | 0 | 3 |
72 | 2013 | 벌크선 | 4만톤 이상 | 200 | 122 | 150 |
73 | 2013 | 벌크선 | 2.5만톤 미만 | 120 | 174 | 160 |
74 | 2013 | 벌크선 | 2.5만~4만톤 미만 | 160 | 261 | 203 |