Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 1424 |
Missing cells | 1127 |
Missing cells (%) | 19.8% |
Duplicate rows | 249 |
Duplicate rows (%) | 17.5% |
Total size in memory | 47.4 KiB |
Average record size in memory | 34.1 B |
Variable types
Numeric | 2 |
---|---|
Text | 1 |
Boolean | 1 |
Dataset
Description | 가축분뇨 전자인계관리시스템에서 관리하고 있는 정보 중 장비 장착현황(업체번호, 모델명, 사용여부 등)으로 등록된 정보 입니다. |
---|---|
Author | 한국환경공단 |
URL | https://www.data.go.kr/data/15041948/fileData.do |
Dataset has 249 (17.5%) duplicate rows | Duplicates |
사용여부 is highly imbalanced (99.2%) | Imbalance |
모델 명 has 1127 (79.1%) missing values | Missing |
Reproduction
Analysis started | 2023-12-12 21:08:17.379792 |
---|---|
Analysis finished | 2023-12-12 21:08:18.159929 |
Duration | 0.78 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
업체번호
Real number (ℝ)
Distinct | 723 |
---|---|
Distinct (%) | 50.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 2.0159192 × 109 |
Minimum | 2.0130002 × 109 |
---|---|
Maximum | 2.0220003 × 109 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 12.6 KiB |
Quantile statistics
Minimum | 2.0130002 × 109 |
---|---|
5-th percentile | 2.0130004 × 109 |
Q1 | 2.0150003 × 109 |
median | 2.016001 × 109 |
Q3 | 2.0160035 × 109 |
95-th percentile | 2.0200005 × 109 |
Maximum | 2.0220003 × 109 |
Range | 9000113 |
Interquartile range (IQR) | 1003154.8 |
Descriptive statistics
Standard deviation | 1734943.3 |
---|---|
Coefficient of variation (CV) | 0.00086062146 |
Kurtosis | 2.6195974 |
Mean | 2.0159192 × 109 |
Median Absolute Deviation (MAD) | 1000418 |
Skewness | 1.2007762 |
Sum | 2.8706689 × 1012 |
Variance | 3.0100283 × 1012 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2016000829 | 13 | 0.9% |
2013000410 | 13 | 0.9% |
2017000168 | 10 | 0.7% |
2022000001 | 9 | 0.6% |
2013000588 | 9 | 0.6% |
2015000203 | 8 | 0.6% |
2013000406 | 8 | 0.6% |
2013000572 | 7 | 0.5% |
2015000394 | 7 | 0.5% |
2015000376 | 7 | 0.5% |
Other values (713) | 1333 |
Value | Count | Frequency (%) |
2013000160 | 4 | |
2013000196 | 5 | |
2013000217 | 1 | 0.1% |
2013000240 | 1 | 0.1% |
2013000259 | 1 | 0.1% |
2013000277 | 2 | 0.1% |
2013000315 | 1 | 0.1% |
2013000319 | 1 | 0.1% |
2013000350 | 1 | 0.1% |
2013000355 | 1 | 0.1% |
Value | Count | Frequency (%) |
2022000273 | 2 | |
2022000237 | 2 | |
2022000190 | 1 | |
2022000173 | 1 | |
2022000156 | 1 | |
2022000142 | 1 | |
2022000112 | 1 | |
2022000074 | 1 | |
2022000054 | 1 | |
2022000042 | 1 |
모델 명
Text
MISSING
 
Distinct | 241 |
---|---|
Distinct (%) | 81.1% |
Missing | 1127 |
Missing (%) | 79.1% |
Memory size | 11.3 KiB |
Value | Count | Frequency (%) |
xv-ca100 | 27 | 9.1% |
혁신제품 | 18 | 6.1% |
혁신장비 | 5 | 1.7% |
vpn장비 | 4 | 1.3% |
ca17-075 | 3 | 1.0% |
ca18-032 | 2 | 0.7% |
ca17-098 | 2 | 0.7% |
ca18-064 | 2 | 0.7% |
ca17-097 | 2 | 0.7% |
ca18-050 | 1 | 0.3% |
Other values (231) | 231 |
Most occurring characters
Value | Count | Frequency (%) |
1 | 337 | |
0 | 334 | |
- | 264 | |
C | 257 | |
A | 235 | |
8 | 162 | |
7 | 139 | 6.1% |
2 | 103 | 4.5% |
9 | 50 | 2.2% |
3 | 47 | 2.0% |
Other values (24) | 366 |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 1299 | |
Uppercase Letter | 621 | |
Dash Punctuation | 264 | 11.5% |
Other Letter | 106 | 4.6% |
Lowercase Letter | 4 | 0.2% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
1 | 337 | |
0 | 334 | |
8 | 162 | |
7 | 139 | |
2 | 103 | 7.9% |
9 | 50 | 3.8% |
3 | 47 | 3.6% |
5 | 44 | 3.4% |
4 | 43 | 3.3% |
6 | 40 | 3.1% |
Uppercase Letter
Value | Count | Frequency (%) |
C | 257 | |
A | 235 | |
V | 31 | 5.0% |
X | 27 | 4.3% |
S | 23 | 3.7% |
T | 20 | 3.2% |
R | 17 | 2.7% |
N | 4 | 0.6% |
P | 4 | 0.6% |
M | 3 | 0.5% |
Other Letter
Value | Count | Frequency (%) |
신 | 23 | |
혁 | 23 | |
품 | 18 | |
제 | 18 | |
비 | 10 | |
장 | 10 | |
주 | 1 | 0.9% |
너 | 1 | 0.9% |
수 | 1 | 0.9% |
탁 | 1 | 0.9% |
Lowercase Letter
Value | Count | Frequency (%) |
t | 2 | |
e | 1 | |
s | 1 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 264 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 1563 | |
Latin | 625 | 27.2% |
Hangul | 106 | 4.6% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
C | 257 | |
A | 235 | |
V | 31 | 5.0% |
X | 27 | 4.3% |
S | 23 | 3.7% |
T | 20 | 3.2% |
R | 17 | 2.7% |
N | 4 | 0.6% |
P | 4 | 0.6% |
M | 3 | 0.5% |
Other values (3) | 4 | 0.6% |
Common
Value | Count | Frequency (%) |
1 | 337 | |
0 | 334 | |
- | 264 | |
8 | 162 | |
7 | 139 | |
2 | 103 | 6.6% |
9 | 50 | 3.2% |
3 | 47 | 3.0% |
5 | 44 | 2.8% |
4 | 43 | 2.8% |
Hangul
Value | Count | Frequency (%) |
신 | 23 | |
혁 | 23 | |
품 | 18 | |
제 | 18 | |
비 | 10 | |
장 | 10 | |
주 | 1 | 0.9% |
너 | 1 | 0.9% |
수 | 1 | 0.9% |
탁 | 1 | 0.9% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2188 | |
Hangul | 106 | 4.6% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
1 | 337 | |
0 | 334 | |
- | 264 | |
C | 257 | |
A | 235 | |
8 | 162 | |
7 | 139 | |
2 | 103 | 4.7% |
9 | 50 | 2.3% |
3 | 47 | 2.1% |
Other values (14) | 260 |
Hangul
Value | Count | Frequency (%) |
신 | 23 | |
혁 | 23 | |
품 | 18 | |
제 | 18 | |
비 | 10 | |
장 | 10 | |
주 | 1 | 0.9% |
너 | 1 | 0.9% |
수 | 1 | 0.9% |
탁 | 1 | 0.9% |
사용여부
Boolean
IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.5 KiB |
True | |
---|---|
False | 1 |
Value | Count | Frequency (%) |
True | 1423 | |
False | 1 | 0.1% |
관할지사
Real number (ℝ)
Distinct | 11 |
---|---|
Distinct (%) | 0.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 906.11657 |
Minimum | 900 |
---|---|
Maximum | 910 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 12.6 KiB |
Quantile statistics
Minimum | 900 |
---|---|
5-th percentile | 901 |
Q1 | 905 |
median | 906 |
Q3 | 908 |
95-th percentile | 910 |
Maximum | 910 |
Range | 10 |
Interquartile range (IQR) | 3 |
Descriptive statistics
Standard deviation | 2.4653108 |
---|---|
Coefficient of variation (CV) | 0.0027207436 |
Kurtosis | -0.59500847 |
Mean | 906.11657 |
Median Absolute Deviation (MAD) | 2 |
Skewness | -0.32937089 |
Sum | 1290310 |
Variance | 6.0777575 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
906 | 267 | |
905 | 231 | |
908 | 175 | |
907 | 165 | |
909 | 157 | |
910 | 123 | |
902 | 80 | 5.6% |
903 | 80 | 5.6% |
901 | 75 | 5.3% |
904 | 70 | 4.9% |
Value | Count | Frequency (%) |
900 | 1 | 0.1% |
901 | 75 | 5.3% |
902 | 80 | 5.6% |
903 | 80 | 5.6% |
904 | 70 | 4.9% |
905 | 231 | |
906 | 267 | |
907 | 165 | |
908 | 175 | |
909 | 157 |
Value | Count | Frequency (%) |
910 | 123 | |
909 | 157 | |
908 | 175 | |
907 | 165 | |
906 | 267 | |
905 | 231 | |
904 | 70 | 4.9% |
903 | 80 | 5.6% |
902 | 80 | 5.6% |
901 | 75 | 5.3% |
업체번호 | 사용여부 | 관할지사 | |
---|---|---|---|
업체번호 | 1.000 | 0.000 | 0.849 |
사용여부 | 0.000 | 1.000 | 0.097 |
관할지사 | 0.849 | 0.097 | 1.000 |
업체번호 | 관할지사 | 사용여부 | |
---|---|---|---|
업체번호 | 1.000 | -0.296 | 0.000 |
관할지사 | -0.296 | 1.000 | 0.074 |
사용여부 | 0.000 | 0.074 | 1.000 |
업체번호 | 모델 명 | 사용여부 | 관할지사 | |
---|---|---|---|---|
0 | 2013000315 | CA17-099 | Y | 910 |
1 | 2013000319 | CA17-090 | Y | 910 |
2 | 2013000350 | <NA> | Y | 910 |
3 | 2013000196 | CA17-075 | Y | 910 |
4 | 2013000196 | CA17-093 | Y | 910 |
5 | 2013000196 | CA17-023 | Y | 910 |
6 | 2013000196 | CA17-036 | Y | 910 |
7 | 2013000196 | CA17-094 | Y | 910 |
8 | 2013000217 | CA17-083 | Y | 910 |
9 | 2013000259 | CA17-091 | Y | 910 |
업체번호 | 모델 명 | 사용여부 | 관할지사 | |
---|---|---|---|---|
1414 | 2018010669 | <NA> | Y | 909 |
1415 | 2018010669 | <NA> | Y | 909 |
1416 | 2020000750 | <NA> | Y | 902 |
1417 | 2018010423 | <NA> | Y | 902 |
1418 | 2019000257 | <NA> | Y | 906 |
1419 | 2019000257 | <NA> | Y | 906 |
1420 | 2019000281 | <NA> | Y | 905 |
1421 | 2019000206 | <NA> | Y | 908 |
1422 | 2019000206 | <NA> | Y | 908 |
1423 | 2020000183 | <NA> | Y | 908 |
Most frequently occurring
업체번호 | 모델 명 | 사용여부 | 관할지사 | # duplicates | |
---|---|---|---|---|---|
95 | 2016000829 | <NA> | Y | 905 | 11 |
205 | 2017000168 | <NA> | Y | 901 | 10 |
245 | 2022000001 | <NA> | Y | 906 | 8 |
8 | 2015000203 | <NA> | Y | 905 | 7 |
45 | 2015000376 | <NA> | Y | 909 | 7 |
47 | 2015000380 | <NA> | Y | 909 | 7 |
53 | 2015000394 | <NA> | Y | 907 | 7 |
116 | 2016000978 | <NA> | Y | 907 | 7 |
123 | 2016001370 | <NA> | Y | 907 | 7 |
4 | 2013000572 | <NA> | Y | 910 | 6 |