Overview

Dataset statistics

Number of variables7
Number of observations3870
Missing cells1248
Missing cells (%)4.6%
Duplicate rows516
Duplicate rows (%)13.3%
Total size in memory223.1 KiB
Average record size in memory59.0 B

Variable types

Numeric3
Boolean1
DateTime1
Categorical1
Text1

Dataset

Description가축분뇨 전자인계관리시스템에서 관리하고 있는 정보 중에 가축분뇨 및 액비의 배출,운반, 처리 와 관리하고 있는 회원정보(사용자유형 등)으로 등록되어 관리되고 있는 정보 입니다.
Author한국환경공단
URLhttps://www.data.go.kr/data/15041844/fileData.do

Alerts

사용여부 has constant value ""Constant
Dataset has 516 (13.3%) duplicate rowsDuplicates
사용자유형 is highly overall correlated with 사용자구분High correlation
사용자구분 is highly overall correlated with 사용자유형High correlation
업체사용구분 has 416 (10.7%) missing valuesMissing
관할지사 has 416 (10.7%) missing valuesMissing
관할관청 has 416 (10.7%) missing valuesMissing
사용자유형 is highly skewed (γ1 = 31.30024297)Skewed

Reproduction

Analysis started2023-12-12 01:31:47.901427
Analysis finished2023-12-12 01:31:50.173356
Duration2.27 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

사용자유형
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct9
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3268734
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size34.1 KiB
2023-12-12T10:31:50.233629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile3
Maximum99
Range98
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.8514304
Coefficient of variation (CV)2.1489845
Kurtosis1066.4046
Mean1.3268734
Median Absolute Deviation (MAD)0
Skewness31.300243
Sum5135
Variance8.1306553
MonotonicityNot monotonic
2023-12-12T10:31:50.348530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 3453
89.2%
3 318
 
8.2%
2 46
 
1.2%
6 30
 
0.8%
9 9
 
0.2%
7 7
 
0.2%
8 3
 
0.1%
99 3
 
0.1%
5 1
 
< 0.1%
ValueCountFrequency (%)
1 3453
89.2%
2 46
 
1.2%
3 318
 
8.2%
5 1
 
< 0.1%
6 30
 
0.8%
7 7
 
0.2%
8 3
 
0.1%
9 9
 
0.2%
99 3
 
0.1%
ValueCountFrequency (%)
99 3
 
0.1%
9 9
 
0.2%
8 3
 
0.1%
7 7
 
0.2%
6 30
 
0.8%
5 1
 
< 0.1%
3 318
 
8.2%
2 46
 
1.2%
1 3453
89.2%

사용여부
Boolean

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.9 KiB
True
3870 
ValueCountFrequency (%)
True 3870
100.0%
2023-12-12T10:31:50.462997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct104
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size30.4 KiB
Minimum2013-11-01 00:00:00
Maximum2022-12-01 00:00:00
2023-12-12T10:31:50.575953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:31:50.715070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

사용자구분
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size30.4 KiB
A2
2646 
A1
808 
<NA>
416 

Length

Max length4
Median length2
Mean length2.2149871
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowA2
2nd rowA2
3rd rowA2
4th rowA2
5th rowA2

Common Values

ValueCountFrequency (%)
A2 2646
68.4%
A1 808
 
20.9%
<NA> 416
 
10.7%

Length

2023-12-12T10:31:50.881773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:31:51.016425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
a2 2646
68.4%
a1 808
 
20.9%
na 416
 
10.7%

업체사용구분
Text

MISSING 

Distinct68
Distinct (%)2.0%
Missing416
Missing (%)10.7%
Memory size30.4 KiB
2023-12-12T10:31:51.203865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length2
Mean length2.4820498
Min length2

Characters and Unicode

Total characters8573
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique29 ?
Unique (%)0.8%

Sample

1st row03
2nd row03
3rd row02,03,05,06
4th row01,04
5th row05,04,06
ValueCountFrequency (%)
01 2011
58.2%
02 415
 
12.0%
01,04 258
 
7.5%
03 222
 
6.4%
04 187
 
5.4%
05 133
 
3.9%
02,05 23
 
0.7%
06 22
 
0.6%
02,03 18
 
0.5%
01,02 10
 
0.3%
Other values (58) 155
 
4.5%
2023-12-12T10:31:51.682548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3999
46.6%
1 2363
27.6%
, 555
 
6.5%
2 533
 
6.2%
4 500
 
5.8%
3 303
 
3.5%
5 231
 
2.7%
6 76
 
0.9%
7 7
 
0.1%
9 5
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8018
93.5%
Other Punctuation 555
 
6.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3999
49.9%
1 2363
29.5%
2 533
 
6.6%
4 500
 
6.2%
3 303
 
3.8%
5 231
 
2.9%
6 76
 
0.9%
7 7
 
0.1%
9 5
 
0.1%
8 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
, 555
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8573
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3999
46.6%
1 2363
27.6%
, 555
 
6.5%
2 533
 
6.2%
4 500
 
5.8%
3 303
 
3.5%
5 231
 
2.7%
6 76
 
0.9%
7 7
 
0.1%
9 5
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8573
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3999
46.6%
1 2363
27.6%
, 555
 
6.5%
2 533
 
6.2%
4 500
 
5.8%
3 303
 
3.5%
5 231
 
2.7%
6 76
 
0.9%
7 7
 
0.1%
9 5
 
0.1%

관할지사
Real number (ℝ)

MISSING 

Distinct10
Distinct (%)0.3%
Missing416
Missing (%)10.7%
Infinite0
Infinite (%)0.0%
Mean906.0388
Minimum901
Maximum910
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size34.1 KiB
2023-12-12T10:31:51.848444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum901
5-th percentile901
Q1904
median906
Q3908
95-th percentile909
Maximum910
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.5106089
Coefficient of variation (CV)0.0027709729
Kurtosis-0.81491397
Mean906.0388
Median Absolute Deviation (MAD)2
Skewness-0.40221258
Sum3129458
Variance6.3031571
MonotonicityNot monotonic
2023-12-12T10:31:52.026901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
906 581
15.0%
908 525
13.6%
909 512
13.2%
907 406
10.5%
904 346
8.9%
905 303
7.8%
902 237
6.1%
901 198
 
5.1%
903 187
 
4.8%
910 159
 
4.1%
(Missing) 416
10.7%
ValueCountFrequency (%)
901 198
 
5.1%
902 237
6.1%
903 187
 
4.8%
904 346
8.9%
905 303
7.8%
906 581
15.0%
907 406
10.5%
908 525
13.6%
909 512
13.2%
910 159
 
4.1%
ValueCountFrequency (%)
910 159
 
4.1%
909 512
13.2%
908 525
13.6%
907 406
10.5%
906 581
15.0%
905 303
7.8%
904 346
8.9%
903 187
 
4.8%
902 237
6.1%
901 198
 
5.1%

관할관청
Real number (ℝ)

MISSING 

Distinct149
Distinct (%)4.3%
Missing416
Missing (%)10.7%
Infinite0
Infinite (%)0.0%
Mean1167.5052
Minimum123
Maximum1701
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size34.1 KiB
2023-12-12T10:31:52.192688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum123
5-th percentile821
Q11009
median1203
Q31320
95-th percentile1515
Maximum1701
Range1578
Interquartile range (IQR)311

Descriptive statistics

Standard deviation240.99416
Coefficient of variation (CV)0.20641806
Kurtosis0.20048352
Mean1167.5052
Median Absolute Deviation (MAD)193
Skewness-0.28720983
Sum4032563
Variance58078.187
MonotonicityNot monotonic
2023-12-12T10:31:52.385857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
829 125
 
3.2%
1213 118
 
3.0%
1602 110
 
2.8%
1203 102
 
2.6%
1104 91
 
2.4%
1015 82
 
2.1%
912 76
 
2.0%
1401 67
 
1.7%
1409 65
 
1.7%
1209 63
 
1.6%
Other values (139) 2555
66.0%
(Missing) 416
 
10.7%
ValueCountFrequency (%)
123 1
 
< 0.1%
203 3
 
0.1%
209 1
 
< 0.1%
210 1
 
< 0.1%
313 1
 
< 0.1%
411 31
0.8%
413 1
 
< 0.1%
417 4
 
0.1%
511 4
 
0.1%
603 1
 
< 0.1%
ValueCountFrequency (%)
1701 5
 
0.1%
1602 110
2.8%
1601 49
1.3%
1515 56
1.4%
1514 4
 
0.1%
1513 12
 
0.3%
1512 56
1.4%
1511 48
1.2%
1510 21
 
0.5%
1509 5
 
0.1%

Interactions

2023-12-12T10:31:49.106703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:31:48.310155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:31:48.733790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:31:49.243817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:31:48.446831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:31:48.857286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:31:49.362566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:31:48.583415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:31:48.989849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T10:31:52.538057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용자유형사용자구분업체사용구분관할지사관할관청
사용자유형1.000NaNNaNNaNNaN
사용자구분NaN1.0000.2630.2170.173
업체사용구분NaN0.2631.0000.4630.360
관할지사NaN0.2170.4631.0000.932
관할관청NaN0.1730.3600.9321.000
2023-12-12T10:31:52.692690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용자유형관할지사관할관청사용자구분
사용자유형1.000-0.0100.0261.000
관할지사-0.0101.0000.1620.210
관할관청0.0260.1621.0000.132
사용자구분1.0000.2100.1321.000

Missing values

2023-12-12T10:31:49.548745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:31:49.678445image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T10:31:50.099882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

사용자유형사용여부가입일자사용자구분업체사용구분관할지사관할관청
01Y2013-12A2039101601
11Y2013-12A2039101602
21Y2013-12A202,03,05,069101601
31Y2013-12A201,049101602
41Y2013-12A205,04,069101602
51Y2013-12A205,069101602
61Y2013-12A202,03,069101602
71Y2013-12A205,069101602
81Y2013-12A2069101601
91Y2013-12A2059101602
사용자유형사용여부가입일자사용자구분업체사용구분관할지사관할관청
38601Y2022-10A1039061213
38611Y2022-08A2019081019
38621Y2022-09A2019081011
38631Y2022-09A2019051509
38641Y2022-10A2039061213
38653Y2022-11<NA><NA><NA><NA>
38661Y2022-12A2039071321
38671Y2022-12A1019081019
38686Y2022-10<NA><NA><NA><NA>
38693Y2022-11<NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

사용자유형사용여부가입일자사용자구분업체사용구분관할지사관할관청# duplicates
2471Y2017-01A201909110437
2731Y2017-02A201904140135
4643Y2016-12<NA><NA><NA><NA>33
1381Y2016-12A20190282931
3911Y2018-12A101906120928
1471Y2016-12A201905151227
4553Y2015-12<NA><NA><NA><NA>27
1351Y2016-12A20190182925
1421Y2016-12A201904140925
1231Y2016-12A101908101524