Overview

Dataset statistics

Number of variables5
Number of observations199
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.3 KiB
Average record size in memory42.7 B

Variable types

Text1
Numeric1
Categorical1
Boolean1
DateTime1

Dataset

Description전기전자제품및자동차의재활용시스템 내 폐전기전자제품_혼입 비율 조사 정보(조사대상업체명,조사회차,의무이행년도,확정여부,조사일자 등)
Author환경부
URLhttps://www.data.go.kr/data/15092290/fileData.do

Alerts

의무이행년도 has constant value ""Constant
확정여부 is highly imbalanced (52.9%)Imbalance

Reproduction

Analysis started2024-04-06 08:48:43.487585
Analysis finished2024-04-06 08:48:45.495454
Duration2.01 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct92
Distinct (%)46.2%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2024-04-06T17:48:46.026584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length14
Mean length8.758794
Min length2

Characters and Unicode

Total characters1743
Distinct characters169
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)11.1%

Sample

1st row(사)한국IT복지진흥원
2nd row(사)한국아이티복지진흥원자원순환센터
3rd row(사)한국아이티복지진흥원자원순환센터
4th row(사)한국아이티복지진흥원자원순환센터
5th row(사)한국아이티복지진흥원자원순환센터
ValueCountFrequency (%)
부산광역시 32
 
11.5%
주식회사 14
 
5.0%
울산광역시 10
 
3.6%
주)미래리싸이클링 10
 
3.6%
주)창우rs 9
 
3.2%
충청남도 8
 
2.9%
주)모던이앤알 8
 
2.9%
동구 6
 
2.2%
주)에스쓰리알 5
 
1.8%
주)수도권서부자원순환센터 5
 
1.8%
Other values (87) 172
61.6%
2024-04-06T17:48:47.257740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
115
 
6.6%
( 100
 
5.7%
) 100
 
5.7%
80
 
4.6%
59
 
3.4%
57
 
3.3%
46
 
2.6%
45
 
2.6%
43
 
2.5%
40
 
2.3%
Other values (159) 1058
60.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1436
82.4%
Open Punctuation 100
 
5.7%
Close Punctuation 100
 
5.7%
Space Separator 80
 
4.6%
Uppercase Letter 22
 
1.3%
Decimal Number 5
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
115
 
8.0%
59
 
4.1%
57
 
4.0%
46
 
3.2%
45
 
3.1%
43
 
3.0%
40
 
2.8%
40
 
2.8%
39
 
2.7%
37
 
2.6%
Other values (149) 915
63.7%
Uppercase Letter
ValueCountFrequency (%)
S 9
40.9%
R 9
40.9%
I 1
 
4.5%
E 1
 
4.5%
P 1
 
4.5%
T 1
 
4.5%
Open Punctuation
ValueCountFrequency (%)
( 100
100.0%
Close Punctuation
ValueCountFrequency (%)
) 100
100.0%
Space Separator
ValueCountFrequency (%)
80
100.0%
Decimal Number
ValueCountFrequency (%)
2 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1436
82.4%
Common 285
 
16.4%
Latin 22
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
115
 
8.0%
59
 
4.1%
57
 
4.0%
46
 
3.2%
45
 
3.1%
43
 
3.0%
40
 
2.8%
40
 
2.8%
39
 
2.7%
37
 
2.6%
Other values (149) 915
63.7%
Latin
ValueCountFrequency (%)
S 9
40.9%
R 9
40.9%
I 1
 
4.5%
E 1
 
4.5%
P 1
 
4.5%
T 1
 
4.5%
Common
ValueCountFrequency (%)
( 100
35.1%
) 100
35.1%
80
28.1%
2 5
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1436
82.4%
ASCII 307
 
17.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
115
 
8.0%
59
 
4.1%
57
 
4.0%
46
 
3.2%
45
 
3.1%
43
 
3.0%
40
 
2.8%
40
 
2.8%
39
 
2.7%
37
 
2.6%
Other values (149) 915
63.7%
ASCII
ValueCountFrequency (%)
( 100
32.6%
) 100
32.6%
80
26.1%
S 9
 
2.9%
R 9
 
2.9%
2 5
 
1.6%
I 1
 
0.3%
E 1
 
0.3%
P 1
 
0.3%
T 1
 
0.3%

조사회차
Real number (ℝ)

Distinct33
Distinct (%)16.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16931.653
Minimum10946
Maximum20400
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2024-04-06T17:48:47.662604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10946
5-th percentile11030
Q111100
median20110
Q320200
95-th percentile20300
Maximum20400
Range9454
Interquartile range (IQR)9100

Descriptive statistics

Standard deviation4373.6763
Coefficient of variation (CV)0.2583136
Kurtosis-1.6533416
Mean16931.653
Median Absolute Deviation (MAD)120
Skewness-0.60181018
Sum3369399
Variance19129044
MonotonicityNot monotonic
2024-04-06T17:48:47.997468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
11100 35
17.6%
20200 34
17.1%
20100 18
9.0%
20300 15
 
7.5%
11030 14
 
7.0%
20230 12
 
6.0%
20000 10
 
5.0%
20130 10
 
5.0%
11130 6
 
3.0%
20210 4
 
2.0%
Other values (23) 41
20.6%
ValueCountFrequency (%)
10946 1
 
0.5%
11000 4
 
2.0%
11020 1
 
0.5%
11027 1
 
0.5%
11030 14
 
7.0%
11036 1
 
0.5%
11050 4
 
2.0%
11054 1
 
0.5%
11100 35
17.6%
11103 1
 
0.5%
ValueCountFrequency (%)
20400 2
 
1.0%
20330 2
 
1.0%
20300 15
7.5%
20240 4
 
2.0%
20230 12
 
6.0%
20220 1
 
0.5%
20215 1
 
0.5%
20210 4
 
2.0%
20205 2
 
1.0%
20200 34
17.1%

의무이행년도
Categorical

CONSTANT 

Distinct1
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023
199 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2023
5th row2023

Common Values

ValueCountFrequency (%)
2023 199
100.0%

Length

2024-04-06T17:48:48.356922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:48:48.714669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 199
100.0%

확정여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size331.0 B
True
179 
False
20 
ValueCountFrequency (%)
True 179
89.9%
False 20
 
10.1%
2024-04-06T17:48:49.001405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct73
Distinct (%)36.7%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
Minimum2023-04-19 00:00:00
Maximum2023-12-26 00:00:00
2024-04-06T17:48:49.312404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:48:49.682167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-04-06T17:48:44.631981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T17:48:49.924154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조사대상업체명조사회차확정여부조사일자
조사대상업체명1.0000.4460.6390.805
조사회차0.4461.0000.1520.481
확정여부0.6390.1521.0000.672
조사일자0.8050.4810.6721.000
2024-04-06T17:48:50.209905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조사회차확정여부
조사회차1.0000.094
확정여부0.0941.000

Missing values

2024-04-06T17:48:45.005567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T17:48:45.316974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

조사대상업체명조사회차의무이행년도확정여부조사일자
0(사)한국IT복지진흥원202152023Y2023-05-23
1(사)한국아이티복지진흥원자원순환센터111302023N2023-04-19
2(사)한국아이티복지진흥원자원순환센터200002023Y2023-10-30
3(사)한국아이티복지진흥원자원순환센터201002023N2023-04-19
4(사)한국아이티복지진흥원자원순환센터203002023N2023-10-24
5(사)한국아이티복지진흥원자원순환센터204002023N2023-10-24
6(유)행복나누미110002023Y2023-06-28
7(유)행복나누미202002023Y2023-12-05
8(주)경남알씨202002023Y2023-05-03
9(주)경남알씨202002023Y2023-09-20
조사대상업체명조사회차의무이행년도확정여부조사일자
189충청남도 서산시201452023Y2023-11-02
190충청남도 홍성군201452023Y2023-05-10
191충청남도 홍성군201502023Y2023-10-20
192콘에어코리아트레이딩202302023Y2023-06-05
193태양환경자원201302023Y2023-10-27
194태양환경자원201402023Y2023-11-08
195태양환경자원202002023Y2023-10-27
196태양환경자원202302023Y2023-05-09
197현대실업111002023Y2023-05-03
198현대실업111002023Y2023-09-20