Overview

Dataset statistics

Number of variables4
Number of observations1020
Missing cells0
Missing cells (%)0.0%
Duplicate rows12
Duplicate rows (%)1.2%
Total size in memory32.0 KiB
Average record size in memory32.1 B

Variable types

Text3
Boolean1

Dataset

Description2023년 8월 기준 환경통계포털에서 제공 중인 국가승인통계와 행정기초통계의 현황 목록(환경분야명, 통계명, 통계표명, 국가승인통계 여부 등)을 제공
URLhttps://www.data.go.kr/data/15105566/fileData.do

Alerts

Dataset has 12 (1.2%) duplicate rowsDuplicates
승인통계 여부 is highly imbalanced (58.6%)Imbalance

Reproduction

Analysis started2023-12-12 07:38:09.810078
Analysis finished2023-12-12 07:38:10.628776
Duration0.82 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct72
Distinct (%)7.1%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
2023-12-12T16:38:10.860301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length22
Mean length8.6745098
Min length1

Characters and Unicode

Total characters8848
Distinct characters162
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)1.0%

Sample

1st row환경산업통계조사
2nd row환경산업통계조사
3rd row환경산업통계조사
4th row환경산업통계조사
5th row환경산업통계조사
ValueCountFrequency (%)
생활폐기물 166
 
10.4%
부문 83
 
5.2%
국민환경보건 74
 
4.6%
기초조사 74
 
4.6%
환경기초시설 64
 
4.0%
발생현황 63
 
4.0%
환경산업통계조사 63
 
4.0%
현황 53
 
3.3%
생활,사업장,일반,건설 51
 
3.2%
사업장폐기물 40
 
2.5%
Other values (102) 863
54.1%
2023-12-12T16:38:11.708528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
574
 
6.5%
502
 
5.7%
396
 
4.5%
378
 
4.3%
343
 
3.9%
274
 
3.1%
268
 
3.0%
265
 
3.0%
264
 
3.0%
236
 
2.7%
Other values (152) 5348
60.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8063
91.1%
Space Separator 574
 
6.5%
Other Punctuation 166
 
1.9%
Decimal Number 24
 
0.3%
Uppercase Letter 9
 
0.1%
Close Punctuation 6
 
0.1%
Open Punctuation 6
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
502
 
6.2%
396
 
4.9%
378
 
4.7%
343
 
4.3%
274
 
3.4%
268
 
3.3%
265
 
3.3%
264
 
3.3%
236
 
2.9%
234
 
2.9%
Other values (141) 4903
60.8%
Decimal Number
ValueCountFrequency (%)
0 12
50.0%
8 6
25.0%
2 6
25.0%
Uppercase Letter
ValueCountFrequency (%)
S 3
33.3%
M 3
33.3%
T 3
33.3%
Other Punctuation
ValueCountFrequency (%)
, 160
96.4%
/ 6
 
3.6%
Space Separator
ValueCountFrequency (%)
574
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8063
91.1%
Common 776
 
8.8%
Latin 9
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
502
 
6.2%
396
 
4.9%
378
 
4.7%
343
 
4.3%
274
 
3.4%
268
 
3.3%
265
 
3.3%
264
 
3.3%
236
 
2.9%
234
 
2.9%
Other values (141) 4903
60.8%
Common
ValueCountFrequency (%)
574
74.0%
, 160
 
20.6%
0 12
 
1.5%
) 6
 
0.8%
8 6
 
0.8%
/ 6
 
0.8%
( 6
 
0.8%
2 6
 
0.8%
Latin
ValueCountFrequency (%)
S 3
33.3%
M 3
33.3%
T 3
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8063
91.1%
ASCII 785
 
8.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
574
73.1%
, 160
 
20.4%
0 12
 
1.5%
) 6
 
0.8%
8 6
 
0.8%
/ 6
 
0.8%
( 6
 
0.8%
2 6
 
0.8%
S 3
 
0.4%
M 3
 
0.4%
Hangul
ValueCountFrequency (%)
502
 
6.2%
396
 
4.9%
378
 
4.7%
343
 
4.3%
274
 
3.4%
268
 
3.3%
265
 
3.3%
264
 
3.3%
236
 
2.9%
234
 
2.9%
Other values (141) 4903
60.8%
Distinct52
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
2023-12-12T16:38:12.002812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length23
Mean length10.331373
Min length5

Characters and Unicode

Total characters10538
Distinct characters143
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)0.7%

Sample

1st row환경산업통계조사
2nd row환경산업통계조사
3rd row환경산업통계조사
4th row환경산업통계조사
5th row환경산업통계조사
ValueCountFrequency (%)
전국 310
 
13.4%
통계조사 310
 
13.4%
폐기물 310
 
13.4%
94
 
4.0%
발생 89
 
3.8%
처리현황 81
 
3.5%
국민환경보건 80
 
3.4%
기초조사 80
 
3.4%
전국폐기물 71
 
3.1%
환경산업통계조사 63
 
2.7%
Other values (77) 833
35.9%
2023-12-12T16:38:12.520763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1301
 
12.3%
608
 
5.8%
596
 
5.7%
593
 
5.6%
551
 
5.2%
546
 
5.2%
540
 
5.1%
530
 
5.0%
414
 
3.9%
408
 
3.9%
Other values (133) 4451
42.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9214
87.4%
Space Separator 1301
 
12.3%
Other Punctuation 14
 
0.1%
Uppercase Letter 9
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
608
 
6.6%
596
 
6.5%
593
 
6.4%
551
 
6.0%
546
 
5.9%
540
 
5.9%
530
 
5.8%
414
 
4.5%
408
 
4.4%
250
 
2.7%
Other values (126) 4178
45.3%
Other Punctuation
ValueCountFrequency (%)
· 7
50.0%
/ 6
42.9%
. 1
 
7.1%
Uppercase Letter
ValueCountFrequency (%)
M 3
33.3%
S 3
33.3%
T 3
33.3%
Space Separator
ValueCountFrequency (%)
1301
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9214
87.4%
Common 1315
 
12.5%
Latin 9
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
608
 
6.6%
596
 
6.5%
593
 
6.4%
551
 
6.0%
546
 
5.9%
540
 
5.9%
530
 
5.8%
414
 
4.5%
408
 
4.4%
250
 
2.7%
Other values (126) 4178
45.3%
Common
ValueCountFrequency (%)
1301
98.9%
· 7
 
0.5%
/ 6
 
0.5%
. 1
 
0.1%
Latin
ValueCountFrequency (%)
M 3
33.3%
S 3
33.3%
T 3
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9214
87.4%
ASCII 1317
 
12.5%
None 7
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1301
98.8%
/ 6
 
0.5%
M 3
 
0.2%
S 3
 
0.2%
T 3
 
0.2%
. 1
 
0.1%
Hangul
ValueCountFrequency (%)
608
 
6.6%
596
 
6.5%
593
 
6.4%
551
 
6.0%
546
 
5.9%
540
 
5.9%
530
 
5.8%
414
 
4.5%
408
 
4.4%
250
 
2.7%
Other values (126) 4178
45.3%
None
ValueCountFrequency (%)
· 7
100.0%
Distinct991
Distinct (%)97.2%
Missing0
Missing (%)0.0%
Memory size8.1 KiB
2023-12-12T16:38:12.855058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length66
Median length42
Mean length21.409804
Min length2

Characters and Unicode

Total characters21838
Distinct characters410
Distinct categories14 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique965 ?
Unique (%)94.6%

Sample

1st row환경보호활동별 매출액(조사업체)(2004~2010)
2nd row환경산업분류별(보호활동) 매출액
3rd row환경산업분류별(매체별) 매출액
4th row환경산업분류별(업종별) 매출액
5th row산업분류별/환경산업분류별 환경부문 매출액(서비스)
ValueCountFrequency (%)
153
 
3.9%
폐기물 127
 
3.2%
79
 
2.0%
발생량 67
 
1.7%
66
 
1.7%
원단위 58
 
1.5%
현황 55
 
1.4%
처리현황 50
 
1.3%
발생원별 50
 
1.3%
발생 47
 
1.2%
Other values (1167) 3185
80.9%
2023-12-12T16:38:13.488736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2920
 
13.4%
697
 
3.2%
565
 
2.6%
) 540
 
2.5%
( 540
 
2.5%
0 521
 
2.4%
473
 
2.2%
418
 
1.9%
409
 
1.9%
2 381
 
1.7%
Other values (400) 14374
65.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15366
70.4%
Space Separator 2920
 
13.4%
Decimal Number 1691
 
7.7%
Close Punctuation 540
 
2.5%
Open Punctuation 540
 
2.5%
Other Punctuation 232
 
1.1%
Uppercase Letter 200
 
0.9%
Math Symbol 159
 
0.7%
Connector Punctuation 86
 
0.4%
Dash Punctuation 80
 
0.4%
Other values (4) 24
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
697
 
4.5%
565
 
3.7%
473
 
3.1%
418
 
2.7%
409
 
2.7%
370
 
2.4%
368
 
2.4%
367
 
2.4%
338
 
2.2%
303
 
2.0%
Other values (347) 11058
72.0%
Uppercase Letter
ValueCountFrequency (%)
P 52
26.0%
M 37
18.5%
E 19
 
9.5%
H 13
 
6.5%
C 12
 
6.0%
O 11
 
5.5%
R 11
 
5.5%
A 8
 
4.0%
B 8
 
4.0%
S 7
 
3.5%
Other values (5) 22
11.0%
Decimal Number
ValueCountFrequency (%)
0 521
30.8%
2 381
22.5%
1 285
16.9%
9 201
 
11.9%
6 71
 
4.2%
8 70
 
4.1%
4 62
 
3.7%
5 40
 
2.4%
7 33
 
2.0%
3 27
 
1.6%
Other Punctuation
ValueCountFrequency (%)
, 120
51.7%
· 50
21.6%
/ 29
 
12.5%
: 9
 
3.9%
. 9
 
3.9%
& 6
 
2.6%
# 4
 
1.7%
; 4
 
1.7%
% 1
 
0.4%
Lowercase Letter
ValueCountFrequency (%)
p 4
19.0%
t 4
19.0%
x 3
14.3%
n 2
9.5%
c 2
9.5%
z 2
9.5%
o 2
9.5%
e 1
 
4.8%
s 1
 
4.8%
Math Symbol
ValueCountFrequency (%)
~ 158
99.4%
1
 
0.6%
Space Separator
ValueCountFrequency (%)
2920
100.0%
Close Punctuation
ValueCountFrequency (%)
) 540
100.0%
Open Punctuation
ValueCountFrequency (%)
( 540
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 86
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 80
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Other Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 15366
70.4%
Common 6251
28.6%
Latin 221
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
697
 
4.5%
565
 
3.7%
473
 
3.1%
418
 
2.7%
409
 
2.7%
370
 
2.4%
368
 
2.4%
367
 
2.4%
338
 
2.2%
303
 
2.0%
Other values (347) 11058
72.0%
Common
ValueCountFrequency (%)
2920
46.7%
) 540
 
8.6%
( 540
 
8.6%
0 521
 
8.3%
2 381
 
6.1%
1 285
 
4.6%
9 201
 
3.2%
~ 158
 
2.5%
, 120
 
1.9%
_ 86
 
1.4%
Other values (19) 499
 
8.0%
Latin
ValueCountFrequency (%)
P 52
23.5%
M 37
16.7%
E 19
 
8.6%
H 13
 
5.9%
C 12
 
5.4%
O 11
 
5.0%
R 11
 
5.0%
A 8
 
3.6%
B 8
 
3.6%
S 7
 
3.2%
Other values (14) 43
19.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 15366
70.4%
ASCII 6418
29.4%
None 51
 
0.2%
Punctuation 2
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2920
45.5%
) 540
 
8.4%
( 540
 
8.4%
0 521
 
8.1%
2 381
 
5.9%
1 285
 
4.4%
9 201
 
3.1%
~ 158
 
2.5%
, 120
 
1.9%
_ 86
 
1.3%
Other values (38) 666
 
10.4%
Hangul
ValueCountFrequency (%)
697
 
4.5%
565
 
3.7%
473
 
3.1%
418
 
2.7%
409
 
2.7%
370
 
2.4%
368
 
2.4%
367
 
2.4%
338
 
2.2%
303
 
2.0%
Other values (347) 11058
72.0%
None
ValueCountFrequency (%)
· 50
98.0%
1
 
2.0%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

승인통계 여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size1.1 KiB
True
935 
False
 
85
ValueCountFrequency (%)
True 935
91.7%
False 85
 
8.3%
2023-12-12T16:38:13.644519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:38:13.713621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분야명통계명승인통계 여부
분야명1.0001.0000.999
통계명1.0001.0000.999
승인통계 여부0.9990.9991.000

Missing values

2023-12-12T16:38:10.463368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:38:10.582385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

분야명통계명통계표명승인통계 여부
0환경산업통계조사환경산업통계조사환경보호활동별 매출액(조사업체)(2004~2010)Y
1환경산업통계조사환경산업통계조사환경산업분류별(보호활동) 매출액Y
2환경산업통계조사환경산업통계조사환경산업분류별(매체별) 매출액Y
3환경산업통계조사환경산업통계조사환경산업분류별(업종별) 매출액Y
4환경산업통계조사환경산업통계조사산업분류별/환경산업분류별 환경부문 매출액(서비스)Y
5환경산업통계조사환경산업통계조사산업분류별/환경산업분류별 환경부문 매출액(환경제품생산)Y
6환경산업통계조사환경산업통계조사산업분류별 조사업체수(조사업체)(2009~2010)Y
7환경산업통계조사환경산업통계조사환경산업분류별(업종별) 투자액(2009~2012)Y
8환경산업통계조사환경산업통계조사환경산업분류별(업종별) 유형고정자산(2009~2012)Y
9환경산업통계조사환경산업통계조사산업분류별 매출액(조사업체)(2009~2010)Y
분야명통계명통계표명승인통계 여부
1010국민환경보건 기초조사국민환경보건 기초조사요 중 모노(2-에틸-5-옥소헥실)프탈레이트(MEOHP) 농도(크레아티닌 보정)Y
1011국민환경보건 기초조사국민환경보건 기초조사요 중 모노(2-에틸-5-카르복시펜틸)프탈레이트(MECPP) 농도Y
1012국민환경보건 기초조사국민환경보건 기초조사요 중 모노(2-에틸-5-카르복시펜틸)프탈레이트(MECPP) 농도(크레아티닌 보정)Y
1013국민환경보건 기초조사국민환경보건 기초조사요 중 모노벤질프탈레이트(MBzP) 농도Y
1014국민환경보건 기초조사국민환경보건 기초조사요 중 모노벤질프탈레이트(MBzP) 농도(크레아티닌 보정)Y
1015국민환경보건 기초조사국민환경보건 기초조사요 중 모노카르복시옥틸 프탈레이트(MCOP) 농도Y
1016국민환경보건 기초조사국민환경보건 기초조사요 중 모노카르복시옥틸 프탈레이트(MCOP) 농도(크레아티닌 보정)Y
1017국민환경보건 기초조사국민환경보건 기초조사요 중 모노(3-카르복시프로필)프탈레이트(MCPP) 농도Y
1018국민환경보건 기초조사국민환경보건 기초조사요 중 모노(3-카르복시프로필)프탈레이트(MCPP) 농도(크레아티닌 보정)Y
1019국민환경보건 기초조사국민환경보건 기초조사요 중 비스페놀 A 농도(크레아티닌 보정)Y

Duplicate rows

Most frequently occurring

분야명통계명통계표명승인통계 여부# duplicates
8생활폐기물 발생현황전국 폐기물 통계조사생활폐기물 삼성분 분석 연평균Y3
0기타폐기물전국 폐기물 통계조사다중이용시설 폐기물 발생현황Y2
1기타폐기물전국 폐기물 통계조사수해폐기물 발생현황Y2
2기타폐기물전국 폐기물 통계조사스포츠·레저시설 폐기물 발생현황Y2
3사업장폐기물전국 폐기물 통계조사사업장폐기물 발생 및 처리현황(총괄)Y2
4사업장폐기물발생현황전국 폐기물 통계조사사업장폐기물 발생 및 처리현황(총괄)Y2
5생활폐기물전국 폐기물 통계조사도시규모별 재활용가능자원 분리배출현황Y2
6생활폐기물전국 폐기물 통계조사도시규모별 재활용가능자원 종류에 따른 원단위 발생량Y2
7생활폐기물 발생현황전국 폐기물 통계조사생활폐기물 발열량분석 연평균Y2
9생활폐기물 발생현황전국 폐기물 통계조사생활폐기물 원소분석 연평균Y2