Overview

Dataset statistics

Number of variables4
Number of observations1398
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory45.2 KiB
Average record size in memory33.1 B

Variable types

Numeric1
Text3

Dataset

Description수도권매립지에서 사용하는 폐기물, 생활소각재등급, 시료분석 등의 코드 정보입니다.개방항목 : 그룹코드, 그룹코드명, 코드, 코드명의 항목을 제공합니다.
Author수도권매립지관리공사
URLhttps://www.data.go.kr/data/15064380/fileData.do

Reproduction

Analysis started2024-04-13 13:21:37.835153
Analysis finished2024-04-13 13:21:39.797055
Duration1.96 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

그룹코드
Real number (ℝ)

Distinct128
Distinct (%)9.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.7897
Minimum1
Maximum130
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size12.4 KiB
2024-04-13T22:21:40.034963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7
Q126
median32
Q376
95-th percentile118
Maximum130
Range129
Interquartile range (IQR)50

Descriptive statistics

Standard deviation34.517537
Coefficient of variation (CV)0.67961689
Kurtosis-0.73616993
Mean50.7897
Median Absolute Deviation (MAD)20
Skewness0.66472822
Sum71004
Variance1191.4604
MonotonicityIncreasing
2024-04-13T22:21:40.491801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28 170
 
12.2%
31 68
 
4.9%
26 57
 
4.1%
29 56
 
4.0%
24 38
 
2.7%
21 36
 
2.6%
85 32
 
2.3%
76 31
 
2.2%
19 31
 
2.2%
116 30
 
2.1%
Other values (118) 849
60.7%
ValueCountFrequency (%)
1 7
 
0.5%
2 6
 
0.4%
3 5
 
0.4%
4 21
1.5%
5 7
 
0.5%
6 10
0.7%
7 18
1.3%
8 9
0.6%
9 2
 
0.1%
10 8
 
0.6%
ValueCountFrequency (%)
130 3
 
0.2%
129 5
0.4%
128 3
 
0.2%
127 7
0.5%
126 1
 
0.1%
125 11
0.8%
124 1
 
0.1%
123 1
 
0.1%
122 2
 
0.1%
121 2
 
0.1%
Distinct181
Distinct (%)12.9%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
2024-04-13T22:21:41.478290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length14
Mean length7.3655222
Min length3

Characters and Unicode

Total characters10297
Distinct characters202
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)0.5%

Sample

1st row계량_검사구분코드
2nd row계량_검사구분코드
3rd row계량_검사구분코드
4th row업무구분코드
5th row업무구분코드
ValueCountFrequency (%)
부서코드 170
 
8.7%
농협 82
 
4.2%
펌뱅킹 82
 
4.2%
폐기물 73
 
3.7%
폐기물성상 68
 
3.5%
분류 68
 
3.5%
위반코드(=벌점코드 50
 
2.6%
구분 48
 
2.5%
신청사유코드 46
 
2.4%
톤수/형식코드 38
 
1.9%
Other values (204) 1226
62.8%
2024-04-13T22:21:42.607404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
814
 
7.9%
803
 
7.8%
553
 
5.4%
377
 
3.7%
276
 
2.7%
273
 
2.7%
261
 
2.5%
241
 
2.3%
_ 220
 
2.1%
194
 
1.9%
Other values (192) 6285
61.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9291
90.2%
Space Separator 553
 
5.4%
Connector Punctuation 220
 
2.1%
Close Punctuation 50
 
0.5%
Open Punctuation 50
 
0.5%
Math Symbol 50
 
0.5%
Other Punctuation 47
 
0.5%
Uppercase Letter 19
 
0.2%
Decimal Number 13
 
0.1%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
814
 
8.8%
803
 
8.6%
377
 
4.1%
276
 
3.0%
273
 
2.9%
261
 
2.8%
241
 
2.6%
194
 
2.1%
191
 
2.1%
182
 
2.0%
Other values (177) 5679
61.1%
Uppercase Letter
ValueCountFrequency (%)
I 5
26.3%
P 5
26.3%
A 5
26.3%
F 2
 
10.5%
R 2
 
10.5%
Decimal Number
ValueCountFrequency (%)
2 7
53.8%
1 4
30.8%
4 2
 
15.4%
Space Separator
ValueCountFrequency (%)
553
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 220
100.0%
Close Punctuation
ValueCountFrequency (%)
) 50
100.0%
Open Punctuation
ValueCountFrequency (%)
( 50
100.0%
Math Symbol
ValueCountFrequency (%)
= 50
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 47
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9291
90.2%
Common 987
 
9.6%
Latin 19
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
814
 
8.8%
803
 
8.6%
377
 
4.1%
276
 
3.0%
273
 
2.9%
261
 
2.8%
241
 
2.6%
194
 
2.1%
191
 
2.1%
182
 
2.0%
Other values (177) 5679
61.1%
Common
ValueCountFrequency (%)
553
56.0%
_ 220
 
22.3%
) 50
 
5.1%
( 50
 
5.1%
= 50
 
5.1%
/ 47
 
4.8%
2 7
 
0.7%
1 4
 
0.4%
- 4
 
0.4%
4 2
 
0.2%
Latin
ValueCountFrequency (%)
I 5
26.3%
P 5
26.3%
A 5
26.3%
F 2
 
10.5%
R 2
 
10.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9291
90.2%
ASCII 1006
 
9.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
814
 
8.8%
803
 
8.6%
377
 
4.1%
276
 
3.0%
273
 
2.9%
261
 
2.8%
241
 
2.6%
194
 
2.1%
191
 
2.1%
182
 
2.0%
Other values (177) 5679
61.1%
ASCII
ValueCountFrequency (%)
553
55.0%
_ 220
 
21.9%
) 50
 
5.0%
( 50
 
5.0%
= 50
 
5.0%
/ 47
 
4.7%
2 7
 
0.7%
I 5
 
0.5%
P 5
 
0.5%
A 5
 
0.5%
Other values (5) 14
 
1.4%

코드
Text

Distinct594
Distinct (%)42.5%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
2024-04-13T22:21:43.958415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length2
Mean length3.0171674
Min length1

Characters and Unicode

Total characters4218
Distinct characters29
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique473 ?
Unique (%)33.8%

Sample

1st row01
2nd row02
3rd row03
4th rowCM
5th rowCP
ValueCountFrequency (%)
01 135
 
9.7%
02 125
 
8.9%
03 88
 
6.3%
04 67
 
4.8%
05 53
 
3.8%
06 31
 
2.2%
99 20
 
1.4%
07 20
 
1.4%
10 18
 
1.3%
11 15
 
1.1%
Other values (584) 826
59.1%
2024-04-13T22:21:45.532374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1687
40.0%
1 690
16.4%
2 532
 
12.6%
3 393
 
9.3%
4 249
 
5.9%
5 212
 
5.0%
6 129
 
3.1%
9 107
 
2.5%
7 65
 
1.5%
8 55
 
1.3%
Other values (19) 99
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4119
97.7%
Uppercase Letter 99
 
2.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 28
28.3%
M 22
22.2%
F 12
12.1%
A 6
 
6.1%
D 5
 
5.1%
B 4
 
4.0%
E 3
 
3.0%
I 2
 
2.0%
N 2
 
2.0%
Y 2
 
2.0%
Other values (9) 13
13.1%
Decimal Number
ValueCountFrequency (%)
0 1687
41.0%
1 690
16.8%
2 532
 
12.9%
3 393
 
9.5%
4 249
 
6.0%
5 212
 
5.1%
6 129
 
3.1%
9 107
 
2.6%
7 65
 
1.6%
8 55
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Common 4119
97.7%
Latin 99
 
2.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 28
28.3%
M 22
22.2%
F 12
12.1%
A 6
 
6.1%
D 5
 
5.1%
B 4
 
4.0%
E 3
 
3.0%
I 2
 
2.0%
N 2
 
2.0%
Y 2
 
2.0%
Other values (9) 13
13.1%
Common
ValueCountFrequency (%)
0 1687
41.0%
1 690
16.8%
2 532
 
12.9%
3 393
 
9.5%
4 249
 
6.0%
5 212
 
5.1%
6 129
 
3.1%
9 107
 
2.6%
7 65
 
1.6%
8 55
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4218
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1687
40.0%
1 690
16.4%
2 532
 
12.6%
3 393
 
9.3%
4 249
 
5.9%
5 212
 
5.0%
6 129
 
3.1%
9 107
 
2.5%
7 65
 
1.5%
8 55
 
1.3%
Other values (19) 99
 
2.3%
Distinct1117
Distinct (%)79.9%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
2024-04-13T22:21:46.349715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length38
Median length29
Mean length6.1387697
Min length1

Characters and Unicode

Total characters8582
Distinct characters407
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique940 ?
Unique (%)67.2%

Sample

1st row정밀
2nd row시료
3rd row일반
4th row반입관리
5th row웹포탈
ValueCountFrequency (%)
기타 34
 
1.8%
혼합반입 26
 
1.4%
폐기물 26
 
1.4%
조회 18
 
0.9%
차량대수 16
 
0.8%
요청 15
 
0.8%
응답 14
 
0.7%
변경 11
 
0.6%
초과 10
 
0.5%
계좌 10
 
0.5%
Other values (1182) 1729
90.6%
2024-04-13T22:21:47.442491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
511
 
6.0%
236
 
2.7%
189
 
2.2%
176
 
2.1%
( 139
 
1.6%
139
 
1.6%
) 138
 
1.6%
135
 
1.6%
125
 
1.5%
123
 
1.4%
Other values (397) 6671
77.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7115
82.9%
Space Separator 511
 
6.0%
Decimal Number 342
 
4.0%
Other Punctuation 140
 
1.6%
Open Punctuation 139
 
1.6%
Close Punctuation 138
 
1.6%
Uppercase Letter 134
 
1.6%
Lowercase Letter 44
 
0.5%
Connector Punctuation 12
 
0.1%
Dash Punctuation 5
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
236
 
3.3%
189
 
2.7%
176
 
2.5%
139
 
2.0%
135
 
1.9%
125
 
1.8%
123
 
1.7%
121
 
1.7%
117
 
1.6%
116
 
1.6%
Other values (336) 5638
79.2%
Uppercase Letter
ValueCountFrequency (%)
C 20
14.9%
F 17
12.7%
S 16
11.9%
R 11
 
8.2%
B 9
 
6.7%
E 7
 
5.2%
A 7
 
5.2%
P 7
 
5.2%
M 6
 
4.5%
T 6
 
4.5%
Other values (10) 28
20.9%
Lowercase Letter
ValueCountFrequency (%)
o 5
11.4%
a 5
11.4%
i 4
 
9.1%
u 4
 
9.1%
r 3
 
6.8%
g 3
 
6.8%
c 3
 
6.8%
t 2
 
4.5%
s 2
 
4.5%
d 2
 
4.5%
Other values (8) 11
25.0%
Decimal Number
ValueCountFrequency (%)
0 101
29.5%
1 78
22.8%
2 54
15.8%
5 25
 
7.3%
3 23
 
6.7%
4 20
 
5.8%
8 16
 
4.7%
7 9
 
2.6%
6 8
 
2.3%
9 8
 
2.3%
Other Punctuation
ValueCountFrequency (%)
/ 58
41.4%
, 46
32.9%
% 18
 
12.9%
. 8
 
5.7%
? 5
 
3.6%
· 4
 
2.9%
& 1
 
0.7%
Space Separator
ValueCountFrequency (%)
511
100.0%
Open Punctuation
ValueCountFrequency (%)
( 139
100.0%
Close Punctuation
ValueCountFrequency (%)
) 138
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 12
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Math Symbol
ValueCountFrequency (%)
+ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7115
82.9%
Common 1289
 
15.0%
Latin 178
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
236
 
3.3%
189
 
2.7%
176
 
2.5%
139
 
2.0%
135
 
1.9%
125
 
1.8%
123
 
1.7%
121
 
1.7%
117
 
1.6%
116
 
1.6%
Other values (336) 5638
79.2%
Latin
ValueCountFrequency (%)
C 20
 
11.2%
F 17
 
9.6%
S 16
 
9.0%
R 11
 
6.2%
B 9
 
5.1%
E 7
 
3.9%
A 7
 
3.9%
P 7
 
3.9%
M 6
 
3.4%
T 6
 
3.4%
Other values (28) 72
40.4%
Common
ValueCountFrequency (%)
511
39.6%
( 139
 
10.8%
) 138
 
10.7%
0 101
 
7.8%
1 78
 
6.1%
/ 58
 
4.5%
2 54
 
4.2%
, 46
 
3.6%
5 25
 
1.9%
3 23
 
1.8%
Other values (13) 116
 
9.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7115
82.9%
ASCII 1463
 
17.0%
None 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
511
34.9%
( 139
 
9.5%
) 138
 
9.4%
0 101
 
6.9%
1 78
 
5.3%
/ 58
 
4.0%
2 54
 
3.7%
, 46
 
3.1%
5 25
 
1.7%
3 23
 
1.6%
Other values (50) 290
19.8%
Hangul
ValueCountFrequency (%)
236
 
3.3%
189
 
2.7%
176
 
2.5%
139
 
2.0%
135
 
1.9%
125
 
1.8%
123
 
1.7%
121
 
1.7%
117
 
1.6%
116
 
1.6%
Other values (336) 5638
79.2%
None
ValueCountFrequency (%)
· 4
100.0%

Interactions

2024-04-13T22:21:39.257777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2024-04-13T22:21:39.485657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-13T22:21:39.671739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

그룹코드그룹코드명코드코드명
01계량_검사구분코드01정밀
11계량_검사구분코드02시료
21계량_검사구분코드03일반
31업무구분코드CM반입관리
41업무구분코드CP웹포탈
51업무구분코드CS업무공통
61업무구분코드RC매립정보
72전표발행구분코드01신규
82전표발행구분코드02재발행
92프로그램종류코드01XML
그룹코드그룹코드명코드코드명
1388128지자체수납 가산금 비율24
1389128지자체수납 가산금 비율35
1390129위반 내용 구분01규정 봉투 미사용
1391129위반 내용 구분02대형폐기물 혼합반입
1392129위반 내용 구분03사업장폐기물 혼합반입
1393129위반 내용 구분90기타 폐기물 혼합반입
1394129위반 내용 구분98수기입력
1395130가산금 구분01가산금 1차
1396130가산금 구분02가산금 2차
1397130가산금 구분03가산금 3차