Overview

Dataset statistics

Number of variables4
Number of observations2933
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory91.8 KiB
Average record size in memory32.0 B

Variable types

Text4

Dataset

Description비점오염원관리 정보시스템의 자료를 관리하기 위한 기초 데이터로서 사업종류, 기상상태, 업종 및 하천 수계에 대한 세부 코드 정보를 제공합니다.
Author한국환경공단
URLhttps://www.data.go.kr/data/15070122/fileData.do

Reproduction

Analysis started2024-03-16 04:25:22.673549
Analysis finished2024-03-16 04:25:23.893195
Duration1.22 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct96
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
2024-03-16T13:25:24.032464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.9989772
Min length4

Characters and Unicode

Total characters20528
Distinct characters20
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st rowMS10003
2nd rowRV10013
3rd rowRV10016
4th rowRV10016
5th rowRV10016
ValueCountFrequency (%)
fc10005 2021
68.9%
ms10003 179
 
6.1%
fc10029 59
 
2.0%
ms10004 42
 
1.4%
fc10008 40
 
1.4%
fc10048 36
 
1.2%
fc10003 32
 
1.1%
cd10007 26
 
0.9%
ms10006 25
 
0.9%
fc10039 23
 
0.8%
Other values (86) 450
 
15.3%
2024-03-16T13:25:24.324737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8381
40.8%
1 3151
 
15.3%
C 2535
 
12.3%
F 2414
 
11.8%
5 2061
 
10.0%
3 358
 
1.7%
M 348
 
1.7%
S 348
 
1.7%
2 188
 
0.9%
4 167
 
0.8%
Other values (10) 577
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14660
71.4%
Uppercase Letter 5868
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8381
57.2%
1 3151
 
21.5%
5 2061
 
14.1%
3 358
 
2.4%
2 188
 
1.3%
4 167
 
1.1%
9 123
 
0.8%
8 108
 
0.7%
6 68
 
0.5%
7 55
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
C 2535
43.2%
F 2414
41.1%
M 348
 
5.9%
S 348
 
5.9%
D 104
 
1.8%
R 50
 
0.9%
V 49
 
0.8%
N 17
 
0.3%
O 2
 
< 0.1%
T 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 14660
71.4%
Latin 5868
28.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8381
57.2%
1 3151
 
21.5%
5 2061
 
14.1%
3 358
 
2.4%
2 188
 
1.3%
4 167
 
1.1%
9 123
 
0.8%
8 108
 
0.7%
6 68
 
0.5%
7 55
 
0.4%
Latin
ValueCountFrequency (%)
C 2535
43.2%
F 2414
41.1%
M 348
 
5.9%
S 348
 
5.9%
D 104
 
1.8%
R 50
 
0.9%
V 49
 
0.8%
N 17
 
0.3%
O 2
 
< 0.1%
T 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20528
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8381
40.8%
1 3151
 
15.3%
C 2535
 
12.3%
F 2414
 
11.8%
5 2061
 
10.0%
3 358
 
1.7%
M 348
 
1.7%
S 348
 
1.7%
2 188
 
0.9%
4 167
 
0.8%
Other values (10) 577
 
2.8%
Distinct94
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
2024-03-16T13:25:24.561867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length20
Mean length17.890215
Min length5

Characters and Unicode

Total characters52472
Distinct characters158
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row측정소_운영점검일지_점검_항목_코드
2nd row비용_구분_코드
3rd row사업전후_구분_코드
4th row사업전후_구분_코드
5th row사업전후_구분_코드
ValueCountFrequency (%)
사업_종류_코드(표준산업_분류_코드 2021
68.3%
측정소_운영점검일지_점검_항목_코드 179
 
6.0%
장비_종류_코드 67
 
2.3%
주요관리_대상_물질_코드 59
 
2.0%
저감_시설_종류_코드 40
 
1.4%
저감_시설_종류_코드(국고보조 36
 
1.2%
개발사업_종류_코드 32
 
1.1%
첨부파일_업무_구분_코드 26
 
0.9%
국고보조_유지관리_재해_민원_폐기물처리_항목_코드 23
 
0.8%
변경신고_이력_변경대상코드 22
 
0.7%
Other values (88) 456
 
15.4%
2024-03-16T13:25:24.918377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 10555
20.1%
4940
 
9.4%
4940
 
9.4%
4265
 
8.1%
4174
 
8.0%
2241
 
4.3%
2171
 
4.1%
2129
 
4.1%
) 2083
 
4.0%
( 2083
 
4.0%
Other values (148) 12891
24.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 37617
71.7%
Connector Punctuation 10555
 
20.1%
Close Punctuation 2083
 
4.0%
Open Punctuation 2083
 
4.0%
Uppercase Letter 84
 
0.2%
Space Separator 50
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4940
13.1%
4940
13.1%
4265
11.3%
4174
11.1%
2241
 
6.0%
2171
 
5.8%
2129
 
5.7%
2028
 
5.4%
2024
 
5.4%
2021
 
5.4%
Other values (139) 6684
17.8%
Uppercase Letter
ValueCountFrequency (%)
S 52
61.9%
M 26
31.0%
U 2
 
2.4%
R 2
 
2.4%
L 2
 
2.4%
Connector Punctuation
ValueCountFrequency (%)
_ 10555
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2083
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2083
100.0%
Space Separator
ValueCountFrequency (%)
50
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 37617
71.7%
Common 14771
 
28.2%
Latin 84
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4940
13.1%
4940
13.1%
4265
11.3%
4174
11.1%
2241
 
6.0%
2171
 
5.8%
2129
 
5.7%
2028
 
5.4%
2024
 
5.4%
2021
 
5.4%
Other values (139) 6684
17.8%
Latin
ValueCountFrequency (%)
S 52
61.9%
M 26
31.0%
U 2
 
2.4%
R 2
 
2.4%
L 2
 
2.4%
Common
ValueCountFrequency (%)
_ 10555
71.5%
) 2083
 
14.1%
( 2083
 
14.1%
50
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 37617
71.7%
ASCII 14855
 
28.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 10555
71.1%
) 2083
 
14.0%
( 2083
 
14.0%
S 52
 
0.4%
50
 
0.3%
M 26
 
0.2%
U 2
 
< 0.1%
R 2
 
< 0.1%
L 2
 
< 0.1%
Hangul
ValueCountFrequency (%)
4940
13.1%
4940
13.1%
4265
11.3%
4174
11.1%
2241
 
6.0%
2171
 
5.8%
2129
 
5.7%
2028
 
5.4%
2024
 
5.4%
2021
 
5.4%
Other values (139) 6684
17.8%
Distinct2502
Distinct (%)85.3%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
2024-03-16T13:25:25.197682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length4.0054552
Min length1

Characters and Unicode

Total characters11748
Distinct characters45
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2303 ?
Unique (%)78.5%

Sample

1st rowB6
2nd row99
3rd row1
4th row2
5th row3
ValueCountFrequency (%)
1 29
 
1.0%
2 25
 
0.9%
3 15
 
0.5%
11 13
 
0.4%
13 11
 
0.4%
4 11
 
0.4%
5 11
 
0.4%
12 11
 
0.4%
0 11
 
0.4%
6 10
 
0.3%
Other values (2491) 2786
95.0%
2024-03-16T13:25:25.588853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2206
18.8%
2 1985
16.9%
0 1332
11.3%
9 971
8.3%
3 965
8.2%
4 929
7.9%
6 618
 
5.3%
5 598
 
5.1%
7 502
 
4.3%
8 349
 
3.0%
Other values (35) 1293
11.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 10455
89.0%
Uppercase Letter 1272
 
10.8%
Lowercase Letter 21
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
D 175
13.8%
R 139
 
10.9%
E 98
 
7.7%
C 89
 
7.0%
T 78
 
6.1%
I 76
 
6.0%
A 75
 
5.9%
B 73
 
5.7%
M 56
 
4.4%
F 52
 
4.1%
Other values (16) 361
28.4%
Decimal Number
ValueCountFrequency (%)
1 2206
21.1%
2 1985
19.0%
0 1332
12.7%
9 971
9.3%
3 965
9.2%
4 929
8.9%
6 618
 
5.9%
5 598
 
5.7%
7 502
 
4.8%
8 349
 
3.3%
Lowercase Letter
ValueCountFrequency (%)
t 5
23.8%
m 5
23.8%
h 4
19.0%
p 2
 
9.5%
s 1
 
4.8%
e 1
 
4.8%
c 1
 
4.8%
r 1
 
4.8%
a 1
 
4.8%

Most occurring scripts

ValueCountFrequency (%)
Common 10455
89.0%
Latin 1293
 
11.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
D 175
13.5%
R 139
 
10.8%
E 98
 
7.6%
C 89
 
6.9%
T 78
 
6.0%
I 76
 
5.9%
A 75
 
5.8%
B 73
 
5.6%
M 56
 
4.3%
F 52
 
4.0%
Other values (25) 382
29.5%
Common
ValueCountFrequency (%)
1 2206
21.1%
2 1985
19.0%
0 1332
12.7%
9 971
9.3%
3 965
9.2%
4 929
8.9%
6 618
 
5.9%
5 598
 
5.7%
7 502
 
4.8%
8 349
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11748
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2206
18.8%
2 1985
16.9%
0 1332
11.3%
9 971
8.3%
3 965
8.2%
4 929
7.9%
6 618
 
5.3%
5 598
 
5.1%
7 502
 
4.3%
8 349
 
3.0%
Other values (35) 1293
11.0%
Distinct2460
Distinct (%)83.9%
Missing0
Missing (%)0.0%
Memory size23.0 KiB
2024-03-16T13:25:25.842848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length40
Mean length11.591544
Min length1

Characters and Unicode

Total characters33998
Distinct characters594
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2100 ?
Unique (%)71.6%

Sample

1st row채수장비(펌프, 맨홀, 켄틸러버) 정상작동 확인
2nd row기타(부대비용 등)
3rd row사업전
4th row공사중
5th row사업후
ValueCountFrequency (%)
978
 
9.9%
제조업 679
 
6.9%
기타 389
 
4.0%
서비스업 155
 
1.6%
확인 127
 
1.3%
도매업 121
 
1.2%
소매업 107
 
1.1%
99
 
1.0%
89
 
0.9%
운영업 76
 
0.8%
Other values (2532) 7028
71.4%
2024-03-16T13:25:26.269732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6915
 
20.3%
2016
 
5.9%
1037
 
3.1%
978
 
2.9%
932
 
2.7%
878
 
2.6%
424
 
1.2%
413
 
1.2%
407
 
1.2%
, 396
 
1.2%
Other values (584) 19602
57.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 25802
75.9%
Space Separator 6915
 
20.3%
Other Punctuation 444
 
1.3%
Lowercase Letter 299
 
0.9%
Uppercase Letter 234
 
0.7%
Decimal Number 97
 
0.3%
Close Punctuation 70
 
0.2%
Open Punctuation 70
 
0.2%
Connector Punctuation 49
 
0.1%
Dash Punctuation 17
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2016
 
7.8%
1037
 
4.0%
978
 
3.8%
932
 
3.6%
878
 
3.4%
424
 
1.6%
413
 
1.6%
407
 
1.6%
375
 
1.5%
361
 
1.4%
Other values (518) 17981
69.7%
Uppercase Letter
ValueCountFrequency (%)
P 27
11.5%
O 25
10.7%
T 22
 
9.4%
S 20
 
8.5%
N 19
 
8.1%
D 17
 
7.3%
C 12
 
5.1%
L 11
 
4.7%
B 10
 
4.3%
H 10
 
4.3%
Other values (12) 61
26.1%
Lowercase Letter
ValueCountFrequency (%)
a 43
14.4%
r 41
13.7%
e 29
 
9.7%
s 22
 
7.4%
i 18
 
6.0%
w 15
 
5.0%
m 14
 
4.7%
t 14
 
4.7%
l 13
 
4.3%
o 13
 
4.3%
Other values (11) 77
25.8%
Decimal Number
ValueCountFrequency (%)
1 32
33.0%
0 19
19.6%
2 12
 
12.4%
3 10
 
10.3%
5 6
 
6.2%
6 5
 
5.2%
4 5
 
5.2%
7 4
 
4.1%
9 2
 
2.1%
8 2
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 396
89.2%
/ 11
 
2.5%
· 10
 
2.3%
; 10
 
2.3%
: 8
 
1.8%
. 7
 
1.6%
& 2
 
0.5%
Space Separator
ValueCountFrequency (%)
6915
100.0%
Close Punctuation
ValueCountFrequency (%)
) 70
100.0%
Open Punctuation
ValueCountFrequency (%)
( 70
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 49
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 17
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 25802
75.9%
Common 7663
 
22.5%
Latin 533
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2016
 
7.8%
1037
 
4.0%
978
 
3.8%
932
 
3.6%
878
 
3.4%
424
 
1.6%
413
 
1.6%
407
 
1.6%
375
 
1.5%
361
 
1.4%
Other values (518) 17981
69.7%
Latin
ValueCountFrequency (%)
a 43
 
8.1%
r 41
 
7.7%
e 29
 
5.4%
P 27
 
5.1%
O 25
 
4.7%
T 22
 
4.1%
s 22
 
4.1%
S 20
 
3.8%
N 19
 
3.6%
i 18
 
3.4%
Other values (33) 267
50.1%
Common
ValueCountFrequency (%)
6915
90.2%
, 396
 
5.2%
) 70
 
0.9%
( 70
 
0.9%
_ 49
 
0.6%
1 32
 
0.4%
0 19
 
0.2%
- 17
 
0.2%
2 12
 
0.2%
/ 11
 
0.1%
Other values (13) 72
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 25742
75.7%
ASCII 8185
 
24.1%
Compat Jamo 60
 
0.2%
None 10
 
< 0.1%
CJK Compat 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6915
84.5%
, 396
 
4.8%
) 70
 
0.9%
( 70
 
0.9%
_ 49
 
0.6%
a 43
 
0.5%
r 41
 
0.5%
1 32
 
0.4%
e 29
 
0.4%
P 27
 
0.3%
Other values (54) 513
 
6.3%
Hangul
ValueCountFrequency (%)
2016
 
7.8%
1037
 
4.0%
978
 
3.8%
932
 
3.6%
878
 
3.4%
424
 
1.6%
413
 
1.6%
407
 
1.6%
375
 
1.5%
361
 
1.4%
Other values (517) 17921
69.6%
Compat Jamo
ValueCountFrequency (%)
60
100.0%
None
ValueCountFrequency (%)
· 10
100.0%
CJK Compat
ValueCountFrequency (%)
1
100.0%

Correlations

2024-03-16T13:25:26.350274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
코드종류코드종류명
코드종류1.0001.000
코드종류명1.0001.000

Missing values

2024-03-16T13:25:23.734696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-16T13:25:23.857389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

코드종류코드종류명세부코드세부코드명
0MS10003측정소_운영점검일지_점검_항목_코드B6채수장비(펌프, 맨홀, 켄틸러버) 정상작동 확인
1RV10013비용_구분_코드99기타(부대비용 등)
2RV10016사업전후_구분_코드1사업전
3RV10016사업전후_구분_코드2공사중
4RV10016사업전후_구분_코드3사업후
5CD10010주소별_좌표_주소구분코드1주소별좌표_시도
6CD10010주소별_좌표_주소구분코드2주소별좌표_시군구
7CD10010주소별_좌표_주소구분코드3주소별좌표_읍면동
8MS10001측정망 SMS발송코드1강우개시
9MS10001측정망 SMS발송코드2자동채수시작
코드종류코드종류명세부코드세부코드명
2923NC10010유관기관_협력기관_회원구분GB04유관기관
2924MS10004장비_종류_코드EQ06시료여과필터
2925MS10004장비_종류_코드EQ08강우설량계
2926MS10004장비_종류_코드EQ09데이터로거
2927MS10004장비_종류_코드EQ22저류수조
2928MS10004장비_종류_코드EQ23의자
2929MS10004장비_종류_코드EQ14채수펌프B
2930MS10004장비_종류_코드EQ12VPN
2931MS10004장비_종류_코드EQ13채수펌프A
2932MS10004장비_종류_코드EQ15조명 및 전열설비