Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.0 KiB
Average record size in memory41.3 B

Variable types

DateTime1
Categorical3
Text1

Alerts

측정일 has constant value ""Constant
측정지점 has constant value ""Constant
수치 is highly imbalanced (50.4%)Imbalance

Reproduction

Analysis started2023-12-10 10:18:49.753908
Analysis finished2023-12-10 10:18:50.324591
Duration0.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2021-01-04 00:00:00
Maximum2021-01-04 00:00:00
2023-12-10T19:18:50.408492image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:18:50.542882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

시설명
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
고산정수장
59 
공주정수장
41 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공주정수장
2nd row공주정수장
3rd row고산정수장
4th row공주정수장
5th row공주정수장

Common Values

ValueCountFrequency (%)
고산정수장 59
59.0%
공주정수장 41
41.0%

Length

2023-12-10T19:18:51.002362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:18:51.141878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고산정수장 59
59.0%
공주정수장 41
41.0%

측정지점
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
정수지
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정수지
2nd row정수지
3rd row정수지
4th row정수지
5th row정수지

Common Values

ValueCountFrequency (%)
정수지 100
100.0%

Length

2023-12-10T19:18:51.278043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:18:51.395087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정수지 100
100.0%
Distinct59
Distinct (%)59.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:18:51.648407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length12
Mean length5.05
Min length1

Characters and Unicode

Total characters505
Distinct characters117
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)18.0%

Sample

1st row대장균
2nd row질산성질소
3rd row파라티온
4th row다이아지논
5th row사염화탄소
ValueCountFrequency (%)
대장균 2
 
2.0%
과망간산칼륨소비량 2
 
2.0%
구리 2
 
2.0%
수은 2
 
2.0%
암모니아성질소 2
 
2.0%
질산성질소 2
 
2.0%
디클로로메탄 2
 
2.0%
총대장균군 2
 
2.0%
2
 
2.0%
ph 2
 
2.0%
Other values (49) 80
80.0%
2023-12-10T19:18:52.212824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
49
 
9.7%
19
 
3.8%
16
 
3.2%
1 14
 
2.8%
14
 
2.8%
13
 
2.6%
- 12
 
2.4%
12
 
2.4%
11
 
2.2%
11
 
2.2%
Other values (107) 334
66.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 455
90.1%
Decimal Number 20
 
4.0%
Dash Punctuation 12
 
2.4%
Other Punctuation 10
 
2.0%
Lowercase Letter 2
 
0.4%
Uppercase Letter 2
 
0.4%
Open Punctuation 2
 
0.4%
Close Punctuation 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
49
 
10.8%
19
 
4.2%
16
 
3.5%
14
 
3.1%
13
 
2.9%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
10
 
2.2%
Other values (97) 290
63.7%
Decimal Number
ValueCountFrequency (%)
1 14
70.0%
4 2
 
10.0%
2 2
 
10.0%
3 2
 
10.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Other Punctuation
ValueCountFrequency (%)
. 10
100.0%
Lowercase Letter
ValueCountFrequency (%)
p 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
H 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 455
90.1%
Common 46
 
9.1%
Latin 4
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
49
 
10.8%
19
 
4.2%
16
 
3.5%
14
 
3.1%
13
 
2.9%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
10
 
2.2%
Other values (97) 290
63.7%
Common
ValueCountFrequency (%)
1 14
30.4%
- 12
26.1%
. 10
21.7%
4 2
 
4.3%
( 2
 
4.3%
) 2
 
4.3%
2 2
 
4.3%
3 2
 
4.3%
Latin
ValueCountFrequency (%)
p 2
50.0%
H 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 455
90.1%
ASCII 50
 
9.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
49
 
10.8%
19
 
4.2%
16
 
3.5%
14
 
3.1%
13
 
2.9%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
10
 
2.2%
Other values (97) 290
63.7%
ASCII
ValueCountFrequency (%)
1 14
28.0%
- 12
24.0%
. 10
20.0%
4 2
 
4.0%
p 2
 
4.0%
H 2
 
4.0%
( 2
 
4.0%
) 2
 
4.0%
2 2
 
4.0%
3 2
 
4.0%

수치
Categorical

IMBALANCE 

Distinct28
Distinct (%)28.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
불검출
68 
없음
 
4
0
 
2
0.004
 
2
1.2
 
1
Other values (23)
23 

Length

Max length6
Median length3
Mean length3.14
Min length1

Unique

Unique24 ?
Unique (%)24.0%

Sample

1st row불검출
2nd row1.2
3rd row불검출
4th row불검출
5th row불검출

Common Values

ValueCountFrequency (%)
불검출 68
68.0%
없음 4
 
4.0%
0 2
 
2.0%
0.004 2
 
2.0%
1.2 1
 
1.0%
82 1
 
1.0%
0.003 1
 
1.0%
0.64 1
 
1.0%
0.019 1
 
1.0%
0.06 1
 
1.0%
Other values (18) 18
 
18.0%

Length

2023-12-10T19:18:52.412806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
불검출 68
68.0%
없음 4
 
4.0%
0 2
 
2.0%
0.004 2
 
2.0%
6 1
 
1.0%
7.6 1
 
1.0%
9.1 1
 
1.0%
0.0012 1
 
1.0%
3.4 1
 
1.0%
7.1 1
 
1.0%
Other values (18) 18
 
18.0%

Correlations

2023-12-10T19:18:52.532285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명검사항목수치
시설명1.0000.0000.000
검사항목0.0001.0000.927
수치0.0000.9271.000
2023-12-10T19:18:52.661907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수치시설명
수치1.0000.000
시설명0.0001.000
2023-12-10T19:18:52.754699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명수치
시설명1.0000.000
수치0.0001.000

Missing values

2023-12-10T19:18:50.124304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:18:50.261236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시설명측정지점검사항목수치
02021-01-04공주정수장정수지대장균불검출
12021-01-04공주정수장정수지질산성질소1.2
22021-01-04고산정수장정수지파라티온불검출
32021-01-04공주정수장정수지다이아지논불검출
42021-01-04공주정수장정수지사염화탄소불검출
52021-01-04공주정수장정수지증발잔류물82
62021-01-04고산정수장정수지크실렌불검출
72021-01-04고산정수장정수지트리클로로에틸렌불검출
82021-01-04공주정수장정수지1.1-디클로로에틸렌불검출
92021-01-04공주정수장정수지냄새없음
측정일시설명측정지점검사항목수치
902021-01-04고산정수장정수지사염화탄소불검출
912021-01-04고산정수장정수지셀레늄불검출
922021-01-04고산정수장정수지세제(음이온계면활성제)불검출
932021-01-04고산정수장정수지색도불검출
942021-01-04고산정수장정수지암모니아성질소불검출
952021-01-04고산정수장정수지염소이온9.1
962021-01-04고산정수장정수지알루미늄불검출
972021-01-04고산정수장정수지질산성질소1.0
982021-01-04공주정수장정수지총대장균군불검출
992021-01-04공주정수장정수지불검출