Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.0 KiB
Average record size in memory41.3 B

Variable types

DateTime1
Categorical3
Text1

Alerts

측정일 has constant value ""Constant
측정지점 has constant value ""Constant
수치 is highly imbalanced (51.6%)Imbalance

Reproduction

Analysis started2023-12-10 10:18:45.768603
Analysis finished2023-12-10 10:18:46.420501
Duration0.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2021-02-01 00:00:00
Maximum2021-02-01 00:00:00
2023-12-10T19:18:46.492831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:18:46.644048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

시설명
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
고산정수장
59 
고양정수장
41 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row고양정수장
2nd row고양정수장
3rd row고산정수장
4th row고양정수장
5th row고양정수장

Common Values

ValueCountFrequency (%)
고산정수장 59
59.0%
고양정수장 41
41.0%

Length

2023-12-10T19:18:46.818717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:18:46.978017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고산정수장 59
59.0%
고양정수장 41
41.0%

측정지점
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
정수지
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정수지
2nd row정수지
3rd row정수지
4th row정수지
5th row정수지

Common Values

ValueCountFrequency (%)
정수지 100
100.0%

Length

2023-12-10T19:18:47.124221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:18:47.258314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정수지 100
100.0%
Distinct59
Distinct (%)59.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:18:47.532509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length12
Mean length5.05
Min length1

Characters and Unicode

Total characters505
Distinct characters117
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)18.0%

Sample

1st row1.4-다이옥산
2nd row질산성질소
3rd row아연
4th row1.2-디브로모-3-클로로프로판
5th row브롬산염
ValueCountFrequency (%)
1.4-다이옥산 2
 
2.0%
디클로로아세토니트릴 2
 
2.0%
2
 
2.0%
2
 
2.0%
디브로모아세토니트릴 2
 
2.0%
사염화탄소 2
 
2.0%
셀레늄 2
 
2.0%
질산성질소 2
 
2.0%
색도 2
 
2.0%
대장균 2
 
2.0%
Other values (49) 80
80.0%
2023-12-10T19:18:48.129877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
49
 
9.7%
19
 
3.8%
16
 
3.2%
1 14
 
2.8%
14
 
2.8%
13
 
2.6%
12
 
2.4%
- 12
 
2.4%
11
 
2.2%
11
 
2.2%
Other values (107) 334
66.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 455
90.1%
Decimal Number 20
 
4.0%
Dash Punctuation 12
 
2.4%
Other Punctuation 10
 
2.0%
Open Punctuation 2
 
0.4%
Uppercase Letter 2
 
0.4%
Close Punctuation 2
 
0.4%
Lowercase Letter 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
49
 
10.8%
19
 
4.2%
16
 
3.5%
14
 
3.1%
13
 
2.9%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
10
 
2.2%
Other values (97) 290
63.7%
Decimal Number
ValueCountFrequency (%)
1 14
70.0%
2 2
 
10.0%
3 2
 
10.0%
4 2
 
10.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Other Punctuation
ValueCountFrequency (%)
. 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
H 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Lowercase Letter
ValueCountFrequency (%)
p 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 455
90.1%
Common 46
 
9.1%
Latin 4
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
49
 
10.8%
19
 
4.2%
16
 
3.5%
14
 
3.1%
13
 
2.9%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
10
 
2.2%
Other values (97) 290
63.7%
Common
ValueCountFrequency (%)
1 14
30.4%
- 12
26.1%
. 10
21.7%
( 2
 
4.3%
) 2
 
4.3%
2 2
 
4.3%
3 2
 
4.3%
4 2
 
4.3%
Latin
ValueCountFrequency (%)
H 2
50.0%
p 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 455
90.1%
ASCII 50
 
9.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
49
 
10.8%
19
 
4.2%
16
 
3.5%
14
 
3.1%
13
 
2.9%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
10
 
2.2%
Other values (97) 290
63.7%
ASCII
ValueCountFrequency (%)
1 14
28.0%
- 12
24.0%
. 10
20.0%
( 2
 
4.0%
H 2
 
4.0%
) 2
 
4.0%
p 2
 
4.0%
2 2
 
4.0%
3 2
 
4.0%
4 2
 
4.0%

수치
Categorical

IMBALANCE 

Distinct28
Distinct (%)28.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
불검출
69 
없음
 
4
0
 
2
1.9
 
1
128
 
1
Other values (23)
23 

Length

Max length6
Median length3
Mean length3.12
Min length1

Unique

Unique25 ?
Unique (%)25.0%

Sample

1st row불검출
2nd row2.3
3rd row불검출
4th row불검출
5th row불검출

Common Values

ValueCountFrequency (%)
불검출 69
69.0%
없음 4
 
4.0%
0 2
 
2.0%
1.9 1
 
1.0%
128 1
 
1.0%
0.004 1
 
1.0%
0.70 1
 
1.0%
3.1 1
 
1.0%
0.003 1
 
1.0%
45 1
 
1.0%
Other values (18) 18
 
18.0%

Length

2023-12-10T19:18:48.341852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
불검출 69
69.0%
없음 4
 
4.0%
0 2
 
2.0%
0.01 1
 
1.0%
0.0007 1
 
1.0%
7.6 1
 
1.0%
0.017 1
 
1.0%
0.013 1
 
1.0%
0.0014 1
 
1.0%
0.016 1
 
1.0%
Other values (18) 18
 
18.0%

Correlations

2023-12-10T19:18:48.490132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명검사항목수치
시설명1.0000.0000.000
검사항목0.0001.0000.903
수치0.0000.9031.000
2023-12-10T19:18:48.628812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수치시설명
수치1.0000.000
시설명0.0001.000
2023-12-10T19:18:48.775873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명수치
시설명1.0000.000
수치0.0001.000

Missing values

2023-12-10T19:18:46.205886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:18:46.352385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시설명측정지점검사항목수치
02021-02-01고양정수장정수지1.4-다이옥산불검출
12021-02-01고양정수장정수지질산성질소2.3
22021-02-01고산정수장정수지아연불검출
32021-02-01고양정수장정수지1.2-디브로모-3-클로로프로판불검출
42021-02-01고양정수장정수지브롬산염불검출
52021-02-01고양정수장정수지증발잔류물128
62021-02-01고산정수장정수지디클로로메탄불검출
72021-02-01고산정수장정수지시안불검출
82021-02-01고산정수장정수지크실렌불검출
92021-02-01고양정수장정수지1.1-디클로로에틸렌불검출
측정일시설명측정지점검사항목수치
902021-02-01고산정수장정수지파라티온불검출
912021-02-01고산정수장정수지페놀불검출
922021-02-01고산정수장정수지포름알데히드불검출
932021-02-01고산정수장정수지할로아세틱에시드0.017
942021-02-01고양정수장정수지pH7.6
952021-02-01고양정수장정수지경도67
962021-02-01고양정수장정수지구리불검출
972021-02-01고양정수장정수지불검출
982021-02-01고양정수장정수지총대장균군불검출
992021-02-01고양정수장정수지불검출