Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.0 KiB
Average record size in memory41.3 B

Variable types

DateTime1
Categorical3
Text1

Alerts

측정일 has constant value ""Constant
측정지점 has constant value ""Constant
수치 is highly imbalanced (50.3%)Imbalance

Reproduction

Analysis started2023-12-10 10:18:53.519401
Analysis finished2023-12-10 10:18:54.034733
Duration0.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일
Date

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
Minimum2020-12-01 00:00:00
Maximum2020-12-01 00:00:00
2023-12-10T19:18:54.097856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:18:54.251999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

시설명
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
고산정수장
59 
곤명정수장
41 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row곤명정수장
2nd row곤명정수장
3rd row고산정수장
4th row곤명정수장
5th row곤명정수장

Common Values

ValueCountFrequency (%)
고산정수장 59
59.0%
곤명정수장 41
41.0%

Length

2023-12-10T19:18:54.411998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:18:54.549502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고산정수장 59
59.0%
곤명정수장 41
41.0%

측정지점
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
정수지
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row정수지
2nd row정수지
3rd row정수지
4th row정수지
5th row정수지

Common Values

ValueCountFrequency (%)
정수지 100
100.0%

Length

2023-12-10T19:18:54.684114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:18:54.841222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
정수지 100
100.0%
Distinct59
Distinct (%)59.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:18:55.118927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length12
Mean length5.05
Min length1

Characters and Unicode

Total characters505
Distinct characters117
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)18.0%

Sample

1st row1.4-다이옥산
2nd row질산성질소
3rd row아연
4th row1.2-디브로모-3-클로로프로판
5th row브롬산염
ValueCountFrequency (%)
1.4-다이옥산 2
 
2.0%
디클로로아세토니트릴 2
 
2.0%
2
 
2.0%
2
 
2.0%
디브로모아세토니트릴 2
 
2.0%
사염화탄소 2
 
2.0%
셀레늄 2
 
2.0%
질산성질소 2
 
2.0%
색도 2
 
2.0%
대장균 2
 
2.0%
Other values (49) 80
80.0%
2023-12-10T19:18:55.721638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
49
 
9.7%
19
 
3.8%
16
 
3.2%
1 14
 
2.8%
14
 
2.8%
13
 
2.6%
12
 
2.4%
- 12
 
2.4%
11
 
2.2%
11
 
2.2%
Other values (107) 334
66.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 455
90.1%
Decimal Number 20
 
4.0%
Dash Punctuation 12
 
2.4%
Other Punctuation 10
 
2.0%
Open Punctuation 2
 
0.4%
Uppercase Letter 2
 
0.4%
Close Punctuation 2
 
0.4%
Lowercase Letter 2
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
49
 
10.8%
19
 
4.2%
16
 
3.5%
14
 
3.1%
13
 
2.9%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
10
 
2.2%
Other values (97) 290
63.7%
Decimal Number
ValueCountFrequency (%)
1 14
70.0%
2 2
 
10.0%
3 2
 
10.0%
4 2
 
10.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Other Punctuation
ValueCountFrequency (%)
. 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Uppercase Letter
ValueCountFrequency (%)
H 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Lowercase Letter
ValueCountFrequency (%)
p 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 455
90.1%
Common 46
 
9.1%
Latin 4
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
49
 
10.8%
19
 
4.2%
16
 
3.5%
14
 
3.1%
13
 
2.9%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
10
 
2.2%
Other values (97) 290
63.7%
Common
ValueCountFrequency (%)
1 14
30.4%
- 12
26.1%
. 10
21.7%
( 2
 
4.3%
) 2
 
4.3%
2 2
 
4.3%
3 2
 
4.3%
4 2
 
4.3%
Latin
ValueCountFrequency (%)
H 2
50.0%
p 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 455
90.1%
ASCII 50
 
9.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
49
 
10.8%
19
 
4.2%
16
 
3.5%
14
 
3.1%
13
 
2.9%
12
 
2.6%
11
 
2.4%
11
 
2.4%
10
 
2.2%
10
 
2.2%
Other values (97) 290
63.7%
ASCII
ValueCountFrequency (%)
1 14
28.0%
- 12
24.0%
. 10
20.0%
( 2
 
4.0%
H 2
 
4.0%
) 2
 
4.0%
p 2
 
4.0%
2 2
 
4.0%
3 2
 
4.0%
4 2
 
4.0%

수치
Categorical

IMBALANCE 

Distinct27
Distinct (%)27.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
불검출
68 
없음
 
4
0.0014
 
2
0.004
 
2
0
 
2
Other values (22)
22 

Length

Max length6
Median length3
Mean length3.12
Min length1

Unique

Unique22 ?
Unique (%)22.0%

Sample

1st row불검출
2nd row1.6
3rd row0.007
4th row불검출
5th row불검출

Common Values

ValueCountFrequency (%)
불검출 68
68.0%
없음 4
 
4.0%
0.0014 2
 
2.0%
0.004 2
 
2.0%
0 2
 
2.0%
1.1 1
 
1.0%
0.007 1
 
1.0%
91 1
 
1.0%
0.72 1
 
1.0%
4.0 1
 
1.0%
Other values (17) 17
 
17.0%

Length

2023-12-10T19:18:55.960483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
불검출 68
68.0%
없음 4
 
4.0%
0.0014 2
 
2.0%
0.004 2
 
2.0%
0 2
 
2.0%
5 1
 
1.0%
7.5 1
 
1.0%
0.018 1
 
1.0%
0.019 1
 
1.0%
0.0029 1
 
1.0%
Other values (17) 17
 
17.0%

Correlations

2023-12-10T19:18:56.060158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명검사항목수치
시설명1.0000.0000.000
검사항목0.0001.0000.947
수치0.0000.9471.000
2023-12-10T19:18:56.163552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수치시설명
수치1.0000.000
시설명0.0001.000
2023-12-10T19:18:56.293142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명수치
시설명1.0000.000
수치0.0001.000

Missing values

2023-12-10T19:18:53.838915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:18:53.993200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시설명측정지점검사항목수치
02020-12-01곤명정수장정수지1.4-다이옥산불검출
12020-12-01곤명정수장정수지질산성질소1.6
22020-12-01고산정수장정수지아연0.007
32020-12-01곤명정수장정수지1.2-디브로모-3-클로로프로판불검출
42020-12-01곤명정수장정수지브롬산염불검출
52020-12-01곤명정수장정수지증발잔류물91
62020-12-01고산정수장정수지디클로로메탄불검출
72020-12-01고산정수장정수지시안불검출
82020-12-01고산정수장정수지크실렌불검출
92020-12-01곤명정수장정수지1.1-디클로로에틸렌불검출
측정일시설명측정지점검사항목수치
902020-12-01고산정수장정수지파라티온불검출
912020-12-01고산정수장정수지페놀불검출
922020-12-01고산정수장정수지포름알데히드불검출
932020-12-01고산정수장정수지할로아세틱에시드0.018
942020-12-01곤명정수장정수지pH7.5
952020-12-01곤명정수장정수지경도30
962020-12-01곤명정수장정수지구리불검출
972020-12-01곤명정수장정수지불검출
982020-12-01곤명정수장정수지총대장균군불검출
992020-12-01곤명정수장정수지불검출