Overview

Dataset statistics

Number of variables10
Number of observations66
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.4 KiB
Average record size in memory83.9 B

Variable types

Categorical8
DateTime1
Numeric1

Dataset

Description2023년도 결과 제출합니다.검사항목: 일반세균, 총대장균군, 분원성대장균군, 암모니아성질소, 질산성질소, 과망간산칼륨소비량
Author광주광역시
URLhttps://www.data.go.kr/data/3075549/fileData.do

Alerts

암모니아성질소 has constant value ""Constant
질산성질소 is highly overall correlated with 채수장소(id) High correlation
채수장소(id) is highly overall correlated with 질산성질소 and 2 other fieldsHigh correlation
일반세균 is highly overall correlated with 분원성대장균군 and 1 other fieldsHigh correlation
총대장균군 is highly overall correlated with 분원성대장균군 and 1 other fieldsHigh correlation
분원성대장균군 is highly overall correlated with 채수장소(id) and 3 other fieldsHigh correlation
판정 is highly overall correlated with 채수장소(id) and 3 other fieldsHigh correlation
분원성대장균군 is highly imbalanced (56.1%)Imbalance

Reproduction

Analysis started2024-03-15 00:38:45.866380
Analysis finished2024-03-15 00:38:47.758932
Duration1.89 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

채수장소(id)
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size656.0 B
구)증심사
11 
대각사
11 
산장광장
11 
청풍쉼터
11 
산정
11 

Length

Max length5
Median length4
Mean length3.5
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row구)증심사
2nd row구)증심사
3rd row구)증심사
4th row구)증심사
5th row구)증심사

Common Values

ValueCountFrequency (%)
구)증심사 11
16.7%
대각사 11
16.7%
산장광장 11
16.7%
청풍쉼터 11
16.7%
산정 11
16.7%
용진산 11
16.7%

Length

2024-03-15T09:38:47.978186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T09:38:48.551045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
구)증심사 11
16.7%
대각사 11
16.7%
산장광장 11
16.7%
청풍쉼터 11
16.7%
산정 11
16.7%
용진산 11
16.7%

분기
Categorical

Distinct4
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Memory size656.0 B
3
36 
2
18 
1
4

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row2
4th row2
5th row3

Common Values

ValueCountFrequency (%)
3 36
54.5%
2 18
27.3%
1 6
 
9.1%
4 6
 
9.1%

Length

2024-03-15T09:38:48.943325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T09:38:49.280291image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3 36
54.5%
2 18
27.3%
1 6
 
9.1%
4 6
 
9.1%
Distinct37
Distinct (%)56.1%
Missing0
Missing (%)0.0%
Memory size656.0 B
Minimum2023-02-16 00:00:00
Maximum2023-12-08 00:00:00
2024-03-15T09:38:49.491313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T09:38:49.728658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)

일반세균
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)27.3%
Missing0
Missing (%)0.0%
Memory size656.0 B
불검출
36 
2
6
 
3
1
 
3
9
 
3
Other values (13)
13 

Length

Max length3
Median length3
Mean length2.2727273
Min length1

Unique

Unique13 ?
Unique (%)19.7%

Sample

1st row불검출
2nd row불검출
3rd row불검출
4th row불검출
5th row불검출

Common Values

ValueCountFrequency (%)
불검출 36
54.5%
2 8
 
12.1%
6 3
 
4.5%
1 3
 
4.5%
9 3
 
4.5%
26 1
 
1.5%
3 1
 
1.5%
12 1
 
1.5%
13 1
 
1.5%
57 1
 
1.5%
Other values (8) 8
 
12.1%

Length

2024-03-15T09:38:50.192813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
불검출 36
54.5%
2 8
 
12.1%
6 3
 
4.5%
1 3
 
4.5%
9 3
 
4.5%
41 1
 
1.5%
225 1
 
1.5%
11 1
 
1.5%
7 1
 
1.5%
18 1
 
1.5%
Other values (8) 8
 
12.1%

총대장균군
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size656.0 B
불검출
56 
검출
10 

Length

Max length3
Median length3
Mean length2.8484848
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row불검출
2nd row불검출
3rd row검출
4th row불검출
5th row불검출

Common Values

ValueCountFrequency (%)
불검출 56
84.8%
검출 10
 
15.2%

Length

2024-03-15T09:38:50.560354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T09:38:50.888246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
불검출 56
84.8%
검출 10
 
15.2%

분원성대장균군
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size656.0 B
불검출
60 
검출
 
6

Length

Max length3
Median length3
Mean length2.9090909
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row불검출
2nd row불검출
3rd row불검출
4th row불검출
5th row불검출

Common Values

ValueCountFrequency (%)
불검출 60
90.9%
검출 6
 
9.1%

Length

2024-03-15T09:38:51.262384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T09:38:51.666804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
불검출 60
90.9%
검출 6
 
9.1%

암모니아성질소
Categorical

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size656.0 B
불검출
66 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row불검출
2nd row불검출
3rd row불검출
4th row불검출
5th row불검출

Common Values

ValueCountFrequency (%)
불검출 66
100.0%

Length

2024-03-15T09:38:52.107278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T09:38:52.476790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
불검출 66
100.0%

질산성질소
Real number (ℝ)

HIGH CORRELATION 

Distinct36
Distinct (%)54.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4272727
Minimum0.2
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size722.0 B
2024-03-15T09:38:52.836234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.2
5-th percentile0.3
Q10.525
median1.1
Q34.25
95-th percentile7.75
Maximum9
Range8.8
Interquartile range (IQR)3.725

Descriptive statistics

Standard deviation2.5904882
Coefficient of variation (CV)1.0672423
Kurtosis-0.20551291
Mean2.4272727
Median Absolute Deviation (MAD)0.7
Skewness1.1328948
Sum160.2
Variance6.7106294
MonotonicityNot monotonic
2024-03-15T09:38:53.304016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
0.4 8
 
12.1%
0.5 4
 
6.1%
0.6 4
 
6.1%
1.1 4
 
6.1%
1.0 4
 
6.1%
0.3 3
 
4.5%
0.8 3
 
4.5%
1.2 3
 
4.5%
0.9 3
 
4.5%
6.3 2
 
3.0%
Other values (26) 28
42.4%
ValueCountFrequency (%)
0.2 2
 
3.0%
0.3 3
 
4.5%
0.4 8
12.1%
0.5 4
6.1%
0.6 4
6.1%
0.7 1
 
1.5%
0.8 3
 
4.5%
0.9 3
 
4.5%
1.0 4
6.1%
1.1 4
6.1%
ValueCountFrequency (%)
9.0 1
1.5%
8.2 1
1.5%
8.0 1
1.5%
7.9 1
1.5%
7.3 1
1.5%
6.8 1
1.5%
6.4 1
1.5%
6.3 2
3.0%
6.2 2
3.0%
6.0 1
1.5%
Distinct26
Distinct (%)39.4%
Missing0
Missing (%)0.0%
Memory size656.0 B
1.5
1.2
1.3
2
 
4
2.4
 
4
Other values (21)
42 

Length

Max length3
Median length3
Mean length2.8181818
Min length1

Unique

Unique9 ?
Unique (%)13.6%

Sample

1st row0.6
2nd row0.5
3rd row0.5
4th row2.4
5th row1.1

Common Values

ValueCountFrequency (%)
1.5 6
 
9.1%
1.2 5
 
7.6%
1.3 5
 
7.6%
2 4
 
6.1%
2.4 4
 
6.1%
3.1 4
 
6.1%
2.1 4
 
6.1%
0.8 3
 
4.5%
0.9 3
 
4.5%
2.2 3
 
4.5%
Other values (16) 25
37.9%

Length

2024-03-15T09:38:53.823563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1.5 6
 
9.1%
1.3 5
 
7.6%
1.2 5
 
7.6%
2 4
 
6.1%
2.4 4
 
6.1%
3.1 4
 
6.1%
2.1 4
 
6.1%
0.8 3
 
4.5%
0.9 3
 
4.5%
2.2 3
 
4.5%
Other values (16) 25
37.9%

판정
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size656.0 B
적합
57 
부적합

Length

Max length3
Median length2
Mean length2.1363636
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row적합
2nd row적합
3rd row적합
4th row적합
5th row적합

Common Values

ValueCountFrequency (%)
적합 57
86.4%
부적합 9
 
13.6%

Length

2024-03-15T09:38:54.295263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T09:38:54.662989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
적합 57
86.4%
부적합 9
 
13.6%

Interactions

2024-03-15T09:38:46.727387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T09:38:54.907429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
채수장소(id)분기채수일자일반세균총대장균군분원성대장균군질산성질소과망간산칼륨소비량판정
채수장소(id)1.0000.0000.0000.4280.6310.7150.7820.3450.841
분기0.0001.0001.0000.3600.1670.0000.3130.4250.277
채수일자0.0001.0001.0000.0000.0000.0850.0000.0000.474
일반세균0.4280.3600.0001.0000.7020.7590.0000.7380.821
총대장균군0.6310.1670.0000.7021.0000.8680.0000.1460.925
분원성대장균군0.7150.0000.0850.7590.8681.0000.0000.4850.901
질산성질소0.7820.3130.0000.0000.0000.0001.0000.0000.000
과망간산칼륨소비량0.3450.4250.0000.7380.1460.4850.0001.0000.000
판정0.8410.2770.4740.8210.9250.9010.0000.0001.000
2024-03-15T09:38:55.332600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
총대장균군판정일반세균분기분원성대장균군과망간산칼륨소비량채수장소(id)
총대장균군1.0000.7510.4860.1060.6690.0510.444
판정0.7511.0000.5830.1800.7140.0000.625
일반세균0.4860.5831.0000.1680.5310.2610.152
분기0.1060.1800.1681.0000.0000.1730.000
분원성대장균군0.6690.7140.5310.0001.0000.2970.510
과망간산칼륨소비량0.0510.0000.2610.1730.2971.0000.107
채수장소(id)0.4440.6250.1520.0000.5100.1071.000
2024-03-15T09:38:55.677840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
질산성질소채수장소(id)분기일반세균총대장균군분원성대장균군과망간산칼륨소비량판정
질산성질소1.0000.5140.1920.0000.0000.0000.0000.000
채수장소(id)0.5141.0000.0000.1520.4440.5100.1070.625
분기0.1920.0001.0000.1680.1060.0000.1730.180
일반세균0.0000.1520.1681.0000.4860.5310.2610.583
총대장균군0.0000.4440.1060.4861.0000.6690.0510.751
분원성대장균군0.0000.5100.0000.5310.6691.0000.2970.714
과망간산칼륨소비량0.0000.1070.1730.2610.0510.2971.0000.000
판정0.0000.6250.1800.5830.7510.7140.0001.000

Missing values

2024-03-15T09:38:47.062342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T09:38:47.563541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

채수장소(id)분기채수일자일반세균총대장균군분원성대장균군암모니아성질소질산성질소과망간산칼륨소비량판정
0구)증심사12023-02-16불검출불검출불검출불검출0.40.6적합
1구)증심사22023-04-25불검출불검출불검출불검출0.40.5적합
2구)증심사22023-05-15불검출검출불검출불검출0.40.5적합
3구)증심사22023-06-22불검출불검출불검출불검출0.92.4적합
4구)증심사32023-07-12불검출불검출불검출불검출0.41.1적합
5구)증심사32023-07-26불검출불검출불검출불검출0.42.8적합
6구)증심사32023-08-14불검출불검출불검출불검출0.50.3적합
7구)증심사32023-08-22불검출불검출불검출불검출0.42.4적합
8구)증심사32023-09-13불검출불검출불검출불검출0.52.8적합
9구)증심사32023-09-27불검출불검출불검출불검출0.41.2적합
채수장소(id)분기채수일자일반세균총대장균군분원성대장균군암모니아성질소질산성질소과망간산칼륨소비량판정
56용진산22023-04-19불검출불검출불검출불검출1.71.1적합
57용진산22023-05-15불검출불검출불검출불검출2.21.4적합
58용진산22023-06-19불검출불검출불검출불검출1.52.1적합
59용진산32023-07-03불검출불검출불검출불검출0.30.9적합
60용진산32023-07-191불검출불검출불검출0.21.2적합
61용진산32023-08-0311불검출불검출불검출0.32.1적합
62용진산32023-08-212불검출불검출불검출0.80.9적합
63용진산32023-09-04불검출불검출불검출불검출0.62.2적합
64용진산32023-09-1814불검출불검출불검출0.44적합
65용진산42023-11-20불검출불검출불검출불검출1.04.1적합