Overview

Dataset statistics

Number of variables11
Number of observations46
Missing cells46
Missing cells (%)9.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.2 KiB
Average record size in memory93.9 B

Variable types

Numeric2
Categorical7
Text1
Unsupported1

Dataset

Description금정구 소재 비상급수시설 21년 1~4분기에 실시한 수질검사에 대한 데이터로
Author부산광역시 금정구
URLhttps://www.data.go.kr/data/15099077/fileData.do

Alerts

용도 has constant value ""Constant
연번 is highly overall correlated with and 1 other fieldsHigh correlation
is highly overall correlated with 연번 and 1 other fieldsHigh correlation
소유주체별 is highly overall correlated with High correlation
2021_2분기 is highly overall correlated with 연번High correlation
2021_2분기 is highly imbalanced (50.4%)Imbalance
비고 has 46 (100.0%) missing valuesMissing
연번 has unique valuesUnique
비고 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 13:07:53.971455
Analysis finished2023-12-12 13:07:55.119765
Duration1.15 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct46
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.5
Minimum1
Maximum46
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size546.0 B
2023-12-12T22:07:55.190003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.25
Q112.25
median23.5
Q334.75
95-th percentile43.75
Maximum46
Range45
Interquartile range (IQR)22.5

Descriptive statistics

Standard deviation13.422618
Coefficient of variation (CV)0.57117522
Kurtosis-1.2
Mean23.5
Median Absolute Deviation (MAD)11.5
Skewness0
Sum1081
Variance180.16667
MonotonicityStrictly increasing
2023-12-12T22:07:55.342420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%)
1 1
 
2.2%
36 1
 
2.2%
27 1
 
2.2%
28 1
 
2.2%
29 1
 
2.2%
30 1
 
2.2%
31 1
 
2.2%
32 1
 
2.2%
33 1
 
2.2%
34 1
 
2.2%
Other values (36) 36
78.3%
ValueCountFrequency (%)
1 1
2.2%
2 1
2.2%
3 1
2.2%
4 1
2.2%
5 1
2.2%
6 1
2.2%
7 1
2.2%
8 1
2.2%
9 1
2.2%
10 1
2.2%
ValueCountFrequency (%)
46 1
2.2%
45 1
2.2%
44 1
2.2%
43 1
2.2%
42 1
2.2%
41 1
2.2%
40 1
2.2%
39 1
2.2%
38 1
2.2%
37 1
2.2%


Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)30.4%
Missing0
Missing (%)0.0%
Memory size500.0 B
구서2동
금성동
부곡2동
서3동
선두구동
Other values (9)
19 

Length

Max length4
Median length4
Mean length3.6086957
Min length3

Unique

Unique3 ?
Unique (%)6.5%

Sample

1st row서1동
2nd row서2동
3rd row서2동
4th row서3동
5th row서3동

Common Values

ValueCountFrequency (%)
구서2동 7
15.2%
금성동 7
15.2%
부곡2동 5
10.9%
서3동 4
8.7%
선두구동 4
8.7%
구서1동 4
8.7%
부곡1동 3
6.5%
청룡노포 3
6.5%
서2동 2
 
4.3%
금사동 2
 
4.3%
Other values (4) 5
10.9%

Length

2023-12-12T22:07:55.473337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
구서2동 7
15.2%
금성동 7
15.2%
부곡2동 5
10.9%
서3동 4
8.7%
선두구동 4
8.7%
구서1동 4
8.7%
부곡1동 3
6.5%
청룡노포 3
6.5%
서2동 2
 
4.3%
금사동 2
 
4.3%
Other values (4) 5
10.9%
Distinct45
Distinct (%)97.8%
Missing0
Missing (%)0.0%
Memory size500.0 B
2023-12-12T22:07:55.677146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10.5
Mean length7.2826087
Min length4

Characters and Unicode

Total characters335
Distinct characters106
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44 ?
Unique (%)95.7%

Sample

1st row서1동지하수
2nd row삼한아파트
3rd row서동현대아파트
4th row금정전자공고1
5th row금정전자공고2
ValueCountFrequency (%)
삼한아파트 2
 
3.8%
102동 2
 
3.8%
선경3차apt 2
 
3.8%
서1동지하수 1
 
1.9%
자두농원옆(녹동마을 1
 
1.9%
남산초등학교 1
 
1.9%
남산하이츠빌라 1
 
1.9%
금단마을 1
 
1.9%
태평양아파트 1
 
1.9%
경보아파트 1
 
1.9%
Other values (40) 40
75.5%
2023-12-12T22:07:56.083187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
54
 
16.1%
16
 
4.8%
1 13
 
3.9%
2 10
 
3.0%
A 9
 
2.7%
9
 
2.7%
T 9
 
2.7%
P 9
 
2.7%
8
 
2.4%
8
 
2.4%
Other values (96) 190
56.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 219
65.4%
Space Separator 54
 
16.1%
Decimal Number 32
 
9.6%
Uppercase Letter 27
 
8.1%
Close Punctuation 1
 
0.3%
Other Punctuation 1
 
0.3%
Open Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
16
 
7.3%
9
 
4.1%
8
 
3.7%
8
 
3.7%
7
 
3.2%
6
 
2.7%
5
 
2.3%
5
 
2.3%
5
 
2.3%
4
 
1.8%
Other values (83) 146
66.7%
Decimal Number
ValueCountFrequency (%)
1 13
40.6%
2 10
31.2%
3 4
 
12.5%
0 3
 
9.4%
4 1
 
3.1%
8 1
 
3.1%
Uppercase Letter
ValueCountFrequency (%)
A 9
33.3%
T 9
33.3%
P 9
33.3%
Space Separator
ValueCountFrequency (%)
54
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 219
65.4%
Common 89
26.6%
Latin 27
 
8.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
16
 
7.3%
9
 
4.1%
8
 
3.7%
8
 
3.7%
7
 
3.2%
6
 
2.7%
5
 
2.3%
5
 
2.3%
5
 
2.3%
4
 
1.8%
Other values (83) 146
66.7%
Common
ValueCountFrequency (%)
54
60.7%
1 13
 
14.6%
2 10
 
11.2%
3 4
 
4.5%
0 3
 
3.4%
) 1
 
1.1%
4 1
 
1.1%
, 1
 
1.1%
8 1
 
1.1%
( 1
 
1.1%
Latin
ValueCountFrequency (%)
A 9
33.3%
T 9
33.3%
P 9
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 219
65.4%
ASCII 116
34.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
54
46.6%
1 13
 
11.2%
2 10
 
8.6%
A 9
 
7.8%
T 9
 
7.8%
P 9
 
7.8%
3 4
 
3.4%
0 3
 
2.6%
) 1
 
0.9%
4 1
 
0.9%
Other values (3) 3
 
2.6%
Hangul
ValueCountFrequency (%)
16
 
7.3%
9
 
4.1%
8
 
3.7%
8
 
3.7%
7
 
3.2%
6
 
2.7%
5
 
2.3%
5
 
2.3%
5
 
2.3%
4
 
1.8%
Other values (83) 146
66.7%

규모_t
Real number (ℝ)

Distinct20
Distinct (%)43.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean77.456522
Minimum30
Maximum200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size546.0 B
2023-12-12T22:07:56.197584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum30
5-th percentile30
Q150
median70
Q392.25
95-th percentile150
Maximum200
Range170
Interquartile range (IQR)42.25

Descriptive statistics

Standard deviation37.616608
Coefficient of variation (CV)0.48564804
Kurtosis1.5784437
Mean77.456522
Median Absolute Deviation (MAD)20
Skewness1.2672325
Sum3563
Variance1415.0092
MonotonicityNot monotonic
2023-12-12T22:07:56.297141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
50 7
15.2%
60 6
13.0%
80 5
10.9%
30 4
 
8.7%
100 4
 
8.7%
150 4
 
8.7%
70 3
 
6.5%
130 1
 
2.2%
65 1
 
2.2%
40 1
 
2.2%
Other values (10) 10
21.7%
ValueCountFrequency (%)
30 4
8.7%
40 1
 
2.2%
42 1
 
2.2%
49 1
 
2.2%
50 7
15.2%
55 1
 
2.2%
60 6
13.0%
65 1
 
2.2%
70 3
6.5%
72 1
 
2.2%
ValueCountFrequency (%)
200 1
 
2.2%
150 4
8.7%
130 1
 
2.2%
120 1
 
2.2%
100 4
8.7%
93 1
 
2.2%
90 1
 
2.2%
89 1
 
2.2%
80 5
10.9%
78 1
 
2.2%

소유주체별
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Memory size500.0 B
민간
20 
자체
20 
공공
정부
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)2.2%

Sample

1st row정부
2nd row민간
3rd row민간
4th row공공
5th row공공

Common Values

ValueCountFrequency (%)
민간 20
43.5%
자체 20
43.5%
공공 5
 
10.9%
정부 1
 
2.2%

Length

2023-12-12T22:07:56.421075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:07:56.509081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
민간 20
43.5%
자체 20
43.5%
공공 5
 
10.9%
정부 1
 
2.2%

용도
Categorical

CONSTANT 

Distinct1
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size500.0 B
식수
46 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row식수
2nd row식수
3rd row식수
4th row식수
5th row식수

Common Values

ValueCountFrequency (%)
식수 46
100.0%

Length

2023-12-12T22:07:56.609274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:07:56.694572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
식수 46
100.0%

2021_1분기
Categorical

Distinct2
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size500.0 B
적합
37 
부적합

Length

Max length3
Median length2
Mean length2.1956522
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row적합
2nd row적합
3rd row부적합
4th row적합
5th row적합

Common Values

ValueCountFrequency (%)
적합 37
80.4%
부적합 9
 
19.6%

Length

2023-12-12T22:07:56.790957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:07:56.917064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
적합 37
80.4%
부적합 9
 
19.6%

2021_2분기
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size500.0 B
적합
41 
부적합

Length

Max length3
Median length2
Mean length2.1086957
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row적합
2nd row적합
3rd row적합
4th row적합
5th row적합

Common Values

ValueCountFrequency (%)
적합 41
89.1%
부적합 5
 
10.9%

Length

2023-12-12T22:07:57.035996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:07:57.139858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
적합 41
89.1%
부적합 5
 
10.9%

2021_3분기
Categorical

Distinct2
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size500.0 B
부적합
23 
적합
23 

Length

Max length3
Median length2.5
Mean length2.5
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row부적합
2nd row부적합
3rd row부적합
4th row적합
5th row적합

Common Values

ValueCountFrequency (%)
부적합 23
50.0%
적합 23
50.0%

Length

2023-12-12T22:07:57.254798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:07:57.369091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
부적합 23
50.0%
적합 23
50.0%

2021_4분기
Categorical

Distinct2
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Memory size500.0 B
적합
28 
부적합
18 

Length

Max length3
Median length2
Mean length2.3913043
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row적합
2nd row적합
3rd row적합
4th row적합
5th row적합

Common Values

ValueCountFrequency (%)
적합 28
60.9%
부적합 18
39.1%

Length

2023-12-12T22:07:57.468017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:07:57.558643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
적합 28
60.9%
부적합 18
39.1%

비고
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing46
Missing (%)100.0%
Memory size546.0 B

Interactions

2023-12-12T22:07:54.628624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:07:54.457089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:07:54.731206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:07:54.526402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:07:57.634600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시설명규모_t소유주체별2021_1분기2021_2분기2021_3분기2021_4분기
연번1.0000.9460.9310.2820.4830.0000.7330.0000.383
0.9461.0000.7690.3610.9090.1960.1670.1210.427
시설명0.9310.7691.0000.8621.0001.0001.0001.0000.000
규모_t0.2820.3610.8621.0000.7680.4610.5640.0000.389
소유주체별0.4830.9091.0000.7681.0000.0760.0000.4190.312
2021_1분기0.0000.1961.0000.4610.0761.0000.0000.2550.000
2021_2분기0.7330.1671.0000.5640.0000.0001.0000.0000.258
2021_3분기0.0000.1211.0000.0000.4190.2550.0001.0000.423
2021_4분기0.3830.4270.0000.3890.3120.0000.2580.4231.000
2023-12-12T22:07:57.770324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2021_4분기소유주체별2021_3분기2021_2분기2021_1분기
2021_4분기1.0000.2750.1990.2770.1650.000
0.2751.0000.6750.0120.0780.104
소유주체별0.1990.6751.0000.2720.0000.030
2021_3분기0.2770.0120.2721.0000.0000.163
2021_2분기0.1650.0780.0000.0001.0000.000
2021_1분기0.0000.1040.0300.1630.0001.000
2023-12-12T22:07:57.881310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번규모_t소유주체별2021_1분기2021_2분기2021_3분기2021_4분기
연번1.000-0.1050.7340.2460.0000.5170.0000.276
규모_t-0.1051.0000.1280.4100.3170.3920.0000.266
0.7340.1281.0000.6750.1040.0780.0120.275
소유주체별0.2460.4100.6751.0000.0300.0000.2720.199
2021_1분기0.0000.3170.1040.0301.0000.0000.1630.000
2021_2분기0.5170.3920.0780.0000.0001.0000.0000.165
2021_3분기0.0000.0000.0120.2720.1630.0001.0000.277
2021_4분기0.2760.2660.2750.1990.0000.1650.2771.000

Missing values

2023-12-12T22:07:54.896528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:07:55.065199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번시설명규모_t소유주체별용도2021_1분기2021_2분기2021_3분기2021_4분기비고
01서1동서1동지하수130정부식수적합적합부적합적합<NA>
12서2동삼한아파트80민간식수적합적합부적합적합<NA>
23서2동서동현대아파트30민간식수부적합적합부적합적합<NA>
34서3동금정전자공고180공공식수적합적합적합적합<NA>
45서3동금정전자공고280공공식수적합적합적합적합<NA>
56서3동서동상가시장49자체식수적합적합적합적합<NA>
67서3동서곡초등학교100공공식수적합적합적합적합<NA>
78금사동예원정보고42자체식수부적합적합적합적합<NA>
89금사동삼한아파트90민간식수적합적합부적합부적합<NA>
910부곡1동시영APT120동50자체식수적합적합부적합적합<NA>
연번시설명규모_t소유주체별용도2021_1분기2021_2분기2021_3분기2021_4분기비고
3637구서2동선경3차APT 311동70민간식수적합부적합부적합부적합<NA>
3738구서2동선경3차APT 312동70민간식수적합적합적합부적합<NA>
3839구서2동일신APT 2동30민간식수적합적합적합적합<NA>
3940금성동금성어린이집60자체식수적합적합부적합적합<NA>
4041금성동금성토산주70자체식수적합적합적합부적합<NA>
4142금성동국청사뒤150자체식수부적합부적합부적합부적합<NA>
4243금성동1통지내200자체식수부적합부적합부적합부적합<NA>
4344금성동2통지내55자체식수적합부적합부적합부적합<NA>
4445금성동남문입구40자체식수부적합적합부적합적합<NA>
4546금성동세심정50자체식수적합적합적합적합<NA>