Overview

Dataset statistics

Number of variables6
Number of observations1395
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory68.2 KiB
Average record size in memory50.1 B

Variable types

DateTime1
Categorical3
Numeric2

Dataset

Description처리장개방일자,물재생센터명칭,개방장소구분,개방횟수,이용인원,비고
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-20253/S/1/datasetView.do

Alerts

개방장소구분 is highly overall correlated with 비고High correlation
물재생센터명칭 is highly overall correlated with 비고High correlation
비고 is highly overall correlated with 개방횟수 and 3 other fieldsHigh correlation
개방횟수 is highly overall correlated with 이용인원 and 1 other fieldsHigh correlation
이용인원 is highly overall correlated with 개방횟수 and 1 other fieldsHigh correlation
비고 is highly imbalanced (80.1%)Imbalance
개방횟수 has 449 (32.2%) zerosZeros
이용인원 has 449 (32.2%) zerosZeros

Reproduction

Analysis started2024-05-03 20:48:28.496801
Analysis finished2024-05-03 20:48:31.489805
Duration2.99 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct85
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
Minimum2016-01-01 00:00:00
Maximum2023-01-01 00:00:00
2024-05-03T20:48:31.702091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:48:32.137020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

물재생센터명칭
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
서남물재생센터
496 
중랑물재생센터
476 
탄천물재생센터
254 
난지물재생센터
169 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중랑물재생센터
2nd row중랑물재생센터
3rd row중랑물재생센터
4th row중랑물재생센터
5th row중랑물재생센터

Common Values

ValueCountFrequency (%)
서남물재생센터 496
35.6%
중랑물재생센터 476
34.1%
탄천물재생센터 254
18.2%
난지물재생센터 169
 
12.1%

Length

2024-05-03T20:48:32.569135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-03T20:48:32.905857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서남물재생센터 496
35.6%
중랑물재생센터 476
34.1%
탄천물재생센터 254
18.2%
난지물재생센터 169
 
12.1%

개방장소구분
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
테니스장
323 
축구장
323 
기타
267 
강당
133 
족구장
86 
Other values (5)
263 

Length

Max length5
Median length4
Mean length3.0666667
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row테니스장
2nd row축구장
3rd row족구장
4th row배드민턴장
5th row기타

Common Values

ValueCountFrequency (%)
테니스장 323
23.2%
축구장 323
23.2%
기타 267
19.1%
강당 133
9.5%
족구장 86
 
6.2%
배드민턴장 85
 
6.1%
농구장 82
 
5.9%
골프장 82
 
5.9%
탁구장 13
 
0.9%
배구장 1
 
0.1%

Length

2024-05-03T20:48:33.308905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-03T20:48:33.697155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
테니스장 323
23.2%
축구장 323
23.2%
기타 267
19.1%
강당 133
9.5%
족구장 86
 
6.2%
배드민턴장 85
 
6.1%
농구장 82
 
5.9%
골프장 82
 
5.9%
탁구장 13
 
0.9%
배구장 1
 
0.1%

개방횟수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct373
Distinct (%)26.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean167.44229
Minimum0
Maximum3075
Zeros449
Zeros (%)32.2%
Negative0
Negative (%)0.0%
Memory size12.4 KiB
2024-05-03T20:48:34.118532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median36
Q393.5
95-th percentile890.9
Maximum3075
Range3075
Interquartile range (IQR)93.5

Descriptive statistics

Standard deviation367.54738
Coefficient of variation (CV)2.195069
Kurtosis14.138983
Mean167.44229
Median Absolute Deviation (MAD)36
Skewness3.4846101
Sum233582
Variance135091.08
MonotonicityNot monotonic
2024-05-03T20:48:34.577600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 449
32.2%
1 42
 
3.0%
40 22
 
1.6%
34 15
 
1.1%
16 14
 
1.0%
46 14
 
1.0%
51 11
 
0.8%
45 11
 
0.8%
56 11
 
0.8%
49 11
 
0.8%
Other values (363) 795
57.0%
ValueCountFrequency (%)
0 449
32.2%
1 42
 
3.0%
2 6
 
0.4%
3 6
 
0.4%
4 9
 
0.6%
5 7
 
0.5%
6 9
 
0.6%
7 6
 
0.4%
8 5
 
0.4%
10 5
 
0.4%
ValueCountFrequency (%)
3075 1
0.1%
2542 1
0.1%
2530 1
0.1%
2335 2
0.1%
2322 1
0.1%
2301 1
0.1%
2168 1
0.1%
2083 1
0.1%
2078 1
0.1%
2075 1
0.1%

이용인원
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct772
Distinct (%)55.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1025.8717
Minimum0
Maximum11604
Zeros449
Zeros (%)32.2%
Negative0
Negative (%)0.0%
Memory size12.4 KiB
2024-05-03T20:48:34.964680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median345
Q31443.5
95-th percentile4769.9
Maximum11604
Range11604
Interquartile range (IQR)1443.5

Descriptive statistics

Standard deviation1592.0466
Coefficient of variation (CV)1.5518964
Kurtosis7.3704611
Mean1025.8717
Median Absolute Deviation (MAD)345
Skewness2.5294776
Sum1431091
Variance2534612.3
MonotonicityNot monotonic
2024-05-03T20:48:35.350114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 449
32.2%
1550 32
 
2.3%
65 6
 
0.4%
73 5
 
0.4%
280 4
 
0.3%
56 4
 
0.3%
380 4
 
0.3%
277 3
 
0.2%
1881 3
 
0.2%
400 3
 
0.2%
Other values (762) 882
63.2%
ValueCountFrequency (%)
0 449
32.2%
7 1
 
0.1%
12 1
 
0.1%
15 2
 
0.1%
20 1
 
0.1%
21 1
 
0.1%
22 1
 
0.1%
30 2
 
0.1%
32 2
 
0.1%
37 1
 
0.1%
ValueCountFrequency (%)
11604 1
0.1%
9340 1
0.1%
9256 2
0.1%
9211 1
0.1%
8672 1
0.1%
8609 1
0.1%
8583 1
0.1%
8089 1
0.1%
7736 1
0.1%
7480 1
0.1%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size11.0 KiB
<NA>
1288 
풋살장
 
51
조깅 외
 
32
다목적구장
 
22
테니스장 보수정비
 
1

Length

Max length9
Median length4
Mean length3.9820789
Min length3

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row풋살장

Common Values

ValueCountFrequency (%)
<NA> 1288
92.3%
풋살장 51
 
3.7%
조깅 외 32
 
2.3%
다목적구장 22
 
1.6%
테니스장 보수정비 1
 
0.1%
운동장 1
 
0.1%

Length

2024-05-03T20:48:35.858856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-03T20:48:36.207572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1288
90.2%
풋살장 51
 
3.6%
조깅 32
 
2.2%
32
 
2.2%
다목적구장 22
 
1.5%
테니스장 1
 
0.1%
보수정비 1
 
0.1%
운동장 1
 
0.1%

Interactions

2024-05-03T20:48:29.949256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:48:29.174727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:48:30.351981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-03T20:48:29.550143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-03T20:48:36.446046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리장개방일자물재생센터명칭개방장소구분개방횟수이용인원비고
처리장개방일자1.0000.0000.0000.1930.1870.000
물재생센터명칭0.0001.0000.6460.4480.4611.000
개방장소구분0.0000.6461.0000.5650.4651.000
개방횟수0.1930.4480.5651.0000.853NaN
이용인원0.1870.4610.4650.8531.0001.000
비고0.0001.0001.000NaN1.0001.000
2024-05-03T20:48:36.716839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
개방장소구분물재생센터명칭비고
개방장소구분1.0000.4460.990
물재생센터명칭0.4461.0000.990
비고0.9900.9901.000
2024-05-03T20:48:37.051402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
개방횟수이용인원물재생센터명칭개방장소구분비고
개방횟수1.0000.9040.2830.2041.000
이용인원0.9041.0000.2920.1590.986
물재생센터명칭0.2830.2921.0000.4460.990
개방장소구분0.2040.1590.4461.0000.990
비고1.0000.9860.9900.9901.000

Missing values

2024-05-03T20:48:30.923707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-03T20:48:31.322343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

처리장개방일자물재생센터명칭개방장소구분개방횟수이용인원비고
02023/01중랑물재생센터테니스장52164<NA>
12023/01중랑물재생센터축구장71812<NA>
22023/01중랑물재생센터족구장35210<NA>
32023/01중랑물재생센터배드민턴장92415<NA>
42023/01중랑물재생센터기타85352풋살장
52022/12탄천물재생센터테니스장6061112<NA>
62022/12탄천물재생센터축구장591346<NA>
72022/12탄천물재생센터기타5777232<NA>
82022/12중랑물재생센터테니스장44156<NA>
92022/12중랑물재생센터축구장68776<NA>
처리장개방일자물재생센터명칭개방장소구분개방횟수이용인원비고
13852016/01중랑물재생센터기타11550조깅 외
13862016/01서남물재생센터테니스장10524249<NA>
13872016/01서남물재생센터축구장24834<NA>
13882016/01서남물재생센터농구장00<NA>
13892016/01서남물재생센터기타00<NA>
13902016/01서남물재생센터골프장00<NA>
13912016/01서남물재생센터강당00<NA>
13922016/01난지물재생센터테니스장71378<NA>
13932016/01난지물재생센터축구장781667<NA>
13942016/01난지물재생센터기타115<NA>