Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric5
Categorical2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

측정항목 is highly overall correlated with 평균값 and 1 other fieldsHigh correlation
평균값 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
측정기 상태 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
국가 기준초과 구분 is highly imbalanced (83.9%)Imbalance
지자체 기준초과 구분 is highly imbalanced (96.6%)Imbalance
평균값 has 642 (6.4%) zerosZeros
측정기 상태 has 5964 (59.6%) zerosZeros

Reproduction

Analysis started2024-05-04 04:04:34.259420
Analysis finished2024-05-04 04:04:45.873535
Duration11.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct2075
Distinct (%)20.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9900211 × 109
Minimum1.9900101 × 109
Maximum1.9900328 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:04:46.121603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.9900101 × 109
5-th percentile1.9900104 × 109
Q11.9900122 × 109
median1.9900213 × 109
Q31.9900307 × 109
95-th percentile1.9900324 × 109
Maximum1.9900328 × 109
Range22719
Interquartile range (IQR)18498.25

Descriptive statistics

Standard deviation8211.8413
Coefficient of variation (CV)4.1265097 × 10-6
Kurtosis-1.4881098
Mean1.9900211 × 109
Median Absolute Deviation (MAD)9215.5
Skewness0.07781517
Sum1.9900211 × 1013
Variance67434337
MonotonicityNot monotonic
2024-05-04T04:04:46.635425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1990012707 13
 
0.1%
1990032611 13
 
0.1%
1990011816 13
 
0.1%
1990011412 12
 
0.1%
1990022621 12
 
0.1%
1990022812 12
 
0.1%
1990020917 12
 
0.1%
1990010711 11
 
0.1%
1990010701 11
 
0.1%
1990010510 11
 
0.1%
Other values (2065) 9880
98.8%
ValueCountFrequency (%)
1990010100 5
0.1%
1990010101 7
0.1%
1990010102 7
0.1%
1990010103 9
0.1%
1990010104 5
0.1%
1990010105 4
< 0.1%
1990010106 4
< 0.1%
1990010107 3
 
< 0.1%
1990010108 5
0.1%
1990010109 3
 
< 0.1%
ValueCountFrequency (%)
1990032819 3
 
< 0.1%
1990032818 9
0.1%
1990032817 5
0.1%
1990032816 2
 
< 0.1%
1990032814 2
 
< 0.1%
1990032813 2
 
< 0.1%
1990032812 4
< 0.1%
1990032811 4
< 0.1%
1990032810 7
0.1%
1990032809 5
0.1%

측정소 코드
Real number (ℝ)

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.4052
Minimum103
Maximum124
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:04:47.028167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum103
5-th percentile103
Q1105
median113
Q3122
95-th percentile124
Maximum124
Range21
Interquartile range (IQR)17

Descriptive statistics

Standard deviation7.4239426
Coefficient of variation (CV)0.066046256
Kurtosis-1.3896716
Mean112.4052
Median Absolute Deviation (MAD)8
Skewness0.32927212
Sum1124052
Variance55.114924
MonotonicityNot monotonic
2024-05-04T04:04:47.537165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
105 1280
12.8%
122 1265
12.7%
117 1262
12.6%
107 1258
12.6%
124 1252
12.5%
113 1239
12.4%
103 1229
12.3%
108 1215
12.2%
ValueCountFrequency (%)
103 1229
12.3%
105 1280
12.8%
107 1258
12.6%
108 1215
12.2%
113 1239
12.4%
117 1262
12.6%
122 1265
12.7%
124 1252
12.5%
ValueCountFrequency (%)
124 1252
12.5%
122 1265
12.7%
117 1262
12.6%
113 1239
12.4%
108 1215
12.2%
107 1258
12.6%
105 1280
12.8%
103 1229
12.3%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3349
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:04:47.865584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7621559
Coefficient of variation (CV)0.51775213
Kurtosis-1.2251333
Mean5.3349
Median Absolute Deviation (MAD)3
Skewness-0.19929497
Sum53349
Variance7.6295049
MonotonicityNot monotonic
2024-05-04T04:04:48.283378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
9 1708
17.1%
3 1699
17.0%
1 1672
16.7%
8 1655
16.6%
6 1638
16.4%
5 1628
16.3%
ValueCountFrequency (%)
1 1672
16.7%
3 1699
17.0%
5 1628
16.3%
6 1638
16.4%
8 1655
16.6%
9 1708
17.1%
ValueCountFrequency (%)
9 1708
17.1%
8 1655
16.6%
6 1638
16.4%
5 1628
16.3%
3 1699
17.0%
1 1672
16.7%

평균값
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct530
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-3185.5209
Minimum-9999
Maximum272
Zeros642
Zeros (%)6.4%
Negative3903
Negative (%)39.0%
Memory size166.0 KiB
2024-05-04T04:04:48.711910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile-9999
Q1-9999
median0.007
Q30.058
95-th percentile5.2
Maximum272
Range10271
Interquartile range (IQR)9999.058

Descriptive statistics

Standard deviation4637.104
Coefficient of variation (CV)-1.4556816
Kurtosis-1.376976
Mean-3185.5209
Median Absolute Deviation (MAD)1.893
Skewness-0.78683898
Sum-31855209
Variance21502734
MonotonicityNot monotonic
2024-05-04T04:04:49.187287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-9999.0 3163
31.6%
0.0 642
 
6.4%
-9.999 494
 
4.9%
-999.9 243
 
2.4%
0.001 123
 
1.2%
0.002 89
 
0.9%
0.018 77
 
0.8%
0.031 73
 
0.7%
0.006 72
 
0.7%
0.022 70
 
0.7%
Other values (520) 4954
49.5%
ValueCountFrequency (%)
-9999.0 3163
31.6%
-999.9 243
 
2.4%
-10.03 1
 
< 0.1%
-9.999 494
 
4.9%
-0.048 1
 
< 0.1%
-0.039 1
 
< 0.1%
0.0 642
 
6.4%
0.001 123
 
1.2%
0.002 89
 
0.9%
0.003 53
 
0.5%
ValueCountFrequency (%)
272.0 1
< 0.1%
231.0 1
< 0.1%
206.0 2
< 0.1%
192.0 1
< 0.1%
189.0 1
< 0.1%
187.0 1
< 0.1%
184.0 1
< 0.1%
175.0 1
< 0.1%
173.0 1
< 0.1%
169.0 1
< 0.1%

측정기 상태
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.6812
Minimum0
Maximum9
Zeros5964
Zeros (%)59.6%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:04:49.558487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q34
95-th percentile4
Maximum9
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.2192023
Coefficient of variation (CV)1.3200109
Kurtosis0.26354074
Mean1.6812
Median Absolute Deviation (MAD)0
Skewness0.99967827
Sum16812
Variance4.924859
MonotonicityNot monotonic
2024-05-04T04:04:49.897588image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 5964
59.6%
4 3377
33.8%
2 310
 
3.1%
8 209
 
2.1%
9 109
 
1.1%
1 31
 
0.3%
ValueCountFrequency (%)
0 5964
59.6%
1 31
 
0.3%
2 310
 
3.1%
4 3377
33.8%
8 209
 
2.1%
9 109
 
1.1%
ValueCountFrequency (%)
9 109
 
1.1%
8 209
 
2.1%
4 3377
33.8%
2 310
 
3.1%
1 31
 
0.3%
0 5964
59.6%

국가 기준초과 구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9764 
1
 
236

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9764
97.6%
1 236
 
2.4%

Length

2024-05-04T04:04:50.346411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:04:50.763087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9764
97.6%
1 236
 
2.4%

지자체 기준초과 구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9965 
1
 
35

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9965
99.7%
1 35
 
0.4%

Length

2024-05-04T04:04:51.214388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:04:51.559014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9965
99.7%
1 35
 
0.4%

Interactions

2024-05-04T04:04:43.222451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:36.306063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:37.978106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:39.504391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:41.538517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:43.575599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:36.570144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:38.259684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:39.894468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:41.885258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:43.964147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:36.867674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:38.551123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:40.190512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:42.287931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:44.318615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:37.415164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:38.847320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:40.591266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:42.551497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:44.731609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:37.710189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:39.143702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:41.228899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:42.809251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T04:04:51.770174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.0000.1230.1170.0780.058
측정소 코드0.0001.0000.0000.2860.3710.1090.122
측정항목0.0000.0001.0000.2810.8210.4400.163
평균값0.1230.2860.2811.0000.8180.0500.000
측정기 상태0.1170.3710.8210.8181.0000.1750.060
국가 기준초과 구분0.0780.1090.4400.0500.1751.0000.271
지자체 기준초과 구분0.0580.1220.1630.0000.0600.2711.000
2024-05-04T04:04:52.081082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분국가 기준초과 구분
지자체 기준초과 구분1.0000.175
국가 기준초과 구분0.1751.000
2024-05-04T04:04:52.404133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.001-0.0120.016-0.0200.0560.041
측정소 코드0.0011.000-0.0040.131-0.1470.0780.088
측정항목-0.012-0.0041.000-0.6870.6100.3170.118
평균값0.0160.131-0.6871.000-0.8450.1110.040
측정기 상태-0.020-0.1470.610-0.8451.0000.1260.043
국가 기준초과 구분0.0560.0780.3170.1110.1261.0000.175
지자체 기준초과 구분0.0410.0880.1180.0400.0430.1751.000

Missing values

2024-05-04T04:04:45.203475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T04:04:45.725803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
3401319900130121139-9999.0400
40915199002051210830.044000
2479719900122121139-9999.0400
3271119900129091089-9999.0400
6142619900223071178-9999.0400
3697199001040510330.026000
3895199001040910530.027000
4837199001050412230.034000
1130219900110191088-9999.0400
6481719900226061079-9999.0400
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
63915199002251111360.009000
18519900101031229-9999.0400
73364199003051610851.4000
46195199002100210830.015000
9433199001090411330.025000
19461199001172110860.0000
6663199001061812260.004000
50546199002132110355.2000
50787199002140210360.0000
41671199002060410530.031000