Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric5
Categorical2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

측정항목 is highly overall correlated with 평균값 and 1 other fieldsHigh correlation
평균값 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
측정기 상태 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
국가 기준초과 구분 is highly imbalanced (93.7%)Imbalance
지자체 기준초과 구분 is highly imbalanced (99.2%)Imbalance
평균값 has 414 (4.1%) zerosZeros
측정기 상태 has 4954 (49.5%) zerosZeros

Reproduction

Analysis started2024-05-04 04:03:52.203109
Analysis finished2024-05-04 04:04:03.832342
Duration11.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct1685
Distinct (%)16.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9920179 × 109
Minimum1.9920101 × 109
Maximum1.9920311 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:04:04.062769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.9920101 × 109
5-th percentile1.9920104 × 109
Q11.9920116 × 109
median1.9920201 × 109
Q31.9920221 × 109
95-th percentile1.9920307 × 109
Maximum1.9920311 × 109
Range21006
Interquartile range (IQR)10487.25

Descriptive statistics

Standard deviation6881.9403
Coefficient of variation (CV)3.4547583 × 10-6
Kurtosis-0.96675957
Mean1.9920179 × 109
Median Absolute Deviation (MAD)7798
Skewness0.53552467
Sum1.9920179 × 1013
Variance47361102
MonotonicityNot monotonic
2024-05-04T04:04:04.499766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1992011017 15
 
0.1%
1992010704 13
 
0.1%
1992011620 13
 
0.1%
1992010117 13
 
0.1%
1992012302 13
 
0.1%
1992021512 13
 
0.1%
1992012007 13
 
0.1%
1992011923 12
 
0.1%
1992013009 12
 
0.1%
1992020218 12
 
0.1%
Other values (1675) 9871
98.7%
ValueCountFrequency (%)
1992010100 5
0.1%
1992010101 7
0.1%
1992010102 6
0.1%
1992010103 5
0.1%
1992010104 10
0.1%
1992010105 4
 
< 0.1%
1992010106 6
0.1%
1992010107 5
0.1%
1992010108 4
 
< 0.1%
1992010109 12
0.1%
ValueCountFrequency (%)
1992031106 4
 
< 0.1%
1992031105 7
0.1%
1992031104 5
0.1%
1992031103 11
0.1%
1992031102 4
 
< 0.1%
1992031101 2
 
< 0.1%
1992031100 2
 
< 0.1%
1992031023 9
0.1%
1992031022 5
0.1%
1992031021 5
0.1%

측정소 코드
Real number (ℝ)

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.4342
Minimum103
Maximum124
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:04:04.860157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum103
5-th percentile103
Q1107
median113
Q3117
95-th percentile124
Maximum124
Range21
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.8692932
Coefficient of variation (CV)0.061096118
Kurtosis-1.2173878
Mean112.4342
Median Absolute Deviation (MAD)6
Skewness0.29942858
Sum1124342
Variance47.187189
MonotonicityNot monotonic
2024-05-04T04:04:05.204084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
113 1036
10.4%
108 1034
10.3%
116 1030
10.3%
117 1024
10.2%
122 1021
10.2%
107 1000
10.0%
124 990
9.9%
105 990
9.9%
103 976
9.8%
106 455
4.5%
ValueCountFrequency (%)
103 976
9.8%
105 990
9.9%
106 455
4.5%
107 1000
10.0%
108 1034
10.3%
111 444
4.4%
113 1036
10.4%
116 1030
10.3%
117 1024
10.2%
122 1021
10.2%
ValueCountFrequency (%)
124 990
9.9%
122 1021
10.2%
117 1024
10.2%
116 1030
10.3%
113 1036
10.4%
111 444
4.4%
108 1034
10.3%
107 1000
10.0%
106 455
4.5%
105 990
9.9%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3326
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:04:05.480258image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7724981
Coefficient of variation (CV)0.51991489
Kurtosis-1.2341175
Mean5.3326
Median Absolute Deviation (MAD)3
Skewness-0.2037194
Sum53326
Variance7.6867459
MonotonicityNot monotonic
2024-05-04T04:04:05.799207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
9 1710
17.1%
1 1701
17.0%
3 1687
16.9%
6 1675
16.8%
8 1663
16.6%
5 1564
15.6%
ValueCountFrequency (%)
1 1701
17.0%
3 1687
16.9%
5 1564
15.6%
6 1675
16.8%
8 1663
16.6%
9 1710
17.1%
ValueCountFrequency (%)
9 1710
17.1%
8 1663
16.6%
6 1675
16.8%
5 1564
15.6%
3 1687
16.9%
1 1701
17.0%

평균값
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct318
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-3206.8111
Minimum-9999
Maximum208
Zeros414
Zeros (%)4.1%
Negative4615
Negative (%)46.2%
Memory size166.0 KiB
2024-05-04T04:04:06.211358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile-9999
Q1-9999
median0
Q30.039
95-th percentile2.4
Maximum208
Range10207
Interquartile range (IQR)9999.039

Descriptive statistics

Standard deviation4630.3133
Coefficient of variation (CV)-1.4438996
Kurtosis-1.382164
Mean-3206.8111
Median Absolute Deviation (MAD)2.9
Skewness-0.78217227
Sum-32068111
Variance21439801
MonotonicityNot monotonic
2024-05-04T04:04:06.685323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-9999.0 3169
31.7%
-9.999 1068
 
10.7%
0.0 414
 
4.1%
-999.9 376
 
3.8%
0.001 142
 
1.4%
0.002 103
 
1.0%
0.009 81
 
0.8%
0.003 80
 
0.8%
0.011 77
 
0.8%
0.019 76
 
0.8%
Other values (308) 4414
44.1%
ValueCountFrequency (%)
-9999.0 3169
31.7%
-999.9 376
 
3.8%
-10.122 1
 
< 0.1%
-10.011 1
 
< 0.1%
-9.999 1068
 
10.7%
0.0 414
 
4.1%
0.001 142
 
1.4%
0.002 103
 
1.0%
0.003 80
 
0.8%
0.004 59
 
0.6%
ValueCountFrequency (%)
208.0 1
< 0.1%
184.0 1
< 0.1%
170.0 1
< 0.1%
168.0 1
< 0.1%
159.0 1
< 0.1%
135.0 2
< 0.1%
113.0 1
< 0.1%
109.0 1
< 0.1%
107.0 1
< 0.1%
84.0 1
< 0.1%

측정기 상태
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9466
Minimum0
Maximum9
Zeros4954
Zeros (%)49.5%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:04:07.044196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q34
95-th percentile4
Maximum9
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation1.9914178
Coefficient of variation (CV)1.0230237
Kurtosis-1.562241
Mean1.9466
Median Absolute Deviation (MAD)2
Skewness0.17327247
Sum19466
Variance3.965745
MonotonicityNot monotonic
2024-05-04T04:04:07.368251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 4954
49.5%
4 4591
45.9%
2 402
 
4.0%
9 28
 
0.3%
1 22
 
0.2%
8 3
 
< 0.1%
ValueCountFrequency (%)
0 4954
49.5%
1 22
 
0.2%
2 402
 
4.0%
4 4591
45.9%
8 3
 
< 0.1%
9 28
 
0.3%
ValueCountFrequency (%)
9 28
 
0.3%
8 3
 
< 0.1%
4 4591
45.9%
2 402
 
4.0%
1 22
 
0.2%
0 4954
49.5%

국가 기준초과 구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9926 
1
 
74

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9926
99.3%
1 74
 
0.7%

Length

2024-05-04T04:04:07.737763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:04:08.031763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9926
99.3%
1 74
 
0.7%

지자체 기준초과 구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9993 
1
 
7

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9993
99.9%
1 7
 
0.1%

Length

2024-05-04T04:04:08.342472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:04:08.648641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9993
99.9%
1 7
 
0.1%

Interactions

2024-05-04T04:04:01.868144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:55.195622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:57.518497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:59.080760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:00.541749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:02.126396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:55.846296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:57.793669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:59.411206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:00.807887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:02.387299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:56.422878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:58.092081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:59.700239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:01.073892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:02.666540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:56.826849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:58.381751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:00.024277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:01.357484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:02.907884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:57.207274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:58.818770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:00.278849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:01.612724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T04:04:08.909869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.1860.0000.1730.2780.0580.086
측정소 코드0.1861.0000.0000.2280.2730.0350.044
측정항목0.0000.0001.0000.3630.7300.2420.076
평균값0.1730.2280.3631.0000.5610.0290.000
측정기 상태0.2780.2730.7300.5611.0000.1170.020
국가 기준초과 구분0.0580.0350.2420.0290.1171.0000.303
지자체 기준초과 구분0.0860.0440.0760.0000.0200.3031.000
2024-05-04T04:04:09.212083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분국가 기준초과 구분
지자체 기준초과 구분1.0000.196
국가 기준초과 구분0.1961.000
2024-05-04T04:04:09.462417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0460.0040.025-0.0280.0410.062
측정소 코드0.0461.0000.0010.080-0.1530.0380.047
측정항목0.0040.0011.000-0.7270.6140.1740.055
평균값0.0250.080-0.7271.000-0.8730.0620.014
측정기 상태-0.028-0.1530.614-0.8731.0000.0840.015
국가 기준초과 구분0.0410.0380.1740.0620.0841.0000.196
지자체 기준초과 구분0.0620.0470.0550.0140.0150.1961.000

Missing values

2024-05-04T04:04:03.279739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T04:04:03.668966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
56137199202061010730.051000
50385199202012311760.009000
4690119920130141139-9999.0400
4745219920130221248-9999.0400
40368199201261111610.121000
11438199201080510752.5000
92547199203051211360.008000
72781199202190611330.036000
52777199202032010330.024000
2531119920116231116-9.999400
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
35557199201231011730.024000
4710199201032310810.043000
3934519920125201056-9.999400
42198199201271510810.035000
2926819920119111111-9.999400
25867199201170712430.019000
95064199203071110510.052000
7515199201051712260.005000
4826219920131111068-9999.0400
92445199203051011660.012200