Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells3
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric5
Categorical2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

국가 기준초과 구분 is highly overall correlated with 평균값 and 1 other fieldsHigh correlation
지자체 기준초과 구분 is highly overall correlated with 평균값 and 1 other fieldsHigh correlation
측정항목 is highly overall correlated with 평균값High correlation
평균값 is highly overall correlated with 측정항목 and 2 other fieldsHigh correlation
국가 기준초과 구분 is highly imbalanced (82.8%)Imbalance
지자체 기준초과 구분 is highly imbalanced (82.8%)Imbalance
측정기 상태 has 9852 (98.5%) zerosZeros

Reproduction

Analysis started2024-05-11 07:01:05.078406
Analysis finished2024-05-11 07:01:12.182184
Duration7.1 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct468
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.016011 × 109
Minimum2.0160101 × 109
Maximum2.016012 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:01:12.325062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0160101 × 109
5-th percentile2.0160102 × 109
Q12.0160105 × 109
median2.016011 × 109
Q32.0160115 × 109
95-th percentile2.0160119 × 109
Maximum2.016012 × 109
Range1911
Interquartile range (IQR)990

Descriptive statistics

Standard deviation558.88052
Coefficient of variation (CV)2.7722096 × 10-7
Kurtosis-1.186872
Mean2.016011 × 109
Median Absolute Deviation (MAD)495
Skewness0.0099920943
Sum2.016011 × 1013
Variance312347.44
MonotonicityNot monotonic
2024-05-11T16:01:12.596895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2016011223 33
 
0.3%
2016011804 32
 
0.3%
2016010120 31
 
0.3%
2016011613 31
 
0.3%
2016010513 31
 
0.3%
2016011101 31
 
0.3%
2016010518 30
 
0.3%
2016011320 30
 
0.3%
2016011007 30
 
0.3%
2016011206 30
 
0.3%
Other values (458) 9691
96.9%
ValueCountFrequency (%)
2016010100 23
0.2%
2016010101 16
0.2%
2016010102 24
0.2%
2016010103 12
0.1%
2016010104 19
0.2%
2016010105 20
0.2%
2016010106 23
0.2%
2016010107 22
0.2%
2016010108 24
0.2%
2016010109 18
0.2%
ValueCountFrequency (%)
2016012011 23
0.2%
2016012010 26
0.3%
2016012009 20
0.2%
2016012008 22
0.2%
2016012007 19
0.2%
2016012006 17
0.2%
2016012005 23
0.2%
2016012004 15
0.1%
2016012003 26
0.3%
2016012002 21
0.2%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.9305
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:01:12.838981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median113
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.2074451
Coefficient of variation (CV)0.063821953
Kurtosis-1.2025787
Mean112.9305
Median Absolute Deviation (MAD)6
Skewness0.0050087726
Sum1129305
Variance51.947264
MonotonicityNot monotonic
2024-05-11T16:01:13.050796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
101 434
 
4.3%
105 432
 
4.3%
109 420
 
4.2%
121 420
 
4.2%
106 420
 
4.2%
119 414
 
4.1%
111 413
 
4.1%
113 403
 
4.0%
114 403
 
4.0%
124 402
 
4.0%
Other values (15) 5839
58.4%
ValueCountFrequency (%)
101 434
4.3%
102 399
4.0%
103 392
3.9%
104 360
3.6%
105 432
4.3%
106 420
4.2%
107 390
3.9%
108 391
3.9%
109 420
4.2%
110 398
4.0%
ValueCountFrequency (%)
125 373
3.7%
124 402
4.0%
123 392
3.9%
122 394
3.9%
121 420
4.2%
120 395
4.0%
119 414
4.1%
118 375
3.8%
117 401
4.0%
116 395
4.0%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3075
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:01:13.311147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7396157
Coefficient of variation (CV)0.51617819
Kurtosis-1.2010793
Mean5.3075
Median Absolute Deviation (MAD)2
Skewness-0.19155
Sum53075
Variance7.5054943
MonotonicityNot monotonic
2024-05-11T16:01:13.534496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
5 1721
17.2%
3 1676
16.8%
1 1671
16.7%
6 1660
16.6%
8 1637
16.4%
9 1635
16.4%
ValueCountFrequency (%)
1 1671
16.7%
3 1676
16.8%
5 1721
17.2%
6 1660
16.6%
8 1637
16.4%
9 1635
16.4%
ValueCountFrequency (%)
9 1635
16.4%
8 1637
16.4%
6 1660
16.6%
5 1721
17.2%
3 1676
16.8%
1 1671
16.7%

평균값
Real number (ℝ)

HIGH CORRELATION 

Distinct420
Distinct (%)4.2%
Missing3
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean12.810341
Minimum-248
Maximum217
Zeros41
Zeros (%)0.4%
Negative4
Negative (%)< 0.1%
Memory size166.0 KiB
2024-05-11T16:01:13.780246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-248
5-th percentile0.004
Q10.011
median0.067
Q321
95-th percentile61
Maximum217
Range465
Interquartile range (IQR)20.989

Descriptive statistics

Standard deviation22.902571
Coefficient of variation (CV)1.7878189
Kurtosis7.7327693
Mean12.810341
Median Absolute Deviation (MAD)0.066
Skewness2.0339361
Sum128064.98
Variance524.52774
MonotonicityNot monotonic
2024-05-11T16:01:14.021802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.006 524
 
5.2%
0.005 519
 
5.2%
0.007 390
 
3.9%
0.004 260
 
2.6%
0.008 195
 
1.9%
0.002 177
 
1.8%
0.003 153
 
1.5%
0.018 100
 
1.0%
0.02 97
 
1.0%
0.009 93
 
0.9%
Other values (410) 7489
74.9%
ValueCountFrequency (%)
-248.0 1
 
< 0.1%
-4.0 1
 
< 0.1%
-1.0 2
 
< 0.1%
0.0 41
 
0.4%
0.001 50
 
0.5%
0.002 177
 
1.8%
0.003 153
 
1.5%
0.004 260
2.6%
0.005 519
5.2%
0.006 524
5.2%
ValueCountFrequency (%)
217.0 1
< 0.1%
196.0 1
< 0.1%
193.0 1
< 0.1%
184.0 1
< 0.1%
177.0 1
< 0.1%
173.0 1
< 0.1%
163.0 1
< 0.1%
153.0 2
< 0.1%
148.0 1
< 0.1%
147.0 1
< 0.1%

측정기 상태
Real number (ℝ)

ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0547
Minimum0
Maximum9
Zeros9852
Zeros (%)98.5%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:01:14.231841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.61395896
Coefficient of variation (CV)11.224113
Kurtosis191.21127
Mean0.0547
Median Absolute Deviation (MAD)0
Skewness13.583548
Sum547
Variance0.3769456
MonotonicityNot monotonic
2024-05-11T16:01:14.432524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 9852
98.5%
1 69
 
0.7%
9 42
 
0.4%
2 26
 
0.3%
4 10
 
0.1%
8 1
 
< 0.1%
ValueCountFrequency (%)
0 9852
98.5%
1 69
 
0.7%
2 26
 
0.3%
4 10
 
0.1%
8 1
 
< 0.1%
9 42
 
0.4%
ValueCountFrequency (%)
9 42
 
0.4%
8 1
 
< 0.1%
4 10
 
0.1%
2 26
 
0.3%
1 69
 
0.7%
0 9852
98.5%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9744 
1
 
256

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9744
97.4%
1 256
 
2.6%

Length

2024-05-11T16:01:14.635464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:01:14.816294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9744
97.4%
1 256
 
2.6%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9744 
1
 
256

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9744
97.4%
1 256
 
2.6%

Length

2024-05-11T16:01:14.998730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:01:15.153046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9744
97.4%
1 256
 
2.6%

Interactions

2024-05-11T16:01:10.904185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:06.756934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:07.774338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:08.706164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:09.643764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:11.117332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:06.960472image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:07.967001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:08.943146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:09.815503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:11.281246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:07.146871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:08.148238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:09.094465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:09.980955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:11.496579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:07.406469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:08.343060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:09.294340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:10.534977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:11.667581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:07.587831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:08.508156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:09.465689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:10.691911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T16:01:15.271014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.0000.3360.1010.3340.334
측정소 코드0.0001.0000.0000.0510.1020.0480.048
측정항목0.0000.0001.0000.5060.1270.3850.385
평균값0.3360.0510.5061.0000.1610.4290.429
측정기 상태0.1010.1020.1270.1611.0000.2290.229
국가 기준초과 구분0.3340.0480.3850.4290.2291.0001.000
지자체 기준초과 구분0.3340.0480.3850.4290.2291.0001.000
2024-05-11T16:01:15.467364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국가 기준초과 구분지자체 기준초과 구분
국가 기준초과 구분1.0000.998
지자체 기준초과 구분0.9981.000
2024-05-11T16:01:15.618484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.000-0.0110.0180.002-0.0460.2540.254
측정소 코드-0.0111.0000.0040.0090.0110.0370.037
측정항목0.0180.0041.0000.7220.0610.2770.277
평균값0.0020.0090.7221.000-0.0230.5220.522
측정기 상태-0.0460.0110.061-0.0231.0000.1640.164
국가 기준초과 구분0.2540.0370.2770.5220.1641.0000.998
지자체 기준초과 구분0.2540.0370.2770.5220.1640.9981.000

Missing values

2024-05-11T16:01:11.860207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T16:01:12.081803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
51108201601150411910.006000
447582016011310110843.0000
39357201601112211060.019000
64793201601182312498.0000
30306201601091010210.008000
15803201601050910999.0000
496252016011418121931.0000
3595201601012312530.044000
26463201601080811160.003000
58947201601170812560.005000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
21939201601070210760.023000
58242201601170410810.003000
65744201601190610850.37000
553732016011609104923.0000
14604201601050111010.005000
36879201601110512260.018000
34812201601101610310.005000
26773201601081011330.031000
913620160103121238113.0011
306822016010912114832.0000