Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric5
Categorical2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

평균값 is highly overall correlated with 측정기 상태High correlation
측정기 상태 is highly overall correlated with 평균값High correlation
국가 기준초과 구분 is highly overall correlated with 지자체 기준초과 구분High correlation
지자체 기준초과 구분 is highly overall correlated with 국가 기준초과 구분High correlation
국가 기준초과 구분 is highly imbalanced (91.0%)Imbalance
지자체 기준초과 구분 is highly imbalanced (85.2%)Imbalance
평균값 has 177 (1.8%) zerosZeros
측정기 상태 has 6443 (64.4%) zerosZeros

Reproduction

Analysis started2024-05-11 07:01:17.867832
Analysis finished2024-05-11 07:01:24.237939
Duration6.37 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct614
Distinct (%)6.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0020113 × 109
Minimum2.0020101 × 109
Maximum2.0020126 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:01:24.368185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0020101 × 109
5-th percentile2.0020102 × 109
Q12.0020107 × 109
median2.0020113 × 109
Q32.002012 × 109
95-th percentile2.0020125 × 109
Maximum2.0020126 × 109
Range2513
Interquartile range (IQR)1295

Descriptive statistics

Standard deviation735.79186
Coefficient of variation (CV)3.6752632 × 10-7
Kurtosis-1.2036077
Mean2.0020113 × 109
Median Absolute Deviation (MAD)615
Skewness0.0090137858
Sum2.0020113 × 1013
Variance541389.67
MonotonicityNot monotonic
2024-05-11T16:01:24.645995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2002011607 30
 
0.3%
2002011318 27
 
0.3%
2002011614 27
 
0.3%
2002011616 27
 
0.3%
2002010503 27
 
0.3%
2002010905 26
 
0.3%
2002011019 26
 
0.3%
2002012500 26
 
0.3%
2002010304 26
 
0.3%
2002010414 25
 
0.2%
Other values (604) 9733
97.3%
ValueCountFrequency (%)
2002010100 21
0.2%
2002010101 16
0.2%
2002010102 23
0.2%
2002010103 11
0.1%
2002010104 14
0.1%
2002010105 12
0.1%
2002010106 10
0.1%
2002010107 15
0.1%
2002010108 15
0.1%
2002010109 16
0.2%
ValueCountFrequency (%)
2002012613 6
 
0.1%
2002012612 22
0.2%
2002012611 10
0.1%
2002012610 20
0.2%
2002012609 12
0.1%
2002012608 11
0.1%
2002012607 15
0.1%
2002012606 14
0.1%
2002012605 18
0.2%
2002012604 18
0.2%

측정소 코드
Real number (ℝ)

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean113.5271
Minimum102
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:01:24.897287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum102
5-th percentile103
Q1107
median114
Q3120
95-th percentile124
Maximum125
Range23
Interquartile range (IQR)13

Descriptive statistics

Standard deviation6.9486757
Coefficient of variation (CV)0.061207198
Kurtosis-1.2137307
Mean113.5271
Median Absolute Deviation (MAD)6
Skewness-0.0070813338
Sum1135271
Variance48.284094
MonotonicityNot monotonic
2024-05-11T16:01:25.095836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
106 455
 
4.5%
120 454
 
4.5%
119 436
 
4.4%
125 434
 
4.3%
102 433
 
4.3%
121 427
 
4.3%
108 425
 
4.2%
115 422
 
4.2%
114 421
 
4.2%
113 421
 
4.2%
Other values (14) 5672
56.7%
ValueCountFrequency (%)
102 433
4.3%
103 402
4.0%
104 411
4.1%
105 411
4.1%
106 455
4.5%
107 407
4.1%
108 425
4.2%
109 389
3.9%
110 402
4.0%
111 414
4.1%
ValueCountFrequency (%)
125 434
4.3%
124 419
4.2%
123 404
4.0%
122 404
4.0%
121 427
4.3%
120 454
4.5%
119 436
4.4%
118 409
4.1%
117 410
4.1%
116 380
3.8%

측정항목
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.2985
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:01:25.248554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7613693
Coefficient of variation (CV)0.52116057
Kurtosis-1.2222224
Mean5.2985
Median Absolute Deviation (MAD)3
Skewness-0.19674254
Sum52985
Variance7.6251603
MonotonicityNot monotonic
2024-05-11T16:01:25.435102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 1732
17.3%
6 1702
17.0%
8 1657
16.6%
3 1654
16.5%
9 1637
16.4%
5 1618
16.2%
ValueCountFrequency (%)
1 1732
17.3%
3 1654
16.5%
5 1618
16.2%
6 1702
17.0%
8 1657
16.6%
9 1637
16.4%
ValueCountFrequency (%)
9 1637
16.4%
8 1657
16.6%
6 1702
17.0%
5 1618
16.2%
3 1654
16.5%
1 1732
17.3%

평균값
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct363
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-1638.5803
Minimum-9999
Maximum906
Zeros177
Zeros (%)1.8%
Negative3154
Negative (%)31.5%
Memory size166.0 KiB
2024-05-11T16:01:25.660402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile-9999
Q1-9.999
median0.008
Q30.5
95-th percentile77
Maximum906
Range10905
Interquartile range (IQR)10.499

Descriptive statistics

Standard deviation3672.7488
Coefficient of variation (CV)-2.2414213
Kurtosis1.36874
Mean-1638.5803
Median Absolute Deviation (MAD)1.492
Skewness-1.8310024
Sum-16385803
Variance13489083
MonotonicityNot monotonic
2024-05-11T16:01:25.940811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-9999.0 1614
 
16.1%
-9.999 1190
 
11.9%
-999.9 347
 
3.5%
0.002 341
 
3.4%
0.001 306
 
3.1%
0.003 280
 
2.8%
0.004 231
 
2.3%
0.005 200
 
2.0%
0.0 177
 
1.8%
0.006 166
 
1.7%
Other values (353) 5148
51.5%
ValueCountFrequency (%)
-9999.0 1614
16.1%
-999.9 347
 
3.5%
-9.999 1190
11.9%
-2.0 1
 
< 0.1%
-0.6 1
 
< 0.1%
-0.2 1
 
< 0.1%
0.0 177
 
1.8%
0.001 306
 
3.1%
0.002 341
 
3.4%
0.003 280
 
2.8%
ValueCountFrequency (%)
906.0 1
< 0.1%
411.0 1
< 0.1%
334.0 1
< 0.1%
280.0 1
< 0.1%
273.0 1
< 0.1%
272.0 1
< 0.1%
271.0 1
< 0.1%
270.0 1
< 0.1%
261.0 1
< 0.1%
258.0 1
< 0.1%

측정기 상태
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.1282
Minimum0
Maximum9
Zeros6443
Zeros (%)64.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:01:26.150818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q34
95-th percentile8
Maximum9
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.1360402
Coefficient of variation (CV)1.4735646
Kurtosis-0.56245808
Mean2.1282
Median Absolute Deviation (MAD)0
Skewness1.0458962
Sum21282
Variance9.8347482
MonotonicityNot monotonic
2024-05-11T16:01:26.338668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 6443
64.4%
8 1810
 
18.1%
4 1589
 
15.9%
2 96
 
1.0%
1 38
 
0.4%
9 24
 
0.2%
ValueCountFrequency (%)
0 6443
64.4%
1 38
 
0.4%
2 96
 
1.0%
4 1589
 
15.9%
8 1810
 
18.1%
9 24
 
0.2%
ValueCountFrequency (%)
9 24
 
0.2%
8 1810
 
18.1%
4 1589
 
15.9%
2 96
 
1.0%
1 38
 
0.4%
0 6443
64.4%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9886 
1
 
114

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9886
98.9%
1 114
 
1.1%

Length

2024-05-11T16:01:26.538562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:01:26.685982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9886
98.9%
1 114
 
1.1%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9788 
1
 
212

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9788
97.9%
1 212
 
2.1%

Length

2024-05-11T16:01:26.849123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:01:27.009237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9788
97.9%
1 212
 
2.1%

Interactions

2024-05-11T16:01:23.133957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:19.344490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:20.239394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:21.079225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:21.916266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:23.293998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:19.519879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:20.443172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:21.242330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:22.076964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:23.430320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:19.683506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:20.582208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:21.415963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:22.220682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:23.569419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:19.875542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:20.768997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:21.601602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:22.407873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:23.718723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:20.039184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:20.927381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:21.752219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:01:22.979438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T16:01:27.132247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.0000.0260.1320.2410.254
측정소 코드0.0001.0000.0220.2610.4900.0970.106
측정항목0.0000.0221.0000.5850.5670.3230.449
평균값0.0260.2610.5851.0000.5360.0300.046
측정기 상태0.1320.4900.5670.5361.0000.1180.133
국가 기준초과 구분0.2410.0970.3230.0300.1181.0000.909
지자체 기준초과 구분0.2540.1060.4490.0460.1330.9091.000
2024-05-11T16:01:27.345320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국가 기준초과 구분지자체 기준초과 구분
국가 기준초과 구분1.0000.726
지자체 기준초과 구분0.7261.000
2024-05-11T16:01:27.480654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.000-0.0010.0040.017-0.0300.1850.195
측정소 코드-0.0011.0000.029-0.0710.0760.0740.081
측정항목0.0040.0291.000-0.0410.2030.2320.324
평균값0.017-0.071-0.0411.000-0.6920.0510.071
측정기 상태-0.0300.0760.203-0.6921.0000.0850.096
국가 기준초과 구분0.1850.0740.2320.0510.0851.0000.726
지자체 기준초과 구분0.1950.0810.3240.0710.0960.7261.000

Missing values

2024-05-11T16:01:23.940809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T16:01:24.143890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
480772002011421122951.0000
21056200201070210750.5000
31059200201092311860.008000
7010020020121061215-999.9400
11874200201041011310.01000
759462002012223111845.0000
2292120020107151063-9.999800
39566200201121012052.4000
2174320020107061259-9999.0400
3423420020110211198-9999.0400
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
5510320020116221179-9999.0400
848872002012513113931.0000
52938200201160711710.001000
4828220020114231091-9.999800
357320020102001216-9.999800
12606200201041511510.013000
2067520020106231159-9999.0400
5613520020117051219-9999.0400
11239200201040610330.049000
68689200201202110230.015000