Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric5
Categorical2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

평균값 is highly overall correlated with 측정기 상태High correlation
측정기 상태 is highly overall correlated with 평균값High correlation
국가 기준초과 구분 is highly overall correlated with 지자체 기준초과 구분High correlation
지자체 기준초과 구분 is highly overall correlated with 국가 기준초과 구분High correlation
국가 기준초과 구분 is highly imbalanced (97.6%)Imbalance
지자체 기준초과 구분 is highly imbalanced (95.2%)Imbalance
평균값 has 142 (1.4%) zerosZeros
측정기 상태 has 8282 (82.8%) zerosZeros

Reproduction

Analysis started2024-04-27 12:06:21.336806
Analysis finished2024-04-27 12:06:30.351659
Duration9.01 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct581
Distinct (%)5.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0050113 × 109
Minimum2.0050101 × 109
Maximum2.0050125 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:06:30.500616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0050101 × 109
5-th percentile2.0050102 × 109
Q12.0050107 × 109
median2.0050112 × 109
Q32.0050119 × 109
95-th percentile2.0050123 × 109
Maximum2.0050125 × 109
Range2404
Interquartile range (IQR)1201

Descriptive statistics

Standard deviation694.46557
Coefficient of variation (CV)3.4636492 × 10-7
Kurtosis-1.1922604
Mean2.0050113 × 109
Median Absolute Deviation (MAD)600
Skewness0.010009571
Sum2.0050113 × 1013
Variance482282.43
MonotonicityNot monotonic
2024-04-27T12:06:30.840711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2005012310 29
 
0.3%
2005011023 29
 
0.3%
2005012116 28
 
0.3%
2005010403 28
 
0.3%
2005011915 28
 
0.3%
2005011914 27
 
0.3%
2005012121 26
 
0.3%
2005011812 26
 
0.3%
2005010815 26
 
0.3%
2005012321 26
 
0.3%
Other values (571) 9727
97.3%
ValueCountFrequency (%)
2005010100 20
0.2%
2005010101 17
0.2%
2005010102 20
0.2%
2005010103 11
0.1%
2005010104 16
0.2%
2005010105 15
0.1%
2005010106 12
0.1%
2005010107 14
0.1%
2005010108 23
0.2%
2005010109 17
0.2%
ValueCountFrequency (%)
2005012504 4
 
< 0.1%
2005012503 15
0.1%
2005012502 18
0.2%
2005012501 19
0.2%
2005012500 13
0.1%
2005012423 19
0.2%
2005012422 15
0.1%
2005012421 16
0.2%
2005012420 15
0.1%
2005012419 19
0.2%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean113.0493
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:06:31.227223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median113
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.226527
Coefficient of variation (CV)0.063923677
Kurtosis-1.2120867
Mean113.0493
Median Absolute Deviation (MAD)6
Skewness-0.01901842
Sum1130493
Variance52.222692
MonotonicityNot monotonic
2024-04-27T12:06:31.617070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
122 442
 
4.4%
116 436
 
4.4%
118 428
 
4.3%
117 418
 
4.2%
102 414
 
4.1%
111 413
 
4.1%
103 412
 
4.1%
123 408
 
4.1%
110 408
 
4.1%
105 407
 
4.1%
Other values (15) 5814
58.1%
ValueCountFrequency (%)
101 399
4.0%
102 414
4.1%
103 412
4.1%
104 381
3.8%
105 407
4.1%
106 366
3.7%
107 403
4.0%
108 400
4.0%
109 384
3.8%
110 408
4.1%
ValueCountFrequency (%)
125 389
3.9%
124 398
4.0%
123 408
4.1%
122 442
4.4%
121 388
3.9%
120 396
4.0%
119 388
3.9%
118 428
4.3%
117 418
4.2%
116 436
4.4%

측정항목
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.2745
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:06:31.949451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7655586
Coefficient of variation (CV)0.52432622
Kurtosis-1.2413594
Mean5.2745
Median Absolute Deviation (MAD)3
Skewness-0.18146356
Sum52745
Variance7.6483146
MonotonicityNot monotonic
2024-04-27T12:06:32.345765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 1740
17.4%
8 1720
17.2%
3 1712
17.1%
5 1632
16.3%
6 1605
16.1%
9 1591
15.9%
ValueCountFrequency (%)
1 1740
17.4%
3 1712
17.1%
5 1632
16.3%
6 1605
16.1%
8 1720
17.2%
9 1591
15.9%
ValueCountFrequency (%)
9 1591
15.9%
8 1720
17.2%
6 1605
16.1%
5 1632
16.3%
3 1712
17.1%
1 1740
17.4%

평균값
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct291
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-938.25209
Minimum-9999
Maximum182
Zeros142
Zeros (%)1.4%
Negative1544
Negative (%)15.4%
Memory size166.0 KiB
2024-04-27T12:06:32.771352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile-9999
Q10.004
median0.025
Q31.1
95-th percentile63
Maximum182
Range10181
Interquartile range (IQR)1.096

Descriptive statistics

Standard deviation2909.4193
Coefficient of variation (CV)-3.1008929
Kurtosis5.7917308
Mean-938.25209
Median Absolute Deviation (MAD)0.375
Skewness-2.7878782
Sum-9382520.9
Variance8464720.5
MonotonicityNot monotonic
2024-04-27T12:06:33.205500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-9999.0 933
 
9.3%
-9.999 457
 
4.6%
0.003 308
 
3.1%
0.004 284
 
2.8%
0.002 264
 
2.6%
0.005 249
 
2.5%
0.001 236
 
2.4%
0.006 231
 
2.3%
0.5 188
 
1.9%
0.007 185
 
1.8%
Other values (281) 6665
66.6%
ValueCountFrequency (%)
-9999.0 933
9.3%
-999.9 154
 
1.5%
-9.999 457
4.6%
0.0 142
 
1.4%
0.001 236
 
2.4%
0.002 264
 
2.6%
0.003 308
 
3.1%
0.004 284
 
2.8%
0.005 249
 
2.5%
0.006 231
 
2.3%
ValueCountFrequency (%)
182.0 1
 
< 0.1%
180.0 1
 
< 0.1%
175.0 1
 
< 0.1%
170.0 2
< 0.1%
168.0 1
 
< 0.1%
164.0 1
 
< 0.1%
163.0 1
 
< 0.1%
162.0 1
 
< 0.1%
161.0 1
 
< 0.1%
160.0 3
< 0.1%

측정기 상태
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6874
Minimum0
Maximum9
Zeros8282
Zeros (%)82.8%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:06:33.668769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile4
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.5899478
Coefficient of variation (CV)2.3129878
Kurtosis4.5775396
Mean0.6874
Median Absolute Deviation (MAD)0
Skewness2.2429648
Sum6874
Variance2.527934
MonotonicityNot monotonic
2024-04-27T12:06:34.017742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 8282
82.8%
4 1457
 
14.6%
2 146
 
1.5%
8 57
 
0.6%
9 30
 
0.3%
1 28
 
0.3%
ValueCountFrequency (%)
0 8282
82.8%
1 28
 
0.3%
2 146
 
1.5%
4 1457
 
14.6%
8 57
 
0.6%
9 30
 
0.3%
ValueCountFrequency (%)
9 30
 
0.3%
8 57
 
0.6%
4 1457
 
14.6%
2 146
 
1.5%
1 28
 
0.3%
0 8282
82.8%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9976 
1
 
24

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9976
99.8%
1 24
 
0.2%

Length

2024-04-27T12:06:34.409884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:06:34.706730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9976
99.8%
1 24
 
0.2%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9947 
1
 
53

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9947
99.5%
1 53
 
0.5%

Length

2024-04-27T12:06:34.999948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:06:35.314249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9947
99.5%
1 53
 
0.5%

Interactions

2024-04-27T12:06:28.294154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:22.694308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:24.029459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:25.368547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:26.829193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:28.552246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:22.961112image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:24.293706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:25.646007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:27.096949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:28.809387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:23.253081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:24.569017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:25.918256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:27.485488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:29.098458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:23.508944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:24.850209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:26.205192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:27.775656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:29.360319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:23.771496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:25.109733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:26.569025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:06:28.039410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-27T12:06:35.503103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.0000.2090.2480.0770.126
측정소 코드0.0001.0000.0230.1510.3030.0280.044
측정항목0.0000.0231.0000.3880.5260.1180.187
평균값0.2090.1510.3881.0000.6300.0000.000
측정기 상태0.2480.3030.5260.6301.0000.1560.148
국가 기준초과 구분0.0770.0280.1180.0000.1561.0000.859
지자체 기준초과 구분0.1260.0440.1870.0000.1480.8591.000
2024-04-27T12:06:35.830499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분국가 기준초과 구분
지자체 기준초과 구분1.0000.658
국가 기준초과 구분0.6581.000
2024-04-27T12:06:36.070620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0020.002-0.1030.1940.0590.097
측정소 코드0.0021.000-0.010-0.0040.0030.0210.033
측정항목0.002-0.0101.0000.2430.2660.0850.134
평균값-0.103-0.0040.2431.000-0.6180.0100.021
측정기 상태0.1940.0030.266-0.6181.0000.1120.107
국가 기준초과 구분0.0590.0210.0850.0100.1121.0000.658
지자체 기준초과 구분0.0970.0330.1340.0210.1070.6581.000

Missing values

2024-04-27T12:06:29.697326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-27T12:06:30.202577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
218920050101141159-9999.0400
893200501010512497.0000
8072920050123101059-9999.0400
483222005011410104857.0000
4859920050114111259-9999.0400
153822005010506114841.0000
3908320050111201149-9999.0400
49922200501142012150.7000
12553200501041111830.022000
823912005012321107934.0000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
64778200501182312250.8000
6611620050119081205-999.9400
8017200501030511230.039000
657772005011906113922.0000
203562005010615118825.0000
4424920050113061259-9999.0400
69740200501200812450.6000
63090200501181211610.007000
1628200501011012250.4000
43339200501130012430.05000