Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric5
Categorical2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

평균값 is highly overall correlated with 측정기 상태High correlation
측정기 상태 is highly overall correlated with 평균값High correlation
국가 기준초과 구분 is highly overall correlated with 지자체 기준초과 구분High correlation
지자체 기준초과 구분 is highly overall correlated with 국가 기준초과 구분High correlation
국가 기준초과 구분 is highly imbalanced (92.6%)Imbalance
지자체 기준초과 구분 is highly imbalanced (86.6%)Imbalance
측정기 상태 has 8350 (83.5%) zerosZeros

Reproduction

Analysis started2024-05-04 04:00:23.513485
Analysis finished2024-05-04 04:00:31.714154
Duration8.2 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct573
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0040113 × 109
Minimum2.0040101 × 109
Maximum2.0040124 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:00:32.062577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0040101 × 109
5-th percentile2.0040102 × 109
Q12.0040107 × 109
median2.0040113 × 109
Q32.0040118 × 109
95-th percentile2.0040123 × 109
Maximum2.0040124 × 109
Range2320
Interquartile range (IQR)1121

Descriptive statistics

Standard deviation690.59834
Coefficient of variation (CV)3.4460802 × 10-7
Kurtosis-1.2009555
Mean2.0040113 × 109
Median Absolute Deviation (MAD)596
Skewness-0.0064701802
Sum2.0040113 × 1013
Variance476926.07
MonotonicityNot monotonic
2024-05-04T04:00:32.495932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2004010509 29
 
0.3%
2004011500 29
 
0.3%
2004010709 28
 
0.3%
2004010219 27
 
0.3%
2004012021 27
 
0.3%
2004011716 26
 
0.3%
2004011411 26
 
0.3%
2004010912 26
 
0.3%
2004011606 25
 
0.2%
2004011722 25
 
0.2%
Other values (563) 9732
97.3%
ValueCountFrequency (%)
2004010100 15
0.1%
2004010101 18
0.2%
2004010102 19
0.2%
2004010103 23
0.2%
2004010104 21
0.2%
2004010105 21
0.2%
2004010106 12
0.1%
2004010107 24
0.2%
2004010108 23
0.2%
2004010109 18
0.2%
ValueCountFrequency (%)
2004012420 2
 
< 0.1%
2004012419 19
0.2%
2004012418 17
0.2%
2004012417 14
0.1%
2004012416 17
0.2%
2004012415 12
0.1%
2004012414 23
0.2%
2004012413 22
0.2%
2004012412 15
0.1%
2004012411 24
0.2%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.9588
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:00:32.873331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median113
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.2330446
Coefficient of variation (CV)0.064032591
Kurtosis-1.2073095
Mean112.9588
Median Absolute Deviation (MAD)6
Skewness-0.0056738554
Sum1129588
Variance52.316934
MonotonicityNot monotonic
2024-05-04T04:00:33.259897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
102 442
 
4.4%
115 423
 
4.2%
103 420
 
4.2%
117 420
 
4.2%
105 420
 
4.2%
111 416
 
4.2%
122 413
 
4.1%
119 412
 
4.1%
113 411
 
4.1%
112 410
 
4.1%
Other values (15) 5813
58.1%
ValueCountFrequency (%)
101 403
4.0%
102 442
4.4%
103 420
4.2%
104 389
3.9%
105 420
4.2%
106 383
3.8%
107 348
3.5%
108 386
3.9%
109 389
3.9%
110 408
4.1%
ValueCountFrequency (%)
125 402
4.0%
124 372
3.7%
123 400
4.0%
122 413
4.1%
121 398
4.0%
120 402
4.0%
119 412
4.1%
118 399
4.0%
117 420
4.2%
116 355
3.5%

측정항목
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.338
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:00:33.585776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7490565
Coefficient of variation (CV)0.51499747
Kurtosis-1.2117151
Mean5.338
Median Absolute Deviation (MAD)3
Skewness-0.20342397
Sum53380
Variance7.5573117
MonotonicityNot monotonic
2024-05-04T04:00:33.810951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
3 1698
17.0%
9 1679
16.8%
6 1678
16.8%
8 1662
16.6%
1 1651
16.5%
5 1632
16.3%
ValueCountFrequency (%)
1 1651
16.5%
3 1698
17.0%
5 1632
16.3%
6 1678
16.8%
8 1662
16.6%
9 1679
16.8%
ValueCountFrequency (%)
9 1679
16.8%
8 1662
16.6%
6 1678
16.8%
5 1632
16.3%
3 1698
17.0%
1 1651
16.5%

평균값
Real number (ℝ)

HIGH CORRELATION 

Distinct340
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-1046.6485
Minimum-9999
Maximum10505
Zeros73
Zeros (%)0.7%
Negative1417
Negative (%)14.2%
Memory size166.0 KiB
2024-05-04T04:00:34.147848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile-9999
Q10.004
median0.03
Q31.4
95-th percentile84
Maximum10505
Range20504
Interquartile range (IQR)1.396

Descriptive statistics

Standard deviation3073.389
Coefficient of variation (CV)-2.9364098
Kurtosis4.6106383
Mean-1046.6485
Median Absolute Deviation (MAD)0.27
Skewness-2.5587277
Sum-10466485
Variance9445719.9
MonotonicityNot monotonic
2024-05-04T04:00:34.758215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-9999.0 1052
 
10.5%
0.004 305
 
3.0%
0.002 299
 
3.0%
0.001 292
 
2.9%
0.005 282
 
2.8%
0.003 271
 
2.7%
-9.999 269
 
2.7%
0.006 195
 
1.9%
0.007 190
 
1.9%
0.008 185
 
1.8%
Other values (330) 6660
66.6%
ValueCountFrequency (%)
-9999.0 1052
10.5%
-999.9 96
 
1.0%
-9.999 269
 
2.7%
0.0 73
 
0.7%
0.001 292
 
2.9%
0.002 299
 
3.0%
0.003 271
 
2.7%
0.004 305
 
3.0%
0.005 282
 
2.8%
0.006 195
 
1.9%
ValueCountFrequency (%)
10505.0 1
< 0.1%
1819.0 1
< 0.1%
916.0 2
< 0.1%
605.0 1
< 0.1%
527.0 1
< 0.1%
407.0 1
< 0.1%
288.0 1
< 0.1%
276.0 1
< 0.1%
242.0 1
< 0.1%
240.0 1
< 0.1%

측정기 상태
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6737
Minimum0
Maximum9
Zeros8350
Zeros (%)83.5%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:00:35.043673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile4
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.6224955
Coefficient of variation (CV)2.4083353
Kurtosis5.6179906
Mean0.6737
Median Absolute Deviation (MAD)0
Skewness2.4360923
Sum6737
Variance2.6324916
MonotonicityNot monotonic
2024-05-04T04:00:35.215930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 8350
83.5%
4 1303
 
13.0%
2 176
 
1.8%
8 126
 
1.3%
1 30
 
0.3%
9 15
 
0.1%
ValueCountFrequency (%)
0 8350
83.5%
1 30
 
0.3%
2 176
 
1.8%
4 1303
 
13.0%
8 126
 
1.3%
9 15
 
0.1%
ValueCountFrequency (%)
9 15
 
0.1%
8 126
 
1.3%
4 1303
 
13.0%
2 176
 
1.8%
1 30
 
0.3%
0 8350
83.5%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9910 
1
 
90

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9910
99.1%
1 90
 
0.9%

Length

2024-05-04T04:00:35.411494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:00:35.579515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9910
99.1%
1 90
 
0.9%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9813 
1
 
187

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9813
98.1%
1 187
 
1.9%

Length

2024-05-04T04:00:35.926779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:00:36.108543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9813
98.1%
1 187
 
1.9%

Interactions

2024-05-04T04:00:30.035035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:24.877537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:26.133251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:27.208081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:28.640373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:30.296663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:25.137128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:26.300461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:27.498621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:28.911071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:30.555171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:25.396318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:26.465539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:27.767998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:29.175192image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:30.869555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:25.755750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:26.657984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:28.106613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:29.476432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:31.116895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:25.966937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:26.867438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:28.368212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:00:29.769779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T04:00:36.221241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.0000.0570.1580.1700.266
측정소 코드0.0001.0000.0000.0640.2630.0310.047
측정항목0.0000.0001.0000.1520.6160.2600.417
평균값0.0570.0640.1521.0000.3490.0190.010
측정기 상태0.1580.2630.6160.3491.0000.1150.075
국가 기준초과 구분0.1700.0310.2600.0190.1151.0000.850
지자체 기준초과 구분0.2660.0470.4170.0100.0750.8501.000
2024-05-04T04:00:36.414141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분국가 기준초과 구분
지자체 기준초과 구분1.0000.647
국가 기준초과 구분0.6471.000
2024-05-04T04:00:36.570802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.000-0.002-0.004-0.0550.0350.1290.203
측정소 코드-0.0021.0000.0050.088-0.0880.0240.036
측정항목-0.0040.0051.0000.1980.3340.1870.301
평균값-0.0550.0880.1981.000-0.5600.0440.049
측정기 상태0.035-0.0880.334-0.5601.0000.0830.054
국가 기준초과 구분0.1290.0240.1870.0440.0831.0000.647
지자체 기준초과 구분0.2030.0360.3010.0490.0540.6471.000

Missing values

2024-05-04T04:00:31.394669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T04:00:31.609774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
21596200401062312551.9000
78318200401221810410.003000
38935200401111911530.05000
73963200401211310330.01000
505002004011500117865.0000
71923200401202311330.014000
1539520040105061169-9999.0400
8365720040124051189-9999.0400
7190200401022312450.4000
34304200401101211850.3000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
7157920040120211059-9999.0400
80414200401230810350.7000
14093200401042112495.0200
38767200401111811230.044000
4495120040113111179-9999.0400
3241720040110001039-9999.0400
6141520040118011119-9999.0400
74862200401211910310.004000
4128120040112111063-9.999400
82938200401240012410.021200