Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric5
Categorical2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

평균값 is highly overall correlated with 측정기 상태High correlation
측정기 상태 is highly overall correlated with 평균값High correlation
국가 기준초과 구분 is highly overall correlated with 지자체 기준초과 구분High correlation
지자체 기준초과 구분 is highly overall correlated with 국가 기준초과 구분High correlation
국가 기준초과 구분 is highly imbalanced (87.1%)Imbalance
지자체 기준초과 구분 is highly imbalanced (80.9%)Imbalance
평균값 has 181 (1.8%) zerosZeros
측정기 상태 has 8190 (81.9%) zerosZeros

Reproduction

Analysis started2024-05-11 06:58:15.809498
Analysis finished2024-05-11 06:58:20.107853
Duration4.3 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct572
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0030112 × 109
Minimum2.0030101 × 109
Maximum2.0030124 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:58:20.205321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0030101 × 109
5-th percentile2.0030102 × 109
Q12.0030106 × 109
median2.0030112 × 109
Q32.0030118 × 109
95-th percentile2.0030123 × 109
Maximum2.0030124 × 109
Range2319
Interquartile range (IQR)1195.25

Descriptive statistics

Standard deviation688.16185
Coefficient of variation (CV)3.4356365 × 10-7
Kurtosis-1.1981722
Mean2.0030112 × 109
Median Absolute Deviation (MAD)598
Skewness0.0141841
Sum2.0030112 × 1013
Variance473566.74
MonotonicityNot monotonic
2024-05-11T15:58:20.388925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2003010911 32
 
0.3%
2003011913 30
 
0.3%
2003011108 29
 
0.3%
2003010809 28
 
0.3%
2003011502 27
 
0.3%
2003011612 27
 
0.3%
2003012410 27
 
0.3%
2003010907 26
 
0.3%
2003010121 26
 
0.3%
2003010101 26
 
0.3%
Other values (562) 9722
97.2%
ValueCountFrequency (%)
2003010100 21
0.2%
2003010101 26
0.3%
2003010102 15
0.1%
2003010103 23
0.2%
2003010104 16
0.2%
2003010105 13
0.1%
2003010106 14
0.1%
2003010107 18
0.2%
2003010108 20
0.2%
2003010109 23
0.2%
ValueCountFrequency (%)
2003012419 5
 
0.1%
2003012418 11
0.1%
2003012417 21
0.2%
2003012416 16
0.2%
2003012415 15
0.1%
2003012414 15
0.1%
2003012413 17
0.2%
2003012412 22
0.2%
2003012411 24
0.2%
2003012410 27
0.3%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.9614
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:58:20.546169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median113
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.1693689
Coefficient of variation (CV)0.063467422
Kurtosis-1.1828797
Mean112.9614
Median Absolute Deviation (MAD)6
Skewness0.0061172693
Sum1129614
Variance51.39985
MonotonicityNot monotonic
2024-05-11T15:58:20.696785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
116 450
 
4.5%
106 440
 
4.4%
114 425
 
4.2%
101 418
 
4.2%
107 418
 
4.2%
125 410
 
4.1%
112 410
 
4.1%
122 407
 
4.1%
111 407
 
4.1%
117 405
 
4.0%
Other values (15) 5810
58.1%
ValueCountFrequency (%)
101 418
4.2%
102 374
3.7%
103 393
3.9%
104 380
3.8%
105 398
4.0%
106 440
4.4%
107 418
4.2%
108 383
3.8%
109 384
3.8%
110 403
4.0%
ValueCountFrequency (%)
125 410
4.1%
124 372
3.7%
123 381
3.8%
122 407
4.1%
121 391
3.9%
120 374
3.7%
119 385
3.9%
118 404
4.0%
117 405
4.0%
116 450
4.5%

측정항목
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.334
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:58:20.841039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7238183
Coefficient of variation (CV)0.51065209
Kurtosis-1.1752196
Mean5.334
Median Absolute Deviation (MAD)2
Skewness-0.21372359
Sum53340
Variance7.4191859
MonotonicityNot monotonic
2024-05-11T15:58:20.988070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
6 1738
17.4%
5 1721
17.2%
8 1658
16.6%
1 1645
16.4%
3 1624
16.2%
9 1614
16.1%
ValueCountFrequency (%)
1 1645
16.4%
3 1624
16.2%
5 1721
17.2%
6 1738
17.4%
8 1658
16.6%
9 1614
16.1%
ValueCountFrequency (%)
9 1614
16.1%
8 1658
16.6%
6 1738
17.4%
5 1721
17.2%
3 1624
16.2%
1 1645
16.4%

평균값
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct379
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-1026.3922
Minimum-9999
Maximum333
Zeros181
Zeros (%)1.8%
Negative1688
Negative (%)16.9%
Memory size166.0 KiB
2024-05-11T15:58:21.185170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile-9999
Q10.003
median0.028
Q31.6
95-th percentile100
Maximum333
Range10332
Interquartile range (IQR)1.597

Descriptive statistics

Standard deviation3037.1718
Coefficient of variation (CV)-2.9590752
Kurtosis4.8323413
Mean-1026.3922
Median Absolute Deviation (MAD)0.672
Skewness-2.6101415
Sum-10263922
Variance9224412.8
MonotonicityNot monotonic
2024-05-11T15:58:21.684458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-9999.0 1026
 
10.3%
-9.999 488
 
4.9%
0.002 338
 
3.4%
0.003 255
 
2.5%
0.004 241
 
2.4%
0.001 223
 
2.2%
0.007 197
 
2.0%
0.005 195
 
1.9%
0.006 187
 
1.9%
0.0 181
 
1.8%
Other values (369) 6669
66.7%
ValueCountFrequency (%)
-9999.0 1026
10.3%
-999.9 172
 
1.7%
-10.0 1
 
< 0.1%
-9.999 488
4.9%
-0.002 1
 
< 0.1%
0.0 181
 
1.8%
0.001 223
 
2.2%
0.002 338
 
3.4%
0.003 255
 
2.5%
0.004 241
 
2.4%
ValueCountFrequency (%)
333.0 1
< 0.1%
304.0 1
< 0.1%
290.0 1
< 0.1%
288.0 1
< 0.1%
286.0 1
< 0.1%
285.0 1
< 0.1%
284.0 1
< 0.1%
279.0 2
< 0.1%
277.0 1
< 0.1%
271.0 1
< 0.1%

측정기 상태
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6901
Minimum0
Maximum9
Zeros8190
Zeros (%)81.9%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:58:21.810973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile4
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.5249572
Coefficient of variation (CV)2.2097627
Kurtosis2.766508
Mean0.6901
Median Absolute Deviation (MAD)0
Skewness1.9632255
Sum6901
Variance2.3254945
MonotonicityNot monotonic
2024-05-11T15:58:21.929010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 8190
81.9%
4 1577
 
15.8%
2 143
 
1.4%
1 62
 
0.6%
9 21
 
0.2%
8 7
 
0.1%
ValueCountFrequency (%)
0 8190
81.9%
1 62
 
0.6%
2 143
 
1.4%
4 1577
 
15.8%
8 7
 
0.1%
9 21
 
0.2%
ValueCountFrequency (%)
9 21
 
0.2%
8 7
 
0.1%
4 1577
 
15.8%
2 143
 
1.4%
1 62
 
0.6%
0 8190
81.9%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9822 
1
 
178

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9822
98.2%
1 178
 
1.8%

Length

2024-05-11T15:58:22.060030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:58:22.207198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9822
98.2%
1 178
 
1.8%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9707 
1
 
293

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9707
97.1%
1 293
 
2.9%

Length

2024-05-11T15:58:22.332660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:58:22.442744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9707
97.1%
1 293
 
2.9%

Interactions

2024-05-11T15:58:19.254912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:16.710377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:17.296179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:17.977208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:18.624813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:19.366769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:16.825960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:17.440450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:18.124810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:18.744652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:19.479066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:16.934980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:17.623070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:18.241244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:18.857241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:19.600500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:17.065364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:17.755410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:18.378551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:18.988120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:19.725998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:17.177805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:17.862977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:18.511382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:58:19.122710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:58:22.520591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0210.0000.0250.0970.2450.262
측정소 코드0.0211.0000.0000.2810.3540.0430.065
측정항목0.0000.0001.0000.3970.5160.4150.536
평균값0.0250.2810.3971.0000.6550.0210.033
측정기 상태0.0970.3540.5160.6551.0000.1150.123
국가 기준초과 구분0.2450.0430.4150.0210.1151.0000.937
지자체 기준초과 구분0.2620.0650.5360.0330.1230.9371.000
2024-05-11T15:58:22.638789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국가 기준초과 구분지자체 기준초과 구분
국가 기준초과 구분1.0000.773
지자체 기준초과 구분0.7731.000
2024-05-11T15:58:22.735435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.000-0.0070.0080.000-0.0120.1870.201
측정소 코드-0.0071.000-0.0160.098-0.1930.0330.049
측정항목0.008-0.0161.0000.1970.2860.2990.387
평균값0.0000.0980.1971.000-0.6180.0480.063
측정기 상태-0.012-0.1930.286-0.6181.0000.0830.089
국가 기준초과 구분0.1870.0330.2990.0480.0831.0000.773
지자체 기준초과 구분0.2010.0490.3870.0630.0890.7731.000

Missing values

2024-05-11T15:58:19.911489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:58:20.047529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
61788200301180312410.005000
41899200301121510930.038000
7080320030120161016-9.999400
56959200301161911930.056000
57530200301162311450.5000
67974200301192110510.007000
488502003011413117833.0000
16449200301051311760.024000
43965200301130510360.004000
2523200301011612160.026000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
67581200301191811460.015000
2732920030108141059-9999.0400
698382003012009115872.0000
274302003010814122883.0000
40242200301120410810.008000
30317200301091010390.0100
62490200301180811610.005100
2632620030108071138194.0011
764262003012205113897.0000
1635420030105131018-9999.0400