Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric4
Categorical3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

측정항목 is highly overall correlated with 평균값High correlation
평균값 is highly overall correlated with 측정항목High correlation
국가 기준초과 구분 is highly overall correlated with 지자체 기준초과 구분High correlation
지자체 기준초과 구분 is highly overall correlated with 국가 기준초과 구분High correlation
측정기 상태 is highly imbalanced (95.1%)Imbalance
국가 기준초과 구분 is highly imbalanced (99.4%)Imbalance
지자체 기준초과 구분 is highly imbalanced (99.4%)Imbalance
평균값 is highly skewed (γ1 = -54.8479159)Skewed

Reproduction

Analysis started2024-04-27 12:03:44.625415
Analysis finished2024-04-27 12:03:50.964634
Duration6.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct461
Distinct (%)4.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.015011 × 109
Minimum2.0150101 × 109
Maximum2.015012 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:03:51.167522image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0150101 × 109
5-th percentile2.0150101 × 109
Q12.0150105 × 109
median2.015011 × 109
Q32.0150115 × 109
95-th percentile2.0150119 × 109
Maximum2.015012 × 109
Range1904
Interquartile range (IQR)991

Descriptive statistics

Standard deviation555.17579
Coefficient of variation (CV)2.7551998 × 10-7
Kurtosis-1.1957802
Mean2.015011 × 109
Median Absolute Deviation (MAD)495.5
Skewness0.040515679
Sum2.015011 × 1013
Variance308220.16
MonotonicityNot monotonic
2024-04-27T12:03:51.605807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2015010812 35
 
0.4%
2015010921 33
 
0.3%
2015010913 32
 
0.3%
2015010423 32
 
0.3%
2015010409 31
 
0.3%
2015010103 31
 
0.3%
2015011200 31
 
0.3%
2015011215 31
 
0.3%
2015011908 30
 
0.3%
2015010403 30
 
0.3%
Other values (451) 9684
96.8%
ValueCountFrequency (%)
2015010100 10
 
0.1%
2015010101 26
0.3%
2015010102 25
0.2%
2015010103 31
0.3%
2015010104 22
0.2%
2015010105 23
0.2%
2015010106 25
0.2%
2015010107 27
0.3%
2015010108 24
0.2%
2015010109 24
0.2%
ValueCountFrequency (%)
2015012004 10
 
0.1%
2015012003 24
0.2%
2015012002 19
0.2%
2015012001 20
0.2%
2015012000 29
0.3%
2015011923 19
0.2%
2015011922 14
0.1%
2015011921 28
0.3%
2015011920 21
0.2%
2015011919 20
0.2%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.8962
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:03:51.907984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median113
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.2011396
Coefficient of variation (CV)0.063785491
Kurtosis-1.2067831
Mean112.8962
Median Absolute Deviation (MAD)6
Skewness0.017479664
Sum1128962
Variance51.856411
MonotonicityNot monotonic
2024-04-27T12:03:52.213613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
115 441
 
4.4%
104 438
 
4.4%
102 437
 
4.4%
113 432
 
4.3%
108 426
 
4.3%
106 411
 
4.1%
107 410
 
4.1%
111 410
 
4.1%
123 408
 
4.1%
122 407
 
4.1%
Other values (15) 5780
57.8%
ValueCountFrequency (%)
101 386
3.9%
102 437
4.4%
103 380
3.8%
104 438
4.4%
105 391
3.9%
106 411
4.1%
107 410
4.1%
108 426
4.3%
109 389
3.9%
110 388
3.9%
ValueCountFrequency (%)
125 373
3.7%
124 393
3.9%
123 408
4.1%
122 407
4.1%
121 377
3.8%
120 386
3.9%
119 401
4.0%
118 400
4.0%
117 390
3.9%
116 389
3.9%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3087
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:03:52.434243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7334871
Coefficient of variation (CV)0.51490705
Kurtosis-1.20046
Mean5.3087
Median Absolute Deviation (MAD)2
Skewness-0.18504918
Sum53087
Variance7.4719515
MonotonicityNot monotonic
2024-04-27T12:03:52.782234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
3 1733
17.3%
6 1701
17.0%
5 1673
16.7%
9 1646
16.5%
1 1639
16.4%
8 1608
16.1%
ValueCountFrequency (%)
1 1639
16.4%
3 1733
17.3%
5 1673
16.7%
6 1701
17.0%
8 1608
16.1%
9 1646
16.5%
ValueCountFrequency (%)
9 1646
16.5%
8 1608
16.1%
6 1701
17.0%
5 1673
16.7%
3 1733
17.3%
1 1639
16.4%

평균값
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct254
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.5309946
Minimum-9999
Maximum158
Zeros7
Zeros (%)0.1%
Negative18
Negative (%)0.2%
Memory size166.0 KiB
2024-04-27T12:03:53.207098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile0.003
Q10.011
median0.07
Q319
95-th percentile58
Maximum158
Range10157
Interquartile range (IQR)18.989

Descriptive statistics

Standard deviation176.45768
Coefficient of variation (CV)20.684303
Kurtosis3103.4228
Mean8.5309946
Median Absolute Deviation (MAD)0.068
Skewness-54.847916
Sum85309.946
Variance31137.311
MonotonicityNot monotonic
2024-04-27T12:03:53.666924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.006 409
 
4.1%
0.007 402
 
4.0%
0.005 323
 
3.2%
0.008 277
 
2.8%
0.003 235
 
2.4%
0.004 230
 
2.3%
0.5 221
 
2.2%
0.009 213
 
2.1%
0.4 206
 
2.1%
0.002 194
 
1.9%
Other values (244) 7290
72.9%
ValueCountFrequency (%)
-9999.0 3
 
< 0.1%
-999.9 6
 
0.1%
-9.999 9
 
0.1%
0.0 7
 
0.1%
0.001 47
 
0.5%
0.002 194
1.9%
0.003 235
2.4%
0.004 230
2.3%
0.005 323
3.2%
0.006 409
4.1%
ValueCountFrequency (%)
158.0 1
 
< 0.1%
153.0 2
< 0.1%
146.0 1
 
< 0.1%
145.0 1
 
< 0.1%
142.0 1
 
< 0.1%
136.0 1
 
< 0.1%
135.0 1
 
< 0.1%
133.0 3
< 0.1%
127.0 1
 
< 0.1%
126.0 2
< 0.1%

측정기 상태
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9878 
1
 
58
2
 
45
9
 
16
4
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9878
98.8%
1 58
 
0.6%
2 45
 
0.4%
9 16
 
0.2%
4 3
 
< 0.1%

Length

2024-04-27T12:03:54.082182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:03:54.560712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9878
98.8%
1 58
 
0.6%
2 45
 
0.4%
9 16
 
0.2%
4 3
 
< 0.1%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9995 
1
 
5

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9995
> 99.9%
1 5
 
0.1%

Length

2024-04-27T12:03:54.891049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:03:55.175805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9995
> 99.9%
1 5
 
< 0.1%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9995 
1
 
5

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9995
> 99.9%
1 5
 
0.1%

Length

2024-04-27T12:03:55.475899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:03:55.837141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9995
> 99.9%
1 5
 
< 0.1%

Interactions

2024-04-27T12:03:49.296254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:45.789708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:47.026011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:48.136010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:49.563098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:46.091082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:47.294439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:48.433805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:49.831081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:46.368213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:47.559849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:48.710169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:50.115499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:46.663147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:47.858488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:48.999720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-27T12:03:56.035237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0110.0000.0360.1280.0290.029
측정소 코드0.0111.0000.0000.0410.1200.0430.043
측정항목0.0000.0001.0000.0690.0510.0310.031
평균값0.0360.0410.0691.0000.2930.0000.000
측정기 상태0.1280.1200.0510.2931.0000.1070.107
국가 기준초과 구분0.0290.0430.0310.0000.1071.0000.988
지자체 기준초과 구분0.0290.0430.0310.0000.1070.9881.000
2024-04-27T12:03:56.277561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분측정기 상태국가 기준초과 구분
지자체 기준초과 구분1.0000.1310.900
측정기 상태0.1311.0000.131
국가 기준초과 구분0.9000.1311.000
2024-04-27T12:03:56.458359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.000-0.008-0.0050.0430.0530.0220.022
측정소 코드-0.0081.0000.003-0.0010.0500.0330.033
측정항목-0.0050.0031.0000.7020.0340.0230.023
평균값0.043-0.0010.7021.0000.3410.0000.000
측정기 상태0.0530.0500.0340.3411.0000.1310.131
국가 기준초과 구분0.0220.0330.0230.0000.1311.0000.900
지자체 기준초과 구분0.0220.0330.0230.0000.1310.9001.000

Missing values

2024-04-27T12:03:50.474433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-27T12:03:50.857147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
47493201501140411660.002000
64831201501190010630.024000
24631201501072010630.04000
10432015010106124927.0000
317862015010919123864.0000
37394201501110910850.8000
36795201501110510860.002000
564412015011616107946.0000
32623201501100111330.015000
170032015010517109940.0000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
33930201501101010610.009000
63120201501181212110.009000
15307201501050610230.037000
11745201501040610860.005000
21657201501070011060.027000
29089201501090112430.048000
62126201501180610550.8000
131392015010415115922.0000
538132015011522119954.0000
8562201501030910310.007000