Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric4
Categorical3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

지자체 기준초과 구분 has constant value ""Constant
측정항목 is highly overall correlated with 평균값High correlation
평균값 is highly overall correlated with 측정항목High correlation
측정기 상태 is highly imbalanced (95.5%)Imbalance
국가 기준초과 구분 is highly imbalanced (99.6%)Imbalance
평균값 is highly skewed (γ1 = -55.78194489)Skewed

Reproduction

Analysis started2024-05-04 03:58:26.160405
Analysis finished2024-05-04 03:58:35.045871
Duration8.89 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct440
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.010011 × 109
Minimum2.0100101 × 109
Maximum2.0100119 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T03:58:35.274108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0100101 × 109
5-th percentile2.0100101 × 109
Q12.0100105 × 109
median2.010011 × 109
Q32.0100114 × 109
95-th percentile2.0100118 × 109
Maximum2.0100119 × 109
Range1807
Interquartile range (IQR)904

Descriptive statistics

Standard deviation530.60642
Coefficient of variation (CV)2.6398185 × 10-7
Kurtosis-1.2045808
Mean2.010011 × 109
Median Absolute Deviation (MAD)484
Skewness0.0038751439
Sum2.010011 × 1013
Variance281543.18
MonotonicityNot monotonic
2024-05-04T03:58:35.807303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2010011011 36
 
0.4%
2010010311 36
 
0.4%
2010010921 36
 
0.4%
2010011007 34
 
0.3%
2010010913 33
 
0.3%
2010011818 33
 
0.3%
2010011900 33
 
0.3%
2010010914 32
 
0.3%
2010010323 32
 
0.3%
2010010201 31
 
0.3%
Other values (430) 9664
96.6%
ValueCountFrequency (%)
2010010100 19
0.2%
2010010101 19
0.2%
2010010102 19
0.2%
2010010103 21
0.2%
2010010104 21
0.2%
2010010105 27
0.3%
2010010106 22
0.2%
2010010107 25
0.2%
2010010108 20
0.2%
2010010109 18
0.2%
ValueCountFrequency (%)
2010011907 19
0.2%
2010011906 20
0.2%
2010011905 19
0.2%
2010011904 20
0.2%
2010011903 27
0.3%
2010011902 31
0.3%
2010011901 28
0.3%
2010011900 33
0.3%
2010011823 15
0.1%
2010011822 18
0.2%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.965
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T03:58:36.206731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median113
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.2202346
Coefficient of variation (CV)0.063915679
Kurtosis-1.221894
Mean112.965
Median Absolute Deviation (MAD)6
Skewness-0.0016855759
Sum1129650
Variance52.131788
MonotonicityNot monotonic
2024-05-04T03:58:36.591527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
104 431
 
4.3%
107 430
 
4.3%
121 430
 
4.3%
120 424
 
4.2%
112 423
 
4.2%
123 422
 
4.2%
119 414
 
4.1%
103 413
 
4.1%
101 411
 
4.1%
111 409
 
4.1%
Other values (15) 5793
57.9%
ValueCountFrequency (%)
101 411
4.1%
102 383
3.8%
103 413
4.1%
104 431
4.3%
105 398
4.0%
106 381
3.8%
107 430
4.3%
108 390
3.9%
109 374
3.7%
110 405
4.0%
ValueCountFrequency (%)
125 356
3.6%
124 399
4.0%
123 422
4.2%
122 397
4.0%
121 430
4.3%
120 424
4.2%
119 414
4.1%
118 403
4.0%
117 384
3.8%
116 356
3.6%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3319
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T03:58:36.975361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7279455
Coefficient of variation (CV)0.51162728
Kurtosis-1.1914248
Mean5.3319
Median Absolute Deviation (MAD)2
Skewness-0.20148428
Sum53319
Variance7.4416866
MonotonicityNot monotonic
2024-05-04T03:58:37.335257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
6 1746
17.5%
3 1713
17.1%
5 1641
16.4%
8 1641
16.4%
9 1639
16.4%
1 1620
16.2%
ValueCountFrequency (%)
1 1620
16.2%
3 1713
17.1%
5 1641
16.4%
6 1746
17.5%
8 1641
16.4%
9 1639
16.4%
ValueCountFrequency (%)
9 1639
16.4%
8 1641
16.4%
6 1746
17.5%
5 1641
16.4%
3 1713
17.1%
1 1620
16.2%

평균값
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct250
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.677921
Minimum-9999
Maximum224
Zeros22
Zeros (%)0.2%
Negative6
Negative (%)0.1%
Memory size166.0 KiB
2024-05-04T03:58:37.922054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile0.003
Q10.011
median0.077
Q327
95-th percentile68
Maximum224
Range10223
Interquartile range (IQR)26.989

Descriptive statistics

Standard deviation175.42232
Coefficient of variation (CV)15.021708
Kurtosis3180.8685
Mean11.677921
Median Absolute Deviation (MAD)0.076
Skewness-55.781945
Sum116779.21
Variance30772.991
MonotonicityNot monotonic
2024-05-04T03:58:38.364771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.006 316
 
3.2%
0.005 312
 
3.1%
0.007 300
 
3.0%
0.008 285
 
2.9%
0.003 240
 
2.4%
0.009 234
 
2.3%
0.5 211
 
2.1%
0.6 206
 
2.1%
0.004 205
 
2.1%
0.002 196
 
2.0%
Other values (240) 7495
75.0%
ValueCountFrequency (%)
-9999.0 3
 
< 0.1%
-999.9 1
 
< 0.1%
-9.999 2
 
< 0.1%
0.0 22
 
0.2%
0.001 123
 
1.2%
0.002 196
2.0%
0.003 240
2.4%
0.004 205
2.1%
0.005 312
3.1%
0.006 316
3.2%
ValueCountFrequency (%)
224.0 1
< 0.1%
165.0 1
< 0.1%
155.0 1
< 0.1%
128.0 1
< 0.1%
126.0 2
< 0.1%
125.0 1
< 0.1%
122.0 1
< 0.1%
121.0 1
< 0.1%
120.0 1
< 0.1%
119.0 1
< 0.1%

측정기 상태
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9905 
1
 
57
9
 
29
2
 
9

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9905
99.1%
1 57
 
0.6%
9 29
 
0.3%
2 9
 
0.1%

Length

2024-05-04T03:58:38.782271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T03:58:39.073606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9905
99.1%
1 57
 
0.6%
9 29
 
0.3%
2 9
 
0.1%

국가 기준초과 구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9997 
1
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9997
> 99.9%
1 3
 
< 0.1%

Length

2024-05-04T03:58:39.604949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T03:58:39.906327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9997
> 99.9%
1 3
 
< 0.1%

지자체 기준초과 구분
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 10000
100.0%

Length

2024-05-04T03:58:40.276048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T03:58:40.573830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 10000
100.0%

Interactions

2024-05-04T03:58:32.773424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:27.756158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:29.375099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:30.884361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:33.148797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:28.128242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:29.824080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:31.185273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:33.454868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:28.469613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:30.170214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:31.862863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:33.875358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:28.915360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:30.582186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T03:58:32.299086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T03:58:40.780899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분
측정일시1.0000.0000.0070.0250.0810.029
측정소 코드0.0001.0000.0000.0190.1120.000
측정항목0.0070.0001.0000.0040.0880.043
평균값0.0250.0190.0041.0000.2030.000
측정기 상태0.0810.1120.0880.2031.0000.000
국가 기준초과 구분0.0290.0000.0430.0000.0001.000
2024-05-04T03:58:41.151444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국가 기준초과 구분측정기 상태
국가 기준초과 구분1.0000.000
측정기 상태0.0001.000
2024-05-04T03:58:41.435533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분
측정일시1.000-0.0020.0020.0390.0480.022
측정소 코드-0.0021.0000.0070.0010.0670.000
측정항목0.0020.0071.0000.6770.0570.031
평균값0.0390.0010.6771.0000.1860.000
측정기 상태0.0480.0670.0570.1861.0000.000
국가 기준초과 구분0.0220.0000.0310.0000.0001.000

Missing values

2024-05-04T03:58:34.313486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T03:58:34.795915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
372592010011108110949.0000
13131201001041511460.025000
37848201001111210910.013000
45363201001131411160.027000
2679201001011712260.002000
19350201001060910110.008000
40598201001120611750.7000
290862010010901123887.0000
33980201001101011450.8000
12606201001041210210.005000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
492762010011416113842.0000
30451201001091110130.065000
56512010010213117931.0000
14268201001042310410.006000
5752010010103121919.0000
47761201001140611130.056000
40843201001120810830.032000
32173201001092211330.069000
193432010010608124923.0000
1254201001010811010.007000