Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells4
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric5
Categorical2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

지자체 기준초과 구분 is highly overall correlated with 평균값 and 1 other fieldsHigh correlation
국가 기준초과 구분 is highly overall correlated with 평균값 and 1 other fieldsHigh correlation
측정항목 is highly overall correlated with 평균값High correlation
평균값 is highly overall correlated with 측정항목 and 2 other fieldsHigh correlation
국가 기준초과 구분 is highly imbalanced (82.0%)Imbalance
지자체 기준초과 구분 is highly imbalanced (82.0%)Imbalance
측정기 상태 has 9845 (98.5%) zerosZeros

Reproduction

Analysis started2024-07-27 00:34:42.084712
Analysis finished2024-07-27 00:34:55.564879
Duration13.48 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct468
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.016011 × 109
Minimum2.0160101 × 109
Maximum2.016012 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-07-27T09:34:55.873163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0160101 × 109
5-th percentile2.0160101 × 109
Q12.0160105 × 109
median2.016011 × 109
Q32.0160115 × 109
95-th percentile2.0160119 × 109
Maximum2.016012 × 109
Range1911
Interquartile range (IQR)998

Descriptive statistics

Standard deviation566.52846
Coefficient of variation (CV)2.8101456 × 10-7
Kurtosis-1.2108739
Mean2.016011 × 109
Median Absolute Deviation (MAD)498.5
Skewness0.0049462973
Sum2.016011 × 1013
Variance320954.5
MonotonicityNot monotonic
2024-07-27T09:34:56.549156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2016010221 33
 
0.3%
2016010521 31
 
0.3%
2016011413 31
 
0.3%
2016011417 31
 
0.3%
2016011806 31
 
0.3%
2016010315 31
 
0.3%
2016011309 31
 
0.3%
2016010507 31
 
0.3%
2016011905 31
 
0.3%
2016010816 30
 
0.3%
Other values (458) 9689
96.9%
ValueCountFrequency (%)
2016010100 30
0.3%
2016010101 28
0.3%
2016010102 20
0.2%
2016010103 11
 
0.1%
2016010104 24
0.2%
2016010105 18
0.2%
2016010106 15
0.1%
2016010107 27
0.3%
2016010108 23
0.2%
2016010109 15
0.1%
ValueCountFrequency (%)
2016012011 21
0.2%
2016012010 26
0.3%
2016012009 22
0.2%
2016012008 20
0.2%
2016012007 22
0.2%
2016012006 26
0.3%
2016012005 20
0.2%
2016012004 20
0.2%
2016012003 19
0.2%
2016012002 19
0.2%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean113.0696
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-07-27T09:34:57.250264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median113
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.1947851
Coefficient of variation (CV)0.063631472
Kurtosis-1.1980857
Mean113.0696
Median Absolute Deviation (MAD)6
Skewness-0.028542672
Sum1130696
Variance51.764932
MonotonicityNot monotonic
2024-07-27T09:34:57.888104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
121 444
 
4.4%
105 441
 
4.4%
112 428
 
4.3%
116 427
 
4.3%
113 425
 
4.2%
119 409
 
4.1%
115 407
 
4.1%
102 406
 
4.1%
110 402
 
4.0%
118 402
 
4.0%
Other values (15) 5809
58.1%
ValueCountFrequency (%)
101 402
4.0%
102 406
4.1%
103 377
3.8%
104 401
4.0%
105 441
4.4%
106 375
3.8%
107 369
3.7%
108 342
3.4%
109 383
3.8%
110 402
4.0%
ValueCountFrequency (%)
125 382
3.8%
124 395
4.0%
123 401
4.0%
122 400
4.0%
121 444
4.4%
120 401
4.0%
119 409
4.1%
118 402
4.0%
117 390
3.9%
116 427
4.3%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3679
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-07-27T09:34:58.394673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.746107
Coefficient of variation (CV)0.51157939
Kurtosis-1.2024288
Mean5.3679
Median Absolute Deviation (MAD)3
Skewness-0.2145918
Sum53679
Variance7.5411037
MonotonicityNot monotonic
2024-07-27T09:34:59.019412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
9 1717
17.2%
3 1674
16.7%
5 1670
16.7%
8 1665
16.7%
6 1652
16.5%
1 1622
16.2%
ValueCountFrequency (%)
1 1622
16.2%
3 1674
16.7%
5 1670
16.7%
6 1652
16.5%
8 1665
16.7%
9 1717
17.2%
ValueCountFrequency (%)
9 1717
17.2%
8 1665
16.7%
6 1652
16.5%
5 1670
16.7%
3 1674
16.7%
1 1622
16.2%

평균값
Real number (ℝ)

HIGH CORRELATION 

Distinct435
Distinct (%)4.4%
Missing4
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean13.114523
Minimum-221
Maximum217
Zeros44
Zeros (%)0.4%
Negative6
Negative (%)0.1%
Memory size166.0 KiB
2024-07-27T09:34:59.654937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-221
5-th percentile0.004
Q10.011
median0.18
Q322
95-th percentile62
Maximum217
Range438
Interquartile range (IQR)21.989

Descriptive statistics

Standard deviation23.415935
Coefficient of variation (CV)1.7854965
Kurtosis8.2980509
Mean13.114523
Median Absolute Deviation (MAD)0.178
Skewness2.0809158
Sum131092.77
Variance548.30601
MonotonicityNot monotonic
2024-07-27T09:35:00.306019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.006 499
 
5.0%
0.005 499
 
5.0%
0.007 368
 
3.7%
0.004 236
 
2.4%
0.008 207
 
2.1%
0.003 169
 
1.7%
0.002 169
 
1.7%
0.009 102
 
1.0%
0.022 96
 
1.0%
0.02 93
 
0.9%
Other values (425) 7558
75.6%
ValueCountFrequency (%)
-221.0 1
 
< 0.1%
-190.0 1
 
< 0.1%
-67.0 1
 
< 0.1%
-30.0 1
 
< 0.1%
-16.0 1
 
< 0.1%
-12.0 1
 
< 0.1%
0.0 44
 
0.4%
0.001 62
 
0.6%
0.002 169
1.7%
0.003 169
1.7%
ValueCountFrequency (%)
217.0 1
< 0.1%
198.0 1
< 0.1%
193.0 1
< 0.1%
184.0 1
< 0.1%
179.0 1
< 0.1%
177.0 1
< 0.1%
172.0 1
< 0.1%
171.0 2
< 0.1%
164.0 1
< 0.1%
158.0 1
< 0.1%

측정기 상태
Real number (ℝ)

ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0528
Minimum0
Maximum9
Zeros9845
Zeros (%)98.5%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-07-27T09:35:00.808587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.58944627
Coefficient of variation (CV)11.163755
Kurtosis202.95223
Mean0.0528
Median Absolute Deviation (MAD)0
Skewness13.923146
Sum528
Variance0.3474469
MonotonicityNot monotonic
2024-07-27T09:35:01.314669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 9845
98.5%
1 77
 
0.8%
9 37
 
0.4%
2 27
 
0.3%
4 12
 
0.1%
8 2
 
< 0.1%
ValueCountFrequency (%)
0 9845
98.5%
1 77
 
0.8%
2 27
 
0.3%
4 12
 
0.1%
8 2
 
< 0.1%
9 37
 
0.4%
ValueCountFrequency (%)
9 37
 
0.4%
8 2
 
< 0.1%
4 12
 
0.1%
2 27
 
0.3%
1 77
 
0.8%
0 9845
98.5%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9728 
1
 
272

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9728
97.3%
1 272
 
2.7%

Length

2024-07-27T09:35:01.844632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-27T09:35:02.249093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9728
97.3%
1 272
 
2.7%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9728 
1
 
272

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9728
97.3%
1 272
 
2.7%

Length

2024-07-27T09:35:02.838595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-27T09:35:03.191075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9728
97.3%
1 272
 
2.7%

Interactions

2024-07-27T09:34:52.829643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:44.964570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:47.013048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:48.822226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:50.898798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:53.152012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:45.352507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:47.387453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:49.252219image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:51.344432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:53.507386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:45.738612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:47.711597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:49.663048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:51.684349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:53.946231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:46.138014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:48.154245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:50.149756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:52.154659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:54.265245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:46.588774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:48.482065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:50.550474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:34:52.438271image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-07-27T09:35:03.751439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.0000.2460.1210.3570.357
측정소 코드0.0001.0000.0000.0460.1120.0580.058
측정항목0.0000.0001.0000.4400.1290.3780.378
평균값0.2460.0460.4401.0000.2310.7590.759
측정기 상태0.1210.1120.1290.2311.0000.1950.195
국가 기준초과 구분0.3570.0580.3780.7590.1951.0001.000
지자체 기준초과 구분0.3570.0580.3780.7590.1951.0001.000
2024-07-27T09:35:04.204058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분국가 기준초과 구분
지자체 기준초과 구분1.0000.998
국가 기준초과 구분0.9981.000
2024-07-27T09:35:04.558727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.000-0.002-0.004-0.022-0.0370.2720.272
측정소 코드-0.0021.000-0.008-0.0080.0150.0440.044
측정항목-0.004-0.0081.0000.7230.0430.2720.272
평균값-0.022-0.0080.7231.000-0.0430.5840.584
측정기 상태-0.0370.0150.043-0.0431.0000.1400.140
국가 기준초과 구분0.2720.0440.2720.5840.1401.0000.998
지자체 기준초과 구분0.2720.0440.2720.5840.1400.9981.000

Missing values

2024-07-27T09:34:54.736180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-07-27T09:34:55.290757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
14478201601050011410.006000
5622201601021311310.006000
27969201601081811260.023000
25039201601072212430.023000
155212016010507112921.0000
51674201601150811351.11000
61431201601180111460.001000
230562016010709118838.0000
44474201601130811351.0000
294402016010904107840.0000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
121552016010409101977.0011
31213201601091610330.025000
1394201601010910850.97000
14697201601050112560.004000
62856201601181110210.007000
66662201601191211150.36000
920201601010610451.03000
67707201601191911060.022000
53322201601151911310.006000
45639201601131610760.006000