Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric4
Categorical3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

평균값 is highly overall correlated with 측정기 상태High correlation
측정기 상태 is highly overall correlated with 평균값High correlation
국가 기준초과 구분 is highly overall correlated with 지자체 기준초과 구분High correlation
지자체 기준초과 구분 is highly overall correlated with 국가 기준초과 구분High correlation
측정기 상태 is highly imbalanced (56.0%)Imbalance
국가 기준초과 구분 is highly imbalanced (97.1%)Imbalance
지자체 기준초과 구분 is highly imbalanced (95.5%)Imbalance
평균값 has 354 (3.5%) zerosZeros

Reproduction

Analysis started2024-05-04 04:02:36.654940
Analysis finished2024-05-04 04:02:45.998351
Duration9.34 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct797
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9980123 × 109
Minimum1.9980101 × 109
Maximum1.9980203 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:02:46.343199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.9980101 × 109
5-th percentile1.9980102 × 109
Q11.9980109 × 109
median1.9980117 × 109
Q31.9980126 × 109
95-th percentile1.9980201 × 109
Maximum1.9980203 × 109
Range10204
Interquartile range (IQR)1695

Descriptive statistics

Standard deviation2509.5142
Coefficient of variation (CV)1.2560053 × 10-6
Kurtosis5.0361587
Mean1.9980123 × 109
Median Absolute Deviation (MAD)815
Skewness2.4216159
Sum1.9980123 × 1013
Variance6297661.3
MonotonicityNot monotonic
2024-05-04T04:02:46.921792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1998020107 25
 
0.2%
1998011511 24
 
0.2%
1998012207 23
 
0.2%
1998020208 22
 
0.2%
1998012315 22
 
0.2%
1998020102 22
 
0.2%
1998020213 21
 
0.2%
1998012605 21
 
0.2%
1998011013 21
 
0.2%
1998010103 21
 
0.2%
Other values (787) 9778
97.8%
ValueCountFrequency (%)
1998010100 9
0.1%
1998010101 11
0.1%
1998010102 6
 
0.1%
1998010103 21
0.2%
1998010104 11
0.1%
1998010105 9
0.1%
1998010106 17
0.2%
1998010107 10
0.1%
1998010108 9
0.1%
1998010109 11
0.1%
ValueCountFrequency (%)
1998020304 6
 
0.1%
1998020303 12
0.1%
1998020302 17
0.2%
1998020301 14
0.1%
1998020300 19
0.2%
1998020223 20
0.2%
1998020222 8
 
0.1%
1998020221 13
0.1%
1998020220 13
0.1%
1998020219 17
0.2%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean111.7732
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:02:47.598892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile101
Q1105
median110
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)14

Descriptive statistics

Standard deviation7.4370083
Coefficient of variation (CV)0.066536597
Kurtosis-1.2997571
Mean111.7732
Median Absolute Deviation (MAD)6
Skewness0.2490324
Sum1117732
Variance55.309093
MonotonicityNot monotonic
2024-05-04T04:02:48.128640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
102 549
 
5.5%
113 547
 
5.5%
117 543
 
5.4%
123 542
 
5.4%
122 533
 
5.3%
103 533
 
5.3%
108 530
 
5.3%
111 528
 
5.3%
124 528
 
5.3%
105 516
 
5.2%
Other values (15) 4651
46.5%
ValueCountFrequency (%)
101 501
5.0%
102 549
5.5%
103 533
5.3%
104 512
5.1%
105 516
5.2%
106 489
4.9%
107 494
4.9%
108 530
5.3%
109 504
5.0%
110 494
4.9%
ValueCountFrequency (%)
125 39
 
0.4%
124 528
5.3%
123 542
5.4%
122 533
5.3%
121 514
5.1%
120 33
 
0.3%
119 466
4.7%
118 31
 
0.3%
117 543
5.4%
116 477
4.8%

측정항목
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3447
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:02:48.604397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7715068
Coefficient of variation (CV)0.51855236
Kurtosis-1.228066
Mean5.3447
Median Absolute Deviation (MAD)3
Skewness-0.2111357
Sum53447
Variance7.68125
MonotonicityNot monotonic
2024-05-04T04:02:49.028454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
9 1714
17.1%
1 1699
17.0%
8 1685
16.9%
3 1648
16.5%
6 1628
16.3%
5 1626
16.3%
ValueCountFrequency (%)
1 1699
17.0%
3 1648
16.5%
5 1626
16.3%
6 1628
16.3%
8 1685
16.9%
9 1714
17.1%
ValueCountFrequency (%)
9 1714
17.1%
8 1685
16.9%
6 1628
16.3%
5 1626
16.3%
3 1648
16.5%
1 1699
17.0%

평균값
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct273
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-2377.847
Minimum-9999
Maximum512
Zeros354
Zeros (%)3.5%
Negative3517
Negative (%)35.2%
Memory size166.0 KiB
2024-05-04T04:02:49.529080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile-9999
Q1-999.9
median0.008
Q30.041
95-th percentile37
Maximum512
Range10511
Interquartile range (IQR)999.941

Descriptive statistics

Standard deviation4231.0741
Coefficient of variation (CV)-1.7793718
Kurtosis-0.44786086
Mean-2377.847
Median Absolute Deviation (MAD)1.092
Skewness-1.2429895
Sum-23778470
Variance17901988
MonotonicityNot monotonic
2024-05-04T04:02:50.124397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-9999.0 2353
23.5%
-9.999 877
 
8.8%
0.0 354
 
3.5%
-999.9 287
 
2.9%
0.005 232
 
2.3%
0.007 223
 
2.2%
0.006 190
 
1.9%
0.009 173
 
1.7%
0.004 171
 
1.7%
0.008 166
 
1.7%
Other values (263) 4974
49.7%
ValueCountFrequency (%)
-9999.0 2353
23.5%
-999.9 287
 
2.9%
-9.999 877
 
8.8%
0.0 354
 
3.5%
0.001 62
 
0.6%
0.002 69
 
0.7%
0.003 136
 
1.4%
0.004 171
 
1.7%
0.005 232
 
2.3%
0.006 190
 
1.9%
ValueCountFrequency (%)
512.0 1
< 0.1%
240.0 1
< 0.1%
228.0 1
< 0.1%
198.0 1
< 0.1%
195.0 1
< 0.1%
184.0 2
< 0.1%
183.0 1
< 0.1%
182.0 1
< 0.1%
178.0 1
< 0.1%
175.0 1
< 0.1%

측정기 상태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
6140 
4
3786 
2
 
58
1
 
14
9
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row0
3rd row0
4th row4
5th row4

Common Values

ValueCountFrequency (%)
0 6140
61.4%
4 3786
37.9%
2 58
 
0.6%
1 14
 
0.1%
9 2
 
< 0.1%

Length

2024-05-04T04:02:50.592314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:02:50.889563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 6140
61.4%
4 3786
37.9%
2 58
 
0.6%
1 14
 
0.1%
9 2
 
< 0.1%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9971 
1
 
29

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9971
99.7%
1 29
 
0.3%

Length

2024-05-04T04:02:51.268418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:02:51.623449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9971
99.7%
1 29
 
0.3%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9951 
1
 
49

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9951
99.5%
1 49
 
0.5%

Length

2024-05-04T04:02:52.041369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:02:52.427799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9951
99.5%
1 49
 
0.5%

Interactions

2024-05-04T04:02:43.518024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:38.613358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:39.987831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:41.555861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:43.910223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:38.965043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:40.316388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:42.027762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:44.297628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:39.318105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:40.646657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:42.562789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:44.846633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:39.676568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:40.963115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:43.005100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T04:02:52.678972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.2600.0000.0820.0610.0330.009
측정소 코드0.2601.0000.0070.2650.3690.0460.059
측정항목0.0000.0071.0000.3110.4470.1070.174
평균값0.0820.2650.3111.0000.3370.0000.003
측정기 상태0.0610.3690.4470.3371.0000.0860.071
국가 기준초과 구분0.0330.0460.1070.0000.0861.0000.910
지자체 기준초과 구분0.0090.0590.1740.0030.0710.9101.000
2024-05-04T04:02:52.987743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분국가 기준초과 구분측정기 상태
지자체 기준초과 구분1.0000.7290.086
국가 기준초과 구분0.7291.0000.105
측정기 상태0.0860.1051.000
2024-05-04T04:02:53.270374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0280.005-0.0310.0490.0220.006
측정소 코드0.0281.000-0.0010.0860.1620.0350.045
측정항목0.005-0.0011.000-0.3940.3230.0770.125
평균값-0.0310.086-0.3941.0000.5420.0290.040
측정기 상태0.0490.1620.3230.5421.0000.1050.086
국가 기준초과 구분0.0220.0350.0770.0290.1051.0000.729
지자체 기준초과 구분0.0060.0450.1250.0400.0860.7291.000

Missing values

2024-05-04T04:02:45.214425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T04:02:45.741688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
6760119980125161249-9999.0400
78331199801291510330.04000
41540199801160410850.3000
1042719980104191099-9999.0400
5177919980119221049-9999.0400
7399199801031612330.022000
9077919980202151199-9999.0400
32384199801122010251.9000
35544199801132312110.015000
16549199801070110430.011000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
2425119980109201179-9999.0400
35689199801140110230.04000
73374199801271911610.01000
65940199801250210910.007000
7716199801031911710.007000
63702199801240612110.007000
1034819980104181198-9999.0400
20948199801081511951.0000
6509419980124191011-9.999400
20162199801080812251.1000