Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric4
Categorical3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

측정항목 is highly overall correlated with 평균값 and 1 other fieldsHigh correlation
평균값 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
측정기 상태 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
측정기 상태 is highly imbalanced (55.2%)Imbalance
국가 기준초과 구분 is highly imbalanced (99.5%)Imbalance
지자체 기준초과 구분 is highly imbalanced (99.5%)Imbalance
평균값 has 1429 (14.3%) zerosZeros

Reproduction

Analysis started2024-05-11 07:00:55.001612
Analysis finished2024-05-11 07:00:59.675465
Duration4.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct1847
Distinct (%)18.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9940196 × 109
Minimum1.9940101 × 109
Maximum1.9940319 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:00:59.804591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.9940101 × 109
5-th percentile1.9940104 × 109
Q11.9940119 × 109
median1.9940207 × 109
Q31.9940227 × 109
95-th percentile1.9940315 × 109
Maximum1.9940319 × 109
Range21803
Interquartile range (IQR)10798

Descriptive statistics

Standard deviation7608.6564
Coefficient of variation (CV)3.815738 × 10-6
Kurtosis-1.2980413
Mean1.9940196 × 109
Median Absolute Deviation (MAD)8618
Skewness0.28148177
Sum1.9940196 × 1013
Variance57891653
MonotonicityNot monotonic
2024-05-11T16:01:00.086653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1994020120 13
 
0.1%
1994020302 13
 
0.1%
1994021020 12
 
0.1%
1994011202 12
 
0.1%
1994021208 12
 
0.1%
1994012920 12
 
0.1%
1994011204 12
 
0.1%
1994011321 12
 
0.1%
1994010812 12
 
0.1%
1994012202 12
 
0.1%
Other values (1837) 9878
98.8%
ValueCountFrequency (%)
1994010100 7
0.1%
1994010101 3
 
< 0.1%
1994010102 6
0.1%
1994010103 8
0.1%
1994010104 7
0.1%
1994010105 4
< 0.1%
1994010106 3
 
< 0.1%
1994010107 7
0.1%
1994010108 6
0.1%
1994010109 7
0.1%
ValueCountFrequency (%)
1994031903 5
0.1%
1994031902 4
 
< 0.1%
1994031901 5
0.1%
1994031900 4
 
< 0.1%
1994031823 7
0.1%
1994031822 7
0.1%
1994031821 5
0.1%
1994031820 5
0.1%
1994031819 10
0.1%
1994031817 5
0.1%

측정소 코드
Real number (ℝ)

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.8259
Minimum103
Maximum124
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:01:00.347202image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum103
5-th percentile103
Q1107
median113
Q3117
95-th percentile124
Maximum124
Range21
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.1078859
Coefficient of variation (CV)0.062998707
Kurtosis-1.3319684
Mean112.8259
Median Absolute Deviation (MAD)6
Skewness0.18077134
Sum1128259
Variance50.522041
MonotonicityNot monotonic
2024-05-11T16:01:00.579032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
116 1153
11.5%
124 1138
11.4%
122 1133
11.3%
105 1130
11.3%
113 1124
11.2%
108 1117
11.2%
103 1105
11.1%
107 1064
10.6%
117 1036
10.4%
ValueCountFrequency (%)
103 1105
11.1%
105 1130
11.3%
107 1064
10.6%
108 1117
11.2%
113 1124
11.2%
116 1153
11.5%
117 1036
10.4%
122 1133
11.3%
124 1138
11.4%
ValueCountFrequency (%)
124 1138
11.4%
122 1133
11.3%
117 1036
10.4%
116 1153
11.5%
113 1124
11.2%
108 1117
11.2%
107 1064
10.6%
105 1130
11.3%
103 1105
11.1%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.2781
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T16:01:00.811528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.760312
Coefficient of variation (CV)0.52297455
Kurtosis-1.2280978
Mean5.2781
Median Absolute Deviation (MAD)3
Skewness-0.1793247
Sum52781
Variance7.6193223
MonotonicityNot monotonic
2024-05-11T16:01:01.077733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 1731
17.3%
5 1695
17.0%
3 1686
16.9%
8 1657
16.6%
9 1625
16.2%
6 1606
16.1%
ValueCountFrequency (%)
1 1731
17.3%
3 1686
16.9%
5 1695
17.0%
6 1606
16.1%
8 1657
16.6%
9 1625
16.2%
ValueCountFrequency (%)
9 1625
16.2%
8 1657
16.6%
6 1606
16.1%
5 1695
17.0%
3 1686
16.9%
1 1731
17.3%

평균값
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct198
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-1914.8286
Minimum-9999
Maximum11.8
Zeros1429
Zeros (%)14.3%
Negative2014
Negative (%)20.1%
Memory size166.0 KiB
2024-05-11T16:01:01.370319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile-9999
Q10
median0.013
Q30.043
95-th percentile1.7
Maximum11.8
Range10010.8
Interquartile range (IQR)0.043

Descriptive statistics

Standard deviation3921.8488
Coefficient of variation (CV)-2.0481462
Kurtosis0.48277167
Mean-1914.8286
Median Absolute Deviation (MAD)0.022
Skewness-1.5740346
Sum-19148286
Variance15380898
MonotonicityNot monotonic
2024-05-11T16:01:01.682094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-9999.0 1903
 
19.0%
0.0 1429
 
14.3%
0.002 205
 
2.1%
0.001 203
 
2.0%
0.003 136
 
1.4%
0.011 126
 
1.3%
0.019 124
 
1.2%
0.01 123
 
1.2%
0.012 121
 
1.2%
0.7 117
 
1.2%
Other values (188) 5513
55.1%
ValueCountFrequency (%)
-9999.0 1903
19.0%
-3276.8 10
 
0.1%
-999.9 4
 
< 0.1%
-999.8 66
 
0.7%
-999.7 13
 
0.1%
-999.6 7
 
0.1%
-9.999 11
 
0.1%
0.0 1429
14.3%
0.001 203
 
2.0%
0.002 205
 
2.1%
ValueCountFrequency (%)
11.8 1
< 0.1%
9.6 1
< 0.1%
8.1 1
< 0.1%
7.9 1
< 0.1%
7.7 1
< 0.1%
7.4 1
< 0.1%
7.0 1
< 0.1%
6.9 1
< 0.1%
6.8 1
< 0.1%
6.6 1
< 0.1%

측정기 상태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
6531 
4
3312 
2
 
110
1
 
41
9
 
6

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row4
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 6531
65.3%
4 3312
33.1%
2 110
 
1.1%
1 41
 
0.4%
9 6
 
0.1%

Length

2024-05-11T16:01:01.917985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:01:02.068068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 6531
65.3%
4 3312
33.1%
2 110
 
1.1%
1 41
 
0.4%
9 6
 
0.1%

국가 기준초과 구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9996 
1
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9996
> 99.9%
1 4
 
< 0.1%

Length

2024-05-11T16:01:02.244950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:01:02.392653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9996
> 99.9%
1 4
 
< 0.1%

지자체 기준초과 구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9996 
1
 
4

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9996
> 99.9%
1 4
 
< 0.1%

Length

2024-05-11T16:01:02.567206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T16:01:02.714332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9996
> 99.9%
1 4
 
< 0.1%

Interactions

2024-05-11T16:00:58.533180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:55.995245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:57.133424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:57.780084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:58.704601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:56.158823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:57.274507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:57.967409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:58.855367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:56.326118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:57.418946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:58.135010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:59.038680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:56.943893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:57.606334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T16:00:58.352615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T16:01:02.825698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.0210.0730.0820.0000.000
측정소 코드0.0001.0000.0180.2850.1010.0000.021
측정항목0.0210.0181.0000.1990.6440.0000.034
평균값0.0730.2850.1991.0000.6160.0000.000
측정기 상태0.0820.1010.6440.6161.0000.1900.126
국가 기준초과 구분0.0000.0000.0000.0000.1901.0000.555
지자체 기준초과 구분0.0000.0210.0340.0000.1260.5551.000
2024-05-11T16:01:02.984428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정기 상태1.0000.2330.154
국가 기준초과 구분0.2331.0000.375
지자체 기준초과 구분0.1540.3751.000
2024-05-11T16:01:03.142542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0100.024-0.0150.0310.0000.000
측정소 코드0.0101.000-0.0090.0310.0680.0000.015
측정항목0.024-0.0091.000-0.7240.5040.0000.024
평균값-0.0150.031-0.7241.0000.6140.0000.000
측정기 상태0.0310.0680.5040.6141.0000.2330.154
국가 기준초과 구분0.0000.0000.0000.0000.2331.0000.375
지자체 기준초과 구분0.0000.0150.0240.0000.1540.3751.000

Missing values

2024-05-11T16:00:59.311455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T16:00:59.598190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
15218199401121712250.9000
43215199402030810760.002000
91660199403121710880.0400
89385199403102310760.021000
11971199401100511730.039000
17215199401140612230.046000
86863199403090011630.017000
63818199402190512252.5000
23060199401181910351.0000
2907519940123101089-9999.0400
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
91548199403121510810.007000
21097199401170611730.032000
11235199401091610360.011000
10958199401091012452.0000
84987199403071312260.035000
3154199401031010880.0400
57248199402140410551.8000
86065199403080912230.039000
14739199401120812460.001000
67300199402212210780.0400