Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric4
Categorical3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

지자체 기준초과 구분 has constant value ""Constant
측정항목 is highly overall correlated with 평균값 and 1 other fieldsHigh correlation
평균값 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
측정기 상태 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
국가 기준초과 구분 is highly imbalanced (97.8%)Imbalance
평균값 has 960 (9.6%) zerosZeros

Reproduction

Analysis started2024-05-04 04:03:33.496757
Analysis finished2024-05-04 04:03:41.325137
Duration7.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct1845
Distinct (%)18.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9930198 × 109
Minimum1.9930101 × 109
Maximum1.9930319 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:03:41.576932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.9930101 × 109
5-th percentile1.9930104 × 109
Q11.993012 × 109
median1.9930208 × 109
Q31.9930227 × 109
95-th percentile1.9930315 × 109
Maximum1.9930319 × 109
Range21803
Interquartile range (IQR)10706

Descriptive statistics

Standard deviation7572.3336
Coefficient of variation (CV)3.7994272 × 10-6
Kurtosis-1.2966211
Mean1.9930198 × 109
Median Absolute Deviation (MAD)8695
Skewness0.2518719
Sum1.9930198 × 1013
Variance57340236
MonotonicityNot monotonic
2024-05-04T04:03:42.125749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1993020223 13
 
0.1%
1993021215 12
 
0.1%
1993021312 12
 
0.1%
1993031406 12
 
0.1%
1993030203 12
 
0.1%
1993021800 12
 
0.1%
1993011203 12
 
0.1%
1993022513 12
 
0.1%
1993020107 11
 
0.1%
1993012223 11
 
0.1%
Other values (1835) 9881
98.8%
ValueCountFrequency (%)
1993010100 7
0.1%
1993010101 6
0.1%
1993010102 4
 
< 0.1%
1993010103 5
0.1%
1993010104 5
0.1%
1993010105 4
 
< 0.1%
1993010106 11
0.1%
1993010107 3
 
< 0.1%
1993010108 8
0.1%
1993010109 3
 
< 0.1%
ValueCountFrequency (%)
1993031903 6
0.1%
1993031902 7
0.1%
1993031901 3
< 0.1%
1993031900 5
0.1%
1993031823 7
0.1%
1993031822 2
 
< 0.1%
1993031821 3
< 0.1%
1993031820 7
0.1%
1993031819 3
< 0.1%
1993031818 2
 
< 0.1%

측정소 코드
Real number (ℝ)

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.7578
Minimum103
Maximum124
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:03:42.837821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum103
5-th percentile103
Q1107
median113
Q3117
95-th percentile124
Maximum124
Range21
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.1142814
Coefficient of variation (CV)0.063093475
Kurtosis-1.3281624
Mean112.7578
Median Absolute Deviation (MAD)6
Skewness0.18968437
Sum1127578
Variance50.613
MonotonicityNot monotonic
2024-05-04T04:03:43.282561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
103 1145
11.5%
116 1140
11.4%
124 1135
11.3%
105 1129
11.3%
108 1103
11.0%
113 1103
11.0%
122 1096
11.0%
107 1079
10.8%
117 1070
10.7%
ValueCountFrequency (%)
103 1145
11.5%
105 1129
11.3%
107 1079
10.8%
108 1103
11.0%
113 1103
11.0%
116 1140
11.4%
117 1070
10.7%
122 1096
11.0%
124 1135
11.3%
ValueCountFrequency (%)
124 1135
11.3%
122 1096
11.0%
117 1070
10.7%
116 1140
11.4%
113 1103
11.0%
108 1103
11.0%
107 1079
10.8%
105 1129
11.3%
103 1145
11.5%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3158
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:03:43.582362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7484224
Coefficient of variation (CV)0.51702893
Kurtosis-1.2097316
Mean5.3158
Median Absolute Deviation (MAD)2
Skewness-0.19933983
Sum53158
Variance7.5538257
MonotonicityNot monotonic
2024-05-04T04:03:44.044945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 1684
16.8%
6 1676
16.8%
3 1669
16.7%
5 1666
16.7%
8 1664
16.6%
9 1641
16.4%
ValueCountFrequency (%)
1 1684
16.8%
3 1669
16.7%
5 1666
16.7%
6 1676
16.8%
8 1664
16.6%
9 1641
16.4%
ValueCountFrequency (%)
9 1641
16.4%
8 1664
16.6%
6 1676
16.8%
5 1666
16.7%
3 1669
16.7%
1 1684
16.8%

평균값
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct263
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-2498.5532
Minimum-9999
Maximum21.1
Zeros960
Zeros (%)9.6%
Negative2535
Negative (%)25.4%
Memory size166.0 KiB
2024-05-04T04:03:44.693093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile-9999
Q1-999.9
median0.0135
Q30.046
95-th percentile2.1
Maximum21.1
Range10020.1
Interquartile range (IQR)999.946

Descriptive statistics

Standard deviation4328.4188
Coefficient of variation (CV)-1.7323701
Kurtosis-0.66373635
Mean-2498.5532
Median Absolute Deviation (MAD)0.0325
Skewness-1.1559164
Sum-24985532
Variance18735209
MonotonicityNot monotonic
2024-05-04T04:03:45.253650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-9999.0 2498
25.0%
0.0 960
 
9.6%
0.001 194
 
1.9%
0.002 130
 
1.3%
0.1 126
 
1.3%
0.012 121
 
1.2%
0.01 121
 
1.2%
0.015 119
 
1.2%
0.011 118
 
1.2%
0.023 113
 
1.1%
Other values (253) 5500
55.0%
ValueCountFrequency (%)
-9999.0 2498
25.0%
-999.9 3
 
< 0.1%
-999.4 1
 
< 0.1%
-999.3 1
 
< 0.1%
-999.2 2
 
< 0.1%
-999.1 2
 
< 0.1%
-998.7 1
 
< 0.1%
-998.6 1
 
< 0.1%
-9.999 2
 
< 0.1%
-9.995 1
 
< 0.1%
ValueCountFrequency (%)
21.1 1
< 0.1%
12.0 1
< 0.1%
11.8 1
< 0.1%
9.5 1
< 0.1%
9.3 1
< 0.1%
9.0 2
< 0.1%
8.6 1
< 0.1%
8.0 2
< 0.1%
7.9 1
< 0.1%
7.7 1
< 0.1%

측정기 상태
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
6556 
4
3318 
2
 
104
1
 
22

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 6556
65.6%
4 3318
33.2%
2 104
 
1.0%
1 22
 
0.2%

Length

2024-05-04T04:03:45.808300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:03:46.313150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 6556
65.6%
4 3318
33.2%
2 104
 
1.0%
1 22
 
0.2%

국가 기준초과 구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9979 
1
 
21

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9979
99.8%
1 21
 
0.2%

Length

2024-05-04T04:03:46.802241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:03:47.132869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9979
99.8%
1 21
 
0.2%

지자체 기준초과 구분
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
10000 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 10000
100.0%

Length

2024-05-04T04:03:47.470706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:03:47.902673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 10000
100.0%

Interactions

2024-05-04T04:03:39.270859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:35.020968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:36.386410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:37.764704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:39.560288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:35.362242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:36.689318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:38.149043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:39.833996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:35.804418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:37.011857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:38.507816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:40.220721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:36.109866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:37.372638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:38.869380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T04:03:48.079860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분
측정일시1.0000.0000.0000.0260.0810.024
측정소 코드0.0001.0000.0000.0280.0990.066
측정항목0.0000.0001.0000.0560.7400.138
평균값0.0260.0280.0561.0000.3520.000
측정기 상태0.0810.0990.7400.3521.0000.043
국가 기준초과 구분0.0240.0660.1380.0000.0431.000
2024-05-04T04:03:48.506284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
국가 기준초과 구분측정기 상태
국가 기준초과 구분1.0000.028
측정기 상태0.0281.000
2024-05-04T04:03:48.775686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분
측정일시1.0000.0170.006-0.0230.0660.029
측정소 코드0.0171.000-0.007-0.0150.0640.047
측정항목0.006-0.0071.000-0.7520.5760.099
평균값-0.023-0.015-0.7521.0000.6020.022
측정기 상태0.0660.0640.5760.6021.0000.028
국가 기준초과 구분0.0290.0470.0990.0220.0281.000

Missing values

2024-05-04T04:03:40.713058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T04:03:41.149028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
78762199303021811610.023000
40779199302011110560.007000
19173199301151910360.011000
91700199303121810551.4000
43779199302031811760.011000
5488619930212081088-9999.0400
32316199301252211310.084000
5122719930209121169-9999.0400
81715199303050110730.056000
65954199302202110850.4000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
28494199301222311710.067000
1916919930115181249-9999.0400
38665199301302010330.046000
44907199302041511660.01000
96996199303162010710.016000
2359719930119041249-9999.0400
92811199303131411760.02000
9464319930315001169-9999.0400
4694219930206051078-9999.0400
56980199302132310580.0400