Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric5
Categorical2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

측정항목 is highly overall correlated with 평균값High correlation
평균값 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
측정기 상태 is highly overall correlated with 평균값High correlation
국가 기준초과 구분 is highly imbalanced (88.6%)Imbalance
지자체 기준초과 구분 is highly imbalanced (97.3%)Imbalance
평균값 has 303 (3.0%) zerosZeros
측정기 상태 has 4023 (40.2%) zerosZeros

Reproduction

Analysis started2024-05-04 04:04:12.400280
Analysis finished2024-05-04 04:04:22.747800
Duration10.35 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct2064
Distinct (%)20.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9910215 × 109
Minimum1.9910101 × 109
Maximum1.9910328 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:04:22.998403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.9910101 × 109
5-th percentile1.9910105 × 109
Q11.9910122 × 109
median1.9910215 × 109
Q31.9910309 × 109
95-th percentile1.9910324 × 109
Maximum1.9910328 × 109
Range22719
Interquartile range (IQR)18688

Descriptive statistics

Standard deviation8442.2561
Coefficient of variation (CV)4.2401632 × 10-6
Kurtosis-1.5757346
Mean1.9910215 × 109
Median Absolute Deviation (MAD)9311
Skewness0.0060656819
Sum1.9910215 × 1013
Variance71271687
MonotonicityNot monotonic
2024-05-04T04:04:23.445617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1991020516 12
 
0.1%
1991012706 12
 
0.1%
1991030300 12
 
0.1%
1991020200 12
 
0.1%
1991032312 12
 
0.1%
1991011310 11
 
0.1%
1991030105 11
 
0.1%
1991011401 11
 
0.1%
1991030822 11
 
0.1%
1991012014 11
 
0.1%
Other values (2054) 9885
98.9%
ValueCountFrequency (%)
1991010100 2
 
< 0.1%
1991010101 7
0.1%
1991010102 4
< 0.1%
1991010104 2
 
< 0.1%
1991010105 2
 
< 0.1%
1991010106 6
0.1%
1991010107 2
 
< 0.1%
1991010108 1
 
< 0.1%
1991010109 5
0.1%
1991010111 5
0.1%
ValueCountFrequency (%)
1991032819 3
 
< 0.1%
1991032818 7
0.1%
1991032817 5
0.1%
1991032816 9
0.1%
1991032815 4
< 0.1%
1991032814 8
0.1%
1991032813 3
 
< 0.1%
1991032812 4
< 0.1%
1991032811 5
0.1%
1991032810 6
0.1%

측정소 코드
Real number (ℝ)

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.8054
Minimum103
Maximum124
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:04:23.809242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum103
5-th percentile103
Q1107
median113
Q3117
95-th percentile124
Maximum124
Range21
Interquartile range (IQR)10

Descriptive statistics

Standard deviation7.3041814
Coefficient of variation (CV)0.064750281
Kurtosis-1.379135
Mean112.8054
Median Absolute Deviation (MAD)6
Skewness0.21494886
Sum1128054
Variance53.351066
MonotonicityNot monotonic
2024-05-04T04:04:24.192571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
103 1290
12.9%
113 1266
12.7%
124 1257
12.6%
117 1251
12.5%
122 1238
12.4%
108 1229
12.3%
107 1217
12.2%
105 848
8.5%
116 404
 
4.0%
ValueCountFrequency (%)
103 1290
12.9%
105 848
8.5%
107 1217
12.2%
108 1229
12.3%
113 1266
12.7%
116 404
 
4.0%
117 1251
12.5%
122 1238
12.4%
124 1257
12.6%
ValueCountFrequency (%)
124 1257
12.6%
122 1238
12.4%
117 1251
12.5%
116 404
 
4.0%
113 1266
12.7%
108 1229
12.3%
107 1217
12.2%
105 848
8.5%
103 1290
12.9%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3351
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:04:24.664309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.77171
Coefficient of variation (CV)0.51952353
Kurtosis-1.2233326
Mean5.3351
Median Absolute Deviation (MAD)3
Skewness-0.20668313
Sum53351
Variance7.6823762
MonotonicityNot monotonic
2024-05-04T04:04:25.141735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
9 1730
17.3%
1 1714
17.1%
6 1663
16.6%
5 1635
16.4%
3 1630
16.3%
8 1628
16.3%
ValueCountFrequency (%)
1 1714
17.1%
3 1630
16.3%
5 1635
16.4%
6 1663
16.6%
8 1628
16.3%
9 1730
17.3%
ValueCountFrequency (%)
9 1730
17.3%
8 1628
16.3%
6 1663
16.6%
5 1635
16.4%
3 1630
16.3%
1 1714
17.1%

평균값
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct443
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-3262.5533
Minimum-9999
Maximum276
Zeros303
Zeros (%)3.0%
Negative5940
Negative (%)59.4%
Memory size166.0 KiB
2024-05-04T04:04:25.767341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile-9999
Q1-9999
median-9.999
Q30.03
95-th percentile2.9
Maximum276
Range10275
Interquartile range (IQR)9999.03

Descriptive statistics

Standard deviation4622.6302
Coefficient of variation (CV)-1.416875
Kurtosis-1.4030349
Mean-3262.5533
Median Absolute Deviation (MAD)10.285
Skewness-0.76572176
Sum-32625533
Variance21368710
MonotonicityNot monotonic
2024-05-04T04:04:26.372444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-9999.0 3195
31.9%
-9.999 2074
20.7%
-999.9 671
 
6.7%
0.0 303
 
3.0%
0.001 79
 
0.8%
0.019 58
 
0.6%
0.012 54
 
0.5%
0.003 52
 
0.5%
0.015 52
 
0.5%
0.02 51
 
0.5%
Other values (433) 3411
34.1%
ValueCountFrequency (%)
-9999.0 3195
31.9%
-999.9 671
 
6.7%
-9.999 2074
20.7%
0.0 303
 
3.0%
0.001 79
 
0.8%
0.002 41
 
0.4%
0.003 52
 
0.5%
0.004 39
 
0.4%
0.005 34
 
0.3%
0.006 38
 
0.4%
ValueCountFrequency (%)
276.0 1
< 0.1%
252.0 1
< 0.1%
196.0 1
< 0.1%
180.0 1
< 0.1%
179.0 1
< 0.1%
178.0 1
< 0.1%
160.0 1
< 0.1%
156.0 2
< 0.1%
147.0 2
< 0.1%
146.0 1
< 0.1%

측정기 상태
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.3334
Minimum0
Maximum9
Zeros4023
Zeros (%)40.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:04:26.934597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median2
Q34
95-th percentile4
Maximum9
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.1824575
Coefficient of variation (CV)0.93531221
Kurtosis0.1261697
Mean2.3334
Median Absolute Deviation (MAD)2
Skewness0.51699821
Sum23334
Variance4.7631208
MonotonicityNot monotonic
2024-05-04T04:04:27.392300image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
4 4715
47.1%
0 4023
40.2%
2 960
 
9.6%
9 278
 
2.8%
1 20
 
0.2%
8 4
 
< 0.1%
ValueCountFrequency (%)
0 4023
40.2%
1 20
 
0.2%
2 960
 
9.6%
4 4715
47.1%
8 4
 
< 0.1%
9 278
 
2.8%
ValueCountFrequency (%)
9 278
 
2.8%
8 4
 
< 0.1%
4 4715
47.1%
2 960
 
9.6%
1 20
 
0.2%
0 4023
40.2%

국가 기준초과 구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9848 
1
 
152

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9848
98.5%
1 152
 
1.5%

Length

2024-05-04T04:04:27.910562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:04:28.276864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9848
98.5%
1 152
 
1.5%

지자체 기준초과 구분
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9973 
1
 
27

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9973
99.7%
1 27
 
0.3%

Length

2024-05-04T04:04:28.700540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:04:29.281583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9973
99.7%
1 27
 
0.3%

Interactions

2024-05-04T04:04:20.315230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:14.020711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:15.353328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:17.138021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:18.721296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:20.620080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:14.274721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:15.624485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:17.406040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:19.123186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:20.887465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:14.536628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:15.880516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:17.693284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:19.397991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:21.219337image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:14.844856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:16.175554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:18.138371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:19.774109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:21.499034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:15.106275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:16.861867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:18.434032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:04:20.047315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T04:04:29.589052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.2100.0000.0920.2960.1080.077
측정소 코드0.2101.0000.0130.3210.5420.1440.122
측정항목0.0000.0131.0000.4840.6980.3520.161
평균값0.0920.3210.4841.0000.5740.0730.019
측정기 상태0.2960.5420.6980.5741.0000.2080.083
국가 기준초과 구분0.1080.1440.3520.0730.2081.0000.198
지자체 기준초과 구분0.0770.1220.1610.0190.0830.1981.000
2024-05-04T04:04:30.042565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분국가 기준초과 구분
지자체 기준초과 구분1.0000.127
국가 기준초과 구분0.1271.000
2024-05-04T04:04:30.428679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0120.001-0.0230.0160.0770.056
측정소 코드0.0121.0000.0040.146-0.1500.1040.087
측정항목0.0010.0041.000-0.6840.4900.2530.116
평균값-0.0230.146-0.6841.000-0.8560.0980.039
측정기 상태0.016-0.1500.490-0.8561.0000.1500.059
국가 기준초과 구분0.0770.1040.2530.0980.1501.0000.127
지자체 기준초과 구분0.0560.0870.1160.0390.0590.1271.000

Missing values

2024-05-04T04:04:22.068193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T04:04:22.558582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
23450199101210811352.7000
47472199102121610310.029000
4877319910213221249-9999.0400
59334199102241011310.018000
8628419910318051228-9999.0400
3673119910202001079-9999.0400
32808199101291111310.082000
3338199101032111351.6000
1844819910117001078-9999.0400
9550119910325081139-9999.0400
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
3710219910202091038-9999.0400
79920199103130810310.09000
67645199103032011730.015000
2163719910119181223-9.999200
7964219910313021228-9999.0400
47449199102121511330.012000
3027719910127061223-9.999200
36218199102011210351.6000
2822419910125121031-9.999400
98900199103272311352.4000