Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric4
Categorical3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

평균값 is highly overall correlated with 측정기 상태High correlation
측정기 상태 is highly overall correlated with 평균값High correlation
국가 기준초과 구분 is highly overall correlated with 지자체 기준초과 구분High correlation
지자체 기준초과 구분 is highly overall correlated with 국가 기준초과 구분High correlation
측정기 상태 is highly imbalanced (57.8%)Imbalance
국가 기준초과 구분 is highly imbalanced (93.6%)Imbalance
지자체 기준초과 구분 is highly imbalanced (88.1%)Imbalance
평균값 has 301 (3.0%) zerosZeros

Reproduction

Analysis started2024-05-04 04:02:56.243244
Analysis finished2024-05-04 04:03:04.289751
Duration8.05 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct926
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9970133 × 109
Minimum1.9970101 × 109
Maximum1.9970208 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:03:04.583106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.9970101 × 109
5-th percentile1.9970102 × 109
Q11.997011 × 109
median1.997012 × 109
Q31.9970129 × 109
95-th percentile1.9970206 × 109
Maximum1.9970208 × 109
Range10713
Interquartile range (IQR)1908

Descriptive statistics

Standard deviation3593.9293
Coefficient of variation (CV)1.7996521 × 10-6
Kurtosis0.13223009
Mean1.9970133 × 109
Median Absolute Deviation (MAD)990
Skewness1.3575044
Sum1.9970133 × 1013
Variance12916328
MonotonicityNot monotonic
2024-05-04T04:03:05.199684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1997011117 22
 
0.2%
1997020421 22
 
0.2%
1997010911 21
 
0.2%
1997011506 21
 
0.2%
1997010114 20
 
0.2%
1997010908 20
 
0.2%
1997012315 19
 
0.2%
1997011820 19
 
0.2%
1997011719 19
 
0.2%
1997020418 19
 
0.2%
Other values (916) 9798
98.0%
ValueCountFrequency (%)
1997010100 15
0.1%
1997010101 11
0.1%
1997010102 5
 
0.1%
1997010103 12
0.1%
1997010104 18
0.2%
1997010105 13
0.1%
1997010106 11
0.1%
1997010107 9
0.1%
1997010108 14
0.1%
1997010109 14
0.1%
ValueCountFrequency (%)
1997020813 12
0.1%
1997020812 10
0.1%
1997020811 6
 
0.1%
1997020810 10
0.1%
1997020809 8
0.1%
1997020808 15
0.1%
1997020807 15
0.1%
1997020806 5
 
0.1%
1997020805 17
0.2%
1997020804 14
0.1%

측정소 코드
Real number (ℝ)

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.2407
Minimum102
Maximum124
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:03:05.581728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum102
5-th percentile102
Q1106
median111
Q3119
95-th percentile124
Maximum124
Range22
Interquartile range (IQR)13

Descriptive statistics

Standard deviation7.1580645
Coefficient of variation (CV)0.063774232
Kurtosis-1.335816
Mean112.2407
Median Absolute Deviation (MAD)6
Skewness0.24484816
Sum1122407
Variance51.237887
MonotonicityNot monotonic
2024-05-04T04:03:06.067776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
119 597
 
6.0%
106 586
 
5.9%
117 581
 
5.8%
103 566
 
5.7%
123 566
 
5.7%
110 563
 
5.6%
102 558
 
5.6%
116 554
 
5.5%
124 551
 
5.5%
108 547
 
5.5%
Other values (8) 4331
43.3%
ValueCountFrequency (%)
102 558
5.6%
103 566
5.7%
104 546
5.5%
105 544
5.4%
106 586
5.9%
107 537
5.4%
108 547
5.5%
109 538
5.4%
110 563
5.6%
111 547
5.5%
ValueCountFrequency (%)
124 551
5.5%
123 566
5.7%
122 544
5.4%
121 535
5.3%
119 597
6.0%
117 581
5.8%
116 554
5.5%
113 540
5.4%
111 547
5.5%
110 563
5.6%

측정항목
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3107
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:03:06.478694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7395102
Coefficient of variation (CV)0.51584728
Kurtosis-1.2051042
Mean5.3107
Median Absolute Deviation (MAD)2
Skewness-0.18895575
Sum53107
Variance7.504916
MonotonicityNot monotonic
2024-05-04T04:03:06.938886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
5 1707
17.1%
3 1705
17.1%
1 1656
16.6%
6 1648
16.5%
8 1643
16.4%
9 1641
16.4%
ValueCountFrequency (%)
1 1656
16.6%
3 1705
17.1%
5 1707
17.1%
6 1648
16.5%
8 1643
16.4%
9 1641
16.4%
ValueCountFrequency (%)
9 1641
16.4%
8 1643
16.4%
6 1648
16.5%
5 1707
17.1%
3 1705
17.1%
1 1656
16.6%

평균값
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct366
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-2228.6211
Minimum-9999
Maximum459
Zeros301
Zeros (%)3.0%
Negative2904
Negative (%)29.0%
Memory size166.0 KiB
2024-05-04T04:03:07.478857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile-9999
Q1-9.999
median0.013
Q30.07
95-th percentile59
Maximum459
Range10458
Interquartile range (IQR)10.069

Descriptive statistics

Standard deviation4151.7918
Coefficient of variation (CV)-1.862942
Kurtosis-0.21195082
Mean-2228.6211
Median Absolute Deviation (MAD)0.687
Skewness-1.3354651
Sum-22286211
Variance17237375
MonotonicityNot monotonic
2024-05-04T04:03:08.257440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-9999.0 2219
 
22.2%
-9.999 524
 
5.2%
0.0 301
 
3.0%
0.005 182
 
1.8%
0.006 177
 
1.8%
-999.9 161
 
1.6%
0.008 157
 
1.6%
0.003 155
 
1.6%
0.007 148
 
1.5%
0.004 144
 
1.4%
Other values (356) 5832
58.3%
ValueCountFrequency (%)
-9999.0 2219
22.2%
-999.9 161
 
1.6%
-9.999 524
 
5.2%
0.0 301
 
3.0%
0.001 144
 
1.4%
0.002 116
 
1.2%
0.003 155
 
1.6%
0.004 144
 
1.4%
0.005 182
 
1.8%
0.006 177
 
1.8%
ValueCountFrequency (%)
459.0 1
< 0.1%
358.0 1
< 0.1%
292.0 1
< 0.1%
253.0 1
< 0.1%
239.0 1
< 0.1%
236.0 1
< 0.1%
227.0 2
< 0.1%
225.0 1
< 0.1%
224.0 1
< 0.1%
220.0 1
< 0.1%

측정기 상태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
6767 
4
3138 
2
 
67
1
 
27
9
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row4
2nd row0
3rd row0
4th row4
5th row0

Common Values

ValueCountFrequency (%)
0 6767
67.7%
4 3138
31.4%
2 67
 
0.7%
1 27
 
0.3%
9 1
 
< 0.1%

Length

2024-05-04T04:03:08.831098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:03:09.308974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 6767
67.7%
4 3138
31.4%
2 67
 
0.7%
1 27
 
0.3%
9 1
 
< 0.1%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9925 
1
 
75

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9925
99.2%
1 75
 
0.8%

Length

2024-05-04T04:03:09.718882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:03:10.113870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9925
99.2%
1 75
 
0.8%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9839 
1
 
161

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9839
98.4%
1 161
 
1.6%

Length

2024-05-04T04:03:10.450092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:03:10.763844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9839
98.4%
1 161
 
1.6%

Interactions

2024-05-04T04:03:01.936795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:57.915562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:59.395407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:00.659865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:02.252829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:58.238466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:59.727903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:00.940722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:02.583878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:58.597795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:00.055999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:01.211456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:02.996822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:02:59.014100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:00.356193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:03:01.545058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T04:03:11.044815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.0000.1190.0690.0000.034
측정소 코드0.0001.0000.0000.1170.1180.0750.090
측정항목0.0000.0001.0000.2240.4950.2410.382
평균값0.1190.1170.2241.0000.3140.0000.022
측정기 상태0.0690.1180.4950.3141.0000.0480.070
국가 기준초과 구분0.0000.0750.2410.0000.0481.0000.858
지자체 기준초과 구분0.0340.0900.3820.0220.0700.8581.000
2024-05-04T04:03:11.528352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분국가 기준초과 구분측정기 상태
지자체 기준초과 구분1.0000.6560.086
국가 기준초과 구분0.6561.0000.059
측정기 상태0.0860.0591.000
2024-05-04T04:03:11.904944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0010.0030.1070.0570.0000.023
측정소 코드0.0011.000-0.029-0.0190.0680.0750.090
측정항목0.003-0.0291.000-0.4150.3630.1730.275
평균값0.107-0.019-0.4151.0000.5780.0460.070
측정기 상태0.0570.0680.3630.5781.0000.0590.086
국가 기준초과 구분0.0000.0750.1730.0460.0591.0000.656
지자체 기준초과 구분0.0230.0900.2750.0700.0860.6561.000

Missing values

2024-05-04T04:03:03.676971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T04:03:04.125889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
9180519970205101029-9999.0400
94718199702061310250.9000
60204199701240511010.012000
3554719970114171046-9.999400
51906199701210011610.006000
61980199701242112310.011000
4534199701021712489.0000
9480199701041512110.015000
99015199702080412160.003000
859241997020303113880.0000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
3145719970113031069-9999.0400
57972199701230812110.017000
22975199701092011930.053000
6120119970124141173-9.999400
72372199701282210410.009000
928001997020519106871.0000
612419970103081178-9999.0200
52358199701210412151.3000
31638199701130412410.012000
28747199701120210530.036000