Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric5
Categorical2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

지자체 기준초과 구분 is highly overall correlated with 국가 기준초과 구분High correlation
국가 기준초과 구분 is highly overall correlated with 지자체 기준초과 구분High correlation
측정항목 is highly overall correlated with 평균값High correlation
평균값 is highly overall correlated with 측정항목High correlation
국가 기준초과 구분 is highly imbalanced (70.6%)Imbalance
지자체 기준초과 구분 is highly imbalanced (70.6%)Imbalance
평균값 is highly skewed (γ1 = -22.21268622)Skewed
평균값 has 222 (2.2%) zerosZeros
측정기 상태 has 9626 (96.3%) zerosZeros

Reproduction

Analysis started2024-07-27 00:24:27.097043
Analysis finished2024-07-27 00:24:39.703367
Duration12.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct470
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.018011 × 109
Minimum2.0180101 × 109
Maximum2.018012 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-07-27T09:24:39.892929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0180101 × 109
5-th percentile2.0180102 × 109
Q12.0180105 × 109
median2.018011 × 109
Q32.0180115 × 109
95-th percentile2.0180119 × 109
Maximum2.018012 × 109
Range1913
Interquartile range (IQR)994.25

Descriptive statistics

Standard deviation562.16649
Coefficient of variation (CV)2.7857453 × 10-7
Kurtosis-1.1991596
Mean2.018011 × 109
Median Absolute Deviation (MAD)497
Skewness-0.0018689884
Sum2.018011 × 1013
Variance316031.16
MonotonicityNot monotonic
2024-07-27T09:24:40.339367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2018011301 38
 
0.4%
2018010410 34
 
0.3%
2018010704 33
 
0.3%
2018010708 33
 
0.3%
2018011715 33
 
0.3%
2018011806 32
 
0.3%
2018011303 31
 
0.3%
2018010403 31
 
0.3%
2018011815 31
 
0.3%
2018010311 30
 
0.3%
Other values (460) 9674
96.7%
ValueCountFrequency (%)
2018010100 19
0.2%
2018010101 19
0.2%
2018010102 27
0.3%
2018010103 17
0.2%
2018010104 22
0.2%
2018010105 20
0.2%
2018010106 12
0.1%
2018010107 24
0.2%
2018010108 22
0.2%
2018010109 18
0.2%
ValueCountFrequency (%)
2018012013 13
0.1%
2018012012 23
0.2%
2018012011 22
0.2%
2018012010 23
0.2%
2018012009 19
0.2%
2018012008 23
0.2%
2018012007 14
0.1%
2018012006 20
0.2%
2018012005 18
0.2%
2018012004 17
0.2%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.9199
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-07-27T09:24:40.838948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median113
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.2346332
Coefficient of variation (CV)0.064068718
Kurtosis-1.2279988
Mean112.9199
Median Absolute Deviation (MAD)6
Skewness-6.2770396 × 10-5
Sum1129199
Variance52.339918
MonotonicityNot monotonic
2024-07-27T09:24:41.313254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
102 470
 
4.7%
120 453
 
4.5%
106 442
 
4.4%
104 432
 
4.3%
117 426
 
4.3%
114 422
 
4.2%
124 415
 
4.2%
119 410
 
4.1%
107 406
 
4.1%
109 402
 
4.0%
Other values (15) 5722
57.2%
ValueCountFrequency (%)
101 382
3.8%
102 470
4.7%
103 387
3.9%
104 432
4.3%
105 371
3.7%
106 442
4.4%
107 406
4.1%
108 390
3.9%
109 402
4.0%
110 381
3.8%
ValueCountFrequency (%)
125 360
3.6%
124 415
4.2%
123 375
3.8%
122 398
4.0%
121 401
4.0%
120 453
4.5%
119 410
4.1%
118 397
4.0%
117 426
4.3%
116 359
3.6%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3203
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-07-27T09:24:41.720100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7548987
Coefficient of variation (CV)0.5178089
Kurtosis-1.2163683
Mean5.3203
Median Absolute Deviation (MAD)3
Skewness-0.19620904
Sum53203
Variance7.5894669
MonotonicityNot monotonic
2024-07-27T09:24:42.091139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
5 1685
16.9%
1 1683
16.8%
3 1675
16.8%
9 1672
16.7%
8 1656
16.6%
6 1629
16.3%
ValueCountFrequency (%)
1 1683
16.8%
3 1675
16.8%
5 1685
16.9%
6 1629
16.3%
8 1656
16.6%
9 1672
16.7%
ValueCountFrequency (%)
9 1672
16.7%
8 1656
16.6%
6 1629
16.3%
5 1685
16.9%
3 1675
16.8%
1 1683
16.8%

평균값
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct265
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-2.9925596
Minimum-9999
Maximum3514
Zeros222
Zeros (%)2.2%
Negative24
Negative (%)0.2%
Memory size166.0 KiB
2024-07-27T09:24:42.586844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile0.002
Q10.008
median0.067
Q325
95-th percentile76
Maximum3514
Range13513
Interquartile range (IQR)24.992

Descriptive statistics

Standard deviation439.80235
Coefficient of variation (CV)-146.96528
Kurtosis505.18592
Mean-2.9925596
Median Absolute Deviation (MAD)0.067
Skewness-22.212686
Sum-29925.596
Variance193426.11
MonotonicityNot monotonic
2024-07-27T09:24:43.239048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.005 474
 
4.7%
0.006 441
 
4.4%
0.004 412
 
4.1%
0.007 297
 
3.0%
0.5 279
 
2.8%
0.002 275
 
2.8%
0.003 247
 
2.5%
0.6 232
 
2.3%
0.4 226
 
2.3%
0.0 222
 
2.2%
Other values (255) 6895
69.0%
ValueCountFrequency (%)
-9999.0 19
 
0.2%
-433.0 1
 
< 0.1%
-349.0 1
 
< 0.1%
-50.0 1
 
< 0.1%
-1.0 1
 
< 0.1%
-0.2 1
 
< 0.1%
0.0 222
2.2%
0.001 49
 
0.5%
0.002 275
2.8%
0.003 247
2.5%
ValueCountFrequency (%)
3514.0 1
< 0.1%
3487.0 1
< 0.1%
161.0 1
< 0.1%
160.0 1
< 0.1%
157.0 1
< 0.1%
155.0 1
< 0.1%
154.0 1
< 0.1%
149.0 2
< 0.1%
147.0 1
< 0.1%
146.0 2
< 0.1%

측정기 상태
Real number (ℝ)

ZEROS 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2425
Minimum0
Maximum9
Zeros9626
Zeros (%)96.3%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-07-27T09:24:43.840147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum9
Range9
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.3556096
Coefficient of variation (CV)5.5901429
Kurtosis30.393485
Mean0.2425
Median Absolute Deviation (MAD)0
Skewness5.6514277
Sum2425
Variance1.8376775
MonotonicityNot monotonic
2024-07-27T09:24:44.220129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 9626
96.3%
8 214
 
2.1%
1 68
 
0.7%
9 59
 
0.6%
4 24
 
0.2%
2 9
 
0.1%
ValueCountFrequency (%)
0 9626
96.3%
1 68
 
0.7%
2 9
 
0.1%
4 24
 
0.2%
8 214
 
2.1%
9 59
 
0.6%
ValueCountFrequency (%)
9 59
 
0.6%
8 214
 
2.1%
4 24
 
0.2%
2 9
 
0.1%
1 68
 
0.7%
0 9626
96.3%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9481 
1
 
519

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9481
94.8%
1 519
 
5.2%

Length

2024-07-27T09:24:44.659630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-27T09:24:45.011679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9481
94.8%
1 519
 
5.2%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9481 
1
 
519

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9481
94.8%
1 519
 
5.2%

Length

2024-07-27T09:24:45.371854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-07-27T09:24:45.772305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9481
94.8%
1 519
 
5.2%

Interactions

2024-07-27T09:24:37.056237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:29.464708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:31.062104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:33.143825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:35.071405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:37.384359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:29.744533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:31.446654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:33.554111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:35.520583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:37.727683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:29.931957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:31.873501image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:33.895046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:35.885118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:38.051001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:30.222174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:32.279578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:34.214774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:36.287783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:38.697584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:30.609737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:32.719276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:34.651685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-07-27T09:24:36.726573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-07-27T09:24:45.978149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.0000.0400.1420.4560.456
측정소 코드0.0001.0000.0000.0500.4310.0030.003
측정항목0.0000.0001.0000.0310.1080.5140.514
평균값0.0400.0500.0311.0000.2530.0680.068
측정기 상태0.1420.4310.1080.2531.0000.0440.044
국가 기준초과 구분0.4560.0030.5140.0680.0441.0001.000
지자체 기준초과 구분0.4560.0030.5140.0680.0441.0001.000
2024-07-27T09:24:46.357718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분국가 기준초과 구분
지자체 기준초과 구분1.0000.999
국가 기준초과 구분0.9991.000
2024-07-27T09:24:46.632869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.004-0.0020.090-0.0840.3510.351
측정소 코드0.0041.000-0.0080.034-0.1520.0020.002
측정항목-0.002-0.0081.0000.6850.0090.3720.372
평균값0.0900.0340.6851.000-0.2120.0600.060
측정기 상태-0.084-0.1520.009-0.2121.0000.0310.031
국가 기준초과 구분0.3510.0020.3720.0600.0311.0000.999
지자체 기준초과 구분0.3510.0020.3720.0600.0310.9991.000

Missing values

2024-07-27T09:24:39.075454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-07-27T09:24:39.532653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
86802018010309122851.0000
12441201801041012460.014000
5771320180117001199102.0011
33711201801100811960.009000
17181201801051811460.026000
15552201801050711810.006000
53966201801152312051.2000
43104201801122311010.006000
1704201801011111010.007000
305992018010911125922.0000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
31274201801091611350.6000
59450201801171210950.7000
30952018010120116917.0000
56097201801161312560.014000
27102201801081211810.007000
35389201801101912430.014000
37369201801110910430.025000
4215201801020410360.002000
39774201801120110510.001800
62858201801181110250.9000