Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric4
Categorical3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

지자체 기준초과 구분 is highly overall correlated with 측정항목 and 2 other fieldsHigh correlation
국가 기준초과 구분 is highly overall correlated with 측정항목 and 2 other fieldsHigh correlation
측정항목 is highly overall correlated with 평균값 and 2 other fieldsHigh correlation
평균값 is highly overall correlated with 측정항목 and 2 other fieldsHigh correlation
측정기 상태 is highly imbalanced (96.4%)Imbalance
국가 기준초과 구분 is highly imbalanced (52.3%)Imbalance
지자체 기준초과 구분 is highly imbalanced (52.3%)Imbalance

Reproduction

Analysis started2024-04-27 12:02:59.879053
Analysis finished2024-04-27 12:03:05.239676
Duration5.36 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct435
Distinct (%)4.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.019011 × 109
Minimum2.0190101 × 109
Maximum2.0190119 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:03:05.442295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0190101 × 109
5-th percentile2.0190101 × 109
Q12.0190105 × 109
median2.019011 × 109
Q32.0190114 × 109
95-th percentile2.0190118 × 109
Maximum2.0190119 × 109
Range1802
Interquartile range (IQR)899

Descriptive statistics

Standard deviation519.78839
Coefficient of variation (CV)2.5744704 × 10-7
Kurtosis-1.1908663
Mean2.019011 × 109
Median Absolute Deviation (MAD)423
Skewness0.015573752
Sum2.019011 × 1013
Variance270179.97
MonotonicityNot monotonic
2024-04-27T12:03:06.038326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2019010320 38
 
0.4%
2019011019 34
 
0.3%
2019011018 33
 
0.3%
2019010418 33
 
0.3%
2019010501 33
 
0.3%
2019011217 33
 
0.3%
2019010518 33
 
0.3%
2019010814 32
 
0.3%
2019010410 32
 
0.3%
2019010721 32
 
0.3%
Other values (425) 9667
96.7%
ValueCountFrequency (%)
2019010100 22
0.2%
2019010101 30
0.3%
2019010102 17
0.2%
2019010103 23
0.2%
2019010104 27
0.3%
2019010105 23
0.2%
2019010106 23
0.2%
2019010107 24
0.2%
2019010108 26
0.3%
2019010109 27
0.3%
ValueCountFrequency (%)
2019011902 8
 
0.1%
2019011901 18
0.2%
2019011900 24
0.2%
2019011823 21
0.2%
2019011822 26
0.3%
2019011821 17
0.2%
2019011820 26
0.3%
2019011819 32
0.3%
2019011818 23
0.2%
2019011817 20
0.2%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.9987
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:03:06.498252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median113
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.2165578
Coefficient of variation (CV)0.063864078
Kurtosis-1.2144324
Mean112.9987
Median Absolute Deviation (MAD)6
Skewness-0.0086693438
Sum1129987
Variance52.078706
MonotonicityNot monotonic
2024-04-27T12:03:06.770159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
119 433
 
4.3%
121 428
 
4.3%
124 418
 
4.2%
101 416
 
4.2%
113 416
 
4.2%
104 415
 
4.2%
111 415
 
4.2%
105 414
 
4.1%
123 413
 
4.1%
118 407
 
4.1%
Other values (15) 5825
58.2%
ValueCountFrequency (%)
101 416
4.2%
102 400
4.0%
103 375
3.8%
104 415
4.2%
105 414
4.1%
106 387
3.9%
107 403
4.0%
108 395
4.0%
109 394
3.9%
110 393
3.9%
ValueCountFrequency (%)
125 356
3.6%
124 418
4.2%
123 413
4.1%
122 407
4.1%
121 428
4.3%
120 385
3.9%
119 433
4.3%
118 407
4.1%
117 376
3.8%
116 385
3.9%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3243
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:03:06.972342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7490881
Coefficient of variation (CV)0.51632855
Kurtosis-1.2030488
Mean5.3243
Median Absolute Deviation (MAD)2
Skewness-0.2006142
Sum53243
Variance7.5574853
MonotonicityNot monotonic
2024-04-27T12:03:07.182389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
6 1690
16.9%
5 1686
16.9%
9 1683
16.8%
1 1681
16.8%
3 1647
16.5%
8 1613
16.1%
ValueCountFrequency (%)
1 1681
16.8%
3 1647
16.5%
5 1686
16.9%
6 1690
16.9%
8 1613
16.1%
9 1683
16.8%
ValueCountFrequency (%)
9 1683
16.8%
8 1613
16.1%
6 1690
16.9%
5 1686
16.9%
3 1647
16.5%
1 1681
16.8%

평균값
Real number (ℝ)

HIGH CORRELATION 

Distinct318
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19.770869
Minimum0
Maximum985
Zeros16
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:03:07.458378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.003
Q10.008
median0.078
Q327
95-th percentile97
Maximum985
Range985
Interquartile range (IQR)26.992

Descriptive statistics

Standard deviation47.553212
Coefficient of variation (CV)2.4052161
Kurtosis186.95866
Mean19.770869
Median Absolute Deviation (MAD)0.077
Skewness10.177105
Sum197708.69
Variance2261.308
MonotonicityNot monotonic
2024-04-27T12:03:07.847029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.005 529
 
5.3%
0.004 495
 
5.0%
0.002 362
 
3.6%
0.006 356
 
3.6%
0.003 314
 
3.1%
0.007 235
 
2.4%
0.6 210
 
2.1%
0.7 201
 
2.0%
0.8 197
 
2.0%
0.5 195
 
1.9%
Other values (308) 6906
69.1%
ValueCountFrequency (%)
0.0 16
 
0.2%
0.001 78
 
0.8%
0.002 362
3.6%
0.003 314
3.1%
0.004 495
5.0%
0.005 529
5.3%
0.006 356
3.6%
0.007 235
2.4%
0.008 169
 
1.7%
0.009 102
 
1.0%
ValueCountFrequency (%)
985.0 11
0.1%
409.0 1
 
< 0.1%
247.0 1
 
< 0.1%
235.0 1
 
< 0.1%
224.0 1
 
< 0.1%
223.0 1
 
< 0.1%
218.0 1
 
< 0.1%
216.0 2
 
< 0.1%
215.0 1
 
< 0.1%
212.0 2
 
< 0.1%

측정기 상태
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9923 
1
 
52
9
 
24
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9923
99.2%
1 52
 
0.5%
9 24
 
0.2%
2 1
 
< 0.1%

Length

2024-04-27T12:03:08.249987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:03:08.469150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9923
99.2%
1 52
 
0.5%
9 24
 
0.2%
2 1
 
< 0.1%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
8975 
1
1025 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 8975
89.8%
1 1025
 
10.2%

Length

2024-04-27T12:03:08.677731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:03:08.951585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 8975
89.8%
1 1025
 
10.2%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
8975 
1
1025 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 8975
89.8%
1 1025
 
10.2%

Length

2024-04-27T12:03:09.256415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:03:09.556965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 8975
89.8%
1 1025
 
10.2%

Interactions

2024-04-27T12:03:03.931354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:01.038695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:01.943659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:02.943651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:04.206259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:01.290213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:02.217810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:03.133746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:04.476190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:01.505369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:02.481685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:03.415671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:04.681866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:01.702503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:02.773622image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:03:03.686533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-27T12:03:09.748869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.0000.4360.0660.4030.403
측정소 코드0.0001.0000.0000.0560.0110.0470.047
측정항목0.0000.0001.0000.2560.0430.7210.721
평균값0.4360.0560.2561.0000.4730.5240.524
측정기 상태0.0660.0110.0430.4731.0000.1360.136
국가 기준초과 구분0.4030.0470.7210.5240.1361.0001.000
지자체 기준초과 구분0.4030.0470.7210.5240.1361.0001.000
2024-04-27T12:03:10.091997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분측정기 상태국가 기준초과 구분
지자체 기준초과 구분1.0000.0900.999
측정기 상태0.0901.0000.090
국가 기준초과 구분0.9990.0901.000
2024-04-27T12:03:10.353016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0080.0040.0540.0400.3100.310
측정소 코드0.0081.0000.0090.0010.0070.0360.036
측정항목0.0040.0091.0000.7040.0280.5330.533
평균값0.0540.0010.7041.0000.4030.6350.635
측정기 상태0.0400.0070.0280.4031.0000.0900.090
국가 기준초과 구분0.3100.0360.5330.6350.0901.0000.999
지자체 기준초과 구분0.3100.0360.5330.6350.0900.9991.000

Missing values

2024-04-27T12:03:04.914422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-27T12:03:05.140859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
313002019010916117865.0000
50509201901150011930.051000
45452201901131510150.8000
46880201901140011451.3000
111462019010402108856.0000
14097201901042112560.002000
31867201901092011230.049000
193542019010609101851.0000
43939201901130412430.062000
24231201901071711460.019000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
56676201901161712210.004000
17169201901051811260.032000
3925020190111211178111.0011
374572019011109118961.0011
530382019011517115886.0000
451201901010310130.053000
17572019010111118927.0000
217542019010701101832.0000
45468201901131510410.009000
17414201901052010350.4000