Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric4
Categorical3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

국가 기준초과 구분 is highly overall correlated with 지자체 기준초과 구분High correlation
지자체 기준초과 구분 is highly overall correlated with 국가 기준초과 구분High correlation
측정항목 is highly overall correlated with 평균값High correlation
평균값 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
측정기 상태 is highly overall correlated with 평균값High correlation
측정기 상태 is highly imbalanced (93.8%)Imbalance
국가 기준초과 구분 is highly imbalanced (68.9%)Imbalance
지자체 기준초과 구분 is highly imbalanced (68.9%)Imbalance
평균값 is highly skewed (γ1 = -22.16042092)Skewed

Reproduction

Analysis started2024-05-11 06:56:00.772207
Analysis finished2024-05-11 06:56:06.021054
Duration5.25 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct473
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.017011 × 109
Minimum2.0170101 × 109
Maximum2.017012 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:56:06.149981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0170101 × 109
5-th percentile2.0170101 × 109
Q12.0170105 × 109
median2.017011 × 109
Q32.0170115 × 109
95-th percentile2.0170119 × 109
Maximum2.017012 × 109
Range1916
Interquartile range (IQR)1000

Descriptive statistics

Standard deviation571.27757
Coefficient of variation (CV)2.8322977 × 10-7
Kurtosis-1.2059239
Mean2.017011 × 109
Median Absolute Deviation (MAD)500
Skewness0.010999287
Sum2.017011 × 1013
Variance326358.06
MonotonicityNot monotonic
2024-05-11T15:56:06.419660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2017011518 35
 
0.4%
2017012000 33
 
0.3%
2017011615 33
 
0.3%
2017011701 32
 
0.3%
2017010109 32
 
0.3%
2017011521 31
 
0.3%
2017011209 31
 
0.3%
2017010219 31
 
0.3%
2017010105 30
 
0.3%
2017011123 30
 
0.3%
Other values (463) 9682
96.8%
ValueCountFrequency (%)
2017010100 16
0.2%
2017010101 23
0.2%
2017010102 20
0.2%
2017010103 24
0.2%
2017010104 25
0.2%
2017010105 30
0.3%
2017010106 23
0.2%
2017010107 28
0.3%
2017010108 18
0.2%
2017010109 32
0.3%
ValueCountFrequency (%)
2017012016 7
 
0.1%
2017012015 20
0.2%
2017012014 16
0.2%
2017012013 22
0.2%
2017012012 19
0.2%
2017012011 26
0.3%
2017012010 26
0.3%
2017012009 23
0.2%
2017012008 20
0.2%
2017012007 27
0.3%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean113.041
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:56:06.687263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median113
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.2482531
Coefficient of variation (CV)0.064120568
Kurtosis-1.2079017
Mean113.041
Median Absolute Deviation (MAD)6
Skewness-0.019049945
Sum1130410
Variance52.537173
MonotonicityNot monotonic
2024-05-11T15:56:06.927700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
124 430
 
4.3%
101 430
 
4.3%
122 428
 
4.3%
112 420
 
4.2%
120 417
 
4.2%
111 416
 
4.2%
116 416
 
4.2%
102 413
 
4.1%
121 408
 
4.1%
104 399
 
4.0%
Other values (15) 5823
58.2%
ValueCountFrequency (%)
101 430
4.3%
102 413
4.1%
103 396
4.0%
104 399
4.0%
105 380
3.8%
106 380
3.8%
107 377
3.8%
108 385
3.9%
109 392
3.9%
110 394
3.9%
ValueCountFrequency (%)
125 387
3.9%
124 430
4.3%
123 388
3.9%
122 428
4.3%
121 408
4.1%
120 417
4.2%
119 389
3.9%
118 391
3.9%
117 391
3.9%
116 416
4.2%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3583
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:56:07.172021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7596889
Coefficient of variation (CV)0.51503068
Kurtosis-1.2144538
Mean5.3583
Median Absolute Deviation (MAD)3
Skewness-0.22088221
Sum53583
Variance7.6158827
MonotonicityNot monotonic
2024-05-11T15:56:07.443945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
8 1709
17.1%
9 1692
16.9%
1 1673
16.7%
6 1658
16.6%
3 1639
16.4%
5 1629
16.3%
ValueCountFrequency (%)
1 1673
16.7%
3 1639
16.4%
5 1629
16.3%
6 1658
16.6%
8 1709
17.1%
9 1692
16.9%
ValueCountFrequency (%)
9 1692
16.9%
8 1709
17.1%
6 1658
16.6%
5 1629
16.3%
3 1639
16.4%
1 1673
16.7%

평균값
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct283
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-4.534255
Minimum-9999
Maximum297
Zeros33
Zeros (%)0.3%
Negative22
Negative (%)0.2%
Memory size166.0 KiB
2024-05-11T15:56:07.790082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile0.003
Q10.008
median0.0875
Q322
95-th percentile79
Maximum297
Range10296
Interquartile range (IQR)21.992

Descriptive statistics

Standard deviation448.34494
Coefficient of variation (CV)-98.879516
Kurtosis491.22279
Mean-4.534255
Median Absolute Deviation (MAD)0.1125
Skewness-22.160421
Sum-45342.55
Variance201013.18
MonotonicityNot monotonic
2024-05-11T15:56:08.088948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.005 578
 
5.8%
0.004 570
 
5.7%
0.006 348
 
3.5%
0.003 330
 
3.3%
0.002 287
 
2.9%
0.4 234
 
2.3%
0.007 210
 
2.1%
0.5 203
 
2.0%
0.6 184
 
1.8%
0.3 157
 
1.6%
Other values (273) 6899
69.0%
ValueCountFrequency (%)
-9999.0 20
 
0.2%
-18.0 1
 
< 0.1%
-12.0 1
 
< 0.1%
0.0 33
 
0.3%
0.001 26
 
0.3%
0.002 287
2.9%
0.003 330
3.3%
0.004 570
5.7%
0.005 578
5.8%
0.006 348
3.5%
ValueCountFrequency (%)
297.0 1
< 0.1%
244.0 1
< 0.1%
216.0 1
< 0.1%
197.0 1
< 0.1%
194.0 1
< 0.1%
187.0 1
< 0.1%
182.0 1
< 0.1%
178.0 2
< 0.1%
177.0 1
< 0.1%
176.0 1
< 0.1%

측정기 상태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9843 
1
 
71
9
 
42
8
 
25
4
 
19

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9843
98.4%
1 71
 
0.7%
9 42
 
0.4%
8 25
 
0.2%
4 19
 
0.2%

Length

2024-05-11T15:56:08.328653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:56:08.556551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9843
98.4%
1 71
 
0.7%
9 42
 
0.4%
8 25
 
0.2%
4 19
 
0.2%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9442 
1
 
558

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9442
94.4%
1 558
 
5.6%

Length

2024-05-11T15:56:08.765220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:56:08.943394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9442
94.4%
1 558
 
5.6%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9442 
1
 
558

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9442
94.4%
1 558
 
5.6%

Length

2024-05-11T15:56:09.132567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:56:09.301943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9442
94.4%
1 558
 
5.6%

Interactions

2024-05-11T15:56:04.762218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:02.190348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:03.211665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:03.990848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:04.977816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:02.449146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:03.401491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:04.195264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:05.205470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:02.718909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:03.575921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:04.363834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:05.433108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:03.000527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:03.810914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:56:04.565671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:56:09.422806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.028NaN0.1050.3680.368
측정소 코드0.0001.0000.000NaN0.1770.0410.041
측정항목0.0280.0001.000NaN0.0650.5050.505
평균값NaNNaNNaN1.000NaNNaNNaN
측정기 상태0.1050.1770.065NaN1.0000.0380.038
국가 기준초과 구분0.3680.0410.505NaN0.0381.0001.000
지자체 기준초과 구분0.3680.0410.505NaN0.0381.0001.000
2024-05-11T15:56:09.639332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정기 상태1.0000.0470.047
국가 기준초과 구분0.0471.0000.999
지자체 기준초과 구분0.0470.9991.000
2024-05-11T15:56:09.859887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0150.0120.0270.0440.2810.281
측정소 코드0.0151.000-0.022-0.0100.0740.0310.031
측정항목0.012-0.0221.0000.7270.0440.3650.365
평균값0.027-0.0100.7271.0000.8820.0000.000
측정기 상태0.0440.0740.0440.8821.0000.0470.047
국가 기준초과 구분0.2810.0310.3650.0000.0471.0000.999
지자체 기준초과 구분0.2810.0310.3650.0000.0470.9991.000

Missing values

2024-05-11T15:56:05.671460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:56:05.929311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
4988201701020910750.7000
30917201701091410390.0100
26646201701080911710.005000
4905201701020811860.006000
50517201701150012060.023000
309412017010914107950.0000
30558201701091111910.005000
5966201701021512054.1100
56512017010213117985.0011
53067201701151712060.021000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
37986201701111310710.005000
391792017011121105922.0000
2064201701011312010.006000
47420201701140410451.0000
685072017012000118959.0011
61154201701172311851.2000
32202017010121112884.0000
91242017010312121859.0000
31749201701091911760.02000
30042201701090810810.005000