Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric4
Categorical3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

지자체 기준초과 구분 is highly overall correlated with 국가 기준초과 구분High correlation
국가 기준초과 구분 is highly overall correlated with 지자체 기준초과 구분High correlation
측정항목 is highly overall correlated with 평균값High correlation
평균값 is highly overall correlated with 측정항목High correlation
측정기 상태 is highly imbalanced (89.0%)Imbalance
국가 기준초과 구분 is highly imbalanced (74.0%)Imbalance
지자체 기준초과 구분 is highly imbalanced (74.0%)Imbalance
평균값 is highly skewed (γ1 = -43.28980571)Skewed

Reproduction

Analysis started2024-04-27 12:05:33.218146
Analysis finished2024-04-27 12:05:39.720766
Duration6.5 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct498
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0080111 × 109
Minimum2.0080101 × 109
Maximum2.0080121 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:05:39.946762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0080101 × 109
5-th percentile2.0080102 × 109
Q12.0080106 × 109
median2.0080111 × 109
Q32.0080116 × 109
95-th percentile2.008012 × 109
Maximum2.0080121 × 109
Range2017
Interquartile range (IQR)1008.25

Descriptive statistics

Standard deviation597.69576
Coefficient of variation (CV)2.9765561 × 10-7
Kurtosis-1.2022866
Mean2.0080111 × 109
Median Absolute Deviation (MAD)505
Skewness0.016015682
Sum2.0080111 × 1013
Variance357240.23
MonotonicityNot monotonic
2024-04-27T12:05:40.377641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2008011306 34
 
0.3%
2008011804 34
 
0.3%
2008011220 32
 
0.3%
2008010302 30
 
0.3%
2008011811 30
 
0.3%
2008010902 30
 
0.3%
2008011809 29
 
0.3%
2008010322 29
 
0.3%
2008010611 29
 
0.3%
2008010402 29
 
0.3%
Other values (488) 9694
96.9%
ValueCountFrequency (%)
2008010100 19
0.2%
2008010101 22
0.2%
2008010102 26
0.3%
2008010103 11
 
0.1%
2008010104 28
0.3%
2008010105 22
0.2%
2008010106 17
0.2%
2008010107 23
0.2%
2008010108 18
0.2%
2008010109 19
0.2%
ValueCountFrequency (%)
2008012117 7
 
0.1%
2008012116 17
0.2%
2008012115 19
0.2%
2008012114 22
0.2%
2008012113 19
0.2%
2008012112 19
0.2%
2008012111 16
0.2%
2008012110 21
0.2%
2008012109 22
0.2%
2008012108 18
0.2%

측정소 코드
Real number (ℝ)

Distinct24
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.5929
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:05:40.641268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median112
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.0629284
Coefficient of variation (CV)0.062729785
Kurtosis-1.1420043
Mean112.5929
Median Absolute Deviation (MAD)6
Skewness0.04893582
Sum1125929
Variance49.884958
MonotonicityNot monotonic
2024-04-27T12:05:40.870400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
118 451
 
4.5%
119 439
 
4.4%
114 435
 
4.3%
101 433
 
4.3%
107 432
 
4.3%
121 431
 
4.3%
111 431
 
4.3%
103 430
 
4.3%
115 429
 
4.3%
112 428
 
4.3%
Other values (14) 5661
56.6%
ValueCountFrequency (%)
101 433
4.3%
102 404
4.0%
103 430
4.3%
104 417
4.2%
105 405
4.0%
106 395
4.0%
107 432
4.3%
108 388
3.9%
109 426
4.3%
110 421
4.2%
ValueCountFrequency (%)
125 395
4.0%
124 427
4.3%
122 428
4.3%
121 431
4.3%
120 393
3.9%
119 439
4.4%
118 451
4.5%
117 409
4.1%
116 370
3.7%
115 429
4.3%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3429
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:05:41.071559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7501411
Coefficient of variation (CV)0.51472816
Kurtosis-1.2036877
Mean5.3429
Median Absolute Deviation (MAD)3
Skewness-0.20849362
Sum53429
Variance7.5632759
MonotonicityNot monotonic
2024-04-27T12:05:41.331595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
9 1699
17.0%
6 1687
16.9%
5 1666
16.7%
1 1664
16.6%
3 1650
16.5%
8 1634
16.3%
ValueCountFrequency (%)
1 1664
16.6%
3 1650
16.5%
5 1666
16.7%
6 1687
16.9%
8 1634
16.3%
9 1699
17.0%
ValueCountFrequency (%)
9 1699
17.0%
8 1634
16.3%
6 1687
16.9%
5 1666
16.7%
3 1650
16.5%
1 1664
16.6%

평균값
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct357
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.445015
Minimum-9999
Maximum288
Zeros38
Zeros (%)0.4%
Negative11
Negative (%)0.1%
Memory size166.0 KiB
2024-04-27T12:05:41.701071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile0.002
Q10.012
median0.099
Q322
95-th percentile81
Maximum288
Range10287
Interquartile range (IQR)21.988

Descriptive statistics

Standard deviation226.28532
Coefficient of variation (CV)19.771517
Kurtosis1913.3001
Mean11.445015
Median Absolute Deviation (MAD)0.099
Skewness-43.289806
Sum114450.15
Variance51205.046
MonotonicityNot monotonic
2024-04-27T12:05:42.057415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.002 303
 
3.0%
0.007 268
 
2.7%
0.006 249
 
2.5%
0.001 224
 
2.2%
0.008 215
 
2.1%
0.005 212
 
2.1%
0.6 203
 
2.0%
0.5 199
 
2.0%
0.01 186
 
1.9%
0.009 185
 
1.8%
Other values (347) 7756
77.6%
ValueCountFrequency (%)
-9999.0 5
 
0.1%
-9.999 6
 
0.1%
0.0 38
 
0.4%
0.001 224
2.2%
0.002 303
3.0%
0.003 185
1.8%
0.004 180
1.8%
0.005 212
2.1%
0.006 249
2.5%
0.007 268
2.7%
ValueCountFrequency (%)
288.0 1
 
< 0.1%
279.0 1
 
< 0.1%
269.0 1
 
< 0.1%
255.0 2
< 0.1%
250.0 1
 
< 0.1%
247.0 3
< 0.1%
245.0 1
 
< 0.1%
244.0 1
 
< 0.1%
242.0 1
 
< 0.1%
237.0 1
 
< 0.1%

측정기 상태
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9690 
8
 
135
9
 
84
1
 
67
2
 
24

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9690
96.9%
8 135
 
1.4%
9 84
 
0.8%
1 67
 
0.7%
2 24
 
0.2%

Length

2024-04-27T12:05:42.462950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:05:42.848511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9690
96.9%
8 135
 
1.4%
9 84
 
0.8%
1 67
 
0.7%
2 24
 
0.2%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9560 
1
 
440

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9560
95.6%
1 440
 
4.4%

Length

2024-04-27T12:05:43.200572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:05:43.498310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9560
95.6%
1 440
 
4.4%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9560 
1
 
440

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9560
95.6%
1 440
 
4.4%

Length

2024-04-27T12:05:43.794370image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:05:44.062657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9560
95.6%
1 440
 
4.4%

Interactions

2024-04-27T12:05:37.847439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:34.265640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:35.421246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:36.586443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:38.160578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:34.560628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:35.706452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:36.960593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:38.423088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:34.839262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:36.121898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:37.264237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:38.721781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:35.140321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:36.370639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:05:37.558242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-27T12:05:44.254583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.000NaN0.0830.4750.475
측정소 코드0.0001.0000.000NaN0.3600.0000.000
측정항목0.0000.0001.000NaN0.1440.4150.415
평균값NaNNaNNaN1.000NaNNaNNaN
측정기 상태0.0830.3600.144NaN1.0000.0950.095
국가 기준초과 구분0.4750.0000.415NaN0.0951.0001.000
지자체 기준초과 구분0.4750.0000.415NaN0.0951.0001.000
2024-04-27T12:05:44.490634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분측정기 상태국가 기준초과 구분
지자체 기준초과 구분1.0000.1160.999
측정기 상태0.1161.0000.116
국가 기준초과 구분0.9990.1161.000
2024-04-27T12:05:44.660330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.000-0.013-0.001-0.0330.0350.3650.365
측정소 코드-0.0131.000-0.0050.0030.1570.0000.000
측정항목-0.001-0.0051.0000.6610.0970.2990.299
평균값-0.0330.0030.6611.0000.1560.0000.000
측정기 상태0.0350.1570.0970.1561.0000.1160.116
국가 기준초과 구분0.3650.0000.2990.0000.1161.0000.999
지자체 기준초과 구분0.3650.0000.2990.0000.1160.9991.000

Missing values

2024-04-27T12:05:39.125615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-27T12:05:39.554555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
325552008011010102928.0000
41103200801122111160.027000
5953200801021710930.042000
277792008010900122940.0000
476692008011419101915.0000
54961200801162111730.026000
47043200801141411760.02000
48963200801150410160.003000
46707200801141210960.01000
621112008011823108996.0800
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
42665200801130810797.0000
50552200801151510250.8000
5414200801021311550.5000
37263200801111811960.001000
36884200801111610450.6000
19874200801061810152.1000
36500200801111311250.6000
45331200801140212030.051000
53618200801161210950.7000
37775200801112210899.0800