Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric4
Categorical3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

지자체 기준초과 구분 is highly overall correlated with 국가 기준초과 구분High correlation
국가 기준초과 구분 is highly overall correlated with 지자체 기준초과 구분High correlation
측정항목 is highly overall correlated with 평균값High correlation
평균값 is highly overall correlated with 측정항목High correlation
측정기 상태 is highly imbalanced (91.6%)Imbalance
국가 기준초과 구분 is highly imbalanced (78.0%)Imbalance
지자체 기준초과 구분 is highly imbalanced (78.0%)Imbalance
평균값 is highly skewed (γ1 = -59.15126768)Skewed

Reproduction

Analysis started2024-04-27 12:02:29.464980
Analysis finished2024-04-27 12:02:38.427120
Duration8.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct449
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.021011 × 109
Minimum2.0210101 × 109
Maximum2.0210119 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:02:38.705900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0210101 × 109
5-th percentile2.0210101 × 109
Q12.0210105 × 109
median2.021011 × 109
Q32.0210115 × 109
95-th percentile2.0210118 × 109
Maximum2.0210119 × 109
Range1816
Interquartile range (IQR)983

Descriptive statistics

Standard deviation540.2197
Coefficient of variation (CV)2.6730171 × 10-7
Kurtosis-1.1948509
Mean2.021011 × 109
Median Absolute Deviation (MAD)491.5
Skewness-0.0073428005
Sum2.021011 × 1013
Variance291837.32
MonotonicityNot monotonic
2024-04-27T12:02:39.158654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2021010422 42
 
0.4%
2021011111 34
 
0.3%
2021010113 34
 
0.3%
2021010306 33
 
0.3%
2021010318 32
 
0.3%
2021010214 32
 
0.3%
2021011020 32
 
0.3%
2021011319 32
 
0.3%
2021011504 31
 
0.3%
2021010114 31
 
0.3%
Other values (439) 9667
96.7%
ValueCountFrequency (%)
2021010100 15
0.1%
2021010101 19
0.2%
2021010102 25
0.2%
2021010103 20
0.2%
2021010104 28
0.3%
2021010105 18
0.2%
2021010106 14
0.1%
2021010107 25
0.2%
2021010108 19
0.2%
2021010109 25
0.2%
ValueCountFrequency (%)
2021011916 24
0.2%
2021011915 24
0.2%
2021011914 29
0.3%
2021011913 16
0.2%
2021011912 20
0.2%
2021011911 29
0.3%
2021011910 19
0.2%
2021011909 21
0.2%
2021011908 23
0.2%
2021011907 16
0.2%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean113.1735
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:02:39.547214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median113
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.232263
Coefficient of variation (CV)0.063904209
Kurtosis-1.2107478
Mean113.1735
Median Absolute Deviation (MAD)6
Skewness-0.033927399
Sum1131735
Variance52.305628
MonotonicityNot monotonic
2024-04-27T12:02:39.949959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
124 428
 
4.3%
121 427
 
4.3%
120 423
 
4.2%
106 418
 
4.2%
123 417
 
4.2%
116 417
 
4.2%
102 416
 
4.2%
125 412
 
4.1%
118 408
 
4.1%
119 406
 
4.1%
Other values (15) 5828
58.3%
ValueCountFrequency (%)
101 374
3.7%
102 416
4.2%
103 402
4.0%
104 373
3.7%
105 383
3.8%
106 418
4.2%
107 372
3.7%
108 375
3.8%
109 388
3.9%
110 394
3.9%
ValueCountFrequency (%)
125 412
4.1%
124 428
4.3%
123 417
4.2%
122 387
3.9%
121 427
4.3%
120 423
4.2%
119 406
4.1%
118 408
4.1%
117 400
4.0%
116 417
4.2%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3416
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:02:40.153053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7356275
Coefficient of variation (CV)0.51213634
Kurtosis-1.190938
Mean5.3416
Median Absolute Deviation (MAD)2
Skewness-0.20544676
Sum53416
Variance7.4836578
MonotonicityNot monotonic
2024-04-27T12:02:40.511157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
6 1712
17.1%
5 1684
16.8%
9 1680
16.8%
3 1670
16.7%
1 1634
16.3%
8 1620
16.2%
ValueCountFrequency (%)
1 1634
16.3%
3 1670
16.7%
5 1684
16.8%
6 1712
17.1%
8 1620
16.2%
9 1680
16.8%
ValueCountFrequency (%)
9 1680
16.8%
8 1620
16.2%
6 1712
17.1%
5 1684
16.8%
3 1670
16.7%
1 1634
16.3%

평균값
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct238
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.4306731
Minimum-9999
Maximum1985
Zeros2
Zeros (%)< 0.1%
Negative10
Negative (%)0.1%
Memory size166.0 KiB
2024-04-27T12:02:41.090936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile0.003
Q10.012
median0.065
Q315
95-th percentile52
Maximum1985
Range11984
Interquartile range (IQR)14.988

Descriptive statistics

Standard deviation149.56717
Coefficient of variation (CV)15.85965
Kurtosis4020.3883
Mean9.4306731
Median Absolute Deviation (MAD)0.064
Skewness-59.151268
Sum94306.731
Variance22370.34
MonotonicityNot monotonic
2024-04-27T12:02:41.535898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.003 777
 
7.8%
0.004 604
 
6.0%
0.4 460
 
4.6%
0.5 339
 
3.4%
0.002 318
 
3.2%
0.6 238
 
2.4%
0.005 228
 
2.3%
0.7 209
 
2.1%
0.3 143
 
1.4%
13.0 131
 
1.3%
Other values (228) 6553
65.5%
ValueCountFrequency (%)
-9999.0 2
 
< 0.1%
-1000.0 2
 
< 0.1%
-100.0 2
 
< 0.1%
-9.999 3
 
< 0.1%
-0.016 1
 
< 0.1%
0.0 2
 
< 0.1%
0.001 31
 
0.3%
0.002 318
3.2%
0.003 777
7.8%
0.004 604
6.0%
ValueCountFrequency (%)
1985.0 3
< 0.1%
985.0 5
0.1%
672.0 1
 
< 0.1%
666.0 1
 
< 0.1%
287.0 1
 
< 0.1%
210.0 1
 
< 0.1%
209.0 1
 
< 0.1%
188.0 1
 
< 0.1%
182.0 1
 
< 0.1%
172.0 1
 
< 0.1%

측정기 상태
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9839 
9
 
117
1
 
44

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9839
98.4%
9 117
 
1.2%
1 44
 
0.4%

Length

2024-04-27T12:02:41.949683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:02:42.261855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9839
98.4%
9 117
 
1.2%
1 44
 
0.4%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9647 
1
 
353

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 9647
96.5%
1 353
 
3.5%

Length

2024-04-27T12:02:42.626359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:02:42.952502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9647
96.5%
1 353
 
3.5%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9647 
1
 
353

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 9647
96.5%
1 353
 
3.5%

Length

2024-04-27T12:02:43.213111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:02:43.393003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9647
96.5%
1 353
 
3.5%

Interactions

2024-04-27T12:02:36.566283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:33.277961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:34.422564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:35.390461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:36.858330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:33.576589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:34.685640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:35.738498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:37.184744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:33.853946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:34.852993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:35.994487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:37.505940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:34.146587image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:35.120683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:36.280535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-27T12:02:43.511534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.0000.0800.0970.3520.352
측정소 코드0.0001.0000.0330.0890.1530.0260.026
측정항목0.0000.0331.0000.0690.1660.4900.490
평균값0.0800.0890.0691.0000.5350.0890.089
측정기 상태0.0970.1530.1660.5351.0000.0470.047
국가 기준초과 구분0.3520.0260.4900.0890.0471.0001.000
지자체 기준초과 구분0.3520.0260.4900.0890.0471.0001.000
2024-04-27T12:02:43.756218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분측정기 상태국가 기준초과 구분
지자체 기준초과 구분1.0000.0790.999
측정기 상태0.0791.0000.079
국가 기준초과 구분0.9990.0791.000
2024-04-27T12:02:44.018310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.000-0.004-0.0020.0310.0570.2700.270
측정소 코드-0.0041.0000.0080.0090.0920.0200.020
측정항목-0.0020.0081.0000.7720.0690.3530.353
평균값0.0310.0090.7721.0000.2690.1470.147
측정기 상태0.0570.0920.0690.2691.0000.0790.079
국가 기준초과 구분0.2700.0200.3530.1470.0791.0000.999
지자체 기준초과 구분0.2700.0200.3530.1470.0790.9991.000

Missing values

2024-04-27T12:02:37.880883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-27T12:02:38.265709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
51207202101150511060.002000
235492021010712125914.0000
4479202101020512260.022000
528352021011516106959.0011
34250202101101210950.6000
5429202101021210596.0000
120702021010408112837.0000
65828202101190612250.3000
568002021011618117830.0000
64886202101190011550.4000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
35487202101102011560.008000
7227202101030010560.028000
9288202101031312410.005000
8216202101030612050.4000
49968202101142110410.003000
65336202101190311550.4000
59611202101171311130.017000
30775202101091310530.01000
57854202101170111850.4000
115432021010404124922.0000