Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric4
Categorical3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

측정항목 is highly overall correlated with 평균값High correlation
평균값 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
측정기 상태 is highly overall correlated with 평균값High correlation
국가 기준초과 구분 is highly overall correlated with 지자체 기준초과 구분High correlation
지자체 기준초과 구분 is highly overall correlated with 국가 기준초과 구분High correlation
국가 기준초과 구분 is highly imbalanced (97.3%)Imbalance
지자체 기준초과 구분 is highly imbalanced (94.4%)Imbalance
평균값 has 493 (4.9%) zerosZeros

Reproduction

Analysis started2024-05-04 04:05:14.805090
Analysis finished2024-05-04 04:05:21.073809
Duration6.27 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct926
Distinct (%)9.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9950134 × 109
Minimum1.9950101 × 109
Maximum1.9950208 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:05:21.301080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.9950101 × 109
5-th percentile1.9950102 × 109
Q11.995011 × 109
median1.995012 × 109
Q31.995013 × 109
95-th percentile1.9950206 × 109
Maximum1.9950208 × 109
Range10713
Interquartile range (IQR)1991

Descriptive statistics

Standard deviation3662.4289
Coefficient of variation (CV)1.8357916 × 10-6
Kurtosis-0.089794021
Mean1.9950134 × 109
Median Absolute Deviation (MAD)995
Skewness1.2822722
Sum1.9950134 × 1013
Variance13413386
MonotonicityNot monotonic
2024-05-04T04:05:21.757083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1995011013 24
 
0.2%
1995020807 22
 
0.2%
1995010604 21
 
0.2%
1995010911 21
 
0.2%
1995020208 21
 
0.2%
1995020705 21
 
0.2%
1995011720 20
 
0.2%
1995010306 20
 
0.2%
1995020212 19
 
0.2%
1995010612 19
 
0.2%
Other values (916) 9792
97.9%
ValueCountFrequency (%)
1995010100 8
0.1%
1995010101 11
0.1%
1995010102 10
0.1%
1995010103 9
0.1%
1995010104 8
0.1%
1995010105 11
0.1%
1995010106 15
0.1%
1995010107 12
0.1%
1995010108 18
0.2%
1995010109 10
0.1%
ValueCountFrequency (%)
1995020813 5
 
0.1%
1995020812 11
0.1%
1995020811 7
 
0.1%
1995020810 6
 
0.1%
1995020809 13
0.1%
1995020808 9
0.1%
1995020807 22
0.2%
1995020806 8
 
0.1%
1995020805 13
0.1%
1995020804 9
0.1%

측정소 코드
Real number (ℝ)

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.2664
Minimum102
Maximum124
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:05:22.148092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum102
5-th percentile102
Q1106
median111
Q3119
95-th percentile124
Maximum124
Range22
Interquartile range (IQR)13

Descriptive statistics

Standard deviation7.1365064
Coefficient of variation (CV)0.063567607
Kurtosis-1.3328116
Mean112.2664
Median Absolute Deviation (MAD)6
Skewness0.24877064
Sum1122664
Variance50.929724
MonotonicityNot monotonic
2024-05-04T04:05:22.499679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%)
109 608
 
6.1%
122 603
 
6.0%
117 589
 
5.9%
106 583
 
5.8%
123 562
 
5.6%
111 559
 
5.6%
104 559
 
5.6%
105 554
 
5.5%
108 554
 
5.5%
121 553
 
5.5%
Other values (8) 4276
42.8%
ValueCountFrequency (%)
102 535
5.3%
103 537
5.4%
104 559
5.6%
105 554
5.5%
106 583
5.8%
107 508
5.1%
108 554
5.5%
109 608
6.1%
110 544
5.4%
111 559
5.6%
ValueCountFrequency (%)
124 530
5.3%
123 562
5.6%
122 603
6.0%
121 553
5.5%
119 527
5.3%
117 589
5.9%
116 548
5.5%
113 547
5.5%
111 559
5.6%
110 544
5.4%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3689
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:05:22.820726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.750158
Coefficient of variation (CV)0.51223864
Kurtosis-1.2067447
Mean5.3689
Median Absolute Deviation (MAD)2
Skewness-0.219871
Sum53689
Variance7.5633691
MonotonicityNot monotonic
2024-05-04T04:05:23.215680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
8 1709
17.1%
9 1698
17.0%
5 1669
16.7%
3 1658
16.6%
1 1636
16.4%
6 1630
16.3%
ValueCountFrequency (%)
1 1636
16.4%
3 1658
16.6%
5 1669
16.7%
6 1630
16.3%
8 1709
17.1%
9 1698
17.0%
ValueCountFrequency (%)
9 1698
17.0%
8 1709
17.1%
6 1630
16.3%
5 1669
16.7%
3 1658
16.6%
1 1636
16.4%

평균값
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct328
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-2636.4637
Minimum-9999
Maximum289
Zeros493
Zeros (%)4.9%
Negative3110
Negative (%)31.1%
Memory size166.0 KiB
2024-05-04T04:05:23.638903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile-9999
Q1-9999
median0.012
Q30.046
95-th percentile3.8
Maximum289
Range10288
Interquartile range (IQR)9999.046

Descriptive statistics

Standard deviation4394.4123
Coefficient of variation (CV)-1.6667828
Kurtosis-0.83641736
Mean-2636.4637
Median Absolute Deviation (MAD)0.188
Skewness-1.0772543
Sum-26364637
Variance19310859
MonotonicityNot monotonic
2024-05-04T04:05:24.205467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-9999.0 2625
26.2%
0.0 493
 
4.9%
-9.999 335
 
3.4%
0.011 146
 
1.5%
-999.9 145
 
1.5%
0.008 143
 
1.4%
0.012 137
 
1.4%
0.009 136
 
1.4%
0.015 135
 
1.4%
0.01 132
 
1.3%
Other values (318) 5573
55.7%
ValueCountFrequency (%)
-9999.0 2625
26.2%
-999.9 145
 
1.5%
-999.8 1
 
< 0.1%
-999.7 1
 
< 0.1%
-999.5 1
 
< 0.1%
-9.999 335
 
3.4%
-9.998 1
 
< 0.1%
-3.5 1
 
< 0.1%
0.0 493
 
4.9%
0.001 112
 
1.1%
ValueCountFrequency (%)
289.0 1
< 0.1%
243.0 1
< 0.1%
220.0 1
< 0.1%
216.0 1
< 0.1%
202.0 1
< 0.1%
200.0 1
< 0.1%
199.0 1
< 0.1%
198.0 1
< 0.1%
197.0 1
< 0.1%
192.0 1
< 0.1%

측정기 상태
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
6384 
4
3486 
2
 
127
1
 
3

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row4
3rd row0
4th row4
5th row4

Common Values

ValueCountFrequency (%)
0 6384
63.8%
4 3486
34.9%
2 127
 
1.3%
1 3
 
< 0.1%

Length

2024-05-04T04:05:24.782044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:05:25.459825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 6384
63.8%
4 3486
34.9%
2 127
 
1.3%
1 3
 
< 0.1%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9973 
1
 
27

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9973
99.7%
1 27
 
0.3%

Length

2024-05-04T04:05:25.862210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:05:26.135286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9973
99.7%
1 27
 
0.3%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9936 
1
 
64

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9936
99.4%
1 64
 
0.6%

Length

2024-05-04T04:05:26.453776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:05:26.857204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9936
99.4%
1 64
 
0.6%

Interactions

2024-05-04T04:05:19.252141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:15.915722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:17.022044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:17.953630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:19.551254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:16.181243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:17.295449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:18.397402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:19.863207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:16.437600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:17.500159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:18.674249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:20.143708image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:16.734945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:17.720815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T04:05:18.972855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T04:05:27.165655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0190.0100.0790.0690.0080.000
측정소 코드0.0191.0000.0130.1710.1620.1010.145
측정항목0.0100.0131.0000.2160.6560.1560.244
평균값0.0790.1710.2161.0000.5730.0000.000
측정기 상태0.0690.1620.6560.5731.0000.0530.087
국가 기준초과 구분0.0080.1010.1560.0000.0531.0000.841
지자체 기준초과 구분0.0000.1450.2440.0000.0870.8411.000
2024-05-04T04:05:27.527599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분국가 기준초과 구분측정기 상태
지자체 기준초과 구분1.0000.6360.058
국가 기준초과 구분0.6361.0000.035
측정기 상태0.0580.0351.000
2024-05-04T04:05:27.836655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.000-0.016-0.0040.0320.0280.0050.000
측정소 코드-0.0161.000-0.004-0.0850.1040.1010.145
측정항목-0.004-0.0041.000-0.5870.4850.1120.175
평균값0.032-0.085-0.5871.0000.5990.0290.048
측정기 상태0.0280.1040.4850.5991.0000.0350.058
국가 기준초과 구분0.0050.1010.1120.0290.0351.0000.636
지자체 기준초과 구분0.0000.1450.1750.0480.0580.6361.000

Missing values

2024-05-04T04:05:20.487309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T04:05:20.922804image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
8570419950203011119-9999.0400
1770519950107191238-9999.0400
88106199502032312130.022000
7419919950129151025-999.9400
495619950102211229-9999.0400
23055199501092111052.2000
29060199501120510330.038000
133371995010603110839.0000
75638199501300410830.005000
31365199501130210951.0000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
9733819950207131069-9999.0400
90121199502041811010.015000
4800519950119121108192.0011
98049199502071912250.8000
33025199501131712110.006000
58471199501231310910.019000
84757199502021612110.029000
36184199501142310260.011000
91568199502050712230.045000
44455199501180311610.028000