Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory693.4 KiB
Average record size in memory71.0 B

Variable types

Numeric4
Categorical3

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15526/S/1/datasetView.do

Alerts

지자체 기준초과 구분 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
국가 기준초과 구분 is highly overall correlated with 측정항목 and 1 other fieldsHigh correlation
측정항목 is highly overall correlated with 평균값 and 2 other fieldsHigh correlation
평균값 is highly overall correlated with 측정항목High correlation
측정기 상태 is highly imbalanced (93.7%)Imbalance
국가 기준초과 구분 is highly imbalanced (62.6%)Imbalance
지자체 기준초과 구분 is highly imbalanced (62.6%)Imbalance
평균값 is highly skewed (γ1 = -46.7294729)Skewed

Reproduction

Analysis started2024-04-27 12:02:46.093435
Analysis finished2024-04-27 12:02:52.539130
Duration6.45 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

측정일시
Real number (ℝ)

Distinct448
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.020011 × 109
Minimum2.0200101 × 109
Maximum2.0200119 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:02:52.785110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0200101 × 109
5-th percentile2.0200101 × 109
Q12.0200105 × 109
median2.020011 × 109
Q32.0200114 × 109
95-th percentile2.0200118 × 109
Maximum2.0200119 × 109
Range1815
Interquartile range (IQR)906

Descriptive statistics

Standard deviation536.69788
Coefficient of variation (CV)2.6569057 × 10-7
Kurtosis-1.1861743
Mean2.020011 × 109
Median Absolute Deviation (MAD)487
Skewness0.025820608
Sum2.020011 × 1013
Variance288044.62
MonotonicityNot monotonic
2024-04-27T12:02:53.240464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2020010609 36
 
0.4%
2020010306 36
 
0.4%
2020011709 35
 
0.4%
2020011816 35
 
0.4%
2020011116 34
 
0.3%
2020010900 33
 
0.3%
2020011316 33
 
0.3%
2020011100 32
 
0.3%
2020010507 32
 
0.3%
2020011407 32
 
0.3%
Other values (438) 9662
96.6%
ValueCountFrequency (%)
2020010100 16
0.2%
2020010101 21
0.2%
2020010102 22
0.2%
2020010103 22
0.2%
2020010104 26
0.3%
2020010105 27
0.3%
2020010106 22
0.2%
2020010107 30
0.3%
2020010108 20
0.2%
2020010109 23
0.2%
ValueCountFrequency (%)
2020011915 9
 
0.1%
2020011914 21
0.2%
2020011913 22
0.2%
2020011912 22
0.2%
2020011911 24
0.2%
2020011910 22
0.2%
2020011909 18
0.2%
2020011908 24
0.2%
2020011907 22
0.2%
2020011906 22
0.2%

측정소 코드
Real number (ℝ)

Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean113.055
Minimum101
Maximum125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:02:53.756772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1107
median113
Q3119
95-th percentile124
Maximum125
Range24
Interquartile range (IQR)12

Descriptive statistics

Standard deviation7.2534568
Coefficient of variation (CV)0.064158656
Kurtosis-1.2174689
Mean113.055
Median Absolute Deviation (MAD)6
Skewness-0.011458139
Sum1130550
Variance52.612636
MonotonicityNot monotonic
2024-04-27T12:02:54.049506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
122 431
 
4.3%
119 426
 
4.3%
101 419
 
4.2%
113 418
 
4.2%
107 415
 
4.2%
124 412
 
4.1%
125 410
 
4.1%
106 404
 
4.0%
111 404
 
4.0%
118 404
 
4.0%
Other values (15) 5857
58.6%
ValueCountFrequency (%)
101 419
4.2%
102 384
3.8%
103 401
4.0%
104 403
4.0%
105 395
4.0%
106 404
4.0%
107 415
4.2%
108 398
4.0%
109 349
3.5%
110 393
3.9%
ValueCountFrequency (%)
125 410
4.1%
124 412
4.1%
123 399
4.0%
122 431
4.3%
121 392
3.9%
120 403
4.0%
119 426
4.3%
118 404
4.0%
117 387
3.9%
116 370
3.7%

측정항목
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.3459
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-27T12:02:54.387717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q38
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.7437576
Coefficient of variation (CV)0.51324522
Kurtosis-1.1976167
Mean5.3459
Median Absolute Deviation (MAD)2
Skewness-0.21159593
Sum53459
Variance7.528206
MonotonicityNot monotonic
2024-04-27T12:02:54.747127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
6 1712
17.1%
9 1683
16.8%
5 1657
16.6%
3 1653
16.5%
1 1652
16.5%
8 1643
16.4%
ValueCountFrequency (%)
1 1652
16.5%
3 1653
16.5%
5 1657
16.6%
6 1712
17.1%
8 1643
16.4%
9 1683
16.8%
ValueCountFrequency (%)
9 1683
16.8%
8 1643
16.4%
6 1712
17.1%
5 1657
16.6%
3 1653
16.5%
1 1652
16.5%

평균값
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct219
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.210505
Minimum-9999
Maximum1985
Zeros26
Zeros (%)0.3%
Negative4
Negative (%)< 0.1%
Memory size166.0 KiB
2024-04-27T12:02:55.212013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-9999
5-th percentile0.003
Q10.009
median0.066
Q326
95-th percentile58
Maximum1985
Range11984
Interquartile range (IQR)25.991

Descriptive statistics

Standard deviation204.49678
Coefficient of variation (CV)20.028077
Kurtosis2295.603
Mean10.210505
Median Absolute Deviation (MAD)0.066
Skewness-46.729473
Sum102105.05
Variance41818.933
MonotonicityNot monotonic
2024-04-27T12:02:55.699924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.003 715
 
7.1%
0.004 661
 
6.6%
0.002 390
 
3.9%
0.005 298
 
3.0%
0.6 282
 
2.8%
0.7 256
 
2.6%
0.8 241
 
2.4%
0.5 198
 
2.0%
0.9 181
 
1.8%
0.006 156
 
1.6%
Other values (209) 6622
66.2%
ValueCountFrequency (%)
-9999.0 4
 
< 0.1%
0.0 26
 
0.3%
0.001 49
 
0.5%
0.002 390
3.9%
0.003 715
7.1%
0.004 661
6.6%
0.005 298
3.0%
0.006 156
 
1.6%
0.007 104
 
1.0%
0.008 80
 
0.8%
ValueCountFrequency (%)
1985.0 1
 
< 0.1%
985.0 7
0.1%
906.0 1
 
< 0.1%
699.0 1
 
< 0.1%
683.0 1
 
< 0.1%
677.0 1
 
< 0.1%
220.0 1
 
< 0.1%
140.0 1
 
< 0.1%
133.0 1
 
< 0.1%
129.0 1
 
< 0.1%

측정기 상태
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9888 
9
 
59
1
 
53

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row9
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9888
98.9%
9 59
 
0.6%
1 53
 
0.5%

Length

2024-04-27T12:02:56.155644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:02:56.474910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9888
98.9%
9 59
 
0.6%
1 53
 
0.5%

국가 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9278 
1
 
722

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9278
92.8%
1 722
 
7.2%

Length

2024-04-27T12:02:56.798333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:02:57.067683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9278
92.8%
1 722
 
7.2%

지자체 기준초과 구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9277 
1
 
723

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9277
92.8%
1 723
 
7.2%

Length

2024-04-27T12:02:57.252033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-27T12:02:57.421459image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9277
92.8%
1 723
 
7.2%

Interactions

2024-04-27T12:02:50.569092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:47.143134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:48.197615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:49.379819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:50.945766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:47.433509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:48.481286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:49.679455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:51.247196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:47.713002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:48.741964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:49.981366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:51.573054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:47.919024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:49.032765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-27T12:02:50.255480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-27T12:02:57.576805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.0000.0000.0000.0210.0960.2120.211
측정소 코드0.0001.0000.0000.0200.1000.1080.108
측정항목0.0000.0001.0000.0510.1210.7880.787
평균값0.0210.0200.0511.0000.1740.1570.157
측정기 상태0.0960.1000.1210.1741.0000.0290.033
국가 기준초과 구분0.2120.1080.7880.1570.0291.0001.000
지자체 기준초과 구분0.2110.1080.7870.1570.0331.0001.000
2024-04-27T12:02:57.820920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지자체 기준초과 구분측정기 상태국가 기준초과 구분
지자체 기준초과 구분1.0000.0540.999
측정기 상태0.0541.0000.049
국가 기준초과 구분0.9990.0491.000
2024-04-27T12:02:57.998951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
측정일시1.000-0.010-0.0000.0090.0570.1620.162
측정소 코드-0.0101.000-0.0010.0100.0590.0830.083
측정항목-0.000-0.0011.0000.7370.0500.5920.592
평균값0.0090.0100.7371.0000.2780.1070.107
측정기 상태0.0570.0590.0500.2781.0000.0490.054
국가 기준초과 구분0.1620.0830.5920.1070.0491.0000.999
지자체 기준초과 구분0.1620.0830.5920.1070.0540.9991.000

Missing values

2024-04-27T12:02:51.942098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-27T12:02:52.370758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
4485202001020512360.002000
39919202001120210430.0900
36033202001110010660.015000
228582020010708110811.0000
37298202001110811751.1000
41462202001121211150.8000
70842020010223106864.0000
40251202001120410960.023000
31023202001091412160.027000
45547202001131511730.012000
측정일시측정소 코드측정항목평균값측정기 상태국가 기준초과 구분지자체 기준초과 구분
24469202001071910430.033000
57075202001162011360.032000
35203202001101811830.059000
8934202001031111510.007000
2670202001011712110.003000
636222020011816104853.0000
21662202001070011150.6000
21409202001062211930.036000
55755202001161111860.017000
529792020011517105918.0000