Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 KiB
Average record size in memory78.3 B

Variable types

Categorical8
Numeric1

Dataset

Description샘플 데이터
Author지디에스컨설팅그룹
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=e8039240-2dff-11ea-9713-eb3e5186fb38

Alerts

오염원 지역 코드 has constant value ""Constant
오염원 경도 is highly overall correlated with 오염원 고유번호 and 3 other fieldsHigh correlation
오염원 고유번호 is highly overall correlated with 오염원 종류 명 and 3 other fieldsHigh correlation
오염원 종류 명 is highly overall correlated with 오염원 고유번호 and 3 other fieldsHigh correlation
오염원 상세 종류 명 is highly overall correlated with 오염원 고유번호 and 3 other fieldsHigh correlation
오염원 위도 is highly overall correlated with 오염원 고유번호 and 3 other fieldsHigh correlation
인구수 is highly overall correlated with 연령대High correlation
연령대 is highly overall correlated with 인구수High correlation
인구수 has 6 (6.0%) zerosZeros

Reproduction

Analysis started2023-12-10 12:34:04.088538
Analysis finished2023-12-10 12:34:05.772270
Duration1.68 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

오염원 고유번호
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
42 
2
42 
3
16 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 42
42.0%
2 42
42.0%
3 16
 
16.0%

Length

2023-12-10T21:34:05.879936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:34:06.400460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 42
42.0%
2 42
42.0%
3 16
 
16.0%

오염원 지역 코드
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
27000
100 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row27000
2nd row27000
3rd row27000
4th row27000
5th row27000

Common Values

ValueCountFrequency (%)
27000 100
100.0%

Length

2023-12-10T21:34:06.653020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:34:06.845402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
27000 100
100.0%

오염원 종류 명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
주유소
58 
세차장
42 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row주유소
2nd row주유소
3rd row주유소
4th row주유소
5th row주유소

Common Values

ValueCountFrequency (%)
주유소 58
58.0%
세차장 42
42.0%

Length

2023-12-10T21:34:07.001131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:34:07.206746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주유소 58
58.0%
세차장 42
42.0%

오염원 상세 종류 명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
주유소
58 
세차장
42 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row주유소
2nd row주유소
3rd row주유소
4th row주유소
5th row주유소

Common Values

ValueCountFrequency (%)
주유소 58
58.0%
세차장 42
42.0%

Length

2023-12-10T21:34:07.379657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:34:07.584251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
주유소 58
58.0%
세차장 42
42.0%

오염원 경도
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1095595.01174
42 
1096153.37895
42 
1096684.3279
16 

Length

Max length13
Median length13
Mean length12.84
Min length12

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1095595.01174
2nd row1095595.01174
3rd row1095595.01174
4th row1095595.01174
5th row1095595.01174

Common Values

ValueCountFrequency (%)
1095595.01174 42
42.0%
1096153.37895 42
42.0%
1096684.3279 16
 
16.0%

Length

2023-12-10T21:34:07.734067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:34:07.882222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1095595.01174 42
42.0%
1096153.37895 42
42.0%
1096684.3279 16
 
16.0%

오염원 위도
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1760933.90633
42 
1761643.77909
42 
1761283.12639
16 

Length

Max length13
Median length13
Mean length13
Min length13

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1760933.90633
2nd row1760933.90633
3rd row1760933.90633
4th row1760933.90633
5th row1760933.90633

Common Values

ValueCountFrequency (%)
1760933.90633 42
42.0%
1761643.77909 42
42.0%
1761283.12639 16
 
16.0%

Length

2023-12-10T21:34:08.125738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:34:08.413478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1760933.90633 42
42.0%
1761643.77909 42
42.0%
1761283.12639 16
 
16.0%

연령대
Categorical

HIGH CORRELATION 

Distinct21
Distinct (%)21.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
10~14세
 
6
25~29세
 
6
0~4세
 
6
15~19세
 
6
20~24세
 
6
Other values (16)
70 

Length

Max length6
Median length6
Mean length5.76
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row90~94세
2nd row5~9세
3rd row55~59세
4th row90~94세
5th row25~29세

Common Values

ValueCountFrequency (%)
10~14세 6
 
6.0%
25~29세 6
 
6.0%
0~4세 6
 
6.0%
15~19세 6
 
6.0%
20~24세 6
 
6.0%
30~34세 6
 
6.0%
5~9세 6
 
6.0%
35~39세 6
 
6.0%
85~89세 4
 
4.0%
55~59세 4
 
4.0%
Other values (11) 44
44.0%

Length

2023-12-10T21:34:08.667988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
10~14세 6
 
6.0%
0~4세 6
 
6.0%
15~19세 6
 
6.0%
20~24세 6
 
6.0%
30~34세 6
 
6.0%
5~9세 6
 
6.0%
35~39세 6
 
6.0%
25~29세 6
 
6.0%
45~49세 4
 
4.0%
90~94세 4
 
4.0%
Other values (11) 44
44.0%

성별
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
F
50 
M
50 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowF
3rd rowM
4th rowM
5th rowM

Common Values

ValueCountFrequency (%)
F 50
50.0%
M 50
50.0%

Length

2023-12-10T21:34:08.858121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T21:34:09.075640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
f 50
50.0%
m 50
50.0%

인구수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct56
Distinct (%)56.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean591.12
Minimum0
Maximum1170
Zeros6
Zeros (%)6.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T21:34:09.233155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1263
median597
Q3944
95-th percentile1125
Maximum1170
Range1170
Interquartile range (IQR)681

Descriptive statistics

Standard deviation361.18674
Coefficient of variation (CV)0.61102102
Kurtosis-1.127969
Mean591.12
Median Absolute Deviation (MAD)340
Skewness-0.1819017
Sum59112
Variance130455.86
MonotonicityNot monotonic
2023-12-10T21:34:09.523056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 6
 
6.0%
55 2
 
2.0%
947 2
 
2.0%
884 2
 
2.0%
725 2
 
2.0%
802 2
 
2.0%
246 2
 
2.0%
517 2
 
2.0%
538 2
 
2.0%
991 2
 
2.0%
Other values (46) 76
76.0%
ValueCountFrequency (%)
0 6
6.0%
8 2
 
2.0%
11 2
 
2.0%
43 2
 
2.0%
55 2
 
2.0%
126 2
 
2.0%
130 2
 
2.0%
191 1
 
1.0%
221 1
 
1.0%
246 2
 
2.0%
ValueCountFrequency (%)
1170 2
2.0%
1158 2
2.0%
1125 2
2.0%
1109 2
2.0%
1050 2
2.0%
1046 2
2.0%
1032 2
2.0%
991 2
2.0%
961 2
2.0%
950 2
2.0%

Interactions

2023-12-10T21:34:05.231276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T21:34:09.711829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
오염원 고유번호오염원 종류 명오염원 상세 종류 명오염원 경도오염원 위도연령대성별인구수
오염원 고유번호1.0001.0001.0001.0001.0000.0000.0000.446
오염원 종류 명1.0001.0000.9991.0001.0000.0000.0000.000
오염원 상세 종류 명1.0000.9991.0001.0001.0000.0000.0000.000
오염원 경도1.0001.0001.0001.0001.0000.0000.0000.446
오염원 위도1.0001.0001.0001.0001.0000.0000.0000.446
연령대0.0000.0000.0000.0000.0001.0000.0000.905
성별0.0000.0000.0000.0000.0000.0001.0000.000
인구수0.4460.0000.0000.4460.4460.9050.0001.000
2023-12-10T21:34:09.938237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
오염원 경도오염원 고유번호오염원 종류 명오염원 상세 종류 명연령대오염원 위도성별
오염원 경도1.0001.0000.9950.9950.0001.0000.000
오염원 고유번호1.0001.0000.9950.9950.0001.0000.000
오염원 종류 명0.9950.9951.0000.9790.0000.9950.000
오염원 상세 종류 명0.9950.9950.9791.0000.0000.9950.000
연령대0.0000.0000.0000.0001.0000.0000.000
오염원 위도1.0001.0000.9950.9950.0001.0000.000
성별0.0000.0000.0000.0000.0000.0001.000
2023-12-10T21:34:10.139854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인구수오염원 고유번호오염원 종류 명오염원 상세 종류 명오염원 경도오염원 위도연령대성별
인구수1.0000.2880.0000.0000.2880.2880.5900.000
오염원 고유번호0.2881.0000.9950.9951.0001.0000.0000.000
오염원 종류 명0.0000.9951.0000.9790.9950.9950.0000.000
오염원 상세 종류 명0.0000.9950.9791.0000.9950.9950.0000.000
오염원 경도0.2881.0000.9950.9951.0001.0000.0000.000
오염원 위도0.2881.0000.9950.9951.0001.0000.0000.000
연령대0.5900.0000.0000.0000.0000.0001.0000.000
성별0.0000.0000.0000.0000.0000.0000.0001.000

Missing values

2023-12-10T21:34:05.490400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T21:34:05.695610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

오염원 고유번호오염원 지역 코드오염원 종류 명오염원 상세 종류 명오염원 경도오염원 위도연령대성별인구수
0127000주유소주유소1095595.011741760933.9063390~94세F55
1127000주유소주유소1095595.011741760933.906335~9세F517
2127000주유소주유소1095595.011741760933.9063355~59세M1050
3127000주유소주유소1095595.011741760933.9063390~94세M8
4127000주유소주유소1095595.011741760933.9063325~29세M766
5127000주유소주유소1095595.011741760933.9063375~79세M263
6127000주유소주유소1095595.011741760933.9063340~44세F961
7127000주유소주유소1095595.011741760933.9063365~69세M632
8127000주유소주유소1095595.011741760933.9063370~74세F500
9127000주유소주유소1095595.011741760933.9063385~89세F130
오염원 고유번호오염원 지역 코드오염원 종류 명오염원 상세 종류 명오염원 경도오염원 위도연령대성별인구수
90327000주유소주유소1096684.32791761283.1263930~34세M681
91327000주유소주유소1096684.32791761283.1263910~14세M251
92327000주유소주유소1096684.32791761283.1263910~14세F221
93327000주유소주유소1096684.32791761283.1263925~29세M581
94327000주유소주유소1096684.32791761283.1263915~19세M410
95327000주유소주유소1096684.32791761283.1263915~19세F391
96327000주유소주유소1096684.32791761283.1263925~29세F569
97327000주유소주유소1096684.32791761283.1263930~34세F659
98327000주유소주유소1096684.32791761283.1263935~39세F613
99327000주유소주유소1096684.32791761283.1263935~39세M633