Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.4 KiB
Average record size in memory75.3 B

Variable types

Categorical8
Numeric1

Dataset

Description샘플 데이터
Author지디에스컨설팅그룹
URLhttps://www.bigdata-environment.kr/user/data_market/detail.do?id=00eb74d0-2e00-11ea-9713-eb3e5186fb38

Alerts

상위 유역 명 has constant value ""Constant
급수 등급 명 has constant value ""Constant
유역명 is highly overall correlated with 유역코드 and 2 other fieldsHigh correlation
상세유역명 is highly overall correlated with 유역코드 and 2 other fieldsHigh correlation
유역코드 is highly overall correlated with 유역명 and 2 other fieldsHigh correlation
위치 명 is highly overall correlated with 유역코드 and 2 other fieldsHigh correlation

Reproduction

Analysis started2023-12-10 13:09:56.215525
Analysis finished2023-12-10 13:09:57.350393
Duration1.13 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

유역코드
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
100101
40 
100102
40 
100103
20 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row100101
2nd row100101
3rd row100101
4th row100101
5th row100101

Common Values

ValueCountFrequency (%)
100101 40
40.0%
100102 40
40.0%
100103 20
20.0%

Length

2023-12-10T22:09:57.448569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:57.694945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
100101 40
40.0%
100102 40
40.0%
100103 20
20.0%

유역명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
광동댐
40 
광동댐하류
40 
임계천
20 

Length

Max length5
Median length3
Mean length3.8
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row광동댐
2nd row광동댐
3rd row광동댐
4th row광동댐
5th row광동댐

Common Values

ValueCountFrequency (%)
광동댐 40
40.0%
광동댐하류 40
40.0%
임계천 20
20.0%

Length

2023-12-10T22:09:57.883751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:58.093921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
광동댐 40
40.0%
광동댐하류 40
40.0%
임계천 20
20.0%

상위 유역 명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
남한강상류
100 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남한강상류
2nd row남한강상류
3rd row남한강상류
4th row남한강상류
5th row남한강상류

Common Values

ValueCountFrequency (%)
남한강상류 100
100.0%

Length

2023-12-10T22:09:58.440031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:58.634141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
남한강상류 100
100.0%

위치 명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
골지천-광동댐
40 
광동댐-임계천하구
40 
임계천-임계천하구
20 

Length

Max length9
Median length9
Mean length8.2
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row골지천-광동댐
2nd row골지천-광동댐
3rd row골지천-광동댐
4th row골지천-광동댐
5th row골지천-광동댐

Common Values

ValueCountFrequency (%)
골지천-광동댐 40
40.0%
광동댐-임계천하구 40
40.0%
임계천-임계천하구 20
20.0%

Length

2023-12-10T22:09:58.923117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:59.153888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
골지천-광동댐 40
40.0%
광동댐-임계천하구 40
40.0%
임계천-임계천하구 20
20.0%

상세유역명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
골지천
80 
임계천
20 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row골지천
2nd row골지천
3rd row골지천
4th row골지천
5th row골지천

Common Values

ValueCountFrequency (%)
골지천 80
80.0%
임계천 20
 
20.0%

Length

2023-12-10T22:09:59.377385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:59.535486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
골지천 80
80.0%
임계천 20
 
20.0%

급수 등급 명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
지방2급
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지방2급
2nd row지방2급
3rd row지방2급
4th row지방2급
5th row지방2급

Common Values

ValueCountFrequency (%)
지방2급 100
100.0%

Length

2023-12-10T22:09:59.711874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:59.864686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지방2급 100
100.0%

연령대 명
Categorical

Distinct20
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
0 - 4세
 
6
20 - 24세
 
6
25 - 29세
 
6
30 - 34세
 
6
35 - 39세
 
6
Other values (15)
70 

Length

Max length8
Median length8
Mean length7.8
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0 - 4세
2nd row0 - 4세
3rd row10 - 14세
4th row10 - 14세
5th row15 - 19세

Common Values

ValueCountFrequency (%)
0 - 4세 6
 
6.0%
20 - 24세 6
 
6.0%
25 - 29세 6
 
6.0%
30 - 34세 6
 
6.0%
35 - 39세 6
 
6.0%
40 - 44세 6
 
6.0%
45 - 49세 6
 
6.0%
50 - 54세 6
 
6.0%
10 - 14세 6
 
6.0%
15 - 19세 6
 
6.0%
Other values (10) 40
40.0%

Length

2023-12-10T22:10:00.033516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
100
33.3%
0 6
 
2.0%
40 6
 
2.0%
15 6
 
2.0%
14세 6
 
2.0%
10 6
 
2.0%
54세 6
 
2.0%
50 6
 
2.0%
49세 6
 
2.0%
45 6
 
2.0%
Other values (31) 146
48.7%

성별
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
F
50 
M
50 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowM
3rd rowF
4th rowM
5th rowF

Common Values

ValueCountFrequency (%)
F 50
50.0%
M 50
50.0%

Length

2023-12-10T22:10:00.227604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:10:00.374499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
f 50
50.0%
m 50
50.0%

인구수
Real number (ℝ)

Distinct78
Distinct (%)78.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean85.56
Minimum1
Maximum298
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:10:00.555785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.85
Q121
median59.5
Q3128
95-th percentile256.75
Maximum298
Range297
Interquartile range (IQR)107

Descriptive statistics

Standard deviation80.911639
Coefficient of variation (CV)0.94567133
Kurtosis0.3251245
Mean85.56
Median Absolute Deviation (MAD)43
Skewness1.1303941
Sum8556
Variance6546.6933
MonotonicityNot monotonic
2023-12-10T22:10:00.731226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8 5
 
5.0%
3 3
 
3.0%
49 3
 
3.0%
21 3
 
3.0%
27 2
 
2.0%
92 2
 
2.0%
11 2
 
2.0%
17 2
 
2.0%
18 2
 
2.0%
94 2
 
2.0%
Other values (68) 74
74.0%
ValueCountFrequency (%)
1 1
 
1.0%
2 1
 
1.0%
3 3
3.0%
6 1
 
1.0%
7 1
 
1.0%
8 5
5.0%
9 1
 
1.0%
11 2
 
2.0%
13 2
 
2.0%
15 1
 
1.0%
ValueCountFrequency (%)
298 1
1.0%
295 1
1.0%
293 1
1.0%
283 1
1.0%
271 1
1.0%
256 1
1.0%
248 1
1.0%
239 1
1.0%
238 1
1.0%
229 1
1.0%

Interactions

2023-12-10T22:09:56.828326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:10:00.871851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유역코드유역명위치 명상세유역명연령대 명성별인구수
유역코드1.0001.0001.0001.0000.0000.0000.662
유역명1.0001.0001.0001.0000.0000.0000.662
위치 명1.0001.0001.0001.0000.0000.0000.662
상세유역명1.0001.0001.0001.0000.0000.0000.450
연령대 명0.0000.0000.0000.0001.0000.0000.631
성별0.0000.0000.0000.0000.0001.0000.000
인구수0.6620.6620.6620.4500.6310.0001.000
2023-12-10T22:10:01.439735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
유역명성별연령대 명상세유역명유역코드위치 명
유역명1.0000.0000.0000.9951.0001.000
성별0.0001.0000.0000.0000.0000.000
연령대 명0.0000.0001.0000.0000.0000.000
상세유역명0.9950.0000.0001.0000.9950.995
유역코드1.0000.0000.0000.9951.0001.000
위치 명1.0000.0000.0000.9951.0001.000
2023-12-10T22:10:01.609685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
인구수유역코드유역명위치 명상세유역명연령대 명성별
인구수1.0000.4930.4930.4930.3300.2320.000
유역코드0.4931.0001.0001.0000.9950.0000.000
유역명0.4931.0001.0001.0000.9950.0000.000
위치 명0.4931.0001.0001.0000.9950.0000.000
상세유역명0.3300.9950.9950.9951.0000.0000.000
연령대 명0.2320.0000.0000.0000.0001.0000.000
성별0.0000.0000.0000.0000.0000.0001.000

Missing values

2023-12-10T22:09:57.022998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:09:57.265599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

유역코드유역명상위 유역 명위치 명상세유역명급수 등급 명연령대 명성별인구수
0100101광동댐남한강상류골지천-광동댐골지천지방2급0 - 4세F74
1100101광동댐남한강상류골지천-광동댐골지천지방2급0 - 4세M92
2100101광동댐남한강상류골지천-광동댐골지천지방2급10 - 14세F141
3100101광동댐남한강상류골지천-광동댐골지천지방2급10 - 14세M134
4100101광동댐남한강상류골지천-광동댐골지천지방2급15 - 19세F115
5100101광동댐남한강상류골지천-광동댐골지천지방2급15 - 19세M122
6100101광동댐남한강상류골지천-광동댐골지천지방2급20 - 24세F86
7100101광동댐남한강상류골지천-광동댐골지천지방2급20 - 24세M144
8100101광동댐남한강상류골지천-광동댐골지천지방2급25 - 29세F79
9100101광동댐남한강상류골지천-광동댐골지천지방2급25 - 29세M99
유역코드유역명상위 유역 명위치 명상세유역명급수 등급 명연령대 명성별인구수
90100103임계천남한강상류임계천-임계천하구임계천지방2급30 - 34세F36
91100103임계천남한강상류임계천-임계천하구임계천지방2급30 - 34세M46
92100103임계천남한강상류임계천-임계천하구임계천지방2급35 - 39세F31
93100103임계천남한강상류임계천-임계천하구임계천지방2급35 - 39세M60
94100103임계천남한강상류임계천-임계천하구임계천지방2급40 - 44세F49
95100103임계천남한강상류임계천-임계천하구임계천지방2급40 - 44세M61
96100103임계천남한강상류임계천-임계천하구임계천지방2급45 - 49세F63
97100103임계천남한강상류임계천-임계천하구임계천지방2급45 - 49세M146
98100103임계천남한강상류임계천-임계천하구임계천지방2급50 - 54세F128
99100103임계천남한강상류임계천-임계천하구임계천지방2급50 - 54세M190