Overview

Dataset statistics

Number of variables7
Number of observations2147
Missing cells2226
Missing cells (%)14.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory123.8 KiB
Average record size in memory59.1 B

Variable types

Text3
Numeric3
Categorical1

Dataset

Description공공데이터 중장기 개방계획에 따라 공개하는 경상남도 하천관리 시스템의 데이터 입니다. 하천관리시스템의 무제부지적도 정보를 포함하고있습니다.
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15093555

Alerts

해당도면_구분코드 has constant value ""Constant
일련번호 is highly overall correlated with 해당도면_일련번호 and 1 other fieldsHigh correlation
해당도면_일련번호 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
파일 갯수 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
파일 설명 has 409 (19.0%) missing valuesMissing
파일 갯수 has 1817 (84.6%) missing valuesMissing
파일명 has unique valuesUnique

Reproduction

Analysis started2023-12-10 23:46:19.116218
Analysis finished2023-12-10 23:46:20.482739
Duration1.37 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct212
Distinct (%)9.9%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
2023-12-11T08:46:20.641545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters40793
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20140402015F02Q0101
2nd row20140402015F02Q0101
3rd row20140402015F02Q0101
4th row20140402015F02Q0101
5th row20140402015F02Q0101
ValueCountFrequency (%)
20268002012f02q0101 81
 
3.8%
20272002012f02q0101 54
 
2.5%
20272002012f02q0102 54
 
2.5%
20227802010f01q0101 41
 
1.9%
20243402014f02q0101 38
 
1.8%
27209902014f02q0101 38
 
1.8%
20249602010f02q0101 33
 
1.5%
40226502008f01q0101 33
 
1.5%
20140402015f02q0101 32
 
1.5%
20142102012f02q0101 27
 
1.3%
Other values (202) 1716
79.9%
2023-12-11T08:46:21.120254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 14132
34.6%
2 8677
21.3%
1 7734
19.0%
F 2147
 
5.3%
Q 2147
 
5.3%
7 1269
 
3.1%
4 1141
 
2.8%
6 981
 
2.4%
5 783
 
1.9%
8 691
 
1.7%
Other values (2) 1091
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 36499
89.5%
Uppercase Letter 4294
 
10.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 14132
38.7%
2 8677
23.8%
1 7734
21.2%
7 1269
 
3.5%
4 1141
 
3.1%
6 981
 
2.7%
5 783
 
2.1%
8 691
 
1.9%
3 653
 
1.8%
9 438
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
F 2147
50.0%
Q 2147
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 36499
89.5%
Latin 4294
 
10.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 14132
38.7%
2 8677
23.8%
1 7734
21.2%
7 1269
 
3.5%
4 1141
 
3.1%
6 981
 
2.7%
5 783
 
2.1%
8 691
 
1.9%
3 653
 
1.8%
9 438
 
1.2%
Latin
ValueCountFrequency (%)
F 2147
50.0%
Q 2147
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 40793
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 14132
34.6%
2 8677
21.3%
1 7734
19.0%
F 2147
 
5.3%
Q 2147
 
5.3%
7 1269
 
3.1%
4 1141
 
2.8%
6 981
 
2.4%
5 783
 
1.9%
8 691
 
1.7%
Other values (2) 1091
 
2.7%

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct81
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.030275
Minimum1
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.0 KiB
2023-12-11T08:46:21.314389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q312
95-th percentile33.7
Maximum81
Range80
Interquartile range (IQR)9

Descriptive statistics

Standard deviation11.589215
Coefficient of variation (CV)1.1554234
Kurtosis9.0368662
Mean10.030275
Median Absolute Deviation (MAD)4
Skewness2.6924226
Sum21535
Variance134.30989
MonotonicityNot monotonic
2023-12-11T08:46:21.860662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 212
 
9.9%
1 211
 
9.8%
3 209
 
9.7%
4 194
 
9.0%
5 171
 
8.0%
6 141
 
6.6%
7 116
 
5.4%
8 102
 
4.8%
9 77
 
3.6%
10 71
 
3.3%
Other values (71) 643
29.9%
ValueCountFrequency (%)
1 211
9.8%
2 212
9.9%
3 209
9.7%
4 194
9.0%
5 171
8.0%
6 141
6.6%
7 116
5.4%
8 102
4.8%
9 77
 
3.6%
10 71
 
3.3%
ValueCountFrequency (%)
81 1
< 0.1%
80 1
< 0.1%
79 1
< 0.1%
78 1
< 0.1%
77 1
< 0.1%
76 1
< 0.1%
75 1
< 0.1%
74 1
< 0.1%
73 1
< 0.1%
72 1
< 0.1%

파일명
Text

UNIQUE 

Distinct2147
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
2023-12-11T08:46:22.072498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length26
Mean length26
Min length26

Characters and Unicode

Total characters55822
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2147 ?
Unique (%)100.0%

Sample

1st row20140402015F02Q0101I060001
2nd row20140402015F02Q0101I060002
3rd row20140402015F02Q0101I060003
4th row20140402015F02Q0101I060004
5th row20140402015F02Q0101I060005
ValueCountFrequency (%)
20140402015f02q0101i060001 1
 
< 0.1%
20272002012f02q0101i060032 1
 
< 0.1%
20272002012f02q0101i060046 1
 
< 0.1%
20272002012f02q0101i060045 1
 
< 0.1%
20272002012f02q0101i060044 1
 
< 0.1%
20272002012f02q0101i060043 1
 
< 0.1%
20272002012f02q0101i060042 1
 
< 0.1%
20272002012f02q0101i060041 1
 
< 0.1%
20272002012f02q0101i060040 1
 
< 0.1%
20272002012f02q0101i060039 1
 
< 0.1%
Other values (2137) 2137
99.5%
2023-12-11T08:46:22.424819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 22116
39.6%
2 9114
16.3%
1 8481
 
15.2%
6 3338
 
6.0%
F 2147
 
3.8%
Q 2147
 
3.8%
I 2147
 
3.8%
7 1447
 
2.6%
4 1437
 
2.6%
5 1039
 
1.9%
Other values (3) 2409
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 49381
88.5%
Uppercase Letter 6441
 
11.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 22116
44.8%
2 9114
18.5%
1 8481
 
17.2%
6 3338
 
6.8%
7 1447
 
2.9%
4 1437
 
2.9%
5 1039
 
2.1%
3 1013
 
2.1%
8 840
 
1.7%
9 556
 
1.1%
Uppercase Letter
ValueCountFrequency (%)
F 2147
33.3%
Q 2147
33.3%
I 2147
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 49381
88.5%
Latin 6441
 
11.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 22116
44.8%
2 9114
18.5%
1 8481
 
17.2%
6 3338
 
6.8%
7 1447
 
2.9%
4 1437
 
2.9%
5 1039
 
2.1%
3 1013
 
2.1%
8 840
 
1.7%
9 556
 
1.1%
Latin
ValueCountFrequency (%)
F 2147
33.3%
Q 2147
33.3%
I 2147
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 55822
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 22116
39.6%
2 9114
16.3%
1 8481
 
15.2%
6 3338
 
6.0%
F 2147
 
3.8%
Q 2147
 
3.8%
I 2147
 
3.8%
7 1447
 
2.6%
4 1437
 
2.6%
5 1039
 
1.9%
Other values (3) 2409
 
4.3%

해당도면_구분코드
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size16.9 KiB
I06
2147 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowI06
2nd rowI06
3rd rowI06
4th rowI06
5th rowI06

Common Values

ValueCountFrequency (%)
I06 2147
100.0%

Length

2023-12-11T08:46:22.590201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:46:22.680741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
i06 2147
100.0%

해당도면_일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct81
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.030275
Minimum1
Maximum81
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.0 KiB
2023-12-11T08:46:22.813662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median6
Q312
95-th percentile33.7
Maximum81
Range80
Interquartile range (IQR)9

Descriptive statistics

Standard deviation11.589215
Coefficient of variation (CV)1.1554234
Kurtosis9.0368662
Mean10.030275
Median Absolute Deviation (MAD)4
Skewness2.6924226
Sum21535
Variance134.30989
MonotonicityNot monotonic
2023-12-11T08:46:23.045578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 212
 
9.9%
1 211
 
9.8%
3 209
 
9.7%
4 194
 
9.0%
5 171
 
8.0%
6 141
 
6.6%
7 116
 
5.4%
8 102
 
4.8%
9 77
 
3.6%
10 71
 
3.3%
Other values (71) 643
29.9%
ValueCountFrequency (%)
1 211
9.8%
2 212
9.9%
3 209
9.7%
4 194
9.0%
5 171
8.0%
6 141
6.6%
7 116
5.4%
8 102
4.8%
9 77
 
3.6%
10 71
 
3.3%
ValueCountFrequency (%)
81 1
< 0.1%
80 1
< 0.1%
79 1
< 0.1%
78 1
< 0.1%
77 1
< 0.1%
76 1
< 0.1%
75 1
< 0.1%
74 1
< 0.1%
73 1
< 0.1%
72 1
< 0.1%

파일 설명
Text

MISSING 

Distinct257
Distinct (%)14.8%
Missing409
Missing (%)19.0%
Memory size16.9 KiB
2023-12-11T08:46:23.353599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length3.264672
Min length1

Characters and Unicode

Total characters5674
Distinct characters34
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique152 ?
Unique (%)8.7%

Sample

1st row0001
2nd row0002
3rd row0003
4th row0004
5th row0005
ValueCountFrequency (%)
0001 103
 
5.9%
01 103
 
5.9%
0002 91
 
5.2%
0003 89
 
5.1%
0004 86
 
4.9%
0005 76
 
4.4%
0006 71
 
4.1%
0007 59
 
3.4%
0008 54
 
3.1%
0009 38
 
2.2%
Other values (247) 968
55.7%
2023-12-11T08:46:23.871296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3203
56.5%
1 659
 
11.6%
2 323
 
5.7%
3 268
 
4.7%
4 228
 
4.0%
5 188
 
3.3%
6 153
 
2.7%
7 128
 
2.3%
8 109
 
1.9%
R 99
 
1.7%
Other values (24) 316
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5343
94.2%
Uppercase Letter 212
 
3.7%
Other Letter 119
 
2.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
15
12.6%
11
9.2%
11
9.2%
11
9.2%
9
7.6%
9
7.6%
9
7.6%
9
7.6%
9
7.6%
7
 
5.9%
Other values (4) 19
16.0%
Decimal Number
ValueCountFrequency (%)
0 3203
59.9%
1 659
 
12.3%
2 323
 
6.0%
3 268
 
5.0%
4 228
 
4.3%
5 188
 
3.5%
6 153
 
2.9%
7 128
 
2.4%
8 109
 
2.0%
9 84
 
1.6%
Uppercase Letter
ValueCountFrequency (%)
R 99
46.7%
H 28
 
13.2%
D 20
 
9.4%
C 17
 
8.0%
J 10
 
4.7%
B 10
 
4.7%
S 9
 
4.2%
M 7
 
3.3%
G 7
 
3.3%
Y 5
 
2.4%

Most occurring scripts

ValueCountFrequency (%)
Common 5343
94.2%
Latin 212
 
3.7%
Hangul 119
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
15
12.6%
11
9.2%
11
9.2%
11
9.2%
9
7.6%
9
7.6%
9
7.6%
9
7.6%
9
7.6%
7
 
5.9%
Other values (4) 19
16.0%
Common
ValueCountFrequency (%)
0 3203
59.9%
1 659
 
12.3%
2 323
 
6.0%
3 268
 
5.0%
4 228
 
4.3%
5 188
 
3.5%
6 153
 
2.9%
7 128
 
2.4%
8 109
 
2.0%
9 84
 
1.6%
Latin
ValueCountFrequency (%)
R 99
46.7%
H 28
 
13.2%
D 20
 
9.4%
C 17
 
8.0%
J 10
 
4.7%
B 10
 
4.7%
S 9
 
4.2%
M 7
 
3.3%
G 7
 
3.3%
Y 5
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5555
97.9%
Hangul 119
 
2.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3203
57.7%
1 659
 
11.9%
2 323
 
5.8%
3 268
 
4.8%
4 228
 
4.1%
5 188
 
3.4%
6 153
 
2.8%
7 128
 
2.3%
8 109
 
2.0%
R 99
 
1.8%
Other values (10) 197
 
3.5%
Hangul
ValueCountFrequency (%)
15
12.6%
11
9.2%
11
9.2%
11
9.2%
9
7.6%
9
7.6%
9
7.6%
9
7.6%
9
7.6%
7
 
5.9%
Other values (4) 19
16.0%

파일 갯수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct14
Distinct (%)4.2%
Missing1817
Missing (%)84.6%
Infinite0
Infinite (%)0.0%
Mean10.115152
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size19.0 KiB
2023-12-11T08:46:24.042627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q15
median8
Q317
95-th percentile20
Maximum20
Range19
Interquartile range (IQR)12

Descriptive statistics

Standard deviation5.7112975
Coefficient of variation (CV)0.56462797
Kurtosis-1.3338807
Mean10.115152
Median Absolute Deviation (MAD)4
Skewness0.30023305
Sum3338
Variance32.618919
MonotonicityNot monotonic
2023-12-11T08:46:24.184304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
17 51
 
2.4%
5 40
 
1.9%
7 35
 
1.6%
6 30
 
1.4%
13 26
 
1.2%
8 24
 
1.1%
4 20
 
0.9%
20 20
 
0.9%
18 18
 
0.8%
3 18
 
0.8%
Other values (4) 48
 
2.2%
(Missing) 1817
84.6%
ValueCountFrequency (%)
1 12
 
0.6%
3 18
0.8%
4 20
0.9%
5 40
1.9%
6 30
1.4%
7 35
1.6%
8 24
1.1%
10 10
 
0.5%
11 11
 
0.5%
13 26
1.2%
ValueCountFrequency (%)
20 20
 
0.9%
18 18
 
0.8%
17 51
2.4%
15 15
 
0.7%
13 26
1.2%
11 11
 
0.5%
10 10
 
0.5%
8 24
1.1%
7 35
1.6%
6 30
1.4%

Interactions

2023-12-11T08:46:19.954054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:46:19.340633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:46:19.639802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:46:20.040988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:46:19.450359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:46:19.747830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:46:20.129871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:46:19.545808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:46:19.861630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T08:46:24.288868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호해당도면_일련번호파일 갯수
일련번호1.0001.0000.588
해당도면_일련번호1.0001.0000.588
파일 갯수0.5880.5881.000
2023-12-11T08:46:24.421723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호해당도면_일련번호파일 갯수
일련번호1.0001.0000.577
해당도면_일련번호1.0001.0000.577
파일 갯수0.5770.5771.000

Missing values

2023-12-11T08:46:20.244997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:46:20.349717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T08:46:20.438104image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

하천관리코드일련번호파일명해당도면_구분코드해당도면_일련번호파일 설명파일 갯수
020140402015F02Q0101120140402015F02Q0101I060001I061<NA><NA>
120140402015F02Q0101220140402015F02Q0101I060002I062<NA><NA>
220140402015F02Q0101320140402015F02Q0101I060003I063<NA><NA>
320140402015F02Q0101420140402015F02Q0101I060004I064<NA><NA>
420140402015F02Q0101520140402015F02Q0101I060005I065<NA><NA>
520140402015F02Q0101620140402015F02Q0101I060006I066<NA><NA>
620140402015F02Q0101720140402015F02Q0101I060007I067<NA><NA>
720140402015F02Q0101820140402015F02Q0101I060008I068<NA><NA>
820140402015F02Q0101920140402015F02Q0101I060009I069<NA><NA>
920140402015F02Q01011020140402015F02Q0101I060010I0610<NA><NA>
하천관리코드일련번호파일명해당도면_구분코드해당도면_일련번호파일 설명파일 갯수
213740227902007F01Q0101440227902007F01Q0101I060004I0640004<NA>
213840227902007F01Q0101540227902007F01Q0101I060005I0650005<NA>
213940227902007F01Q0101640227902007F01Q0101I060006I0660006<NA>
214040227902007F01Q0101740227902007F01Q0101I060007I0670007<NA>
214140227902007F01Q0101840227902007F01Q0101I060008I0680008<NA>
214240227902007F01Q0101940227902007F01Q0101I060009I0690009<NA>
214340227902007F01Q01011040227902007F01Q0101I060010I06100010<NA>
214440227902007F01Q01011140227902007F01Q0101I060011I06110011<NA>
214540227902007F01Q01011240227902007F01Q0101I060012I06120012<NA>
214640227902007F01Q01011340227902007F01Q0101I060013I06130013<NA>