Overview

Dataset statistics

Number of variables5
Number of observations6664
Missing cells2
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory273.5 KiB
Average record size in memory42.0 B

Variable types

Numeric2
Text2
Categorical1

Dataset

Description공공데이터 중장기 개방계획에 따라 공개하는 경상남도 하천관리 시스템의 데이터 입니다. 하천관리시스템의 하천구역구분 정보를 포함하고있습니다.
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15093498

Alerts

구분코드 is highly imbalanced (89.7%)Imbalance
공간아이디 has unique valuesUnique

Reproduction

Analysis started2023-12-11 00:38:19.224403
Analysis finished2023-12-11 00:38:20.173003
Duration0.95 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

공간아이디
Real number (ℝ)

UNIQUE 

Distinct6664
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3332.5
Minimum1
Maximum6664
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.7 KiB
2023-12-11T09:38:20.258286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile334.15
Q11666.75
median3332.5
Q34998.25
95-th percentile6330.85
Maximum6664
Range6663
Interquartile range (IQR)3331.5

Descriptive statistics

Standard deviation1923.8754
Coefficient of variation (CV)0.57730696
Kurtosis-1.2
Mean3332.5
Median Absolute Deviation (MAD)1666
Skewness0
Sum22207780
Variance3701296.7
MonotonicityStrictly increasing
2023-12-11T09:38:20.417525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
4583 1
 
< 0.1%
4451 1
 
< 0.1%
4450 1
 
< 0.1%
4449 1
 
< 0.1%
4448 1
 
< 0.1%
4447 1
 
< 0.1%
4446 1
 
< 0.1%
4445 1
 
< 0.1%
4444 1
 
< 0.1%
Other values (6654) 6654
99.8%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
6664 1
< 0.1%
6663 1
< 0.1%
6662 1
< 0.1%
6661 1
< 0.1%
6660 1
< 0.1%
6659 1
< 0.1%
6658 1
< 0.1%
6657 1
< 0.1%
6656 1
< 0.1%
6655 1
< 0.1%
Distinct662
Distinct (%)9.9%
Missing0
Missing (%)0.0%
Memory size52.2 KiB
2023-12-11T09:38:20.627681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters126616
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)0.5%

Sample

1st row20234402019F02Q0101
2nd row20234402019F02Q0101
3rd row20234402019F02Q0101
4th row20234402019F02Q0101
5th row20234402019F02Q0101
ValueCountFrequency (%)
27212002005f02q0101 75
 
1.1%
20265801995f02q0101 66
 
1.0%
20268002012f02q0101 62
 
0.9%
20231502019f02q0101 61
 
0.9%
20246902020f02q0101 56
 
0.8%
20225602005f02q0101 55
 
0.8%
20231002004f01q0101 46
 
0.7%
20231702019f02q0101 43
 
0.6%
20228802004f01q0101 42
 
0.6%
27210002021f01q0101 38
 
0.6%
Other values (652) 6120
91.8%
2023-12-11T09:38:20.983621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 42870
33.9%
1 24616
19.4%
2 24562
19.4%
F 6664
 
5.3%
Q 6664
 
5.3%
9 3872
 
3.1%
7 3524
 
2.8%
4 3203
 
2.5%
5 2997
 
2.4%
3 2890
 
2.3%
Other values (2) 4754
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 113288
89.5%
Uppercase Letter 13328
 
10.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 42870
37.8%
1 24616
21.7%
2 24562
21.7%
9 3872
 
3.4%
7 3524
 
3.1%
4 3203
 
2.8%
5 2997
 
2.6%
3 2890
 
2.6%
6 2674
 
2.4%
8 2080
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
F 6664
50.0%
Q 6664
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 113288
89.5%
Latin 13328
 
10.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 42870
37.8%
1 24616
21.7%
2 24562
21.7%
9 3872
 
3.4%
7 3524
 
3.1%
4 3203
 
2.8%
5 2997
 
2.6%
3 2890
 
2.6%
6 2674
 
2.4%
8 2080
 
1.8%
Latin
ValueCountFrequency (%)
F 6664
50.0%
Q 6664
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 126616
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 42870
33.9%
1 24616
19.4%
2 24562
19.4%
F 6664
 
5.3%
Q 6664
 
5.3%
9 3872
 
3.1%
7 3524
 
2.8%
4 3203
 
2.5%
5 2997
 
2.4%
3 2890
 
2.3%
Other values (2) 4754
 
3.8%

구분코드
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size52.2 KiB
H01
6574 
H02
 
90

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowH01
2nd rowH01
3rd rowH01
4th rowH01
5th rowH01

Common Values

ValueCountFrequency (%)
H01 6574
98.6%
H02 90
 
1.4%

Length

2023-12-11T09:38:21.120987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T09:38:21.208009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
h01 6574
98.6%
h02 90
 
1.4%

일련번호
Real number (ℝ)

Distinct391
Distinct (%)5.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20.187425
Minimum1
Maximum391
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size58.7 KiB
2023-12-11T09:38:21.310962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median7
Q313
95-th percentile70.85
Maximum391
Range390
Interquartile range (IQR)10

Descriptive statistics

Standard deviation52.704912
Coefficient of variation (CV)2.6107793
Kurtosis24.517869
Mean20.187425
Median Absolute Deviation (MAD)4
Skewness4.8722933
Sum134529
Variance2777.8077
MonotonicityNot monotonic
2023-12-11T09:38:21.472279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 630
 
9.5%
2 594
 
8.9%
3 542
 
8.1%
4 529
 
7.9%
5 483
 
7.2%
6 451
 
6.8%
7 391
 
5.9%
8 357
 
5.4%
9 296
 
4.4%
10 251
 
3.8%
Other values (381) 2140
32.1%
ValueCountFrequency (%)
1 630
9.5%
2 594
8.9%
3 542
8.1%
4 529
7.9%
5 483
7.2%
6 451
6.8%
7 391
5.9%
8 357
5.4%
9 296
4.4%
10 251
 
3.8%
ValueCountFrequency (%)
391 1
< 0.1%
390 1
< 0.1%
389 1
< 0.1%
388 1
< 0.1%
387 1
< 0.1%
386 1
< 0.1%
385 1
< 0.1%
384 1
< 0.1%
383 1
< 0.1%
382 1
< 0.1%
Distinct514
Distinct (%)7.7%
Missing2
Missing (%)< 0.1%
Memory size52.2 KiB
2023-12-11T09:38:21.761421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length2
Mean length2.4148904
Min length1

Characters and Unicode

Total characters16088
Distinct characters126
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique366 ?
Unique (%)5.5%

Sample

1st row우7
2nd row우6
3rd row좌6
4th row우5
5th row좌5
ValueCountFrequency (%)
좌1 513
 
7.7%
우1 508
 
7.6%
우2 463
 
6.9%
좌2 463
 
6.9%
우3 418
 
6.3%
좌3 412
 
6.2%
좌4 331
 
5.0%
우4 319
 
4.8%
우5 243
 
3.6%
좌5 239
 
3.6%
Other values (504) 2753
41.3%
2023-12-11T09:38:22.199927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3245
20.2%
3180
19.8%
1 1958
12.2%
2 1301
8.1%
3 1057
 
6.6%
4 825
 
5.1%
5 626
 
3.9%
6 463
 
2.9%
0 462
 
2.9%
360
 
2.2%
Other values (116) 2611
16.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8549
53.1%
Decimal Number 7505
46.6%
Open Punctuation 11
 
0.1%
Close Punctuation 11
 
0.1%
Dash Punctuation 9
 
0.1%
Other Punctuation 2
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3245
38.0%
3180
37.2%
360
 
4.2%
260
 
3.0%
137
 
1.6%
126
 
1.5%
119
 
1.4%
111
 
1.3%
64
 
0.7%
63
 
0.7%
Other values (101) 884
 
10.3%
Decimal Number
ValueCountFrequency (%)
1 1958
26.1%
2 1301
17.3%
3 1057
14.1%
4 825
11.0%
5 626
 
8.3%
6 463
 
6.2%
0 462
 
6.2%
7 348
 
4.6%
8 263
 
3.5%
9 202
 
2.7%
Open Punctuation
ValueCountFrequency (%)
( 11
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8549
53.1%
Common 7539
46.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3245
38.0%
3180
37.2%
360
 
4.2%
260
 
3.0%
137
 
1.6%
126
 
1.5%
119
 
1.4%
111
 
1.3%
64
 
0.7%
63
 
0.7%
Other values (101) 884
 
10.3%
Common
ValueCountFrequency (%)
1 1958
26.0%
2 1301
17.3%
3 1057
14.0%
4 825
10.9%
5 626
 
8.3%
6 463
 
6.1%
0 462
 
6.1%
7 348
 
4.6%
8 263
 
3.5%
9 202
 
2.7%
Other values (5) 34
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8549
53.1%
ASCII 7539
46.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3245
38.0%
3180
37.2%
360
 
4.2%
260
 
3.0%
137
 
1.6%
126
 
1.5%
119
 
1.4%
111
 
1.3%
64
 
0.7%
63
 
0.7%
Other values (101) 884
 
10.3%
ASCII
ValueCountFrequency (%)
1 1958
26.0%
2 1301
17.3%
3 1057
14.0%
4 825
10.9%
5 626
 
8.3%
6 463
 
6.1%
0 462
 
6.1%
7 348
 
4.6%
8 263
 
3.5%
9 202
 
2.7%
Other values (5) 34
 
0.5%

Interactions

2023-12-11T09:38:19.737131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:38:19.531685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:38:19.838530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:38:19.638531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:38:22.298928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공간아이디구분코드일련번호
공간아이디1.0000.1460.611
구분코드0.1461.0000.195
일련번호0.6110.1951.000
2023-12-11T09:38:22.385457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공간아이디일련번호구분코드
공간아이디1.0000.1580.112
일련번호0.1581.0000.150
구분코드0.1120.1501.000

Missing values

2023-12-11T09:38:20.004589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:38:20.121897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

공간아이디하천관리코드구분코드일련번호하천구분구역명
0120234402019F02Q0101H011우7
1220234402019F02Q0101H012우6
2320234402019F02Q0101H013좌6
3420234402019F02Q0101H014우5
4520234402019F02Q0101H015좌5
5620234402019F02Q0101H016좌4
6720234402019F02Q0101H017우4
7820234402019F02Q0101H018우3
8920234402019F02Q0101H019좌3
91020234402019F02Q0101H0110좌2
공간아이디하천관리코드구분코드일련번호하천구분구역명
6654665520254902021F02Q0101H0127좌04
6655665620254902021F02Q0101H0128좌03
6656665720254902021F02Q0101H0129우02
6657665820254902021F02Q0101H0130좌02
6658665920254902021F02Q0101H0131우01
6659666020254902021F02Q0101H0132좌01
6660666120254902021F02Q0101H0133좌07
6661666220254902021F02Q0101H0134좌15
6662666320254902021F02Q0101H0135<NA>
6663666420254902021F02Q0101H0136<NA>