Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory566.4 KiB
Average record size in memory58.0 B

Variable types

Numeric2
Text3
Categorical1

Dataset

Description공공데이터 중장기 개방계획에 따라 공개하는 경상남도 하천관리 시스템의 데이터 입니다. 하천관리시스템의 부속물 정보를 포함하고있습니다.
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15093560

Alerts

구분코드 is highly imbalanced (70.6%)Imbalance
공간아이디 has unique valuesUnique

Reproduction

Analysis started2024-04-21 02:04:48.049302
Analysis finished2024-04-21 02:04:50.649707
Duration2.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

공간아이디
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11061.285
Minimum2
Maximum22085
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-21T11:04:50.856731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile1181.85
Q15594.5
median11019
Q316577.25
95-th percentile20974.2
Maximum22085
Range22083
Interquartile range (IQR)10982.75

Descriptive statistics

Standard deviation6353.9472
Coefficient of variation (CV)0.57443117
Kurtosis-1.1966007
Mean11061.285
Median Absolute Deviation (MAD)5503
Skewness0.0069052034
Sum1.1061286 × 108
Variance40372644
MonotonicityNot monotonic
2024-04-21T11:04:51.269890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7169 1
 
< 0.1%
16577 1
 
< 0.1%
10701 1
 
< 0.1%
3803 1
 
< 0.1%
4586 1
 
< 0.1%
6364 1
 
< 0.1%
15539 1
 
< 0.1%
20055 1
 
< 0.1%
18767 1
 
< 0.1%
12593 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
9 1
< 0.1%
14 1
< 0.1%
16 1
< 0.1%
20 1
< 0.1%
22 1
< 0.1%
23 1
< 0.1%
ValueCountFrequency (%)
22085 1
< 0.1%
22084 1
< 0.1%
22081 1
< 0.1%
22080 1
< 0.1%
22079 1
< 0.1%
22078 1
< 0.1%
22077 1
< 0.1%
22076 1
< 0.1%
22073 1
< 0.1%
22071 1
< 0.1%
Distinct699
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-21T11:04:52.124887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters190000
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)0.3%

Sample

1st row20255602014F02Q0101
2nd row20227502012F01Q0101
3rd row27209901994F02Q0101
4th row27202401995F01Q0101
5th row20241502016F01Q0101
ValueCountFrequency (%)
20246902020f02q0101 170
 
1.7%
20228802004f01q0101 96
 
1.0%
20231502019f02q0101 86
 
0.9%
20249602010f02q0101 68
 
0.7%
20248701986f01q0101 65
 
0.7%
20250301997f01q0101 62
 
0.6%
20234302019f02q0101 61
 
0.6%
27209902014f02q0101 59
 
0.6%
20265801995f02q0101 59
 
0.6%
20233402019f02q0101 57
 
0.6%
Other values (689) 9217
92.2%
2024-04-21T11:04:53.359540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 63851
33.6%
1 35464
18.7%
2 35425
18.6%
F 10000
 
5.3%
Q 10000
 
5.3%
9 8283
 
4.4%
7 5550
 
2.9%
4 5324
 
2.8%
5 4383
 
2.3%
3 4358
 
2.3%
Other values (2) 7362
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 170000
89.5%
Uppercase Letter 20000
 
10.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 63851
37.6%
1 35464
20.9%
2 35425
20.8%
9 8283
 
4.9%
7 5550
 
3.3%
4 5324
 
3.1%
5 4383
 
2.6%
3 4358
 
2.6%
6 4333
 
2.5%
8 3029
 
1.8%
Uppercase Letter
ValueCountFrequency (%)
F 10000
50.0%
Q 10000
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 170000
89.5%
Latin 20000
 
10.5%

Most frequent character per script

Common
ValueCountFrequency (%)
0 63851
37.6%
1 35464
20.9%
2 35425
20.8%
9 8283
 
4.9%
7 5550
 
3.3%
4 5324
 
3.1%
5 4383
 
2.6%
3 4358
 
2.6%
6 4333
 
2.5%
8 3029
 
1.8%
Latin
ValueCountFrequency (%)
F 10000
50.0%
Q 10000
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 190000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 63851
33.6%
1 35464
18.7%
2 35425
18.6%
F 10000
 
5.3%
Q 10000
 
5.3%
9 8283
 
4.4%
7 5550
 
2.9%
4 5324
 
2.8%
5 4383
 
2.3%
3 4358
 
2.3%
Other values (2) 7362
 
3.9%

구분코드
Categorical

IMBALANCE 

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
S08
8379 
S09
1044 
S10
 
145
K03
 
144
S11
 
121
Other values (4)
 
167

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowS08
2nd rowS09
3rd rowS08
4th rowS08
5th rowS08

Common Values

ValueCountFrequency (%)
S08 8379
83.8%
S09 1044
 
10.4%
S10 145
 
1.5%
K03 144
 
1.4%
S11 121
 
1.2%
S13 107
 
1.1%
K02 25
 
0.2%
S12 22
 
0.2%
S99 13
 
0.1%

Length

2024-04-21T11:04:53.767978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T11:04:54.099372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
s08 8379
83.8%
s09 1044
 
10.4%
s10 145
 
1.5%
k03 144
 
1.4%
s11 121
 
1.2%
s13 107
 
1.1%
k02 25
 
0.2%
s12 22
 
0.2%
s99 13
 
0.1%

일련번호
Real number (ℝ)

Distinct684
Distinct (%)6.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean96.596
Minimum1
Maximum1703
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-21T11:04:54.482309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q111
median24
Q351
95-th percentile354.15
Maximum1703
Range1702
Interquartile range (IQR)40

Descriptive statistics

Standard deviation269.56999
Coefficient of variation (CV)2.7906952
Kurtosis17.340926
Mean96.596
Median Absolute Deviation (MAD)17
Skewness4.2417926
Sum965960
Variance72667.979
MonotonicityNot monotonic
2024-04-21T11:04:54.920968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 269
 
2.7%
1 263
 
2.6%
4 261
 
2.6%
5 260
 
2.6%
6 258
 
2.6%
2 246
 
2.5%
7 244
 
2.4%
8 237
 
2.4%
9 227
 
2.3%
13 227
 
2.3%
Other values (674) 7508
75.1%
ValueCountFrequency (%)
1 263
2.6%
2 246
2.5%
3 269
2.7%
4 261
2.6%
5 260
2.6%
6 258
2.6%
7 244
2.4%
8 237
2.4%
9 227
2.3%
10 224
2.2%
ValueCountFrequency (%)
1703 1
< 0.1%
1702 1
< 0.1%
1701 1
< 0.1%
1700 1
< 0.1%
1699 1
< 0.1%
1698 1
< 0.1%
1697 1
< 0.1%
1696 1
< 0.1%
1695 1
< 0.1%
1693 1
< 0.1%
Distinct7472
Distinct (%)74.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-21T11:04:55.907403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length13
Mean length7.0234
Min length1

Characters and Unicode

Total characters70234
Distinct characters290
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6848 ?
Unique (%)68.5%

Sample

1st row영오제10배수통관
2nd row월광제13배수통관
3rd row우천2취수문
4th row배수통관
5th row좌11배수통관
ValueCountFrequency (%)
배수통관 653
 
6.3%
배수암거 89
 
0.9%
취수문 87
 
0.8%
죽천 59
 
0.6%
55
 
0.5%
배수통문 35
 
0.3%
계획배수통관 35
 
0.3%
제3배수통관 33
 
0.3%
제1배수통관 32
 
0.3%
제4배수통관 31
 
0.3%
Other values (7360) 9258
89.3%
2024-04-21T11:04:57.269304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9659
13.8%
9030
 
12.9%
6991
 
10.0%
6718
 
9.6%
3486
 
5.0%
1 3346
 
4.8%
2 2028
 
2.9%
1848
 
2.6%
1679
 
2.4%
3 1443
 
2.1%
Other values (280) 24006
34.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 57312
81.6%
Decimal Number 12002
 
17.1%
Space Separator 370
 
0.5%
Uppercase Letter 183
 
0.3%
Close Punctuation 164
 
0.2%
Open Punctuation 164
 
0.2%
Dash Punctuation 25
 
< 0.1%
Other Punctuation 13
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9659
16.9%
9030
15.8%
6991
12.2%
6718
11.7%
3486
 
6.1%
1848
 
3.2%
1679
 
2.9%
1083
 
1.9%
752
 
1.3%
605
 
1.1%
Other values (252) 15461
27.0%
Decimal Number
ValueCountFrequency (%)
1 3346
27.9%
2 2028
16.9%
3 1443
12.0%
4 1117
 
9.3%
5 924
 
7.7%
6 816
 
6.8%
7 683
 
5.7%
8 591
 
4.9%
9 532
 
4.4%
0 522
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
U 47
25.7%
O 43
23.5%
X 43
23.5%
B 43
23.5%
I 2
 
1.1%
C 2
 
1.1%
D 1
 
0.5%
V 1
 
0.5%
Y 1
 
0.5%
Other Punctuation
ValueCountFrequency (%)
. 6
46.2%
@ 3
23.1%
* 3
23.1%
# 1
 
7.7%
Space Separator
ValueCountFrequency (%)
370
100.0%
Close Punctuation
ValueCountFrequency (%)
) 164
100.0%
Open Punctuation
ValueCountFrequency (%)
( 164
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 25
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 57312
81.6%
Common 12739
 
18.1%
Latin 183
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9659
16.9%
9030
15.8%
6991
12.2%
6718
11.7%
3486
 
6.1%
1848
 
3.2%
1679
 
2.9%
1083
 
1.9%
752
 
1.3%
605
 
1.1%
Other values (252) 15461
27.0%
Common
ValueCountFrequency (%)
1 3346
26.3%
2 2028
15.9%
3 1443
11.3%
4 1117
 
8.8%
5 924
 
7.3%
6 816
 
6.4%
7 683
 
5.4%
8 591
 
4.6%
9 532
 
4.2%
0 522
 
4.1%
Other values (9) 737
 
5.8%
Latin
ValueCountFrequency (%)
U 47
25.7%
O 43
23.5%
X 43
23.5%
B 43
23.5%
I 2
 
1.1%
C 2
 
1.1%
D 1
 
0.5%
V 1
 
0.5%
Y 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 57312
81.6%
ASCII 12922
 
18.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9659
16.9%
9030
15.8%
6991
12.2%
6718
11.7%
3486
 
6.1%
1848
 
3.2%
1679
 
2.9%
1083
 
1.9%
752
 
1.3%
605
 
1.1%
Other values (252) 15461
27.0%
ASCII
ValueCountFrequency (%)
1 3346
25.9%
2 2028
15.7%
3 1443
11.2%
4 1117
 
8.6%
5 924
 
7.2%
6 816
 
6.3%
7 683
 
5.3%
8 591
 
4.6%
9 532
 
4.1%
0 522
 
4.0%
Other values (18) 920
 
7.1%
Distinct9700
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-21T11:04:58.017714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length45
Median length45
Mean length44.5105
Min length40

Characters and Unicode

Total characters445105
Distinct characters19
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9400 ?
Unique (%)94.0%

Sample

1st rowPOINT (1068114.0508975058 1678896.8450204893)
2nd rowPOINT (1060449.0227276098 1749069.2373822585)
3rd rowPOINT (1056820.565164969 1670808.3378589272)
4th rowPOINT (1098977.458962936 1693429.5799286035)
5th rowPOINT (1025278.9763804282 1734412.4653892987)
ValueCountFrequency (%)
point 10000
33.3%
1140418.8755580366 2
 
< 0.1%
1080934.427717068 2
 
< 0.1%
1734583.964913643 2
 
< 0.1%
1104549.7904665205 2
 
< 0.1%
1728178.8659452284 2
 
< 0.1%
1062277.4897165694 2
 
< 0.1%
1725296.5015104176 2
 
< 0.1%
1021494.0922068657 2
 
< 0.1%
1717981.1739145685 2
 
< 0.1%
Other values (19391) 19982
66.6%
2024-04-21T11:04:59.169374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 50397
11.3%
7 35791
 
8.0%
0 35642
 
8.0%
6 33919
 
7.6%
2 30456
 
6.8%
5 30400
 
6.8%
3 30334
 
6.8%
4 29701
 
6.7%
8 29366
 
6.6%
9 29099
 
6.5%
Other values (9) 110000
24.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 335105
75.3%
Uppercase Letter 50000
 
11.2%
Other Punctuation 20000
 
4.5%
Space Separator 20000
 
4.5%
Open Punctuation 10000
 
2.2%
Close Punctuation 10000
 
2.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 50397
15.0%
7 35791
10.7%
0 35642
10.6%
6 33919
10.1%
2 30456
9.1%
5 30400
9.1%
3 30334
9.1%
4 29701
8.9%
8 29366
8.8%
9 29099
8.7%
Uppercase Letter
ValueCountFrequency (%)
P 10000
20.0%
O 10000
20.0%
T 10000
20.0%
N 10000
20.0%
I 10000
20.0%
Other Punctuation
ValueCountFrequency (%)
. 20000
100.0%
Space Separator
ValueCountFrequency (%)
20000
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10000
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 395105
88.8%
Latin 50000
 
11.2%

Most frequent character per script

Common
ValueCountFrequency (%)
1 50397
12.8%
7 35791
9.1%
0 35642
9.0%
6 33919
8.6%
2 30456
7.7%
5 30400
7.7%
3 30334
7.7%
4 29701
7.5%
8 29366
7.4%
9 29099
7.4%
Other values (4) 60000
15.2%
Latin
ValueCountFrequency (%)
P 10000
20.0%
O 10000
20.0%
T 10000
20.0%
N 10000
20.0%
I 10000
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 445105
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 50397
11.3%
7 35791
 
8.0%
0 35642
 
8.0%
6 33919
 
7.6%
2 30456
 
6.8%
5 30400
 
6.8%
3 30334
 
6.8%
4 29701
 
6.7%
8 29366
 
6.6%
9 29099
 
6.5%
Other values (9) 110000
24.7%

Interactions

2024-04-21T11:04:49.518583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:04:49.007828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:04:49.774908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-21T11:04:49.251088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T11:04:59.439233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공간아이디구분코드일련번호
공간아이디1.0000.2970.429
구분코드0.2971.0000.261
일련번호0.4290.2611.000
2024-04-21T11:04:59.679210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공간아이디일련번호구분코드
공간아이디1.0000.0830.139
일련번호0.0831.0000.131
구분코드0.1390.1311.000

Missing values

2024-04-21T11:04:50.133595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T11:04:50.491736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

공간아이디하천관리코드구분코드일련번호부속물명공간정보
7175716920255602014F02Q0101S0842영오제10배수통관POINT (1068114.0508975058 1678896.8450204893)
64164220227502012F01Q0101S0913월광제13배수통관POINT (1060449.0227276098 1749069.2373822585)
168451683827209901994F02Q0101S0883우천2취수문POINT (1056820.565164969 1670808.3378589272)
141991419227202401995F01Q0101S0818배수통관POINT (1098977.458962936 1693429.5799286035)
3869386820241502016F01Q0101S0820좌11배수통관POINT (1025278.9763804282 1734412.4653892987)
31131120225602005F02Q0101S0926POINT (1095450.8159278487 1733662.4881836309)
168781687127209901994F02Q0101S08130종천7배수통관POINT (1058364.563886197 1667780.880478418)
153061529927205202014F01Q0101S1332여수토POINT (1083266.9717359336 1668948.1986530311)
201222012320233802019F02Q0101S0835황계제27배수통관POINT (1053561.459955034 1723850.994294676)
4021402020242902016F02Q0101S0814죽곡좌8배수통관POINT (1018609.189352776 1725115.2229685425)
공간아이디하천관리코드구분코드일련번호부속물명공간정보
103011030220264901995F01Q0101S0847거문6배수통관POINT (1099067.820325124 1716103.5296267036)
128781287220276102018F02Q0101S084제20배수통관POINT (1140700.125847342 1706977.634365025)
165431653627209401996F01Q0101S0835동림7배수암거POINT (1052796.9133151432 1660974.7634337025)
118601185520272202008F01Q0101S081시전 제1배수통관POINT (1128586.1744087113 1726487.8868032238)
121621215720274201995F01Q0101S088내포2배수암거POINT (1129683.837154323 1712743.898516166)
169801697327210002018F02Q0101S086제4배수통문POINT (1058696.9743806804 1671987.3846174257)
144761446827202701999F01Q0101S0822교방1배수암거POINT (1097185.4108382112 1691315.403944658)
7401739520256002010F01Q0101S089검암제3배수암거POINT (1063503.5778962586 1680856.526224242)
1642164320231102004F01Q0101S1380관정POINT (1038306.698614512 1739290.791807352)
3960395920242002008F01Q0101S085공배5배수통관POINT (1024970.5325376411 1727317.4616374543)