Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells30770
Missing cells (%)38.5%
Duplicate rows25
Duplicate rows (%)0.2%
Total size in memory722.7 KiB
Average record size in memory74.0 B

Variable types

Text6
Numeric2

Dataset

Description경상남도 하천관리 시스템의 구조물현황 데이터로, 하천명, 부속물명, 부속물 주소, 부속물 측점번호, 부속물 구조 및 규모, 비고사항에 대한 정보를 제공합니다.
Author경상남도
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15093546

Alerts

Dataset has 25 (0.2%) duplicate rowsDuplicates
부속물_기타주소 has 5425 (54.2%) missing valuesMissing
부속물_구조 has 8425 (84.2%) missing valuesMissing
부속물_규모 has 9566 (95.7%) missing valuesMissing
비고 has 7288 (72.9%) missing valuesMissing

Reproduction

Analysis started2023-12-11 00:41:12.987087
Analysis finished2023-12-11 00:41:14.868141
Duration1.88 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct459
Distinct (%)4.6%
Missing18
Missing (%)0.2%
Memory size156.2 KiB
2023-12-11T09:41:15.135999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length2.9879784
Min length2

Characters and Unicode

Total characters29826
Distinct characters206
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row소사천
2nd row검단천
3rd row우명천
4th row성만천
5th row장자천
ValueCountFrequency (%)
대산천 139
 
1.4%
가천천 138
 
1.4%
대곡천 120
 
1.2%
동천 115
 
1.2%
금양천 107
 
1.1%
영천강 106
 
1.1%
회야강 101
 
1.0%
양천 101
 
1.0%
석교천 91
 
0.9%
금성천 91
 
0.9%
Other values (449) 8873
88.9%
2023-12-11T09:41:15.675186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10401
34.9%
1141
 
3.8%
823
 
2.8%
593
 
2.0%
477
 
1.6%
468
 
1.6%
419
 
1.4%
404
 
1.4%
384
 
1.3%
333
 
1.1%
Other values (196) 14383
48.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29718
99.6%
Decimal Number 52
 
0.2%
Close Punctuation 28
 
0.1%
Open Punctuation 28
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10401
35.0%
1141
 
3.8%
823
 
2.8%
593
 
2.0%
477
 
1.6%
468
 
1.6%
419
 
1.4%
404
 
1.4%
384
 
1.3%
333
 
1.1%
Other values (192) 14275
48.0%
Decimal Number
ValueCountFrequency (%)
1 38
73.1%
2 14
 
26.9%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%
Open Punctuation
ValueCountFrequency (%)
( 28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29718
99.6%
Common 108
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10401
35.0%
1141
 
3.8%
823
 
2.8%
593
 
2.0%
477
 
1.6%
468
 
1.6%
419
 
1.4%
404
 
1.4%
384
 
1.3%
333
 
1.1%
Other values (192) 14275
48.0%
Common
ValueCountFrequency (%)
1 38
35.2%
) 28
25.9%
( 28
25.9%
2 14
 
13.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29718
99.6%
ASCII 108
 
0.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10401
35.0%
1141
 
3.8%
823
 
2.8%
593
 
2.0%
477
 
1.6%
468
 
1.6%
419
 
1.4%
404
 
1.4%
384
 
1.3%
333
 
1.1%
Other values (192) 14275
48.0%
ASCII
ValueCountFrequency (%)
1 38
35.2%
) 28
25.9%
( 28
25.9%
2 14
 
13.0%

일련번호
Real number (ℝ)

Distinct636
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean79.3483
Minimum1
Maximum2002
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T09:41:15.863012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q112
median26
Q351
95-th percentile267.05
Maximum2002
Range2001
Interquartile range (IQR)39

Descriptive statistics

Standard deviation218.65936
Coefficient of variation (CV)2.7556905
Kurtosis27.152975
Mean79.3483
Median Absolute Deviation (MAD)17
Skewness5.0992906
Sum793483
Variance47811.914
MonotonicityNot monotonic
2023-12-11T09:41:16.051126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 255
 
2.5%
4 248
 
2.5%
7 246
 
2.5%
3 245
 
2.5%
2 242
 
2.4%
5 230
 
2.3%
8 223
 
2.2%
6 218
 
2.2%
15 213
 
2.1%
12 209
 
2.1%
Other values (626) 7671
76.7%
ValueCountFrequency (%)
1 255
2.5%
2 242
2.4%
3 245
2.5%
4 248
2.5%
5 230
2.3%
6 218
2.2%
7 246
2.5%
8 223
2.2%
9 188
1.9%
10 202
2.0%
ValueCountFrequency (%)
2002 2
< 0.1%
1704 1
< 0.1%
1703 1
< 0.1%
1702 1
< 0.1%
1701 1
< 0.1%
1698 1
< 0.1%
1697 1
< 0.1%
1691 1
< 0.1%
1688 1
< 0.1%
1680 1
< 0.1%
Distinct7839
Distinct (%)78.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T09:41:16.378126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length6.2081
Min length1

Characters and Unicode

Total characters62081
Distinct characters396
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7250 ?
Unique (%)72.5%

Sample

1st row소사7취입보
2nd row검단1교
3rd row우명7낙차공(철거)
4th row성만9배수통관
5th row장자제2낙차공
ValueCountFrequency (%)
배수통관 312
 
3.0%
호수,저수지 130
 
1.3%
호수저수지 107
 
1.0%
배수암거 46
 
0.4%
취수문 36
 
0.4%
저수지 31
 
0.3%
제1낙차공 26
 
0.3%
낙차보 25
 
0.2%
제2낙차공 25
 
0.2%
계획교량 24
 
0.2%
Other values (7824) 9495
92.6%
2023-12-11T09:41:17.139184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6068
 
9.8%
4716
 
7.6%
3741
 
6.0%
3583
 
5.8%
3205
 
5.2%
1 3003
 
4.8%
2035
 
3.3%
2 1888
 
3.0%
1399
 
2.3%
1392
 
2.2%
Other values (386) 31051
50.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 50449
81.3%
Decimal Number 10528
 
17.0%
Space Separator 305
 
0.5%
Uppercase Letter 219
 
0.4%
Open Punctuation 200
 
0.3%
Close Punctuation 197
 
0.3%
Other Punctuation 142
 
0.2%
Dash Punctuation 28
 
< 0.1%
Connector Punctuation 11
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6068
 
12.0%
4716
 
9.3%
3741
 
7.4%
3583
 
7.1%
3205
 
6.4%
2035
 
4.0%
1399
 
2.8%
1392
 
2.8%
1315
 
2.6%
1188
 
2.4%
Other values (352) 21807
43.2%
Uppercase Letter
ValueCountFrequency (%)
B 64
29.2%
X 63
28.8%
O 61
27.9%
U 17
 
7.8%
C 4
 
1.8%
I 3
 
1.4%
D 1
 
0.5%
Y 1
 
0.5%
T 1
 
0.5%
J 1
 
0.5%
Other values (3) 3
 
1.4%
Decimal Number
ValueCountFrequency (%)
1 3003
28.5%
2 1888
17.9%
3 1279
12.1%
4 968
 
9.2%
5 812
 
7.7%
6 704
 
6.7%
7 556
 
5.3%
8 497
 
4.7%
0 421
 
4.0%
9 400
 
3.8%
Other Punctuation
ValueCountFrequency (%)
, 131
92.3%
. 6
 
4.2%
@ 2
 
1.4%
: 2
 
1.4%
# 1
 
0.7%
Space Separator
ValueCountFrequency (%)
305
100.0%
Open Punctuation
ValueCountFrequency (%)
( 200
100.0%
Close Punctuation
ValueCountFrequency (%)
) 197
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 28
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 11
100.0%
Math Symbol
ValueCountFrequency (%)
+ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 50449
81.3%
Common 11413
 
18.4%
Latin 219
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6068
 
12.0%
4716
 
9.3%
3741
 
7.4%
3583
 
7.1%
3205
 
6.4%
2035
 
4.0%
1399
 
2.8%
1392
 
2.8%
1315
 
2.6%
1188
 
2.4%
Other values (352) 21807
43.2%
Common
ValueCountFrequency (%)
1 3003
26.3%
2 1888
16.5%
3 1279
11.2%
4 968
 
8.5%
5 812
 
7.1%
6 704
 
6.2%
7 556
 
4.9%
8 497
 
4.4%
0 421
 
3.7%
9 400
 
3.5%
Other values (11) 885
 
7.8%
Latin
ValueCountFrequency (%)
B 64
29.2%
X 63
28.8%
O 61
27.9%
U 17
 
7.8%
C 4
 
1.8%
I 3
 
1.4%
D 1
 
0.5%
Y 1
 
0.5%
T 1
 
0.5%
J 1
 
0.5%
Other values (3) 3
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 50449
81.3%
ASCII 11632
 
18.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6068
 
12.0%
4716
 
9.3%
3741
 
7.4%
3583
 
7.1%
3205
 
6.4%
2035
 
4.0%
1399
 
2.8%
1392
 
2.8%
1315
 
2.6%
1188
 
2.4%
Other values (352) 21807
43.2%
ASCII
ValueCountFrequency (%)
1 3003
25.8%
2 1888
16.2%
3 1279
11.0%
4 968
 
8.3%
5 812
 
7.0%
6 704
 
6.1%
7 556
 
4.8%
8 497
 
4.3%
0 421
 
3.6%
9 400
 
3.4%
Other values (24) 1104
 
9.5%
Distinct724
Distinct (%)15.8%
Missing5425
Missing (%)54.2%
Memory size156.2 KiB
2023-12-11T09:41:17.543185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16
Mean length15.094645
Min length10

Characters and Unicode

Total characters69058
Distinct characters243
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique123 ?
Unique (%)2.7%

Sample

1st row경상남도 진해시 소사동
2nd row함안군 대산면 옥렬리
3rd row경상남도 거창군 거창읍 김천리
4th row창원시 마산회원구 합성동
5th row경상남도 진해시 소사동
ValueCountFrequency (%)
경상남도 4175
24.0%
밀양시 352
 
2.0%
마산시 348
 
2.0%
고성군 338
 
1.9%
진주시 317
 
1.8%
창원시 307
 
1.8%
함안군 306
 
1.8%
창녕군 301
 
1.7%
사천시 266
 
1.5%
하동군 264
 
1.5%
Other values (771) 10437
59.9%
2023-12-11T09:41:18.057400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12836
18.6%
4528
 
6.6%
4388
 
6.4%
4372
 
6.3%
4220
 
6.1%
3988
 
5.8%
3356
 
4.9%
2380
 
3.4%
2243
 
3.2%
1503
 
2.2%
Other values (233) 25244
36.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 56191
81.4%
Space Separator 12836
 
18.6%
Decimal Number 31
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4528
 
8.1%
4388
 
7.8%
4372
 
7.8%
4220
 
7.5%
3988
 
7.1%
3356
 
6.0%
2380
 
4.2%
2243
 
4.0%
1503
 
2.7%
1385
 
2.5%
Other values (228) 23828
42.4%
Decimal Number
ValueCountFrequency (%)
1 14
45.2%
3 11
35.5%
4 5
 
16.1%
2 1
 
3.2%
Space Separator
ValueCountFrequency (%)
12836
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 56191
81.4%
Common 12867
 
18.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4528
 
8.1%
4388
 
7.8%
4372
 
7.8%
4220
 
7.5%
3988
 
7.1%
3356
 
6.0%
2380
 
4.2%
2243
 
4.0%
1503
 
2.7%
1385
 
2.5%
Other values (228) 23828
42.4%
Common
ValueCountFrequency (%)
12836
99.8%
1 14
 
0.1%
3 11
 
0.1%
4 5
 
< 0.1%
2 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 56191
81.4%
ASCII 12867
 
18.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12836
99.8%
1 14
 
0.1%
3 11
 
0.1%
4 5
 
< 0.1%
2 1
 
< 0.1%
Hangul
ValueCountFrequency (%)
4528
 
8.1%
4388
 
7.8%
4372
 
7.8%
4220
 
7.5%
3988
 
7.1%
3356
 
6.0%
2380
 
4.2%
2243
 
4.0%
1503
 
2.7%
1385
 
2.5%
Other values (228) 23828
42.4%
Distinct6580
Distinct (%)66.1%
Missing48
Missing (%)0.5%
Memory size156.2 KiB
2023-12-11T09:41:18.391863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length9.1359526
Min length2

Characters and Unicode

Total characters90921
Distinct characters14
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4850 ?
Unique (%)48.7%

Sample

1st row0008+0037
2nd row0012+0035
3rd row0024+0003
4th row0000+0794
5th row0000+0632
ValueCountFrequency (%)
0000+0000 692
 
7.0%
0008+0000 10
 
0.1%
0007+0000 9
 
0.1%
0000+0030 9
 
0.1%
0021+0000 8
 
0.1%
0005+0031 8
 
0.1%
0016+0000 8
 
0.1%
0000+0050 8
 
0.1%
0002+0000 8
 
0.1%
0004+0028 8
 
0.1%
Other values (6570) 9184
92.3%
2023-12-11T09:41:18.787550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 47774
52.5%
+ 9948
 
10.9%
1 6215
 
6.8%
2 4522
 
5.0%
3 3899
 
4.3%
5 3549
 
3.9%
4 3448
 
3.8%
6 2948
 
3.2%
7 2846
 
3.1%
8 2761
 
3.0%
Other values (4) 3011
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80512
88.6%
Math Symbol 9948
 
10.9%
Other Punctuation 458
 
0.5%
Dash Punctuation 3
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 47774
59.3%
1 6215
 
7.7%
2 4522
 
5.6%
3 3899
 
4.8%
5 3549
 
4.4%
4 3448
 
4.3%
6 2948
 
3.7%
7 2846
 
3.5%
8 2761
 
3.4%
9 2550
 
3.2%
Other Punctuation
ValueCountFrequency (%)
. 456
99.6%
, 2
 
0.4%
Math Symbol
ValueCountFrequency (%)
+ 9948
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 90921
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 47774
52.5%
+ 9948
 
10.9%
1 6215
 
6.8%
2 4522
 
5.0%
3 3899
 
4.3%
5 3549
 
3.9%
4 3448
 
3.8%
6 2948
 
3.2%
7 2846
 
3.1%
8 2761
 
3.0%
Other values (4) 3011
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90921
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 47774
52.5%
+ 9948
 
10.9%
1 6215
 
6.8%
2 4522
 
5.0%
3 3899
 
4.3%
5 3549
 
3.9%
4 3448
 
3.8%
6 2948
 
3.2%
7 2846
 
3.1%
8 2761
 
3.0%
Other values (4) 3011
 
3.3%

부속물_구조
Text

MISSING 

Distinct207
Distinct (%)13.1%
Missing8425
Missing (%)84.2%
Memory size156.2 KiB
2023-12-11T09:41:19.117644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length56
Median length54
Mean length5.9803175
Min length1

Characters and Unicode

Total characters9419
Distinct characters152
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique144 ?
Unique (%)9.1%

Sample

1st rowR.C
2nd rowR.C
3rd rowR.C
4th rowR.C
5th row흄관
ValueCountFrequency (%)
r.c 335
15.7%
흄관 225
 
10.6%
con'c 183
 
8.6%
146
 
6.9%
thp 124
 
5.8%
hp 111
 
5.2%
csp 68
 
3.2%
콘크리트 50
 
2.3%
접수면적 48
 
2.3%
중력식 43
 
2.0%
Other values (286) 798
37.4%
2023-12-11T09:41:19.508486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 962
 
10.2%
. 675
 
7.2%
592
 
6.3%
: 400
 
4.2%
P 352
 
3.7%
R 345
 
3.7%
0 312
 
3.3%
O 300
 
3.2%
269
 
2.9%
' 261
 
2.8%
Other values (142) 4951
52.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3053
32.4%
Other Letter 2817
29.9%
Other Punctuation 1538
16.3%
Decimal Number 925
 
9.8%
Space Separator 592
 
6.3%
Lowercase Letter 377
 
4.0%
Other Symbol 103
 
1.1%
Modifier Symbol 4
 
< 0.1%
Dash Punctuation 4
 
< 0.1%
Close Punctuation 2
 
< 0.1%
Other values (2) 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
269
 
9.5%
225
 
8.0%
137
 
4.9%
113
 
4.0%
105
 
3.7%
103
 
3.7%
88
 
3.1%
80
 
2.8%
69
 
2.4%
69
 
2.4%
Other values (82) 1559
55.3%
Uppercase Letter
ValueCountFrequency (%)
C 962
31.5%
P 352
 
11.5%
R 345
 
11.3%
O 300
 
9.8%
N 259
 
8.5%
H 252
 
8.3%
T 135
 
4.4%
S 109
 
3.6%
B 85
 
2.8%
L 56
 
1.8%
Other values (10) 198
 
6.5%
Lowercase Letter
ValueCountFrequency (%)
m 172
45.6%
r 40
 
10.6%
h 39
 
10.3%
f 39
 
10.3%
o 17
 
4.5%
n 11
 
2.9%
c 11
 
2.9%
b 10
 
2.7%
x 9
 
2.4%
l 8
 
2.1%
Other values (7) 21
 
5.6%
Decimal Number
ValueCountFrequency (%)
0 312
33.7%
2 113
 
12.2%
7 86
 
9.3%
1 75
 
8.1%
5 75
 
8.1%
3 65
 
7.0%
8 64
 
6.9%
4 59
 
6.4%
6 54
 
5.8%
9 22
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 675
43.9%
: 400
26.0%
' 261
 
17.0%
/ 160
 
10.4%
, 40
 
2.6%
" 2
 
0.1%
Space Separator
ValueCountFrequency (%)
592
100.0%
Other Symbol
ValueCountFrequency (%)
103
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Math Symbol
ValueCountFrequency (%)
= 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3430
36.4%
Common 3172
33.7%
Hangul 2817
29.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
269
 
9.5%
225
 
8.0%
137
 
4.9%
113
 
4.0%
105
 
3.7%
103
 
3.7%
88
 
3.1%
80
 
2.8%
69
 
2.4%
69
 
2.4%
Other values (82) 1559
55.3%
Latin
ValueCountFrequency (%)
C 962
28.0%
P 352
 
10.3%
R 345
 
10.1%
O 300
 
8.7%
N 259
 
7.6%
H 252
 
7.3%
m 172
 
5.0%
T 135
 
3.9%
S 109
 
3.2%
B 85
 
2.5%
Other values (27) 459
13.4%
Common
ValueCountFrequency (%)
. 675
21.3%
592
18.7%
: 400
12.6%
0 312
9.8%
' 261
 
8.2%
/ 160
 
5.0%
2 113
 
3.6%
103
 
3.2%
7 86
 
2.7%
1 75
 
2.4%
Other values (13) 395
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6498
69.0%
Hangul 2817
29.9%
CJK Compat 103
 
1.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 962
14.8%
. 675
 
10.4%
592
 
9.1%
: 400
 
6.2%
P 352
 
5.4%
R 345
 
5.3%
0 312
 
4.8%
O 300
 
4.6%
' 261
 
4.0%
N 259
 
4.0%
Other values (48) 2040
31.4%
Hangul
ValueCountFrequency (%)
269
 
9.5%
225
 
8.0%
137
 
4.9%
113
 
4.0%
105
 
3.7%
103
 
3.7%
88
 
3.1%
80
 
2.8%
69
 
2.4%
69
 
2.4%
Other values (82) 1559
55.3%
CJK Compat
ValueCountFrequency (%)
103
100.0%
None
ValueCountFrequency (%)
Ø 1
100.0%

부속물_규모
Real number (ℝ)

MISSING 

Distinct182
Distinct (%)41.9%
Missing9566
Missing (%)95.7%
Infinite0
Infinite (%)0.0%
Mean29.409447
Minimum0
Maximum1000
Zeros3
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T09:41:19.640636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3.1
Q17
median11
Q320
95-th percentile51.335
Maximum1000
Range1000
Interquartile range (IQR)13

Descriptive statistics

Standard deviation104.53639
Coefficient of variation (CV)3.5545173
Kurtosis53.113622
Mean29.409447
Median Absolute Deviation (MAD)5
Skewness7.255506
Sum12763.7
Variance10927.856
MonotonicityNot monotonic
2023-12-11T09:41:19.772328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.0 22
 
0.2%
12.0 17
 
0.2%
7.0 15
 
0.1%
10.0 12
 
0.1%
6.0 10
 
0.1%
3.1 9
 
0.1%
3.0 9
 
0.1%
18.0 8
 
0.1%
4.0 8
 
0.1%
5.0 8
 
0.1%
Other values (172) 316
 
3.2%
(Missing) 9566
95.7%
ValueCountFrequency (%)
0.0 3
 
< 0.1%
1.0 1
 
< 0.1%
1.1 1
 
< 0.1%
1.2 1
 
< 0.1%
2.0 1
 
< 0.1%
2.7 1
 
< 0.1%
3.0 9
0.1%
3.1 9
0.1%
3.2 1
 
< 0.1%
3.4 1
 
< 0.1%
ValueCountFrequency (%)
1000.0 1
 
< 0.1%
800.0 5
0.1%
600.0 2
 
< 0.1%
110.0 1
 
< 0.1%
98.3 1
 
< 0.1%
93.1 1
 
< 0.1%
75.0 1
 
< 0.1%
69.3 1
 
< 0.1%
64.3 1
 
< 0.1%
63.0 1
 
< 0.1%

비고
Text

MISSING 

Distinct167
Distinct (%)6.2%
Missing7288
Missing (%)72.9%
Memory size156.2 KiB
2023-12-11T09:41:20.037058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length2
Mean length3.1515487
Min length1

Characters and Unicode

Total characters8547
Distinct characters176
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique102 ?
Unique (%)3.8%

Sample

1st row신설
2nd row재가설
3rd row재가설
4th row신설
5th row신설
ValueCountFrequency (%)
신설 1309
44.9%
재가설 373
 
12.8%
충분 226
 
7.8%
존치 148
 
5.1%
부족 90
 
3.1%
계획지구 87
 
3.0%
증설 60
 
2.1%
재설치 41
 
1.4%
기존확장 37
 
1.3%
철거 33
 
1.1%
Other values (185) 510
 
17.5%
2023-12-11T09:41:20.455841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1899
22.2%
1322
15.5%
452
 
5.3%
413
 
4.8%
229
 
2.7%
229
 
2.7%
223
 
2.6%
220
 
2.6%
196
 
2.3%
172
 
2.0%
Other values (166) 3192
37.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7408
86.7%
Decimal Number 330
 
3.9%
Space Separator 225
 
2.6%
Other Punctuation 225
 
2.6%
Close Punctuation 85
 
1.0%
Open Punctuation 85
 
1.0%
Uppercase Letter 60
 
0.7%
Lowercase Letter 49
 
0.6%
Math Symbol 45
 
0.5%
Dash Punctuation 26
 
0.3%
Other values (2) 9
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1899
25.6%
1322
17.8%
452
 
6.1%
413
 
5.6%
229
 
3.1%
229
 
3.1%
220
 
3.0%
196
 
2.6%
172
 
2.3%
140
 
1.9%
Other values (118) 2136
28.8%
Lowercase Letter
ValueCountFrequency (%)
a 9
18.4%
h 7
14.3%
e 6
12.2%
s 6
12.2%
o 5
10.2%
c 5
10.2%
m 4
8.2%
φ 3
 
6.1%
l 1
 
2.0%
p 1
 
2.0%
Other values (2) 2
 
4.1%
Uppercase Letter
ValueCountFrequency (%)
D 22
36.7%
H 5
 
8.3%
E 4
 
6.7%
U 4
 
6.7%
T 4
 
6.7%
B 4
 
6.7%
A 4
 
6.7%
L 4
 
6.7%
S 3
 
5.0%
N 3
 
5.0%
Decimal Number
ValueCountFrequency (%)
0 100
30.3%
1 81
24.5%
2 39
 
11.8%
5 22
 
6.7%
8 19
 
5.8%
7 17
 
5.2%
3 17
 
5.2%
4 14
 
4.2%
6 11
 
3.3%
9 10
 
3.0%
Other Punctuation
ValueCountFrequency (%)
, 153
68.0%
. 34
 
15.1%
: 30
 
13.3%
/ 7
 
3.1%
* 1
 
0.4%
Space Separator
ValueCountFrequency (%)
223
99.1%
  2
 
0.9%
Math Symbol
ValueCountFrequency (%)
× 24
53.3%
+ 21
46.7%
Other Symbol
ValueCountFrequency (%)
5
83.3%
1
 
16.7%
Close Punctuation
ValueCountFrequency (%)
) 85
100.0%
Open Punctuation
ValueCountFrequency (%)
( 85
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 26
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7409
86.7%
Common 1029
 
12.0%
Latin 103
 
1.2%
Greek 6
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1899
25.6%
1322
17.8%
452
 
6.1%
413
 
5.6%
229
 
3.1%
229
 
3.1%
220
 
3.0%
196
 
2.6%
172
 
2.3%
140
 
1.9%
Other values (119) 2137
28.8%
Common
ValueCountFrequency (%)
223
21.7%
, 153
14.9%
0 100
9.7%
) 85
 
8.3%
( 85
 
8.3%
1 81
 
7.9%
2 39
 
3.8%
. 34
 
3.3%
: 30
 
2.9%
- 26
 
2.5%
Other values (14) 173
16.8%
Latin
ValueCountFrequency (%)
D 22
21.4%
a 9
 
8.7%
h 7
 
6.8%
e 6
 
5.8%
s 6
 
5.8%
o 5
 
4.9%
c 5
 
4.9%
H 5
 
4.9%
E 4
 
3.9%
U 4
 
3.9%
Other values (11) 30
29.1%
Greek
ValueCountFrequency (%)
φ 3
50.0%
Φ 3
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7408
86.7%
ASCII 1101
 
12.9%
None 33
 
0.4%
CJK Compat 5
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1899
25.6%
1322
17.8%
452
 
6.1%
413
 
5.6%
229
 
3.1%
229
 
3.1%
220
 
3.0%
196
 
2.6%
172
 
2.3%
140
 
1.9%
Other values (118) 2136
28.8%
ASCII
ValueCountFrequency (%)
223
20.3%
, 153
13.9%
0 100
9.1%
) 85
 
7.7%
( 85
 
7.7%
1 81
 
7.4%
2 39
 
3.5%
. 34
 
3.1%
: 30
 
2.7%
- 26
 
2.4%
Other values (32) 245
22.3%
None
ValueCountFrequency (%)
× 24
72.7%
φ 3
 
9.1%
Φ 3
 
9.1%
  2
 
6.1%
1
 
3.0%
CJK Compat
ValueCountFrequency (%)
5
100.0%

Interactions

2023-12-11T09:41:14.123814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:41:13.887371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:41:14.252979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T09:41:14.020051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T09:41:20.539698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호부속물_규모
일련번호1.000NaN
부속물_규모NaN1.000
2023-12-11T09:41:20.613167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호부속물_규모
일련번호1.000-0.003
부속물_규모-0.0031.000

Missing values

2023-12-11T09:41:14.398082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T09:41:14.586575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T09:41:14.764469image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

하천명일련번호부속물명부속물_기타주소부속물_측점번호부속물_구조부속물_규모비고
19692소사천8소사7취입보경상남도 진해시 소사동0008+0037R.C<NA>신설
40592검단천4검단1교<NA>0012+0035<NA><NA><NA>
6575우명천33우명7낙차공(철거)<NA>0024+0003<NA><NA><NA>
7048성만천18성만9배수통관<NA>0000+0794<NA><NA><NA>
1319장자천27장자제2낙차공<NA>0000+0632<NA>19.3재가설
41716옥열천82옥열10낙차공함안군 대산면 옥렬리0039+0086<NA><NA><NA>
8703웅곡천5김천2교경상남도 거창군 거창읍 김천리0000+0165R.C<NA><NA>
22819산호천18산호제1취입보창원시 마산회원구 합성동0013+0029<NA><NA><NA>
3247소사천6소사4배수암거경상남도 진해시 소사동0005+0093R.C<NA><NA>
33797가좌천38가좌3교<NA>0053+0036R.C<NA>재가설
하천명일련번호부속물명부속물_기타주소부속물_측점번호부속물_구조부속물_규모비고
45081덕곡천16덕곡제10배수통관<NA>0002+0228<NA><NA>재가설(통수단면적 부족)
5275가천천1201가천21배수암거<NA>0013+0330<NA><NA><NA>
31246영오천123좌연7배수통관경상남도 고성군 개천면 좌연리0070+0010HP<NA>신설
42748유곡천55판곡양수장경상남도 의령군 유곡면 칠곡리0046+0100R.C<NA><NA>
32393내곡천8내곡제3배수통관<NA>0000+0767<NA><NA>기존확장
37845영오천120좌연3취수보<NA>0000+0000<NA><NA><NA>
3425부목천32문12<NA>0015+0079<NA><NA>증설
13525의령천3의령1배수관<NA>0003+0036<NA><NA><NA>
15238여좌천4여좌28배수통관<NA>0026+0024<NA><NA><NA>
11205양산천84배수암거<NA>0066+0072<NA><NA><NA>

Duplicate rows

Most frequently occurring

하천명일련번호부속물명부속물_기타주소부속물_측점번호부속물_구조부속물_규모비고# duplicates
0단장천27단장4배수통관<NA>0008+0595<NA><NA><NA>2
1단장천37사연2배수통관<NA>0011+0146<NA><NA><NA>2
2단장천44범도5배수통관<NA>0015+0014<NA><NA><NA>2
3대산천4호수저수지<NA>0000+0000<NA><NA><NA>2
4방곡천36방곡제5보<NA>0008+0018중력식<NA><NA>2
5방곡천62신촌저수지<NA>0000+0000<NA><NA><NA>2
6연초천59연초42배수통관<NA>0060+0095.50<NA><NA><NA>2
7연초천71연초53배수통관<NA>0071+0097.70<NA><NA><NA>2
8연초천77제1취입보<NA>0021+0096.60<NA><NA>재설치2
9영천강17봉발17교(재가설)<NA><NA><NA><NA><NA>2