Overview

Dataset statistics

Number of variables5
Number of observations6550
Missing cells3
Missing cells (%)< 0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory256.0 KiB
Average record size in memory40.0 B

Variable types

Categorical1
Text4

Alerts

소방용수종류 is highly imbalanced (61.5%)Imbalance

Reproduction

Analysis started2024-03-14 03:27:40.103856
Analysis finished2024-03-14 03:27:41.244783
Duration1.14 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

소방용수종류
Categorical

IMBALANCE 

Distinct12
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size51.3 KiB
지상식
3327 
지하식
2880 
급수탑
 
136
비상소화장치
 
50
자연
 
44
Other values (7)
 
113

Length

Max length6
Median length3
Mean length3.0242748
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지상식
2nd row지상식
3rd row지상식
4th row지상식
5th row지상식

Common Values

ValueCountFrequency (%)
지상식 3327
50.8%
지하식 2880
44.0%
급수탑 136
 
2.1%
비상소화장치 50
 
0.8%
자연 44
 
0.7%
기타 29
 
0.4%
저수조 24
 
0.4%
비상화장치 22
 
0.3%
저수지 18
 
0.3%
소방용수시설 9
 
0.1%
Other values (2) 11
 
0.2%

Length

2024-03-14T12:27:41.312171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
지상식 3327
50.8%
지하식 2880
44.0%
급수탑 136
 
2.1%
비상소화장치 50
 
0.8%
자연 44
 
0.7%
기타 29
 
0.4%
저수조 24
 
0.4%
비상화장치 22
 
0.3%
저수지 18
 
0.3%
소방용수시설 9
 
0.1%
Other values (2) 11
 
0.2%
Distinct6547
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size51.3 KiB
2024-03-14T12:27:41.499994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length12
Mean length11.966107
Min length9

Characters and Unicode

Total characters78378
Distinct characters108
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6544 ?
Unique (%)99.9%

Sample

1st row덕진-아중-지상-058
2nd row익산-다송-지상17
3rd row익산-다송-지상16
4th row익산-다송-지상15
5th row익산-다송-지상14
ValueCountFrequency (%)
무진장-마령 5
 
0.1%
익산-모현-지하-043 2
 
< 0.1%
무진장-진안-비상-19 2
 
< 0.1%
남원-식정-지하-013 2
 
< 0.1%
군산-비응-급수-001 1
 
< 0.1%
군산-대야-지하-092 1
 
< 0.1%
군산-대야-지하-094 1
 
< 0.1%
군산-대야-지하-095 1
 
< 0.1%
군산-대야-지하-096 1
 
< 0.1%
군산-대야-지하-097 1
 
< 0.1%
Other values (6539) 6539
99.7%
2024-03-14T12:27:41.831533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 19647
25.1%
6493
 
8.3%
0 5750
 
7.3%
3444
 
4.4%
3341
 
4.3%
2863
 
3.7%
1 2859
 
3.6%
2 1969
 
2.5%
3 1526
 
1.9%
1457
 
1.9%
Other values (98) 29029
37.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 40086
51.1%
Dash Punctuation 19647
25.1%
Decimal Number 18632
23.8%
Space Separator 13
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6493
 
16.2%
3444
 
8.6%
3341
 
8.3%
2863
 
7.1%
1457
 
3.6%
1098
 
2.7%
1077
 
2.7%
1002
 
2.5%
981
 
2.4%
962
 
2.4%
Other values (86) 17368
43.3%
Decimal Number
ValueCountFrequency (%)
0 5750
30.9%
1 2859
15.3%
2 1969
 
10.6%
3 1526
 
8.2%
4 1314
 
7.1%
5 1169
 
6.3%
6 1088
 
5.8%
7 1059
 
5.7%
8 980
 
5.3%
9 918
 
4.9%
Dash Punctuation
ValueCountFrequency (%)
- 19647
100.0%
Space Separator
ValueCountFrequency (%)
13
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 40086
51.1%
Common 38292
48.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6493
 
16.2%
3444
 
8.6%
3341
 
8.3%
2863
 
7.1%
1457
 
3.6%
1098
 
2.7%
1077
 
2.7%
1002
 
2.5%
981
 
2.4%
962
 
2.4%
Other values (86) 17368
43.3%
Common
ValueCountFrequency (%)
- 19647
51.3%
0 5750
 
15.0%
1 2859
 
7.5%
2 1969
 
5.1%
3 1526
 
4.0%
4 1314
 
3.4%
5 1169
 
3.1%
6 1088
 
2.8%
7 1059
 
2.8%
8 980
 
2.6%
Other values (2) 931
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 40086
51.1%
ASCII 38292
48.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 19647
51.3%
0 5750
 
15.0%
1 2859
 
7.5%
2 1969
 
5.1%
3 1526
 
4.0%
4 1314
 
3.4%
5 1169
 
3.1%
6 1088
 
2.8%
7 1059
 
2.8%
8 980
 
2.6%
Other values (2) 931
 
2.4%
Hangul
ValueCountFrequency (%)
6493
 
16.2%
3444
 
8.6%
3341
 
8.3%
2863
 
7.1%
1457
 
3.6%
1098
 
2.7%
1077
 
2.7%
1002
 
2.5%
981
 
2.4%
962
 
2.4%
Other values (86) 17368
43.3%

위치
Text

Distinct6456
Distinct (%)98.6%
Missing3
Missing (%)< 0.1%
Memory size51.3 KiB
2024-03-14T12:27:42.053954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length64
Median length49
Mean length21.35543
Min length4

Characters and Unicode

Total characters139814
Distinct characters852
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6424 ?
Unique (%)98.1%

Sample

1st row전주시 덕진구 정언신로 1
2nd row익산시 함열읍 다송리 844-18
3rd row익산시 함열읍 다송리 723-152
4th row익산시 함열읍 다송리 723-179
5th row익산시 함열읍 다송리 724-19
ValueCountFrequency (%)
2389
 
8.1%
642
 
2.2%
익산시 420
 
1.4%
전주시 310
 
1.0%
입구 264
 
0.9%
김제시 223
 
0.8%
맞은편 207
 
0.7%
인도 189
 
0.6%
정읍시 168
 
0.6%
사거리 165
 
0.6%
Other values (12807) 24563
83.2%
2024-03-14T12:27:42.429351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23564
 
16.9%
1 4196
 
3.0%
3829
 
2.7%
3295
 
2.4%
( 2860
 
2.0%
) 2853
 
2.0%
2 2814
 
2.0%
2799
 
2.0%
2350
 
1.7%
- 2255
 
1.6%
Other values (842) 88999
63.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 88471
63.3%
Space Separator 23564
 
16.9%
Decimal Number 18576
 
13.3%
Open Punctuation 2904
 
2.1%
Close Punctuation 2895
 
2.1%
Dash Punctuation 2255
 
1.6%
Uppercase Letter 442
 
0.3%
Lowercase Letter 300
 
0.2%
Other Punctuation 295
 
0.2%
Math Symbol 53
 
< 0.1%
Other values (3) 59
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3829
 
4.3%
3295
 
3.7%
2799
 
3.2%
2350
 
2.7%
2197
 
2.5%
2154
 
2.4%
1941
 
2.2%
1729
 
2.0%
1549
 
1.8%
1384
 
1.6%
Other values (761) 65244
73.7%
Uppercase Letter
ValueCountFrequency (%)
M 43
9.7%
S 43
9.7%
G 42
9.5%
T 39
8.8%
A 37
 
8.4%
K 33
 
7.5%
C 31
 
7.0%
L 28
 
6.3%
P 26
 
5.9%
B 23
 
5.2%
Other values (14) 97
21.9%
Lowercase Letter
ValueCountFrequency (%)
m 196
65.3%
k 13
 
4.3%
s 10
 
3.3%
e 9
 
3.0%
i 8
 
2.7%
a 8
 
2.7%
o 7
 
2.3%
c 7
 
2.3%
t 7
 
2.3%
g 6
 
2.0%
Other values (12) 29
 
9.7%
Decimal Number
ValueCountFrequency (%)
1 4196
22.6%
2 2814
15.1%
3 2132
11.5%
4 1654
 
8.9%
0 1642
 
8.8%
5 1607
 
8.7%
7 1214
 
6.5%
6 1200
 
6.5%
8 1122
 
6.0%
9 995
 
5.4%
Other Punctuation
ValueCountFrequency (%)
, 142
48.1%
@ 83
28.1%
. 43
 
14.6%
/ 11
 
3.7%
? 9
 
3.1%
& 4
 
1.4%
1
 
0.3%
: 1
 
0.3%
! 1
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 2860
98.5%
[ 36
 
1.2%
7
 
0.2%
{ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 2853
98.5%
] 36
 
1.2%
5
 
0.2%
} 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
~ 48
90.6%
5
 
9.4%
Other Symbol
ValueCountFrequency (%)
21
95.5%
1
 
4.5%
Space Separator
ValueCountFrequency (%)
23564
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2255
100.0%
Control
ValueCountFrequency (%)
32
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 88491
63.3%
Common 50580
36.2%
Latin 742
 
0.5%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3829
 
4.3%
3295
 
3.7%
2799
 
3.2%
2350
 
2.7%
2197
 
2.5%
2154
 
2.4%
1941
 
2.2%
1729
 
2.0%
1549
 
1.8%
1384
 
1.6%
Other values (761) 65264
73.8%
Latin
ValueCountFrequency (%)
m 196
26.4%
M 43
 
5.8%
S 43
 
5.8%
G 42
 
5.7%
T 39
 
5.3%
A 37
 
5.0%
K 33
 
4.4%
C 31
 
4.2%
L 28
 
3.8%
P 26
 
3.5%
Other values (36) 224
30.2%
Common
ValueCountFrequency (%)
23564
46.6%
1 4196
 
8.3%
( 2860
 
5.7%
) 2853
 
5.6%
2 2814
 
5.6%
- 2255
 
4.5%
3 2132
 
4.2%
4 1654
 
3.3%
0 1642
 
3.2%
5 1607
 
3.2%
Other values (24) 5003
 
9.9%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 88450
63.3%
ASCII 51303
36.7%
None 34
 
< 0.1%
Compat Jamo 20
 
< 0.1%
Arrows 5
 
< 0.1%
CJK 1
 
< 0.1%
Enclosed Alphanum 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
23564
45.9%
1 4196
 
8.2%
( 2860
 
5.6%
) 2853
 
5.6%
2 2814
 
5.5%
- 2255
 
4.4%
3 2132
 
4.2%
4 1654
 
3.2%
0 1642
 
3.2%
5 1607
 
3.1%
Other values (65) 5726
 
11.2%
Hangul
ValueCountFrequency (%)
3829
 
4.3%
3295
 
3.7%
2799
 
3.2%
2350
 
2.7%
2197
 
2.5%
2154
 
2.4%
1941
 
2.2%
1729
 
2.0%
1549
 
1.8%
1384
 
1.6%
Other values (756) 65223
73.7%
None
ValueCountFrequency (%)
21
61.8%
7
 
20.6%
5
 
14.7%
1
 
2.9%
Compat Jamo
ValueCountFrequency (%)
14
70.0%
4
 
20.0%
1
 
5.0%
1
 
5.0%
Arrows
ValueCountFrequency (%)
5
100.0%
CJK
ValueCountFrequency (%)
1
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
100.0%
Distinct6410
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size51.3 KiB
2024-03-14T12:27:42.650398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length19
Mean length11.288397
Min length5

Characters and Unicode

Total characters73939
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6300 ?
Unique (%)96.2%

Sample

1st row35.829587
2nd row36.03332534
3rd row36.03417928
4th row36.03492465
5th row36.03444839
ValueCountFrequency (%)
36.0587188 8
 
0.1%
36.0808156 7
 
0.1%
35.4405251 6
 
0.1%
35.42030842 5
 
0.1%
35.9393065 4
 
0.1%
35.9432107 4
 
0.1%
35.9435472 4
 
0.1%
35.4163346 3
 
< 0.1%
36.1149916 3
 
< 0.1%
35.963600 3
 
< 0.1%
Other values (6399) 6505
99.3%
2024-03-14T12:27:43.067776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 11871
16.1%
5 11543
15.6%
9 6870
9.3%
. 6551
8.9%
8 6163
8.3%
6 5798
7.8%
4 5531
7.5%
7 5247
7.1%
1 4930
6.7%
2 4796
6.5%
Other values (6) 4639
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 67318
91.0%
Other Punctuation 6557
 
8.9%
Space Separator 62
 
0.1%
Other Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 11871
17.6%
5 11543
17.1%
9 6870
10.2%
8 6163
9.2%
6 5798
8.6%
4 5531
8.2%
7 5247
7.8%
1 4930
7.3%
2 4796
7.1%
0 4569
 
6.8%
Other Punctuation
ValueCountFrequency (%)
. 6551
99.9%
? 5
 
0.1%
, 1
 
< 0.1%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
62
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 73937
> 99.9%
Hangul 2
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3 11871
16.1%
5 11543
15.6%
9 6870
9.3%
. 6551
8.9%
8 6163
8.3%
6 5798
7.8%
4 5531
7.5%
7 5247
7.1%
1 4930
6.7%
2 4796
6.5%
Other values (4) 4637
 
6.3%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 73937
> 99.9%
Hangul 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 11871
16.1%
5 11543
15.6%
9 6870
9.3%
. 6551
8.9%
8 6163
8.3%
6 5798
7.8%
4 5531
7.5%
7 5247
7.1%
1 4930
6.7%
2 4796
6.5%
Other values (4) 4637
 
6.3%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct6419
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Memory size51.3 KiB
2024-03-14T12:27:43.275617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length19
Mean length12.172672
Min length6

Characters and Unicode

Total characters79731
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6321 ?
Unique (%)96.5%

Sample

1st row127.154083
2nd row126.96262989
3rd row126.96100709
4th row126.95952861
5th row126.95703522
ValueCountFrequency (%)
126.9423397 8
 
0.1%
126.9589255 7
 
0.1%
127.2395143 6
 
0.1%
127.40548631 5
 
0.1%
126.9535357 4
 
0.1%
126.732498 4
 
0.1%
126.9542581 4
 
0.1%
126.951507 4
 
0.1%
126.9504841 3
 
< 0.1%
126.961744 3
 
< 0.1%
Other values (6410) 6503
99.3%
2024-03-14T12:27:43.596423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 12806
16.1%
2 11335
14.2%
6 8837
11.1%
7 8684
10.9%
. 6551
8.2%
5 5487
6.9%
8 5479
6.9%
9 5477
6.9%
3 5188
6.5%
4 5106
 
6.4%
Other values (5) 4781
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 73139
91.7%
Other Punctuation 6555
 
8.2%
Space Separator 35
 
< 0.1%
Other Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 12806
17.5%
2 11335
15.5%
6 8837
12.1%
7 8684
11.9%
5 5487
7.5%
8 5479
7.5%
9 5477
7.5%
3 5188
7.1%
4 5106
 
7.0%
0 4740
 
6.5%
Other Punctuation
ValueCountFrequency (%)
. 6551
99.9%
? 4
 
0.1%
Other Letter
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
35
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 79729
> 99.9%
Hangul 2
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 12806
16.1%
2 11335
14.2%
6 8837
11.1%
7 8684
10.9%
. 6551
8.2%
5 5487
6.9%
8 5479
6.9%
9 5477
6.9%
3 5188
6.5%
4 5106
 
6.4%
Other values (3) 4779
 
6.0%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 79729
> 99.9%
Hangul 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 12806
16.1%
2 11335
14.2%
6 8837
11.1%
7 8684
10.9%
. 6551
8.2%
5 5487
6.9%
8 5479
6.9%
9 5477
6.9%
3 5188
6.5%
4 5106
 
6.4%
Other values (3) 4779
 
6.0%
Hangul
ValueCountFrequency (%)
1
50.0%
1
50.0%

Missing values

2024-03-14T12:27:41.118881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T12:27:41.207169image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

소방용수종류소화전 번호위치X좌표Y좌표
0지상식덕진-아중-지상-058전주시 덕진구 정언신로 135.829587127.154083
1지상식익산-다송-지상17익산시 함열읍 다송리 844-1836.03332534126.96262989
2지상식익산-다송-지상16익산시 함열읍 다송리 723-15236.03417928126.96100709
3지상식익산-다송-지상15익산시 함열읍 다송리 723-17936.03492465126.95952861
4지상식익산-다송-지상14익산시 함열읍 다송리 724-1936.03444839126.95703522
5지상식익산-다송-지상13익산시 함열읍 다송리 22636.03384084126.95823523
6지상식익산-다송-지상12익산시 함열읍 다송리 723-11336.03292576126.96007507
7지상식익산-다송-지상11익산시 함열읍 다송리 723-16236.03160913126.96187301
8지상식익산-다송-지상10익산시 함열읍 다송리 843-836.030541126.96084111
9지상식익산-다송-지상9익산시 함열읍 다송리 723-13336.03158577126.95895171
소방용수종류소화전 번호위치X좌표Y좌표
6540지상식무진장-진안-지상-60반월리 진안홍삼한방농공단지내 가로등(S32-22) 옆35.77060338127.43501406
6541지상식무진장-진안-지상-63상전면 문화길 16-32 (박연생씨댁에서 30m 위쪽)35.822759619913995127.49012229032815
6542지상식무진장-진안-지상-64반월리 진안홍삼한방농공단지내진안홍삼연구소 뒤편 도로상35.77212208127.43784217
6543지상식무진장-진안-지상-65수동리 내송마을 회관 앞35.835056374780834127.5135616119951
6544지상식무진장-진안-지상-66군상리 406-33(진안철물점 뒤)35.79268679127.43459554
6545지상식무진장-진안-지상-67군상리 406-11 은성방앗간 하천 건너편35.793297127.428349
6546지하식무진장-진안-지하-01군상리 진안소방파출소 앞(폐쇄)35.79626603267129127.43486946726094
6547지하식무진장-진안-지하-02신괴리 485번지 괴정마을회관 앞35.88751407127.54020645
6548지하식무진장-진안-지하-03신괴리 456 소희섭씨댁앞35.8859117127.53929257
6549지하식무진장-진안-지하-04신괴리 748번지 김광호씨댁 앞35.86957258127.54002097