Overview

Dataset statistics

Number of variables3
Number of observations6942
Missing cells0
Missing cells (%)0.0%
Duplicate rows10
Duplicate rows (%)0.1%
Total size in memory162.8 KiB
Average record size in memory24.0 B

Variable types

Categorical1
Text2

Alerts

Dataset has 10 (0.1%) duplicate rowsDuplicates
소방용수종류 is highly imbalanced (60.6%)Imbalance

Reproduction

Analysis started2024-03-14 00:18:52.595831
Analysis finished2024-03-14 00:18:53.207436
Duration0.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

소방용수종류
Categorical

IMBALANCE 

Distinct11
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size54.4 KiB
지상식
3662 
지하식
2906 
급수탑
 
130
기타
 
119
저수지
 
47
Other values (6)
 
78

Length

Max length6
Median length3
Mean length2.9884759
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row지상식
2nd row급수탑
3rd row급수탑
4th row지상식
5th row지상식

Common Values

ValueCountFrequency (%)
지상식 3662
52.8%
지하식 2906
41.9%
급수탑 130
 
1.9%
기타 119
 
1.7%
저수지 47
 
0.7%
저수조 26
 
0.4%
자연 25
 
0.4%
비상소화장치 20
 
0.3%
자연수리 4
 
0.1%
관정 2
 
< 0.1%

Length

2024-03-14T09:18:53.281672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
지상식 3662
52.8%
지하식 2906
41.9%
급수탑 130
 
1.9%
기타 119
 
1.7%
저수지 47
 
0.7%
저수조 26
 
0.4%
자연 25
 
0.4%
비상소화장치 20
 
0.3%
자연수리 4
 
0.1%
관정 2
 
< 0.1%
Distinct6859
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Memory size54.4 KiB
2024-03-14T09:18:53.492669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length12
Mean length11.976664
Min length6

Characters and Unicode

Total characters83142
Distinct characters118
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6777 ?
Unique (%)97.6%

Sample

1st row덕진-고산-지상-021
2nd row덕진-고산-급수-002
3rd row덕진-고산-급수-002
4th row덕진-고산-지상-022
5th row덕진-고산-지상-001
ValueCountFrequency (%)
무진장-마령 5
 
0.1%
덕진-삼례-지상-024 3
 
< 0.1%
완산-효자-급수-008 2
 
< 0.1%
군산-대야-지상-013 2
 
< 0.1%
남원-금지-지상-046 2
 
< 0.1%
남원-식정-지상-100 2
 
< 0.1%
덕진-팔복-지상-028 2
 
< 0.1%
남원-식정-지하-050 2
 
< 0.1%
완산-효자-지하-048 2
 
< 0.1%
sdafsdfsd12312 2
 
< 0.1%
Other values (6854) 6927
99.7%
2024-03-14T09:18:53.778303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 20806
25.0%
6682
 
8.0%
0 6535
 
7.9%
3790
 
4.6%
3461
 
4.2%
3059
 
3.7%
1 3006
 
3.6%
2 1889
 
2.3%
1705
 
2.1%
3 1554
 
1.9%
Other values (108) 30655
36.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 42586
51.2%
Dash Punctuation 20806
25.0%
Decimal Number 19709
23.7%
Lowercase Letter 25
 
< 0.1%
Space Separator 16
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6682
 
15.7%
3790
 
8.9%
3461
 
8.1%
3059
 
7.2%
1705
 
4.0%
1150
 
2.7%
1132
 
2.7%
1053
 
2.5%
1034
 
2.4%
1002
 
2.4%
Other values (92) 18518
43.5%
Decimal Number
ValueCountFrequency (%)
0 6535
33.2%
1 3006
15.3%
2 1889
 
9.6%
3 1554
 
7.9%
4 1362
 
6.9%
5 1243
 
6.3%
6 1124
 
5.7%
7 1064
 
5.4%
8 993
 
5.0%
9 939
 
4.8%
Lowercase Letter
ValueCountFrequency (%)
d 8
32.0%
s 8
32.0%
f 6
24.0%
a 3
 
12.0%
Dash Punctuation
ValueCountFrequency (%)
- 20806
100.0%
Space Separator
ValueCountFrequency (%)
16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 42586
51.2%
Common 40531
48.7%
Latin 25
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6682
 
15.7%
3790
 
8.9%
3461
 
8.1%
3059
 
7.2%
1705
 
4.0%
1150
 
2.7%
1132
 
2.7%
1053
 
2.5%
1034
 
2.4%
1002
 
2.4%
Other values (92) 18518
43.5%
Common
ValueCountFrequency (%)
- 20806
51.3%
0 6535
 
16.1%
1 3006
 
7.4%
2 1889
 
4.7%
3 1554
 
3.8%
4 1362
 
3.4%
5 1243
 
3.1%
6 1124
 
2.8%
7 1064
 
2.6%
8 993
 
2.4%
Other values (2) 955
 
2.4%
Latin
ValueCountFrequency (%)
d 8
32.0%
s 8
32.0%
f 6
24.0%
a 3
 
12.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 42586
51.2%
ASCII 40556
48.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 20806
51.3%
0 6535
 
16.1%
1 3006
 
7.4%
2 1889
 
4.7%
3 1554
 
3.8%
4 1362
 
3.4%
5 1243
 
3.1%
6 1124
 
2.8%
7 1064
 
2.6%
8 993
 
2.4%
Other values (6) 980
 
2.4%
Hangul
ValueCountFrequency (%)
6682
 
15.7%
3790
 
8.9%
3461
 
8.1%
3059
 
7.2%
1705
 
4.0%
1150
 
2.7%
1132
 
2.7%
1053
 
2.5%
1034
 
2.4%
1002
 
2.4%
Other values (92) 18518
43.5%

위치
Text

Distinct6838
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Memory size54.4 KiB
2024-03-14T09:18:54.046474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length97
Median length56
Mean length22.957073
Min length2

Characters and Unicode

Total characters159368
Distinct characters867
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6796 ?
Unique (%)97.9%

Sample

1st row화산면 화산로 830-14 (화산면사무소 앞)
2nd row장선리 577
3rd row동상면 동상로 1450번지 앞
4th row소농리 비봉면사무소 앞
5th row읍내리 880 부부한의원앞
ValueCountFrequency (%)
2573
 
7.8%
716
 
2.2%
익산시 423
 
1.3%
전주시 357
 
1.1%
입구 288
 
0.9%
고창군 266
 
0.8%
맞은편 235
 
0.7%
김제시 233
 
0.7%
덕진구 221
 
0.7%
인도 207
 
0.6%
Other values (14114) 27582
83.3%
2024-03-14T09:18:54.445411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26790
 
16.8%
1 4861
 
3.1%
( 3861
 
2.4%
) 3848
 
2.4%
3835
 
2.4%
3580
 
2.2%
2 3140
 
2.0%
2816
 
1.8%
2780
 
1.7%
2746
 
1.7%
Other values (857) 101111
63.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 99660
62.5%
Space Separator 26805
 
16.8%
Decimal Number 21263
 
13.3%
Open Punctuation 3918
 
2.5%
Close Punctuation 3903
 
2.4%
Dash Punctuation 2406
 
1.5%
Uppercase Letter 529
 
0.3%
Other Punctuation 410
 
0.3%
Lowercase Letter 336
 
0.2%
Control 57
 
< 0.1%
Other values (3) 81
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3835
 
3.8%
3580
 
3.6%
2816
 
2.8%
2780
 
2.8%
2746
 
2.8%
2638
 
2.6%
2255
 
2.3%
1900
 
1.9%
1753
 
1.8%
1508
 
1.5%
Other values (777) 73849
74.1%
Uppercase Letter
ValueCountFrequency (%)
S 58
11.0%
G 49
9.3%
M 46
8.7%
C 45
 
8.5%
T 41
 
7.8%
K 41
 
7.8%
A 40
 
7.6%
L 33
 
6.2%
B 33
 
6.2%
P 29
 
5.5%
Other values (14) 114
21.6%
Lowercase Letter
ValueCountFrequency (%)
m 217
64.6%
s 15
 
4.5%
k 14
 
4.2%
t 11
 
3.3%
e 11
 
3.3%
a 10
 
3.0%
i 9
 
2.7%
c 8
 
2.4%
o 7
 
2.1%
n 5
 
1.5%
Other values (11) 29
 
8.6%
Decimal Number
ValueCountFrequency (%)
1 4861
22.9%
2 3140
14.8%
3 2526
11.9%
4 1947
9.2%
5 1825
 
8.6%
0 1817
 
8.5%
7 1369
 
6.4%
6 1344
 
6.3%
8 1292
 
6.1%
9 1142
 
5.4%
Other Punctuation
ValueCountFrequency (%)
, 233
56.8%
@ 84
 
20.5%
. 75
 
18.3%
/ 9
 
2.2%
& 6
 
1.5%
1
 
0.2%
: 1
 
0.2%
! 1
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 3861
98.5%
[ 49
 
1.3%
7
 
0.2%
{ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 3848
98.6%
] 49
 
1.3%
5
 
0.1%
} 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
26790
99.9%
  15
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 48
90.6%
5
 
9.4%
Other Symbol
ValueCountFrequency (%)
19
95.0%
1
 
5.0%
Dash Punctuation
ValueCountFrequency (%)
- 2406
100.0%
Control
ValueCountFrequency (%)
57
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 99678
62.5%
Common 58824
36.9%
Latin 865
 
0.5%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3835
 
3.8%
3580
 
3.6%
2816
 
2.8%
2780
 
2.8%
2746
 
2.8%
2638
 
2.6%
2255
 
2.3%
1900
 
1.9%
1753
 
1.8%
1508
 
1.5%
Other values (777) 73867
74.1%
Latin
ValueCountFrequency (%)
m 217
25.1%
S 58
 
6.7%
G 49
 
5.7%
M 46
 
5.3%
C 45
 
5.2%
T 41
 
4.7%
K 41
 
4.7%
A 40
 
4.6%
L 33
 
3.8%
B 33
 
3.8%
Other values (35) 262
30.3%
Common
ValueCountFrequency (%)
26790
45.5%
1 4861
 
8.3%
( 3861
 
6.6%
) 3848
 
6.5%
2 3140
 
5.3%
3 2526
 
4.3%
- 2406
 
4.1%
4 1947
 
3.3%
5 1825
 
3.1%
0 1817
 
3.1%
Other values (24) 5803
 
9.9%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 99625
62.5%
ASCII 59655
37.4%
None 47
 
< 0.1%
Compat Jamo 31
 
< 0.1%
Arrows 5
 
< 0.1%
Jamo 3
 
< 0.1%
CJK 1
 
< 0.1%
Enclosed Alphanum 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
26790
44.9%
1 4861
 
8.1%
( 3861
 
6.5%
) 3848
 
6.5%
2 3140
 
5.3%
3 2526
 
4.2%
- 2406
 
4.0%
4 1947
 
3.3%
5 1825
 
3.1%
0 1817
 
3.0%
Other values (63) 6634
 
11.1%
Hangul
ValueCountFrequency (%)
3835
 
3.8%
3580
 
3.6%
2816
 
2.8%
2780
 
2.8%
2746
 
2.8%
2638
 
2.6%
2255
 
2.3%
1900
 
1.9%
1753
 
1.8%
1508
 
1.5%
Other values (767) 73814
74.1%
None
ValueCountFrequency (%)
19
40.4%
  15
31.9%
7
 
14.9%
5
 
10.6%
1
 
2.1%
Compat Jamo
ValueCountFrequency (%)
15
48.4%
5
 
16.1%
4
 
12.9%
3
 
9.7%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Arrows
ValueCountFrequency (%)
5
100.0%
Jamo
ValueCountFrequency (%)
3
100.0%
CJK
ValueCountFrequency (%)
1
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
100.0%

Missing values

2024-03-14T09:18:53.110184image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T09:18:53.169982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

소방용수종류소화전 번호위치
0지상식덕진-고산-지상-021화산면 화산로 830-14 (화산면사무소 앞)
1급수탑덕진-고산-급수-002장선리 577
2급수탑덕진-고산-급수-002동상면 동상로 1450번지 앞
3지상식덕진-고산-지상-022소농리 비봉면사무소 앞
4지상식덕진-고산-지상-001읍내리 880 부부한의원앞
5지상식덕진-고산-지상-002고산면 읍내리 고산로90 (십자약국 앞)
6지상식덕진-고산-지상-003고산면 읍내3길 9
7지상식덕진-고산-지상-004읍내리 860-2(고산119안전센터 앞)
8지상식덕진-고산-지상-005읍내리 미소드림빌라 앞
9지상식덕진-고산-지상-006읍내 6길11호 백입순앞
소방용수종류소화전 번호위치
6932지상식무진장-진안-지상-140진안군 주천면 중리길 44-1 (김영복씨댁 옆)
6933지상식무진장-진안-지상-116진안군 진안읍 연장리 거북바위로 3길 18 (시케이푸드 옆 도로상)
6934지상식무진장-진안-지상-116연장리 거북바위로 3길 18 (시케이푸드 옆 도로상)
6935지상식무진장-진안-지상-117진안군 진안읍 연장리 거북바위로 3길 15-37 (한국고려홍삼조합 삼거리)
6936지상식무진장-진안-지상-118진안군 진안읍 연장리 하평마을 하평길13 김일남 앞(하평마을)
6937지상식무진장-진안-지상-78진안군 진안읍 연장리 대연장길 88 대연장마을회관(대연장마을)
6938지상식무진장-진안-지상-119진안군 진안읍 연장리 연거로 50(원연장마을)
6939지상식무진장-진안-지상-120진안군 용담면 방화길 8 방화마을회관
6940지상식무진장-진안-지상-121진안군 용담면 왕두길 5 월계마을과 경계 도로 옆 밭
6941지상식무진장-진안-지상-122진안군 안천면 율현길 70, 마을회관 앞

Duplicate rows

Most frequently occurring

소방용수종류소화전 번호위치# duplicates
0기타완산-교동-비상-026전주시 완산구 자만동 1길 62
1저수지sdafsdfsd12312test2
2지상식고창-고창-지상-132고창군 고창읍 월곡뉴타운1길 602
3지상식남원-금지-지상-046남원시 수지면 산정유암길 113(등동마을회관 앞)2
4지상식남원-인월-지상-032남원시 산내면 백일길 7(백일마을회관 앞)2
5지상식남원-인월-지상-033남원시 아영면 신지길 39(신지마을회관 앞)2
6지상식덕진-팔복-지상-028덕진구 동산동 771 (효성C1공장 정문에서 좌측 100m)2
7지상식무진장-무주-지상-117무주군 무풍면 철목길 1242
8지상식무진장-진안-지상-139진안군 주천면 산제길 36 (마을회관 앞)2
9지상식완산-교동-지상-029상관면 수월길 11 (수월경로당 앞)2