Overview

Dataset statistics

Number of variables4
Number of observations290
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.2 KiB
Average record size in memory32.5 B

Variable types

Text3
Categorical1

Dataset

Description충청북도내 소방시설업체에 관련된 데이터로 소방시설관리, 설계, 감리, 공사업체로 구분되어 있습니다. (업체명, 주소, 업종, 전화번호) *기존에 관리, 설계, 감리, 공사로 구분되어 있던 목록을 하나로 통합하였습니다.
URLhttps://www.data.go.kr/data/15053071/fileData.do

Alerts

업종 is highly imbalanced (51.6%)Imbalance
전화번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 13:52:02.221704
Analysis finished2023-12-12 13:52:02.586187
Duration0.36 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct289
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-12T22:52:02.748260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length17
Mean length8.7931034
Min length4

Characters and Unicode

Total characters2550
Distinct characters196
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique288 ?
Unique (%)99.3%

Sample

1st row(주)21세기소방
2nd row(주)가람방재
3rd row(주)가람엔지니어링
4th row(주)거성개발
5th row(주)거성에너지
ValueCountFrequency (%)
주식회사 118
28.6%
2
 
0.5%
신화엔지니어링 1
 
0.2%
씨케이엔지니어링 1
 
0.2%
명성전력 1
 
0.2%
맥이엔에스 1
 
0.2%
다산전력 1
 
0.2%
드림이앤지 1
 
0.2%
동성소방 1
 
0.2%
동성 1
 
0.2%
Other values (284) 284
68.9%
2023-12-12T22:52:03.152205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
267
 
10.5%
146
 
5.7%
( 146
 
5.7%
) 146
 
5.7%
122
 
4.8%
121
 
4.7%
119
 
4.7%
80
 
3.1%
74
 
2.9%
65
 
2.5%
Other values (186) 1264
49.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2093
82.1%
Open Punctuation 146
 
5.7%
Close Punctuation 146
 
5.7%
Space Separator 122
 
4.8%
Uppercase Letter 25
 
1.0%
Other Punctuation 10
 
0.4%
Other Symbol 6
 
0.2%
Decimal Number 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
267
 
12.8%
146
 
7.0%
121
 
5.8%
119
 
5.7%
80
 
3.8%
74
 
3.5%
65
 
3.1%
65
 
3.1%
62
 
3.0%
56
 
2.7%
Other values (164) 1038
49.6%
Uppercase Letter
ValueCountFrequency (%)
N 4
16.0%
E 3
12.0%
G 3
12.0%
A 2
8.0%
T 2
8.0%
H 2
8.0%
S 2
8.0%
D 2
8.0%
W 1
 
4.0%
I 1
 
4.0%
Other values (3) 3
12.0%
Other Punctuation
ValueCountFrequency (%)
. 8
80.0%
& 1
 
10.0%
, 1
 
10.0%
Decimal Number
ValueCountFrequency (%)
2 1
50.0%
1 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 146
100.0%
Close Punctuation
ValueCountFrequency (%)
) 146
100.0%
Space Separator
ValueCountFrequency (%)
122
100.0%
Other Symbol
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2099
82.3%
Common 426
 
16.7%
Latin 25
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
267
 
12.7%
146
 
7.0%
121
 
5.8%
119
 
5.7%
80
 
3.8%
74
 
3.5%
65
 
3.1%
65
 
3.1%
62
 
3.0%
56
 
2.7%
Other values (165) 1044
49.7%
Latin
ValueCountFrequency (%)
N 4
16.0%
E 3
12.0%
G 3
12.0%
A 2
8.0%
T 2
8.0%
H 2
8.0%
S 2
8.0%
D 2
8.0%
W 1
 
4.0%
I 1
 
4.0%
Other values (3) 3
12.0%
Common
ValueCountFrequency (%)
( 146
34.3%
) 146
34.3%
122
28.6%
. 8
 
1.9%
& 1
 
0.2%
, 1
 
0.2%
2 1
 
0.2%
1 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2093
82.1%
ASCII 451
 
17.7%
None 6
 
0.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
267
 
12.8%
146
 
7.0%
121
 
5.8%
119
 
5.7%
80
 
3.8%
74
 
3.5%
65
 
3.1%
65
 
3.1%
62
 
3.0%
56
 
2.7%
Other values (164) 1038
49.6%
ASCII
ValueCountFrequency (%)
( 146
32.4%
) 146
32.4%
122
27.1%
. 8
 
1.8%
N 4
 
0.9%
E 3
 
0.7%
G 3
 
0.7%
A 2
 
0.4%
T 2
 
0.4%
H 2
 
0.4%
Other values (11) 13
 
2.9%
None
ValueCountFrequency (%)
6
100.0%

주소
Text

Distinct289
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-12T22:52:03.470966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length52
Median length42
Mean length28.313793
Min length18

Characters and Unicode

Total characters8211
Distinct characters235
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique288 ?
Unique (%)99.3%

Sample

1st row충청북도 음성군 대소면 멍심이길 102
2nd row충청북도 진천군 진천읍 장관1길 16-1
3rd row충청북도 충주시 낙수당1길 32 (칠금동)
4th row충청북도 청주시 청원구 상리로 21 (율량동)
5th row충청북도 청주시 서원구 남이면 남이가좌1길 41-11
ValueCountFrequency (%)
충청북도 290
 
16.5%
청주시 188
 
10.7%
흥덕구 63
 
3.6%
청원구 54
 
3.1%
49
 
2.8%
서원구 40
 
2.3%
충주시 31
 
1.8%
상당구 31
 
1.8%
남이면 19
 
1.1%
제천시 16
 
0.9%
Other values (645) 981
55.7%
2023-12-12T22:52:03.919818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1501
 
18.3%
548
 
6.7%
329
 
4.0%
301
 
3.7%
1 300
 
3.7%
293
 
3.6%
235
 
2.9%
232
 
2.8%
224
 
2.7%
213
 
2.6%
Other values (225) 4035
49.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4885
59.5%
Space Separator 1501
 
18.3%
Decimal Number 1266
 
15.4%
Close Punctuation 186
 
2.3%
Open Punctuation 186
 
2.3%
Dash Punctuation 98
 
1.2%
Other Punctuation 89
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
548
 
11.2%
329
 
6.7%
301
 
6.2%
293
 
6.0%
235
 
4.8%
232
 
4.7%
224
 
4.6%
213
 
4.4%
192
 
3.9%
159
 
3.3%
Other values (210) 2159
44.2%
Decimal Number
ValueCountFrequency (%)
1 300
23.7%
2 186
14.7%
3 140
11.1%
4 118
 
9.3%
0 105
 
8.3%
5 100
 
7.9%
6 95
 
7.5%
8 87
 
6.9%
9 68
 
5.4%
7 67
 
5.3%
Space Separator
ValueCountFrequency (%)
1501
100.0%
Close Punctuation
ValueCountFrequency (%)
) 186
100.0%
Open Punctuation
ValueCountFrequency (%)
( 186
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 98
100.0%
Other Punctuation
ValueCountFrequency (%)
, 89
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4885
59.5%
Common 3326
40.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
548
 
11.2%
329
 
6.7%
301
 
6.2%
293
 
6.0%
235
 
4.8%
232
 
4.7%
224
 
4.6%
213
 
4.4%
192
 
3.9%
159
 
3.3%
Other values (210) 2159
44.2%
Common
ValueCountFrequency (%)
1501
45.1%
1 300
 
9.0%
) 186
 
5.6%
( 186
 
5.6%
2 186
 
5.6%
3 140
 
4.2%
4 118
 
3.5%
0 105
 
3.2%
5 100
 
3.0%
- 98
 
2.9%
Other values (5) 406
 
12.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4885
59.5%
ASCII 3326
40.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1501
45.1%
1 300
 
9.0%
) 186
 
5.6%
( 186
 
5.6%
2 186
 
5.6%
3 140
 
4.2%
4 118
 
3.5%
0 105
 
3.2%
5 100
 
3.0%
- 98
 
2.9%
Other values (5) 406
 
12.2%
Hangul
ValueCountFrequency (%)
548
 
11.2%
329
 
6.7%
301
 
6.2%
293
 
6.0%
235
 
4.8%
232
 
4.7%
224
 
4.6%
213
 
4.4%
192
 
3.9%
159
 
3.3%
Other values (210) 2159
44.2%

업종
Categorical

IMBALANCE 

Distinct10
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
공사업
206 
소방시설관리업,공사업
29 
설계업
 
18
설계업,감리업
 
11
감리업
 
10
Other values (5)
 
16

Length

Max length15
Median length3
Mean length4.2413793
Min length3

Unique

Unique2 ?
Unique (%)0.7%

Sample

1st row소방시설관리업,공사업
2nd row소방시설관리업,공사업
3rd row공사업
4th row공사업
5th row공사업

Common Values

ValueCountFrequency (%)
공사업 206
71.0%
소방시설관리업,공사업 29
 
10.0%
설계업 18
 
6.2%
설계업,감리업 11
 
3.8%
감리업 10
 
3.4%
소방시설관리업 9
 
3.1%
설계업,감리업,공사업 3
 
1.0%
설계업,공사업 2
 
0.7%
감리업,공사업 1
 
0.3%
설계업,소방시설관리업,공사업 1
 
0.3%

Length

2023-12-12T22:52:04.079891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:52:04.197884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공사업 206
71.0%
소방시설관리업,공사업 29
 
10.0%
설계업 18
 
6.2%
설계업,감리업 11
 
3.8%
감리업 10
 
3.4%
소방시설관리업 9
 
3.1%
설계업,감리업,공사업 3
 
1.0%
설계업,공사업 2
 
0.7%
감리업,공사업 1
 
0.3%
설계업,소방시설관리업,공사업 1
 
0.3%

전화번호
Text

UNIQUE 

Distinct290
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size2.4 KiB
2023-12-12T22:52:04.498121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.02069
Min length11

Characters and Unicode

Total characters3486
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique290 ?
Unique (%)100.0%

Sample

1st row043-877-2119
2nd row043-232-9119
3rd row043-854-0669
4th row043-215-9868
5th row043-543-3721
ValueCountFrequency (%)
043-877-2119 1
 
0.3%
043-232-4736 1
 
0.3%
043-232-1190 1
 
0.3%
043-235-4411 1
 
0.3%
043-288-0491 1
 
0.3%
043-272-0150 1
 
0.3%
043-743-3166 1
 
0.3%
070-7772-4306 1
 
0.3%
043-212-1190 1
 
0.3%
043-533-2119 1
 
0.3%
Other values (280) 280
96.6%
2023-12-12T22:52:04.962031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 580
16.6%
3 492
14.1%
0 476
13.7%
4 446
12.8%
2 340
9.8%
1 310
8.9%
8 194
 
5.6%
7 172
 
4.9%
9 165
 
4.7%
5 158
 
4.5%
Other values (2) 153
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2905
83.3%
Dash Punctuation 580
 
16.6%
Space Separator 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 492
16.9%
0 476
16.4%
4 446
15.4%
2 340
11.7%
1 310
10.7%
8 194
 
6.7%
7 172
 
5.9%
9 165
 
5.7%
5 158
 
5.4%
6 152
 
5.2%
Dash Punctuation
ValueCountFrequency (%)
- 580
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3486
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 580
16.6%
3 492
14.1%
0 476
13.7%
4 446
12.8%
2 340
9.8%
1 310
8.9%
8 194
 
5.6%
7 172
 
4.9%
9 165
 
4.7%
5 158
 
4.5%
Other values (2) 153
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3486
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 580
16.6%
3 492
14.1%
0 476
13.7%
4 446
12.8%
2 340
9.8%
1 310
8.9%
8 194
 
5.6%
7 172
 
4.9%
9 165
 
4.7%
5 158
 
4.5%
Other values (2) 153
 
4.4%

Missing values

2023-12-12T22:52:02.470354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:52:02.551256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업체명주소업종전화번호
0(주)21세기소방충청북도 음성군 대소면 멍심이길 102소방시설관리업,공사업043-877-2119
1(주)가람방재충청북도 진천군 진천읍 장관1길 16-1소방시설관리업,공사업043-232-9119
2(주)가람엔지니어링충청북도 충주시 낙수당1길 32 (칠금동)공사업043-854-0669
3(주)거성개발충청북도 청주시 청원구 상리로 21 (율량동)공사업043-215-9868
4(주)거성에너지충청북도 청주시 서원구 남이면 남이가좌1길 41-11공사업043-543-3721
5(주)건사엔지니어링충청북도 청주시 상당구 남일면 효촌송암1길 51-3설계업,감리업043-284-1681
6(주)건양기술공사건축사사무소충청북도 청주시 청원구 교서로 111 (우암동)설계업043-252-2441
7(주)건우전력충청북도 충주시 목행산단3로 54 (목행동)공사업043-857-2290
8(주)건주충청북도 제천시 용두대로 209 (하소동)공사업043-646-5533
9(주)경보전설충청북도 옥천군 옥천읍 중앙로9길 4공사업043-731-8380
업체명주소업종전화번호
280태웅이엔에스(주)충청북도 청주시 흥덕구 평동로126번길 6 (평동)공사업043-231-9000
281태일소방감리사무소충청북도 청주시 상당구 꽃산동로 15-2 (금천동), 덕성아파트상가 2-103감리업043-250-9713
282태정에너지산업(주)충청북도 청주시 흥덕구 신성로108번길 9-6 (신성동)공사업043-274-0030
283토광방재안전관리(주)충청북도 청주시 청원구 내수읍 묵방2길 59-36소방시설관리업,공사업043-294-4244
284한국소방설비 합자회사충청북도 충주시 능바우길 47 (칠금동)공사업043-848-2828
285한국전설(주)충청북도 제천시 명륜로4길 3 (청전동)공사업043-644-6405
286한서전기(주)충청북도 진천군 진천읍 장관2길 99공사업043-534-9494
287한세전력 주식회사충청북도 음성군 금왕읍 무극로265번길 1공사업043-877-3404
288한을이엔지충청북도 청주시 흥덕구 짐대로72번길 5 , 502호(복대동)설계업043-232-2763
289호암방재충청북도 청주시 흥덕구 장구봉로137번길 7-8소방시설관리업043-264-8896