Overview

Dataset statistics

Number of variables7
Number of observations514
Missing cells6
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory28.2 KiB
Average record size in memory56.2 B

Variable types

Unsupported1
Categorical4
Text2

Dataset

Description대구광역시 등록, 관리 중인 소방시설업 및 소방시설관리업 등록 현황에 대한 데이터로소방시설업 세부업종 소방시설공사업, 소방시설설계업, 소방공사감리업 및 소방시설관리업입니다.
Author대구광역시
URLhttps://www.data.go.kr/data/3072212/fileData.do

Alerts

Unnamed: 5 is highly overall correlated with Unnamed: 1 and 2 other fieldsHigh correlation
Unnamed: 1 is highly overall correlated with Unnamed: 2 and 2 other fieldsHigh correlation
Unnamed: 6 is highly overall correlated with Unnamed: 1 and 1 other fieldsHigh correlation
Unnamed: 2 is highly overall correlated with Unnamed: 1 and 1 other fieldsHigh correlation
Unnamed: 1 is highly imbalanced (97.4%)Imbalance
소방시설업 현황(대구) is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-14 19:42:34.841285
Analysis finished2024-03-14 19:42:36.354489
Duration1.51 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

소방시설업 현황(대구)
Unsupported

REJECTED  UNSUPPORTED 

Missing1
Missing (%)0.2%
Memory size4.1 KiB

Unnamed: 1
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
대구
512 
<NA>
 
1
지역
 
1

Length

Max length4
Median length2
Mean length2.0038911
Min length2

Unique

Unique2 ?
Unique (%)0.4%

Sample

1st row<NA>
2nd row지역
3rd row대구
4th row대구
5th row대구

Common Values

ValueCountFrequency (%)
대구 512
99.6%
<NA> 1
 
0.2%
지역 1
 
0.2%

Length

2024-03-15T04:42:36.592711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T04:42:36.786482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대구 512
99.6%
na 1
 
0.2%
지역 1
 
0.2%

Unnamed: 2
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
수성구
105 
북구
93 
달서구
93 
동구
68 
서구
46 
Other values (6)
109 

Length

Max length4
Median length2
Mean length2.4766537
Min length2

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row<NA>
2nd row세부지역
3rd row달성군
4th row서구
5th row동구

Common Values

ValueCountFrequency (%)
수성구 105
20.4%
북구 93
18.1%
달서구 93
18.1%
동구 68
13.2%
서구 46
8.9%
달성군 36
 
7.0%
중구 33
 
6.4%
남구 32
 
6.2%
군위군 5
 
1.0%
<NA> 2
 
0.4%

Length

2024-03-15T04:42:36.993524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
수성구 105
20.4%
북구 93
18.1%
달서구 93
18.1%
동구 68
13.2%
서구 46
8.9%
달성군 36
 
7.0%
중구 33
 
6.4%
남구 32
 
6.2%
군위군 5
 
1.0%
na 2
 
0.4%
Distinct413
Distinct (%)80.5%
Missing1
Missing (%)0.2%
Memory size4.1 KiB
2024-03-15T04:42:37.852062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length13
Mean length8.2183236
Min length2

Characters and Unicode

Total characters4216
Distinct characters228
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique354 ?
Unique (%)69.0%

Sample

1st row상호
2nd row(유)대보이엔씨
3rd row(자)근화기업
4th row(자)대한소방
5th row(자)세기전력
ValueCountFrequency (%)
주식회사 139
 
21.1%
다온이엔지 5
 
0.8%
세이프소방방재 5
 
0.8%
주)나우전기 4
 
0.6%
주)새솔enc 4
 
0.6%
대영소방기술(주 4
 
0.6%
주)나무전기 4
 
0.6%
주)제일소방방재 4
 
0.6%
지이에스 4
 
0.6%
주)가람기술단 4
 
0.6%
Other values (408) 481
73.1%
2024-03-15T04:42:39.106328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
428
 
10.2%
) 291
 
6.9%
( 291
 
6.9%
167
 
4.0%
145
 
3.4%
145
 
3.4%
140
 
3.3%
129
 
3.1%
128
 
3.0%
110
 
2.6%
Other values (218) 2242
53.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3432
81.4%
Close Punctuation 291
 
6.9%
Open Punctuation 291
 
6.9%
Space Separator 145
 
3.4%
Uppercase Letter 43
 
1.0%
Decimal Number 12
 
0.3%
Other Punctuation 1
 
< 0.1%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
428
 
12.5%
167
 
4.9%
145
 
4.2%
140
 
4.1%
129
 
3.8%
128
 
3.7%
110
 
3.2%
98
 
2.9%
98
 
2.9%
91
 
2.7%
Other values (205) 1898
55.3%
Uppercase Letter
ValueCountFrequency (%)
E 14
32.6%
N 12
27.9%
G 9
20.9%
C 6
14.0%
I 1
 
2.3%
A 1
 
2.3%
Decimal Number
ValueCountFrequency (%)
1 8
66.7%
9 4
33.3%
Close Punctuation
ValueCountFrequency (%)
) 291
100.0%
Open Punctuation
ValueCountFrequency (%)
( 291
100.0%
Space Separator
ValueCountFrequency (%)
145
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Lowercase Letter
ValueCountFrequency (%)
n 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3432
81.4%
Common 740
 
17.6%
Latin 44
 
1.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
428
 
12.5%
167
 
4.9%
145
 
4.2%
140
 
4.1%
129
 
3.8%
128
 
3.7%
110
 
3.2%
98
 
2.9%
98
 
2.9%
91
 
2.7%
Other values (205) 1898
55.3%
Latin
ValueCountFrequency (%)
E 14
31.8%
N 12
27.3%
G 9
20.5%
C 6
13.6%
I 1
 
2.3%
A 1
 
2.3%
n 1
 
2.3%
Common
ValueCountFrequency (%)
) 291
39.3%
( 291
39.3%
145
19.6%
1 8
 
1.1%
9 4
 
0.5%
. 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3432
81.4%
ASCII 784
 
18.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
428
 
12.5%
167
 
4.9%
145
 
4.2%
140
 
4.1%
129
 
3.8%
128
 
3.7%
110
 
3.2%
98
 
2.9%
98
 
2.9%
91
 
2.7%
Other values (205) 1898
55.3%
ASCII
ValueCountFrequency (%)
) 291
37.1%
( 291
37.1%
145
18.5%
E 14
 
1.8%
N 12
 
1.5%
G 9
 
1.1%
1 8
 
1.0%
C 6
 
0.8%
9 4
 
0.5%
. 1
 
0.1%
Other values (3) 3
 
0.4%
Distinct408
Distinct (%)80.0%
Missing4
Missing (%)0.8%
Memory size4.1 KiB
2024-03-15T04:42:40.624955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length11.986275
Min length4

Characters and Unicode

Total characters6113
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique348 ?
Unique (%)68.2%

Sample

1st row전화번호
2nd row053-752-2299
3rd row053-557-6300
4th row053-424-5031
5th row053-357-9797
ValueCountFrequency (%)
053-741-8849 5
 
1.0%
053-383-7335 5
 
1.0%
054-954-0942 4
 
0.8%
053-719-0890 4
 
0.8%
053-756-5577 4
 
0.8%
053-941-4916 4
 
0.8%
053-253-4412 4
 
0.8%
053-624-4868 4
 
0.8%
053-744-3119 4
 
0.8%
053-951-2597 4
 
0.8%
Other values (398) 468
91.8%
2024-03-15T04:42:42.228593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1018
16.7%
5 953
15.6%
3 822
13.4%
0 806
13.2%
1 489
8.0%
4 392
 
6.4%
7 381
 
6.2%
2 366
 
6.0%
9 327
 
5.3%
6 294
 
4.8%
Other values (5) 265
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5091
83.3%
Dash Punctuation 1018
 
16.7%
Other Letter 4
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 953
18.7%
3 822
16.1%
0 806
15.8%
1 489
9.6%
4 392
7.7%
7 381
 
7.5%
2 366
 
7.2%
9 327
 
6.4%
6 294
 
5.8%
8 261
 
5.1%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Dash Punctuation
ValueCountFrequency (%)
- 1018
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6109
99.9%
Hangul 4
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1018
16.7%
5 953
15.6%
3 822
13.5%
0 806
13.2%
1 489
8.0%
4 392
 
6.4%
7 381
 
6.2%
2 366
 
6.0%
9 327
 
5.4%
6 294
 
4.8%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6109
99.9%
Hangul 4
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1018
16.7%
5 953
15.6%
3 822
13.5%
0 806
13.2%
1 489
8.0%
4 392
 
6.4%
7 381
 
6.2%
2 366
 
6.0%
9 327
 
5.4%
6 294
 
4.8%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Unnamed: 5
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
공사업
309 
설계업
94 
감리업
59 
방염업
50 
<NA>
 
1

Length

Max length4
Median length3
Mean length3
Min length2

Unique

Unique2 ?
Unique (%)0.4%

Sample

1st row<NA>
2nd row업종
3rd row공사업
4th row공사업
5th row공사업

Common Values

ValueCountFrequency (%)
공사업 309
60.1%
설계업 94
 
18.3%
감리업 59
 
11.5%
방염업 50
 
9.7%
<NA> 1
 
0.2%
업종 1
 
0.2%

Length

2024-03-15T04:42:42.734837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T04:42:43.111610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공사업 309
60.1%
설계업 94
 
18.3%
감리업 59
 
11.5%
방염업 50
 
9.7%
na 1
 
0.2%
업종 1
 
0.2%

Unnamed: 6
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size4.1 KiB
전문
315 
일반(기계)
76 
일반(전기)
71 
합판목재류
 
26
섬유류
 
15
Other values (3)
 
11

Length

Max length6
Median length2
Mean length3.381323
Min length2

Unique

Unique2 ?
Unique (%)0.4%

Sample

1st row<NA>
2nd row분야
3rd row전문
4th row전문
5th row전문

Common Values

ValueCountFrequency (%)
전문 315
61.3%
일반(기계) 76
 
14.8%
일반(전기) 71
 
13.8%
합판목재류 26
 
5.1%
섬유류 15
 
2.9%
합성수지류 9
 
1.8%
<NA> 1
 
0.2%
분야 1
 
0.2%

Length

2024-03-15T04:42:43.546368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T04:42:43.776601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전문 315
61.3%
일반(기계 76
 
14.8%
일반(전기 71
 
13.8%
합판목재류 26
 
5.1%
섬유류 15
 
2.9%
합성수지류 9
 
1.8%
na 1
 
0.2%
분야 1
 
0.2%

Correlations

2024-03-15T04:42:43.928330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 5Unnamed: 6
Unnamed: 11.0001.0001.0001.000
Unnamed: 21.0001.0000.8520.673
Unnamed: 51.0000.8521.0000.883
Unnamed: 61.0000.6730.8831.000
2024-03-15T04:42:44.372064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 5Unnamed: 1Unnamed: 6Unnamed: 2
Unnamed: 51.0000.9970.8140.515
Unnamed: 10.9971.0000.9950.992
Unnamed: 60.8140.9951.0000.422
Unnamed: 20.5150.9920.4221.000
2024-03-15T04:42:44.536083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 1Unnamed: 2Unnamed: 5Unnamed: 6
Unnamed: 11.0000.9920.9970.995
Unnamed: 20.9921.0000.5150.422
Unnamed: 50.9970.5151.0000.814
Unnamed: 60.9950.4220.8141.000

Missing values

2024-03-15T04:42:35.373613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T04:42:35.774971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-15T04:42:36.136523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

소방시설업 현황(대구)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
0NaN<NA><NA><NA><NA><NA><NA>
1순번지역세부지역상호전화번호업종분야
21대구달성군(유)대보이엔씨053-752-2299공사업전문
32대구서구(자)근화기업053-557-6300공사업전문
43대구동구(자)대한소방053-424-5031공사업전문
54대구북구(자)세기전력053-357-9797공사업전문
65대구중구(자)신진기업053-255-3055공사업전문
76대구달서구(주)가람기술단053-638-2711설계업일반(기계)
87대구달서구(주)가람기술단053-638-2711설계업일반(전기)
98대구달서구(주)가람기술단053-638-2711감리업일반(기계)
소방시설업 현황(대구)Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6
504503대구서구한솔블라인드053-427-7845방염업합판목재류
505504대구달서구한영전기 주식회사053-557-6451공사업전문
506505대구중구한울엔지니어링053-745-9230감리업일반(기계)
507506대구중구한울엔지니어링053-745-9230감리업일반(전기)
508507대구동구한일전기설계감리사무소053-744-6915설계업일반(기계)
509508대구동구한일전기설계감리사무소053-744-6915설계업일반(전기)
510509대구달성군합자회사 천내전력053-632-2068공사업전문
511510대구수성구혜성전력(주)053-765-3070공사업전문
512511대구수성구화성산업(주)053-760-3731공사업전문
513512대구북구흥산소방산업 주식회사053-431-8898공사업전문