Overview

Dataset statistics

Number of variables6
Number of observations255
Missing cells19
Missing cells (%)1.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory12.6 KiB
Average record size in memory50.5 B

Variable types

Categorical1
Text3
Numeric2

Dataset

Description대전광역시 대한기계설비건설협회 업체현황에 대한 데이터로 업종, 상호, 소재지, 전화번호 등의 항목을 제공합니다.
URLhttps://www.data.go.kr/data/15061024/fileData.do

Alerts

시공능력평가액 is highly overall correlated with 2022년 실적(기성액)(천원)High correlation
2022년 실적(기성액)(천원) is highly overall correlated with 시공능력평가액High correlation
2022년 실적(기성액)(천원) has 17 (6.7%) missing valuesMissing

Reproduction

Analysis started2023-12-12 22:09:21.554990
Analysis finished2023-12-12 22:09:22.701614
Duration1.15 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업종
Categorical

Distinct2
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
기계설비
224 
가스1종
31 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기계설비
2nd row기계설비
3rd row기계설비
4th row기계설비
5th row기계설비

Common Values

ValueCountFrequency (%)
기계설비 224
87.8%
가스1종 31
 
12.2%

Length

2023-12-13T07:09:22.759774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:09:22.864884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기계설비 224
87.8%
가스1종 31
 
12.2%

상호
Text

Distinct237
Distinct (%)92.9%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-13T07:09:23.072939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length12
Mean length8.2
Min length5

Characters and Unicode

Total characters2091
Distinct characters205
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique219 ?
Unique (%)85.9%

Sample

1st row(주)한국가스기술공사
2nd row(주)금성백조주택
3rd row계룡건설산업(주)
4th row(주)금영이엔지(KUMYOUNGENGCO., LTD)
5th row(주)대청엔지니어링
ValueCountFrequency (%)
주)한국가스기술공사 2
 
0.8%
주)국영건설 2
 
0.8%
주)코리아산업엔지니어링 2
 
0.8%
주)성창엔지니어링 2
 
0.8%
주)대경건설 2
 
0.8%
주)태현기공 2
 
0.8%
주)미래에스코 2
 
0.8%
주)에프씨디 2
 
0.8%
주)신진엔지니어링 2
 
0.8%
주)거창엔지니어링 2
 
0.8%
Other values (228) 236
92.2%
2023-12-13T07:09:23.451591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 251
 
12.0%
) 251
 
12.0%
247
 
11.8%
93
 
4.4%
92
 
4.4%
87
 
4.2%
47
 
2.2%
46
 
2.2%
43
 
2.1%
41
 
2.0%
Other values (195) 893
42.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1567
74.9%
Open Punctuation 251
 
12.0%
Close Punctuation 251
 
12.0%
Uppercase Letter 19
 
0.9%
Other Punctuation 2
 
0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
247
 
15.8%
93
 
5.9%
92
 
5.9%
87
 
5.6%
47
 
3.0%
46
 
2.9%
43
 
2.7%
41
 
2.6%
41
 
2.6%
41
 
2.6%
Other values (178) 789
50.4%
Uppercase Letter
ValueCountFrequency (%)
G 3
15.8%
N 3
15.8%
U 2
10.5%
E 2
10.5%
O 2
10.5%
T 1
 
5.3%
D 1
 
5.3%
L 1
 
5.3%
C 1
 
5.3%
Y 1
 
5.3%
Other values (2) 2
10.5%
Other Punctuation
ValueCountFrequency (%)
, 1
50.0%
. 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 251
100.0%
Close Punctuation
ValueCountFrequency (%)
) 251
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1567
74.9%
Common 505
 
24.2%
Latin 19
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
247
 
15.8%
93
 
5.9%
92
 
5.9%
87
 
5.6%
47
 
3.0%
46
 
2.9%
43
 
2.7%
41
 
2.6%
41
 
2.6%
41
 
2.6%
Other values (178) 789
50.4%
Latin
ValueCountFrequency (%)
G 3
15.8%
N 3
15.8%
U 2
10.5%
E 2
10.5%
O 2
10.5%
T 1
 
5.3%
D 1
 
5.3%
L 1
 
5.3%
C 1
 
5.3%
Y 1
 
5.3%
Other values (2) 2
10.5%
Common
ValueCountFrequency (%)
( 251
49.7%
) 251
49.7%
1
 
0.2%
, 1
 
0.2%
. 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1567
74.9%
ASCII 524
 
25.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 251
47.9%
) 251
47.9%
G 3
 
0.6%
N 3
 
0.6%
U 2
 
0.4%
E 2
 
0.4%
O 2
 
0.4%
T 1
 
0.2%
D 1
 
0.2%
L 1
 
0.2%
Other values (7) 7
 
1.3%
Hangul
ValueCountFrequency (%)
247
 
15.8%
93
 
5.9%
92
 
5.9%
87
 
5.6%
47
 
3.0%
46
 
2.9%
43
 
2.7%
41
 
2.6%
41
 
2.6%
41
 
2.6%
Other values (178) 789
50.4%
Distinct222
Distinct (%)87.1%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-13T07:09:23.811223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length30
Mean length24.509804
Min length14

Characters and Unicode

Total characters6250
Distinct characters165
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique195 ?
Unique (%)76.5%

Sample

1st row대전광역시 유성구 대덕대로 1227(봉산동)
2nd row대전광역시 서구 계룡로583번길 9 (탄방동)
3rd row대전광역시 서구 문정로48번길 48 (탄방동)
4th row대전광역시 유성구 엑스포로 385 (문지동)
5th row대전광역시 유성구 배울1로 283 (탑립동)
ValueCountFrequency (%)
대전광역시 253
 
20.4%
유성구 73
 
5.9%
대덕구 72
 
5.8%
서구 52
 
4.2%
중구 36
 
2.9%
동구 22
 
1.8%
오정동 17
 
1.4%
대화동 12
 
1.0%
대화로 11
 
0.9%
160 8
 
0.6%
Other values (401) 683
55.1%
2023-12-13T07:09:24.531715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
984
 
15.7%
417
 
6.7%
273
 
4.4%
259
 
4.1%
253
 
4.0%
253
 
4.0%
253
 
4.0%
249
 
4.0%
245
 
3.9%
1 236
 
3.8%
Other values (155) 2828
45.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3739
59.8%
Decimal Number 1044
 
16.7%
Space Separator 984
 
15.7%
Close Punctuation 208
 
3.3%
Open Punctuation 208
 
3.3%
Dash Punctuation 54
 
0.9%
Other Punctuation 13
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
417
 
11.2%
273
 
7.3%
259
 
6.9%
253
 
6.8%
253
 
6.8%
253
 
6.8%
249
 
6.7%
245
 
6.6%
134
 
3.6%
124
 
3.3%
Other values (139) 1279
34.2%
Decimal Number
ValueCountFrequency (%)
1 236
22.6%
2 131
12.5%
3 111
10.6%
5 108
10.3%
4 88
 
8.4%
7 87
 
8.3%
6 78
 
7.5%
8 75
 
7.2%
0 70
 
6.7%
9 60
 
5.7%
Other Punctuation
ValueCountFrequency (%)
, 11
84.6%
. 2
 
15.4%
Space Separator
ValueCountFrequency (%)
984
100.0%
Close Punctuation
ValueCountFrequency (%)
) 208
100.0%
Open Punctuation
ValueCountFrequency (%)
( 208
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 54
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3739
59.8%
Common 2511
40.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
417
 
11.2%
273
 
7.3%
259
 
6.9%
253
 
6.8%
253
 
6.8%
253
 
6.8%
249
 
6.7%
245
 
6.6%
134
 
3.6%
124
 
3.3%
Other values (139) 1279
34.2%
Common
ValueCountFrequency (%)
984
39.2%
1 236
 
9.4%
) 208
 
8.3%
( 208
 
8.3%
2 131
 
5.2%
3 111
 
4.4%
5 108
 
4.3%
4 88
 
3.5%
7 87
 
3.5%
6 78
 
3.1%
Other values (6) 272
 
10.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3739
59.8%
ASCII 2511
40.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
984
39.2%
1 236
 
9.4%
) 208
 
8.3%
( 208
 
8.3%
2 131
 
5.2%
3 111
 
4.4%
5 108
 
4.3%
4 88
 
3.5%
7 87
 
3.5%
6 78
 
3.1%
Other values (6) 272
 
10.8%
Hangul
ValueCountFrequency (%)
417
 
11.2%
273
 
7.3%
259
 
6.9%
253
 
6.8%
253
 
6.8%
253
 
6.8%
249
 
6.7%
245
 
6.6%
134
 
3.6%
124
 
3.3%
Other values (139) 1279
34.2%
Distinct237
Distinct (%)92.9%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-13T07:09:24.748423image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length12.152941
Min length10

Characters and Unicode

Total characters3099
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique219 ?
Unique (%)85.9%

Sample

1st row042-600-8225
2nd row042-630-9595
3rd row070-4470-7367
4th row042-824-5538
5th row042-933-7100
ValueCountFrequency (%)
042-600-8225 2
 
0.8%
042-534-9262 2
 
0.8%
042-537-1957 2
 
0.8%
042-221-2700 2
 
0.8%
042-536-5631 2
 
0.8%
042-483-4221 2
 
0.8%
042-488-2553 2
 
0.8%
042-282-6547 2
 
0.8%
042-253-5222 2
 
0.8%
042-544-7017 2
 
0.8%
Other values (227) 235
92.2%
2023-12-13T07:09:25.075556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 510
16.5%
2 488
15.7%
0 443
14.3%
4 423
13.6%
5 221
7.1%
3 211
6.8%
6 208
6.7%
7 177
 
5.7%
8 159
 
5.1%
1 145
 
4.7%
Other values (2) 114
 
3.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2572
83.0%
Dash Punctuation 510
 
16.5%
Math Symbol 17
 
0.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 488
19.0%
0 443
17.2%
4 423
16.4%
5 221
8.6%
3 211
8.2%
6 208
8.1%
7 177
 
6.9%
8 159
 
6.2%
1 145
 
5.6%
9 97
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 510
100.0%
Math Symbol
ValueCountFrequency (%)
~ 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3099
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 510
16.5%
2 488
15.7%
0 443
14.3%
4 423
13.6%
5 221
7.1%
3 211
6.8%
6 208
6.7%
7 177
 
5.7%
8 159
 
5.1%
1 145
 
4.7%
Other values (2) 114
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3099
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 510
16.5%
2 488
15.7%
0 443
14.3%
4 423
13.6%
5 221
7.1%
3 211
6.8%
6 208
6.7%
7 177
 
5.7%
8 159
 
5.1%
1 145
 
4.7%
Other values (2) 114
 
3.7%

시공능력평가액
Real number (ℝ)

HIGH CORRELATION 

Distinct252
Distinct (%)99.6%
Missing2
Missing (%)0.8%
Infinite0
Infinite (%)0.0%
Mean5751799.1
Minimum186612
Maximum1.1716766 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.4 KiB
2023-12-13T07:09:25.191316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum186612
5-th percentile582038.4
Q11310039
median2048819
Q34490260
95-th percentile20002272
Maximum1.1716766 × 108
Range1.1698105 × 108
Interquartile range (IQR)3180221

Descriptive statistics

Standard deviation13778278
Coefficient of variation (CV)2.3954727
Kurtosis35.942669
Mean5751799.1
Median Absolute Deviation (MAD)1015969
Skewness5.6044687
Sum1.4552052 × 109
Variance1.8984094 × 1014
MonotonicityNot monotonic
2023-12-13T07:09:25.324097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
375312 2
 
0.8%
1194271 1
 
0.4%
1329569 1
 
0.4%
1329033 1
 
0.4%
1310718 1
 
0.4%
1310039 1
 
0.4%
1285859 1
 
0.4%
1268484 1
 
0.4%
1259318 1
 
0.4%
1257978 1
 
0.4%
Other values (242) 242
94.9%
(Missing) 2
 
0.8%
ValueCountFrequency (%)
186612 1
0.4%
327348 1
0.4%
375312 2
0.8%
379308 1
0.4%
415892 1
0.4%
482005 1
0.4%
529493 1
0.4%
530158 1
0.4%
538936 1
0.4%
570932 1
0.4%
ValueCountFrequency (%)
117167660 1
0.4%
109532360 1
0.4%
89453483 1
0.4%
62913358 1
0.4%
51727466 1
0.4%
47595413 1
0.4%
46988235 1
0.4%
45511460 1
0.4%
40549339 1
0.4%
34426253 1
0.4%

2022년 실적(기성액)(천원)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct237
Distinct (%)99.6%
Missing17
Missing (%)6.7%
Infinite0
Infinite (%)0.0%
Mean2879347.1
Minimum0
Maximum77641039
Zeros1
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size2.4 KiB
2023-12-13T07:09:25.440583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile88369.95
Q1377535.25
median930844
Q32557116.2
95-th percentile9512134.1
Maximum77641039
Range77641039
Interquartile range (IQR)2179581

Descriptive statistics

Standard deviation7440661
Coefficient of variation (CV)2.5841487
Kurtosis60.095501
Mean2879347.1
Median Absolute Deviation (MAD)711875
Skewness7.031403
Sum6.8528461 × 108
Variance5.5363436 × 1013
MonotonicityNot monotonic
2023-12-13T07:09:25.577740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
171600 2
 
0.8%
703657 1
 
0.4%
888616 1
 
0.4%
752782 1
 
0.4%
959848 1
 
0.4%
292215 1
 
0.4%
319260 1
 
0.4%
100639 1
 
0.4%
718710 1
 
0.4%
457556 1
 
0.4%
Other values (227) 227
89.0%
(Missing) 17
 
6.7%
ValueCountFrequency (%)
0 1
0.4%
8288 1
0.4%
14476 1
0.4%
41281 1
0.4%
43763 1
0.4%
48840 1
0.4%
52182 1
0.4%
68917 1
0.4%
72634 1
0.4%
76765 1
0.4%
ValueCountFrequency (%)
77641039 1
0.4%
62131296 1
0.4%
25848014 1
0.4%
25699373 1
0.4%
25569313 1
0.4%
22436319 1
0.4%
18967621 1
0.4%
18702651 1
0.4%
16360877 1
0.4%
12032757 1
0.4%

Interactions

2023-12-13T07:09:22.148317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:09:21.921404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:09:22.255557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:09:22.036362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:09:25.668724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종시공능력평가액2022년 실적(기성액)(천원)
업종1.0000.0000.000
시공능력평가액0.0001.0000.790
2022년 실적(기성액)(천원)0.0000.7901.000
2023-12-13T07:09:25.750531image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시공능력평가액2022년 실적(기성액)(천원)업종
시공능력평가액1.0000.7780.000
2022년 실적(기성액)(천원)0.7781.0000.000
업종0.0000.0001.000

Missing values

2023-12-13T07:09:22.389124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:09:22.535753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T07:09:22.647439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

업종상호소재지연락처시공능력평가액2022년 실적(기성액)(천원)
0기계설비(주)한국가스기술공사대전광역시 유성구 대덕대로 1227(봉산동)042-600-82251171676607232674
1기계설비(주)금성백조주택대전광역시 서구 계룡로583번길 9 (탄방동)042-630-95958945348318967621
2기계설비계룡건설산업(주)대전광역시 서구 문정로48번길 48 (탄방동)070-4470-736762913358<NA>
3기계설비(주)금영이엔지(KUMYOUNGENGCO., LTD)대전광역시 유성구 엑스포로 385 (문지동)042-824-55385172746662131296
4기계설비(주)대청엔지니어링대전광역시 유성구 배울1로 283 (탑립동)042-933-71004759541325848014
5기계설비이케이네이션(주)대전광역시 서구 둔산서로 17 (둔산동)042-477-39724551146025569313
6기계설비대광이엔시(주)대전광역시 대덕구 한밭대로 1027 (오정동)042-485-55134054933977641039
7기계설비대창설비(주)대전광역시 동구 흥룡로 34 (가양동, 금정회관)042-622-27763442625325699373
8기계설비(주)원엔지니어링대전광역시 유성구 복용북로 71-10 (복용동)042-542-02512882857022436319
9기계설비(주)신일이엔씨대전광역시 서구 둔지로 50 (둔산동) 둔산탑클레스 301호042-487-31702827635016360877
업종상호소재지연락처시공능력평가액2022년 실적(기성액)(천원)
245가스1종(주)거창엔지니어링대전광역시 동구 충정로 77042-544-70171403351545704
246가스1종(주)이안플랜트대전광역시 서구 괴정로181번길 35 (용문동, 신축(10.7.21))042-536-78301367559557073
247가스1종(주)성창엔지니어링대전광역시 중구 어덕마을로150번길 9-15 (중촌동)042-221-27001366719698782
248가스1종(주)미래에스코대전광역시 서구 둔산대로117번길 44 (만년동)042-488-25531323495120698
249가스1종동원엔지니어링(주)대전광역시 서구 갈마로 231 (괴정동)042-531-8845~61281942163961
250가스1종(주)하나가스엔지니어링대전광역시 대덕구 대전로1158번길 8042-522-36511169812172629
251가스1종(주)다올건설대전광역시 동구 계족로469번길 20-4 (용전동)042-621-70021056904557756
252가스1종(주)신진엔지니어링대전광역시 중구 목중로26번길 45 (중촌동)042-253-52221032850412307
253가스1종월드에너시스대전광역시 유성구 원계산로 192 (계산동)042-823-6073821573309064
254가스1종경원엔지니어링(주)대전광역시 서구 도솔로 36 (도마동)042-471-6741529493232181