Overview

Dataset statistics

Number of variables7
Number of observations78
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.5 KiB
Average record size in memory58.7 B

Variable types

Numeric1
Text3
Categorical3

Dataset

Description인천광역시 건설엔지니어링 업체 등록 리스트 현황으로 업체명 등록번호 대표자명 등록일자 전문분야 세부분야 업무범위에 대한 데이터를 제공합니다
Author인천광역시
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15103256&srcSe=7661IVAWM27C61E190

Alerts

세부분야 is highly overall correlated with 전문분야High correlation
전문분야 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 전문분야High correlation
전문분야 is highly imbalanced (65.6%)Imbalance
연번 has unique valuesUnique
등록번호 has unique valuesUnique
업체명 has unique valuesUnique
대표자 has unique valuesUnique

Reproduction

Analysis started2024-03-13 06:04:56.917087
Analysis finished2024-03-13 06:04:57.857978
Duration0.94 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct78
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean39.5
Minimum1
Maximum78
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size834.0 B
2024-03-13T15:04:57.930100image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.85
Q120.25
median39.5
Q358.75
95-th percentile74.15
Maximum78
Range77
Interquartile range (IQR)38.5

Descriptive statistics

Standard deviation22.660538
Coefficient of variation (CV)0.57368452
Kurtosis-1.2
Mean39.5
Median Absolute Deviation (MAD)19.5
Skewness0
Sum3081
Variance513.5
MonotonicityStrictly increasing
2024-03-13T15:04:58.069455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.3%
51 1
 
1.3%
58 1
 
1.3%
57 1
 
1.3%
56 1
 
1.3%
55 1
 
1.3%
54 1
 
1.3%
53 1
 
1.3%
52 1
 
1.3%
50 1
 
1.3%
Other values (68) 68
87.2%
ValueCountFrequency (%)
1 1
1.3%
2 1
1.3%
3 1
1.3%
4 1
1.3%
5 1
1.3%
6 1
1.3%
7 1
1.3%
8 1
1.3%
9 1
1.3%
10 1
1.3%
ValueCountFrequency (%)
78 1
1.3%
77 1
1.3%
76 1
1.3%
75 1
1.3%
74 1
1.3%
73 1
1.3%
72 1
1.3%
71 1
1.3%
70 1
1.3%
69 1
1.3%

등록번호
Text

UNIQUE 

Distinct78
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size756.0 B
2024-03-13T15:04:58.384776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length7.025641
Min length6

Characters and Unicode

Total characters548
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique78 ?
Unique (%)100.0%

Sample

1st row인천-2-1
2nd row인천-2-2
3rd row인천-2-3
4th row인천-2-6
5th row인천-2-8
ValueCountFrequency (%)
인천-2-1 1
 
1.3%
인천-2-84 1
 
1.3%
인천-2-93 1
 
1.3%
인천-2-92 1
 
1.3%
인천-2-91 1
 
1.3%
인천-2-90 1
 
1.3%
인천-2-89 1
 
1.3%
인천-2-88 1
 
1.3%
인천-2-100 1
 
1.3%
인천-2-83 1
 
1.3%
Other values (68) 68
87.2%
2024-03-13T15:04:58.788643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 156
28.5%
2 88
16.1%
78
14.2%
78
14.2%
1 30
 
5.5%
3 18
 
3.3%
9 18
 
3.3%
6 17
 
3.1%
0 15
 
2.7%
4 14
 
2.6%
Other values (3) 36
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 236
43.1%
Dash Punctuation 156
28.5%
Other Letter 156
28.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 88
37.3%
1 30
 
12.7%
3 18
 
7.6%
9 18
 
7.6%
6 17
 
7.2%
0 15
 
6.4%
4 14
 
5.9%
5 13
 
5.5%
8 12
 
5.1%
7 11
 
4.7%
Other Letter
ValueCountFrequency (%)
78
50.0%
78
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 156
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 392
71.5%
Hangul 156
 
28.5%

Most frequent character per script

Common
ValueCountFrequency (%)
- 156
39.8%
2 88
22.4%
1 30
 
7.7%
3 18
 
4.6%
9 18
 
4.6%
6 17
 
4.3%
0 15
 
3.8%
4 14
 
3.6%
5 13
 
3.3%
8 12
 
3.1%
Hangul
ValueCountFrequency (%)
78
50.0%
78
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 392
71.5%
Hangul 156
 
28.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 156
39.8%
2 88
22.4%
1 30
 
7.7%
3 18
 
4.6%
9 18
 
4.6%
6 17
 
4.3%
0 15
 
3.8%
4 14
 
3.6%
5 13
 
3.3%
8 12
 
3.1%
Hangul
ValueCountFrequency (%)
78
50.0%
78
50.0%

업체명
Text

UNIQUE 

Distinct78
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size756.0 B
2024-03-13T15:04:58.990775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length14
Mean length9.7179487
Min length5

Characters and Unicode

Total characters758
Distinct characters149
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique78 ?
Unique (%)100.0%

Sample

1st row(주)승우엔지니어링
2nd row일진인터내셔날㈜
3rd row(주)단에이앤씨종합건축사사무소
4th row(주)고산엔지니어링
5th row(주)태원종합기술단건축사사무소
ValueCountFrequency (%)
주식회사 16
 
16.3%
주)유에스티21 1
 
1.0%
성학건축사무소 1
 
1.0%
대성종합엔지니어링 1
 
1.0%
더지오 1
 
1.0%
케이씨티엔지니어링 1
 
1.0%
㈜주연엔지니어링 1
 
1.0%
주)은혜엔지니어링 1
 
1.0%
㈜상지건축사사무소 1
 
1.0%
㈜위드씨앤에이건축사사무소 1
 
1.0%
Other values (73) 73
74.5%
2024-03-13T15:04:59.398222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
55
 
7.3%
43
 
5.7%
( 38
 
5.0%
) 38
 
5.0%
33
 
4.4%
28
 
3.7%
28
 
3.7%
22
 
2.9%
21
 
2.8%
21
 
2.8%
Other values (139) 431
56.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 639
84.3%
Open Punctuation 38
 
5.0%
Close Punctuation 38
 
5.0%
Other Symbol 21
 
2.8%
Space Separator 20
 
2.6%
Decimal Number 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
55
 
8.6%
43
 
6.7%
33
 
5.2%
28
 
4.4%
28
 
4.4%
22
 
3.4%
21
 
3.3%
21
 
3.3%
20
 
3.1%
17
 
2.7%
Other values (133) 351
54.9%
Decimal Number
ValueCountFrequency (%)
2 1
50.0%
1 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 38
100.0%
Close Punctuation
ValueCountFrequency (%)
) 38
100.0%
Other Symbol
ValueCountFrequency (%)
21
100.0%
Space Separator
ValueCountFrequency (%)
20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 660
87.1%
Common 98
 
12.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
55
 
8.3%
43
 
6.5%
33
 
5.0%
28
 
4.2%
28
 
4.2%
22
 
3.3%
21
 
3.2%
21
 
3.2%
21
 
3.2%
20
 
3.0%
Other values (134) 368
55.8%
Common
ValueCountFrequency (%)
( 38
38.8%
) 38
38.8%
20
20.4%
2 1
 
1.0%
1 1
 
1.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 639
84.3%
ASCII 98
 
12.9%
None 21
 
2.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
55
 
8.6%
43
 
6.7%
33
 
5.2%
28
 
4.4%
28
 
4.4%
22
 
3.4%
21
 
3.3%
21
 
3.3%
20
 
3.1%
17
 
2.7%
Other values (133) 351
54.9%
ASCII
ValueCountFrequency (%)
( 38
38.8%
) 38
38.8%
20
20.4%
2 1
 
1.0%
1 1
 
1.0%
None
ValueCountFrequency (%)
21
100.0%

전문분야
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size756.0 B
설계사업관리
73 
품질검사
 
5

Length

Max length6
Median length6
Mean length5.8717949
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row설계사업관리
2nd row설계사업관리
3rd row설계사업관리
4th row설계사업관리
5th row설계사업관리

Common Values

ValueCountFrequency (%)
설계사업관리 73
93.6%
품질검사 5
 
6.4%

Length

2024-03-13T15:04:59.556165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T15:04:59.676505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
설계사업관리 73
93.6%
품질검사 5
 
6.4%

세부분야
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size756.0 B
일반
29 
설계등용역
24 
건설사업관리
13 
측량
특수

Length

Max length6
Median length2
Mean length3.6410256
Min length2

Unique

Unique1 ?
Unique (%)1.3%

Sample

1st row일반
2nd row일반
3rd row일반
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 29
37.2%
설계등용역 24
30.8%
건설사업관리 13
16.7%
측량 7
 
9.0%
특수 4
 
5.1%
일반, 특수 1
 
1.3%

Length

2024-03-13T15:04:59.791538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T15:04:59.928572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 30
38.0%
설계등용역 24
30.4%
건설사업관리 13
16.5%
측량 7
 
8.9%
특수 5
 
6.3%

대표자
Text

UNIQUE 

Distinct78
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size756.0 B
2024-03-13T15:05:00.180618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.3205128
Min length3

Characters and Unicode

Total characters259
Distinct characters101
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique78 ?
Unique (%)100.0%

Sample

1st row배수진
2nd row류황범
3rd row박규석
4th row박범섭
5th row양철웅
ValueCountFrequency (%)
김민수 2
 
2.4%
이규택 2
 
2.4%
배수진 1
 
1.2%
정은선 1
 
1.2%
최태욱 1
 
1.2%
서효원 1
 
1.2%
원호연 1
 
1.2%
한병익 1
 
1.2%
홍승혁 1
 
1.2%
김종욱 1
 
1.2%
Other values (71) 71
85.5%
2024-03-13T15:05:00.611737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19
 
7.3%
13
 
5.0%
8
 
3.1%
7
 
2.7%
7
 
2.7%
6
 
2.3%
6
 
2.3%
5
 
1.9%
, 5
 
1.9%
5
 
1.9%
Other values (91) 178
68.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 249
96.1%
Other Punctuation 5
 
1.9%
Space Separator 5
 
1.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19
 
7.6%
13
 
5.2%
8
 
3.2%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
5
 
2.0%
5
 
2.0%
5
 
2.0%
Other values (89) 168
67.5%
Other Punctuation
ValueCountFrequency (%)
, 5
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 249
96.1%
Common 10
 
3.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
19
 
7.6%
13
 
5.2%
8
 
3.2%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
5
 
2.0%
5
 
2.0%
5
 
2.0%
Other values (89) 168
67.5%
Common
ValueCountFrequency (%)
, 5
50.0%
5
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 249
96.1%
ASCII 10
 
3.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
19
 
7.6%
13
 
5.2%
8
 
3.2%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
5
 
2.0%
5
 
2.0%
5
 
2.0%
Other values (89) 168
67.5%
ASCII
ValueCountFrequency (%)
, 5
50.0%
5
50.0%

소재지
Categorical

Distinct9
Distinct (%)11.5%
Missing0
Missing (%)0.0%
Memory size756.0 B
인천광역시 남동구
32 
인천광역시 연수구
13 
인천광역시 서구
12 
인천광역시 부평구
인천광역시 미추홀구
Other values (4)

Length

Max length11
Median length10
Mean length9.7564103
Min length9

Unique

Unique3 ?
Unique (%)3.8%

Sample

1st row인천광역시 부평구
2nd row인천광역시 남동구
3rd row인천광역시 미추홀구
4th row인천광역시 연수구
5th row인천광역시 서구

Common Values

ValueCountFrequency (%)
인천광역시 남동구 32
41.0%
인천광역시 연수구 13
16.7%
인천광역시 서구 12
 
15.4%
인천광역시 부평구 9
 
11.5%
인천광역시 미추홀구 7
 
9.0%
인천광역시 계양구 2
 
2.6%
인천광역시 중구 1
 
1.3%
인천광역시 강화군 1
 
1.3%
인천광역시 연수구 1
 
1.3%

Length

2024-03-13T15:05:00.757282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-13T15:05:00.899152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
인천광역시 78
50.0%
남동구 32
20.5%
연수구 14
 
9.0%
서구 12
 
7.7%
부평구 9
 
5.8%
미추홀구 7
 
4.5%
계양구 2
 
1.3%
중구 1
 
0.6%
강화군 1
 
0.6%

Interactions

2024-03-13T15:04:57.272061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-13T15:05:00.996043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번등록번호업체명전문분야세부분야대표자소재지
연번1.0001.0001.0000.9020.6921.0000.428
등록번호1.0001.0001.0001.0001.0001.0001.000
업체명1.0001.0001.0001.0001.0001.0001.000
전문분야0.9021.0001.0001.0001.0001.0000.000
세부분야0.6921.0001.0001.0001.0001.0000.000
대표자1.0001.0001.0001.0001.0001.0001.000
소재지0.4281.0001.0000.0000.0001.0001.000
2024-03-13T15:05:01.103682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세부분야전문분야소재지
세부분야1.0000.9730.000
전문분야0.9731.0000.000
소재지0.0000.0001.000
2024-03-13T15:05:01.190141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번전문분야세부분야소재지
연번1.0000.6990.4420.205
전문분야0.6991.0000.9730.000
세부분야0.4420.9731.0000.000
소재지0.2050.0000.0001.000

Missing values

2024-03-13T15:04:57.669583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T15:04:57.807375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번등록번호업체명전문분야세부분야대표자소재지
01인천-2-1(주)승우엔지니어링설계사업관리일반배수진인천광역시 부평구
12인천-2-2일진인터내셔날㈜설계사업관리일반류황범인천광역시 남동구
23인천-2-3(주)단에이앤씨종합건축사사무소설계사업관리일반박규석인천광역시 미추홀구
34인천-2-6(주)고산엔지니어링설계사업관리일반박범섭인천광역시 연수구
45인천-2-8(주)태원종합기술단건축사사무소설계사업관리일반양철웅인천광역시 서구
56인천-2-11(주)장원설계사업관리일반장양훈인천광역시 남동구
67인천-2-12(주)대한설계사업관리일반이진홍, 설영만인천광역시 남동구
78인천-2-14(주)한승엔지니어링설계사업관리일반은종철인천광역시 남동구
89인천-2-15(주)한서설계사업관리일반윤호중, 이규택인천광역시 남동구
910인천-2-16(주)백림종합건축사사무소설계사업관리일반이명진인천광역시 미추홀구
연번등록번호업체명전문분야세부분야대표자소재지
6869인천-2-107주식회사 한양설계사업관리건설사업관리김형일인천광역시 남동구
6970인천-2-108(주)소유엔지니어링설계사업관리일반이종민인천광역시 남동구
7071인천-2-109주식회사 셈즈설계사업관리설계등용역이긍재인천광역시 서구
7172인천-2-110주식회사종합건설기술단지그집엔지니어링설계사업관리설계등용역강경범인천광역시 서구
7273인천-2-111진흥기업㈜설계사업관리건설사업관리박상신인천광역시 연수구
7374인천-3-1비티이엔씨㈜품질검사특수김대권인천광역시 남동구
7475인천-3-2(재)한국화학융합시험연구원 인천품질검사일반, 특수김현철인천광역시 서구
7576인천-3-3㈜에이피엔 인천출장소품질검사특수정태화인천광역시 남동구
7677인천-3-4에너지테크㈜ 인천지사품질검사특수김흥규인천광역시 남동구
7778인천-3-5에이원기술검사품질검사특수박덕영인천광역시 서구