Overview

Dataset statistics

Number of variables5
Number of observations327
Missing cells215
Missing cells (%)13.1%
Duplicate rows3
Duplicate rows (%)0.9%
Total size in memory12.9 KiB
Average record size in memory40.4 B

Variable types

Text4
Categorical1

Dataset

Description인천광역시 서구 건축착공신고현황 데이터 제공(대지위치, 설계사무소명, 시공자사무소명, 주용도, 부속용도 등)합니다.
Author인천광역시 서구
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15029297&srcSe=7661IVAWM27C61E190

Alerts

Dataset has 3 (0.9%) duplicate rowsDuplicates
부속용도 has 126 (38.5%) missing valuesMissing
시공자사무소명 has 86 (26.3%) missing valuesMissing

Reproduction

Analysis started2024-03-18 05:26:22.304936
Analysis finished2024-03-18 05:26:22.811499
Duration0.51 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct319
Distinct (%)97.6%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
2024-03-18T14:26:23.081078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length34
Mean length20.449541
Min length14

Characters and Unicode

Total characters6687
Distinct characters93
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique313 ?
Unique (%)95.7%

Sample

1st row인천광역시 서구 석남동 223-671
2nd row인천광역시 서구 심곡동 255-7
3rd row인천광역시 서구 원창동 394-15 외5필지
4th row인천광역시 서구 오류동 1616-1
5th row인천광역시 서구 원당동 1088
ValueCountFrequency (%)
인천광역시 327
22.7%
서구 327
22.7%
석남동 44
 
3.1%
오류동 39
 
2.7%
외1필지 36
 
2.5%
가좌동 29
 
2.0%
금곡동 24
 
1.7%
원창동 22
 
1.5%
당하동 22
 
1.5%
마전동 21
 
1.5%
Other values (367) 550
38.2%
2024-03-18T14:26:23.564116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1115
16.7%
350
 
5.2%
346
 
5.2%
345
 
5.2%
335
 
5.0%
333
 
5.0%
333
 
5.0%
327
 
4.9%
327
 
4.9%
1 307
 
4.6%
Other values (83) 2569
38.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3766
56.3%
Decimal Number 1506
 
22.5%
Space Separator 1115
 
16.7%
Dash Punctuation 288
 
4.3%
Uppercase Letter 11
 
0.2%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
350
 
9.3%
346
 
9.2%
345
 
9.2%
335
 
8.9%
333
 
8.8%
333
 
8.8%
327
 
8.7%
327
 
8.7%
93
 
2.5%
68
 
1.8%
Other values (66) 909
24.1%
Decimal Number
ValueCountFrequency (%)
1 307
20.4%
2 223
14.8%
3 188
12.5%
6 151
10.0%
4 146
9.7%
5 144
9.6%
7 107
 
7.1%
8 87
 
5.8%
9 83
 
5.5%
0 70
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
C 4
36.4%
F 3
27.3%
L 2
18.2%
B 2
18.2%
Space Separator
ValueCountFrequency (%)
1115
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 288
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3766
56.3%
Common 2910
43.5%
Latin 11
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
350
 
9.3%
346
 
9.2%
345
 
9.2%
335
 
8.9%
333
 
8.8%
333
 
8.8%
327
 
8.7%
327
 
8.7%
93
 
2.5%
68
 
1.8%
Other values (66) 909
24.1%
Common
ValueCountFrequency (%)
1115
38.3%
1 307
 
10.5%
- 288
 
9.9%
2 223
 
7.7%
3 188
 
6.5%
6 151
 
5.2%
4 146
 
5.0%
5 144
 
4.9%
7 107
 
3.7%
8 87
 
3.0%
Other values (3) 154
 
5.3%
Latin
ValueCountFrequency (%)
C 4
36.4%
F 3
27.3%
L 2
18.2%
B 2
18.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3766
56.3%
ASCII 2921
43.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1115
38.2%
1 307
 
10.5%
- 288
 
9.9%
2 223
 
7.6%
3 188
 
6.4%
6 151
 
5.2%
4 146
 
5.0%
5 144
 
4.9%
7 107
 
3.7%
8 87
 
3.0%
Other values (7) 165
 
5.6%
Hangul
ValueCountFrequency (%)
350
 
9.3%
346
 
9.2%
345
 
9.2%
335
 
8.9%
333
 
8.8%
333
 
8.8%
327
 
8.7%
327
 
8.7%
93
 
2.5%
68
 
1.8%
Other values (66) 909
24.1%

주용도
Categorical

Distinct21
Distinct (%)6.4%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
공장
88 
제2종근린생활시설
67 
제1종근린생활시설
48 
공동주택
38 
단독주택
26 
Other values (16)
60 

Length

Max length10
Median length9
Mean length5.617737
Min length2

Unique

Unique6 ?
Unique (%)1.8%

Sample

1st row공장
2nd row노유자시설
3rd row창고시설
4th row공장
5th row종교시설

Common Values

ValueCountFrequency (%)
공장 88
26.9%
제2종근린생활시설 67
20.5%
제1종근린생활시설 48
14.7%
공동주택 38
11.6%
단독주택 26
 
8.0%
자원순환관련시설 14
 
4.3%
자동차관련시설 13
 
4.0%
창고시설 6
 
1.8%
노유자시설 6
 
1.8%
업무시설 4
 
1.2%
Other values (11) 17
 
5.2%

Length

2024-03-18T14:26:23.690593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
공장 88
26.9%
제2종근린생활시설 67
20.5%
제1종근린생활시설 48
14.7%
공동주택 38
11.6%
단독주택 26
 
8.0%
자원순환관련시설 14
 
4.3%
자동차관련시설 13
 
4.0%
창고시설 6
 
1.8%
노유자시설 6
 
1.8%
업무시설 4
 
1.2%
Other values (11) 17
 
5.2%

부속용도
Text

MISSING 

Distinct109
Distinct (%)54.2%
Missing126
Missing (%)38.5%
Memory size2.7 KiB
2024-03-18T14:26:23.873583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length25
Mean length8.6517413
Min length2

Characters and Unicode

Total characters1739
Distinct characters131
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique83 ?
Unique (%)41.3%

Sample

1st row노인복지시설
2nd row교회
3rd row다가구주택
4th row정비공장
5th row제2종근린생활시설
ValueCountFrequency (%)
제조업소 25
 
9.8%
제2종근린생활시설 13
 
5.1%
소매점 10
 
3.9%
제1종근린생활시설 8
 
3.1%
7
 
2.7%
다세대주택 7
 
2.7%
폐기물재활용시설 6
 
2.4%
사무소 6
 
2.4%
도시형생활주택(단지형다세대주택 5
 
2.0%
제1,2종근린생활시설 5
 
2.0%
Other values (108) 163
63.9%
2024-03-18T14:26:24.175388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
102
 
5.9%
82
 
4.7%
82
 
4.7%
74
 
4.3%
71
 
4.1%
69
 
4.0%
65
 
3.7%
58
 
3.3%
54
 
3.1%
44
 
2.5%
Other values (121) 1038
59.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1500
86.3%
Space Separator 54
 
3.1%
Decimal Number 50
 
2.9%
Other Punctuation 45
 
2.6%
Close Punctuation 38
 
2.2%
Open Punctuation 38
 
2.2%
Dash Punctuation 10
 
0.6%
Math Symbol 4
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
102
 
6.8%
82
 
5.5%
82
 
5.5%
74
 
4.9%
71
 
4.7%
69
 
4.6%
65
 
4.3%
58
 
3.9%
44
 
2.9%
43
 
2.9%
Other values (109) 810
54.0%
Decimal Number
ValueCountFrequency (%)
2 28
56.0%
1 20
40.0%
8 1
 
2.0%
0 1
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 38
84.4%
/ 6
 
13.3%
. 1
 
2.2%
Space Separator
ValueCountFrequency (%)
54
100.0%
Close Punctuation
ValueCountFrequency (%)
) 38
100.0%
Open Punctuation
ValueCountFrequency (%)
( 38
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 10
100.0%
Math Symbol
ValueCountFrequency (%)
+ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1500
86.3%
Common 239
 
13.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
102
 
6.8%
82
 
5.5%
82
 
5.5%
74
 
4.9%
71
 
4.7%
69
 
4.6%
65
 
4.3%
58
 
3.9%
44
 
2.9%
43
 
2.9%
Other values (109) 810
54.0%
Common
ValueCountFrequency (%)
54
22.6%
) 38
15.9%
( 38
15.9%
, 38
15.9%
2 28
11.7%
1 20
 
8.4%
- 10
 
4.2%
/ 6
 
2.5%
+ 4
 
1.7%
8 1
 
0.4%
Other values (2) 2
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1500
86.3%
ASCII 239
 
13.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
102
 
6.8%
82
 
5.5%
82
 
5.5%
74
 
4.9%
71
 
4.7%
69
 
4.6%
65
 
4.3%
58
 
3.9%
44
 
2.9%
43
 
2.9%
Other values (109) 810
54.0%
ASCII
ValueCountFrequency (%)
54
22.6%
) 38
15.9%
( 38
15.9%
, 38
15.9%
2 28
11.7%
1 20
 
8.4%
- 10
 
4.2%
/ 6
 
2.5%
+ 4
 
1.7%
8 1
 
0.4%
Other values (2) 2
 
0.8%
Distinct169
Distinct (%)52.2%
Missing3
Missing (%)0.9%
Memory size2.7 KiB
2024-03-18T14:26:24.374795image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length19
Mean length9.9444444
Min length7

Characters and Unicode

Total characters3222
Distinct characters162
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique115 ?
Unique (%)35.5%

Sample

1st row건축사사무소 마을
2nd row무이건축 건축사사무소
3rd row(주)모다건축사사무소
4th rowCA건축사사무소
5th rowJ&J건축사사무소
ValueCountFrequency (%)
건축사사무소 54
 
13.1%
주식회사 16
 
3.9%
마을 16
 
3.9%
ca건축사사무소 13
 
3.1%
가온건축사사무소 11
 
2.7%
한우리건축사사무소 10
 
2.4%
한서종합건축사사무소 10
 
2.4%
에이스건축사사무소 9
 
2.2%
주)도림건축사사무소 8
 
1.9%
주영건축사사무소 7
 
1.7%
Other values (168) 259
62.7%
2024-03-18T14:26:25.001454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
672
20.9%
335
 
10.4%
328
 
10.2%
324
 
10.1%
324
 
10.1%
95
 
2.9%
89
 
2.8%
) 60
 
1.9%
( 60
 
1.9%
58
 
1.8%
Other values (152) 877
27.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2960
91.9%
Space Separator 89
 
2.8%
Close Punctuation 60
 
1.9%
Open Punctuation 60
 
1.9%
Uppercase Letter 47
 
1.5%
Decimal Number 4
 
0.1%
Other Punctuation 1
 
< 0.1%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
672
22.7%
335
11.3%
328
11.1%
324
10.9%
324
10.9%
95
 
3.2%
58
 
2.0%
57
 
1.9%
48
 
1.6%
33
 
1.1%
Other values (134) 686
23.2%
Uppercase Letter
ValueCountFrequency (%)
A 17
36.2%
C 14
29.8%
N 3
 
6.4%
J 3
 
6.4%
I 2
 
4.3%
S 2
 
4.3%
W 2
 
4.3%
K 1
 
2.1%
Y 1
 
2.1%
H 1
 
2.1%
Decimal Number
ValueCountFrequency (%)
0 2
50.0%
2 2
50.0%
Space Separator
ValueCountFrequency (%)
89
100.0%
Close Punctuation
ValueCountFrequency (%)
) 60
100.0%
Open Punctuation
ValueCountFrequency (%)
( 60
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%
Lowercase Letter
ValueCountFrequency (%)
i 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2960
91.9%
Common 214
 
6.6%
Latin 48
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
672
22.7%
335
11.3%
328
11.1%
324
10.9%
324
10.9%
95
 
3.2%
58
 
2.0%
57
 
1.9%
48
 
1.6%
33
 
1.1%
Other values (134) 686
23.2%
Latin
ValueCountFrequency (%)
A 17
35.4%
C 14
29.2%
N 3
 
6.2%
J 3
 
6.2%
I 2
 
4.2%
S 2
 
4.2%
W 2
 
4.2%
K 1
 
2.1%
Y 1
 
2.1%
H 1
 
2.1%
Other values (2) 2
 
4.2%
Common
ValueCountFrequency (%)
89
41.6%
) 60
28.0%
( 60
28.0%
0 2
 
0.9%
2 2
 
0.9%
& 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2960
91.9%
ASCII 262
 
8.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
672
22.7%
335
11.3%
328
11.1%
324
10.9%
324
10.9%
95
 
3.2%
58
 
2.0%
57
 
1.9%
48
 
1.6%
33
 
1.1%
Other values (134) 686
23.2%
ASCII
ValueCountFrequency (%)
89
34.0%
) 60
22.9%
( 60
22.9%
A 17
 
6.5%
C 14
 
5.3%
N 3
 
1.1%
J 3
 
1.1%
I 2
 
0.8%
S 2
 
0.8%
0 2
 
0.8%
Other values (8) 10
 
3.8%

시공자사무소명
Text

MISSING 

Distinct186
Distinct (%)77.2%
Missing86
Missing (%)26.3%
Memory size2.7 KiB
2024-03-18T14:26:25.197835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length13
Mean length9.1078838
Min length6

Characters and Unicode

Total characters2195
Distinct characters188
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique154 ?
Unique (%)63.9%

Sample

1st row삼성건축(주)
2nd row주식회사 지유건설
3rd row(주)사닥다리종합건설
4th row주식회사이석종합건설
5th row덕양종합건설주식회사
ValueCountFrequency (%)
주식회사 28
 
10.4%
주)태일씨앤디종합건설 7
 
2.6%
삼성건축(주 7
 
2.6%
일동종합건설(주 5
 
1.9%
주)도담이앤씨 5
 
1.9%
원효종합건설 4
 
1.5%
시온건설종합(주 4
 
1.5%
우진종합건설(주 3
 
1.1%
주)청호건설 3
 
1.1%
주)덕우종합건설 3
 
1.1%
Other values (178) 201
74.4%
2024-03-18T14:26:25.517823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
239
 
10.9%
196
 
8.9%
185
 
8.4%
( 175
 
8.0%
) 175
 
8.0%
140
 
6.4%
140
 
6.4%
68
 
3.1%
63
 
2.9%
63
 
2.9%
Other values (178) 751
34.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1811
82.5%
Open Punctuation 175
 
8.0%
Close Punctuation 175
 
8.0%
Space Separator 30
 
1.4%
Decimal Number 4
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
239
 
13.2%
196
 
10.8%
185
 
10.2%
140
 
7.7%
140
 
7.7%
68
 
3.8%
63
 
3.5%
63
 
3.5%
34
 
1.9%
26
 
1.4%
Other values (173) 657
36.3%
Decimal Number
ValueCountFrequency (%)
1 3
75.0%
0 1
 
25.0%
Open Punctuation
ValueCountFrequency (%)
( 175
100.0%
Close Punctuation
ValueCountFrequency (%)
) 175
100.0%
Space Separator
ValueCountFrequency (%)
30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1811
82.5%
Common 384
 
17.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
239
 
13.2%
196
 
10.8%
185
 
10.2%
140
 
7.7%
140
 
7.7%
68
 
3.8%
63
 
3.5%
63
 
3.5%
34
 
1.9%
26
 
1.4%
Other values (173) 657
36.3%
Common
ValueCountFrequency (%)
( 175
45.6%
) 175
45.6%
30
 
7.8%
1 3
 
0.8%
0 1
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1811
82.5%
ASCII 384
 
17.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
239
 
13.2%
196
 
10.8%
185
 
10.2%
140
 
7.7%
140
 
7.7%
68
 
3.8%
63
 
3.5%
63
 
3.5%
34
 
1.9%
26
 
1.4%
Other values (173) 657
36.3%
ASCII
ValueCountFrequency (%)
( 175
45.6%
) 175
45.6%
30
 
7.8%
1 3
 
0.8%
0 1
 
0.3%

Missing values

2024-03-18T14:26:22.610998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-18T14:26:22.690639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-18T14:26:22.766943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

대지위치주용도부속용도설계사무소명시공자사무소명
0인천광역시 서구 석남동 223-671공장<NA>건축사사무소 마을삼성건축(주)
1인천광역시 서구 심곡동 255-7노유자시설노인복지시설무이건축 건축사사무소<NA>
2인천광역시 서구 원창동 394-15 외5필지창고시설<NA>(주)모다건축사사무소주식회사 지유건설
3인천광역시 서구 오류동 1616-1공장<NA>CA건축사사무소<NA>
4인천광역시 서구 원당동 1088종교시설교회J&J건축사사무소(주)사닥다리종합건설
5인천광역시 서구 원당동 800-11단독주택다가구주택예공건축사사무소주식회사이석종합건설
6인천광역시 서구 석남동 587제2종근린생활시설<NA>도원건축사사무소덕양종합건설주식회사
7인천광역시 서구 가좌동 173-353자동차관련시설정비공장이호건축사사무소1011모터스(주)
8인천광역시 서구 석남동 650-192 외1필지공장<NA>건축사사무소 마을시온건설종합(주)
9인천광역시 서구 백석동 한들지구 12 14제1종근린생활시설제2종근린생활시설건축사사무소 푸리오슬로종합건설(주)
대지위치주용도부속용도설계사무소명시공자사무소명
317인천광역시 서구 오류동 114동.식물관련시설콩나물재배사삼익건축사사무소 주식회사<NA>
318인천광역시 서구 오류동 115-2동.식물관련시설콩나물재배사삼익건축사사무소 주식회사<NA>
319인천광역시 서구 대곡동 186-10제1종근린생활시설소매점한서종합건축사사무소<NA>
320인천광역시 서구 당하동 962-3 외4필지제2종근린생활시설<NA>(주)원종합건축사사무소<NA>
321인천광역시 서구 대곡동 36-24단독주택<NA>바론 건축사사무소<NA>
322인천광역시 서구 대곡동 36-25단독주택<NA>바론 건축사사무소<NA>
323인천광역시 서구 불로동 802-6제1종근린생활시설공중화장실시공종합건축사사무소(주)대광건영
324인천광역시 서구 오류동 1581-4공장물품가공공장(주)지에이플랜건축사사무소(주)노아엑츄에이션
325인천광역시 서구 경서동 경서3구역 4블럭 1로트위험물저장및처리시설액화석유가스충전소건축사사무소 예토건축(주)드림세움건설
326인천광역시 서구 금곡동 643단독주택<NA>진성건축사사무소<NA>

Duplicate rows

Most frequently occurring

대지위치주용도부속용도설계사무소명시공자사무소명# duplicates
0인천광역시 서구 가좌동 556-27공장<NA>모아건축사사무소(주)보브종합건설2
1인천광역시 서구 석남동 257-4공장<NA>온누리건축사사무소(주)도담이앤씨2
2인천광역시 서구 석남동 257-7공장<NA>(주)건축사사무소 명승(주)도담이앤씨2