Overview

Dataset statistics

Number of variables6
Number of observations1138
Missing cells268
Missing cells (%)3.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory54.6 KiB
Average record size in memory49.1 B

Variable types

Numeric1
Text4
DateTime1

Dataset

Description부산광역시_건축사사무소현황_20230723
Author부산광역시
URLhttp://data.busan.go.kr/dataSet/detail.nm?contentId=10&publicdatapk=15034666

Alerts

데이터기준일자 has constant value ""Constant
전화번호 has 268 (23.6%) missing valuesMissing
순번 has unique valuesUnique

Reproduction

Analysis started2024-03-13 13:19:48.058888
Analysis finished2024-03-13 13:19:48.927474
Duration0.87 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct1138
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean569.5
Minimum1
Maximum1138
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.1 KiB
2024-03-13T22:19:49.028973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile57.85
Q1285.25
median569.5
Q3853.75
95-th percentile1081.15
Maximum1138
Range1137
Interquartile range (IQR)568.5

Descriptive statistics

Standard deviation328.65661
Coefficient of variation (CV)0.57709677
Kurtosis-1.2
Mean569.5
Median Absolute Deviation (MAD)284.5
Skewness0
Sum648091
Variance108015.17
MonotonicityStrictly increasing
2024-03-13T22:19:49.173682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
758 1
 
0.1%
764 1
 
0.1%
763 1
 
0.1%
762 1
 
0.1%
761 1
 
0.1%
760 1
 
0.1%
759 1
 
0.1%
757 1
 
0.1%
766 1
 
0.1%
Other values (1128) 1128
99.1%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
1138 1
0.1%
1137 1
0.1%
1136 1
0.1%
1135 1
0.1%
1134 1
0.1%
1133 1
0.1%
1132 1
0.1%
1131 1
0.1%
1130 1
0.1%
1129 1
0.1%
Distinct1052
Distinct (%)92.4%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
2024-03-13T22:19:49.519611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length20
Mean length11.025483
Min length7

Characters and Unicode

Total characters12547
Distinct characters361
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique992 ?
Unique (%)87.2%

Sample

1st row건축사사무소 한국
2nd row신기종합건축사사무소
3rd row삼아 건축사사무소
4th row혁진 건축사사무소
5th row(주)상지엔지니어링건축사사무소
ValueCountFrequency (%)
건축사사무소 532
27.9%
종합건축사사무소 85
 
4.5%
주식회사 82
 
4.3%
주)종합건축사사무소 16
 
0.8%
주)상지엔지니어링건축사사무소 11
 
0.6%
주)건축사사무소 9
 
0.5%
사무소 9
 
0.5%
건축사 9
 
0.5%
5
 
0.3%
주)일신설계종합건축사사무소 5
 
0.3%
Other values (1053) 1142
59.9%
2024-03-13T22:19:50.098807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2377
18.9%
1213
 
9.7%
1183
 
9.4%
1147
 
9.1%
1147
 
9.1%
789
 
6.3%
338
 
2.7%
( 243
 
1.9%
) 243
 
1.9%
220
 
1.8%
Other values (351) 3647
29.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11032
87.9%
Space Separator 789
 
6.3%
Open Punctuation 243
 
1.9%
Close Punctuation 243
 
1.9%
Uppercase Letter 148
 
1.2%
Other Punctuation 41
 
0.3%
Lowercase Letter 23
 
0.2%
Decimal Number 22
 
0.2%
Dash Punctuation 4
 
< 0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2377
21.5%
1213
11.0%
1183
10.7%
1147
 
10.4%
1147
 
10.4%
338
 
3.1%
220
 
2.0%
215
 
1.9%
189
 
1.7%
131
 
1.2%
Other values (299) 2872
26.0%
Uppercase Letter
ValueCountFrequency (%)
A 30
20.3%
M 12
 
8.1%
T 11
 
7.4%
N 11
 
7.4%
C 11
 
7.4%
S 9
 
6.1%
P 9
 
6.1%
E 9
 
6.1%
D 8
 
5.4%
J 7
 
4.7%
Other values (12) 31
20.9%
Lowercase Letter
ValueCountFrequency (%)
c 3
13.0%
m 2
8.7%
h 2
8.7%
t 2
8.7%
e 2
8.7%
s 2
8.7%
p 2
8.7%
l 2
8.7%
a 2
8.7%
n 2
8.7%
Other values (2) 2
8.7%
Other Punctuation
ValueCountFrequency (%)
. 20
48.8%
& 15
36.6%
' 2
 
4.9%
· 2
 
4.9%
# 1
 
2.4%
, 1
 
2.4%
Decimal Number
ValueCountFrequency (%)
1 9
40.9%
2 8
36.4%
5 2
 
9.1%
8 1
 
4.5%
4 1
 
4.5%
0 1
 
4.5%
Space Separator
ValueCountFrequency (%)
789
100.0%
Open Punctuation
ValueCountFrequency (%)
( 243
100.0%
Close Punctuation
ValueCountFrequency (%)
) 243
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11031
87.9%
Common 1343
 
10.7%
Latin 171
 
1.4%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2377
21.5%
1213
11.0%
1183
10.7%
1147
 
10.4%
1147
 
10.4%
338
 
3.1%
220
 
2.0%
215
 
1.9%
189
 
1.7%
131
 
1.2%
Other values (298) 2871
26.0%
Latin
ValueCountFrequency (%)
A 30
17.5%
M 12
 
7.0%
T 11
 
6.4%
N 11
 
6.4%
C 11
 
6.4%
S 9
 
5.3%
P 9
 
5.3%
E 9
 
5.3%
D 8
 
4.7%
J 7
 
4.1%
Other values (24) 54
31.6%
Common
ValueCountFrequency (%)
789
58.7%
( 243
 
18.1%
) 243
 
18.1%
. 20
 
1.5%
& 15
 
1.1%
1 9
 
0.7%
2 8
 
0.6%
- 4
 
0.3%
' 2
 
0.1%
5 2
 
0.1%
Other values (7) 8
 
0.6%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11030
87.9%
ASCII 1511
 
12.0%
None 3
 
< 0.1%
CJK 2
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2377
21.6%
1213
11.0%
1183
10.7%
1147
 
10.4%
1147
 
10.4%
338
 
3.1%
220
 
2.0%
215
 
1.9%
189
 
1.7%
131
 
1.2%
Other values (297) 2870
26.0%
ASCII
ValueCountFrequency (%)
789
52.2%
( 243
 
16.1%
) 243
 
16.1%
A 30
 
2.0%
. 20
 
1.3%
& 15
 
1.0%
M 12
 
0.8%
T 11
 
0.7%
N 11
 
0.7%
C 11
 
0.7%
Other values (39) 126
 
8.3%
None
ValueCountFrequency (%)
· 2
66.7%
1
33.3%
Punctuation
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct942
Distinct (%)82.8%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
2024-03-13T22:19:50.422934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length39
Mean length28.16696
Min length1

Characters and Unicode

Total characters32054
Distinct characters371
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique819 ?
Unique (%)72.0%

Sample

1st row부산광역시 부산진구 부전로152번길 31-7
2nd row부산광역시 사상구 학감대로 238-10 (감전동 , 협성빌딩202호)
3rd row부산광역시 부산진구 새싹로 31
4th row부산광역시 동구 중앙대로320번길 7-8, 대양빌딩 302
5th row부산광역시 중구 자갈치로 42
ValueCountFrequency (%)
부산광역시 1100
 
18.2%
해운대구 178
 
2.9%
부산진구 119
 
2.0%
연제구 118
 
2.0%
수영구 98
 
1.6%
동래구 96
 
1.6%
2층 93
 
1.5%
3층 93
 
1.5%
금정구 91
 
1.5%
동구 88
 
1.5%
Other values (1547) 3965
65.7%
2024-03-13T22:19:50.906010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4972
 
15.5%
1293
 
4.0%
1 1261
 
3.9%
1257
 
3.9%
1178
 
3.7%
1156
 
3.6%
1101
 
3.4%
1098
 
3.4%
1095
 
3.4%
, 1080
 
3.4%
Other values (361) 16563
51.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 18788
58.6%
Decimal Number 6276
 
19.6%
Space Separator 4972
 
15.5%
Other Punctuation 1091
 
3.4%
Open Punctuation 306
 
1.0%
Close Punctuation 305
 
1.0%
Dash Punctuation 185
 
0.6%
Uppercase Letter 106
 
0.3%
Lowercase Letter 21
 
0.1%
Control 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1293
 
6.9%
1257
 
6.7%
1178
 
6.3%
1156
 
6.2%
1101
 
5.9%
1098
 
5.8%
1095
 
5.8%
697
 
3.7%
594
 
3.2%
522
 
2.8%
Other values (316) 8797
46.8%
Uppercase Letter
ValueCountFrequency (%)
A 17
16.0%
B 14
13.2%
C 13
12.3%
T 7
 
6.6%
K 7
 
6.6%
H 6
 
5.7%
O 6
 
5.7%
E 6
 
5.7%
P 4
 
3.8%
S 4
 
3.8%
Other values (9) 22
20.8%
Decimal Number
ValueCountFrequency (%)
1 1261
20.1%
2 988
15.7%
0 787
12.5%
3 731
11.6%
4 475
 
7.6%
7 449
 
7.2%
6 444
 
7.1%
5 431
 
6.9%
9 403
 
6.4%
8 307
 
4.9%
Lowercase Letter
ValueCountFrequency (%)
e 14
66.7%
n 2
 
9.5%
k 2
 
9.5%
h 1
 
4.8%
y 1
 
4.8%
s 1
 
4.8%
Other Punctuation
ValueCountFrequency (%)
, 1080
99.0%
/ 6
 
0.5%
. 3
 
0.3%
· 1
 
0.1%
@ 1
 
0.1%
Space Separator
ValueCountFrequency (%)
4972
100.0%
Open Punctuation
ValueCountFrequency (%)
( 306
100.0%
Close Punctuation
ValueCountFrequency (%)
) 305
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 185
100.0%
Control
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 18788
58.6%
Common 13139
41.0%
Latin 127
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1293
 
6.9%
1257
 
6.7%
1178
 
6.3%
1156
 
6.2%
1101
 
5.9%
1098
 
5.8%
1095
 
5.8%
697
 
3.7%
594
 
3.2%
522
 
2.8%
Other values (316) 8797
46.8%
Latin
ValueCountFrequency (%)
A 17
13.4%
B 14
11.0%
e 14
11.0%
C 13
 
10.2%
T 7
 
5.5%
K 7
 
5.5%
H 6
 
4.7%
O 6
 
4.7%
E 6
 
4.7%
P 4
 
3.1%
Other values (15) 33
26.0%
Common
ValueCountFrequency (%)
4972
37.8%
1 1261
 
9.6%
, 1080
 
8.2%
2 988
 
7.5%
0 787
 
6.0%
3 731
 
5.6%
4 475
 
3.6%
7 449
 
3.4%
6 444
 
3.4%
5 431
 
3.3%
Other values (10) 1521
 
11.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 18788
58.6%
ASCII 13265
41.4%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4972
37.5%
1 1261
 
9.5%
, 1080
 
8.1%
2 988
 
7.4%
0 787
 
5.9%
3 731
 
5.5%
4 475
 
3.6%
7 449
 
3.4%
6 444
 
3.3%
5 431
 
3.2%
Other values (34) 1647
 
12.4%
Hangul
ValueCountFrequency (%)
1293
 
6.9%
1257
 
6.7%
1178
 
6.3%
1156
 
6.2%
1101
 
5.9%
1098
 
5.8%
1095
 
5.8%
697
 
3.7%
594
 
3.2%
522
 
2.8%
Other values (316) 8797
46.8%
None
ValueCountFrequency (%)
· 1
100.0%

전화번호
Text

MISSING 

Distinct746
Distinct (%)85.7%
Missing268
Missing (%)23.6%
Memory size9.0 KiB
2024-03-13T22:19:51.164123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.042529
Min length12

Characters and Unicode

Total characters10477
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique657 ?
Unique (%)75.5%

Sample

1st row051-817-2119
2nd row051-328-4474
3rd row051-816-8193
4th row051-469-1285
5th row051-247-0208
ValueCountFrequency (%)
051-247-0208 13
 
1.5%
051-632-8634 5
 
0.6%
051-462-4712 5
 
0.6%
051-462-0463 4
 
0.5%
051-626-7341 3
 
0.3%
051-463-3355 3
 
0.3%
070-4044-7174 3
 
0.3%
051-861-1858 3
 
0.3%
051-751-5654 3
 
0.3%
051-867-0244 3
 
0.3%
Other values (736) 825
94.8%
2024-03-13T22:19:51.544274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1740
16.6%
5 1586
15.1%
0 1574
15.0%
1 1495
14.3%
7 680
 
6.5%
2 675
 
6.4%
4 653
 
6.2%
6 626
 
6.0%
8 574
 
5.5%
3 521
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8737
83.4%
Dash Punctuation 1740
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
5 1586
18.2%
0 1574
18.0%
1 1495
17.1%
7 680
7.8%
2 675
7.7%
4 653
7.5%
6 626
 
7.2%
8 574
 
6.6%
3 521
 
6.0%
9 353
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 1740
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10477
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1740
16.6%
5 1586
15.1%
0 1574
15.0%
1 1495
14.3%
7 680
 
6.5%
2 675
 
6.4%
4 653
 
6.2%
6 626
 
6.0%
8 574
 
5.5%
3 521
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10477
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1740
16.6%
5 1586
15.1%
0 1574
15.0%
1 1495
14.3%
7 680
 
6.5%
2 675
 
6.4%
4 653
 
6.2%
6 626
 
6.0%
8 574
 
5.5%
3 521
 
5.0%
Distinct1109
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
2024-03-13T22:19:51.895699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.9824253
Min length2

Characters and Unicode

Total characters3394
Distinct characters205
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1080 ?
Unique (%)94.9%

Sample

1st row김창신
2nd row김건수
3rd row김성곤
4th row임창수
5th row김동균
ValueCountFrequency (%)
이상호 2
 
0.2%
김민수 2
 
0.2%
이창훈 2
 
0.2%
김동준 2
 
0.2%
이현정 2
 
0.2%
이상일 2
 
0.2%
김영환 2
 
0.2%
박상규 2
 
0.2%
서상원 2
 
0.2%
탁문현 2
 
0.2%
Other values (1099) 1118
98.2%
2024-03-13T22:19:52.390988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
215
 
6.3%
181
 
5.3%
129
 
3.8%
117
 
3.4%
96
 
2.8%
73
 
2.2%
70
 
2.1%
66
 
1.9%
64
 
1.9%
63
 
1.9%
Other values (195) 2320
68.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3394
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
215
 
6.3%
181
 
5.3%
129
 
3.8%
117
 
3.4%
96
 
2.8%
73
 
2.2%
70
 
2.1%
66
 
1.9%
64
 
1.9%
63
 
1.9%
Other values (195) 2320
68.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3394
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
215
 
6.3%
181
 
5.3%
129
 
3.8%
117
 
3.4%
96
 
2.8%
73
 
2.2%
70
 
2.1%
66
 
1.9%
64
 
1.9%
63
 
1.9%
Other values (195) 2320
68.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3394
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
215
 
6.3%
181
 
5.3%
129
 
3.8%
117
 
3.4%
96
 
2.8%
73
 
2.2%
70
 
2.1%
66
 
1.9%
64
 
1.9%
63
 
1.9%
Other values (195) 2320
68.4%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size9.0 KiB
Minimum2023-07-23 00:00:00
Maximum2023-07-23 00:00:00
2024-03-13T22:19:52.521890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-13T22:19:52.656391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2024-03-13T22:19:48.608361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2024-03-13T22:19:48.735408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-13T22:19:48.853226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번사무소명도로명주소전화번호신고건축사데이터기준일자
01건축사사무소 한국부산광역시 부산진구 부전로152번길 31-7051-817-2119김창신2023-07-23
12신기종합건축사사무소부산광역시 사상구 학감대로 238-10 (감전동 , 협성빌딩202호)051-328-4474김건수2023-07-23
23삼아 건축사사무소부산광역시 부산진구 새싹로 31051-816-8193김성곤2023-07-23
34혁진 건축사사무소부산광역시 동구 중앙대로320번길 7-8, 대양빌딩 302051-469-1285임창수2023-07-23
45(주)상지엔지니어링건축사사무소부산광역시 중구 자갈치로 42051-247-0208김동균2023-07-23
56(주)상지엔지니어링건축사사무소부산광역시 중구 자갈치로 42051-247-0208허동윤2023-07-23
67(주)상지엔지니어링건축사사무소부산광역시 중구 자갈치로 42051-247-0208김태선2023-07-23
78(주)상지엔지니어링건축사사무소부산광역시 중구 자갈치로 42051-247-0208박영목2023-07-23
89(주)상지엔지니어링건축사사무소부산광역시 중구 자갈치로 42051-247-0208김창수2023-07-23
910(주)상지엔지니어링건축사사무소부산광역시 중구 자갈치로 42051-247-0208이광택2023-07-23
순번사무소명도로명주소전화번호신고건축사데이터기준일자
11281129리앤유건축사사무소부산광역시 부산진구 중앙대로 800, 1505호<NA>이여진2023-07-23
11291130건축사사무소하다부산광역시 연제구 고분로 102-1, 4층<NA>김지영2023-07-23
11301131건축사사무소 온닷부산광역시 수영구 수영로 468, 7층<NA>장성준2023-07-23
11311132더케이건축사사무소부산광역시 사하구 낙동대로398번길 3-3, 202호051-202-7655김한곤2023-07-23
11321133건축사사무소 일상부산광역시 중구 충장대로9번길 20-1, 엘스퀘어 1동 1703호<NA>고경지2023-07-23
11331134브이아이에이 건축사사무소부산광역시 해운대구 센텀서로 30, 22층, 2203호<NA>이최선2023-07-23
11341135시고건축사사무소부산광역시 남구 수영로 209, 102동 302호 시고건축사사무소051-513-2021손석환2023-07-23
11351136(주)에스앤티건축사사무소부산광역시 해운대구 마린시티3로 1, 728호051-853-2902조병택2023-07-23
11361137디오건축사사무소부산광역시 강서구 명지오션시티9로 50, 상가동 202호<NA>송명진2023-07-23
11371138건화엔지니어링건축사사무소부산광역시 연제구 거제대로108번길 32, 2층051-714-2425박정구2023-07-23