Overview

Dataset statistics

Number of variables4
Number of observations1044
Missing cells130
Missing cells (%)3.1%
Duplicate rows6
Duplicate rows (%)0.6%
Total size in memory32.8 KiB
Average record size in memory32.1 B

Variable types

Text3
DateTime1

Dataset

Description파일 다운로드
Author강남구
URLhttps://data.seoul.go.kr/dataList/OA-14997/S/1/datasetView.do

Alerts

데이터기준일 has constant value ""Constant
Dataset has 6 (0.6%) duplicate rowsDuplicates
전화번호 has 130 (12.5%) missing valuesMissing

Reproduction

Analysis started2023-12-11 05:37:21.046099
Analysis finished2023-12-11 05:37:21.746387
Duration0.7 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

상호
Text

Distinct1035
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
2023-12-11T14:37:21.974575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length22
Mean length12.295019
Min length7

Characters and Unicode

Total characters12836
Distinct characters385
Distinct categories8 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1027 ?
Unique (%)98.4%

Sample

1st row(주)승창엔지니어링건축사사무소
2nd row(주)천일건축엔지니어링종합건축사사무소
3rd row(주)예림종합건축사사무소
4th row(주)우양종합건축사사무소
5th row(주)한국환경종합건축사사무소
ValueCountFrequency (%)
건축사사무소 117
 
9.1%
주식회사 58
 
4.5%
종합건축사사무소 18
 
1.4%
주)종합건축사사무소 15
 
1.2%
주)건축사사무소 13
 
1.0%
3
 
0.2%
아인건축사사무소 3
 
0.2%
엠비 2
 
0.2%
주)상지모양건축사사무소 2
 
0.2%
이담 2
 
0.2%
Other values (1046) 1053
81.9%
2023-12-11T14:37:22.415137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2160
16.8%
1136
 
8.9%
1118
 
8.7%
1059
 
8.3%
1053
 
8.2%
628
 
4.9%
) 564
 
4.4%
( 563
 
4.4%
297
 
2.3%
287
 
2.2%
Other values (375) 3971
30.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11263
87.7%
Close Punctuation 567
 
4.4%
Open Punctuation 566
 
4.4%
Space Separator 259
 
2.0%
Uppercase Letter 107
 
0.8%
Lowercase Letter 28
 
0.2%
Decimal Number 26
 
0.2%
Other Punctuation 20
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2160
19.2%
1136
 
10.1%
1118
 
9.9%
1059
 
9.4%
1053
 
9.3%
628
 
5.6%
297
 
2.6%
287
 
2.5%
285
 
2.5%
193
 
1.7%
Other values (321) 3047
27.1%
Uppercase Letter
ValueCountFrequency (%)
S 17
15.9%
A 13
12.1%
I 9
 
8.4%
D 7
 
6.5%
E 7
 
6.5%
O 7
 
6.5%
L 6
 
5.6%
C 5
 
4.7%
H 5
 
4.7%
N 5
 
4.7%
Other values (13) 26
24.3%
Lowercase Letter
ValueCountFrequency (%)
c 4
14.3%
t 4
14.3%
i 3
10.7%
e 3
10.7%
r 3
10.7%
h 2
7.1%
s 2
7.1%
a 1
 
3.6%
n 1
 
3.6%
u 1
 
3.6%
Other values (4) 4
14.3%
Decimal Number
ValueCountFrequency (%)
1 7
26.9%
2 7
26.9%
3 4
15.4%
5 3
11.5%
4 2
 
7.7%
0 1
 
3.8%
9 1
 
3.8%
7 1
 
3.8%
Other Punctuation
ValueCountFrequency (%)
. 14
70.0%
· 4
 
20.0%
, 1
 
5.0%
1
 
5.0%
Close Punctuation
ValueCountFrequency (%)
) 564
99.5%
] 3
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 563
99.5%
[ 3
 
0.5%
Space Separator
ValueCountFrequency (%)
259
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11262
87.7%
Common 1438
 
11.2%
Latin 135
 
1.1%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2160
19.2%
1136
 
10.1%
1118
 
9.9%
1059
 
9.4%
1053
 
9.4%
628
 
5.6%
297
 
2.6%
287
 
2.5%
285
 
2.5%
193
 
1.7%
Other values (320) 3046
27.0%
Latin
ValueCountFrequency (%)
S 17
 
12.6%
A 13
 
9.6%
I 9
 
6.7%
D 7
 
5.2%
E 7
 
5.2%
O 7
 
5.2%
L 6
 
4.4%
C 5
 
3.7%
H 5
 
3.7%
N 5
 
3.7%
Other values (27) 54
40.0%
Common
ValueCountFrequency (%)
) 564
39.2%
( 563
39.2%
259
18.0%
. 14
 
1.0%
1 7
 
0.5%
2 7
 
0.5%
· 4
 
0.3%
3 4
 
0.3%
5 3
 
0.2%
[ 3
 
0.2%
Other values (7) 10
 
0.7%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11262
87.7%
ASCII 1568
 
12.2%
None 5
 
< 0.1%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2160
19.2%
1136
 
10.1%
1118
 
9.9%
1059
 
9.4%
1053
 
9.4%
628
 
5.6%
297
 
2.6%
287
 
2.5%
285
 
2.5%
193
 
1.7%
Other values (320) 3046
27.0%
ASCII
ValueCountFrequency (%)
) 564
36.0%
( 563
35.9%
259
16.5%
S 17
 
1.1%
. 14
 
0.9%
A 13
 
0.8%
I 9
 
0.6%
1 7
 
0.4%
D 7
 
0.4%
E 7
 
0.4%
Other values (42) 108
 
6.9%
None
ValueCountFrequency (%)
· 4
80.0%
1
 
20.0%
CJK
ValueCountFrequency (%)
1
100.0%
Distinct1002
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
2023-12-11T14:37:22.768544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length40
Mean length32.33908
Min length19

Characters and Unicode

Total characters33762
Distinct characters315
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique967 ?
Unique (%)92.6%

Sample

1st row서울특별시 강남구 학동로101길 7-0
2nd row서울특별시 강남구 개포로32길 7-0 (개포동)
3rd row서울특별시 강남구 테헤란로107길 10-0, 3층(삼성동,성림빌딩)
4th row서울특별시 강남구 강남대로54길 18-5, 지하1층(도곡동)
5th row서울특별시 강남구 봉은사로30길 74-0, 4층 (역삼동,태남빌딩)
ValueCountFrequency (%)
서울특별시 1044
 
17.8%
강남구 1044
 
17.8%
3층 70
 
1.2%
4층 60
 
1.0%
2층 57
 
1.0%
논현로 48
 
0.8%
선릉로 41
 
0.7%
5층 37
 
0.6%
봉은사로 35
 
0.6%
역삼동 35
 
0.6%
Other values (1633) 3402
57.9%
2023-12-11T14:37:23.253740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4862
 
14.4%
0 1570
 
4.7%
, 1241
 
3.7%
1 1236
 
3.7%
1140
 
3.4%
1111
 
3.3%
1072
 
3.2%
1059
 
3.1%
1050
 
3.1%
1049
 
3.1%
Other values (305) 18372
54.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 18635
55.2%
Decimal Number 6875
 
20.4%
Space Separator 4862
 
14.4%
Other Punctuation 1244
 
3.7%
Dash Punctuation 1024
 
3.0%
Open Punctuation 542
 
1.6%
Close Punctuation 541
 
1.6%
Uppercase Letter 38
 
0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1140
 
6.1%
1111
 
6.0%
1072
 
5.8%
1059
 
5.7%
1050
 
5.6%
1049
 
5.6%
1047
 
5.6%
1046
 
5.6%
1044
 
5.6%
689
 
3.7%
Other values (275) 8328
44.7%
Uppercase Letter
ValueCountFrequency (%)
S 9
23.7%
B 9
23.7%
A 4
10.5%
J 3
 
7.9%
H 3
 
7.9%
L 2
 
5.3%
K 2
 
5.3%
Q 1
 
2.6%
D 1
 
2.6%
G 1
 
2.6%
Other values (3) 3
 
7.9%
Decimal Number
ValueCountFrequency (%)
0 1570
22.8%
1 1236
18.0%
2 961
14.0%
3 732
10.6%
4 585
 
8.5%
5 471
 
6.9%
6 431
 
6.3%
7 350
 
5.1%
8 332
 
4.8%
9 207
 
3.0%
Other Punctuation
ValueCountFrequency (%)
, 1241
99.8%
. 3
 
0.2%
Space Separator
ValueCountFrequency (%)
4862
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1024
100.0%
Open Punctuation
ValueCountFrequency (%)
( 542
100.0%
Close Punctuation
ValueCountFrequency (%)
) 541
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 18635
55.2%
Common 15089
44.7%
Latin 38
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1140
 
6.1%
1111
 
6.0%
1072
 
5.8%
1059
 
5.7%
1050
 
5.6%
1049
 
5.6%
1047
 
5.6%
1046
 
5.6%
1044
 
5.6%
689
 
3.7%
Other values (275) 8328
44.7%
Common
ValueCountFrequency (%)
4862
32.2%
0 1570
 
10.4%
, 1241
 
8.2%
1 1236
 
8.2%
- 1024
 
6.8%
2 961
 
6.4%
3 732
 
4.9%
4 585
 
3.9%
( 542
 
3.6%
) 541
 
3.6%
Other values (7) 1795
 
11.9%
Latin
ValueCountFrequency (%)
S 9
23.7%
B 9
23.7%
A 4
10.5%
J 3
 
7.9%
H 3
 
7.9%
L 2
 
5.3%
K 2
 
5.3%
Q 1
 
2.6%
D 1
 
2.6%
G 1
 
2.6%
Other values (3) 3
 
7.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 18635
55.2%
ASCII 15127
44.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4862
32.1%
0 1570
 
10.4%
, 1241
 
8.2%
1 1236
 
8.2%
- 1024
 
6.8%
2 961
 
6.4%
3 732
 
4.8%
4 585
 
3.9%
( 542
 
3.6%
) 541
 
3.6%
Other values (20) 1833
 
12.1%
Hangul
ValueCountFrequency (%)
1140
 
6.1%
1111
 
6.0%
1072
 
5.8%
1059
 
5.7%
1050
 
5.6%
1049
 
5.6%
1047
 
5.6%
1046
 
5.6%
1044
 
5.6%
689
 
3.7%
Other values (275) 8328
44.7%

전화번호
Text

MISSING 

Distinct874
Distinct (%)95.6%
Missing130
Missing (%)12.5%
Memory size8.3 KiB
2023-12-11T14:37:23.534116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length11.326039
Min length11

Characters and Unicode

Total characters10352
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique836 ?
Unique (%)91.5%

Sample

1st row02-543-8875
2nd row02-578-2921
3rd row02-511-9964
4th row02-553-8849
5th row02-567-6101
ValueCountFrequency (%)
02-568-9920 3
 
0.3%
02-542-8937 3
 
0.3%
02-556-6196 2
 
0.2%
02-543-8875 2
 
0.2%
02-542-5720 2
 
0.2%
02-545-6711 2
 
0.2%
02-529-7207 2
 
0.2%
02-2051-9330 2
 
0.2%
02-516-5181 2
 
0.2%
02-544-5608 2
 
0.2%
Other values (866) 894
97.6%
2023-12-11T14:37:23.986284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1826
17.6%
0 1553
15.0%
2 1490
14.4%
5 1236
11.9%
4 825
8.0%
1 688
 
6.6%
3 643
 
6.2%
7 608
 
5.9%
6 579
 
5.6%
8 463
 
4.5%
Other values (3) 441
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8523
82.3%
Dash Punctuation 1826
 
17.6%
Space Separator 2
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1553
18.2%
2 1490
17.5%
5 1236
14.5%
4 825
9.7%
1 688
8.1%
3 643
7.5%
7 608
 
7.1%
6 579
 
6.8%
8 463
 
5.4%
9 438
 
5.1%
Dash Punctuation
ValueCountFrequency (%)
- 1826
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 10352
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1826
17.6%
0 1553
15.0%
2 1490
14.4%
5 1236
11.9%
4 825
8.0%
1 688
 
6.6%
3 643
 
6.2%
7 608
 
5.9%
6 579
 
5.6%
8 463
 
4.5%
Other values (3) 441
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10352
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1826
17.6%
0 1553
15.0%
2 1490
14.4%
5 1236
11.9%
4 825
8.0%
1 688
 
6.6%
3 643
 
6.2%
7 608
 
5.9%
6 579
 
5.6%
8 463
 
4.5%
Other values (3) 441
 
4.3%

데이터기준일
Date

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size8.3 KiB
Minimum2018-03-31 00:00:00
Maximum2018-03-31 00:00:00
2023-12-11T14:37:24.109834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T14:37:24.226070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Missing values

2023-12-11T14:37:21.591839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T14:37:21.701480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

상호소재지전화번호데이터기준일
0(주)승창엔지니어링건축사사무소서울특별시 강남구 학동로101길 7-002-543-88752018-03-31
1(주)천일건축엔지니어링종합건축사사무소서울특별시 강남구 개포로32길 7-0 (개포동)02-578-29212018-03-31
2(주)예림종합건축사사무소서울특별시 강남구 테헤란로107길 10-0, 3층(삼성동,성림빌딩)02-511-99642018-03-31
3(주)우양종합건축사사무소서울특별시 강남구 강남대로54길 18-5, 지하1층(도곡동)02-553-88492018-03-31
4(주)한국환경종합건축사사무소서울특별시 강남구 봉은사로30길 74-0, 4층 (역삼동,태남빌딩)02-567-61012018-03-31
5(주)한빛종합건축사사무소서울특별시 강남구 언주로138길 8-0 (논현동,한빛빌딩)02-546-37202018-03-31
6삼예종합건축사사무소서울특별시 강남구 학동로 520-002-543-10312018-03-31
7(주)삼진앤영건축사사무소서울특별시 강남구 선릉로 664-0, 206호 (삼성동, 건설빌딩)02-549-33272018-03-31
8맥가종합건축사사무소서울특별시 강남구 봉은사로68길 6-0, 은경빌딩 3층02-540-16932018-03-31
9종합건축사사무소정일서울특별시 강남구 선릉로132길 19-502-546-75432018-03-31
상호소재지전화번호데이터기준일
1034건축사사무소101서울특별시 강남구 개포로22길 5-0, 201호(개포동)02-529-26052018-03-31
1035주식회사 에이치에스플랜건축사사무소서울특별시 강남구 도산대로49길 6-10, 3층(신사동)<NA>2018-03-31
1036(주)케이에스케이도시건축사사무소서울특별시 강남구 도곡로 203-0, 4층(역삼동)<NA>2018-03-31
1037주식회사 단감건축사사무소서울특별시 강남구 논현로30길 25-0, 2층 (도곡동, 남산빌딩)02-6217-87532018-03-31
1038(주)사간건축사사무소서울특별시 강남구 개포로17길 13-0, 302호(개포동)02-3461-22722018-03-31
1039에네스건축사사무소서울특별시 강남구 도산대로51길 15-0, B03호(신사동)<NA>2018-03-31
1040송현그룹건축사사무소(주)서울특별시 강남구 언주로 323-0, 3층(역삼동)02-421-01522018-03-31
1041마인드브릭건축사사무소서울특별시 강남구 학동로7길 29-0, 2층(논현동)<NA>2018-03-31
1042(주)디자인그룹오즈건축사사무소서울특별시 강남구 도곡로 219-0, 4층(역삼동, 우노빌딩)070-4012-32222018-03-31
1043주식회사 터건축사사무소서울특별시 강남구 논현로38길 9-0, 3층(도곡동,자은빌딩)02-516-45372018-03-31

Duplicate rows

Most frequently occurring

상호소재지전화번호데이터기준일# duplicates
0(주)건축사사무소 엠비서울특별시 강남구 봉은사로29길 25-0, (논현동,토미빌 비01호)02-547-05742018-03-312
1(주)무한종합건축사사무소서울특별시 강남구 선릉로131길 9-0, 하나빌딩4층 (논현동)02-516-51812018-03-312
2(주)상지모양건축사사무소서울특별시 강남구 논현로 42-0, 2층(개포동)02-529-65222018-03-312
3(주)종합건축사사무소 에프엘서울특별시 강남구 봉은사로68길 33-1, 포르테라인 2층(삼성동)02-572-54152018-03-312
4건축사사무소 이담서울특별시 강남구 논현로63길 19-0, 대흥빌딩 301호(역삼동)02-568-99202018-03-312
5주식회사 청건축사사무소서울특별시 강남구 개포로22길 6-0, 4층(개포동,광혜빌딩)02-518-73012018-03-312