Overview

Dataset statistics

Number of variables4
Number of observations76
Missing cells32
Missing cells (%)10.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.6 KiB
Average record size in memory34.7 B

Variable types

Numeric1
Text3

Dataset

Description서울특별시 동대문구 관내 건축사무소 현황에대한 데이터로 건축사무소명, 도로명주소, 전화번호 등의 정보를 제공합니다.
Author서울특별시 동대문구
URLhttps://www.data.go.kr/data/15126204/fileData.do

Alerts

전화번호 has 32 (42.1%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2024-01-14 13:36:16.859884
Analysis finished2024-01-14 13:36:17.495810
Duration0.64 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct76
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38.5
Minimum1
Maximum76
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size816.0 B
2024-01-14T22:36:17.597561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.75
Q119.75
median38.5
Q357.25
95-th percentile72.25
Maximum76
Range75
Interquartile range (IQR)37.5

Descriptive statistics

Standard deviation22.083176
Coefficient of variation (CV)0.57358899
Kurtosis-1.2
Mean38.5
Median Absolute Deviation (MAD)19
Skewness0
Sum2926
Variance487.66667
MonotonicityStrictly increasing
2024-01-14T22:36:17.801816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.3%
50 1
 
1.3%
57 1
 
1.3%
56 1
 
1.3%
55 1
 
1.3%
54 1
 
1.3%
53 1
 
1.3%
52 1
 
1.3%
51 1
 
1.3%
49 1
 
1.3%
Other values (66) 66
86.8%
ValueCountFrequency (%)
1 1
1.3%
2 1
1.3%
3 1
1.3%
4 1
1.3%
5 1
1.3%
6 1
1.3%
7 1
1.3%
8 1
1.3%
9 1
1.3%
10 1
1.3%
ValueCountFrequency (%)
76 1
1.3%
75 1
1.3%
74 1
1.3%
73 1
1.3%
72 1
1.3%
71 1
1.3%
70 1
1.3%
69 1
1.3%
68 1
1.3%
67 1
1.3%
Distinct74
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Memory size740.0 B
2024-01-14T22:36:18.059707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length15.5
Mean length11.171053
Min length8

Characters and Unicode

Total characters849
Distinct characters130
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)94.7%

Sample

1st row건축사사무소 신흥
2nd row서광 종합건축사사무소
3rd row건축사사무소 송정
4th row건축사사무소 대원창건사
5th row익성 종합건축사사무소
ValueCountFrequency (%)
건축사사무소 22
 
18.8%
주식회사 12
 
10.3%
종합건축사사무소 3
 
2.6%
와이지피 2
 
1.7%
이마건축사사무소 2
 
1.7%
2
 
1.7%
건축사사무소가온 1
 
0.9%
정현건축사사무소 1
 
0.9%
해봄건축사사무소 1
 
0.9%
주)엘피건축사사무소 1
 
0.9%
Other values (70) 70
59.8%
2024-01-14T22:36:18.461004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
167
19.7%
79
 
9.3%
77
 
9.1%
77
 
9.1%
76
 
9.0%
41
 
4.8%
28
 
3.3%
20
 
2.4%
( 15
 
1.8%
) 15
 
1.8%
Other values (120) 254
29.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 772
90.9%
Space Separator 41
 
4.8%
Open Punctuation 15
 
1.8%
Close Punctuation 15
 
1.8%
Uppercase Letter 4
 
0.5%
Decimal Number 2
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
167
21.6%
79
 
10.2%
77
 
10.0%
77
 
10.0%
76
 
9.8%
28
 
3.6%
20
 
2.6%
13
 
1.7%
13
 
1.7%
12
 
1.6%
Other values (112) 210
27.2%
Uppercase Letter
ValueCountFrequency (%)
D 2
50.0%
K 1
25.0%
A 1
25.0%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
2 1
50.0%
Space Separator
ValueCountFrequency (%)
41
100.0%
Open Punctuation
ValueCountFrequency (%)
( 15
100.0%
Close Punctuation
ValueCountFrequency (%)
) 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 772
90.9%
Common 73
 
8.6%
Latin 4
 
0.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
167
21.6%
79
 
10.2%
77
 
10.0%
77
 
10.0%
76
 
9.8%
28
 
3.6%
20
 
2.6%
13
 
1.7%
13
 
1.7%
12
 
1.6%
Other values (112) 210
27.2%
Common
ValueCountFrequency (%)
41
56.2%
( 15
 
20.5%
) 15
 
20.5%
1 1
 
1.4%
2 1
 
1.4%
Latin
ValueCountFrequency (%)
D 2
50.0%
K 1
25.0%
A 1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 772
90.9%
ASCII 77
 
9.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
167
21.6%
79
 
10.2%
77
 
10.0%
77
 
10.0%
76
 
9.8%
28
 
3.6%
20
 
2.6%
13
 
1.7%
13
 
1.7%
12
 
1.6%
Other values (112) 210
27.2%
ASCII
ValueCountFrequency (%)
41
53.2%
( 15
 
19.5%
) 15
 
19.5%
D 2
 
2.6%
1 1
 
1.3%
2 1
 
1.3%
K 1
 
1.3%
A 1
 
1.3%
Distinct69
Distinct (%)90.8%
Missing0
Missing (%)0.0%
Memory size740.0 B
2024-01-14T22:36:18.796096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length58
Median length35
Mean length26.263158
Min length1

Characters and Unicode

Total characters1996
Distinct characters122
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65 ?
Unique (%)85.5%

Sample

1st row서울특별시 동대문구 왕산로 53, 후암빌딩 4층
2nd row서울특별시 동대문구 왕산로 9
3rd row서울특별시 동대문구 왕산로 143, 삼화빌딩 611호
4th row서울특별시 동대문구 천호대로37길 11, 2층
5th row서울특별시 동대문구 왕산로 81, 두산베어스타워 8층 808호
ValueCountFrequency (%)
서울특별시 71
 
18.1%
동대문구 71
 
18.1%
왕산로 11
 
2.8%
천호대로 8
 
2.0%
2층 7
 
1.8%
3층 6
 
1.5%
81 5
 
1.3%
4층 5
 
1.3%
고산자로28길 5
 
1.3%
전농로 3
 
0.8%
Other values (164) 200
51.0%
2024-01-14T22:36:19.407604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
335
 
16.8%
91
 
4.6%
89
 
4.5%
1 78
 
3.9%
, 76
 
3.8%
75
 
3.8%
73
 
3.7%
71
 
3.6%
71
 
3.6%
71
 
3.6%
Other values (112) 966
48.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1169
58.6%
Decimal Number 386
 
19.3%
Space Separator 335
 
16.8%
Other Punctuation 76
 
3.8%
Dash Punctuation 8
 
0.4%
Uppercase Letter 8
 
0.4%
Open Punctuation 7
 
0.4%
Close Punctuation 7
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
91
 
7.8%
89
 
7.6%
75
 
6.4%
73
 
6.2%
71
 
6.1%
71
 
6.1%
71
 
6.1%
71
 
6.1%
71
 
6.1%
71
 
6.1%
Other values (91) 415
35.5%
Decimal Number
ValueCountFrequency (%)
1 78
20.2%
2 60
15.5%
3 49
12.7%
0 45
11.7%
4 42
10.9%
8 38
9.8%
5 30
 
7.8%
9 16
 
4.1%
7 15
 
3.9%
6 13
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
B 3
37.5%
H 1
 
12.5%
S 1
 
12.5%
K 1
 
12.5%
Y 1
 
12.5%
L 1
 
12.5%
Space Separator
ValueCountFrequency (%)
335
100.0%
Other Punctuation
ValueCountFrequency (%)
, 76
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1169
58.6%
Common 819
41.0%
Latin 8
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
91
 
7.8%
89
 
7.6%
75
 
6.4%
73
 
6.2%
71
 
6.1%
71
 
6.1%
71
 
6.1%
71
 
6.1%
71
 
6.1%
71
 
6.1%
Other values (91) 415
35.5%
Common
ValueCountFrequency (%)
335
40.9%
1 78
 
9.5%
, 76
 
9.3%
2 60
 
7.3%
3 49
 
6.0%
0 45
 
5.5%
4 42
 
5.1%
8 38
 
4.6%
5 30
 
3.7%
9 16
 
2.0%
Other values (5) 50
 
6.1%
Latin
ValueCountFrequency (%)
B 3
37.5%
H 1
 
12.5%
S 1
 
12.5%
K 1
 
12.5%
Y 1
 
12.5%
L 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1169
58.6%
ASCII 827
41.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
335
40.5%
1 78
 
9.4%
, 76
 
9.2%
2 60
 
7.3%
3 49
 
5.9%
0 45
 
5.4%
4 42
 
5.1%
8 38
 
4.6%
5 30
 
3.6%
9 16
 
1.9%
Other values (11) 58
 
7.0%
Hangul
ValueCountFrequency (%)
91
 
7.8%
89
 
7.6%
75
 
6.4%
73
 
6.2%
71
 
6.1%
71
 
6.1%
71
 
6.1%
71
 
6.1%
71
 
6.1%
71
 
6.1%
Other values (91) 415
35.5%

전화번호
Text

MISSING 

Distinct42
Distinct (%)95.5%
Missing32
Missing (%)42.1%
Memory size740.0 B
2024-01-14T22:36:19.710307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length11.5
Min length11

Characters and Unicode

Total characters506
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40 ?
Unique (%)90.9%

Sample

1st row02-924-5668
2nd row02-923-9292
3rd row02-3295-4600
4th row02-962-0542,3
5th row02-922-8811
ValueCountFrequency (%)
02-969-2831 2
 
4.5%
02-2213-3253 2
 
4.5%
02-541-7846 1
 
2.3%
070-4140-1400 1
 
2.3%
02-924-5668 1
 
2.3%
02-6402-4968 1
 
2.3%
02-514-1204 1
 
2.3%
02-3461-2588 1
 
2.3%
02-587-8060 1
 
2.3%
02-964-9777 1
 
2.3%
Other values (32) 32
72.7%
2024-01-14T22:36:20.241497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 101
20.0%
- 88
17.4%
0 68
13.4%
4 39
 
7.7%
9 38
 
7.5%
5 32
 
6.3%
1 31
 
6.1%
3 30
 
5.9%
6 28
 
5.5%
8 26
 
5.1%
Other values (3) 25
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 416
82.2%
Dash Punctuation 88
 
17.4%
Other Punctuation 1
 
0.2%
Math Symbol 1
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 101
24.3%
0 68
16.3%
4 39
 
9.4%
9 38
 
9.1%
5 32
 
7.7%
1 31
 
7.5%
3 30
 
7.2%
6 28
 
6.7%
8 26
 
6.2%
7 23
 
5.5%
Dash Punctuation
ValueCountFrequency (%)
- 88
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 506
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 101
20.0%
- 88
17.4%
0 68
13.4%
4 39
 
7.7%
9 38
 
7.5%
5 32
 
6.3%
1 31
 
6.1%
3 30
 
5.9%
6 28
 
5.5%
8 26
 
5.1%
Other values (3) 25
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 506
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 101
20.0%
- 88
17.4%
0 68
13.4%
4 39
 
7.7%
9 38
 
7.5%
5 32
 
6.3%
1 31
 
6.1%
3 30
 
5.9%
6 28
 
5.5%
8 26
 
5.1%
Other values (3) 25
 
4.9%

Interactions

2024-01-14T22:36:17.195599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-14T22:36:20.367742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번사무소명도로명주소전화번호
연번1.0001.0000.9450.000
사무소명1.0001.0001.0001.000
도로명주소0.9451.0001.0000.959
전화번호0.0001.0000.9591.000

Missing values

2024-01-14T22:36:17.343818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-14T22:36:17.450740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번사무소명도로명주소전화번호
01건축사사무소 신흥서울특별시 동대문구 왕산로 53, 후암빌딩 4층02-924-5668
12서광 종합건축사사무소서울특별시 동대문구 왕산로 902-923-9292
23건축사사무소 송정서울특별시 동대문구 왕산로 143, 삼화빌딩 611호02-3295-4600
34건축사사무소 대원창건사서울특별시 동대문구 천호대로37길 11, 2층02-962-0542,3
45익성 종합건축사사무소서울특별시 동대문구 왕산로 81, 두산베어스타워 8층 808호02-922-8811
56구상 건축사사무소서울특별시 동대문구 고산자로 430, 경동플라자 418호02-924-6003
67상명 건축사사무소서울특별시 동대문구 왕산로32길 9902-960-2288
78(주) 인선 건축사사무소서울특별시 동대문구 고산자로 406, 501호 (용두동, 화성빌딩)02-967-5177
89(주) 기하건축사사무소서울특별시 동대문구 왕산로 81, 808(용두동, 두산베어스타워)02-953-2225
910건축사사무소 하늘서울특별시 동대문구 무학로 120, 5층02-928-4484~6
연번사무소명도로명주소전화번호
6667건축사사무소 아텍서울특별시 동대문구 고산자로28길 39, 경진빌딩 2층02-969-2831
6768건축사사무소 토끼발서울특별시 동대문구 고산자로 469, BH빌딩 5층<NA>
6869이와건축사사무소서울특별시 동대문구 안암로 86-1, 503호<NA>
6970주식회사 올니즈건축사사무소서울특별시 동대문구 천호대로83길 93, 동우빌딩 503호<NA>
7071주식회사 이마건축사사무소서울특별시 동대문구 신이문로 32, 3층<NA>
7172주식회사 이마건축사사무소서울특별시 동대문구 신이문로 32, 3층<NA>
7273주식회사 건축사사무소명림서울특별시 동대문구 고산자로34길 70, 3층 307호<NA>
7374만들다건축사사무소서울특별시 동대문구 답십리로56길 84, 101호<NA>
7475주식회사지음과세움종합건축사사무소서울특별시 동대문구 천호대로85길 10, 4층 408호(장안동, 성동빌딩)02-2213-3253
7576담빛건축사사무소서울특별시 동대문구 왕산로 200, 14층 1426호 청량리역 롯데캐슬SKY-L65 섹션오피스, 전농동02-2135-7321