Overview

Dataset statistics

Number of variables7
Number of observations67
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.0 KiB
Average record size in memory61.0 B

Variable types

Text3
Categorical2
Numeric2

Dataset

Description관리번호,자치구,등록년도,공사명,공사위치,X좌표,Y좌표
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-21107/S/1/datasetView.do

Alerts

등록년도 has constant value ""Constant
X좌표 is highly overall correlated with 자치구High correlation
Y좌표 is highly overall correlated with 자치구High correlation
자치구 is highly overall correlated with X좌표 and 1 other fieldsHigh correlation
관리번호 has unique valuesUnique
공사명 has unique valuesUnique

Reproduction

Analysis started2023-12-11 06:25:46.889717
Analysis finished2023-12-11 06:25:47.884986
Duration1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

관리번호
Text

UNIQUE 

Distinct67
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size668.0 B
2023-12-11T15:25:48.153396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters603
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique67 ?
Unique (%)100.0%

Sample

1st row2017_0002
2nd row2017_0003
3rd row2017_0004
4th row2017_0005
5th row2017_0006
ValueCountFrequency (%)
2017_0002 1
 
1.5%
2017_0036 1
 
1.5%
2017_0038 1
 
1.5%
2017_0039 1
 
1.5%
2017_0040 1
 
1.5%
2017_0041 1
 
1.5%
2017_0042 1
 
1.5%
2017_0043 1
 
1.5%
2017_0044 1
 
1.5%
2017_0045 1
 
1.5%
Other values (57) 57
85.1%
2023-12-11T15:25:48.621712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 215
35.7%
2 84
 
13.9%
1 83
 
13.8%
7 74
 
12.3%
_ 67
 
11.1%
4 17
 
2.8%
5 17
 
2.8%
3 17
 
2.8%
6 16
 
2.7%
8 7
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 536
88.9%
Connector Punctuation 67
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 215
40.1%
2 84
 
15.7%
1 83
 
15.5%
7 74
 
13.8%
4 17
 
3.2%
5 17
 
3.2%
3 17
 
3.2%
6 16
 
3.0%
8 7
 
1.3%
9 6
 
1.1%
Connector Punctuation
ValueCountFrequency (%)
_ 67
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 603
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 215
35.7%
2 84
 
13.9%
1 83
 
13.8%
7 74
 
12.3%
_ 67
 
11.1%
4 17
 
2.8%
5 17
 
2.8%
3 17
 
2.8%
6 16
 
2.7%
8 7
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 603
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 215
35.7%
2 84
 
13.9%
1 83
 
13.8%
7 74
 
12.3%
_ 67
 
11.1%
4 17
 
2.8%
5 17
 
2.8%
3 17
 
2.8%
6 16
 
2.7%
8 7
 
1.2%

자치구
Categorical

HIGH CORRELATION 

Distinct19
Distinct (%)28.4%
Missing0
Missing (%)0.0%
Memory size668.0 B
중구
13 
강서구
은평구
강남구
송파구
Other values (14)
28 

Length

Max length4
Median length3
Mean length2.9104478
Min length2

Unique

Unique5 ?
Unique (%)7.5%

Sample

1st row강남구
2nd row동대문구
3rd row동대문구
4th row강남구
5th row송파구

Common Values

ValueCountFrequency (%)
중구 13
19.4%
강서구 9
13.4%
은평구 7
10.4%
강남구 5
 
7.5%
송파구 5
 
7.5%
구로구 4
 
6.0%
성동구 3
 
4.5%
용산구 3
 
4.5%
동대문구 3
 
4.5%
종로구 2
 
3.0%
Other values (9) 13
19.4%

Length

2023-12-11T15:25:48.822914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
중구 13
19.4%
강서구 9
13.4%
은평구 7
10.4%
강남구 5
 
7.5%
송파구 5
 
7.5%
구로구 4
 
6.0%
성동구 3
 
4.5%
용산구 3
 
4.5%
동대문구 3
 
4.5%
영등포구 2
 
3.0%
Other values (9) 13
19.4%

등록년도
Categorical

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size668.0 B
2017
67 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2017
3rd row2017
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
2017 67
100.0%

Length

2023-12-11T15:25:48.976435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T15:25:49.093401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2017 67
100.0%

공사명
Text

UNIQUE 

Distinct67
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size668.0 B
2023-12-11T15:25:49.382310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length25
Mean length17.402985
Min length7

Characters and Unicode

Total characters1166
Distinct characters224
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique67 ?
Unique (%)100.0%

Sample

1st row청담동 C빌딩 신축공사현장
2nd row서울시립대학교 시민문화 교육관 건립공사
3rd row답십리 대농 신안 주택재건축 아파트 신축공사 ()
4th row삼성동 113-5 관광호텔 신축공사
5th row서울시 송파구 방이동 50 공사현장
ValueCountFrequency (%)
신축공사 42
 
17.5%
아파트 6
 
2.5%
정비사업 4
 
1.7%
주택재건축 3
 
1.2%
지식산업센터 3
 
1.2%
도시환경정비사업 3
 
1.2%
건설공사 3
 
1.2%
서울시 3
 
1.2%
항동3단지 2
 
0.8%
은평구 2
 
0.8%
Other values (157) 169
70.4%
2023-12-11T15:25:49.866504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
173
 
14.8%
66
 
5.7%
58
 
5.0%
50
 
4.3%
49
 
4.2%
1 26
 
2.2%
20
 
1.7%
20
 
1.7%
17
 
1.5%
16
 
1.4%
Other values (214) 671
57.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 877
75.2%
Space Separator 173
 
14.8%
Decimal Number 65
 
5.6%
Uppercase Letter 27
 
2.3%
Dash Punctuation 9
 
0.8%
Other Punctuation 5
 
0.4%
Lowercase Letter 4
 
0.3%
Open Punctuation 3
 
0.3%
Close Punctuation 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
66
 
7.5%
58
 
6.6%
50
 
5.7%
49
 
5.6%
20
 
2.3%
20
 
2.3%
17
 
1.9%
16
 
1.8%
16
 
1.8%
16
 
1.8%
Other values (180) 549
62.6%
Uppercase Letter
ValueCountFrequency (%)
T 4
14.8%
L 3
11.1%
B 3
11.1%
K 3
11.1%
C 2
 
7.4%
S 2
 
7.4%
Y 1
 
3.7%
O 1
 
3.7%
W 1
 
3.7%
E 1
 
3.7%
Other values (6) 6
22.2%
Decimal Number
ValueCountFrequency (%)
1 26
40.0%
2 11
16.9%
3 8
 
12.3%
9 4
 
6.2%
4 4
 
6.2%
0 4
 
6.2%
6 3
 
4.6%
5 3
 
4.6%
8 2
 
3.1%
Lowercase Letter
ValueCountFrequency (%)
c 2
50.0%
r 1
25.0%
a 1
25.0%
Other Punctuation
ValueCountFrequency (%)
, 3
60.0%
. 2
40.0%
Space Separator
ValueCountFrequency (%)
173
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 877
75.2%
Common 258
 
22.1%
Latin 31
 
2.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
66
 
7.5%
58
 
6.6%
50
 
5.7%
49
 
5.6%
20
 
2.3%
20
 
2.3%
17
 
1.9%
16
 
1.8%
16
 
1.8%
16
 
1.8%
Other values (180) 549
62.6%
Latin
ValueCountFrequency (%)
T 4
12.9%
L 3
 
9.7%
B 3
 
9.7%
K 3
 
9.7%
C 2
 
6.5%
S 2
 
6.5%
c 2
 
6.5%
Y 1
 
3.2%
O 1
 
3.2%
W 1
 
3.2%
Other values (9) 9
29.0%
Common
ValueCountFrequency (%)
173
67.1%
1 26
 
10.1%
2 11
 
4.3%
- 9
 
3.5%
3 8
 
3.1%
9 4
 
1.6%
4 4
 
1.6%
0 4
 
1.6%
6 3
 
1.2%
, 3
 
1.2%
Other values (5) 13
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 877
75.2%
ASCII 289
 
24.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
173
59.9%
1 26
 
9.0%
2 11
 
3.8%
- 9
 
3.1%
3 8
 
2.8%
9 4
 
1.4%
T 4
 
1.4%
4 4
 
1.4%
0 4
 
1.4%
6 3
 
1.0%
Other values (24) 43
 
14.9%
Hangul
ValueCountFrequency (%)
66
 
7.5%
58
 
6.6%
50
 
5.7%
49
 
5.6%
20
 
2.3%
20
 
2.3%
17
 
1.9%
16
 
1.8%
16
 
1.8%
16
 
1.8%
Other values (180) 549
62.6%
Distinct66
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Memory size668.0 B
2023-12-11T15:25:50.197076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length21
Mean length17.059701
Min length11

Characters and Unicode

Total characters1143
Distinct characters105
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65 ?
Unique (%)97.0%

Sample

1st row서울시 강남구 청담동
2nd row서울시 동대문구 서울시립대로 163
3rd row서울시 동대문구 답십리동 465-69
4th row서울시 강남구 삼성동 113-5
5th row서울시 송파구 방이동 50
ValueCountFrequency (%)
서울시 67
25.1%
중구 13
 
4.9%
강서구 9
 
3.4%
은평구 7
 
2.6%
강남구 5
 
1.9%
송파구 5
 
1.9%
마곡동 4
 
1.5%
구로구 4
 
1.5%
한강로3가 3
 
1.1%
성동구 3
 
1.1%
Other values (126) 147
55.1%
2023-12-11T15:25:50.918977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
200
17.5%
80
 
7.0%
71
 
6.2%
69
 
6.0%
68
 
5.9%
63
 
5.5%
1 57
 
5.0%
- 47
 
4.1%
2 38
 
3.3%
6 27
 
2.4%
Other values (95) 423
37.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 636
55.6%
Decimal Number 260
22.7%
Space Separator 200
 
17.5%
Dash Punctuation 47
 
4.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
80
 
12.6%
71
 
11.2%
69
 
10.8%
68
 
10.7%
63
 
9.9%
19
 
3.0%
19
 
3.0%
15
 
2.4%
14
 
2.2%
10
 
1.6%
Other values (83) 208
32.7%
Decimal Number
ValueCountFrequency (%)
1 57
21.9%
2 38
14.6%
6 27
10.4%
3 27
10.4%
5 25
9.6%
8 20
 
7.7%
0 18
 
6.9%
4 17
 
6.5%
7 16
 
6.2%
9 15
 
5.8%
Space Separator
ValueCountFrequency (%)
200
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 47
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 636
55.6%
Common 507
44.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
80
 
12.6%
71
 
11.2%
69
 
10.8%
68
 
10.7%
63
 
9.9%
19
 
3.0%
19
 
3.0%
15
 
2.4%
14
 
2.2%
10
 
1.6%
Other values (83) 208
32.7%
Common
ValueCountFrequency (%)
200
39.4%
1 57
 
11.2%
- 47
 
9.3%
2 38
 
7.5%
6 27
 
5.3%
3 27
 
5.3%
5 25
 
4.9%
8 20
 
3.9%
0 18
 
3.6%
4 17
 
3.4%
Other values (2) 31
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 636
55.6%
ASCII 507
44.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
200
39.4%
1 57
 
11.2%
- 47
 
9.3%
2 38
 
7.5%
6 27
 
5.3%
3 27
 
5.3%
5 25
 
4.9%
8 20
 
3.9%
0 18
 
3.6%
4 17
 
3.4%
Other values (2) 31
 
6.1%
Hangul
ValueCountFrequency (%)
80
 
12.6%
71
 
11.2%
69
 
10.8%
68
 
10.7%
63
 
9.9%
19
 
3.0%
19
 
3.0%
15
 
2.4%
14
 
2.2%
10
 
1.6%
Other values (83) 208
32.7%

X좌표
Real number (ℝ)

HIGH CORRELATION 

Distinct66
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean197444.32
Minimum184156
Maximum215623.6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size735.0 B
2023-12-11T15:25:51.075834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum184156
5-th percentile184595.88
Q1191883
median198176.8
Q3204074.4
95-th percentile209518.52
Maximum215623.6
Range31467.6
Interquartile range (IQR)12191.4

Descriptive statistics

Standard deviation8166.3115
Coefficient of variation (CV)0.041360074
Kurtosis-0.91962594
Mean197444.32
Median Absolute Deviation (MAD)6024.0001
Skewness-0.11459619
Sum13228769
Variance66688643
MonotonicityNot monotonic
2023-12-11T15:25:51.215217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
184264.000041 2
 
3.0%
203850.4 1
 
1.5%
191973.999999 1
 
1.5%
194256.800036 1
 
1.5%
194271.600038 1
 
1.5%
193103.199975 1
 
1.5%
194345.200033 1
 
1.5%
197141.199993 1
 
1.5%
186897.199988 1
 
1.5%
185158.800042 1
 
1.5%
Other values (56) 56
83.6%
ValueCountFrequency (%)
184155.999958 1
1.5%
184264.000041 2
3.0%
184586.400034 1
1.5%
184617.999985 1
1.5%
184750.800001 1
1.5%
185052.400025 1
1.5%
185140.000011 1
1.5%
185158.800042 1
1.5%
185315.599977 1
1.5%
186779.600016 1
1.5%
ValueCountFrequency (%)
215623.600033 1
1.5%
210131.600022 1
1.5%
210130.80001 1
1.5%
209538.46 1
1.5%
209472.000039 1
1.5%
209252.000024 1
1.5%
206880.000033 1
1.5%
206833.199966 1
1.5%
206265.999975 1
1.5%
205857.200016 1
1.5%

Y좌표
Real number (ℝ)

HIGH CORRELATION 

Distinct66
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean450116.82
Minimum439392.8
Maximum461328
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size735.0 B
2023-12-11T15:25:51.381818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum439392.8
5-th percentile442421.52
Q1446679.2
median451052.4
Q3452447.4
95-th percentile457450.64
Maximum461328
Range21935.2
Interquartile range (IQR)5768.2

Descriptive statistics

Standard deviation4652.1251
Coefficient of variation (CV)0.010335373
Kurtosis-0.1855047
Mean450116.82
Median Absolute Deviation (MAD)3567.2
Skewness-0.023257347
Sum30157827
Variance21642268
MonotonicityNot monotonic
2023-12-11T15:25:51.524022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
442128.000004 2
 
3.0%
447213.0 1
 
1.5%
455397.999995 1
 
1.5%
456648.799994 1
 
1.5%
456309.199997 1
 
1.5%
454619.599995 1
 
1.5%
455802.0 1
 
1.5%
447378.800002 1
 
1.5%
450511.199996 1
 
1.5%
452932.399996 1
 
1.5%
Other values (56) 56
83.6%
ValueCountFrequency (%)
439392.800004 1
1.5%
442128.000004 2
3.0%
442406.400002 1
1.5%
442456.799998 1
1.5%
442490.800004 1
1.5%
442945.999998 1
1.5%
443536.400003 1
1.5%
445028.8 1
1.5%
445106.75 1
1.5%
445280.399999 1
1.5%
ValueCountFrequency (%)
461328.000001 1
1.5%
459479.200005 1
1.5%
459295.199995 1
1.5%
457749.200005 1
1.5%
456754.000001 1
1.5%
456648.799994 1
1.5%
456309.199997 1
1.5%
455802.0 1
1.5%
455397.999995 1
1.5%
454886.399995 1
1.5%

Interactions

2023-12-11T15:25:47.475448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:25:47.275591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:25:47.571597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:25:47.376076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T15:25:51.615564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리번호자치구공사명공사위치X좌표Y좌표
관리번호1.0001.0001.0001.0001.0001.000
자치구1.0001.0001.0001.0000.9600.942
공사명1.0001.0001.0001.0001.0001.000
공사위치1.0001.0001.0001.0001.0001.000
X좌표1.0000.9601.0001.0001.0000.648
Y좌표1.0000.9421.0001.0000.6481.000
2023-12-11T15:25:51.705452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
X좌표Y좌표자치구
X좌표1.0000.0030.735
Y좌표0.0031.0000.664
자치구0.7350.6641.000

Missing values

2023-12-11T15:25:47.685129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T15:25:47.827239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

관리번호자치구등록년도공사명공사위치X좌표Y좌표
02017_0002강남구2017청담동 C빌딩 신축공사현장서울시 강남구 청담동203850.4447213.0
12017_0003동대문구2017서울시립대학교 시민문화 교육관 건립공사서울시 동대문구 서울시립대로 163205298.8453670.8
22017_0004동대문구2017답십리 대농 신안 주택재건축 아파트 신축공사 ()서울시 동대문구 답십리동 465-69204200.80004452412.400005
32017_0005강남구2017삼성동 113-5 관광호텔 신축공사서울시 강남구 삼성동 113-5204147.199982445705.599997
42017_0006송파구2017서울시 송파구 방이동 50 공사현장서울시 송파구 방이동 50210131.600022446288.799999
52017_0007강남구2017오닉스타워 신축공사서울시 강남구 역삼동 653-2203365.999972445423.200005
62017_0008성동구2017성수별관 신축공사서울시 성동구 성수동 2가 273-61205017.53448891.02
72017_0009송파구2017문정헤리움 써밋타워 신축공사서울시 송파구 석촌동 286-3209472.000039445028.8
82017_0010송파구2017MK타워 신축공사서울시 송파구 문정동 490210130.80001442945.999998
92017_0011성동구2017성수1가 지식산업센터 신축공사서울시 성동구 성수동 성수1가 656-439204001.600038449692.399995
관리번호자치구등록년도공사명공사위치X좌표Y좌표
572017_0059금천구2017시흥동 883-11호 외 3필지 주상복합 신축공사서울시 금천구 시흥동 883-11191273.19999439392.800004
582017_0060구로구2017항동3단지 도생구간서울시 구로구 항동 183-1184264.000041442128.000004
592017_0061구로구2017항동3단지 아파트구간서울시 구로구 항동 183-1184264.000041442128.000004
602017_0062광진구2017건대 동도 센트라움 캠퍼스파크 신축공사서울시 광진구 화양동 116-2206265.999975449791.599995
612017_0063서대문구2017연희 제1주택 재건축 정비사업서울시 서대문구 연희동 711번지193663.599981452918.800005
622017_0064노원구2017중계동 SKY TOWER 신축공사서울시 노원구 중계동 366-9206833.199966461328.000001
632017_0065송파구2017석촌호수 주변서울시 송파구 신천동 30209252.000024445893.200003
642017_0066동대문구2017휘경제2구역 주택재개발정비사업 아파트 신축공사서울시 동대문구 휘경동 128-12205857.200016454886.399995
652017_0067금천구2017가산동 오피스텔 신축공사서울시 금천구 가산동 535-53외 2필지188997.999962442490.800004
662017_0068구로구2017서울 항동 공공주택지구 1BL 중흥S-클래스신축공사서울시 구로구 항동100-5184155.999958442406.400002