Overview

Dataset statistics

Number of variables7
Number of observations37
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.3 KiB
Average record size in memory62.6 B

Variable types

Text3
Categorical2
Numeric2

Dataset

Description관리번호,자치구,등록년도,공사명,공사위치,X좌표,Y좌표
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-21106/S/1/datasetView.do

Alerts

등록년도 has constant value ""Constant
X좌표 is highly overall correlated with 자치구High correlation
Y좌표 is highly overall correlated with 자치구High correlation
자치구 is highly overall correlated with X좌표 and 1 other fieldsHigh correlation
관리번호 has unique valuesUnique
공사명 has unique valuesUnique

Reproduction

Analysis started2023-12-11 07:32:40.525845
Analysis finished2023-12-11 07:32:41.564073
Duration1.04 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

관리번호
Text

UNIQUE 

Distinct37
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size428.0 B
2023-12-11T16:32:41.808145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters333
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)100.0%

Sample

1st row2016_0002
2nd row2016_0005
3rd row2016_0011
4th row2016_0012
5th row2016_0014
ValueCountFrequency (%)
2016_0002 1
 
2.7%
2016_0056 1
 
2.7%
2016_0058 1
 
2.7%
2016_0059 1
 
2.7%
2016_0068 1
 
2.7%
2016_0072 1
 
2.7%
2016_0076 1
 
2.7%
2016_0083 1
 
2.7%
2016_0097 1
 
2.7%
2016_0098 1
 
2.7%
Other values (27) 27
73.0%
2023-12-11T16:32:42.319032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 110
33.0%
1 52
15.6%
2 47
14.1%
6 42
 
12.6%
_ 37
 
11.1%
5 10
 
3.0%
4 9
 
2.7%
3 8
 
2.4%
8 8
 
2.4%
9 5
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 296
88.9%
Connector Punctuation 37
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 110
37.2%
1 52
17.6%
2 47
15.9%
6 42
 
14.2%
5 10
 
3.4%
4 9
 
3.0%
3 8
 
2.7%
8 8
 
2.7%
9 5
 
1.7%
7 5
 
1.7%
Connector Punctuation
ValueCountFrequency (%)
_ 37
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 333
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 110
33.0%
1 52
15.6%
2 47
14.1%
6 42
 
12.6%
_ 37
 
11.1%
5 10
 
3.0%
4 9
 
2.7%
3 8
 
2.4%
8 8
 
2.4%
9 5
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 333
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 110
33.0%
1 52
15.6%
2 47
14.1%
6 42
 
12.6%
_ 37
 
11.1%
5 10
 
3.0%
4 9
 
2.7%
3 8
 
2.4%
8 8
 
2.4%
9 5
 
1.5%

자치구
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)37.8%
Missing0
Missing (%)0.0%
Memory size428.0 B
영등포구
송파구
용산구
마포구
강서구
Other values (9)
16 

Length

Max length4
Median length3
Mean length3.1351351
Min length2

Unique

Unique4 ?
Unique (%)10.8%

Sample

1st row중구
2nd row중구
3rd row용산구
4th row용산구
5th row용산구

Common Values

ValueCountFrequency (%)
영등포구 6
16.2%
송파구 6
16.2%
용산구 3
8.1%
마포구 3
8.1%
강서구 3
8.1%
금천구 3
8.1%
강남구 3
8.1%
중구 2
 
5.4%
은평구 2
 
5.4%
서초구 2
 
5.4%
Other values (4) 4
10.8%

Length

2023-12-11T16:32:42.533580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
영등포구 6
16.2%
송파구 6
16.2%
용산구 3
8.1%
마포구 3
8.1%
강서구 3
8.1%
금천구 3
8.1%
강남구 3
8.1%
중구 2
 
5.4%
은평구 2
 
5.4%
서초구 2
 
5.4%
Other values (4) 4
10.8%

등록년도
Categorical

CONSTANT 

Distinct1
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size428.0 B
2016
37 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2016
2nd row2016
3rd row2016
4th row2016
5th row2016

Common Values

ValueCountFrequency (%)
2016 37
100.0%

Length

2023-12-11T16:32:42.724384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T16:32:42.871312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2016 37
100.0%

공사명
Text

UNIQUE 

Distinct37
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size428.0 B
2023-12-11T16:32:43.167402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length23
Mean length17.324324
Min length9

Characters and Unicode

Total characters641
Distinct characters169
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)100.0%

Sample

1st row하나은행현장 본점 신축공사
2nd row동교동청기와호텔 개발사업현장
3rd row아모레퍼시픽 사옥 신축공사
4th row용산역전면 제2구역 도시환경 정비사업
5th row용산역전면 제3구역 도시환경정비사업(용산3복합개발)
ValueCountFrequency (%)
신축공사 25
 
20.2%
지식산업센터 3
 
2.4%
업무시설 3
 
2.4%
사옥 3
 
2.4%
오피스텔 3
 
2.4%
서초업무시설 2
 
1.6%
주상복합 2
 
1.6%
문정7구역 2
 
1.6%
가산 2
 
1.6%
1-2bl 2
 
1.6%
Other values (70) 77
62.1%
2023-12-11T16:32:43.690422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
87
 
13.6%
39
 
6.1%
34
 
5.3%
31
 
4.8%
29
 
4.5%
15
 
2.3%
2 10
 
1.6%
1 10
 
1.6%
10
 
1.6%
10
 
1.6%
Other values (159) 366
57.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 473
73.8%
Space Separator 87
 
13.6%
Decimal Number 32
 
5.0%
Uppercase Letter 32
 
5.0%
Dash Punctuation 8
 
1.2%
Other Punctuation 5
 
0.8%
Connector Punctuation 2
 
0.3%
Open Punctuation 1
 
0.2%
Close Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
39
 
8.2%
34
 
7.2%
31
 
6.6%
29
 
6.1%
15
 
3.2%
10
 
2.1%
10
 
2.1%
10
 
2.1%
8
 
1.7%
8
 
1.7%
Other values (132) 279
59.0%
Uppercase Letter
ValueCountFrequency (%)
B 5
15.6%
C 5
15.6%
E 5
15.6%
L 5
15.6%
N 3
9.4%
T 2
 
6.2%
R 2
 
6.2%
I 1
 
3.1%
S 1
 
3.1%
V 1
 
3.1%
Other values (2) 2
 
6.2%
Decimal Number
ValueCountFrequency (%)
2 10
31.2%
1 10
31.2%
7 4
 
12.5%
5 3
 
9.4%
3 3
 
9.4%
4 1
 
3.1%
0 1
 
3.1%
Other Punctuation
ValueCountFrequency (%)
. 3
60.0%
, 1
 
20.0%
& 1
 
20.0%
Space Separator
ValueCountFrequency (%)
87
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 473
73.8%
Common 136
 
21.2%
Latin 32
 
5.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
39
 
8.2%
34
 
7.2%
31
 
6.6%
29
 
6.1%
15
 
3.2%
10
 
2.1%
10
 
2.1%
10
 
2.1%
8
 
1.7%
8
 
1.7%
Other values (132) 279
59.0%
Common
ValueCountFrequency (%)
87
64.0%
2 10
 
7.4%
1 10
 
7.4%
- 8
 
5.9%
7 4
 
2.9%
. 3
 
2.2%
5 3
 
2.2%
3 3
 
2.2%
_ 2
 
1.5%
4 1
 
0.7%
Other values (5) 5
 
3.7%
Latin
ValueCountFrequency (%)
B 5
15.6%
C 5
15.6%
E 5
15.6%
L 5
15.6%
N 3
9.4%
T 2
 
6.2%
R 2
 
6.2%
I 1
 
3.1%
S 1
 
3.1%
V 1
 
3.1%
Other values (2) 2
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 473
73.8%
ASCII 168
 
26.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
87
51.8%
2 10
 
6.0%
1 10
 
6.0%
- 8
 
4.8%
B 5
 
3.0%
C 5
 
3.0%
E 5
 
3.0%
L 5
 
3.0%
7 4
 
2.4%
N 3
 
1.8%
Other values (17) 26
 
15.5%
Hangul
ValueCountFrequency (%)
39
 
8.2%
34
 
7.2%
31
 
6.6%
29
 
6.1%
15
 
3.2%
10
 
2.1%
10
 
2.1%
10
 
2.1%
8
 
1.7%
8
 
1.7%
Other values (132) 279
59.0%
Distinct36
Distinct (%)97.3%
Missing0
Missing (%)0.0%
Memory size428.0 B
2023-12-11T16:32:44.022097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length22
Mean length17.972973
Min length11

Characters and Unicode

Total characters665
Distinct characters80
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)94.6%

Sample

1st row서울시 중구 을지로1가 101-1
2nd row서울시 중구 소공동 1-25
3rd row서울시 용산구 한강로2가 159-5번지 일대
4th row서울시 용산구 한강로2가 391번지 일대
5th row서울시 용산구 한강로2가 342번지
ValueCountFrequency (%)
서울시 28
 
19.0%
영등포구 6
 
4.1%
송파구 6
 
4.1%
문정동 4
 
2.7%
서울 4
 
2.7%
한강로2가 3
 
2.0%
마곡동 3
 
2.0%
강서구 3
 
2.0%
가산동 3
 
2.0%
금천구 3
 
2.0%
Other values (70) 84
57.1%
2023-12-11T16:32:44.559509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
110
 
16.5%
41
 
6.2%
38
 
5.7%
35
 
5.3%
33
 
5.0%
31
 
4.7%
1 28
 
4.2%
- 26
 
3.9%
2 22
 
3.3%
3 19
 
2.9%
Other values (70) 282
42.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 372
55.9%
Decimal Number 153
23.0%
Space Separator 110
 
16.5%
Dash Punctuation 26
 
3.9%
Other Punctuation 4
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
41
 
11.0%
38
 
10.2%
35
 
9.4%
33
 
8.9%
31
 
8.3%
10
 
2.7%
10
 
2.7%
9
 
2.4%
9
 
2.4%
9
 
2.4%
Other values (57) 147
39.5%
Decimal Number
ValueCountFrequency (%)
1 28
18.3%
2 22
14.4%
3 19
12.4%
4 15
9.8%
5 15
9.8%
7 13
8.5%
0 12
7.8%
6 12
7.8%
9 9
 
5.9%
8 8
 
5.2%
Space Separator
ValueCountFrequency (%)
110
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 26
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 372
55.9%
Common 293
44.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
41
 
11.0%
38
 
10.2%
35
 
9.4%
33
 
8.9%
31
 
8.3%
10
 
2.7%
10
 
2.7%
9
 
2.4%
9
 
2.4%
9
 
2.4%
Other values (57) 147
39.5%
Common
ValueCountFrequency (%)
110
37.5%
1 28
 
9.6%
- 26
 
8.9%
2 22
 
7.5%
3 19
 
6.5%
4 15
 
5.1%
5 15
 
5.1%
7 13
 
4.4%
0 12
 
4.1%
6 12
 
4.1%
Other values (3) 21
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 372
55.9%
ASCII 293
44.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
110
37.5%
1 28
 
9.6%
- 26
 
8.9%
2 22
 
7.5%
3 19
 
6.5%
4 15
 
5.1%
5 15
 
5.1%
7 13
 
4.4%
0 12
 
4.1%
6 12
 
4.1%
Other values (3) 21
 
7.2%
Hangul
ValueCountFrequency (%)
41
 
11.0%
38
 
10.2%
35
 
9.4%
33
 
8.9%
31
 
8.3%
10
 
2.7%
10
 
2.7%
9
 
2.4%
9
 
2.4%
9
 
2.4%
Other values (57) 147
39.5%

X좌표
Real number (ℝ)

HIGH CORRELATION 

Distinct36
Distinct (%)97.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean197566.61
Minimum184456
Maximum212640
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size465.0 B
2023-12-11T16:32:44.759394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum184456
5-th percentile184750.08
Q1191792
median196925.6
Q3203933.6
95-th percentile210409.92
Maximum212640
Range28184
Interquartile range (IQR)12141.6

Descriptive statistics

Standard deviation8186.5417
Coefficient of variation (CV)0.04143687
Kurtosis-1.0025022
Mean197566.61
Median Absolute Deviation (MAD)5456.8
Skewness0.31393758
Sum7309964.4
Variance67019464
MonotonicityNot monotonic
2023-12-11T16:32:44.935993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
210345.2 2
 
5.4%
198407.6 1
 
2.7%
191977.6 1
 
2.7%
192766.0 1
 
2.7%
192994.4 1
 
2.7%
201946.8 1
 
2.7%
203930.8 1
 
2.7%
203933.6 1
 
2.7%
201774.8 1
 
2.7%
209986.8 1
 
2.7%
Other values (26) 26
70.3%
ValueCountFrequency (%)
184456.0 1
2.7%
184598.4 1
2.7%
184788.0 1
2.7%
189202.8 1
2.7%
189420.4 1
2.7%
189448.0 1
2.7%
189861.2 1
2.7%
191468.8 1
2.7%
191494.4 1
2.7%
191792.0 1
2.7%
ValueCountFrequency (%)
212640.0 1
2.7%
210668.8 1
2.7%
210345.2 2
5.4%
210273.6 1
2.7%
209986.8 1
2.7%
206602.8 1
2.7%
205088.0 1
2.7%
204003.2 1
2.7%
203933.6 1
2.7%
203930.8 1
2.7%

Y좌표
Real number (ℝ)

HIGH CORRELATION 

Distinct36
Distinct (%)97.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean447209.08
Minimum440032
Maximum460026
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size465.0 B
2023-12-11T16:32:45.126566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum440032
5-th percentile441733.52
Q1442959.2
median446704.8
Q3450548.4
95-th percentile454139.28
Maximum460026
Range19994
Interquartile range (IQR)7589.2

Descriptive statistics

Standard deviation4806.9757
Coefficient of variation (CV)0.010748833
Kurtosis0.86040917
Mean447209.08
Median Absolute Deviation (MAD)3765.6
Skewness0.87317511
Sum16546736
Variance23107016
MonotonicityNot monotonic
2023-12-11T16:32:45.323466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
442813.19999 2
 
5.4%
451878.399997 1
 
2.7%
446391.99999 1
 
2.7%
447457.2 1
 
2.7%
447649.99999 1
 
2.7%
440041.99999 1
 
2.7%
444763.19999 1
 
2.7%
443898.8 1
 
2.7%
446365.6 1
 
2.7%
446303.59999 1
 
2.7%
Other values (26) 26
70.3%
ValueCountFrequency (%)
440031.99999 1
2.7%
440041.99999 1
2.7%
442156.39999 1
2.7%
442371.99999 1
2.7%
442427.99999 1
2.7%
442448.4 1
2.7%
442813.19999 2
5.4%
442939.19999 1
2.7%
442959.2 1
2.7%
443160.0 1
2.7%
ValueCountFrequency (%)
460025.99999 1
2.7%
460017.99999 1
2.7%
452669.6 1
2.7%
452191.99999 1
2.7%
452030.4 1
2.7%
451878.399997 1
2.7%
451643.6 1
2.7%
451618.8 1
2.7%
451096.79999 1
2.7%
450548.4 1
2.7%

Interactions

2023-12-11T16:32:41.041999image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T16:32:40.853823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T16:32:41.158447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T16:32:40.938085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T16:32:45.464320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
관리번호자치구공사명공사위치X좌표Y좌표
관리번호1.0001.0001.0001.0001.0001.000
자치구1.0001.0001.0001.0000.9690.942
공사명1.0001.0001.0001.0001.0001.000
공사위치1.0001.0001.0001.0001.0001.000
X좌표1.0000.9691.0001.0001.0000.839
Y좌표1.0000.9421.0001.0000.8391.000
2023-12-11T16:32:45.610870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
X좌표Y좌표자치구
X좌표1.000-0.2460.787
Y좌표-0.2461.0000.712
자치구0.7870.7121.000

Missing values

2023-12-11T16:32:41.328003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T16:32:41.492334image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

관리번호자치구등록년도공사명공사위치X좌표Y좌표
02016_0002중구2016하나은행현장 본점 신축공사서울시 중구 을지로1가 101-1198407.6451878.399997
12016_0005중구2016동교동청기와호텔 개발사업현장서울시 중구 소공동 1-25197425.6451618.8
22016_0011용산구2016아모레퍼시픽 사옥 신축공사서울시 용산구 한강로2가 159-5번지 일대197261.6447733.2
32016_0012용산구2016용산역전면 제2구역 도시환경 정비사업서울시 용산구 한강로2가 391번지 일대196925.6447554.0
42016_0014용산구2016용산역전면 제3구역 도시환경정비사업(용산3복합개발)서울시 용산구 한강로2가 342번지197031.2447698.0
52016_0021광진구2016건국대학교_신공학관_건립공사서울시 광진구 능동로 120206602.8448855.6
62016_0023동대문구2016청계와이즈노벨리아 아파트 신축공사동대문구 답십리동 463-2,19,21204003.2452669.6
72016_0032은평구2016은평엘크루 주상복합 신축공사서울시 은평구 진관동 60-18번지 상업3블록192808.0460025.99999
82016_0033은평구2016은평뉴타운 상업4블럭 주상복합 신축공사서울특별시 은평구 진관동 진관3로 33192934.4460017.99999
92016_0034마포구2016서울복합 1,2호기 토건공사서울특별시 마포구 토정로 56192613.6449450.0
관리번호자치구등록년도공사명공사위치X좌표Y좌표
272016_0097송파구2016문정지구 문정7구역 지식산업센터 7-1BL서울시 송파구 문정동210345.2442813.19999
282016_0098송파구2016문정지구 문정7구역 지식산업센터 7-2BL서울시 송파구 문정동210345.2442813.19999
292016_0101송파구2016방이동 잠실헤리츠 오피스텔 신축공사서울시 송파구 방이동 47-2209986.8446303.59999
302016_0108송파구2016위례 아이온스퀘어 신축공사서울시 송파구 장지동 881212640.0442427.99999
312016_0109송파구2016문정 1-2BL 지식산업센터 신축공사서울특별시 송파구 문정동 642-3210668.8443160.0
322016_0113송파구2016신흥정보통신 사옥 신축공사서울시 송파구 문정동 640-10210273.6442939.19999
332016_0125서초구2016서초업무시설 1-2BL 신축공사서울시 서초구 우면동786202072.0440031.99999
342016_0126성동구2016성수별관 신축공사서울시 성동구 성수동 2가 273-61205088.0449277.59999
352016_0127강서구2016마곡지구 업무 C2-2.5 블록 업무시설 신축공사서울 강서구 마곡동 365-20184598.4452030.4
362016_0128강서구2016마곡지구 업무 C3-1.2.5 블록 업무시설 신축공사서울 강서구 마곡동 288-4184788.0452191.99999