Overview

Dataset statistics

Number of variables7
Number of observations406
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)0.2%
Total size in memory22.7 KiB
Average record size in memory57.3 B

Variable types

Categorical3
Text1
Numeric1
DateTime2

Dataset

Description서울특별시 강남구 민간공사현장 현황입니다. 자세한 사항은 서울특별시 강남구 건축과(02-3423-6142)로 주시면 자세히 안내해 드리도록 하겠습니다.
Author서울특별시 강남구
URLhttps://www.data.go.kr/data/15108269/fileData.do

Alerts

시도 has constant value ""Constant
시군구 has constant value ""Constant
Dataset has 1 (0.2%) duplicate rowsDuplicates
연면적(제곱미터) is highly overall correlated with 주용도High correlation
주용도 is highly overall correlated with 연면적(제곱미터)High correlation

Reproduction

Analysis started2023-12-12 03:33:10.957404
Analysis finished2023-12-12 03:33:11.546910
Duration0.59 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시도
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
서울특별시
406 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 406
100.0%

Length

2023-12-12T12:33:11.623770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:33:11.746379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 406
100.0%

시군구
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
강남구
406 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강남구
2nd row강남구
3rd row강남구
4th row강남구
5th row강남구

Common Values

ValueCountFrequency (%)
강남구 406
100.0%

Length

2023-12-12T12:33:11.851318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:33:11.974583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
강남구 406
100.0%
Distinct401
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
2023-12-12T12:33:12.410829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length24
Mean length20.017241
Min length15

Characters and Unicode

Total characters8127
Distinct characters46
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique396 ?
Unique (%)97.5%

Sample

1st row서울특별시 강남구 역삼동 798-16
2nd row서울특별시 강남구 논현동 62-13 외1필지
3rd row서울특별시 강남구 삼성동 46-33
4th row서울특별시 강남구 대치동 901-30
5th row서울특별시 강남구 신사동 512-6 외1필지
ValueCountFrequency (%)
서울특별시 406
24.0%
강남구 406
24.0%
논현동 108
 
6.4%
역삼동 92
 
5.4%
삼성동 48
 
2.8%
청담동 48
 
2.8%
외1필지 44
 
2.6%
신사동 34
 
2.0%
대치동 26
 
1.5%
외2필지 15
 
0.9%
Other values (409) 468
27.6%
2023-12-12T12:33:12.987018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1289
15.9%
407
 
5.0%
406
 
5.0%
406
 
5.0%
406
 
5.0%
406
 
5.0%
406
 
5.0%
406
 
5.0%
406
 
5.0%
406
 
5.0%
Other values (36) 3183
39.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4677
57.5%
Decimal Number 1781
 
21.9%
Space Separator 1289
 
15.9%
Dash Punctuation 380
 
4.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
407
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
140
 
3.0%
Other values (24) 882
18.9%
Decimal Number
ValueCountFrequency (%)
1 394
22.1%
2 266
14.9%
6 160
9.0%
3 159
8.9%
5 156
 
8.8%
7 144
 
8.1%
4 139
 
7.8%
8 138
 
7.7%
9 135
 
7.6%
0 90
 
5.1%
Space Separator
ValueCountFrequency (%)
1289
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 380
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4677
57.5%
Common 3450
42.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
407
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
140
 
3.0%
Other values (24) 882
18.9%
Common
ValueCountFrequency (%)
1289
37.4%
1 394
 
11.4%
- 380
 
11.0%
2 266
 
7.7%
6 160
 
4.6%
3 159
 
4.6%
5 156
 
4.5%
7 144
 
4.2%
4 139
 
4.0%
8 138
 
4.0%
Other values (2) 225
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4677
57.5%
ASCII 3450
42.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1289
37.4%
1 394
 
11.4%
- 380
 
11.0%
2 266
 
7.7%
6 160
 
4.6%
3 159
 
4.6%
5 156
 
4.5%
7 144
 
4.2%
4 139
 
4.0%
8 138
 
4.0%
Other values (2) 225
 
6.5%
Hangul
ValueCountFrequency (%)
407
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
406
8.7%
140
 
3.0%
Other values (24) 882
18.9%

연면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION 

Distinct346
Distinct (%)85.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2896.4778
Minimum191
Maximum154711
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.7 KiB
2023-12-12T12:33:13.134888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum191
5-th percentile336.25
Q1597
median869.5
Q31997.5
95-th percentile9612.75
Maximum154711
Range154520
Interquartile range (IQR)1400.5

Descriptive statistics

Standard deviation9308.2501
Coefficient of variation (CV)3.2136445
Kurtosis179.51401
Mean2896.4778
Median Absolute Deviation (MAD)407
Skewness11.922419
Sum1175970
Variance86643520
MonotonicityNot monotonic
2023-12-12T12:33:13.320697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2000 6
 
1.5%
602 5
 
1.2%
599 4
 
1.0%
729 4
 
1.0%
870 3
 
0.7%
280 3
 
0.7%
679 3
 
0.7%
1998 3
 
0.7%
735 3
 
0.7%
597 3
 
0.7%
Other values (336) 369
90.9%
ValueCountFrequency (%)
191 1
 
0.2%
198 1
 
0.2%
200 1
 
0.2%
273 1
 
0.2%
274 1
 
0.2%
280 3
0.7%
285 1
 
0.2%
289 1
 
0.2%
290 1
 
0.2%
293 2
0.5%
ValueCountFrequency (%)
154711 1
0.2%
49414 1
0.2%
47731 1
0.2%
46449 1
0.2%
43115 1
0.2%
27025 1
0.2%
20437 1
0.2%
18986 1
0.2%
16745 1
0.2%
16272 1
0.2%

주용도
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
제2종근린생활시설
204 
업무시설
66 
공동주택
47 
제1종근린생활시설
46 
단독주택
27 
Other values (7)
 
16

Length

Max length9
Median length9
Mean length7.1428571
Min length2

Unique

Unique5 ?
Unique (%)1.2%

Sample

1st row업무시설
2nd row제1종근린생활시설
3rd row제2종근린생활시설
4th row제2종근린생활시설
5th row제2종근린생활시설

Common Values

ValueCountFrequency (%)
제2종근린생활시설 204
50.2%
업무시설 66
 
16.3%
공동주택 47
 
11.6%
제1종근린생활시설 46
 
11.3%
단독주택 27
 
6.7%
교육연구시설 8
 
2.0%
자동차관련시설 3
 
0.7%
문화및집회시설 1
 
0.2%
종교시설 1
 
0.2%
의료시설 1
 
0.2%
Other values (2) 2
 
0.5%

Length

2023-12-12T12:33:13.475452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
제2종근린생활시설 204
50.2%
업무시설 66
 
16.3%
공동주택 47
 
11.6%
제1종근린생활시설 46
 
11.3%
단독주택 27
 
6.7%
교육연구시설 8
 
2.0%
자동차관련시설 3
 
0.7%
문화및집회시설 1
 
0.2%
종교시설 1
 
0.2%
의료시설 1
 
0.2%
Other values (2) 2
 
0.5%
Distinct256
Distinct (%)63.1%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
Minimum2018-02-22 00:00:00
Maximum2022-11-14 00:00:00
2023-12-12T12:33:13.595471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:33:13.785947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct200
Distinct (%)49.3%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
Minimum2018-12-15 00:00:00
Maximum2223-05-31 00:00:00
2023-12-12T12:33:13.997393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:33:14.169207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T12:33:11.161734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:33:14.306123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연면적(제곱미터)주용도
연면적(제곱미터)1.0000.830
주용도0.8301.000
2023-12-12T12:33:14.401088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연면적(제곱미터)주용도
연면적(제곱미터)1.0000.646
주용도0.6461.000

Missing values

2023-12-12T12:33:11.362108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:33:11.491374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시도시군구대지위치연면적(제곱미터)주용도착공처리일준공예정일(사용승인예정일)
0서울특별시강남구서울특별시 강남구 역삼동 798-161887업무시설2022-11-142024-01-31
1서울특별시강남구서울특별시 강남구 논현동 62-13 외1필지10560제1종근린생활시설2022-11-092025-02-28
2서울특별시강남구서울특별시 강남구 삼성동 46-33735제2종근린생활시설2022-11-082023-12-01
3서울특별시강남구서울특별시 강남구 대치동 901-30603제2종근린생활시설2022-11-082023-10-31
4서울특별시강남구서울특별시 강남구 신사동 512-6 외1필지582제2종근린생활시설2022-11-082023-09-03
5서울특별시강남구서울특별시 강남구 논현동 182-291828제2종근린생활시설2022-11-072023-11-30
6서울특별시강남구서울특별시 강남구 삼성동 90-29198제2종근린생활시설2022-11-032023-02-28
7서울특별시강남구서울특별시 강남구 논현동 77-29775제2종근린생활시설2022-11-012023-10-06
8서울특별시강남구서울특별시 강남구 논현동 257-11758제2종근린생활시설2022-11-012023-06-17
9서울특별시강남구서울특별시 강남구 논현동 257-121192제2종근린생활시설2022-11-012023-06-17
시도시군구대지위치연면적(제곱미터)주용도착공처리일준공예정일(사용승인예정일)
396서울특별시강남구서울특별시 강남구 삼성동 35 외1필지605제1종근린생활시설2020-05-192020-11-01
397서울특별시강남구서울특별시 강남구 청담동 106-7 외6필지20437공동주택2020-04-072023-05-20
398서울특별시강남구서울특별시 강남구 일원동 173-4 외61필지154711의료시설2020-04-072025-03-31
399서울특별시강남구서울특별시 강남구 신사동 546-6 외2필지2996제2종근린생활시설2020-03-112021-06-30
400서울특별시강남구서울특별시 강남구 자곡동 65043115공장2020-09-072022-10-31
401서울특별시강남구서울특별시 강남구 논현동 278-4 외2필지18986공동주택2019-11-072022-06-30
402서울특별시강남구서울특별시 강남구 세곡동 421-2754제1종근린생활시설2019-07-162020-01-10
403서울특별시강남구서울특별시 강남구 논현동 74-7 외5필지16745숙박시설2018-11-062021-02-28
404서울특별시강남구서울특별시 강남구 논현동 212-8615제1종근린생활시설2018-05-152018-12-15
405서울특별시강남구서울특별시 강남구 도곡동 산 29-513255제2종근린생활시설2018-02-222020-01-04

Duplicate rows

Most frequently occurring

시도시군구대지위치연면적(제곱미터)주용도착공처리일준공예정일(사용승인예정일)# duplicates
0서울특별시강남구서울특별시 강남구 역삼동 823-51990업무시설2022-04-212023-11-302