Overview

Dataset statistics

Number of variables14
Number of observations68
Missing cells73
Missing cells (%)7.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.9 KiB
Average record size in memory118.9 B

Variable types

Numeric2
Categorical8
Text1
DateTime3

Dataset

Description2023년 9월 11일 기준, 경상남도 산청군 미준공 신축 건축물 정보로 구분, 대지위치, 연면적(㎡), 허가일, 착공예정일, 준공예정일, 최대지상층수, 최대지하층수, 주용도, 부속용도, 종주차대수, 가구수 항목으로 구성되어 있습니다.
Author경상남도 산청군
URLhttps://www.data.go.kr/data/15122549/fileData.do

Alerts

건축구분 has constant value ""Constant
대지위치 has constant value ""Constant
데이터기준일자 has constant value ""Constant
가구수 is highly overall correlated with 순번 and 4 other fieldsHigh correlation
최대지하층수 is highly overall correlated with 총주차대수 and 1 other fieldsHigh correlation
최대지상층수 is highly overall correlated with 총주차대수 and 3 other fieldsHigh correlation
부속용도 is highly overall correlated with 최대지상층수 and 2 other fieldsHigh correlation
주용도 is highly overall correlated with 최대지상층수 and 2 other fieldsHigh correlation
순번 is highly overall correlated with 가구수High correlation
총주차대수 is highly overall correlated with 최대지상층수 and 1 other fieldsHigh correlation
최대지상층수 is highly imbalanced (54.1%)Imbalance
최대지하층수 is highly imbalanced (76.6%)Imbalance
주용도 is highly imbalanced (63.4%)Imbalance
가구수 is highly imbalanced (73.9%)Imbalance
착공예정일 has 12 (17.6%) missing valuesMissing
준공예정일 has 12 (17.6%) missing valuesMissing
총주차대수 has 49 (72.1%) missing valuesMissing
순번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 12:47:21.678661
Analysis finished2023-12-12 12:47:23.250753
Duration1.57 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct68
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.5
Minimum1
Maximum68
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 B
2023-12-12T21:47:23.330823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.35
Q117.75
median34.5
Q351.25
95-th percentile64.65
Maximum68
Range67
Interquartile range (IQR)33.5

Descriptive statistics

Standard deviation19.77372
Coefficient of variation (CV)0.5731513
Kurtosis-1.2
Mean34.5
Median Absolute Deviation (MAD)17
Skewness0
Sum2346
Variance391
MonotonicityStrictly increasing
2023-12-12T21:47:23.523349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.5%
45 1
 
1.5%
51 1
 
1.5%
50 1
 
1.5%
49 1
 
1.5%
48 1
 
1.5%
47 1
 
1.5%
46 1
 
1.5%
44 1
 
1.5%
36 1
 
1.5%
Other values (58) 58
85.3%
ValueCountFrequency (%)
1 1
1.5%
2 1
1.5%
3 1
1.5%
4 1
1.5%
5 1
1.5%
6 1
1.5%
7 1
1.5%
8 1
1.5%
9 1
1.5%
10 1
1.5%
ValueCountFrequency (%)
68 1
1.5%
67 1
1.5%
66 1
1.5%
65 1
1.5%
64 1
1.5%
63 1
1.5%
62 1
1.5%
61 1
1.5%
60 1
1.5%
59 1
1.5%

건축구분
Categorical

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size676.0 B
신축
68 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row신축
2nd row신축
3rd row신축
4th row신축
5th row신축

Common Values

ValueCountFrequency (%)
신축 68
100.0%

Length

2023-12-12T21:47:23.679980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:47:23.788565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신축 68
100.0%

대지위치
Categorical

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size676.0 B
산청군
68 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row산청군
2nd row산청군
3rd row산청군
4th row산청군
5th row산청군

Common Values

ValueCountFrequency (%)
산청군 68
100.0%

Length

2023-12-12T21:47:23.891702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:47:23.994697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
산청군 68
100.0%
Distinct67
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Memory size676.0 B
2023-12-12T21:47:24.240487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length6
Mean length4.9411765
Min length2

Characters and Unicode

Total characters336
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique66 ?
Unique (%)97.1%

Sample

1st row383.76
2nd row145.26
3rd row156.65
4th row197.64
5th row430.31
ValueCountFrequency (%)
96 2
 
2.9%
70.95 1
 
1.5%
99.12 1
 
1.5%
78 1
 
1.5%
462 1
 
1.5%
98.82 1
 
1.5%
166.87 1
 
1.5%
85.19 1
 
1.5%
30.48 1
 
1.5%
186 1
 
1.5%
Other values (57) 57
83.8%
2023-12-12T21:47:24.758343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 52
15.5%
4 38
11.3%
1 36
10.7%
9 31
9.2%
3 30
8.9%
8 30
8.9%
6 29
8.6%
5 29
8.6%
2 24
7.1%
0 17
 
5.1%
Other values (2) 20
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 280
83.3%
Other Punctuation 56
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 38
13.6%
1 36
12.9%
9 31
11.1%
3 30
10.7%
8 30
10.7%
6 29
10.4%
5 29
10.4%
2 24
8.6%
0 17
6.1%
7 16
5.7%
Other Punctuation
ValueCountFrequency (%)
. 52
92.9%
, 4
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common 336
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 52
15.5%
4 38
11.3%
1 36
10.7%
9 31
9.2%
3 30
8.9%
8 30
8.9%
6 29
8.6%
5 29
8.6%
2 24
7.1%
0 17
 
5.1%
Other values (2) 20
 
6.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 336
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 52
15.5%
4 38
11.3%
1 36
10.7%
9 31
9.2%
3 30
8.9%
8 30
8.9%
6 29
8.6%
5 29
8.6%
2 24
7.1%
0 17
 
5.1%
Other values (2) 20
 
6.0%
Distinct64
Distinct (%)94.1%
Missing0
Missing (%)0.0%
Memory size676.0 B
Minimum2018-01-09 00:00:00
Maximum2023-08-17 00:00:00
2023-12-12T21:47:24.952819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:47:25.139766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

착공예정일
Date

MISSING 

Distinct51
Distinct (%)91.1%
Missing12
Missing (%)17.6%
Memory size676.0 B
Minimum2018-11-16 00:00:00
Maximum2023-08-19 00:00:00
2023-12-12T21:47:25.342609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:47:25.526145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

준공예정일
Date

MISSING 

Distinct44
Distinct (%)78.6%
Missing12
Missing (%)17.6%
Memory size676.0 B
Minimum2019-01-30 00:00:00
Maximum2025-04-22 00:00:00
2023-12-12T21:47:25.725626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:47:25.894555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=44)

최대지상층수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Memory size676.0 B
1
54 
2
3
 
3
5
 
2
15
 
1

Length

Max length2
Median length1
Mean length1.0147059
Min length1

Unique

Unique1 ?
Unique (%)1.5%

Sample

1st row1
2nd row2
3rd row1
4th row1
5th row3

Common Values

ValueCountFrequency (%)
1 54
79.4%
2 8
 
11.8%
3 3
 
4.4%
5 2
 
2.9%
15 1
 
1.5%

Length

2023-12-12T21:47:26.052533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:47:26.191451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 54
79.4%
2 8
 
11.8%
3 3
 
4.4%
5 2
 
2.9%
15 1
 
1.5%

최대지하층수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size676.0 B
0
64 
1
 
3
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)1.5%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 64
94.1%
1 3
 
4.4%
2 1
 
1.5%

Length

2023-12-12T21:47:26.342115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:47:26.463709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 64
94.1%
1 3
 
4.4%
2 1
 
1.5%

주용도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)4.4%
Missing0
Missing (%)0.0%
Memory size676.0 B
제2종근린생활시설
61 
공동주택
 
4
숙박시설
 
3

Length

Max length9
Median length9
Mean length8.4852941
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제2종근린생활시설
2nd row제2종근린생활시설
3rd row제2종근린생활시설
4th row제2종근린생활시설
5th row제2종근린생활시설

Common Values

ValueCountFrequency (%)
제2종근린생활시설 61
89.7%
공동주택 4
 
5.9%
숙박시설 3
 
4.4%

Length

2023-12-12T21:47:26.603508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:47:26.712749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제2종근린생활시설 61
89.7%
공동주택 4
 
5.9%
숙박시설 3
 
4.4%

부속용도
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)35.3%
Missing0
Missing (%)0.0%
Memory size676.0 B
사무소
22 
<NA>
17 
제조업소
일반음식점
사무실
Other values (19)
20 

Length

Max length19
Median length11
Mean length4.3235294
Min length2

Unique

Unique18 ?
Unique (%)26.5%

Sample

1st row<NA>
2nd row단독주택
3rd row사무소
4th row제조업소
5th row사무소

Common Values

ValueCountFrequency (%)
사무소 22
32.4%
<NA> 17
25.0%
제조업소 3
 
4.4%
일반음식점 3
 
4.4%
사무실 3
 
4.4%
종교집회장 2
 
2.9%
휴게음식점 1
 
1.5%
공연장 1
 
1.5%
금융업소 1
 
1.5%
제2종근린생활시설 1
 
1.5%
Other values (14) 14
20.6%

Length

2023-12-12T21:47:26.831197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
사무소 24
33.3%
na 17
23.6%
일반음식점 5
 
6.9%
제조업소 4
 
5.6%
사무실 3
 
4.2%
종교집회장 2
 
2.8%
생활형숙박시설 1
 
1.4%
업무시설+오피스텔/제1종근린생활시설 1
 
1.4%
사찰 1
 
1.4%
종교집회장-교회 1
 
1.4%
Other values (13) 13
18.1%

총주차대수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct9
Distinct (%)47.4%
Missing49
Missing (%)72.1%
Infinite0
Infinite (%)0.0%
Mean10.421053
Minimum1
Maximum64
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 B
2023-12-12T21:47:26.950696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q312
95-th percentile37.9
Maximum64
Range63
Interquartile range (IQR)10

Descriptive statistics

Standard deviation15.653597
Coefficient of variation (CV)1.5021128
Kurtosis7.634021
Mean10.421053
Median Absolute Deviation (MAD)2
Skewness2.6399709
Sum198
Variance245.03509
MonotonicityNot monotonic
2023-12-12T21:47:27.069794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2 5
 
7.4%
8 3
 
4.4%
1 3
 
4.4%
18 2
 
2.9%
3 2
 
2.9%
35 1
 
1.5%
64 1
 
1.5%
16 1
 
1.5%
4 1
 
1.5%
(Missing) 49
72.1%
ValueCountFrequency (%)
1 3
4.4%
2 5
7.4%
3 2
 
2.9%
4 1
 
1.5%
8 3
4.4%
16 1
 
1.5%
18 2
 
2.9%
35 1
 
1.5%
64 1
 
1.5%
ValueCountFrequency (%)
64 1
 
1.5%
35 1
 
1.5%
18 2
 
2.9%
16 1
 
1.5%
8 3
4.4%
4 1
 
1.5%
3 2
 
2.9%
2 5
7.4%
1 3
4.4%

가구수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size676.0 B
<NA>
65 
1
 
3

Length

Max length4
Median length4
Mean length3.8676471
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row1
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 65
95.6%
1 3
 
4.4%

Length

2023-12-12T21:47:27.201365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:47:27.306437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 65
95.6%
1 3
 
4.4%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size676.0 B
2023-09-11
68 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-09-11
2nd row2023-09-11
3rd row2023-09-11
4th row2023-09-11
5th row2023-09-11

Common Values

ValueCountFrequency (%)
2023-09-11 68
100.0%

Length

2023-12-12T21:47:27.400979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:47:27.521441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-09-11 68
100.0%

Interactions

2023-12-12T21:47:22.572451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:47:22.388022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:47:22.656250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T21:47:22.484668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:47:27.613878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번연면적허가일착공예정일준공예정일최대지상층수최대지하층수주용도부속용도총주차대수
순번1.0000.9400.9810.8900.8360.4230.0000.2390.4450.000
연면적0.9401.0000.9930.9890.9961.0001.0001.0000.9941.000
허가일0.9810.9931.0000.9980.9980.0000.0000.5340.9931.000
착공예정일0.8900.9890.9981.0000.9940.9650.6831.0001.0001.000
준공예정일0.8360.9960.9980.9941.0000.7260.0001.0000.9221.000
최대지상층수0.4231.0000.0000.9650.7261.0000.4250.6880.9380.878
최대지하층수0.0001.0000.0000.6830.0000.4251.0000.1860.8351.000
주용도0.2391.0000.5341.0001.0000.6880.1861.0001.0000.416
부속용도0.4450.9940.9931.0000.9220.9380.8351.0001.0000.904
총주차대수0.0001.0001.0001.0001.0000.8781.0000.4160.9041.000
2023-12-12T21:47:28.057389image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가구수최대지하층수최대지상층수부속용도주용도
가구수1.0001.0001.0001.0001.000
최대지하층수1.0001.0000.3490.4980.054
최대지상층수1.0000.3491.0000.6210.656
부속용도1.0000.4980.6211.0000.764
주용도1.0000.0540.6560.7641.000
2023-12-12T21:47:28.164516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번총주차대수최대지상층수최대지하층수주용도부속용도가구수
순번1.000-0.0180.1760.0000.1310.1101.000
총주차대수-0.0181.0000.5200.9070.3060.4370.000
최대지상층수0.1760.5201.0000.3490.6560.6211.000
최대지하층수0.0000.9070.3491.0000.0540.4981.000
주용도0.1310.3060.6560.0541.0000.7641.000
부속용도0.1100.4370.6210.4980.7641.0001.000
가구수1.0000.0001.0001.0001.0001.0001.000

Missing values

2023-12-12T21:47:22.786525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:47:23.006432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T21:47:23.162427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

순번건축구분대지위치연면적허가일착공예정일준공예정일최대지상층수최대지하층수주용도부속용도총주차대수가구수데이터기준일자
01신축산청군383.762023-08-17<NA><NA>10제2종근린생활시설<NA>8<NA>2023-09-11
12신축산청군145.262023-07-26<NA><NA>20제2종근린생활시설단독주택<NA>12023-09-11
23신축산청군156.652023-07-242023-08-072023-11-3010제2종근린생활시설사무소<NA><NA>2023-09-11
34신축산청군197.642023-07-212023-08-082024-08-0710제2종근린생활시설제조업소<NA><NA>2023-09-11
45신축산청군430.312023-07-192023-08-192023-12-3130제2종근린생활시설사무소2<NA>2023-09-11
56신축산청군300.482023-07-14<NA><NA>10제2종근린생활시설일반음식점2<NA>2023-09-11
67신축산청군448.132023-07-11<NA><NA>20제2종근린생활시설공연장2<NA>2023-09-11
78신축산청군98.42023-07-072023-08-072024-08-0110제2종근린생활시설<NA><NA><NA>2023-09-11
89신축산청군54.282023-06-29<NA><NA>10제2종근린생활시설사무소<NA><NA>2023-09-11
910신축산청군65.922023-06-192023-07-052023-12-3110제2종근린생활시설<NA><NA><NA>2023-09-11
순번건축구분대지위치연면적허가일착공예정일준공예정일최대지상층수최대지하층수주용도부속용도총주차대수가구수데이터기준일자
5859신축산청군3842019-05-022019-05-132021-05-1110제2종근린생활시설사무소8<NA>2023-09-11
5960신축산청군68.312019-04-252020-03-152021-12-3110제2종근린생활시설<NA><NA><NA>2023-09-11
6061신축산청군74.742019-04-222019-04-262020-03-3010제2종근린생활시설사무실<NA><NA>2023-09-11
6162신축산청군1002019-04-222019-04-262020-03-3010제2종근린생활시설사무실<NA><NA>2023-09-11
6263신축산청군133.142019-04-222019-04-262020-03-3010제2종근린생활시설사무실<NA><NA>2023-09-11
6364신축산청군3502019-03-252021-03-242022-03-2310숙박시설<NA>4<NA>2023-09-11
6465신축산청군197.932019-02-012019-02-122020-08-3110제2종근린생활시설일반음식점 외<NA><NA>2023-09-11
6566신축산청군431.722018-11-222018-12-052019-11-3010제2종근린생활시설<NA><NA><NA>2023-09-11
6667신축산청군41.582018-08-292018-11-162019-01-3010제2종근린생활시설사무소<NA><NA>2023-09-11
6768신축산청군132.352018-01-092019-01-092019-06-2010제2종근린생활시설<NA><NA><NA>2023-09-11