Overview

Dataset statistics

Number of variables17
Number of observations78
Missing cells233
Missing cells (%)17.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory10.9 KiB
Average record size in memory143.7 B

Variable types

Categorical6
DateTime4
Text3
Numeric4

Dataset

Description본 데이터는 사천시 관내 미준공 신축 건축물에 대한 정보(건축구분, 허가일, 착공일(예정일), 준공일(예정일), 대지위치, 연면적, 최대지상층수, 최대지하층수, 주용도, 부속용도, 총주차대수, 세대수, 가구수, 시공자명)를 제공하고 있습니다.
Author경상남도 사천시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15121500

Alerts

시행사 has constant value ""Constant
데이터기준일자 has constant value ""Constant
가구수 is highly overall correlated with 연면적(제곱미터) and 2 other fieldsHigh correlation
지하층수 is highly overall correlated with 연면적(제곱미터) and 4 other fieldsHigh correlation
연면적(제곱미터) is highly overall correlated with 지상층수 and 5 other fieldsHigh correlation
지상층수 is highly overall correlated with 연면적(제곱미터) and 4 other fieldsHigh correlation
총주차대수 is highly overall correlated with 연면적(제곱미터) and 5 other fieldsHigh correlation
세대수 is highly overall correlated with 연면적(제곱미터) and 3 other fieldsHigh correlation
시공사 is highly overall correlated with 연면적(제곱미터) and 2 other fieldsHigh correlation
가구수 is highly imbalanced (63.8%)Imbalance
시공사 is highly imbalanced (68.4%)Imbalance
착공일(예정일) has 53 (67.9%) missing valuesMissing
준공일(예정일) has 69 (88.5%) missing valuesMissing
부속용도 has 22 (28.2%) missing valuesMissing
총주차대수 has 24 (30.8%) missing valuesMissing
세대수 has 64 (82.1%) missing valuesMissing
공사명 has 1 (1.3%) missing valuesMissing
총주차대수 has 3 (3.8%) zerosZeros

Reproduction

Analysis started2023-12-10 23:44:42.512804
Analysis finished2023-12-10 23:44:45.943041
Duration3.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

건축구분
Categorical

Distinct4
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size756.0 B
신축
43 
용도변경
30 
증축
 
4
대수선
 
1

Length

Max length4
Median length2
Mean length2.7820513
Min length2

Unique

Unique1 ?
Unique (%)1.3%

Sample

1st row신축
2nd row신축
3rd row신축
4th row신축
5th row신축

Common Values

ValueCountFrequency (%)
신축 43
55.1%
용도변경 30
38.5%
증축 4
 
5.1%
대수선 1
 
1.3%

Length

2023-12-11T08:44:46.017899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:44:46.143004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신축 43
55.1%
용도변경 30
38.5%
증축 4
 
5.1%
대수선 1
 
1.3%
Distinct76
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Memory size756.0 B
Minimum2018-01-05 00:00:00
Maximum2023-07-27 00:00:00
2023-12-11T08:44:46.289621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:46.428011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

착공일(예정일)
Date

MISSING 

Distinct25
Distinct (%)100.0%
Missing53
Missing (%)67.9%
Memory size756.0 B
Minimum2018-05-03 00:00:00
Maximum2024-02-04 00:00:00
2023-12-11T08:44:46.624778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:46.742229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)

준공일(예정일)
Date

MISSING 

Distinct9
Distinct (%)100.0%
Missing69
Missing (%)88.5%
Memory size756.0 B
Minimum2022-11-30 00:00:00
Maximum2026-02-28 00:00:00
2023-12-11T08:44:46.863335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:46.957974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
Distinct76
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Memory size756.0 B
2023-12-11T08:44:47.235893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length25
Mean length21.538462
Min length16

Characters and Unicode

Total characters1680
Distinct characters77
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74 ?
Unique (%)94.9%

Sample

1st row경상남도 사천시 동금동 84-1
2nd row경상남도 사천시 정동면 예수리 468
3rd row경상남도 사천시 정동면 예수리 817
4th row경상남도 사천시 용현면 송지리 산 25
5th row경상남도 사천시 사남면 화전리 산 166-6
ValueCountFrequency (%)
경상남도 78
20.2%
사천시 78
20.2%
사천읍 14
 
3.6%
외1필지 13
 
3.4%
용현면 10
 
2.6%
수석리 8
 
2.1%
외2필지 8
 
2.1%
실안동 7
 
1.8%
7
 
1.8%
사남면 5
 
1.3%
Other values (118) 159
41.1%
2023-12-11T08:44:47.703571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
309
18.4%
99
 
5.9%
95
 
5.7%
83
 
4.9%
79
 
4.7%
78
 
4.6%
78
 
4.6%
78
 
4.6%
1 71
 
4.2%
- 60
 
3.6%
Other values (67) 650
38.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 988
58.8%
Decimal Number 323
 
19.2%
Space Separator 309
 
18.4%
Dash Punctuation 60
 
3.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
99
 
10.0%
95
 
9.6%
83
 
8.4%
79
 
8.0%
78
 
7.9%
78
 
7.9%
78
 
7.9%
51
 
5.2%
46
 
4.7%
29
 
2.9%
Other values (55) 272
27.5%
Decimal Number
ValueCountFrequency (%)
1 71
22.0%
2 46
14.2%
3 35
10.8%
4 31
9.6%
8 30
9.3%
5 29
9.0%
6 27
 
8.4%
7 21
 
6.5%
9 18
 
5.6%
0 15
 
4.6%
Space Separator
ValueCountFrequency (%)
309
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 60
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 988
58.8%
Common 692
41.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
99
 
10.0%
95
 
9.6%
83
 
8.4%
79
 
8.0%
78
 
7.9%
78
 
7.9%
78
 
7.9%
51
 
5.2%
46
 
4.7%
29
 
2.9%
Other values (55) 272
27.5%
Common
ValueCountFrequency (%)
309
44.7%
1 71
 
10.3%
- 60
 
8.7%
2 46
 
6.6%
3 35
 
5.1%
4 31
 
4.5%
8 30
 
4.3%
5 29
 
4.2%
6 27
 
3.9%
7 21
 
3.0%
Other values (2) 33
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 988
58.8%
ASCII 692
41.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
309
44.7%
1 71
 
10.3%
- 60
 
8.7%
2 46
 
6.6%
3 35
 
5.1%
4 31
 
4.5%
8 30
 
4.3%
5 29
 
4.2%
6 27
 
3.9%
7 21
 
3.0%
Other values (2) 33
 
4.8%
Hangul
ValueCountFrequency (%)
99
 
10.0%
95
 
9.6%
83
 
8.4%
79
 
8.0%
78
 
7.9%
78
 
7.9%
78
 
7.9%
51
 
5.2%
46
 
4.7%
29
 
2.9%
Other values (55) 272
27.5%

연면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION 

Distinct75
Distinct (%)96.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11587.234
Minimum32.37
Maximum173946.39
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size834.0 B
2023-12-11T08:44:47.865731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum32.37
5-th percentile83.1615
Q1197.23
median473.03
Q31834.9325
95-th percentile75701.156
Maximum173946.39
Range173914.02
Interquartile range (IQR)1637.7025

Descriptive statistics

Standard deviation32270.357
Coefficient of variation (CV)2.7849922
Kurtosis12.205599
Mean11587.234
Median Absolute Deviation (MAD)345.63
Skewness3.4421022
Sum903804.26
Variance1.0413759 × 109
MonotonicityNot monotonic
2023-12-11T08:44:48.008809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
197.23 2
 
2.6%
139.31 2
 
2.6%
3887.9 2
 
2.6%
54814.71 1
 
1.3%
98.98 1
 
1.3%
101.88 1
 
1.3%
29208.22 1
 
1.3%
197.41 1
 
1.3%
133.08 1
 
1.3%
2707.94 1
 
1.3%
Other values (65) 65
83.3%
ValueCountFrequency (%)
32.37 1
1.3%
70.8 1
1.3%
74.38 1
1.3%
77.05 1
1.3%
84.24 1
1.3%
98.98 1
1.3%
101.88 1
1.3%
122.31 1
1.3%
132.49 1
1.3%
132.72 1
1.3%
ValueCountFrequency (%)
173946.39 1
1.3%
146888.65 1
1.3%
99262.43 1
1.3%
97094.22 1
1.3%
71925.91 1
1.3%
71802.84 1
1.3%
68307.61 1
1.3%
54814.71 1
1.3%
29208.22 1
1.3%
21115.95 1
1.3%

지상층수
Real number (ℝ)

HIGH CORRELATION 

Distinct17
Distinct (%)21.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.6282051
Minimum1
Maximum49
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size834.0 B
2023-12-11T08:44:48.142410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile25.6
Maximum49
Range48
Interquartile range (IQR)3

Descriptive statistics

Standard deviation8.9168001
Coefficient of variation (CV)1.5843062
Kurtosis8.9661527
Mean5.6282051
Median Absolute Deviation (MAD)1
Skewness2.9132018
Sum439
Variance79.509324
MonotonicityNot monotonic
2023-12-11T08:44:48.299495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
1 24
30.8%
2 18
23.1%
3 12
15.4%
4 5
 
6.4%
6 4
 
5.1%
5 2
 
2.6%
18 2
 
2.6%
20 2
 
2.6%
10 1
 
1.3%
33 1
 
1.3%
Other values (7) 7
 
9.0%
ValueCountFrequency (%)
1 24
30.8%
2 18
23.1%
3 12
15.4%
4 5
 
6.4%
5 2
 
2.6%
6 4
 
5.1%
7 1
 
1.3%
9 1
 
1.3%
10 1
 
1.3%
16 1
 
1.3%
ValueCountFrequency (%)
49 1
1.3%
35 1
1.3%
33 1
1.3%
29 1
1.3%
25 1
1.3%
20 2
2.6%
18 2
2.6%
16 1
1.3%
10 1
1.3%
9 1
1.3%

지하층수
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size756.0 B
0
29 
<NA>
28 
1
13 
2
3
 
1

Length

Max length4
Median length1
Mean length2.0769231
Min length1

Unique

Unique2 ?
Unique (%)2.6%

Sample

1st row3
2nd row2
3rd row2
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 29
37.2%
<NA> 28
35.9%
1 13
16.7%
2 6
 
7.7%
3 1
 
1.3%
4 1
 
1.3%

Length

2023-12-11T08:44:48.462053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:44:48.593040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 29
37.2%
na 28
35.9%
1 13
16.7%
2 6
 
7.7%
3 1
 
1.3%
4 1
 
1.3%

주용도
Categorical

Distinct4
Distinct (%)5.1%
Missing0
Missing (%)0.0%
Memory size756.0 B
제2종근린생활시설
52 
공동주택
13 
숙박시설
11 
운동시설
 
2

Length

Max length9
Median length9
Mean length7.3333333
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공동주택
2nd row공동주택
3rd row공동주택
4th row공동주택
5th row공동주택

Common Values

ValueCountFrequency (%)
제2종근린생활시설 52
66.7%
공동주택 13
 
16.7%
숙박시설 11
 
14.1%
운동시설 2
 
2.6%

Length

2023-12-11T08:44:48.748885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:44:48.862095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제2종근린생활시설 52
66.7%
공동주택 13
 
16.7%
숙박시설 11
 
14.1%
운동시설 2
 
2.6%

부속용도
Text

MISSING 

Distinct35
Distinct (%)62.5%
Missing22
Missing (%)28.2%
Memory size756.0 B
2023-12-11T08:44:49.042570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length22
Mean length7.5178571
Min length1

Characters and Unicode

Total characters421
Distinct characters92
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)53.6%

Sample

1st row아파트
2nd row아파트
3rd row아파트
4th row아파트
5th row아파트
ValueCountFrequency (%)
일반음식점 11
13.9%
10
 
12.7%
사무소 9
 
11.4%
아파트 9
 
11.4%
단독주택 6
 
7.6%
외2건 2
 
2.5%
근린생활시설 2
 
2.5%
외3 1
 
1.3%
1
 
1.3%
3 1
 
1.3%
Other values (27) 27
34.2%
2023-12-11T08:44:49.475007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23
 
5.5%
20
 
4.8%
17
 
4.0%
17
 
4.0%
16
 
3.8%
16
 
3.8%
15
 
3.6%
15
 
3.6%
15
 
3.6%
12
 
2.9%
Other values (82) 255
60.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 356
84.6%
Space Separator 23
 
5.5%
Other Punctuation 14
 
3.3%
Decimal Number 12
 
2.9%
Close Punctuation 7
 
1.7%
Open Punctuation 7
 
1.7%
Dash Punctuation 1
 
0.2%
Math Symbol 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
20
 
5.6%
17
 
4.8%
17
 
4.8%
16
 
4.5%
16
 
4.5%
15
 
4.2%
15
 
4.2%
15
 
4.2%
12
 
3.4%
12
 
3.4%
Other values (72) 201
56.5%
Decimal Number
ValueCountFrequency (%)
2 6
50.0%
1 4
33.3%
3 2
 
16.7%
Other Punctuation
ValueCountFrequency (%)
, 12
85.7%
. 2
 
14.3%
Space Separator
ValueCountFrequency (%)
23
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 356
84.6%
Common 65
 
15.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
20
 
5.6%
17
 
4.8%
17
 
4.8%
16
 
4.5%
16
 
4.5%
15
 
4.2%
15
 
4.2%
15
 
4.2%
12
 
3.4%
12
 
3.4%
Other values (72) 201
56.5%
Common
ValueCountFrequency (%)
23
35.4%
, 12
18.5%
) 7
 
10.8%
( 7
 
10.8%
2 6
 
9.2%
1 4
 
6.2%
3 2
 
3.1%
. 2
 
3.1%
- 1
 
1.5%
~ 1
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 356
84.6%
ASCII 65
 
15.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
23
35.4%
, 12
18.5%
) 7
 
10.8%
( 7
 
10.8%
2 6
 
9.2%
1 4
 
6.2%
3 2
 
3.1%
. 2
 
3.1%
- 1
 
1.5%
~ 1
 
1.5%
Hangul
ValueCountFrequency (%)
20
 
5.6%
17
 
4.8%
17
 
4.8%
16
 
4.5%
16
 
4.5%
15
 
4.2%
15
 
4.2%
15
 
4.2%
12
 
3.4%
12
 
3.4%
Other values (72) 201
56.5%

총주차대수
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct30
Distinct (%)55.6%
Missing24
Missing (%)30.8%
Infinite0
Infinite (%)0.0%
Mean143.42593
Minimum0
Maximum1581
Zeros3
Zeros (%)3.8%
Negative0
Negative (%)0.0%
Memory size834.0 B
2023-12-11T08:44:49.607189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.65
Q12
median6
Q344.5
95-th percentile909.5
Maximum1581
Range1581
Interquartile range (IQR)42.5

Descriptive statistics

Standard deviation322.06724
Coefficient of variation (CV)2.2455301
Kurtosis7.9558774
Mean143.42593
Median Absolute Deviation (MAD)5
Skewness2.7689625
Sum7745
Variance103727.31
MonotonicityNot monotonic
2023-12-11T08:44:49.744593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
2 10
12.8%
1 6
 
7.7%
3 4
 
5.1%
0 3
 
3.8%
9 2
 
2.6%
5 2
 
2.6%
8 2
 
2.6%
6 2
 
2.6%
7 2
 
2.6%
4 1
 
1.3%
Other values (20) 20
25.6%
(Missing) 24
30.8%
ValueCountFrequency (%)
0 3
 
3.8%
1 6
7.7%
2 10
12.8%
3 4
 
5.1%
4 1
 
1.3%
5 2
 
2.6%
6 2
 
2.6%
7 2
 
2.6%
8 2
 
2.6%
9 2
 
2.6%
ValueCountFrequency (%)
1581 1
1.3%
997 1
1.3%
942 1
1.3%
892 1
1.3%
631 1
1.3%
582 1
1.3%
554 1
1.3%
450 1
1.3%
332 1
1.3%
202 1
1.3%

세대수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct14
Distinct (%)100.0%
Missing64
Missing (%)82.1%
Infinite0
Infinite (%)0.0%
Mean381.07143
Minimum3
Maximum1047
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size834.0 B
2023-12-11T08:44:49.874411image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile4.95
Q133.75
median397
Q3641.5
95-th percentile853.95
Maximum1047
Range1044
Interquartile range (IQR)607.75

Descriptive statistics

Standard deviation344.57744
Coefficient of variation (CV)0.9042332
Kurtosis-0.9825553
Mean381.07143
Median Absolute Deviation (MAD)341
Skewness0.39545911
Sum5335
Variance118733.61
MonotonicityNot monotonic
2023-12-11T08:44:49.994395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
335 1
 
1.3%
745 1
 
1.3%
511 1
 
1.3%
750 1
 
1.3%
63 1
 
1.3%
535 1
 
1.3%
459 1
 
1.3%
1047 1
 
1.3%
677 1
 
1.3%
6 1
 
1.3%
Other values (4) 4
 
5.1%
(Missing) 64
82.1%
ValueCountFrequency (%)
3 1
1.3%
6 1
1.3%
8 1
1.3%
24 1
1.3%
63 1
1.3%
172 1
1.3%
335 1
1.3%
459 1
1.3%
511 1
1.3%
535 1
1.3%
ValueCountFrequency (%)
1047 1
1.3%
750 1
1.3%
745 1
1.3%
677 1
1.3%
535 1
1.3%
511 1
1.3%
459 1
1.3%
335 1
1.3%
172 1
1.3%
63 1
1.3%

가구수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size756.0 B
<NA>
69 
1
2
 
1

Length

Max length4
Median length4
Mean length3.6538462
Min length1

Unique

Unique1 ?
Unique (%)1.3%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 69
88.5%
1 8
 
10.3%
2 1
 
1.3%

Length

2023-12-11T08:44:50.155895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:44:50.267303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 69
88.5%
1 8
 
10.3%
2 1
 
1.3%

공사명
Text

MISSING 

Distinct74
Distinct (%)96.1%
Missing1
Missing (%)1.3%
Memory size756.0 B
2023-12-11T08:44:50.539070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length3
Mean length6.6103896
Min length3

Characters and Unicode

Total characters509
Distinct characters153
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique71 ?
Unique (%)92.2%

Sample

1st row사천정동2 지역주택조합아파트
2nd row사천시 정동면 예수리 아파트
3rd row사천 송지지역주택조합 아파트
4th row사천시 사남면 화전리 프리미엄 다세대주택
5th row경남 사천대곡지구 A-1블럭 아파트
ValueCountFrequency (%)
1 11
 
9.1%
11
 
9.1%
아파트 4
 
3.3%
차익순 3
 
2.5%
사천시 3
 
2.5%
사천 3
 
2.5%
경남 2
 
1.7%
강석경 2
 
1.7%
옥경희 2
 
1.7%
재단법인기독교대한성결교회유지재단 2
 
1.7%
Other values (78) 78
64.5%
2023-12-11T08:44:51.030699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
44
 
8.6%
24
 
4.7%
19
 
3.7%
14
 
2.8%
13
 
2.6%
1 13
 
2.6%
12
 
2.4%
12
 
2.4%
11
 
2.2%
10
 
2.0%
Other values (143) 337
66.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 440
86.4%
Space Separator 44
 
8.6%
Decimal Number 14
 
2.8%
Close Punctuation 4
 
0.8%
Open Punctuation 4
 
0.8%
Uppercase Letter 1
 
0.2%
Dash Punctuation 1
 
0.2%
Lowercase Letter 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
24
 
5.5%
19
 
4.3%
14
 
3.2%
13
 
3.0%
12
 
2.7%
12
 
2.7%
11
 
2.5%
10
 
2.3%
9
 
2.0%
9
 
2.0%
Other values (135) 307
69.8%
Decimal Number
ValueCountFrequency (%)
1 13
92.9%
2 1
 
7.1%
Space Separator
ValueCountFrequency (%)
44
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 440
86.4%
Common 67
 
13.2%
Latin 2
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
24
 
5.5%
19
 
4.3%
14
 
3.2%
13
 
3.0%
12
 
2.7%
12
 
2.7%
11
 
2.5%
10
 
2.3%
9
 
2.0%
9
 
2.0%
Other values (135) 307
69.8%
Common
ValueCountFrequency (%)
44
65.7%
1 13
 
19.4%
) 4
 
6.0%
( 4
 
6.0%
- 1
 
1.5%
2 1
 
1.5%
Latin
ValueCountFrequency (%)
A 1
50.0%
e 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 440
86.4%
ASCII 69
 
13.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
44
63.8%
1 13
 
18.8%
) 4
 
5.8%
( 4
 
5.8%
A 1
 
1.4%
- 1
 
1.4%
e 1
 
1.4%
2 1
 
1.4%
Hangul
ValueCountFrequency (%)
24
 
5.5%
19
 
4.3%
14
 
3.2%
13
 
3.0%
12
 
2.7%
12
 
2.7%
11
 
2.5%
10
 
2.3%
9
 
2.0%
9
 
2.0%
Other values (135) 307
69.8%

시행사
Categorical

CONSTANT 

Distinct1
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size756.0 B
미선정
78 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row미선정
2nd row미선정
3rd row미선정
4th row미선정
5th row미선정

Common Values

ValueCountFrequency (%)
미선정 78
100.0%

Length

2023-12-11T08:44:51.199757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:44:51.317830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
미선정 78
100.0%

시공사
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct13
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size756.0 B
미선정
66 
(주)한양건설 안순걸
 
1
파인건설(주) 남윤광
 
1
동문건설 이상주
 
1
디엘건설 주식회사 곽수윤
 
1
Other values (8)

Length

Max length13
Median length3
Mean length3.9615385
Min length3

Unique

Unique12 ?
Unique (%)15.4%

Sample

1st row미선정
2nd row(주)한양건설 안순걸
3rd row파인건설(주) 남윤광
4th row미선정
5th row미선정

Common Values

ValueCountFrequency (%)
미선정 66
84.6%
(주)한양건설 안순걸 1
 
1.3%
파인건설(주) 남윤광 1
 
1.3%
동문건설 이상주 1
 
1.3%
디엘건설 주식회사 곽수윤 1
 
1.3%
디엘이앤씨(주) 마창민 1
 
1.3%
토마건설주식회사 1
 
1.3%
청아건설 주식회사 1
 
1.3%
광득건설(주) 1
 
1.3%
프라임건설(주) 1
 
1.3%
Other values (3) 3
 
3.8%

Length

2023-12-11T08:44:51.417634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
미선정 66
77.6%
주식회사 2
 
2.4%
디엘이앤씨(주 1
 
1.2%
주)강민종합건설 1
 
1.2%
한웅건설(주 1
 
1.2%
프라임건설(주 1
 
1.2%
광득건설(주 1
 
1.2%
청아건설 1
 
1.2%
토마건설주식회사 1
 
1.2%
마창민 1
 
1.2%
Other values (9) 9
 
10.6%

데이터기준일자
Date

CONSTANT 

Distinct1
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size756.0 B
Minimum2023-08-30 00:00:00
Maximum2023-08-30 00:00:00
2023-12-11T08:44:51.512382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:51.605060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

Interactions

2023-12-11T08:44:44.783938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:43.689307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:44.107496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:44.429537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:44.870837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:43.790787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:44.189698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:44.517815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:44.937949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:43.892750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:44.265074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:44.611541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:45.016368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:44.011905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:44.354072image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:44.699094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T08:44:51.712211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건축구분허가일착공일(예정일)준공일(예정일)대지위치연면적(제곱미터)지상층수지하층수주용도부속용도총주차대수세대수가구수공사명시공사
건축구분1.0000.9841.000NaN1.0000.0000.1640.4730.5110.8450.0000.0000.1141.0000.386
허가일0.9841.0001.0001.0000.9991.0001.0001.0000.9331.0001.0001.0001.0000.9980.000
착공일(예정일)1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000NaN1.0001.000
준공일(예정일)NaN1.0001.0001.0001.0001.0001.0001.000NaN1.0001.0001.000NaN1.0001.000
대지위치1.0000.9991.0001.0001.0001.0001.0001.0001.0000.9841.0001.0001.0001.0001.000
연면적(제곱미터)0.0001.0001.0001.0001.0001.0000.9670.7850.5220.0000.9950.965NaN1.0000.902
지상층수0.1641.0001.0001.0001.0000.9671.0000.8550.6310.5940.9810.8420.0001.0000.805
지하층수0.4731.0001.0001.0001.0000.7850.8551.0000.3540.0000.8160.881NaN1.0000.725
주용도0.5110.9331.000NaN1.0000.5220.6310.3541.0000.9900.4690.0000.0001.0000.368
부속용도0.8451.0001.0001.0000.9840.0000.5940.0000.9901.0000.0000.0001.0000.9840.000
총주차대수0.0001.0001.0001.0001.0000.9950.9810.8160.4690.0001.0001.000NaN1.0000.857
세대수0.0001.0001.0001.0001.0000.9650.8420.8810.0000.0001.0001.000NaN1.0000.822
가구수0.1141.000NaNNaN1.000NaN0.000NaN0.0001.000NaNNaN1.0001.0000.000
공사명1.0000.9981.0001.0001.0001.0001.0001.0001.0000.9841.0001.0001.0001.0001.000
시공사0.3860.0001.0001.0001.0000.9020.8050.7250.3680.0000.8570.8220.0001.0001.000
2023-12-11T08:44:51.929785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가구수지하층수주용도시공사건축구분
가구수1.0001.0000.0000.0000.000
지하층수1.0001.0000.2900.4690.395
주용도0.0000.2901.0000.2040.218
시공사0.0000.4690.2041.0000.215
건축구분0.0000.3950.2180.2151.000
2023-12-11T08:44:52.068274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연면적(제곱미터)지상층수총주차대수세대수건축구분지하층수주용도가구수시공사
연면적(제곱미터)1.0000.7910.9020.9870.0000.6430.3781.0000.681
지상층수0.7911.0000.7010.6680.0000.7530.4660.0000.538
총주차대수0.9020.7011.0000.9730.0000.6840.3271.0000.582
세대수0.9870.6680.9731.0000.0000.6060.0000.0000.313
건축구분0.0000.0000.0000.0001.0000.3950.2180.0000.215
지하층수0.6430.7530.6840.6060.3951.0000.2901.0000.469
주용도0.3780.4660.3270.0000.2180.2901.0000.0000.204
가구수1.0000.0001.0000.0000.0001.0000.0001.0000.000
시공사0.6810.5380.5820.3130.2150.4690.2040.0001.000

Missing values

2023-12-11T08:44:45.161562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:44:45.623849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T08:44:45.823947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

건축구분허가일착공일(예정일)준공일(예정일)대지위치연면적(제곱미터)지상층수지하층수주용도부속용도총주차대수세대수가구수공사명시행사시공사데이터기준일자
0신축2018-02-052024-02-042024-09-30경상남도 사천시 동금동 84-154814.71353공동주택아파트450335<NA><NA>미선정미선정2023-08-30
1신축2018-03-222020-05-252022-11-30경상남도 사천시 정동면 예수리 46897094.22202공동주택아파트942745<NA>사천정동2 지역주택조합아파트미선정(주)한양건설 안순걸2023-08-30
2신축2019-12-302021-11-092024-03-30경상남도 사천시 정동면 예수리 81771802.84162공동주택아파트554511<NA>사천시 정동면 예수리 아파트미선정파인건설(주) 남윤광2023-08-30
3신축2021-02-022021-03-022023-11-30경상남도 사천시 용현면 송지리 산 2599262.43250공동주택아파트892750<NA>사천 송지지역주택조합 아파트미선정미선정2023-08-30
4신축2021-06-172021-08-302023-01-31경상남도 사천시 사남면 화전리 산 166-68670.9741공동주택아파트6563<NA>사천시 사남면 화전리 프리미엄 다세대주택미선정미선정2023-08-30
5신축2021-12-222022-02-252025-08-31경상남도 사천시 정동면 대곡리 19271925.91201공동주택아파트631535<NA>경남 사천대곡지구 A-1블럭 아파트미선정동문건설 이상주2023-08-30
6신축2021-12-242022-02-012024-04-30경상남도 사천시 정동면 예수리 495-168307.61182공동주택아파트582459<NA>사천 동계지구 지역주택조합 아파트 신축공사(1단지)미선정미선정2023-08-30
7신축2022-03-242022-08-162025-04-16경상남도 사천시 용현면 선진리 1116173946.39292공동주택아파트15811047<NA>e편한세상 사천 스카이마리나미선정디엘건설 주식회사 곽수윤2023-08-30
8신축2022-03-302022-07-012026-02-28경상남도 사천시 동금동 151-5146888.65494공동주택판매시설997677<NA>경남 사천동금 삼천포 주상복합미선정디엘이앤씨(주) 마창민2023-08-30
9신축2023-07-27<NA><NA>경상남도 사천시 송포동 15-10181.9310제2종근린생활시설일반음식점1<NA><NA>윤정혜미선정미선정2023-08-30
건축구분허가일착공일(예정일)준공일(예정일)대지위치연면적(제곱미터)지상층수지하층수주용도부속용도총주차대수세대수가구수공사명시행사시공사데이터기준일자
68용도변경2018-08-14<NA><NA>경상남도 사천시 서금동 144-22187.962<NA>제2종근린생활시설<NA><NA><NA><NA>김재연미선정미선정2023-08-30
69신축2018-07-262021-07-27<NA>경상남도 사천시 축동면 구호리 373-8 외1필지483.020제2종근린생활시설<NA>4<NA><NA>강용구미선정미선정2023-08-30
70용도변경2018-06-12<NA><NA>경상남도 사천시 사천읍 수석리 256-59122.311<NA>제2종근린생활시설사무소<NA><NA><NA>성용진미선정미선정2023-08-30
71용도변경2018-04-16<NA><NA>경상남도 사천시 벌리동 481-7521.833<NA>제2종근린생활시설<NA>2<NA><NA>강성열 외 1미선정미선정2023-08-30
72신축2018-04-122018-09-21<NA>경상남도 사천시 서포면 비토리 산 15-1 외1필지6505.1142숙박시설숙박시설,제1~2종근.생<NA><NA><NA>주식회사정근개발미선정미선정2023-08-30
73신축2018-04-052018-05-03<NA>경상남도 사천시 용강동 662-7 외2필지3877.03180공동주택연립주택 및 업무시설(오피스텔)378<NA>주식회사명품산업개발미선정(주)강민종합건설2023-08-30
74신축2018-03-30<NA><NA>경상남도 사천시 용현면 선진리 1001-5 외1필지276.4430제2종근린생활시설<NA>2<NA>1이민주미선정미선정2023-08-30
75증축2018-02-062022-12-30<NA>경상남도 사천시 용현면 선진리 762-1 외1필지1224.123<NA>제2종근린생활시설일반음식점8<NA>1옥경희 외 1미선정주식회사우상건설2023-08-30
76용도변경2018-01-08<NA><NA>경상남도 사천시 서금동 101-6 외2필지919.695<NA>숙박시설<NA>6<NA><NA>(주)미래로미선정미선정2023-08-30
77용도변경2018-01-05<NA><NA>경상남도 사천시 사천읍 사주리 8-532.371<NA>제2종근린생활시설부동산중개사무소<NA><NA><NA>정순혜 외 1미선정미선정2023-08-30