Overview

Dataset statistics

Number of variables17
Number of observations106
Missing cells311
Missing cells (%)17.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory14.8 KiB
Average record size in memory143.2 B

Variable types

Categorical7
DateTime3
Text3
Numeric4

Dataset

Description본 데이터는 사천시 관내 미준공 신축 건축물에 대한 정보(건축구분, 허가일, 착공일(예정일), 준공일(예정일), 대지위치, 연면적, 최대지상층수, 최대지하층수, 주용도, 부속용도, 총주차대수, 세대수, 가구수, 시공자명)를 제공하고 있습니다.
Author경상남도 사천시
URLhttps://bigdata.gyeongnam.go.kr/index.gn?menuCd=DOM_000000114002001000&publicdatapk=15121500

Alerts

시행사 has constant value ""Constant
연면적(제곱미터) is highly overall correlated with 지상층수 and 5 other fieldsHigh correlation
지상층수 is highly overall correlated with 연면적(제곱미터) and 3 other fieldsHigh correlation
총주차대수 is highly overall correlated with 연면적(제곱미터) and 5 other fieldsHigh correlation
세대수 is highly overall correlated with 연면적(제곱미터) and 4 other fieldsHigh correlation
지하층수 is highly overall correlated with 연면적(제곱미터) and 3 other fieldsHigh correlation
주용도 is highly overall correlated with 가구수 and 2 other fieldsHigh correlation
가구수 is highly overall correlated with 연면적(제곱미터) and 2 other fieldsHigh correlation
시공사 is highly overall correlated with 연면적(제곱미터) and 3 other fieldsHigh correlation
데이터기준일자 is highly overall correlated with 세대수 and 2 other fieldsHigh correlation
가구수 is highly imbalanced (70.8%)Imbalance
시공사 is highly imbalanced (52.3%)Imbalance
착공일(예정일) has 74 (69.8%) missing valuesMissing
준공일(예정일) has 90 (84.9%) missing valuesMissing
부속용도 has 30 (28.3%) missing valuesMissing
총주차대수 has 24 (22.6%) missing valuesMissing
세대수 has 92 (86.8%) missing valuesMissing
총주차대수 has 14 (13.2%) zerosZeros

Reproduction

Analysis started2023-12-10 23:44:32.346715
Analysis finished2023-12-10 23:44:35.315543
Duration2.97 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

건축구분
Categorical

Distinct4
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size980.0 B
신축
58 
용도변경
33 
증축
13 
대수선
 
2

Length

Max length4
Median length2
Mean length2.6415094
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row신축
2nd row증축
3rd row신축
4th row증축
5th row신축

Common Values

ValueCountFrequency (%)
신축 58
54.7%
용도변경 33
31.1%
증축 13
 
12.3%
대수선 2
 
1.9%

Length

2023-12-11T08:44:35.397870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:44:35.492396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신축 58
54.7%
용도변경 33
31.1%
증축 13
 
12.3%
대수선 2
 
1.9%
Distinct98
Distinct (%)92.5%
Missing0
Missing (%)0.0%
Memory size980.0 B
Minimum2018-01-05 00:00:00
Maximum2023-10-31 00:00:00
2023-12-11T08:44:35.580544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:35.691295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

착공일(예정일)
Date

MISSING 

Distinct31
Distinct (%)96.9%
Missing74
Missing (%)69.8%
Memory size980.0 B
Minimum2018-05-03 00:00:00
Maximum2024-02-04 00:00:00
2023-12-11T08:44:35.790732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:35.886386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)

준공일(예정일)
Date

MISSING 

Distinct14
Distinct (%)87.5%
Missing90
Missing (%)84.9%
Memory size980.0 B
Minimum2022-11-30 00:00:00
Maximum2026-02-28 00:00:00
2023-12-11T08:44:35.982648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:36.085014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
Distinct104
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size980.0 B
2023-12-11T08:44:36.360419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length27
Mean length21.839623
Min length16

Characters and Unicode

Total characters2315
Distinct characters95
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique102 ?
Unique (%)96.2%

Sample

1st row경상남도 사천시 벌리동 54-11
2nd row경상남도 사천시 백천동 108-1 외2필지
3rd row경상남도 사천시 동금동 330-9 외1필지
4th row경상남도 사천시 곤양면 환덕리 1360-1 외2필지
5th row경상남도 사천시 사천읍 정의리 1-11
ValueCountFrequency (%)
경상남도 106
19.9%
사천시 106
19.9%
사천읍 19
 
3.6%
외1필지 18
 
3.4%
외2필지 11
 
2.1%
용현면 11
 
2.1%
수석리 9
 
1.7%
사남면 9
 
1.7%
축동면 8
 
1.5%
8
 
1.5%
Other values (158) 227
42.7%
2023-12-11T08:44:36.795407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
426
18.4%
139
 
6.0%
132
 
5.7%
115
 
5.0%
107
 
4.6%
106
 
4.6%
106
 
4.6%
106
 
4.6%
1 100
 
4.3%
- 81
 
3.5%
Other values (85) 897
38.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1372
59.3%
Decimal Number 434
 
18.7%
Space Separator 426
 
18.4%
Dash Punctuation 81
 
3.5%
Uppercase Letter 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
139
 
10.1%
132
 
9.6%
115
 
8.4%
107
 
7.8%
106
 
7.7%
106
 
7.7%
106
 
7.7%
67
 
4.9%
66
 
4.8%
42
 
3.1%
Other values (71) 386
28.1%
Decimal Number
ValueCountFrequency (%)
1 100
23.0%
2 55
12.7%
3 46
10.6%
5 43
9.9%
4 41
9.4%
8 39
 
9.0%
6 34
 
7.8%
7 28
 
6.5%
0 24
 
5.5%
9 24
 
5.5%
Uppercase Letter
ValueCountFrequency (%)
L 1
50.0%
B 1
50.0%
Space Separator
ValueCountFrequency (%)
426
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 81
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1372
59.3%
Common 941
40.6%
Latin 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
139
 
10.1%
132
 
9.6%
115
 
8.4%
107
 
7.8%
106
 
7.7%
106
 
7.7%
106
 
7.7%
67
 
4.9%
66
 
4.8%
42
 
3.1%
Other values (71) 386
28.1%
Common
ValueCountFrequency (%)
426
45.3%
1 100
 
10.6%
- 81
 
8.6%
2 55
 
5.8%
3 46
 
4.9%
5 43
 
4.6%
4 41
 
4.4%
8 39
 
4.1%
6 34
 
3.6%
7 28
 
3.0%
Other values (2) 48
 
5.1%
Latin
ValueCountFrequency (%)
L 1
50.0%
B 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1372
59.3%
ASCII 943
40.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
426
45.2%
1 100
 
10.6%
- 81
 
8.6%
2 55
 
5.8%
3 46
 
4.9%
5 43
 
4.6%
4 41
 
4.3%
8 39
 
4.1%
6 34
 
3.6%
7 28
 
3.0%
Other values (4) 50
 
5.3%
Hangul
ValueCountFrequency (%)
139
 
10.1%
132
 
9.6%
115
 
8.4%
107
 
7.8%
106
 
7.7%
106
 
7.7%
106
 
7.7%
67
 
4.9%
66
 
4.8%
42
 
3.1%
Other values (71) 386
28.1%

연면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION 

Distinct103
Distinct (%)97.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10278.943
Minimum32.37
Maximum173946.39
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-11T08:44:37.196696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum32.37
5-th percentile78.8475
Q1197.23
median502.77
Q32464.845
95-th percentile71895.143
Maximum173946.39
Range173914.02
Interquartile range (IQR)2267.615

Descriptive statistics

Standard deviation29468.978
Coefficient of variation (CV)2.866927
Kurtosis14.010985
Mean10278.943
Median Absolute Deviation (MAD)384.535
Skewness3.6451854
Sum1089567.9
Variance8.6842067 × 108
MonotonicityNot monotonic
2023-12-11T08:44:37.339766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
197.23 2
 
1.9%
139.31 2
 
1.9%
3887.9 2
 
1.9%
919.69 1
 
0.9%
463.06 1
 
0.9%
70.8 1
 
0.9%
369.64 1
 
0.9%
218.3 1
 
0.9%
132.49 1
 
0.9%
168.0 1
 
0.9%
Other values (93) 93
87.7%
ValueCountFrequency (%)
32.37 1
0.9%
51.94 1
0.9%
56.2 1
0.9%
70.8 1
0.9%
74.38 1
0.9%
77.05 1
0.9%
84.24 1
0.9%
98.98 1
0.9%
101.88 1
0.9%
114.16 1
0.9%
ValueCountFrequency (%)
173946.39 1
0.9%
146888.65 1
0.9%
99262.43 1
0.9%
97094.22 1
0.9%
91790.02 1
0.9%
71925.91 1
0.9%
71802.84 1
0.9%
68307.61 1
0.9%
55968.8 1
0.9%
54814.71 1
0.9%

지상층수
Real number (ℝ)

HIGH CORRELATION 

Distinct17
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.6509434
Minimum1
Maximum49
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-11T08:44:37.477557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile20
Maximum49
Range48
Interquartile range (IQR)3

Descriptive statistics

Standard deviation7.8340435
Coefficient of variation (CV)1.6843988
Kurtosis13.206316
Mean4.6509434
Median Absolute Deviation (MAD)1
Skewness3.4754499
Sum493
Variance61.372237
MonotonicityNot monotonic
2023-12-11T08:44:37.604632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
1 38
35.8%
2 26
24.5%
3 13
 
12.3%
4 9
 
8.5%
6 4
 
3.8%
5 3
 
2.8%
18 2
 
1.9%
20 2
 
1.9%
16 1
 
0.9%
25 1
 
0.9%
Other values (7) 7
 
6.6%
ValueCountFrequency (%)
1 38
35.8%
2 26
24.5%
3 13
 
12.3%
4 9
 
8.5%
5 3
 
2.8%
6 4
 
3.8%
7 1
 
0.9%
9 1
 
0.9%
10 1
 
0.9%
16 1
 
0.9%
ValueCountFrequency (%)
49 1
0.9%
35 1
0.9%
33 1
0.9%
29 1
0.9%
25 1
0.9%
20 2
1.9%
18 2
1.9%
16 1
0.9%
10 1
0.9%
9 1
0.9%

지하층수
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Memory size980.0 B
0
44 
<NA>
40 
1
14 
2
3
 
1

Length

Max length4
Median length1
Mean length2.1320755
Min length1

Unique

Unique2 ?
Unique (%)1.9%

Sample

1st row0
2nd row<NA>
3rd row0
4th row<NA>
5th row0

Common Values

ValueCountFrequency (%)
0 44
41.5%
<NA> 40
37.7%
1 14
 
13.2%
2 6
 
5.7%
3 1
 
0.9%
4 1
 
0.9%

Length

2023-12-11T08:44:37.734018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:44:37.842039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 44
41.5%
na 40
37.7%
1 14
 
13.2%
2 6
 
5.7%
3 1
 
0.9%
4 1
 
0.9%

주용도
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)10.4%
Missing0
Missing (%)0.0%
Memory size980.0 B
제2종근린생활시설
58 
공동주택
13 
숙박시설
12 
공장
제1종근린생활시설
 
5
Other values (6)
12 

Length

Max length9
Median length9
Mean length7.0188679
Min length2

Unique

Unique1 ?
Unique (%)0.9%

Sample

1st row제2종근린생활시설
2nd row창고시설
3rd row제2종근린생활시설
4th row동물및식물관련시설
5th row노유자시설

Common Values

ValueCountFrequency (%)
제2종근린생활시설 58
54.7%
공동주택 13
 
12.3%
숙박시설 12
 
11.3%
공장 6
 
5.7%
제1종근린생활시설 5
 
4.7%
동물및식물관련시설 3
 
2.8%
창고시설 2
 
1.9%
노유자시설 2
 
1.9%
단독주택 2
 
1.9%
운동시설 2
 
1.9%

Length

2023-12-11T08:44:37.980969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
제2종근린생활시설 58
54.7%
공동주택 13
 
12.3%
숙박시설 12
 
11.3%
공장 6
 
5.7%
제1종근린생활시설 5
 
4.7%
동물및식물관련시설 3
 
2.8%
창고시설 2
 
1.9%
노유자시설 2
 
1.9%
단독주택 2
 
1.9%
운동시설 2
 
1.9%

부속용도
Text

MISSING 

Distinct50
Distinct (%)65.8%
Missing30
Missing (%)28.3%
Memory size980.0 B
2023-12-11T08:44:38.228723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length42
Median length21.5
Mean length7.3157895
Min length1

Characters and Unicode

Total characters556
Distinct characters111
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)56.6%

Sample

1st row단독주택
2nd row일반음식점
3rd row축사
4th row경로당
5th row마을회관
ValueCountFrequency (%)
일반음식점 16
 
13.7%
13
 
11.1%
사무소 10
 
8.5%
아파트 9
 
7.7%
단독주택 7
 
6.0%
소매점 3
 
2.6%
제조업소 2
 
1.7%
휴게음식점 2
 
1.7%
근린생활시설 2
 
1.7%
제1종근린생활시설 2
 
1.7%
Other values (48) 51
43.6%
2023-12-11T08:44:38.704499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
46
 
8.3%
24
 
4.3%
20
 
3.6%
20
 
3.6%
20
 
3.6%
19
 
3.4%
19
 
3.4%
19
 
3.4%
19
 
3.4%
16
 
2.9%
Other values (101) 334
60.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 470
84.5%
Space Separator 46
 
8.3%
Decimal Number 17
 
3.1%
Close Punctuation 9
 
1.6%
Open Punctuation 9
 
1.6%
Other Punctuation 2
 
0.4%
Dash Punctuation 2
 
0.4%
Math Symbol 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
24
 
5.1%
20
 
4.3%
20
 
4.3%
20
 
4.3%
19
 
4.0%
19
 
4.0%
19
 
4.0%
19
 
4.0%
16
 
3.4%
14
 
3.0%
Other values (92) 280
59.6%
Decimal Number
ValueCountFrequency (%)
2 8
47.1%
1 7
41.2%
3 2
 
11.8%
Space Separator
ValueCountFrequency (%)
46
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 470
84.5%
Common 86
 
15.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
24
 
5.1%
20
 
4.3%
20
 
4.3%
20
 
4.3%
19
 
4.0%
19
 
4.0%
19
 
4.0%
19
 
4.0%
16
 
3.4%
14
 
3.0%
Other values (92) 280
59.6%
Common
ValueCountFrequency (%)
46
53.5%
) 9
 
10.5%
( 9
 
10.5%
2 8
 
9.3%
1 7
 
8.1%
3 2
 
2.3%
. 2
 
2.3%
- 2
 
2.3%
~ 1
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 470
84.5%
ASCII 86
 
15.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
46
53.5%
) 9
 
10.5%
( 9
 
10.5%
2 8
 
9.3%
1 7
 
8.1%
3 2
 
2.3%
. 2
 
2.3%
- 2
 
2.3%
~ 1
 
1.2%
Hangul
ValueCountFrequency (%)
24
 
5.1%
20
 
4.3%
20
 
4.3%
20
 
4.3%
19
 
4.0%
19
 
4.0%
19
 
4.0%
19
 
4.0%
16
 
3.4%
14
 
3.0%
Other values (92) 280
59.6%

총주차대수
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct36
Distinct (%)43.9%
Missing24
Missing (%)22.6%
Infinite0
Infinite (%)0.0%
Mean102.80488
Minimum0
Maximum1581
Zeros14
Zeros (%)13.2%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-11T08:44:38.847840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11.25
median4
Q317.5
95-th percentile628.55
Maximum1581
Range1581
Interquartile range (IQR)16.25

Descriptive statistics

Standard deviation269.95704
Coefficient of variation (CV)2.6259166
Kurtosis13.146189
Mean102.80488
Median Absolute Deviation (MAD)4
Skewness3.4617874
Sum8430
Variance72876.801
MonotonicityNot monotonic
2023-12-11T08:44:39.015122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=36)
ValueCountFrequency (%)
0 14
13.2%
2 12
11.3%
1 7
 
6.6%
3 7
 
6.6%
4 4
 
3.8%
7 3
 
2.8%
5 3
 
2.8%
6 2
 
1.9%
9 2
 
1.9%
8 2
 
1.9%
Other values (26) 26
24.5%
(Missing) 24
22.6%
ValueCountFrequency (%)
0 14
13.2%
1 7
6.6%
2 12
11.3%
3 7
6.6%
4 4
 
3.8%
5 3
 
2.8%
6 2
 
1.9%
7 3
 
2.8%
8 2
 
1.9%
9 2
 
1.9%
ValueCountFrequency (%)
1581 1
0.9%
997 1
0.9%
942 1
0.9%
892 1
0.9%
631 1
0.9%
582 1
0.9%
554 1
0.9%
450 1
0.9%
332 1
0.9%
317 1
0.9%

세대수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct14
Distinct (%)100.0%
Missing92
Missing (%)86.8%
Infinite0
Infinite (%)0.0%
Mean381.07143
Minimum3
Maximum1047
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-11T08:44:39.174589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile4.95
Q133.75
median397
Q3641.5
95-th percentile853.95
Maximum1047
Range1044
Interquartile range (IQR)607.75

Descriptive statistics

Standard deviation344.57744
Coefficient of variation (CV)0.9042332
Kurtosis-0.9825553
Mean381.07143
Median Absolute Deviation (MAD)341
Skewness0.39545911
Sum5335
Variance118733.61
MonotonicityNot monotonic
2023-12-11T08:44:39.326181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
335 1
 
0.9%
745 1
 
0.9%
511 1
 
0.9%
750 1
 
0.9%
63 1
 
0.9%
535 1
 
0.9%
459 1
 
0.9%
1047 1
 
0.9%
677 1
 
0.9%
6 1
 
0.9%
Other values (4) 4
 
3.8%
(Missing) 92
86.8%
ValueCountFrequency (%)
3 1
0.9%
6 1
0.9%
8 1
0.9%
24 1
0.9%
63 1
0.9%
172 1
0.9%
335 1
0.9%
459 1
0.9%
511 1
0.9%
535 1
0.9%
ValueCountFrequency (%)
1047 1
0.9%
750 1
0.9%
745 1
0.9%
677 1
0.9%
535 1
0.9%
511 1
0.9%
459 1
0.9%
335 1
0.9%
172 1
0.9%
63 1
0.9%

가구수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size980.0 B
<NA>
93 
1
10 
4
 
1
7
 
1
2
 
1

Length

Max length4
Median length4
Mean length3.6320755
Min length1

Unique

Unique3 ?
Unique (%)2.8%

Sample

1st row1
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 93
87.7%
1 10
 
9.4%
4 1
 
0.9%
7 1
 
0.9%
2 1
 
0.9%

Length

2023-12-11T08:44:39.487919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:44:39.629324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 93
87.7%
1 10
 
9.4%
4 1
 
0.9%
7 1
 
0.9%
2 1
 
0.9%
Distinct98
Distinct (%)93.3%
Missing1
Missing (%)0.9%
Memory size980.0 B
2023-12-11T08:44:39.895262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length3
Mean length6.552381
Min length3

Characters and Unicode

Total characters688
Distinct characters187
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique93 ?
Unique (%)88.6%

Sample

1st row장병철
2nd row대한불교관음회조계종백천사
3rd row주식회사와이인베스트먼트
4th row김병기
5th row사천시장
ValueCountFrequency (%)
1 11
 
7.4%
11
 
7.4%
사천시 6
 
4.0%
아파트 4
 
2.7%
사천 3
 
2.0%
차익순 3
 
2.0%
옥경희 2
 
1.3%
재단법인기독교대한성결교회유지재단 2
 
1.3%
강석경 2
 
1.3%
사천시장 2
 
1.3%
Other values (102) 103
69.1%
2023-12-11T08:44:40.338566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
44
 
6.4%
37
 
5.4%
30
 
4.4%
19
 
2.8%
19
 
2.8%
18
 
2.6%
15
 
2.2%
15
 
2.2%
13
 
1.9%
1 13
 
1.9%
Other values (177) 465
67.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 611
88.8%
Space Separator 44
 
6.4%
Decimal Number 14
 
2.0%
Open Punctuation 8
 
1.2%
Close Punctuation 8
 
1.2%
Lowercase Letter 1
 
0.1%
Dash Punctuation 1
 
0.1%
Uppercase Letter 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
37
 
6.1%
30
 
4.9%
19
 
3.1%
19
 
3.1%
18
 
2.9%
15
 
2.5%
15
 
2.5%
13
 
2.1%
11
 
1.8%
10
 
1.6%
Other values (169) 424
69.4%
Decimal Number
ValueCountFrequency (%)
1 13
92.9%
2 1
 
7.1%
Space Separator
ValueCountFrequency (%)
44
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 611
88.8%
Common 75
 
10.9%
Latin 2
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
37
 
6.1%
30
 
4.9%
19
 
3.1%
19
 
3.1%
18
 
2.9%
15
 
2.5%
15
 
2.5%
13
 
2.1%
11
 
1.8%
10
 
1.6%
Other values (169) 424
69.4%
Common
ValueCountFrequency (%)
44
58.7%
1 13
 
17.3%
( 8
 
10.7%
) 8
 
10.7%
2 1
 
1.3%
- 1
 
1.3%
Latin
ValueCountFrequency (%)
e 1
50.0%
A 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 611
88.8%
ASCII 77
 
11.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
44
57.1%
1 13
 
16.9%
( 8
 
10.4%
) 8
 
10.4%
2 1
 
1.3%
e 1
 
1.3%
- 1
 
1.3%
A 1
 
1.3%
Hangul
ValueCountFrequency (%)
37
 
6.1%
30
 
4.9%
19
 
3.1%
19
 
3.1%
18
 
2.9%
15
 
2.5%
15
 
2.5%
13
 
2.1%
11
 
1.8%
10
 
1.6%
Other values (169) 424
69.4%

시행사
Categorical

CONSTANT 

Distinct1
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size980.0 B
미선정
106 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row미선정
2nd row미선정
3rd row미선정
4th row미선정
5th row미선정

Common Values

ValueCountFrequency (%)
미선정 106
100.0%

Length

2023-12-11T08:44:40.469029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:44:40.572582image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
미선정 106
100.0%

시공사
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct21
Distinct (%)19.8%
Missing0
Missing (%)0.0%
Memory size980.0 B
미선정
66 
<NA>
21 
파인건설(주) 남윤광
 
1
박경란
 
1
손명섭 임상후
 
1
Other values (16)
16 

Length

Max length13
Median length3
Mean length4
Min length3

Unique

Unique19 ?
Unique (%)17.9%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
미선정 66
62.3%
<NA> 21
 
19.8%
파인건설(주) 남윤광 1
 
0.9%
박경란 1
 
0.9%
손명섭 임상후 1
 
0.9%
김순석 1
 
0.9%
정재욱 1
 
0.9%
안재명 1
 
0.9%
정배영 1
 
0.9%
(주)한양건설 안순걸 1
 
0.9%
Other values (11) 11
 
10.4%

Length

2023-12-11T08:44:40.691543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
미선정 66
57.9%
na 21
 
18.4%
주식회사 2
 
1.8%
이상주 1
 
0.9%
동문건설 1
 
0.9%
주)강민종합건설 1
 
0.9%
한웅건설(주 1
 
0.9%
프라임건설(주 1
 
0.9%
광득건설(주 1
 
0.9%
청아건설 1
 
0.9%
Other values (18) 18
 
15.8%

데이터기준일자
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size980.0 B
2023-08-30
78 
2023-10-31
28 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-10-31
2nd row2023-10-31
3rd row2023-10-31
4th row2023-10-31
5th row2023-10-31

Common Values

ValueCountFrequency (%)
2023-08-30 78
73.6%
2023-10-31 28
 
26.4%

Length

2023-12-11T08:44:40.812877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T08:44:40.907683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-08-30 78
73.6%
2023-10-31 28
 
26.4%

Interactions

2023-12-11T08:44:34.597092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:33.634092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:34.015312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:34.300016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:34.667008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:33.744075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:34.092824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:34.378509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:34.728991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:33.830983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:34.158697image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:34.454590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:34.804657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:33.931006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:34.239292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T08:44:34.530607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T08:44:40.982061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
건축구분허가일착공일(예정일)준공일(예정일)연면적(제곱미터)지상층수지하층수주용도부속용도총주차대수세대수가구수공사명시공사데이터기준일자
건축구분1.0000.8091.0000.0000.0000.0290.3660.5870.8910.0000.0000.2140.9390.8080.556
허가일0.8091.0000.9960.9490.9571.0001.0000.0000.9730.9591.0001.0000.9930.0001.000
착공일(예정일)1.0000.9961.0000.9111.0001.0001.0001.0000.8851.0001.0001.0001.0000.8501.000
준공일(예정일)0.0000.9490.9111.0000.9360.7990.9140.8000.6100.7221.000NaN1.0000.9220.000
연면적(제곱미터)0.0000.9571.0000.9361.0000.9580.7610.3650.0000.9910.965NaN1.0000.9250.000
지상층수0.0291.0001.0000.7990.9581.0000.8630.4660.0000.9650.8420.0001.0000.7840.105
지하층수0.3661.0001.0000.9140.7610.8631.0000.2630.0000.7890.8810.0001.0000.6980.183
주용도0.5870.0001.0000.8000.3650.4660.2631.0001.0000.3260.0000.8970.8940.8570.818
부속용도0.8910.9730.8850.6100.0000.0000.0001.0001.0000.0000.0001.0000.9550.7070.748
총주차대수0.0000.9591.0000.7220.9910.9650.7890.3260.0001.0001.000NaN1.0000.8760.000
세대수0.0001.0001.0001.0000.9650.8420.8810.0000.0001.0001.000NaN1.0000.822NaN
가구수0.2141.0001.000NaNNaN0.0000.0000.8971.000NaNNaN1.0001.0000.5520.702
공사명0.9390.9931.0001.0001.0001.0001.0000.8940.9551.0001.0001.0001.0001.0000.557
시공사0.8080.0000.8500.9220.9250.7840.6980.8570.7070.8760.8220.5521.0001.0001.000
데이터기준일자0.5561.0001.0000.0000.0000.1050.1830.8180.7480.000NaN0.7020.5571.0001.000
2023-12-11T08:44:41.135731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가구수지하층수주용도시공사데이터기준일자건축구분
가구수1.0000.0000.5670.4870.4330.122
지하층수0.0001.0000.1430.3900.2170.302
주용도0.5670.1431.0000.5580.7780.382
시공사0.4870.3900.5581.0000.8850.459
데이터기준일자0.4330.2170.7780.8851.0000.377
건축구분0.1220.3020.3820.4590.3771.000
2023-12-11T08:44:41.256051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연면적(제곱미터)지상층수총주차대수세대수건축구분지하층수주용도가구수시공사데이터기준일자
연면적(제곱미터)1.0000.6580.7840.9870.0000.6150.1831.0000.6660.000
지상층수0.6581.0000.6700.6680.0000.7730.1930.0000.4610.044
총주차대수0.7840.6701.0000.9730.0000.6510.1581.0000.5420.000
세대수0.9870.6680.9731.0000.0000.6060.0000.0000.3131.000
건축구분0.0000.0000.0000.0001.0000.3020.3820.1220.4590.377
지하층수0.6150.7730.6510.6060.3021.0000.1430.0000.3900.217
주용도0.1830.1930.1580.0000.3820.1431.0000.5670.5580.778
가구수1.0000.0001.0000.0000.1220.0000.5671.0000.4870.433
시공사0.6660.4610.5420.3130.4590.3900.5580.4871.0000.885
데이터기준일자0.0000.0440.0001.0000.3770.2170.7780.4330.8851.000

Missing values

2023-12-11T08:44:34.899440image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T08:44:35.083601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T08:44:35.208855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

건축구분허가일착공일(예정일)준공일(예정일)대지위치연면적(제곱미터)지상층수지하층수주용도부속용도총주차대수세대수가구수공사명시행사시공사데이터기준일자
0신축2023-10-31<NA><NA>경상남도 사천시 벌리동 54-11198.3520제2종근린생활시설단독주택2<NA>1장병철미선정<NA>2023-10-31
1증축2023-10-30<NA><NA>경상남도 사천시 백천동 108-1 외2필지2583.152<NA>창고시설<NA>3<NA><NA>대한불교관음회조계종백천사미선정<NA>2023-10-31
2신축2023-10-30<NA><NA>경상남도 사천시 동금동 330-9 외1필지423.010제2종근린생활시설일반음식점4<NA><NA>주식회사와이인베스트먼트미선정<NA>2023-10-31
3증축2023-10-27<NA><NA>경상남도 사천시 곤양면 환덕리 1360-1 외2필지3937.01<NA>동물및식물관련시설축사0<NA><NA>김병기미선정<NA>2023-10-31
4신축2023-10-25<NA><NA>경상남도 사천시 사천읍 정의리 1-11114.1610노유자시설경로당0<NA><NA>사천시장미선정<NA>2023-10-31
5신축2023-10-25<NA><NA>경상남도 사천시 서포면 비토리 14-13479.3520제1종근린생활시설마을회관2<NA><NA>사천시미선정<NA>2023-10-31
6증축2023-10-12<NA><NA>경상남도 사천시 사천읍 장전리 산 73-6 외1필지2109.931<NA>동물및식물관련시설<NA>0<NA><NA>주식회사중앙개발미선정<NA>2023-10-31
7신축2023-10-11<NA><NA>경상남도 사천시 사천읍 정의리 209-1 외2필지820.1740제1종근린생활시설통신용시설4<NA><NA>에스케이텔레콤주식회사미선정<NA>2023-10-31
8증축2023-09-262023-10-202024-03-17경상남도 사천시 용현면 신촌리 588-19909.632<NA>공장<NA>50<NA><NA>한국표면처리(주)미선정한국표면처리(주)2023-10-31
9용도변경2023-09-21<NA><NA>경상남도 사천시 축동면 가산리 405-151.941<NA>단독주택<NA>0<NA>1최창경미선정<NA>2023-10-31
건축구분허가일착공일(예정일)준공일(예정일)대지위치연면적(제곱미터)지상층수지하층수주용도부속용도총주차대수세대수가구수공사명시행사시공사데이터기준일자
96용도변경2018-08-14<NA><NA>경상남도 사천시 서금동 144-22187.962<NA>제2종근린생활시설<NA><NA><NA><NA>김재연미선정미선정2023-08-30
97신축2018-07-262021-07-27<NA>경상남도 사천시 축동면 구호리 373-8 외1필지483.020제2종근린생활시설<NA>4<NA><NA>강용구미선정미선정2023-08-30
98용도변경2018-06-12<NA><NA>경상남도 사천시 사천읍 수석리 256-59122.311<NA>제2종근린생활시설사무소<NA><NA><NA>성용진미선정미선정2023-08-30
99용도변경2018-04-16<NA><NA>경상남도 사천시 벌리동 481-7521.833<NA>제2종근린생활시설<NA>2<NA><NA>강성열 외 1미선정미선정2023-08-30
100신축2018-04-122018-09-21<NA>경상남도 사천시 서포면 비토리 산 15-1 외1필지6505.1142숙박시설숙박시설 제1~2종근.생<NA><NA><NA>주식회사정근개발미선정미선정2023-08-30
101신축2018-04-052018-05-03<NA>경상남도 사천시 용강동 662-7 외2필지3877.03180공동주택연립주택 및 업무시설(오피스텔)378<NA>주식회사명품산업개발미선정(주)강민종합건설2023-08-30
102신축2018-03-30<NA><NA>경상남도 사천시 용현면 선진리 1001-5 외1필지276.4430제2종근린생활시설<NA>2<NA>1이민주미선정미선정2023-08-30
103증축2018-02-062022-12-30<NA>경상남도 사천시 용현면 선진리 762-1 외1필지1224.123<NA>제2종근린생활시설일반음식점8<NA>1옥경희 외 1미선정주식회사우상건설2023-08-30
104용도변경2018-01-08<NA><NA>경상남도 사천시 서금동 101-6 외2필지919.695<NA>숙박시설<NA>6<NA><NA>(주)미래로미선정미선정2023-08-30
105용도변경2018-01-05<NA><NA>경상남도 사천시 사천읍 사주리 8-532.371<NA>제2종근린생활시설부동산중개사무소<NA><NA><NA>정순혜 외 1미선정미선정2023-08-30