Overview

Dataset statistics

Number of variables17
Number of observations57
Missing cells219
Missing cells (%)22.6%
Duplicate rows1
Duplicate rows (%)1.8%
Total size in memory8.1 KiB
Average record size in memory145.3 B

Variable types

Categorical4
DateTime4
Unsupported1
Text4
Numeric4

Dataset

Description2018년 1월1일부터 2023년 7월31일까지의 진주시 내 사용승인 받지 않은 신축 건축물(숙박시설, 공동주택, 운동시설, 제2종근린생활시설) 현황
URLhttps://www.data.go.kr/data/15121335/fileData.do

Alerts

건축구분 has constant value ""Constant
Dataset has 1 (1.8%) duplicate rowsDuplicates
가구수 is highly overall correlated with 연면적(제곱미터) and 3 other fieldsHigh correlation
최대지하층수 is highly overall correlated with 연면적(제곱미터) and 5 other fieldsHigh correlation
연면적(제곱미터) is highly overall correlated with 최대지상층수 and 5 other fieldsHigh correlation
최대지상층수 is highly overall correlated with 연면적(제곱미터) and 4 other fieldsHigh correlation
총주차대수 is highly overall correlated with 연면적(제곱미터) and 5 other fieldsHigh correlation
세대수 is highly overall correlated with 연면적(제곱미터) and 2 other fieldsHigh correlation
주용도 is highly overall correlated with 연면적(제곱미터) and 3 other fieldsHigh correlation
최대지하층수 is highly imbalanced (64.7%)Imbalance
가구수 is highly imbalanced (55.3%)Imbalance
착공처리일 has 24 (42.1%) missing valuesMissing
착공예정일 has 23 (40.4%) missing valuesMissing
사용승인일 has 57 (100.0%) missing valuesMissing
준공예정일(사용승인예정일) has 23 (40.4%) missing valuesMissing
부속용도 has 8 (14.0%) missing valuesMissing
총주차대수 has 2 (3.5%) missing valuesMissing
세대수 has 49 (86.0%) missing valuesMissing
시공자사무소명 has 33 (57.9%) missing valuesMissing
사용승인일 is an unsupported type, check if it needs cleaning or further analysisUnsupported
총주차대수 has 6 (10.5%) zerosZeros

Reproduction

Analysis started2023-12-12 21:19:02.693495
Analysis finished2023-12-12 21:19:05.747070
Duration3.05 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

건축구분
Categorical

CONSTANT 

Distinct1
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size588.0 B
신축
57 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row신축
2nd row신축
3rd row신축
4th row신축
5th row신축

Common Values

ValueCountFrequency (%)
신축 57
100.0%

Length

2023-12-13T06:19:05.826598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:19:05.937344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신축 57
100.0%
Distinct55
Distinct (%)96.5%
Missing0
Missing (%)0.0%
Memory size588.0 B
Minimum2018-03-07 00:00:00
Maximum2023-07-26 00:00:00
2023-12-13T06:19:06.047037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:06.198307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

착공처리일
Date

MISSING 

Distinct31
Distinct (%)93.9%
Missing24
Missing (%)42.1%
Memory size588.0 B
Minimum2018-03-13 00:00:00
Maximum2023-08-18 00:00:00
2023-12-13T06:19:06.346692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:06.572288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)

착공예정일
Date

MISSING 

Distinct34
Distinct (%)100.0%
Missing23
Missing (%)40.4%
Memory size588.0 B
Minimum2018-03-14 00:00:00
Maximum2023-08-11 00:00:00
2023-12-13T06:19:06.787025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:07.023480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=34)

사용승인일
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing57
Missing (%)100.0%
Memory size645.0 B
Distinct28
Distinct (%)82.4%
Missing23
Missing (%)40.4%
Memory size588.0 B
Minimum2018-04-30 00:00:00
Maximum2026-03-31 00:00:00
2023-12-13T06:19:07.243865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:07.457077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
Distinct55
Distinct (%)96.5%
Missing0
Missing (%)0.0%
Memory size588.0 B
2023-12-13T06:19:07.862926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length25
Mean length21.403509
Min length16

Characters and Unicode

Total characters1220
Distinct characters73
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)94.7%

Sample

1st row경상남도 진주시 하대동 71-21
2nd row경상남도 진주시 정촌면 예하리 1297-22
3rd row경상남도 진주시 인사동 219-39
4th row경상남도 진주시 금산면 장사리 902 외2필지
5th row경상남도 진주시 상봉동 834-5 외1필지
ValueCountFrequency (%)
경상남도 57
20.6%
진주시 57
20.6%
외1필지 8
 
2.9%
외4필지 6
 
2.2%
정촌면 6
 
2.2%
예하리 6
 
2.2%
하대동 6
 
2.2%
평거동 5
 
1.8%
금산면 5
 
1.8%
문산읍 4
 
1.4%
Other values (93) 117
42.2%
2023-12-13T06:19:08.532558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
220
18.0%
62
 
5.1%
59
 
4.8%
58
 
4.8%
58
 
4.8%
57
 
4.7%
57
 
4.7%
57
 
4.7%
1 49
 
4.0%
- 39
 
3.2%
Other values (63) 504
41.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 718
58.9%
Decimal Number 243
 
19.9%
Space Separator 220
 
18.0%
Dash Punctuation 39
 
3.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
62
 
8.6%
59
 
8.2%
58
 
8.1%
58
 
8.1%
57
 
7.9%
57
 
7.9%
57
 
7.9%
33
 
4.6%
26
 
3.6%
23
 
3.2%
Other values (51) 228
31.8%
Decimal Number
ValueCountFrequency (%)
1 49
20.2%
3 35
14.4%
2 28
11.5%
4 26
10.7%
9 21
8.6%
0 21
8.6%
8 18
 
7.4%
7 16
 
6.6%
6 15
 
6.2%
5 14
 
5.8%
Space Separator
ValueCountFrequency (%)
220
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 39
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 718
58.9%
Common 502
41.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
62
 
8.6%
59
 
8.2%
58
 
8.1%
58
 
8.1%
57
 
7.9%
57
 
7.9%
57
 
7.9%
33
 
4.6%
26
 
3.6%
23
 
3.2%
Other values (51) 228
31.8%
Common
ValueCountFrequency (%)
220
43.8%
1 49
 
9.8%
- 39
 
7.8%
3 35
 
7.0%
2 28
 
5.6%
4 26
 
5.2%
9 21
 
4.2%
0 21
 
4.2%
8 18
 
3.6%
7 16
 
3.2%
Other values (2) 29
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 718
58.9%
ASCII 502
41.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
220
43.8%
1 49
 
9.8%
- 39
 
7.8%
3 35
 
7.0%
2 28
 
5.6%
4 26
 
5.2%
9 21
 
4.2%
0 21
 
4.2%
8 18
 
3.6%
7 16
 
3.2%
Other values (2) 29
 
5.8%
Hangul
ValueCountFrequency (%)
62
 
8.6%
59
 
8.2%
58
 
8.1%
58
 
8.1%
57
 
7.9%
57
 
7.9%
57
 
7.9%
33
 
4.6%
26
 
3.6%
23
 
3.2%
Other values (51) 228
31.8%

연면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION 

Distinct55
Distinct (%)96.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5769.8011
Minimum104.61
Maximum191739.55
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size645.0 B
2023-12-13T06:19:08.769153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum104.61
5-th percentile124.8
Q1216.3
median386.93
Q3923.95
95-th percentile11269.597
Maximum191739.55
Range191634.94
Interquartile range (IQR)707.65

Descriptive statistics

Standard deviation26707.361
Coefficient of variation (CV)4.6288183
Kurtosis43.950234
Mean5769.8011
Median Absolute Deviation (MAD)192.05
Skewness6.4282592
Sum328878.66
Variance7.1328313 × 108
MonotonicityNot monotonic
2023-12-13T06:19:08.996529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
124.8 3
 
5.3%
176.4 1
 
1.8%
425.73 1
 
1.8%
1977.52 1
 
1.8%
104.61 1
 
1.8%
386.93 1
 
1.8%
1398.48 1
 
1.8%
228.97 1
 
1.8%
411.31 1
 
1.8%
1038.0 1
 
1.8%
Other values (45) 45
78.9%
ValueCountFrequency (%)
104.61 1
 
1.8%
124.8 3
5.3%
126.37 1
 
1.8%
135.6 1
 
1.8%
142.72 1
 
1.8%
160.46 1
 
1.8%
176.4 1
 
1.8%
194.88 1
 
1.8%
194.93 1
 
1.8%
196.0 1
 
1.8%
ValueCountFrequency (%)
191739.55 1
1.8%
60520.8878 1
1.8%
36401.3453 1
1.8%
4986.66 1
1.8%
4845.098 1
1.8%
4027.4 1
1.8%
1977.52 1
1.8%
1958.4 1
1.8%
1586.78 1
1.8%
1499.16 1
1.8%

최대지상층수
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)12.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.5614035
Minimum1
Maximum39
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size645.0 B
2023-12-13T06:19:09.115758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile5
Maximum39
Range38
Interquartile range (IQR)3

Descriptive statistics

Standard deviation6.0916147
Coefficient of variation (CV)1.7104534
Kurtosis25.659322
Mean3.5614035
Median Absolute Deviation (MAD)1
Skewness4.9620061
Sum203
Variance37.107769
MonotonicityNot monotonic
2023-12-13T06:19:09.221831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1 19
33.3%
2 15
26.3%
4 13
22.8%
5 5
 
8.8%
3 3
 
5.3%
29 1
 
1.8%
39 1
 
1.8%
ValueCountFrequency (%)
1 19
33.3%
2 15
26.3%
3 3
 
5.3%
4 13
22.8%
5 5
 
8.8%
29 1
 
1.8%
39 1
 
1.8%
ValueCountFrequency (%)
39 1
 
1.8%
29 1
 
1.8%
5 5
 
8.8%
4 13
22.8%
3 3
 
5.3%
2 15
26.3%
1 19
33.3%

최대지하층수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size588.0 B
0
50 
1
 
4
4
 
2
5
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)1.8%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 50
87.7%
1 4
 
7.0%
4 2
 
3.5%
5 1
 
1.8%

Length

2023-12-13T06:19:09.363710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:19:09.479236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 50
87.7%
1 4
 
7.0%
4 2
 
3.5%
5 1
 
1.8%

주용도
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size588.0 B
제2종근린생활시설
45 
운동시설
공동주택
숙박시설
 
1

Length

Max length9
Median length9
Mean length7.9473684
Min length4

Unique

Unique1 ?
Unique (%)1.8%

Sample

1st row제2종근린생활시설
2nd row제2종근린생활시설
3rd row제2종근린생활시설
4th row제2종근린생활시설
5th row제2종근린생활시설

Common Values

ValueCountFrequency (%)
제2종근린생활시설 45
78.9%
운동시설 6
 
10.5%
공동주택 5
 
8.8%
숙박시설 1
 
1.8%

Length

2023-12-13T06:19:09.576957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:19:09.674284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제2종근린생활시설 45
78.9%
운동시설 6
 
10.5%
공동주택 5
 
8.8%
숙박시설 1
 
1.8%

부속용도
Text

MISSING 

Distinct33
Distinct (%)67.3%
Missing8
Missing (%)14.0%
Memory size588.0 B
2023-12-13T06:19:09.836601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length35
Median length19
Mean length7.8571429
Min length3

Characters and Unicode

Total characters385
Distinct characters75
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)53.1%

Sample

1st row사무소
2nd row사무소
3rd row(일반음식점),다가구주택
4th row휴게음식점,일반음식점
5th row다중생활시설
ValueCountFrequency (%)
사무소 12
19.7%
일반음식점 9
14.8%
5
 
8.2%
제조업소 4
 
6.6%
수리점 3
 
4.9%
근린생활시설 2
 
3.3%
다중생활시설 2
 
3.3%
사무실 2
 
3.3%
단독주택 2
 
3.3%
스크린골프연습장 1
 
1.6%
Other values (19) 19
31.1%
2023-12-13T06:19:10.163383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
22
 
5.7%
20
 
5.2%
17
 
4.4%
16
 
4.2%
16
 
4.2%
16
 
4.2%
( 13
 
3.4%
12
 
3.1%
) 12
 
3.1%
11
 
2.9%
Other values (65) 230
59.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 331
86.0%
Open Punctuation 13
 
3.4%
Space Separator 12
 
3.1%
Close Punctuation 12
 
3.1%
Other Punctuation 12
 
3.1%
Decimal Number 5
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
22
 
6.6%
20
 
6.0%
17
 
5.1%
16
 
4.8%
16
 
4.8%
16
 
4.8%
11
 
3.3%
11
 
3.3%
11
 
3.3%
11
 
3.3%
Other values (57) 180
54.4%
Other Punctuation
ValueCountFrequency (%)
, 7
58.3%
/ 4
33.3%
. 1
 
8.3%
Decimal Number
ValueCountFrequency (%)
1 4
80.0%
2 1
 
20.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Space Separator
ValueCountFrequency (%)
12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 331
86.0%
Common 54
 
14.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
22
 
6.6%
20
 
6.0%
17
 
5.1%
16
 
4.8%
16
 
4.8%
16
 
4.8%
11
 
3.3%
11
 
3.3%
11
 
3.3%
11
 
3.3%
Other values (57) 180
54.4%
Common
ValueCountFrequency (%)
( 13
24.1%
12
22.2%
) 12
22.2%
, 7
13.0%
1 4
 
7.4%
/ 4
 
7.4%
2 1
 
1.9%
. 1
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 331
86.0%
ASCII 54
 
14.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
22
 
6.6%
20
 
6.0%
17
 
5.1%
16
 
4.8%
16
 
4.8%
16
 
4.8%
11
 
3.3%
11
 
3.3%
11
 
3.3%
11
 
3.3%
Other values (57) 180
54.4%
ASCII
ValueCountFrequency (%)
( 13
24.1%
12
22.2%
) 12
22.2%
, 7
13.0%
1 4
 
7.4%
/ 4
 
7.4%
2 1
 
1.9%
. 1
 
1.9%

총주차대수
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct21
Distinct (%)38.2%
Missing2
Missing (%)3.5%
Infinite0
Infinite (%)0.0%
Mean47.363636
Minimum0
Maximum1428
Zeros6
Zeros (%)10.5%
Negative0
Negative (%)0.0%
Memory size645.0 B
2023-12-13T06:19:10.297318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q312
95-th percentile140.4
Maximum1428
Range1428
Interquartile range (IQR)10

Descriptive statistics

Standard deviation200.49054
Coefficient of variation (CV)4.2330057
Kurtosis43.617983
Mean47.363636
Median Absolute Deviation (MAD)2
Skewness6.4016219
Sum2605
Variance40196.458
MonotonicityNot monotonic
2023-12-13T06:19:10.415609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
ValueCountFrequency (%)
2 11
19.3%
3 8
14.0%
1 6
10.5%
0 6
10.5%
4 5
8.8%
12 4
 
7.0%
7 1
 
1.8%
10 1
 
1.8%
95 1
 
1.8%
34 1
 
1.8%
Other values (11) 11
19.3%
(Missing) 2
 
3.5%
ValueCountFrequency (%)
0 6
10.5%
1 6
10.5%
2 11
19.3%
3 8
14.0%
4 5
8.8%
5 1
 
1.8%
7 1
 
1.8%
10 1
 
1.8%
11 1
 
1.8%
12 4
 
7.0%
ValueCountFrequency (%)
1428 1
1.8%
415 1
1.8%
237 1
1.8%
99 1
1.8%
95 1
1.8%
58 1
1.8%
34 1
1.8%
28 1
1.8%
27 1
1.8%
16 1
1.8%

세대수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct6
Distinct (%)75.0%
Missing49
Missing (%)86.0%
Infinite0
Infinite (%)0.0%
Mean61.5
Minimum1
Maximum261
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size645.0 B
2023-12-13T06:19:10.544985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median18
Q361
95-th percentile227.75
Maximum261
Range260
Interquartile range (IQR)60

Descriptive statistics

Standard deviation97.700709
Coefficient of variation (CV)1.5886294
Kurtosis1.6966353
Mean61.5
Median Absolute Deviation (MAD)17
Skewness1.6744305
Sum492
Variance9545.4286
MonotonicityNot monotonic
2023-12-13T06:19:10.652061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 3
 
5.3%
24 1
 
1.8%
261 1
 
1.8%
166 1
 
1.8%
26 1
 
1.8%
12 1
 
1.8%
(Missing) 49
86.0%
ValueCountFrequency (%)
1 3
5.3%
12 1
 
1.8%
24 1
 
1.8%
26 1
 
1.8%
166 1
 
1.8%
261 1
 
1.8%
ValueCountFrequency (%)
261 1
 
1.8%
166 1
 
1.8%
26 1
 
1.8%
24 1
 
1.8%
12 1
 
1.8%
1 3
5.3%

가구수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size588.0 B
<NA>
48 
1
2
 
1

Length

Max length4
Median length4
Mean length3.5263158
Min length1

Unique

Unique1 ?
Unique (%)1.8%

Sample

1st row<NA>
2nd row<NA>
3rd row2
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 48
84.2%
1 8
 
14.0%
2 1
 
1.8%

Length

2023-12-13T06:19:10.781217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:19:10.894665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 48
84.2%
1 8
 
14.0%
2 1
 
1.8%
Distinct43
Distinct (%)75.4%
Missing0
Missing (%)0.0%
Memory size588.0 B
2023-12-13T06:19:11.085039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length16
Mean length9.7894737
Min length7

Characters and Unicode

Total characters558
Distinct characters93
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique36 ?
Unique (%)63.2%

Sample

1st row모두정현재건축사사무소
2nd row예진종합건축사(사)
3rd row삼우건축사사무소
4th row주식회사아키랜드정영필건축사무소
5th row누보건축사사무소
ValueCountFrequency (%)
도원ds건축사사무소 5
 
8.1%
건축사사무소 5
 
8.1%
누보건축사사무소 4
 
6.5%
모두장재순건축사사무소 3
 
4.8%
건축사사무소소윤 3
 
4.8%
삼우건축사사무소 2
 
3.2%
주)아키랜드건축사사무소 2
 
3.2%
태영종합건축사사무소 2
 
3.2%
조은허기윤건축사사무소 1
 
1.6%
모두정현재건축사사무소 1
 
1.6%
Other values (34) 34
54.8%
2023-12-13T06:19:11.423149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
114
20.4%
59
 
10.6%
57
 
10.2%
57
 
10.2%
56
 
10.0%
9
 
1.6%
) 8
 
1.4%
( 8
 
1.4%
8
 
1.4%
7
 
1.3%
Other values (83) 175
31.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 520
93.2%
Uppercase Letter 17
 
3.0%
Close Punctuation 8
 
1.4%
Open Punctuation 8
 
1.4%
Space Separator 5
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
114
21.9%
59
11.3%
57
11.0%
57
11.0%
56
 
10.8%
9
 
1.7%
8
 
1.5%
7
 
1.3%
7
 
1.3%
6
 
1.2%
Other values (72) 140
26.9%
Uppercase Letter
ValueCountFrequency (%)
S 6
35.3%
D 5
29.4%
O 1
 
5.9%
A 1
 
5.9%
K 1
 
5.9%
M 1
 
5.9%
G 1
 
5.9%
T 1
 
5.9%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 520
93.2%
Common 21
 
3.8%
Latin 17
 
3.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
114
21.9%
59
11.3%
57
11.0%
57
11.0%
56
 
10.8%
9
 
1.7%
8
 
1.5%
7
 
1.3%
7
 
1.3%
6
 
1.2%
Other values (72) 140
26.9%
Latin
ValueCountFrequency (%)
S 6
35.3%
D 5
29.4%
O 1
 
5.9%
A 1
 
5.9%
K 1
 
5.9%
M 1
 
5.9%
G 1
 
5.9%
T 1
 
5.9%
Common
ValueCountFrequency (%)
) 8
38.1%
( 8
38.1%
5
23.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 520
93.2%
ASCII 38
 
6.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
114
21.9%
59
11.3%
57
11.0%
57
11.0%
56
 
10.8%
9
 
1.7%
8
 
1.5%
7
 
1.3%
7
 
1.3%
6
 
1.2%
Other values (72) 140
26.9%
ASCII
ValueCountFrequency (%)
) 8
21.1%
( 8
21.1%
S 6
15.8%
5
13.2%
D 5
13.2%
O 1
 
2.6%
A 1
 
2.6%
K 1
 
2.6%
M 1
 
2.6%
G 1
 
2.6%

시공자사무소명
Text

MISSING 

Distinct24
Distinct (%)100.0%
Missing33
Missing (%)57.9%
Memory size588.0 B
2023-12-13T06:19:11.605266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length8.75
Min length6

Characters and Unicode

Total characters210
Distinct characters50
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)100.0%

Sample

1st row정운건업(주)
2nd row극동건설(주)
3rd row주식회사 누보건설
4th row(주)하람종합건설
5th row(주)대우건설
ValueCountFrequency (%)
주식회사 5
 
16.7%
정운건업(주 1
 
3.3%
태산종합건설(주 1
 
3.3%
주)명린종합건설 1
 
3.3%
거능종합건설(주 1
 
3.3%
우주 1
 
3.3%
1
 
3.3%
주)주안종합건설 1
 
3.3%
비오엠건설(주 1
 
3.3%
주)정안건설 1
 
3.3%
Other values (16) 16
53.3%
2023-12-13T06:19:11.932991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26
12.4%
23
 
11.0%
21
 
10.0%
( 17
 
8.1%
) 17
 
8.1%
12
 
5.7%
12
 
5.7%
7
 
3.3%
7
 
3.3%
7
 
3.3%
Other values (40) 61
29.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 170
81.0%
Open Punctuation 17
 
8.1%
Close Punctuation 17
 
8.1%
Space Separator 6
 
2.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26
15.3%
23
13.5%
21
12.4%
12
 
7.1%
12
 
7.1%
7
 
4.1%
7
 
4.1%
7
 
4.1%
3
 
1.8%
3
 
1.8%
Other values (37) 49
28.8%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%
Space Separator
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 170
81.0%
Common 40
 
19.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
26
15.3%
23
13.5%
21
12.4%
12
 
7.1%
12
 
7.1%
7
 
4.1%
7
 
4.1%
7
 
4.1%
3
 
1.8%
3
 
1.8%
Other values (37) 49
28.8%
Common
ValueCountFrequency (%)
( 17
42.5%
) 17
42.5%
6
 
15.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 170
81.0%
ASCII 40
 
19.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
26
15.3%
23
13.5%
21
12.4%
12
 
7.1%
12
 
7.1%
7
 
4.1%
7
 
4.1%
7
 
4.1%
3
 
1.8%
3
 
1.8%
Other values (37) 49
28.8%
ASCII
ValueCountFrequency (%)
( 17
42.5%
) 17
42.5%
6
 
15.0%

Interactions

2023-12-13T06:19:04.730866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:03.289810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:03.658621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:04.044787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:04.848572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:03.398946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:03.751586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:04.133682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:04.952332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:03.495749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:03.855360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:04.240973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:05.040994image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:03.580731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:03.962647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T06:19:04.637741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T06:19:12.369493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
허가일착공처리일착공예정일준공예정일(사용승인예정일)대지위치연면적(제곱미터)최대지상층수최대지하층수주용도부속용도총주차대수세대수가구수설계사무소명시공자사무소명
허가일1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
착공처리일1.0001.0001.0000.9681.0001.0001.0000.0000.0000.9551.0001.0001.0000.9561.000
착공예정일1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
준공예정일(사용승인예정일)1.0000.9681.0001.0001.0001.0000.7511.0000.9070.8751.0001.0001.0000.9471.000
대지위치1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
연면적(제곱미터)1.0001.0001.0001.0001.0001.0000.9800.9790.9331.0001.0001.000NaN1.0001.000
최대지상층수1.0001.0001.0000.7511.0000.9801.0000.9450.9120.9800.9790.900NaN0.9391.000
최대지하층수1.0000.0001.0001.0001.0000.9790.9451.0000.8541.0000.9791.000NaN0.9751.000
주용도1.0000.0001.0000.9071.0000.9330.9120.8541.0001.0000.9320.0000.0000.9061.000
부속용도1.0000.9551.0000.8751.0001.0000.9801.0001.0001.0001.0001.0001.0000.5921.000
총주차대수1.0001.0001.0001.0001.0001.0000.9790.9790.9321.0001.0001.000NaN1.0001.000
세대수1.0001.0001.0001.0001.0001.0000.9001.0000.0001.0001.0001.000NaN1.0001.000
가구수1.0001.0001.0001.0001.000NaNNaNNaN0.0001.000NaNNaN1.0001.000NaN
설계사무소명1.0000.9561.0000.9471.0001.0000.9390.9750.9060.5921.0001.0001.0001.0001.000
시공자사무소명1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000NaN1.0001.000
2023-12-13T06:19:12.510262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가구수최대지하층수주용도
가구수1.0001.0000.000
최대지하층수1.0001.0000.510
주용도0.0000.5101.000
2023-12-13T06:19:12.599655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연면적(제곱미터)최대지상층수총주차대수세대수최대지하층수주용도가구수
연면적(제곱미터)1.0000.7070.8140.9760.8050.6551.000
최대지상층수0.7071.0000.6730.4780.6870.6101.000
총주차대수0.8140.6731.0000.9820.8040.6541.000
세대수0.9760.4780.9821.0001.0000.0000.000
최대지하층수0.8050.6870.8041.0001.0000.5101.000
주용도0.6550.6100.6540.0000.5101.0000.000
가구수1.0001.0001.0000.0001.0000.0001.000

Missing values

2023-12-13T06:19:05.179639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:19:05.455304image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T06:19:05.636324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

건축구분허가일착공처리일착공예정일사용승인일준공예정일(사용승인예정일)대지위치연면적(제곱미터)최대지상층수최대지하층수주용도부속용도총주차대수세대수가구수설계사무소명시공자사무소명
0신축2018-03-072018-03-132018-03-14<NA>2018-04-30경상남도 진주시 하대동 71-21176.410제2종근린생활시설사무소1<NA><NA>모두정현재건축사사무소<NA>
1신축2018-06-192021-11-012021-11-01<NA>2022-10-30경상남도 진주시 정촌면 예하리 1297-22216.310제2종근린생활시설사무소1<NA><NA>예진종합건축사(사)<NA>
2신축2018-07-122021-07-152021-07-16<NA>2022-07-15경상남도 진주시 인사동 219-39477.6620제2종근린생활시설(일반음식점),다가구주택4<NA>2삼우건축사사무소<NA>
3신축2020-08-05<NA><NA><NA><NA>경상남도 진주시 금산면 장사리 902 외2필지768.0441제2종근린생활시설휴게음식점,일반음식점5<NA><NA>주식회사아키랜드정영필건축사무소<NA>
4신축2020-09-25<NA><NA><NA><NA>경상남도 진주시 상봉동 834-5 외1필지447.450제2종근린생활시설다중생활시설4<NA><NA>누보건축사사무소<NA>
5신축2021-03-16<NA><NA><NA><NA>경상남도 진주시 가좌동 1034-11958.440공동주택다세대2724<NA>조은정일현건축사사무소<NA>
6신축2021-04-22<NA><NA><NA><NA>경상남도 진주시 상대동 306-30194.8820제2종근린생활시설동물병원1<NA><NA>도원DS건축사사무소<NA>
7신축2021-04-232023-06-122023-06-13<NA>2023-08-31경상남도 진주시 진성면 동산리 139-7 외1필지295.010제2종근린생활시설농기계수리점<NA><NA><NA>(주)아키랜드건축사사무소정운건업(주)
8신축2021-06-30<NA><NA><NA><NA>경상남도 진주시 하대동 306-9661.6340제2종근린생활시설<NA>4<NA>1건축사사무소모아SM<NA>
9신축2021-07-01<NA><NA><NA><NA>경상남도 진주시 옥봉동 687299.7230제2종근린생활시설(사무소)2<NA><NA>삼우건축사사무소<NA>
건축구분허가일착공처리일착공예정일사용승인일준공예정일(사용승인예정일)대지위치연면적(제곱미터)최대지상층수최대지하층수주용도부속용도총주차대수세대수가구수설계사무소명시공자사무소명
47신축2023-05-162023-08-182023-06-27<NA>2024-02-26경상남도 진주시 하대동 680 외4필지4986.6641운동시설골프연습장/제1종근린생활시설(휴게음식점)95<NA><NA>건축사사무소소윤거능종합건설(주)
48신축2023-05-17<NA><NA><NA><NA>경상남도 진주시 상대동 834 외3필지307.5920운동시설스포츠클라이밍장0<NA><NA>마루건축사사무소<NA>
49신축2023-05-18<NA><NA><NA><NA>경상남도 진주시 가좌동 1809 외4필지199.6610제2종근린생활시설제조업소1<NA><NA>건축사사무소 와이<NA>
50신축2023-05-252023-06-232023-06-19<NA>2023-12-29경상남도 진주시 금산면 중천리 267-1142.7210제2종근린생활시설사무실1<NA><NA>건축사사무소대영<NA>
51신축2023-05-262023-08-182023-08-11<NA>2023-12-31경상남도 진주시 초전동 853-5 외3필지194.9320제2종근린생활시설일반음식점2<NA>1위인건축사사무소<NA>
52신축2023-06-132023-07-062023-07-07<NA>2023-12-31경상남도 진주시 집현면 사촌리 86-2160.4610제2종근린생활시설사무소1<NA><NA>건축사사무소강남<NA>
53신축2023-07-042023-07-212023-07-21<NA>2024-07-19경상남도 진주시 이현동 23-33305.9420제2종근린생활시설수리점, 단독주택2<NA>1가륜건축사사무소(주)명린종합건설
54신축2023-07-142023-08-022023-08-03<NA>2023-10-31경상남도 진주시 금산면 장사리 1017-2198.4210제2종근린생활시설일반음식점2<NA><NA>OK건축사사무소<NA>
55신축2023-07-212023-08-112023-08-08<NA>2024-08-07경상남도 진주시 하대동 674-1 외6필지1050.010제2종근린생활시설제2종근린생활시설(휴게음식점,사무소)/제1종근린생활시설(소매점)10<NA><NA>건축사사무소소윤태건건설(주)
56신축2023-07-26<NA><NA><NA><NA>경상남도 진주시 초전동 694 외11필지923.9511제2종근린생활시설일반음식점, 휴게음식점0<NA><NA>조은서지영건축사사무소<NA>

Duplicate rows

Most frequently occurring

건축구분허가일착공처리일착공예정일준공예정일(사용승인예정일)대지위치연면적(제곱미터)최대지상층수최대지하층수주용도부속용도총주차대수세대수가구수설계사무소명시공자사무소명# duplicates
0신축2021-08-02<NA><NA><NA>경상남도 진주시 평거동 340124.810제2종근린생활시설(일반음식점)2<NA><NA>도원DS건축사사무소<NA>3