Overview

Dataset statistics

Number of variables8
Number of observations39
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)2.6%
Total size in memory2.7 KiB
Average record size in memory70.4 B

Variable types

Categorical3
Text1
Numeric3
DateTime1

Dataset

Description2022년도 건축허가 현황에 대한 데이터로서 대지위치, 지목, 대지면적, 건축면적, 연면적, 허가일, 주용도에 관한 공공데이터를 제공합니다.
Author충청남도
URLhttps://alldam.chungnam.go.kr/index.chungnam?menuCd=DOM_000000201001001001&st=&cds=&orgCd=&apiType=&isOpen=Y&pageIndex=310&beforeMenuCd=DOM_000000201001001000&publicdatapk=15093863

Alerts

건축구분 has constant value ""Constant
Dataset has 1 (2.6%) duplicate rowsDuplicates
대지면적(제곱미터) is highly overall correlated with 건축면적(제곱미터) and 3 other fieldsHigh correlation
건축면적(제곱미터) is highly overall correlated with 대지면적(제곱미터) and 3 other fieldsHigh correlation
연면적(제곱미터) is highly overall correlated with 대지면적(제곱미터) and 3 other fieldsHigh correlation
지목 is highly overall correlated with 대지면적(제곱미터) and 3 other fieldsHigh correlation
주용도 is highly overall correlated with 대지면적(제곱미터) and 3 other fieldsHigh correlation

Reproduction

Analysis started2024-01-09 20:16:55.892806
Analysis finished2024-01-09 20:16:56.994314
Duration1.1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

건축구분
Categorical

CONSTANT 

Distinct1
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size444.0 B
신축
39 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row신축
2nd row신축
3rd row신축
4th row신축
5th row신축

Common Values

ValueCountFrequency (%)
신축 39
100.0%

Length

2024-01-10T05:16:57.050808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T05:16:57.134139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신축 39
100.0%
Distinct38
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Memory size444.0 B
2024-01-10T05:16:57.291165image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length25
Mean length21.230769
Min length17

Characters and Unicode

Total characters828
Distinct characters40
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique37 ?
Unique (%)94.9%

Sample

1st row충청남도 계룡시 엄사면 향한리 407-33
2nd row충청남도 계룡시 두마면 입암리 650
3rd row충청남도 계룡시 두마면 입암리 650
4th row충청남도 계룡시 엄사면 도곡리 155-2
5th row충청남도 계룡시 금암동 156-7
ValueCountFrequency (%)
충청남도 39
20.0%
계룡시 39
20.0%
엄사면 24
12.3%
엄사리 11
 
5.6%
두마면 10
 
5.1%
향한리 9
 
4.6%
입암리 5
 
2.6%
금암동 5
 
2.6%
도곡리 3
 
1.5%
농소리 3
 
1.5%
Other values (45) 47
24.1%
2024-01-10T05:16:57.599609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
156
18.8%
42
 
5.1%
40
 
4.8%
39
 
4.7%
39
 
4.7%
39
 
4.7%
39
 
4.7%
39
 
4.7%
35
 
4.2%
35
 
4.2%
Other values (30) 325
39.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 507
61.2%
Space Separator 156
 
18.8%
Decimal Number 143
 
17.3%
Dash Punctuation 22
 
2.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
42
 
8.3%
40
 
7.9%
39
 
7.7%
39
 
7.7%
39
 
7.7%
39
 
7.7%
39
 
7.7%
35
 
6.9%
35
 
6.9%
34
 
6.7%
Other values (18) 126
24.9%
Decimal Number
ValueCountFrequency (%)
1 23
16.1%
6 22
15.4%
3 17
11.9%
5 16
11.2%
9 15
10.5%
2 13
9.1%
7 13
9.1%
8 12
8.4%
4 8
 
5.6%
0 4
 
2.8%
Space Separator
ValueCountFrequency (%)
156
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 507
61.2%
Common 321
38.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
42
 
8.3%
40
 
7.9%
39
 
7.7%
39
 
7.7%
39
 
7.7%
39
 
7.7%
39
 
7.7%
35
 
6.9%
35
 
6.9%
34
 
6.7%
Other values (18) 126
24.9%
Common
ValueCountFrequency (%)
156
48.6%
1 23
 
7.2%
- 22
 
6.9%
6 22
 
6.9%
3 17
 
5.3%
5 16
 
5.0%
9 15
 
4.7%
2 13
 
4.0%
7 13
 
4.0%
8 12
 
3.7%
Other values (2) 12
 
3.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 507
61.2%
ASCII 321
38.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
156
48.6%
1 23
 
7.2%
- 22
 
6.9%
6 22
 
6.9%
3 17
 
5.3%
5 16
 
5.0%
9 15
 
4.7%
2 13
 
4.0%
7 13
 
4.0%
8 12
 
3.7%
Other values (2) 12
 
3.7%
Hangul
ValueCountFrequency (%)
42
 
8.3%
40
 
7.9%
39
 
7.7%
39
 
7.7%
39
 
7.7%
39
 
7.7%
39
 
7.7%
35
 
6.9%
35
 
6.9%
34
 
6.7%
Other values (18) 126
24.9%

지목
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)12.8%
Missing0
Missing (%)0.0%
Memory size444.0 B
30 
공장용지
 
2
임야
 
2
주차장
 
1

Length

Max length4
Median length1
Mean length1.4102564
Min length1

Unique

Unique1 ?
Unique (%)2.6%

Sample

1st row
2nd row공장용지
3rd row공장용지
4th row
5th row

Common Values

ValueCountFrequency (%)
30
76.9%
공장용지 4
 
10.3%
2
 
5.1%
임야 2
 
5.1%
주차장 1
 
2.6%

Length

2024-01-10T05:16:57.719959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T05:16:57.828611image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
30
76.9%
공장용지 4
 
10.3%
2
 
5.1%
임야 2
 
5.1%
주차장 1
 
2.6%

대지면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION 

Distinct38
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1213.6538
Minimum225.9
Maximum5173
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size483.0 B
2024-01-10T05:16:57.938828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum225.9
5-th percentile236.84
Q1261.05
median505
Q31641.05
95-th percentile4128.12
Maximum5173
Range4947.1
Interquartile range (IQR)1380

Descriptive statistics

Standard deviation1406.2717
Coefficient of variation (CV)1.1587091
Kurtosis1.3453593
Mean1213.6538
Median Absolute Deviation (MAD)254.3
Skewness1.5757463
Sum47332.5
Variance1977600.1
MonotonicityNot monotonic
2024-01-10T05:16:58.065692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
3359.2 2
 
5.1%
2668.0 1
 
2.6%
261.2 1
 
2.6%
441.6 1
 
2.6%
564.7 1
 
2.6%
3641.0 1
 
2.6%
1216.1 1
 
2.6%
519.0 1
 
2.6%
250.7 1
 
2.6%
237.3 1
 
2.6%
Other values (28) 28
71.8%
ValueCountFrequency (%)
225.9 1
2.6%
232.7 1
2.6%
237.3 1
2.6%
243.8 1
2.6%
249.2 1
2.6%
249.9 1
2.6%
250.7 1
2.6%
253.5 1
2.6%
258.2 1
2.6%
260.9 1
2.6%
ValueCountFrequency (%)
5173.0 1
2.6%
4847.4 1
2.6%
4048.2 1
2.6%
3641.0 1
2.6%
3359.2 2
5.1%
2668.0 1
2.6%
2447.0 1
2.6%
2166.5 1
2.6%
2066.0 1
2.6%
1216.1 1
2.6%

건축면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION 

Distinct38
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean502.81134
Minimum72.17
Maximum3570.28
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size483.0 B
2024-01-10T05:16:58.172702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum72.17
5-th percentile86.727
Q1100.26
median151.46
Q3515.73
95-th percentile1797.15
Maximum3570.28
Range3498.11
Interquartile range (IQR)415.47

Descriptive statistics

Standard deviation733.04658
Coefficient of variation (CV)1.4578959
Kurtosis7.2622158
Mean502.81134
Median Absolute Deviation (MAD)55.28
Skewness2.4957598
Sum19609.642
Variance537357.29
MonotonicityNot monotonic
2024-01-10T05:16:58.278616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=38)
ValueCountFrequency (%)
1797.15 2
 
5.1%
889.3921 1
 
2.6%
156.12 1
 
2.6%
87.81 1
 
2.6%
96.18 1
 
2.6%
1239.15 1
 
2.6%
151.87 1
 
2.6%
103.75 1
 
2.6%
99.0 1
 
2.6%
133.58 1
 
2.6%
Other values (28) 28
71.8%
ValueCountFrequency (%)
72.17 1
2.6%
76.98 1
2.6%
87.81 1
2.6%
91.91 1
2.6%
94.72 1
2.6%
95.82 1
2.6%
96.18 1
2.6%
99.0 1
2.6%
99.02 1
2.6%
99.64 1
2.6%
ValueCountFrequency (%)
3570.28 1
2.6%
1797.15 2
5.1%
1697.46 1
2.6%
1548.17 1
2.6%
1239.15 1
2.6%
1189.32 1
2.6%
913.55 1
2.6%
889.3921 1
2.6%
524.2 1
2.6%
507.26 1
2.6%

연면적(제곱미터)
Real number (ℝ)

HIGH CORRELATION 

Distinct37
Distinct (%)94.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean796.06629
Minimum104.61
Maximum4880.63
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size483.0 B
2024-01-10T05:16:58.392317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum104.61
5-th percentile125.527
Q1151.8
median198
Q31073.76
95-th percentile2668.957
Maximum4880.63
Range4776.02
Interquartile range (IQR)921.96

Descriptive statistics

Standard deviation1061.3359
Coefficient of variation (CV)1.3332255
Kurtosis4.783896
Mean796.06629
Median Absolute Deviation (MAD)73.49
Skewness2.0845483
Sum31046.586
Variance1126433.8
MonotonicityNot monotonic
2024-01-10T05:16:58.545508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=37)
ValueCountFrequency (%)
1944.16 2
 
5.1%
198.0 2
 
5.1%
2629.4555 1
 
2.6%
382.2 1
 
2.6%
148.17 1
 
2.6%
125.64 1
 
2.6%
1678.76 1
 
2.6%
195.61 1
 
2.6%
148.55 1
 
2.6%
196.98 1
 
2.6%
Other values (27) 27
69.2%
ValueCountFrequency (%)
104.61 1
2.6%
124.51 1
2.6%
125.64 1
2.6%
139.65 1
2.6%
142.31 1
2.6%
143.4 1
2.6%
148.17 1
2.6%
148.55 1
2.6%
149.82 1
2.6%
149.9 1
2.6%
ValueCountFrequency (%)
4880.63 1
2.6%
3024.47 1
2.6%
2629.4555 1
2.6%
2354.81 1
2.6%
1990.34 1
2.6%
1944.16 2
5.1%
1780.5 1
2.6%
1678.76 1
2.6%
1150.6 1
2.6%
996.92 1
2.6%
Distinct33
Distinct (%)84.6%
Missing0
Missing (%)0.0%
Memory size444.0 B
Minimum2022-01-28 00:00:00
Maximum2022-12-09 00:00:00
2024-01-10T05:16:58.683277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T05:16:58.815817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)

주용도
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)20.5%
Missing0
Missing (%)0.0%
Memory size444.0 B
단독주택
23 
제1종근린생활시설
공장
창고시설
 
2
제2종근린생활시설
 
1
Other values (3)

Length

Max length9
Median length4
Mean length4.7179487
Min length2

Unique

Unique4 ?
Unique (%)10.3%

Sample

1st row단독주택
2nd row공장
3rd row공장
4th row제2종근린생활시설
5th row노유자시설

Common Values

ValueCountFrequency (%)
단독주택 23
59.0%
제1종근린생활시설 6
 
15.4%
공장 4
 
10.3%
창고시설 2
 
5.1%
제2종근린생활시설 1
 
2.6%
노유자시설 1
 
2.6%
공동주택 1
 
2.6%
운동시설 1
 
2.6%

Length

2024-01-10T05:16:58.956497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-10T05:16:59.084841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
단독주택 23
59.0%
제1종근린생활시설 6
 
15.4%
공장 4
 
10.3%
창고시설 2
 
5.1%
제2종근린생활시설 1
 
2.6%
노유자시설 1
 
2.6%
공동주택 1
 
2.6%
운동시설 1
 
2.6%

Interactions

2024-01-10T05:16:56.582717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T05:16:56.127399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T05:16:56.350218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T05:16:56.655342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T05:16:56.197004image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T05:16:56.425344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T05:16:56.738019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T05:16:56.271626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-01-10T05:16:56.504861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-10T05:16:59.403366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대지위치지목대지면적(제곱미터)건축면적(제곱미터)연면적(제곱미터)허가일주용도
대지위치1.0001.0001.0001.0001.0001.0001.000
지목1.0001.0000.8500.6800.8030.8500.879
대지면적(제곱미터)1.0000.8501.0000.8010.9580.0000.925
건축면적(제곱미터)1.0000.6800.8011.0000.9640.0000.800
연면적(제곱미터)1.0000.8030.9580.9641.0000.0000.961
허가일1.0000.8500.0000.0000.0001.0000.000
주용도1.0000.8790.9250.8000.9610.0001.000
2024-01-10T05:16:59.496946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주용도지목
주용도1.0000.752
지목0.7521.000
2024-01-10T05:16:59.568841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대지면적(제곱미터)건축면적(제곱미터)연면적(제곱미터)지목주용도
대지면적(제곱미터)1.0000.7040.5400.7050.578
건축면적(제곱미터)0.7041.0000.9050.5320.589
연면적(제곱미터)0.5400.9051.0000.6310.682
지목0.7050.5320.6311.0000.752
주용도0.5780.5890.6820.7521.000

Missing values

2024-01-10T05:16:56.853007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-10T05:16:56.952807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

건축구분대지위치지목대지면적(제곱미터)건축면적(제곱미터)연면적(제곱미터)허가일주용도
0신축충청남도 계룡시 엄사면 향한리 407-33503.099.02153.72022-12-09단독주택
1신축충청남도 계룡시 두마면 입암리 650공장용지3359.21797.151944.162022-12-09공장
2신축충청남도 계룡시 두마면 입암리 650공장용지3359.21797.151944.162022-12-09공장
3신축충청남도 계룡시 엄사면 도곡리 155-2890.0176.9176.92022-11-23제2종근린생활시설
4신축충청남도 계룡시 금암동 156-7560.1374.21990.342022-10-27노유자시설
5신축충청남도 계룡시 금암동 68-3464.7151.5454.242022-10-25제1종근린생활시설
6신축충청남도 계룡시 엄사면 엄사리 355-1249.2149.5394.042022-09-19단독주택
7신축충청남도 계룡시 엄사면 향한리 712462.591.91124.512022-09-15단독주택
8신축충청남도 계룡시 엄사면 향한리 690633.7126.66182.52022-09-13단독주택
9신축충청남도 계룡시 엄사면 도곡리 33-7505.0100.88167.182022-09-06단독주택
건축구분대지위치지목대지면적(제곱미터)건축면적(제곱미터)연면적(제곱미터)허가일주용도
29신축충청남도 계룡시 엄사면 엄사리 245-7237.3133.58196.982022-03-14단독주택
30신축충청남도 계룡시 두마면 농소리 982 외2필지2066.01189.322354.812022-03-07제1종근린생활시설
31신축충청남도 계룡시 엄사면 향한리 696503.899.64149.92022-03-07단독주택
32신축충청남도 계룡시 엄사면 엄사리 271-7258.2152.67469.532022-02-25단독주택
33신축충청남도 계룡시 엄사면 향한리 694386.276.98139.652022-02-16단독주택
34신축충청남도 계룡시 금암동 161-32166.51697.463024.472022-02-14운동시설
35신축충청남도 계룡시 두마면 농소리 9991015.4507.26996.922022-02-14제1종근린생활시설
36신축충청남도 계룡시 두마면 농소리 985592.0353.05353.052022-02-14제1종근린생활시설
37신축충청남도 계룡시 엄사면 엄사리 355-6253.5139.79199.922022-02-04단독주택
38신축충청남도 계룡시 엄사면 엄사리 269-7225.994.72149.822022-01-28단독주택

Duplicate rows

Most frequently occurring

건축구분대지위치지목대지면적(제곱미터)건축면적(제곱미터)연면적(제곱미터)허가일주용도# duplicates
0신축충청남도 계룡시 두마면 입암리 650공장용지3359.21797.151944.162022-12-09공장2