Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows173
Duplicate rows (%)1.7%
Total size in memory507.8 KiB
Average record size in memory52.0 B

Variable types

Text1
Numeric4

Dataset

Description전북특별자치도 진안군 도시계획정보시스템 건축물대장 표제부에 대한 데이터로 지붕종류, 대지면적, 부속 건축물 수, 부속 건축물 면적, 총 동 연면적 정보를 제공합니다.
Author전북특별자치도 진안군
URLhttps://www.data.go.kr/data/15119155/fileData.do

Alerts

Dataset has 173 (1.7%) duplicate rowsDuplicates
부속 건축물 수 is highly overall correlated with 부속 건축물 면적High correlation
부속 건축물 면적 is highly overall correlated with 부속 건축물 수High correlation
대지면적 is highly skewed (γ1 = 47.28945327)Skewed
부속 건축물 면적 is highly skewed (γ1 = 20.93583335)Skewed
총 동 연면적 is highly skewed (γ1 = 29.90410422)Skewed
대지면적 has 4590 (45.9%) zerosZeros
부속 건축물 수 has 3665 (36.6%) zerosZeros
부속 건축물 면적 has 3667 (36.7%) zerosZeros

Reproduction

Analysis started2024-03-14 21:26:04.569761
Analysis finished2024-03-14 21:26:09.814155
Duration5.24 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct478
Distinct (%)4.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-15T06:26:10.682566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length22
Median length3
Mean length4.0469
Min length1

Characters and Unicode

Total characters40469
Distinct characters169
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique297 ?
Unique (%)3.0%

Sample

1st row스레이트
2nd row스라브
3rd row스레트
4th row함석
5th row스레이트
ValueCountFrequency (%)
스레트 4023
39.7%
스라브 947
 
9.3%
스레이트 467
 
4.6%
함석 383
 
3.8%
슬라브 361
 
3.6%
세멘기와+스레트 284
 
2.8%
함석+스레트 278
 
2.7%
기와 269
 
2.7%
세멘기와 153
 
1.5%
판넬 152
 
1.5%
Other values (465) 2824
27.8%
2024-03-15T06:26:12.121134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7443
18.4%
6768
16.7%
6193
15.3%
1878
 
4.6%
+ 1868
 
4.6%
1831
 
4.5%
1575
 
3.9%
1570
 
3.9%
1103
 
2.7%
988
 
2.4%
Other values (159) 9252
22.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 38396
94.9%
Math Symbol 1868
 
4.6%
Space Separator 141
 
0.3%
Open Punctuation 24
 
0.1%
Close Punctuation 24
 
0.1%
Decimal Number 6
 
< 0.1%
Other Punctuation 4
 
< 0.1%
Uppercase Letter 4
 
< 0.1%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7443
19.4%
6768
17.6%
6193
16.1%
1878
 
4.9%
1831
 
4.8%
1575
 
4.1%
1570
 
4.1%
1103
 
2.9%
988
 
2.6%
986
 
2.6%
Other values (143) 8061
21.0%
Decimal Number
ValueCountFrequency (%)
7 2
33.3%
1 1
16.7%
5 1
16.7%
0 1
16.7%
2 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
C 1
25.0%
E 1
25.0%
P 1
25.0%
T 1
25.0%
Other Punctuation
ValueCountFrequency (%)
/ 3
75.0%
. 1
 
25.0%
Math Symbol
ValueCountFrequency (%)
+ 1868
100.0%
Space Separator
ValueCountFrequency (%)
141
100.0%
Open Punctuation
ValueCountFrequency (%)
( 24
100.0%
Close Punctuation
ValueCountFrequency (%)
) 24
100.0%
Lowercase Letter
ValueCountFrequency (%)
m 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 38396
94.9%
Common 2067
 
5.1%
Latin 6
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7443
19.4%
6768
17.6%
6193
16.1%
1878
 
4.9%
1831
 
4.8%
1575
 
4.1%
1570
 
4.1%
1103
 
2.9%
988
 
2.6%
986
 
2.6%
Other values (143) 8061
21.0%
Common
ValueCountFrequency (%)
+ 1868
90.4%
141
 
6.8%
( 24
 
1.2%
) 24
 
1.2%
/ 3
 
0.1%
7 2
 
0.1%
1 1
 
< 0.1%
. 1
 
< 0.1%
5 1
 
< 0.1%
0 1
 
< 0.1%
Latin
ValueCountFrequency (%)
m 2
33.3%
C 1
16.7%
E 1
16.7%
P 1
16.7%
T 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 38396
94.9%
ASCII 2073
 
5.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7443
19.4%
6768
17.6%
6193
16.1%
1878
 
4.9%
1831
 
4.8%
1575
 
4.1%
1570
 
4.1%
1103
 
2.9%
988
 
2.6%
986
 
2.6%
Other values (143) 8061
21.0%
ASCII
ValueCountFrequency (%)
+ 1868
90.1%
141
 
6.8%
( 24
 
1.2%
) 24
 
1.2%
/ 3
 
0.1%
m 2
 
0.1%
7 2
 
0.1%
C 1
 
< 0.1%
E 1
 
< 0.1%
P 1
 
< 0.1%
Other values (6) 6
 
0.3%

대지면적
Real number (ℝ)

SKEWED  ZEROS 

Distinct1446
Distinct (%)14.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean429.83388
Minimum0
Maximum198227
Zeros4590
Zeros (%)45.9%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-15T06:26:12.546029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median162
Q3450
95-th percentile1184.1
Maximum198227
Range198227
Interquartile range (IQR)450

Descriptive statistics

Standard deviation2950.7439
Coefficient of variation (CV)6.8648472
Kurtosis2760.3176
Mean429.83388
Median Absolute Deviation (MAD)162
Skewness47.289453
Sum4298338.8
Variance8706889.5
MonotonicityNot monotonic
2024-03-15T06:26:13.062142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 4590
45.9%
660.0 82
 
0.8%
327.0 35
 
0.4%
397.0 31
 
0.3%
357.0 28
 
0.3%
347.0 28
 
0.3%
281.0 27
 
0.3%
377.0 27
 
0.3%
387.0 27
 
0.3%
311.0 27
 
0.3%
Other values (1436) 5098
51.0%
ValueCountFrequency (%)
0.0 4590
45.9%
3.0 1
 
< 0.1%
10.0 2
 
< 0.1%
13.0 4
 
< 0.1%
16.0 2
 
< 0.1%
17.0 2
 
< 0.1%
20.0 2
 
< 0.1%
25.0 1
 
< 0.1%
26.0 4
 
< 0.1%
30.563 1
 
< 0.1%
ValueCountFrequency (%)
198227.0 1
< 0.1%
150000.0 1
< 0.1%
73241.0 1
< 0.1%
69124.0 1
< 0.1%
48535.0 1
< 0.1%
38091.0 1
< 0.1%
36496.0 1
< 0.1%
31252.0 1
< 0.1%
25551.0 1
< 0.1%
23158.0 1
< 0.1%

부속 건축물 수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9429
Minimum0
Maximum12
Zeros3665
Zeros (%)36.6%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-15T06:26:13.349311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q31
95-th percentile3
Maximum12
Range12
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.94006806
Coefficient of variation (CV)0.99699657
Kurtosis6.5981375
Mean0.9429
Median Absolute Deviation (MAD)1
Skewness1.4495263
Sum9429
Variance0.88372796
MonotonicityNot monotonic
2024-03-15T06:26:13.567614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1 3931
39.3%
0 3665
36.6%
2 1891
18.9%
3 415
 
4.2%
4 65
 
0.7%
5 15
 
0.1%
6 6
 
0.1%
7 5
 
0.1%
8 4
 
< 0.1%
12 1
 
< 0.1%
Other values (2) 2
 
< 0.1%
ValueCountFrequency (%)
0 3665
36.6%
1 3931
39.3%
2 1891
18.9%
3 415
 
4.2%
4 65
 
0.7%
5 15
 
0.1%
6 6
 
0.1%
7 5
 
0.1%
8 4
 
< 0.1%
10 1
 
< 0.1%
ValueCountFrequency (%)
12 1
 
< 0.1%
11 1
 
< 0.1%
10 1
 
< 0.1%
8 4
 
< 0.1%
7 5
 
0.1%
6 6
 
0.1%
5 15
 
0.1%
4 65
 
0.7%
3 415
 
4.2%
2 1891
18.9%

부속 건축물 면적
Real number (ℝ)

HIGH CORRELATION  SKEWED  ZEROS 

Distinct3598
Distinct (%)36.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.107376
Minimum0
Maximum3116.58
Zeros3667
Zeros (%)36.7%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-15T06:26:13.800164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median18.91
Q338.25
95-th percentile87.49525
Maximum3116.58
Range3116.58
Interquartile range (IQR)38.25

Descriptive statistics

Standard deviation70.548264
Coefficient of variation (CV)2.4237247
Kurtosis680.42154
Mean29.107376
Median Absolute Deviation (MAD)18.91
Skewness20.935833
Sum291073.76
Variance4977.0575
MonotonicityNot monotonic
2024-03-15T06:26:14.047493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 3667
36.7%
18.0 57
 
0.6%
27.0 29
 
0.3%
24.0 26
 
0.3%
28.0 22
 
0.2%
12.0 19
 
0.2%
16.8 15
 
0.1%
36.0 15
 
0.1%
19.2 15
 
0.1%
25.2 14
 
0.1%
Other values (3588) 6121
61.2%
ValueCountFrequency (%)
0.0 3667
36.7%
0.64 1
 
< 0.1%
0.81 1
 
< 0.1%
0.88 1
 
< 0.1%
1.0 1
 
< 0.1%
1.1 1
 
< 0.1%
1.2 1
 
< 0.1%
1.21 1
 
< 0.1%
1.42 1
 
< 0.1%
1.44 3
 
< 0.1%
ValueCountFrequency (%)
3116.58 1
< 0.1%
2418.99 1
< 0.1%
2050.07 1
< 0.1%
1669.17 1
< 0.1%
1668.33 1
< 0.1%
1621.13 1
< 0.1%
1379.65 1
< 0.1%
1166.84 1
< 0.1%
1048.01 1
< 0.1%
888.0 1
< 0.1%

총 동 연면적
Real number (ℝ)

SKEWED 

Distinct6833
Distinct (%)68.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.83626
Minimum0
Maximum15118.16
Zeros27
Zeros (%)0.3%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-15T06:26:14.711148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile18.74
Q148.8
median76.705
Q3105.985
95-th percentile281.008
Maximum15118.16
Range15118.16
Interquartile range (IQR)57.185

Descriptive statistics

Standard deviation280.36481
Coefficient of variation (CV)2.4847049
Kurtosis1369.4447
Mean112.83626
Median Absolute Deviation (MAD)28.345
Skewness29.904104
Sum1128362.6
Variance78604.425
MonotonicityNot monotonic
2024-03-15T06:26:15.151605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 27
 
0.3%
66.0 15
 
0.1%
96.0 13
 
0.1%
84.0 13
 
0.1%
36.0 13
 
0.1%
171.0 13
 
0.1%
27.0 11
 
0.1%
165.0 11
 
0.1%
99.0 11
 
0.1%
32.0 11
 
0.1%
Other values (6823) 9862
98.6%
ValueCountFrequency (%)
0.0 27
0.3%
2.6 1
 
< 0.1%
2.88 4
 
< 0.1%
3.0 2
 
< 0.1%
3.3 1
 
< 0.1%
3.52 1
 
< 0.1%
3.64 1
 
< 0.1%
4.16 1
 
< 0.1%
4.29 1
 
< 0.1%
4.36 1
 
< 0.1%
ValueCountFrequency (%)
15118.16 1
< 0.1%
13329.75 1
< 0.1%
5985.83 1
< 0.1%
5665.34 1
< 0.1%
4505.0 1
< 0.1%
4388.99 1
< 0.1%
3269.73 1
< 0.1%
3174.37 1
< 0.1%
2718.94 1
< 0.1%
2568.508 1
< 0.1%

Interactions

2024-03-15T06:26:07.881884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:05.037140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:06.126341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:06.856176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:08.275623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:05.329881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:06.310281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:07.104153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:08.577665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:05.613005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:06.499557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:07.378210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:08.862050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:05.905544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:06.673884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T06:26:07.622027image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T06:26:15.452731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대지면적부속 건축물 수부속 건축물 면적총 동 연면적
대지면적1.0000.5070.8170.624
부속 건축물 수0.5071.0000.8540.395
부속 건축물 면적0.8170.8541.0000.606
총 동 연면적0.6240.3950.6061.000
2024-03-15T06:26:15.664021image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대지면적부속 건축물 수부속 건축물 면적총 동 연면적
대지면적1.0000.1420.1840.438
부속 건축물 수0.1421.0000.9010.309
부속 건축물 면적0.1840.9011.0000.411
총 동 연면적0.4380.3090.4111.000

Missing values

2024-03-15T06:26:09.240679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T06:26:09.677526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

지붕종류대지면적부속 건축물 수부속 건축물 면적총 동 연면적
7325스레이트149.000.027.94
3329스라브333.000.068.4
9559스레트165.0115.0126.87
4236함석165.000.021.87
411스레이트248.0122.3671.36
11815스레트612.0378.02152.02
8281스라브357.000.071.02
7773스레트572.000.0247.0
4969함석+스레트0.0241.4872.28
6656함석0.000.016.95
지붕종류대지면적부속 건축물 수부속 건축물 면적총 동 연면적
11754스레트1260.0124.657.6
8680스레트0.0132.0130.4
12124강판기와842.0155.8698.1
4377세멘기와410.000.066.28
345스레트0.0113.7235.12
11730스레트0.0119.254.54
1192스레트0.000.043.24
10011세멘기와228.0256.76105.71
4425시멘트기와221.0122.766.37
219아연+스레트0.019.6840.9

Duplicate rows

Most frequently occurring

지붕종류대지면적부속 건축물 수부속 건축물 면적총 동 연면적# duplicates
18스라브0.000.036.06
122스레트0.000.032.05
6경사지붕0.02221.661148.364
62스레트0.000.017.254
69스레트0.000.017.984
83스레트0.000.021.04
98스레트0.000.025.194
106스레트0.000.028.54
130스레트0.000.034.024
3갈바륨강판0.000.0752.03