Overview

Dataset statistics

Number of variables7
Number of observations653
Missing cells647
Missing cells (%)14.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory37.1 KiB
Average record size in memory58.2 B

Variable types

Numeric2
Categorical2
Text2
DateTime1

Dataset

Description전라북도 완주군 관내 원룸 및 오피스텔 현황으로 주택유형, 주소, 건물명, 건축연도 등의 정보를 제공하고 있습니다.
Author전북특별자치도 완주군
URLhttps://www.data.go.kr/data/15077342/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
세대수(객실수) is highly overall correlated with 주택유형구분High correlation
주택유형구분 is highly overall correlated with 세대수(객실수)High correlation
주택유형구분 is highly imbalanced (90.5%)Imbalance
건물명 has 644 (98.6%) missing valuesMissing
번호 has unique valuesUnique

Reproduction

Analysis started2024-04-06 08:16:36.720866
Analysis finished2024-04-06 08:16:38.858643
Duration2.14 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct653
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean327
Minimum1
Maximum653
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.9 KiB
2024-04-06T17:16:39.008321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile33.6
Q1164
median327
Q3490
95-th percentile620.4
Maximum653
Range652
Interquartile range (IQR)326

Descriptive statistics

Standard deviation188.64915
Coefficient of variation (CV)0.5769087
Kurtosis-1.2
Mean327
Median Absolute Deviation (MAD)163
Skewness0
Sum213531
Variance35588.5
MonotonicityNot monotonic
2024-04-06T17:16:39.296971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
645 1
 
0.2%
196 1
 
0.2%
214 1
 
0.2%
213 1
 
0.2%
212 1
 
0.2%
211 1
 
0.2%
210 1
 
0.2%
209 1
 
0.2%
208 1
 
0.2%
207 1
 
0.2%
Other values (643) 643
98.5%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
653 1
0.2%
652 1
0.2%
651 1
0.2%
650 1
0.2%
649 1
0.2%
648 1
0.2%
647 1
0.2%
646 1
0.2%
645 1
0.2%
644 1
0.2%

주택유형구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.2 KiB
다가구주택
645 
오피스텔
 
8

Length

Max length5
Median length5
Mean length4.9877489
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row다가구주택
2nd row다가구주택
3rd row다가구주택
4th row다가구주택
5th row다가구주택

Common Values

ValueCountFrequency (%)
다가구주택 645
98.8%
오피스텔 8
 
1.2%

Length

2024-04-06T17:16:39.628141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:16:39.819763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
다가구주택 645
98.8%
오피스텔 8
 
1.2%

주소
Text

Distinct644
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size5.2 KiB
2024-04-06T17:16:40.194793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length13
Mean length13.352221
Min length11

Characters and Unicode

Total characters8719
Distinct characters86
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique636 ?
Unique (%)97.4%

Sample

1st row용진읍 운곡리 1157-1
2nd row용진읍 운곡리 1136-2
3rd row삼례읍 수계리 1234-1
4th row상관면 신리 323-2
5th row봉동읍 낙평리 166-7
ValueCountFrequency (%)
봉동읍 219
 
11.2%
이서면 180
 
9.2%
삼례읍 131
 
6.7%
둔산리 131
 
6.7%
삼례리 98
 
5.0%
용서리 95
 
4.8%
갈산리 73
 
3.7%
낙평리 44
 
2.2%
후정리 29
 
1.5%
상관면 26
 
1.3%
Other values (690) 933
47.6%
2024-04-06T17:16:40.842944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1306
 
15.0%
653
 
7.5%
- 652
 
7.5%
1 513
 
5.9%
362
 
4.2%
8 326
 
3.7%
6 307
 
3.5%
7 302
 
3.5%
292
 
3.3%
275
 
3.2%
Other values (76) 3731
42.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3903
44.8%
Decimal Number 2858
32.8%
Space Separator 1306
 
15.0%
Dash Punctuation 652
 
7.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
653
16.7%
362
 
9.3%
292
 
7.5%
275
 
7.0%
241
 
6.2%
230
 
5.9%
229
 
5.9%
224
 
5.7%
222
 
5.7%
195
 
5.0%
Other values (64) 980
25.1%
Decimal Number
ValueCountFrequency (%)
1 513
17.9%
8 326
11.4%
6 307
10.7%
7 302
10.6%
2 268
9.4%
4 260
9.1%
3 255
8.9%
5 241
8.4%
9 194
 
6.8%
0 192
 
6.7%
Space Separator
ValueCountFrequency (%)
1306
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 652
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4816
55.2%
Hangul 3903
44.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
653
16.7%
362
 
9.3%
292
 
7.5%
275
 
7.0%
241
 
6.2%
230
 
5.9%
229
 
5.9%
224
 
5.7%
222
 
5.7%
195
 
5.0%
Other values (64) 980
25.1%
Common
ValueCountFrequency (%)
1306
27.1%
- 652
13.5%
1 513
 
10.7%
8 326
 
6.8%
6 307
 
6.4%
7 302
 
6.3%
2 268
 
5.6%
4 260
 
5.4%
3 255
 
5.3%
5 241
 
5.0%
Other values (2) 386
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4816
55.2%
Hangul 3903
44.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1306
27.1%
- 652
13.5%
1 513
 
10.7%
8 326
 
6.8%
6 307
 
6.4%
7 302
 
6.3%
2 268
 
5.6%
4 260
 
5.4%
3 255
 
5.3%
5 241
 
5.0%
Other values (2) 386
 
8.0%
Hangul
ValueCountFrequency (%)
653
16.7%
362
 
9.3%
292
 
7.5%
275
 
7.0%
241
 
6.2%
230
 
5.9%
229
 
5.9%
224
 
5.7%
222
 
5.7%
195
 
5.0%
Other values (64) 980
25.1%

건물명
Text

MISSING 

Distinct9
Distinct (%)100.0%
Missing644
Missing (%)98.6%
Memory size5.2 KiB
2024-04-06T17:16:41.262079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length9
Mean length6.6666667
Min length2

Characters and Unicode

Total characters60
Distinct characters39
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)100.0%

Sample

1st row굿모닝빌딩
2nd row오월
3rd row사월
4th row에르미따쥬
5th row삼례오피스텔
ValueCountFrequency (%)
굿모닝빌딩 1
 
7.7%
오월 1
 
7.7%
사월 1
 
7.7%
에르미따쥬 1
 
7.7%
삼례오피스텔 1
 
7.7%
엠카운티오피스텔 1
 
7.7%
삼례이지움 1
 
7.7%
1
 
7.7%
퍼스트 1
 
7.7%
케렌시아 1
 
7.7%
Other values (3) 3
23.1%
2024-04-06T17:16:41.918052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
 
6.7%
4
 
6.7%
4
 
6.7%
3
 
5.0%
3
 
5.0%
s 2
 
3.3%
2
 
3.3%
2
 
3.3%
2
 
3.3%
2
 
3.3%
Other values (29) 32
53.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 49
81.7%
Space Separator 4
 
6.7%
Lowercase Letter 4
 
6.7%
Uppercase Letter 2
 
3.3%
Dash Punctuation 1
 
1.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
 
8.2%
4
 
8.2%
3
 
6.1%
3
 
6.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
Other values (22) 23
46.9%
Lowercase Letter
ValueCountFrequency (%)
s 2
50.0%
l 1
25.0%
a 1
25.0%
Uppercase Letter
ValueCountFrequency (%)
T 1
50.0%
C 1
50.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 49
81.7%
Latin 6
 
10.0%
Common 5
 
8.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
 
8.2%
4
 
8.2%
3
 
6.1%
3
 
6.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
Other values (22) 23
46.9%
Latin
ValueCountFrequency (%)
s 2
33.3%
T 1
16.7%
C 1
16.7%
l 1
16.7%
a 1
16.7%
Common
ValueCountFrequency (%)
4
80.0%
- 1
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 49
81.7%
ASCII 11
 
18.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4
 
8.2%
4
 
8.2%
3
 
6.1%
3
 
6.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
2
 
4.1%
Other values (22) 23
46.9%
ASCII
ValueCountFrequency (%)
4
36.4%
s 2
18.2%
T 1
 
9.1%
- 1
 
9.1%
C 1
 
9.1%
l 1
 
9.1%
a 1
 
9.1%

세대수(객실수)
Real number (ℝ)

HIGH CORRELATION 

Distinct26
Distinct (%)4.0%
Missing3
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean8.2123077
Minimum2
Maximum219
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.9 KiB
2024-04-06T17:16:42.216324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q13
median5
Q311
95-th percentile18
Maximum219
Range217
Interquartile range (IQR)8

Descriptive statistics

Standard deviation12.066885
Coefficient of variation (CV)1.4693659
Kurtosis171.89963
Mean8.2123077
Median Absolute Deviation (MAD)3
Skewness11.34533
Sum5338
Variance145.60971
MonotonicityNot monotonic
2024-04-06T17:16:42.494113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
3 183
28.0%
2 84
12.9%
11 48
 
7.4%
10 45
 
6.9%
4 36
 
5.5%
5 31
 
4.7%
8 29
 
4.4%
9 29
 
4.4%
15 28
 
4.3%
18 24
 
3.7%
Other values (16) 113
17.3%
ValueCountFrequency (%)
2 84
12.9%
3 183
28.0%
4 36
 
5.5%
5 31
 
4.7%
6 22
 
3.4%
7 9
 
1.4%
8 29
 
4.4%
9 29
 
4.4%
10 45
 
6.9%
11 48
 
7.4%
ValueCountFrequency (%)
219 1
 
0.2%
130 1
 
0.2%
125 1
 
0.2%
47 1
 
0.2%
39 1
 
0.2%
30 1
 
0.2%
27 1
 
0.2%
20 1
 
0.2%
19 24
3.7%
18 24
3.7%
Distinct541
Distinct (%)82.8%
Missing0
Missing (%)0.0%
Memory size5.2 KiB
Minimum1986-09-15 00:00:00
Maximum2023-09-25 00:00:00
2024-04-06T17:16:42.767014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:16:43.145706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size5.2 KiB
2024-02-02
653 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2024-02-02
2nd row2024-02-02
3rd row2024-02-02
4th row2024-02-02
5th row2024-02-02

Common Values

ValueCountFrequency (%)
2024-02-02 653
100.0%

Length

2024-04-06T17:16:43.442856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:16:43.641712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2024-02-02 653
100.0%

Interactions

2024-04-06T17:16:37.592427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:16:37.224065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:16:37.933341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:16:37.415448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T17:16:43.771899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호주택유형구분건물명세대수(객실수)
번호1.0000.407NaN0.187
주택유형구분0.4071.0001.0000.680
건물명NaN1.0001.0001.000
세대수(객실수)0.1870.6801.0001.000
2024-04-06T17:16:43.993434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호세대수(객실수)주택유형구분
번호1.000-0.2380.311
세대수(객실수)-0.2381.0000.811
주택유형구분0.3110.8111.000

Missing values

2024-04-06T17:16:38.151416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T17:16:38.526387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-06T17:16:38.760361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

번호주택유형구분주소건물명세대수(객실수)건축연도데이터기준일자
0645다가구주택용진읍 운곡리 1157-1굿모닝빌딩<NA>2023-09-252024-02-02
1644다가구주택용진읍 운곡리 1136-2<NA><NA>2023-09-192024-02-02
2643다가구주택삼례읍 수계리 1234-1<NA><NA>2023-03-302024-02-02
3642다가구주택상관면 신리 323-2<NA>32023-02-212024-02-02
4641다가구주택봉동읍 낙평리 166-7오월82022-12-192024-02-02
5640다가구주택봉동읍 낙평리 166-8사월82022-12-192024-02-02
6639다가구주택구이면 두현리 548-4<NA>72022-01-262024-02-02
7638다가구주택삼례읍 수계리 1196-6<NA>32021-12-242024-02-02
8637다가구주택용진읍 간중리 747-1<NA>22021-11-302024-02-02
9636다가구주택봉동읍 낙평리 152-1<NA>62021-09-282024-02-02
번호주택유형구분주소건물명세대수(객실수)건축연도데이터기준일자
6432다가구주택삼례읍 삼례리 1631-1<NA>81987-01-072024-02-02
6441다가구주택삼례읍 삼례리 1631-1<NA>41986-09-152024-02-02
645646오피스텔삼례읍 삼례리 1315-3에르미따쥬201998-03-172024-02-02
646647오피스텔삼례읍 삼례리 936-8삼례오피스텔272002-03-292024-02-02
647648오피스텔삼례읍 삼례리 1319-3<NA>182016-05-132024-02-02
648649오피스텔이서면 갈산리 663-3엠카운티오피스텔2192016-10-122024-02-02
649650오피스텔삼례읍 삼례리 1762삼례이지움 더 퍼스트472019-10-042024-02-02
650651오피스텔이서면 갈산리 693-2케렌시아 T-Class1252020-08-182024-02-02
651652오피스텔이서면 갈산리 663-2아르티엠 오피스텔1302022-01-142024-02-02
652653오피스텔봉동읍 둔산리 904-6<NA>392023-02-152024-02-02