Overview

Dataset statistics

Number of variables6
Number of observations518
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)0.4%
Total size in memory25.4 KiB
Average record size in memory50.3 B

Variable types

Categorical3
Text1
Numeric2

Dataset

Description전라남도 나주시 관내 원룸 및 오피스텔 데이터입니다. 소재지주소, 주용도, 가구수, 사용승인연도 등에 관한 데이터를 제공합니다.
Author전라남도 나주시
URLhttps://www.data.go.kr/data/15077537/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
Dataset has 2 (0.4%) duplicate rowsDuplicates
구분 is highly overall correlated with 가구수 and 1 other fieldsHigh correlation
주용도 is highly overall correlated with 가구수 and 1 other fieldsHigh correlation
가구수 is highly overall correlated with 구분 and 1 other fieldsHigh correlation
구분 is highly imbalanced (85.2%)Imbalance
주용도 is highly imbalanced (85.2%)Imbalance

Reproduction

Analysis started2023-12-12 14:37:42.923263
Analysis finished2023-12-12 14:37:43.707529
Duration0.78 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
원룸
507 
오피스텔
 
11

Length

Max length4
Median length2
Mean length2.042471
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row원룸
2nd row원룸
3rd row원룸
4th row원룸
5th row원룸

Common Values

ValueCountFrequency (%)
원룸 507
97.9%
오피스텔 11
 
2.1%

Length

2023-12-12T23:37:43.788954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:37:43.894125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
원룸 507
97.9%
오피스텔 11
 
2.1%
Distinct510
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
2023-12-12T23:37:44.285178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length23
Mean length18.996139
Min length15

Characters and Unicode

Total characters9840
Distinct characters85
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique506 ?
Unique (%)97.7%

Sample

1st row전라남도 나주시 경현동 276
2nd row전라남도 나주시 경현동 439-3
3rd row전라남도 나주시 경현동 561-1
4th row전라남도 나주시 공산면 동촌리 310-6
5th row전라남도 나주시 과원동 10-1
ValueCountFrequency (%)
전라남도 518
23.9%
나주시 518
23.9%
빛가람동 171
 
7.9%
송월동 92
 
4.2%
대호동 75
 
3.5%
이창동 57
 
2.6%
남평읍 28
 
1.3%
산포면 27
 
1.2%
매성리 25
 
1.2%
동사리 24
 
1.1%
Other values (551) 636
29.3%
2023-12-12T23:37:45.237190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1653
16.8%
552
 
5.6%
529
 
5.4%
520
 
5.3%
519
 
5.3%
518
 
5.3%
518
 
5.3%
518
 
5.3%
1 498
 
5.1%
447
 
4.5%
Other values (75) 3568
36.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5646
57.4%
Decimal Number 2140
 
21.7%
Space Separator 1653
 
16.8%
Dash Punctuation 401
 
4.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
552
9.8%
529
9.4%
520
9.2%
519
9.2%
518
9.2%
518
9.2%
518
9.2%
447
 
7.9%
171
 
3.0%
171
 
3.0%
Other values (63) 1183
21.0%
Decimal Number
ValueCountFrequency (%)
1 498
23.3%
2 288
13.5%
4 250
11.7%
3 242
11.3%
0 186
 
8.7%
7 185
 
8.6%
5 168
 
7.9%
6 124
 
5.8%
8 111
 
5.2%
9 88
 
4.1%
Space Separator
ValueCountFrequency (%)
1653
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 401
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5646
57.4%
Common 4194
42.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
552
9.8%
529
9.4%
520
9.2%
519
9.2%
518
9.2%
518
9.2%
518
9.2%
447
 
7.9%
171
 
3.0%
171
 
3.0%
Other values (63) 1183
21.0%
Common
ValueCountFrequency (%)
1653
39.4%
1 498
 
11.9%
- 401
 
9.6%
2 288
 
6.9%
4 250
 
6.0%
3 242
 
5.8%
0 186
 
4.4%
7 185
 
4.4%
5 168
 
4.0%
6 124
 
3.0%
Other values (2) 199
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5646
57.4%
ASCII 4194
42.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1653
39.4%
1 498
 
11.9%
- 401
 
9.6%
2 288
 
6.9%
4 250
 
6.0%
3 242
 
5.8%
0 186
 
4.4%
7 185
 
4.4%
5 168
 
4.0%
6 124
 
3.0%
Other values (2) 199
 
4.7%
Hangul
ValueCountFrequency (%)
552
9.8%
529
9.4%
520
9.2%
519
9.2%
518
9.2%
518
9.2%
518
9.2%
447
 
7.9%
171
 
3.0%
171
 
3.0%
Other values (63) 1183
21.0%

주용도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
단독주택
507 
업무시설
 
11

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row단독주택
2nd row단독주택
3rd row단독주택
4th row단독주택
5th row단독주택

Common Values

ValueCountFrequency (%)
단독주택 507
97.9%
업무시설 11
 
2.1%

Length

2023-12-12T23:37:45.396974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:37:45.486257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
단독주택 507
97.9%
업무시설 11
 
2.1%

가구수
Real number (ℝ)

HIGH CORRELATION 

Distinct28
Distinct (%)5.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.025097
Minimum2
Maximum1315
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-12T23:37:45.600348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2
Q13
median4
Q315
95-th percentile19
Maximum1315
Range1313
Interquartile range (IQR)12

Descriptive statistics

Standard deviation87.533234
Coefficient of variation (CV)5.1414237
Kurtosis182.26949
Mean17.025097
Median Absolute Deviation (MAD)2
Skewness12.98019
Sum8819
Variance7662.0671
MonotonicityNot monotonic
2023-12-12T23:37:45.747877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=28)
ValueCountFrequency (%)
3 166
32.0%
2 66
 
12.7%
19 49
 
9.5%
15 38
 
7.3%
18 35
 
6.8%
4 31
 
6.0%
12 20
 
3.9%
16 19
 
3.7%
14 13
 
2.5%
13 11
 
2.1%
Other values (18) 70
13.5%
ValueCountFrequency (%)
2 66
 
12.7%
3 166
32.0%
4 31
 
6.0%
5 7
 
1.4%
6 10
 
1.9%
7 1
 
0.2%
8 10
 
1.9%
9 8
 
1.5%
10 10
 
1.9%
11 4
 
0.8%
ValueCountFrequency (%)
1315 1
0.2%
1288 1
0.2%
559 1
0.2%
367 1
0.2%
264 1
0.2%
260 1
0.2%
234 1
0.2%
156 1
0.2%
30 1
0.2%
24 1
0.2%

사용승인년도
Real number (ℝ)

Distinct39
Distinct (%)7.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2013.0734
Minimum1910
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-12T23:37:45.894928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1910
5-th percentile1996.85
Q12013
median2015
Q32017
95-th percentile2021
Maximum2023
Range113
Interquartile range (IQR)4

Descriptive statistics

Standard deviation11.607309
Coefficient of variation (CV)0.0057659641
Kurtosis40.328165
Mean2013.0734
Median Absolute Deviation (MAD)2
Skewness-5.7085086
Sum1042772
Variance134.72962
MonotonicityNot monotonic
2023-12-12T23:37:46.030762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
2014 103
19.9%
2015 80
15.4%
2013 46
8.9%
2012 38
 
7.3%
2018 30
 
5.8%
2021 30
 
5.8%
2016 29
 
5.6%
2022 24
 
4.6%
2019 23
 
4.4%
2017 22
 
4.2%
Other values (29) 93
18.0%
ValueCountFrequency (%)
1910 1
0.2%
1914 1
0.2%
1915 1
0.2%
1935 1
0.2%
1948 2
0.4%
1950 1
0.2%
1972 1
0.2%
1976 1
0.2%
1977 1
0.2%
1979 1
0.2%
ValueCountFrequency (%)
2023 1
 
0.2%
2022 24
 
4.6%
2021 30
 
5.8%
2020 21
 
4.1%
2019 23
 
4.4%
2018 30
 
5.8%
2017 22
 
4.2%
2016 29
 
5.6%
2015 80
15.4%
2014 103
19.9%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
2023-01-30
518 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-01-30
2nd row2023-01-30
3rd row2023-01-30
4th row2023-01-30
5th row2023-01-30

Common Values

ValueCountFrequency (%)
2023-01-30 518
100.0%

Length

2023-12-12T23:37:46.158380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:37:46.274779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-01-30 518
100.0%

Interactions

2023-12-12T23:37:43.338065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:37:43.142455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:37:43.453458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:37:43.228910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:37:46.351213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분주용도가구수사용승인년도
구분1.0000.9970.7140.000
주용도0.9971.0000.7140.000
가구수0.7140.7141.0000.000
사용승인년도0.0000.0000.0001.000
2023-12-12T23:37:46.444340image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분주용도
구분1.0000.953
주용도0.9531.000
2023-12-12T23:37:46.530566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가구수사용승인년도구분주용도
가구수1.0000.0750.8470.847
사용승인년도0.0751.0000.0000.000
구분0.8470.0001.0000.953
주용도0.8470.0000.9531.000

Missing values

2023-12-12T23:37:43.565240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:37:43.668311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분소재지지번주소주용도가구수사용승인년도데이터기준일자
0원룸전라남도 나주시 경현동 276단독주택1420132023-01-30
1원룸전라남도 나주시 경현동 439-3단독주택220122023-01-30
2원룸전라남도 나주시 경현동 561-1단독주택1420122023-01-30
3원룸전라남도 나주시 공산면 동촌리 310-6단독주택220102023-01-30
4원룸전라남도 나주시 과원동 10-1단독주택1820122023-01-30
5원룸전라남도 나주시 교동 153단독주택1320122023-01-30
6원룸전라남도 나주시 교동 153-2단독주택520142023-01-30
7원룸전라남도 나주시 금계동 86단독주택419952023-01-30
8원룸전라남도 나주시 금계동 104단독주택1220122023-01-30
9오피스텔전라남도 나주시 금계동 104-1업무시설3020042023-01-30
구분소재지지번주소주용도가구수사용승인년도데이터기준일자
508원룸전라남도 나주시 송월동 1177단독주택1720222023-01-30
509원룸전라남도 나주시 빛가람동 83-4단독주택320222023-01-30
510원룸전라남도 나주시 송월동 1163단독주택1520222023-01-30
511원룸전라남도 나주시 빛가람동 89-4단독주택320222023-01-30
512원룸전라남도 나주시 송월동 1366단독주택1120222023-01-30
513원룸전라남도 나주시 송월동 1367단독주택1320222023-01-30
514원룸전라남도 나주시 이창동 715-6단독주택1020222023-01-30
515원룸전라남도 나주시 이창동 715-5단독주택1920222023-01-30
516원룸전라남도 나주시 송월동 1185단독주택1920232023-01-30
517오피스텔전라남도 나주시 빛가람동 334업무시설55920222023-01-30

Duplicate rows

Most frequently occurring

구분소재지지번주소주용도가구수사용승인년도데이터기준일자# duplicates
0원룸전라남도 나주시 다도면 판촌리 1415단독주택220172023-01-306
1원룸전라남도 나주시 빛가람동 84-13단독주택320212023-01-302