Overview

Dataset statistics

Number of variables4
Number of observations234
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.7 KiB
Average record size in memory33.5 B

Variable types

Numeric1
Text1
Categorical1
DateTime1

Dataset

Description전북특별자치도 고창군 원룸 및 오피스텔 현황에 대한 데이터로 지번주소, 주거 용도, 사용승인 일자 항목을 제공합니다
Author전북특별자치도 고창군
URLhttps://www.data.go.kr/data/15126597/fileData.do

Alerts

용도 is highly imbalanced (92.9%)Imbalance
연번 has unique valuesUnique

Reproduction

Analysis started2024-03-14 23:37:21.392564
Analysis finished2024-03-14 23:37:22.059630
Duration0.67 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct234
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean118.41026
Minimum1
Maximum235
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.2 KiB
2024-03-15T08:37:22.191298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile12.65
Q160.25
median118.5
Q3176.75
95-th percentile223.35
Maximum235
Range234
Interquartile range (IQR)116.5

Descriptive statistics

Standard deviation67.836417
Coefficient of variation (CV)0.57289308
Kurtosis-1.1938146
Mean118.41026
Median Absolute Deviation (MAD)58.5
Skewness-0.0060311694
Sum27708
Variance4601.7795
MonotonicityStrictly increasing
2024-03-15T08:37:22.452950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.4%
163 1
 
0.4%
151 1
 
0.4%
152 1
 
0.4%
153 1
 
0.4%
154 1
 
0.4%
155 1
 
0.4%
156 1
 
0.4%
157 1
 
0.4%
158 1
 
0.4%
Other values (224) 224
95.7%
ValueCountFrequency (%)
1 1
0.4%
2 1
0.4%
3 1
0.4%
4 1
0.4%
5 1
0.4%
6 1
0.4%
7 1
0.4%
8 1
0.4%
9 1
0.4%
10 1
0.4%
ValueCountFrequency (%)
235 1
0.4%
234 1
0.4%
233 1
0.4%
232 1
0.4%
231 1
0.4%
230 1
0.4%
229 1
0.4%
228 1
0.4%
227 1
0.4%
226 1
0.4%
Distinct208
Distinct (%)88.9%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
2024-03-15T08:37:23.914890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length25
Median length24
Mean length21.111111
Min length18

Characters and Unicode

Total characters4940
Distinct characters77
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique190 ?
Unique (%)81.2%

Sample

1st row전라북도 고창군 해리면 왕촌리 256-9
2nd row전라북도 고창군 고창읍 읍내리 22-4
3rd row전라북도 고창군 고창읍 교촌리 364
4th row전라북도 고창군 부안면 선운리 272-2
5th row전라북도 고창군 상하면 자룡리 527-8
ValueCountFrequency (%)
전라북도 234
20.0%
고창군 234
20.0%
고창읍 146
12.5%
월곡리 57
 
4.9%
교촌리 45
 
3.8%
읍내리 36
 
3.1%
상하면 25
 
2.1%
해리면 19
 
1.6%
자룡리 17
 
1.5%
부안면 15
 
1.3%
Other values (242) 342
29.2%
2024-03-15T08:37:25.850218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
936
18.9%
382
 
7.7%
380
 
7.7%
253
 
5.1%
235
 
4.8%
234
 
4.7%
234
 
4.7%
234
 
4.7%
234
 
4.7%
182
 
3.7%
Other values (67) 1636
33.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3042
61.6%
Space Separator 936
 
18.9%
Decimal Number 844
 
17.1%
Dash Punctuation 118
 
2.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
382
12.6%
380
12.5%
253
8.3%
235
7.7%
234
 
7.7%
234
 
7.7%
234
 
7.7%
234
 
7.7%
182
 
6.0%
88
 
2.9%
Other values (55) 586
19.3%
Decimal Number
ValueCountFrequency (%)
1 128
15.2%
2 111
13.2%
5 101
12.0%
6 87
10.3%
3 85
10.1%
4 85
10.1%
8 73
8.6%
7 72
8.5%
9 64
7.6%
0 38
 
4.5%
Space Separator
ValueCountFrequency (%)
936
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 118
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3042
61.6%
Common 1898
38.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
382
12.6%
380
12.5%
253
8.3%
235
7.7%
234
 
7.7%
234
 
7.7%
234
 
7.7%
234
 
7.7%
182
 
6.0%
88
 
2.9%
Other values (55) 586
19.3%
Common
ValueCountFrequency (%)
936
49.3%
1 128
 
6.7%
- 118
 
6.2%
2 111
 
5.8%
5 101
 
5.3%
6 87
 
4.6%
3 85
 
4.5%
4 85
 
4.5%
8 73
 
3.8%
7 72
 
3.8%
Other values (2) 102
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3042
61.6%
ASCII 1898
38.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
936
49.3%
1 128
 
6.7%
- 118
 
6.2%
2 111
 
5.8%
5 101
 
5.3%
6 87
 
4.6%
3 85
 
4.5%
4 85
 
4.5%
8 73
 
3.8%
7 72
 
3.8%
Other values (2) 102
 
5.4%
Hangul
ValueCountFrequency (%)
382
12.6%
380
12.5%
253
8.3%
235
7.7%
234
 
7.7%
234
 
7.7%
234
 
7.7%
234
 
7.7%
182
 
6.0%
88
 
2.9%
Other values (55) 586
19.3%

용도
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
원룸
232 
오피스텔
 
2

Length

Max length4
Median length2
Mean length2.017094
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row원룸
2nd row원룸
3rd row원룸
4th row원룸
5th row원룸

Common Values

ValueCountFrequency (%)
원룸 232
99.1%
오피스텔 2
 
0.9%

Length

2024-03-15T08:37:26.257383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-15T08:37:26.612640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
원룸 232
99.1%
오피스텔 2
 
0.9%
Distinct202
Distinct (%)86.3%
Missing0
Missing (%)0.0%
Memory size2.0 KiB
Minimum1978-03-29 00:00:00
Maximum2023-10-19 00:00:00
2024-03-15T08:37:26.898073image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-15T08:37:27.385781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-03-15T08:37:21.587828image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T08:37:27.659830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번용도
연번1.0000.000
용도0.0001.000
2024-03-15T08:37:27.881537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번용도
연번1.0000.000
용도0.0001.000

Missing values

2024-03-15T08:37:21.847543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T08:37:22.000866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번지번주소용도사용승인일자
01전라북도 고창군 해리면 왕촌리 256-9원룸2009-11-12
12전라북도 고창군 고창읍 읍내리 22-4원룸2011-10-21
23전라북도 고창군 고창읍 교촌리 364원룸2020-12-08
34전라북도 고창군 부안면 선운리 272-2원룸2011-07-19
45전라북도 고창군 상하면 자룡리 527-8원룸2014-07-04
56전라북도 고창군 상하면 자룡리 529-34원룸2015-06-29
67전라북도 고창군 아산면 삼인리 624-28원룸2006-11-10
78전라북도 고창군 고창읍 읍내리 163-3원룸2009-12-24
89전라북도 고창군 고창읍 읍내리 404-2원룸2010-08-23
910전라북도 고창군 고창읍 월곡리 773원룸2012-09-11
연번지번주소용도사용승인일자
224226전라북도 고창군 고창읍 교촌리 412원룸2016-11-02
225227전라북도 고창군 고창읍 덕정리 159원룸2020-09-18
226228전라북도 고창군 고창읍 읍내리 1065원룸2013-02-01
227229전라북도 고창군 고창읍 읍내리 681-148원룸1978-03-29
228230전라북도 고창군 고수면 황산리 308-34원룸2017-10-17
229231전라북도 고창군 아산면 삼인리 113-4원룸2007-09-21
230232전라북도 고창군 부안면 검산리 559-7원룸2009-05-04
231233전라북도 고창군 고창읍 율계리 327-28원룸2013-12-18
232234전라북도 고창군 부안면 용산리 491-17원룸2016-11-22
233235전라북도 고창군 해리면 광승리 948-2원룸2018-07-06