Overview

Dataset statistics

Number of variables3
Number of observations35
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory972.0 B
Average record size in memory27.8 B

Variable types

Text2
Categorical1

Dataset

Description샘플 데이터
Author통계청
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=35

Alerts

가구종류_구분코드(STA_CD) has unique valuesUnique
분류(CLSS2) has unique valuesUnique

Reproduction

Analysis started2023-12-10 14:50:19.115089
Analysis finished2023-12-10 14:50:19.981207
Duration0.87 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct35
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size412.0 B
2023-12-10T23:50:20.167058image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters315
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)100.0%

Sample

1st rowGA_CO_001
2nd rowGA_CO_002
3rd rowGA_CO_003
4th rowGA_CO_004
5th rowGA_CO_005
ValueCountFrequency (%)
ga_co_001 1
 
2.9%
ga_he_008 1
 
2.9%
ga_he_010 1
 
2.9%
ga_po_001 1
 
2.9%
ga_po_002 1
 
2.9%
ga_po_003 1
 
2.9%
ga_po_004 1
 
2.9%
ga_po_005 1
 
2.9%
ga_he_009 1
 
2.9%
ga_po_006 1
 
2.9%
Other values (25) 25
71.4%
2023-12-10T23:50:20.570426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 70
22.2%
0 69
21.9%
G 35
11.1%
A 35
11.1%
O 19
 
6.0%
C 11
 
3.5%
H 10
 
3.2%
E 10
 
3.2%
1 9
 
2.9%
D 6
 
1.9%
Other values (11) 41
13.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 140
44.4%
Decimal Number 105
33.3%
Connector Punctuation 70
22.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 69
65.7%
1 9
 
8.6%
2 5
 
4.8%
6 4
 
3.8%
5 4
 
3.8%
4 4
 
3.8%
3 4
 
3.8%
8 2
 
1.9%
7 2
 
1.9%
9 2
 
1.9%
Uppercase Letter
ValueCountFrequency (%)
G 35
25.0%
A 35
25.0%
O 19
13.6%
C 11
 
7.9%
H 10
 
7.1%
E 10
 
7.1%
D 6
 
4.3%
S 6
 
4.3%
P 6
 
4.3%
T 2
 
1.4%
Connector Punctuation
ValueCountFrequency (%)
_ 70
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 175
55.6%
Latin 140
44.4%

Most frequent character per script

Common
ValueCountFrequency (%)
_ 70
40.0%
0 69
39.4%
1 9
 
5.1%
2 5
 
2.9%
6 4
 
2.3%
5 4
 
2.3%
4 4
 
2.3%
3 4
 
2.3%
8 2
 
1.1%
7 2
 
1.1%
Latin
ValueCountFrequency (%)
G 35
25.0%
A 35
25.0%
O 19
13.6%
C 11
 
7.9%
H 10
 
7.1%
E 10
 
7.1%
D 6
 
4.3%
S 6
 
4.3%
P 6
 
4.3%
T 2
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 315
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 70
22.2%
0 69
21.9%
G 35
11.1%
A 35
11.1%
O 19
 
6.0%
C 11
 
3.5%
H 10
 
3.2%
E 10
 
3.2%
1 9
 
2.9%
D 6
 
1.9%
Other values (11) 41
13.0%
Distinct5
Distinct (%)14.3%
Missing0
Missing (%)0.0%
Memory size412.0 B
방,거실,식당수별가구
11 
난방시설별가구
10 
점유형태별가구
세대구성별가구
가구총괄

Length

Max length11
Median length7
Mean length8.0857143
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row방,거실,식당수별가구
2nd row방,거실,식당수별가구
3rd row방,거실,식당수별가구
4th row방,거실,식당수별가구
5th row방,거실,식당수별가구

Common Values

ValueCountFrequency (%)
방,거실,식당수별가구 11
31.4%
난방시설별가구 10
28.6%
점유형태별가구 6
17.1%
세대구성별가구 6
17.1%
가구총괄 2
 
5.7%

Length

2023-12-10T23:50:20.732477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:50:20.870831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
방,거실,식당수별가구 11
31.4%
난방시설별가구 10
28.6%
점유형태별가구 6
17.1%
세대구성별가구 6
17.1%
가구총괄 2
 
5.7%

분류(CLSS2)
Text

UNIQUE 

Distinct35
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size412.0 B
2023-12-10T23:50:21.126280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.5142857
Min length2

Characters and Unicode

Total characters158
Distinct characters60
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)100.0%

Sample

1st row방1개
2nd row방2개
3rd row방3개
4th row방4개
5th row방5개이상
ValueCountFrequency (%)
방1개 1
 
2.9%
연탄아궁이 1
 
2.9%
기타 1
 
2.9%
자가 1
 
2.9%
전세 1
 
2.9%
보증금있는월세 1
 
2.9%
사글세 1
 
2.9%
무상 1
 
2.9%
재래식아궁이 1
 
2.9%
보증금없는월세 1
 
2.9%
Other values (25) 25
71.4%
2023-12-10T23:50:21.523460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11
 
7.0%
9
 
5.7%
8
 
5.1%
8
 
5.1%
7
 
4.4%
7
 
4.4%
5
 
3.2%
1 5
 
3.2%
5
 
3.2%
5
 
3.2%
Other values (50) 88
55.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 144
91.1%
Decimal Number 14
 
8.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
11
 
7.6%
9
 
6.2%
8
 
5.6%
8
 
5.6%
7
 
4.9%
7
 
4.9%
5
 
3.5%
5
 
3.5%
5
 
3.5%
4
 
2.8%
Other values (45) 75
52.1%
Decimal Number
ValueCountFrequency (%)
1 5
35.7%
2 4
28.6%
4 2
 
14.3%
3 2
 
14.3%
5 1
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 144
91.1%
Common 14
 
8.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
11
 
7.6%
9
 
6.2%
8
 
5.6%
8
 
5.6%
7
 
4.9%
7
 
4.9%
5
 
3.5%
5
 
3.5%
5
 
3.5%
4
 
2.8%
Other values (45) 75
52.1%
Common
ValueCountFrequency (%)
1 5
35.7%
2 4
28.6%
4 2
 
14.3%
3 2
 
14.3%
5 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 144
91.1%
ASCII 14
 
8.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
11
 
7.6%
9
 
6.2%
8
 
5.6%
8
 
5.6%
7
 
4.9%
7
 
4.9%
5
 
3.5%
5
 
3.5%
5
 
3.5%
4
 
2.8%
Other values (45) 75
52.1%
ASCII
ValueCountFrequency (%)
1 5
35.7%
2 4
28.6%
4 2
 
14.3%
3 2
 
14.3%
5 1
 
7.1%

Correlations

2023-12-10T23:50:21.636086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가구종류_구분코드(STA_CD)통계항목(CLSS1)분류(CLSS2)
가구종류_구분코드(STA_CD)1.0001.0001.000
통계항목(CLSS1)1.0001.0001.000
분류(CLSS2)1.0001.0001.000

Missing values

2023-12-10T23:50:19.829648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:50:19.944324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

가구종류_구분코드(STA_CD)통계항목(CLSS1)분류(CLSS2)
0GA_CO_001방,거실,식당수별가구방1개
1GA_CO_002방,거실,식당수별가구방2개
2GA_CO_003방,거실,식당수별가구방3개
3GA_CO_004방,거실,식당수별가구방4개
4GA_CO_005방,거실,식당수별가구방5개이상
5GA_CO_006방,거실,식당수별가구거실없음
6GA_CO_007방,거실,식당수별가구거실1개
7GA_CO_008방,거실,식당수별가구거실2개이상
8GA_CO_009방,거실,식당수별가구식당없음
9GA_CO_010방,거실,식당수별가구식당1개
가구종류_구분코드(STA_CD)통계항목(CLSS1)분류(CLSS2)
25GA_PO_005점유형태별가구무상
26GA_PO_006점유형태별가구보증금없는월세
27GA_SD_001세대구성별가구1세대가구
28GA_SD_002세대구성별가구2세대가구
29GA_SD_003세대구성별가구3세대가구
30GA_SD_004세대구성별가구4세대가구
31GA_SD_005세대구성별가구1인가구
32GA_SD_006세대구성별가구비혈연가구
33TO_GA_001가구총괄총가구수
34TO_GA_002가구총괄평균가구원수