Overview

Dataset statistics

Number of variables13
Number of observations24
Missing cells259
Missing cells (%)83.0%
Duplicate rows1
Duplicate rows (%)4.2%
Total size in memory2.8 KiB
Average record size in memory118.5 B

Variable types

Text4
Unsupported9

Dataset

Description파일 다운로드
AuthorSH공사
URLhttps://data.seoul.go.kr/dataList/OA-12917/F/1/datasetView.do

Alerts

Dataset has 1 (4.2%) duplicate rowsDuplicates
Unnamed: 0 has 10 (41.7%) missing valuesMissing
Unnamed: 1 has 11 (45.8%) missing valuesMissing
Unnamed: 2 has 24 (100.0%) missing valuesMissing
Unnamed: 3 has 11 (45.8%) missing valuesMissing
Unnamed: 4 has 11 (45.8%) missing valuesMissing
Unnamed: 5 has 24 (100.0%) missing valuesMissing
Unnamed: 6 has 24 (100.0%) missing valuesMissing
Unnamed: 7 has 24 (100.0%) missing valuesMissing
Unnamed: 8 has 24 (100.0%) missing valuesMissing
Unnamed: 9 has 24 (100.0%) missing valuesMissing
Unnamed: 10 has 24 (100.0%) missing valuesMissing
Unnamed: 11 has 24 (100.0%) missing valuesMissing
Unnamed: 12 has 24 (100.0%) missing valuesMissing
Unnamed: 2 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 7 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 8 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 9 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 10 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 11 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 12 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 03:50:51.573006
Analysis finished2023-12-11 03:50:52.321066
Duration0.75 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Unnamed: 0
Text

MISSING 

Distinct14
Distinct (%)100.0%
Missing10
Missing (%)41.7%
Memory size324.0 B
2023-12-11T12:50:52.450750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length8
Mean length5.7857143
Min length1

Characters and Unicode

Total characters81
Distinct characters42
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)100.0%

Sample

1st row예산재무상태 (2023년)
2nd row수 입
3rd row영업수익
4th row- 판매사업수익
5th row- 임대사업수익
ValueCountFrequency (%)
4
20.0%
예산재무상태 1
 
5.0%
관리사업수익 1
 
5.0%
유보자금 1
 
5.0%
자본잉여금 1
 
5.0%
자본금출연 1
 
5.0%
부채수입 1
 
5.0%
자산처분 1
 
5.0%
영업외수익 1
 
5.0%
대행사업수익 1
 
5.0%
Other values (7) 7
35.0%
2023-12-11T12:50:52.816813image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
 
9.9%
6
 
7.4%
6
 
7.4%
6
 
7.4%
4
 
4.9%
- 4
 
4.9%
4
 
4.9%
3
 
3.7%
2
 
2.5%
2
 
2.5%
Other values (32) 36
44.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65
80.2%
Space Separator 6
 
7.4%
Dash Punctuation 4
 
4.9%
Decimal Number 4
 
4.9%
Close Punctuation 1
 
1.2%
Open Punctuation 1
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
 
12.3%
6
 
9.2%
6
 
9.2%
4
 
6.2%
4
 
6.2%
3
 
4.6%
2
 
3.1%
2
 
3.1%
2
 
3.1%
2
 
3.1%
Other values (25) 26
40.0%
Decimal Number
ValueCountFrequency (%)
2 2
50.0%
3 1
25.0%
0 1
25.0%
Space Separator
ValueCountFrequency (%)
6
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65
80.2%
Common 16
 
19.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
 
12.3%
6
 
9.2%
6
 
9.2%
4
 
6.2%
4
 
6.2%
3
 
4.6%
2
 
3.1%
2
 
3.1%
2
 
3.1%
2
 
3.1%
Other values (25) 26
40.0%
Common
ValueCountFrequency (%)
6
37.5%
- 4
25.0%
2 2
 
12.5%
) 1
 
6.2%
3 1
 
6.2%
0 1
 
6.2%
( 1
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65
80.2%
ASCII 16
 
19.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8
 
12.3%
6
 
9.2%
6
 
9.2%
4
 
6.2%
4
 
6.2%
3
 
4.6%
2
 
3.1%
2
 
3.1%
2
 
3.1%
2
 
3.1%
Other values (25) 26
40.0%
ASCII
ValueCountFrequency (%)
6
37.5%
- 4
25.0%
2 2
 
12.5%
) 1
 
6.2%
3 1
 
6.2%
0 1
 
6.2%
( 1
 
6.2%

Unnamed: 1
Text

MISSING 

Distinct13
Distinct (%)100.0%
Missing11
Missing (%)45.8%
Memory size324.0 B
2023-12-11T12:50:53.037343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length4.8461538
Min length2

Characters and Unicode

Total characters63
Distinct characters17
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)100.0%

Sample

1st row(단위:억원)
2nd row12,545
3rd row6,254
4th row1,595
5th row4,677
ValueCountFrequency (%)
단위:억원 1
 
7.7%
12,545 1
 
7.7%
6,254 1
 
7.7%
1,595 1
 
7.7%
4,677 1
 
7.7%
19 1
 
7.7%
472 1
 
7.7%
213 1
 
7.7%
16,029 1
 
7.7%
3,231 1
 
7.7%
Other values (3) 3
23.1%
2023-12-11T12:50:53.388088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 9
14.3%
1 8
12.7%
2 7
11.1%
3 7
11.1%
4 7
11.1%
5 6
9.5%
9 4
6.3%
6 4
6.3%
7 3
 
4.8%
( 1
 
1.6%
Other values (7) 7
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 47
74.6%
Other Punctuation 10
 
15.9%
Other Letter 4
 
6.3%
Open Punctuation 1
 
1.6%
Close Punctuation 1
 
1.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 8
17.0%
2 7
14.9%
3 7
14.9%
4 7
14.9%
5 6
12.8%
9 4
8.5%
6 4
8.5%
7 3
 
6.4%
0 1
 
2.1%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Other Punctuation
ValueCountFrequency (%)
, 9
90.0%
: 1
 
10.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 59
93.7%
Hangul 4
 
6.3%

Most frequent character per script

Common
ValueCountFrequency (%)
, 9
15.3%
1 8
13.6%
2 7
11.9%
3 7
11.9%
4 7
11.9%
5 6
10.2%
9 4
6.8%
6 4
6.8%
7 3
 
5.1%
( 1
 
1.7%
Other values (3) 3
 
5.1%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 59
93.7%
Hangul 4
 
6.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 9
15.3%
1 8
13.6%
2 7
11.9%
3 7
11.9%
4 7
11.9%
5 6
10.2%
9 4
6.8%
6 4
6.8%
7 3
 
5.1%
( 1
 
1.7%
Other values (3) 3
 
5.1%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Unnamed: 2
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing24
Missing (%)100.0%
Memory size348.0 B

Unnamed: 3
Text

MISSING 

Distinct13
Distinct (%)100.0%
Missing11
Missing (%)45.8%
Memory size324.0 B
2023-12-11T12:50:53.567474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length5
Mean length5.0769231
Min length1

Characters and Unicode

Total characters66
Distinct characters34
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)100.0%

Sample

1st row지 출
2nd row영업비용
3rd row- 택지개발사업비
4th row- 주택건설사업비
5th row- 임대사업비
ValueCountFrequency (%)
5
25.0%
1
 
5.0%
1
 
5.0%
영업비용 1
 
5.0%
택지개발사업비 1
 
5.0%
주택건설사업비 1
 
5.0%
임대사업비 1
 
5.0%
대행사업비 1
 
5.0%
1
 
5.0%
경상비 1
 
5.0%
Other values (6) 6
30.0%
2023-12-11T12:50:53.905829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9
 
13.6%
7
 
10.6%
6
 
9.1%
- 5
 
7.6%
4
 
6.1%
2
 
3.0%
2
 
3.0%
2
 
3.0%
2
 
3.0%
2
 
3.0%
Other values (24) 25
37.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 54
81.8%
Space Separator 7
 
10.6%
Dash Punctuation 5
 
7.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
16.7%
6
 
11.1%
4
 
7.4%
2
 
3.7%
2
 
3.7%
2
 
3.7%
2
 
3.7%
2
 
3.7%
2
 
3.7%
1
 
1.9%
Other values (22) 22
40.7%
Space Separator
ValueCountFrequency (%)
7
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 54
81.8%
Common 12
 
18.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
16.7%
6
 
11.1%
4
 
7.4%
2
 
3.7%
2
 
3.7%
2
 
3.7%
2
 
3.7%
2
 
3.7%
2
 
3.7%
1
 
1.9%
Other values (22) 22
40.7%
Common
ValueCountFrequency (%)
7
58.3%
- 5
41.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 54
81.8%
ASCII 12
 
18.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9
16.7%
6
 
11.1%
4
 
7.4%
2
 
3.7%
2
 
3.7%
2
 
3.7%
2
 
3.7%
2
 
3.7%
2
 
3.7%
1
 
1.9%
Other values (22) 22
40.7%
ASCII
ValueCountFrequency (%)
7
58.3%
- 5
41.7%

Unnamed: 4
Text

MISSING 

Distinct13
Distinct (%)100.0%
Missing11
Missing (%)45.8%
Memory size324.0 B
2023-12-11T12:50:54.110057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.2307692
Min length3

Characters and Unicode

Total characters68
Distinct characters19
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)100.0%

Sample

1st row(단위:억원)
2nd row18,850
3rd row5,554
4th row2,995
5th row1,633
ValueCountFrequency (%)
단위:억원 1
 
7.7%
18,850 1
 
7.7%
5,554 1
 
7.7%
2,995 1
 
7.7%
1,633 1
 
7.7%
5,273 1
 
7.7%
3,395 1
 
7.7%
1,303 1
 
7.7%
489 1
 
7.7%
216 1
 
7.7%
Other values (3) 3
23.1%
2023-12-11T12:50:54.459179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
, 10
14.7%
3 10
14.7%
5 7
10.3%
1 7
10.3%
9 6
8.8%
6 4
 
5.9%
2 4
 
5.9%
4 4
 
5.9%
8 3
 
4.4%
0 3
 
4.4%
Other values (9) 10
14.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 49
72.1%
Other Punctuation 11
 
16.2%
Other Letter 4
 
5.9%
Space Separator 2
 
2.9%
Close Punctuation 1
 
1.5%
Open Punctuation 1
 
1.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 10
20.4%
5 7
14.3%
1 7
14.3%
9 6
12.2%
6 4
 
8.2%
2 4
 
8.2%
4 4
 
8.2%
8 3
 
6.1%
0 3
 
6.1%
7 1
 
2.0%
Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Other Punctuation
ValueCountFrequency (%)
, 10
90.9%
: 1
 
9.1%
Space Separator
ValueCountFrequency (%)
2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 64
94.1%
Hangul 4
 
5.9%

Most frequent character per script

Common
ValueCountFrequency (%)
, 10
15.6%
3 10
15.6%
5 7
10.9%
1 7
10.9%
9 6
9.4%
6 4
 
6.2%
2 4
 
6.2%
4 4
 
6.2%
8 3
 
4.7%
0 3
 
4.7%
Other values (5) 6
9.4%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64
94.1%
Hangul 4
 
5.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
, 10
15.6%
3 10
15.6%
5 7
10.9%
1 7
10.9%
9 6
9.4%
6 4
 
6.2%
2 4
 
6.2%
4 4
 
6.2%
8 3
 
4.7%
0 3
 
4.7%
Other values (5) 6
9.4%
Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing24
Missing (%)100.0%
Memory size348.0 B

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing24
Missing (%)100.0%
Memory size348.0 B

Unnamed: 7
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing24
Missing (%)100.0%
Memory size348.0 B

Unnamed: 8
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing24
Missing (%)100.0%
Memory size348.0 B

Unnamed: 9
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing24
Missing (%)100.0%
Memory size348.0 B

Unnamed: 10
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing24
Missing (%)100.0%
Memory size348.0 B

Unnamed: 11
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing24
Missing (%)100.0%
Memory size348.0 B

Unnamed: 12
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing24
Missing (%)100.0%
Memory size348.0 B

Correlations

2023-12-11T12:50:54.576242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Unnamed: 0Unnamed: 1Unnamed: 3Unnamed: 4
Unnamed: 01.0001.0001.0001.000
Unnamed: 11.0001.0001.0001.000
Unnamed: 31.0001.0001.0001.000
Unnamed: 41.0001.0001.0001.000

Missing values

2023-12-11T12:50:51.871963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T12:50:52.072189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T12:50:52.240515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Unnamed: 0Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12
0예산재무상태 (2023년)<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
1<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
2수 입<NA><NA>지 출<NA><NA><NA><NA><NA><NA><NA><NA><NA>
3<NA>(단위:억원)<NA><NA>(단위:억원)<NA><NA><NA><NA><NA><NA><NA><NA>
4영업수익12,545<NA>영업비용18,850<NA><NA><NA><NA><NA><NA><NA><NA>
5- 판매사업수익6,254<NA>- 택지개발사업비5,554<NA><NA><NA><NA><NA><NA><NA><NA>
6- 임대사업수익1,595<NA>- 주택건설사업비2,995<NA><NA><NA><NA><NA><NA><NA><NA>
7- 대행사업수익4,677<NA>- 임대사업비1,633<NA><NA><NA><NA><NA><NA><NA><NA>
8- 관리사업수익19<NA>- 대행사업비 등5,273<NA><NA><NA><NA><NA><NA><NA><NA>
9영업외수익472<NA>- 경상비3,395<NA><NA><NA><NA><NA><NA><NA><NA>
Unnamed: 0Unnamed: 1Unnamed: 2Unnamed: 3Unnamed: 4Unnamed: 5Unnamed: 6Unnamed: 7Unnamed: 8Unnamed: 9Unnamed: 10Unnamed: 11Unnamed: 12
14유보자금3,349<NA>부채상환9,342<NA><NA><NA><NA><NA><NA><NA><NA>
1541,163<NA>41,163<NA><NA><NA><NA><NA><NA><NA><NA>
16<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
17<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
18<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
19<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
20<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
21<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
22<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>
23<NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

Unnamed: 0Unnamed: 1Unnamed: 3Unnamed: 4# duplicates
0<NA><NA><NA><NA>9