Overview

Dataset statistics

Number of variables4
Number of observations48
Missing cells31
Missing cells (%)16.1%
Duplicate rows3
Duplicate rows (%)6.2%
Total size in memory1.6 KiB
Average record size in memory34.8 B

Variable types

Text4

Dataset

Description파일 다운로드
AuthorSH공사
URLhttps://data.seoul.go.kr/dataList/OA-12920/F/1/datasetView.do

Alerts

Dataset has 3 (6.2%) duplicate rowsDuplicates
위치 : 공사홈페이지 > 정부3.0 > 경영공시 > 예산재무상태 > 2022년 has 8 (16.7%) missing valuesMissing
Unnamed: 1 has 7 (14.6%) missing valuesMissing
Unnamed: 2 has 9 (18.8%) missing valuesMissing
Unnamed: 3 has 7 (14.6%) missing valuesMissing

Reproduction

Analysis started2024-04-17 09:54:30.806192
Analysis finished2024-04-17 09:54:31.369803
Duration0.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct36
Distinct (%)90.0%
Missing8
Missing (%)16.7%
Memory size516.0 B
2024-04-17T18:54:31.496709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length12
Mean length7.35
Min length2

Characters and Unicode

Total characters294
Distinct characters83
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)80.0%

Sample

1st row▣ 재무제표 (2022년 결산)
2nd row과목
3rd row1.유동자산
4th row - 당좌자산
5th row - 재고자산
ValueCountFrequency (%)
18
23.4%
2 4
 
5.2%
임대사업 2
 
2.6%
기타사업 2
 
2.6%
과목 2
 
2.6%
주택건설사업 2
 
2.6%
영업외 2
 
2.6%
3 2
 
2.6%
4 2
 
2.6%
1 2
 
2.6%
Other values (39) 39
50.6%
2024-04-17T18:54:31.806142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
55
 
18.7%
- 18
 
6.1%
. 15
 
5.1%
14
 
4.8%
10
 
3.4%
9
 
3.1%
8
 
2.7%
7
 
2.4%
2 7
 
2.4%
6
 
2.0%
Other values (73) 145
49.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 184
62.6%
Space Separator 55
 
18.7%
Decimal Number 19
 
6.5%
Dash Punctuation 18
 
6.1%
Other Punctuation 15
 
5.1%
Other Symbol 1
 
0.3%
Open Punctuation 1
 
0.3%
Close Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
7.6%
10
 
5.4%
9
 
4.9%
8
 
4.3%
7
 
3.8%
6
 
3.3%
6
 
3.3%
6
 
3.3%
5
 
2.7%
5
 
2.7%
Other values (59) 108
58.7%
Decimal Number
ValueCountFrequency (%)
2 7
36.8%
1 4
21.1%
4 2
 
10.5%
3 2
 
10.5%
5 1
 
5.3%
6 1
 
5.3%
0 1
 
5.3%
7 1
 
5.3%
Space Separator
ValueCountFrequency (%)
55
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%
Other Punctuation
ValueCountFrequency (%)
. 15
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 184
62.6%
Common 110
37.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
7.6%
10
 
5.4%
9
 
4.9%
8
 
4.3%
7
 
3.8%
6
 
3.3%
6
 
3.3%
6
 
3.3%
5
 
2.7%
5
 
2.7%
Other values (59) 108
58.7%
Common
ValueCountFrequency (%)
55
50.0%
- 18
 
16.4%
. 15
 
13.6%
2 7
 
6.4%
1 4
 
3.6%
4 2
 
1.8%
3 2
 
1.8%
5 1
 
0.9%
6 1
 
0.9%
1
 
0.9%
Other values (4) 4
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 184
62.6%
ASCII 109
37.1%
Geometric Shapes 1
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
55
50.5%
- 18
 
16.5%
. 15
 
13.8%
2 7
 
6.4%
1 4
 
3.7%
4 2
 
1.8%
3 2
 
1.8%
5 1
 
0.9%
6 1
 
0.9%
( 1
 
0.9%
Other values (3) 3
 
2.8%
Hangul
ValueCountFrequency (%)
14
 
7.6%
10
 
5.4%
9
 
4.9%
8
 
4.3%
7
 
3.8%
6
 
3.3%
6
 
3.3%
6
 
3.3%
5
 
2.7%
5
 
2.7%
Other values (59) 108
58.7%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%

Unnamed: 1
Text

MISSING 

Distinct38
Distinct (%)92.7%
Missing7
Missing (%)14.6%
Memory size516.0 B
2024-04-17T18:54:32.011286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length7
Mean length5.3414634
Min length2

Characters and Unicode

Total characters219
Distinct characters28
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)85.4%

Sample

1st row재무상태표 [요약]
2nd row2022년
3rd row62,715
4th row35,218
5th row27,497
ValueCountFrequency (%)
2022년 2
 
4.7%
279,625 2
 
4.7%
2,556 2
 
4.7%
요약 2
 
4.7%
9,532 1
 
2.3%
5,772 1
 
2.3%
17,004 1
 
2.3%
1,754 1
 
2.3%
128 1
 
2.3%
15,900 1
 
2.3%
Other values (29) 29
67.4%
2024-04-17T18:54:32.583441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 33
15.1%
, 28
12.8%
1 25
11.4%
5 20
9.1%
9 17
7.8%
7 16
7.3%
8 13
 
5.9%
3 12
 
5.5%
6 12
 
5.5%
0 10
 
4.6%
Other values (18) 33
15.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 167
76.3%
Other Punctuation 28
 
12.8%
Other Letter 16
 
7.3%
Other Symbol 2
 
0.9%
Space Separator 2
 
0.9%
Open Punctuation 2
 
0.9%
Close Punctuation 2
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
12.5%
2
12.5%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Other values (3) 3
18.8%
Decimal Number
ValueCountFrequency (%)
2 33
19.8%
1 25
15.0%
5 20
12.0%
9 17
10.2%
7 16
9.6%
8 13
 
7.8%
3 12
 
7.2%
6 12
 
7.2%
0 10
 
6.0%
4 9
 
5.4%
Other Punctuation
ValueCountFrequency (%)
, 28
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 2
100.0%
Close Punctuation
ValueCountFrequency (%)
] 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 203
92.7%
Hangul 16
 
7.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 33
16.3%
, 28
13.8%
1 25
12.3%
5 20
9.9%
9 17
8.4%
7 16
7.9%
8 13
 
6.4%
3 12
 
5.9%
6 12
 
5.9%
0 10
 
4.9%
Other values (5) 17
8.4%
Hangul
ValueCountFrequency (%)
2
12.5%
2
12.5%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Other values (3) 3
18.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 201
91.8%
Hangul 16
 
7.3%
Geometric Shapes 2
 
0.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 33
16.4%
, 28
13.9%
1 25
12.4%
5 20
10.0%
9 17
8.5%
7 16
8.0%
8 13
 
6.5%
3 12
 
6.0%
6 12
 
6.0%
0 10
 
5.0%
Other values (4) 15
7.5%
Geometric Shapes
ValueCountFrequency (%)
2
100.0%
Hangul
ValueCountFrequency (%)
2
12.5%
2
12.5%
2
12.5%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Other values (3) 3
18.8%

Unnamed: 2
Text

MISSING 

Distinct36
Distinct (%)92.3%
Missing9
Missing (%)18.8%
Memory size516.0 B
2024-04-17T18:54:32.766648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.1538462
Min length2

Characters and Unicode

Total characters201
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)84.6%

Sample

1st row2021년
2nd row68,014
3rd row34,313
4th row33,701
5th row203,467
ValueCountFrequency (%)
271,481 2
 
5.1%
1,879 2
 
5.1%
2021년 2
 
5.1%
283 1
 
2.6%
571 1
 
2.6%
1,591 1
 
2.6%
1,207 1
 
2.6%
2,798 1
 
2.6%
10,600 1
 
2.6%
58 1
 
2.6%
Other values (26) 26
66.7%
2024-04-17T18:54:33.065046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 34
16.9%
, 29
14.4%
2 19
9.5%
8 19
9.5%
7 17
8.5%
4 17
8.5%
0 17
8.5%
3 14
7.0%
9 11
 
5.5%
6 10
 
5.0%
Other values (3) 14
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 168
83.6%
Other Punctuation 29
 
14.4%
Other Letter 2
 
1.0%
Other Symbol 2
 
1.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 34
20.2%
2 19
11.3%
8 19
11.3%
7 17
10.1%
4 17
10.1%
0 17
10.1%
3 14
8.3%
9 11
 
6.5%
6 10
 
6.0%
5 10
 
6.0%
Other Punctuation
ValueCountFrequency (%)
, 29
100.0%
Other Letter
ValueCountFrequency (%)
2
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 199
99.0%
Hangul 2
 
1.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 34
17.1%
, 29
14.6%
2 19
9.5%
8 19
9.5%
7 17
8.5%
4 17
8.5%
0 17
8.5%
3 14
7.0%
9 11
 
5.5%
6 10
 
5.0%
Other values (2) 12
 
6.0%
Hangul
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 197
98.0%
Hangul 2
 
1.0%
Geometric Shapes 2
 
1.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 34
17.3%
, 29
14.7%
2 19
9.6%
8 19
9.6%
7 17
8.6%
4 17
8.6%
0 17
8.6%
3 14
7.1%
9 11
 
5.6%
6 10
 
5.1%
Hangul
ValueCountFrequency (%)
2
100.0%
Geometric Shapes
ValueCountFrequency (%)
2
100.0%

Unnamed: 3
Text

MISSING 

Distinct37
Distinct (%)90.2%
Missing7
Missing (%)14.6%
Memory size516.0 B
2024-04-17T18:54:33.247665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7
Mean length4.2439024
Min length1

Characters and Unicode

Total characters174
Distinct characters22
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)80.5%

Sample

1st row(단위 : 억원)
2nd row증감
3rd row△5,299
4th row905
5th row△6,204
ValueCountFrequency (%)
8,144 2
 
4.4%
2
 
4.4%
억원 2
 
4.4%
증감 2
 
4.4%
677 2
 
4.4%
단위 2
 
4.4%
△424 1
 
2.2%
4,295 1
 
2.2%
202 1
 
2.2%
61 1
 
2.2%
Other values (29) 29
64.4%
2024-04-17T18:54:33.560953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 21
12.1%
, 17
9.8%
2 17
9.8%
1 15
 
8.6%
7 13
 
7.5%
3 11
 
6.3%
9 11
 
6.3%
10
 
5.7%
0 10
 
5.7%
5 10
 
5.7%
Other values (12) 39
22.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 125
71.8%
Other Punctuation 19
 
10.9%
Other Letter 12
 
6.9%
Other Symbol 10
 
5.7%
Space Separator 4
 
2.3%
Open Punctuation 2
 
1.1%
Close Punctuation 2
 
1.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 21
16.8%
2 17
13.6%
1 15
12.0%
7 13
10.4%
3 11
8.8%
9 11
8.8%
0 10
8.0%
5 10
8.0%
6 10
8.0%
8 7
 
5.6%
Other Letter
ValueCountFrequency (%)
2
16.7%
2
16.7%
2
16.7%
2
16.7%
2
16.7%
2
16.7%
Other Punctuation
ValueCountFrequency (%)
, 17
89.5%
: 2
 
10.5%
Other Symbol
ValueCountFrequency (%)
10
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 162
93.1%
Hangul 12
 
6.9%

Most frequent character per script

Common
ValueCountFrequency (%)
4 21
13.0%
, 17
10.5%
2 17
10.5%
1 15
9.3%
7 13
8.0%
3 11
6.8%
9 11
6.8%
10
 
6.2%
0 10
 
6.2%
5 10
 
6.2%
Other values (6) 27
16.7%
Hangul
ValueCountFrequency (%)
2
16.7%
2
16.7%
2
16.7%
2
16.7%
2
16.7%
2
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 152
87.4%
Hangul 12
 
6.9%
Geometric Shapes 10
 
5.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 21
13.8%
, 17
11.2%
2 17
11.2%
1 15
9.9%
7 13
8.6%
3 11
7.2%
9 11
7.2%
0 10
6.6%
5 10
6.6%
6 10
6.6%
Other values (5) 17
11.2%
Geometric Shapes
ValueCountFrequency (%)
10
100.0%
Hangul
ValueCountFrequency (%)
2
16.7%
2
16.7%
2
16.7%
2
16.7%
2
16.7%
2
16.7%

Correlations

2024-04-17T18:54:33.642255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위치 : 공사홈페이지 > 정부3.0 > 경영공시 > 예산재무상태 > 2022년Unnamed: 1Unnamed: 2Unnamed: 3
위치 : 공사홈페이지 > 정부3.0 > 경영공시 > 예산재무상태 > 2022년1.0000.9770.9770.977
Unnamed: 10.9771.0001.0001.000
Unnamed: 20.9771.0001.0001.000
Unnamed: 30.9771.0001.0001.000

Missing values

2024-04-17T18:54:31.088344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-17T18:54:31.200302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-04-17T18:54:31.299987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

위치 : 공사홈페이지 > 정부3.0 > 경영공시 > 예산재무상태 > 2022년Unnamed: 1Unnamed: 2Unnamed: 3
0<NA><NA><NA><NA>
1▣ 재무제표 (2022년 결산)<NA><NA><NA>
2<NA><NA><NA><NA>
3<NA>재무상태표 [요약]<NA><NA>
4<NA><NA><NA>(단위 : 억원)
5과목2022년2021년증감
61.유동자산62,71568,014△5,299
7- 당좌자산35,21834,313905
8- 재고자산27,49733,701△6,204
92. 비유동자산216,910203,46713,443
위치 : 공사홈페이지 > 정부3.0 > 경영공시 > 예산재무상태 > 2022년Unnamed: 1Unnamed: 2Unnamed: 3
38- 기타사업825824
393. 매출총이익3,2352,798437
40- 판매비와 관리비1,5241,207317
414. 영업이익1,7111,591120
42- 영업외 수익1,427571856
43- 영업외 비용582283299
445. 경상이익2,5561,879677
456. 세전순이익2,5561,879677
46- 법인세비용937481456
477. 당기순이익1,6191,398221

Duplicate rows

Most frequently occurring

위치 : 공사홈페이지 > 정부3.0 > 경영공시 > 예산재무상태 > 2022년Unnamed: 1Unnamed: 2Unnamed: 3# duplicates
2<NA><NA><NA><NA>4
0과목2022년2021년증감2
1<NA><NA><NA>(단위 : 억원)2