Overview

Dataset statistics

Number of variables4
Number of observations48
Missing cells31
Missing cells (%)16.1%
Duplicate rows5
Duplicate rows (%)10.4%
Total size in memory1.6 KiB
Average record size in memory34.8 B

Variable types

Text1
Unsupported3

Dataset

Description파일 다운로드
AuthorSH공사
URLhttps://data.seoul.go.kr/dataList/OA-12920/F/1/datasetView.do

Alerts

Dataset has 5 (10.4%) duplicate rowsDuplicates
위치 : 공사홈페이지 > 정부3.0 > 경영공시 > 예산재무상태 > 2015년 has 8 (16.7%) missing valuesMissing
Unnamed: 1 has 7 (14.6%) missing valuesMissing
Unnamed: 2 has 9 (18.8%) missing valuesMissing
Unnamed: 3 has 7 (14.6%) missing valuesMissing
Unnamed: 1 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 2 is an unsupported type, check if it needs cleaning or further analysisUnsupported
Unnamed: 3 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 04:51:09.532147
Analysis finished2023-12-11 04:51:09.926848
Duration0.39 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct36
Distinct (%)90.0%
Missing8
Missing (%)16.7%
Memory size516.0 B
2023-12-11T13:51:10.039951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length11.5
Mean length7.175
Min length2

Characters and Unicode

Total characters287
Distinct characters83
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)80.0%

Sample

1st row▣ 재무제표 (2014년 결산)
2nd row과목
3rd row1.유동자산
4th row- 당좌자산
5th row- 재고자산
ValueCountFrequency (%)
18
23.4%
2 4
 
5.2%
임대사업 2
 
2.6%
기타사업 2
 
2.6%
과목 2
 
2.6%
주택건설사업 2
 
2.6%
영업외 2
 
2.6%
3 2
 
2.6%
4 2
 
2.6%
1 2
 
2.6%
Other values (39) 39
50.6%
2023-12-11T13:51:10.391302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
48
 
16.7%
- 18
 
6.3%
. 15
 
5.2%
13
 
4.5%
10
 
3.5%
9
 
3.1%
8
 
2.8%
7
 
2.4%
6
 
2.1%
6
 
2.1%
Other values (73) 147
51.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 184
64.1%
Space Separator 48
 
16.7%
Decimal Number 19
 
6.6%
Dash Punctuation 18
 
6.3%
Other Punctuation 15
 
5.2%
Open Punctuation 1
 
0.3%
Other Symbol 1
 
0.3%
Close Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13
 
7.1%
10
 
5.4%
9
 
4.9%
8
 
4.3%
7
 
3.8%
6
 
3.3%
6
 
3.3%
6
 
3.3%
5
 
2.7%
5
 
2.7%
Other values (59) 109
59.2%
Decimal Number
ValueCountFrequency (%)
2 5
26.3%
1 5
26.3%
4 3
15.8%
3 2
 
10.5%
5 1
 
5.3%
6 1
 
5.3%
0 1
 
5.3%
7 1
 
5.3%
Space Separator
ValueCountFrequency (%)
48
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%
Other Punctuation
ValueCountFrequency (%)
. 15
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 184
64.1%
Common 103
35.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13
 
7.1%
10
 
5.4%
9
 
4.9%
8
 
4.3%
7
 
3.8%
6
 
3.3%
6
 
3.3%
6
 
3.3%
5
 
2.7%
5
 
2.7%
Other values (59) 109
59.2%
Common
ValueCountFrequency (%)
48
46.6%
- 18
 
17.5%
. 15
 
14.6%
2 5
 
4.9%
1 5
 
4.9%
4 3
 
2.9%
3 2
 
1.9%
5 1
 
1.0%
6 1
 
1.0%
( 1
 
1.0%
Other values (4) 4
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 184
64.1%
ASCII 102
35.5%
Geometric Shapes 1
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
48
47.1%
- 18
 
17.6%
. 15
 
14.7%
2 5
 
4.9%
1 5
 
4.9%
4 3
 
2.9%
3 2
 
2.0%
5 1
 
1.0%
6 1
 
1.0%
( 1
 
1.0%
Other values (3) 3
 
2.9%
Hangul
ValueCountFrequency (%)
13
 
7.1%
10
 
5.4%
9
 
4.9%
8
 
4.3%
7
 
3.8%
6
 
3.3%
6
 
3.3%
6
 
3.3%
5
 
2.7%
5
 
2.7%
Other values (59) 109
59.2%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%

Unnamed: 1
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing7
Missing (%)14.6%
Memory size516.0 B

Unnamed: 2
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing9
Missing (%)18.8%
Memory size516.0 B

Unnamed: 3
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing7
Missing (%)14.6%
Memory size516.0 B

Missing values

2023-12-11T13:51:09.629576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T13:51:09.731791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-11T13:51:09.853259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

위치 : 공사홈페이지 > 정부3.0 > 경영공시 > 예산재무상태 > 2015년Unnamed: 1Unnamed: 2Unnamed: 3
0<NA>NaNNaNNaN
1▣ 재무제표 (2014년 결산)NaNNaNNaN
2<NA>NaNNaNNaN
3<NA>재무상태표 [요약]NaNNaN
4<NA>NaNNaN(단위 : 억원)
5과목2014년2013년증감
61.유동자산90043108390△18347
7- 당좌자산17058134993559
8- 재고자산7298594891△21906
92. 비유동자산1442761343329944
위치 : 공사홈페이지 > 정부3.0 > 경영공시 > 예산재무상태 > 2015년Unnamed: 1Unnamed: 2Unnamed: 3
38- 기타사업101
393. 매출총이익40054115△110
40- 판매비와 관리비12361539△303
414. 영업이익27692576193
42- 영업외 수익6601381△721
43- 영업외 비용19761747229
445. 경상이익14532210△757
456. 세전순이익14532210△757
46- 범인세비용4091013△604
477. 당기순이익10441197△153

Duplicate rows

Most frequently occurring

위치 : 공사홈페이지 > 정부3.0 > 경영공시 > 예산재무상태 > 2015년# duplicates
4<NA>8
0- 기타사업2
1- 임대사업2
2- 주택건설사업2
3과목2