Overview

Dataset statistics

Number of variables3
Number of observations229
Missing cells0
Missing cells (%)0.0%
Duplicate rows23
Duplicate rows (%)10.0%
Total size in memory5.9 KiB
Average record size in memory26.6 B

Variable types

Numeric1
Categorical1
Text1

Dataset

Description공무원연금공단 종합재해보상 표준기안문첨부파일코드(기안문첨부서류번호, 기안문첨부구분코드, 기안문첨부파일설명 등 포함)에 관한 데이터입니다.
Author공무원연금공단
URLhttps://www.data.go.kr/data/15123826/fileData.do

Alerts

Dataset has 23 (10.0%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-12 17:57:15.316412
Analysis finished2023-12-12 17:57:15.742536
Duration0.43 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct41
Distinct (%)17.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.3755459
Minimum1
Maximum41
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.1 KiB
2023-12-13T02:57:15.826877image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q36
95-th percentile29.6
Maximum41
Range40
Interquartile range (IQR)5

Descriptive statistics

Standard deviation8.9397847
Coefficient of variation (CV)1.4021991
Kurtosis4.5825424
Mean6.3755459
Median Absolute Deviation (MAD)2
Skewness2.2952387
Sum1460
Variance79.91975
MonotonicityNot monotonic
2023-12-13T02:57:15.969229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
1 64
27.9%
2 46
20.1%
3 29
12.7%
4 16
 
7.0%
5 11
 
4.8%
6 8
 
3.5%
7 7
 
3.1%
8 4
 
1.7%
9 3
 
1.3%
16 2
 
0.9%
Other values (31) 39
17.0%
ValueCountFrequency (%)
1 64
27.9%
2 46
20.1%
3 29
12.7%
4 16
 
7.0%
5 11
 
4.8%
6 8
 
3.5%
7 7
 
3.1%
8 4
 
1.7%
9 3
 
1.3%
10 2
 
0.9%
ValueCountFrequency (%)
41 1
0.4%
40 1
0.4%
39 1
0.4%
38 1
0.4%
37 1
0.4%
36 1
0.4%
35 1
0.4%
34 1
0.4%
33 1
0.4%
32 1
0.4%
Distinct2
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
1
143 
2
86 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 143
62.4%
2 86
37.6%

Length

2023-12-13T02:57:16.124350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:57:16.231554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 143
62.4%
2 86
37.6%
Distinct153
Distinct (%)66.8%
Missing0
Missing (%)0.0%
Memory size1.9 KiB
2023-12-13T02:57:16.404391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length60
Median length32
Mean length16.375546
Min length4

Characters and Unicode

Total characters3750
Distinct characters211
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique114 ?
Unique (%)49.8%

Sample

1st row부지급 사유
2nd row부지급 사유
3rd row부지급 사유
4th row부지급 사유
5th row부지급 사유
ValueCountFrequency (%)
1부 165
 
21.9%
사본 43
 
5.7%
30
 
4.0%
부지급 20
 
2.7%
18
 
2.4%
사유 18
 
2.4%
관계서류 12
 
1.6%
결정문 8
 
1.1%
공무상요양승인결정서 8
 
1.1%
일건서류 7
 
0.9%
Other values (253) 423
56.2%
2023-12-13T02:57:16.748220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
528
 
14.1%
227
 
6.1%
1 208
 
5.5%
150
 
4.0%
100
 
2.7%
96
 
2.6%
75
 
2.0%
72
 
1.9%
65
 
1.7%
63
 
1.7%
Other values (201) 2166
57.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2763
73.7%
Space Separator 528
 
14.1%
Decimal Number 309
 
8.2%
Close Punctuation 54
 
1.4%
Open Punctuation 54
 
1.4%
Connector Punctuation 24
 
0.6%
Uppercase Letter 10
 
0.3%
Dash Punctuation 8
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
227
 
8.2%
150
 
5.4%
100
 
3.6%
96
 
3.5%
75
 
2.7%
72
 
2.6%
65
 
2.4%
63
 
2.3%
59
 
2.1%
56
 
2.0%
Other values (182) 1800
65.1%
Decimal Number
ValueCountFrequency (%)
1 208
67.3%
0 49
 
15.9%
2 21
 
6.8%
3 8
 
2.6%
9 8
 
2.6%
7 7
 
2.3%
8 4
 
1.3%
6 2
 
0.6%
4 2
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
D 2
20.0%
C 2
20.0%
I 2
20.0%
R 2
20.0%
M 2
20.0%
Space Separator
ValueCountFrequency (%)
528
100.0%
Close Punctuation
ValueCountFrequency (%)
) 54
100.0%
Open Punctuation
ValueCountFrequency (%)
( 54
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 24
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2763
73.7%
Common 977
 
26.1%
Latin 10
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
227
 
8.2%
150
 
5.4%
100
 
3.6%
96
 
3.5%
75
 
2.7%
72
 
2.6%
65
 
2.4%
63
 
2.3%
59
 
2.1%
56
 
2.0%
Other values (182) 1800
65.1%
Common
ValueCountFrequency (%)
528
54.0%
1 208
 
21.3%
) 54
 
5.5%
( 54
 
5.5%
0 49
 
5.0%
_ 24
 
2.5%
2 21
 
2.1%
- 8
 
0.8%
3 8
 
0.8%
9 8
 
0.8%
Other values (4) 15
 
1.5%
Latin
ValueCountFrequency (%)
D 2
20.0%
C 2
20.0%
I 2
20.0%
R 2
20.0%
M 2
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2763
73.7%
ASCII 987
 
26.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
528
53.5%
1 208
 
21.1%
) 54
 
5.5%
( 54
 
5.5%
0 49
 
5.0%
_ 24
 
2.4%
2 21
 
2.1%
- 8
 
0.8%
3 8
 
0.8%
9 8
 
0.8%
Other values (9) 25
 
2.5%
Hangul
ValueCountFrequency (%)
227
 
8.2%
150
 
5.4%
100
 
3.6%
96
 
3.5%
75
 
2.7%
72
 
2.6%
65
 
2.4%
63
 
2.3%
59
 
2.1%
56
 
2.0%
Other values (182) 1800
65.1%

Interactions

2023-12-13T02:57:15.485874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:57:16.851802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기안문첨부서류번호기안문첨부구분코드
기안문첨부서류번호1.0000.489
기안문첨부구분코드0.4891.000
2023-12-13T02:57:16.930434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기안문첨부서류번호기안문첨부구분코드
기안문첨부서류번호1.0000.355
기안문첨부구분코드0.3551.000

Missing values

2023-12-13T02:57:15.628250image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:57:15.705907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기안문첨부서류번호기안문첨부구분코드기안문첨부파일설명
081부지급 사유
191부지급 사유
2101부지급 사유
3111부지급 사유
4121부지급 사유
5131부지급 사유
6141부지급 사유
7151부지급 사유
8161부지급 사유
9171부지급 사유
기안문첨부서류번호기안문첨부구분코드기안문첨부파일설명
21921요청근거
22021건강보험 요양급여내역(20070101-20091231) 1부
22132공무상요양기간연장 승인내역(20130000) 00부
22212공무상요양비 청구서류 일체 1부
22321부지급 사유
22431부지급 사유
22541부지급 사유
22651부지급 사유
22761부지급 사유
22871부지급 사유

Duplicate rows

Most frequently occurring

기안문첨부서류번호기안문첨부구분코드기안문첨부파일설명# duplicates
512공무원재해보상연금위원회 결정문 사본 1부5
412공무원연금장해진단서 1부4
211공무상요양승인결정서 1부3
612관계서류 1건3
812답변서 및 관계 일건서류 1부3
1221장해급여결정통보서 1부3
011건강검진결과표 및 문진표 사본(2003-2007년) 각 1부2
111공무상요양비청구서 1부2
312공무상요양비청구서 1부(공무원연금공단 홈페이지_민원상담_서식자료실_재해보상서식)2
712국토해양부 지적전산망 조회결과 1부2