Overview

Dataset statistics

Number of variables3
Number of observations10000
Missing cells28
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory332.0 KiB
Average record size in memory34.0 B

Variable types

Text1
Numeric2

Dataset

Description공무원연금공단 종합재해보상 표준질환분류코드(질환분류코드, 질환분류순번, 질환분류기준코드 등 포함)에 관한 데이터입니다.
Author공무원연금공단
URLhttps://www.data.go.kr/data/15123837/fileData.do

Reproduction

Analysis started2023-12-12 22:07:35.002233
Analysis finished2023-12-12 22:07:35.573228
Duration0.57 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct7247
Distinct (%)72.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T07:07:35.819721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length4.3923
Min length3

Characters and Unicode

Total characters43923
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5482 ?
Unique (%)54.8%

Sample

1st rowE274
2nd rowS8556
3rd rowK861
4th rowN814
5th rowO3431
ValueCountFrequency (%)
e10-e14 25
 
0.2%
e1040 10
 
0.1%
e1042 10
 
0.1%
e1142 10
 
0.1%
m1001 9
 
0.1%
e1140 9
 
0.1%
r52 9
 
0.1%
m7794 9
 
0.1%
m7798 9
 
0.1%
k529 9
 
0.1%
Other values (7236) 9891
98.9%
2023-12-13T07:07:36.452437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 4752
10.8%
1 4596
10.5%
8 3831
 
8.7%
2 3608
 
8.2%
M 3245
 
7.4%
4 2954
 
6.7%
3 2854
 
6.5%
6 2853
 
6.5%
9 2742
 
6.2%
5 2738
 
6.2%
Other values (28) 9750
22.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 33546
76.4%
Uppercase Letter 10186
 
23.2%
Dash Punctuation 186
 
0.4%
Space Separator 5
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 3245
31.9%
E 568
 
5.6%
S 508
 
5.0%
K 436
 
4.3%
X 401
 
3.9%
T 400
 
3.9%
F 318
 
3.1%
Q 312
 
3.1%
Y 280
 
2.7%
O 275
 
2.7%
Other values (16) 3443
33.8%
Decimal Number
ValueCountFrequency (%)
0 4752
14.2%
1 4596
13.7%
8 3831
11.4%
2 3608
10.8%
4 2954
8.8%
3 2854
8.5%
6 2853
8.5%
9 2742
8.2%
5 2738
8.2%
7 2618
7.8%
Dash Punctuation
ValueCountFrequency (%)
- 186
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 33737
76.8%
Latin 10186
 
23.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 3245
31.9%
E 568
 
5.6%
S 508
 
5.0%
K 436
 
4.3%
X 401
 
3.9%
T 400
 
3.9%
F 318
 
3.1%
Q 312
 
3.1%
Y 280
 
2.7%
O 275
 
2.7%
Other values (16) 3443
33.8%
Common
ValueCountFrequency (%)
0 4752
14.1%
1 4596
13.6%
8 3831
11.4%
2 3608
10.7%
4 2954
8.8%
3 2854
8.5%
6 2853
8.5%
9 2742
8.1%
5 2738
8.1%
7 2618
7.8%
Other values (2) 191
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 43923
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4752
10.8%
1 4596
10.5%
8 3831
 
8.7%
2 3608
 
8.2%
M 3245
 
7.4%
4 2954
 
6.7%
3 2854
 
6.5%
6 2853
 
6.5%
9 2742
 
6.2%
5 2738
 
6.2%
Other values (28) 9750
22.2%

질환분류순번
Real number (ℝ)

Distinct60
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.5805
Minimum1
Maximum110
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T07:07:36.592577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile11
Maximum110
Range109
Interquartile range (IQR)3

Descriptive statistics

Standard deviation5.1834551
Coefficient of variation (CV)1.4476903
Kurtosis97.45821
Mean3.5805
Median Absolute Deviation (MAD)1
Skewness7.4526681
Sum35805
Variance26.868207
MonotonicityNot monotonic
2023-12-13T07:07:36.729946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3983
39.8%
2 1739
17.4%
3 1156
 
11.6%
4 882
 
8.8%
5 595
 
5.9%
6 463
 
4.6%
7 247
 
2.5%
8 161
 
1.6%
9 122
 
1.2%
10 114
 
1.1%
Other values (50) 538
 
5.4%
ValueCountFrequency (%)
1 3983
39.8%
2 1739
17.4%
3 1156
 
11.6%
4 882
 
8.8%
5 595
 
5.9%
6 463
 
4.6%
7 247
 
2.5%
8 161
 
1.6%
9 122
 
1.2%
10 114
 
1.1%
ValueCountFrequency (%)
110 1
< 0.1%
109 1
< 0.1%
100 1
< 0.1%
89 1
< 0.1%
87 1
< 0.1%
83 1
< 0.1%
80 1
< 0.1%
74 1
< 0.1%
73 1
< 0.1%
70 1
< 0.1%

질환분류기준코드
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing28
Missing (%)0.3%
Infinite0
Infinite (%)0.0%
Mean4.3036502
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T07:07:36.827263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q14
median4
Q35
95-th percentile5
Maximum6
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.78232321
Coefficient of variation (CV)0.18178132
Kurtosis0.22617858
Mean4.3036502
Median Absolute Deviation (MAD)1
Skewness-0.48443563
Sum42916
Variance0.6120296
MonotonicityNot monotonic
2023-12-13T07:07:36.962146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
4 4289
42.9%
5 4036
40.4%
3 1214
 
12.1%
6 270
 
2.7%
2 155
 
1.6%
1 8
 
0.1%
(Missing) 28
 
0.3%
ValueCountFrequency (%)
1 8
 
0.1%
2 155
 
1.6%
3 1214
 
12.1%
4 4289
42.9%
5 4036
40.4%
6 270
 
2.7%
ValueCountFrequency (%)
6 270
 
2.7%
5 4036
40.4%
4 4289
42.9%
3 1214
 
12.1%
2 155
 
1.6%
1 8
 
0.1%

Interactions

2023-12-13T07:07:35.331351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:07:35.176080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:07:35.402729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T07:07:35.258403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:07:37.033070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
질환분류순번질환분류기준코드
질환분류순번1.0000.266
질환분류기준코드0.2661.000
2023-12-13T07:07:37.160483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
질환분류순번질환분류기준코드
질환분류순번1.0000.134
질환분류기준코드0.1341.000

Missing values

2023-12-13T07:07:35.489010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:07:35.547091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

질환분류코드질환분류순번질환분류기준코드
31730E27444
2624S855615
27642K86124
45712N81424
3117O343115
36600H18574
43668M074815
7968O346235
12410Y36524
41412M19814
질환분류코드질환분류순번질환분류기준코드
22687E100345
42510M068535
54242M865715
12587Y173425
8531T67714
33944M0327105
5895R10-R1932
17630B08524
6350P9413
6545O91234