Overview

Dataset statistics

Number of variables5
Number of observations338
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory14.0 KiB
Average record size in memory42.4 B

Variable types

Categorical1
Numeric2
Text2

Dataset

Description한국보훈복지의료공단 광주보훈병원 과별 퇴원환자 20대 주진단명 데이터로 부서명,순위,질병코드,상병명,실인원의 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15067067/fileData.do

Reproduction

Analysis started2023-12-12 11:44:15.895281
Analysis finished2023-12-12 11:44:17.506871
Duration1.61 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

부서명
Categorical

Distinct18
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
비뇨의학과
33 
가정의학과
26 
호흡기내과
24 
감염내과
23 
소화기내과
22 
Other values (13)
210 

Length

Max length6
Median length5
Mean length4.5443787
Min length2

Unique

Unique1 ?
Unique (%)0.3%

Sample

1st row가정의학과
2nd row가정의학과
3rd row가정의학과
4th row가정의학과
5th row가정의학과

Common Values

ValueCountFrequency (%)
비뇨의학과 33
 
9.8%
가정의학과 26
 
7.7%
호흡기내과 24
 
7.1%
감염내과 23
 
6.8%
소화기내과 22
 
6.5%
정형외과 22
 
6.5%
재활의학과 21
 
6.2%
신장내과 21
 
6.2%
외과 21
 
6.2%
순환기내과 21
 
6.2%
Other values (8) 104
30.8%

Length

2023-12-12T20:44:17.674505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
비뇨의학과 33
 
9.8%
가정의학과 26
 
7.7%
호흡기내과 24
 
7.1%
감염내과 23
 
6.8%
소화기내과 22
 
6.5%
정형외과 22
 
6.5%
외과 21
 
6.2%
순환기내과 21
 
6.2%
신장내과 21
 
6.2%
재활의학과 21
 
6.2%
Other values (8) 104
30.8%

순위
Real number (ℝ)

Distinct19
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.6183432
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.1 KiB
2023-12-12T20:44:17.937453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q15
median9
Q314
95-th percentile20
Maximum20
Range19
Interquartile range (IQR)9

Descriptive statistics

Standard deviation5.6046204
Coefficient of variation (CV)0.58270122
Kurtosis-1.0469148
Mean9.6183432
Median Absolute Deviation (MAD)5
Skewness0.22083062
Sum3251
Variance31.411769
MonotonicityNot monotonic
2023-12-12T20:44:18.155339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
14 43
12.7%
8 34
 
10.1%
5 32
 
9.5%
3 23
 
6.8%
12 19
 
5.6%
2 19
 
5.6%
20 19
 
5.6%
1 18
 
5.3%
10 17
 
5.0%
11 15
 
4.4%
Other values (9) 99
29.3%
ValueCountFrequency (%)
1 18
5.3%
2 19
5.6%
3 23
6.8%
4 14
4.1%
5 32
9.5%
6 13
 
3.8%
7 10
 
3.0%
8 34
10.1%
9 11
 
3.3%
10 17
5.0%
ValueCountFrequency (%)
20 19
5.6%
19 8
 
2.4%
18 12
 
3.6%
17 9
 
2.7%
16 13
 
3.8%
14 43
12.7%
13 9
 
2.7%
12 19
5.6%
11 15
 
4.4%
10 17
 
5.0%
Distinct227
Distinct (%)67.2%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
2023-12-12T20:44:19.025591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1014
Distinct characters29
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique165 ?
Unique (%)48.8%

Sample

1st rowC34
2nd rowC22
3rd rowC25
4th rowC16
5th rowC18
ValueCountFrequency (%)
u07 7
 
2.1%
j18 6
 
1.8%
c22 5
 
1.5%
c25 5
 
1.5%
c16 5
 
1.5%
c61 4
 
1.2%
s06 4
 
1.2%
i10 4
 
1.2%
a41 4
 
1.2%
m48 4
 
1.2%
Other values (217) 290
85.8%
2023-12-12T20:44:20.229373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 106
 
10.5%
0 93
 
9.2%
2 84
 
8.3%
5 72
 
7.1%
4 69
 
6.8%
3 67
 
6.6%
C 61
 
6.0%
6 60
 
5.9%
7 44
 
4.3%
8 42
 
4.1%
Other values (19) 316
31.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 676
66.7%
Uppercase Letter 338
33.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 61
18.0%
I 34
10.1%
N 33
9.8%
M 28
8.3%
J 26
7.7%
K 26
7.7%
R 24
 
7.1%
S 21
 
6.2%
D 15
 
4.4%
A 13
 
3.8%
Other values (9) 57
16.9%
Decimal Number
ValueCountFrequency (%)
1 106
15.7%
0 93
13.8%
2 84
12.4%
5 72
10.7%
4 69
10.2%
3 67
9.9%
6 60
8.9%
7 44
6.5%
8 42
 
6.2%
9 39
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
Common 676
66.7%
Latin 338
33.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 61
18.0%
I 34
10.1%
N 33
9.8%
M 28
8.3%
J 26
7.7%
K 26
7.7%
R 24
 
7.1%
S 21
 
6.2%
D 15
 
4.4%
A 13
 
3.8%
Other values (9) 57
16.9%
Common
ValueCountFrequency (%)
1 106
15.7%
0 93
13.8%
2 84
12.4%
5 72
10.7%
4 69
10.2%
3 67
9.9%
6 60
8.9%
7 44
6.5%
8 42
 
6.2%
9 39
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1014
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 106
 
10.5%
0 93
 
9.2%
2 84
 
8.3%
5 72
 
7.1%
4 69
 
6.8%
3 67
 
6.6%
C 61
 
6.0%
6 60
 
5.9%
7 44
 
4.3%
8 42
 
4.1%
Other values (19) 316
31.2%
Distinct227
Distinct (%)67.2%
Missing0
Missing (%)0.0%
Memory size2.8 KiB
2023-12-12T20:44:21.005651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length28.5
Mean length11.174556
Min length2

Characters and Unicode

Total characters3777
Distinct characters293
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique165 ?
Unique (%)48.8%

Sample

1st row기관지 및 폐의 악성 신생물
2nd row간 및 간내 담관의 악성 신생물
3rd row췌장의 악성 신생물
4th row위의 악성 신생물
5th row결장의 악성 신생물
ValueCountFrequency (%)
89
 
8.6%
기타 75
 
7.2%
신생물 67
 
6.4%
악성 58
 
5.6%
않은 18
 
1.7%
상세불명 17
 
1.6%
달리 16
 
1.5%
장애 15
 
1.4%
분류되지 14
 
1.3%
골절 13
 
1.2%
Other values (366) 658
63.3%
2023-12-12T20:44:22.142493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
703
 
18.6%
171
 
4.5%
127
 
3.4%
103
 
2.7%
100
 
2.6%
89
 
2.4%
87
 
2.3%
76
 
2.0%
70
 
1.9%
70
 
1.9%
Other values (283) 2181
57.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3002
79.5%
Space Separator 703
 
18.6%
Decimal Number 27
 
0.7%
Dash Punctuation 12
 
0.3%
Other Punctuation 12
 
0.3%
Uppercase Letter 8
 
0.2%
Open Punctuation 6
 
0.2%
Close Punctuation 6
 
0.2%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
171
 
5.7%
127
 
4.2%
103
 
3.4%
100
 
3.3%
89
 
3.0%
87
 
2.9%
76
 
2.5%
70
 
2.3%
70
 
2.3%
59
 
2.0%
Other values (266) 2050
68.3%
Decimal Number
ValueCountFrequency (%)
0 8
29.6%
7 7
25.9%
1 5
18.5%
2 3
 
11.1%
9 3
 
11.1%
3 1
 
3.7%
Other Punctuation
ValueCountFrequency (%)
, 11
91.7%
. 1
 
8.3%
Uppercase Letter
ValueCountFrequency (%)
U 7
87.5%
G 1
 
12.5%
Open Punctuation
ValueCountFrequency (%)
( 5
83.3%
[ 1
 
16.7%
Close Punctuation
ValueCountFrequency (%)
) 5
83.3%
] 1
 
16.7%
Space Separator
ValueCountFrequency (%)
703
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3002
79.5%
Common 767
 
20.3%
Latin 8
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
171
 
5.7%
127
 
4.2%
103
 
3.4%
100
 
3.3%
89
 
3.0%
87
 
2.9%
76
 
2.5%
70
 
2.3%
70
 
2.3%
59
 
2.0%
Other values (266) 2050
68.3%
Common
ValueCountFrequency (%)
703
91.7%
- 12
 
1.6%
, 11
 
1.4%
0 8
 
1.0%
7 7
 
0.9%
( 5
 
0.7%
) 5
 
0.7%
1 5
 
0.7%
2 3
 
0.4%
9 3
 
0.4%
Other values (5) 5
 
0.7%
Latin
ValueCountFrequency (%)
U 7
87.5%
G 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3002
79.5%
ASCII 775
 
20.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
703
90.7%
- 12
 
1.5%
, 11
 
1.4%
0 8
 
1.0%
U 7
 
0.9%
7 7
 
0.9%
( 5
 
0.6%
) 5
 
0.6%
1 5
 
0.6%
2 3
 
0.4%
Other values (7) 9
 
1.2%
Hangul
ValueCountFrequency (%)
171
 
5.7%
127
 
4.2%
103
 
3.4%
100
 
3.3%
89
 
3.0%
87
 
2.9%
76
 
2.5%
70
 
2.3%
70
 
2.3%
59
 
2.0%
Other values (266) 2050
68.3%

실인원
Real number (ℝ)

Distinct58
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.671598
Minimum1
Maximum595
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.1 KiB
2023-12-12T20:44:22.993617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q313
95-th percentile52.15
Maximum595
Range594
Interquartile range (IQR)11

Descriptive statistics

Standard deviation43.115006
Coefficient of variation (CV)2.7511557
Kurtosis105.12862
Mean15.671598
Median Absolute Deviation (MAD)4
Skewness9.0096117
Sum5297
Variance1858.9037
MonotonicityNot monotonic
2023-12-12T20:44:23.292071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 83
24.6%
3 30
 
8.9%
2 26
 
7.7%
5 19
 
5.6%
9 19
 
5.6%
4 18
 
5.3%
7 18
 
5.3%
10 11
 
3.3%
8 8
 
2.4%
6 8
 
2.4%
Other values (48) 98
29.0%
ValueCountFrequency (%)
1 83
24.6%
2 26
 
7.7%
3 30
 
8.9%
4 18
 
5.3%
5 19
 
5.6%
6 8
 
2.4%
7 18
 
5.3%
8 8
 
2.4%
9 19
 
5.6%
10 11
 
3.3%
ValueCountFrequency (%)
595 1
0.3%
301 1
0.3%
251 1
0.3%
164 1
0.3%
160 1
0.3%
126 1
0.3%
120 1
0.3%
102 1
0.3%
98 1
0.3%
96 1
0.3%

Interactions

2023-12-12T20:44:16.776303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:44:16.363008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:44:16.997948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T20:44:16.576744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T20:44:23.525276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
부서명순위실인원
부서명1.0000.6850.000
순위0.6851.0000.317
실인원0.0000.3171.000
2023-12-12T20:44:23.751201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순위실인원부서명
순위1.000-0.4760.339
실인원-0.4761.0000.000
부서명0.3390.0001.000

Missing values

2023-12-12T20:44:17.246766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:44:17.447943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

부서명순위질병코드상병명실인원
0가정의학과1C34기관지 및 폐의 악성 신생물44
1가정의학과2C22간 및 간내 담관의 악성 신생물30
2가정의학과3C25췌장의 악성 신생물20
3가정의학과4C16위의 악성 신생물19
4가정의학과5C18결장의 악성 신생물10
5가정의학과6C61전립선의 악성 신생물9
6가정의학과6C20직장의 악성 신생물9
7가정의학과8C19직장구불결장접합부의 악성 신생물7
8가정의학과8C67방광의 악성 신생물7
9가정의학과10C24담도의 기타 및 상세불명 부분의 악성 신생물6
부서명순위질병코드상병명실인원
328호흡기내과14N18만성 신장병1
329호흡기내과14J47기관지확장증1
330호흡기내과14J21급성 세기관지염1
331호흡기내과14J09확인된 동물매개 또는 범유행 인플루엔자바이러스에 의한 인플루엔자1
332호흡기내과14J20급성 기관지염1
333호흡기내과14D64기타 빈혈1
334호흡기내과14J96달리 분류되지 않은 호흡부전1
335호흡기내과14J90달리 분류되지 않은 흉막삼출액1
336호흡기내과14J86농흉1
337호흡기내과14U09코로나-19 이후 병태1