Overview

Dataset statistics

Number of variables5
Number of observations516
Missing cells4
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.3 KiB
Average record size in memory42.3 B

Variable types

Categorical1
Numeric2
Text2

Dataset

Description한국보훈복지의료공단 광주보훈병원에서 개방하는 외래 진료과별 상위20 주요상병 데이터로 진료과,순위,상병코드,상병명,건수 순으로 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15067484/fileData.do

Alerts

순위 is highly overall correlated with 건수High correlation
건수 is highly overall correlated with 순위High correlation

Reproduction

Analysis started2023-12-12 23:58:57.705105
Analysis finished2023-12-12 23:58:58.413153
Duration0.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

진료과
Categorical

Distinct29
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
가정의학과
 
20
외과
 
20
내분비내과
 
20
비뇨의학과
 
20
산부인과
 
20
Other values (24)
416 

Length

Max length7
Median length6
Mean length4.253876
Min length2

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row가정의학과
2nd row가정의학과
3rd row가정의학과
4th row가정의학과
5th row가정의학과

Common Values

ValueCountFrequency (%)
가정의학과 20
 
3.9%
외과 20
 
3.9%
내분비내과 20
 
3.9%
비뇨의학과 20
 
3.9%
산부인과 20
 
3.9%
소아청소년과 20
 
3.9%
소화기내과 20
 
3.9%
순환기내과 20
 
3.9%
신경과 20
 
3.9%
신경외과 20
 
3.9%
Other values (19) 316
61.2%

Length

2023-12-13T08:58:58.474700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
가정의학과 20
 
3.9%
호흡기내과 20
 
3.9%
혈액종양내과 20
 
3.9%
한의과 20
 
3.9%
피부과 20
 
3.9%
통증클리닉 20
 
3.9%
정형외과 20
 
3.9%
정신건강의학과 20
 
3.9%
재활의학과 20
 
3.9%
일반내과 20
 
3.9%
Other values (19) 316
61.2%

순위
Real number (ℝ)

HIGH CORRELATION 

Distinct20
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.22093
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-13T08:58:58.587745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q15
median10
Q315
95-th percentile19
Maximum20
Range19
Interquartile range (IQR)10

Descriptive statistics

Standard deviation5.8090679
Coefficient of variation (CV)0.56835022
Kurtosis-1.2225797
Mean10.22093
Median Absolute Deviation (MAD)5
Skewness0.053716548
Sum5274
Variance33.74527
MonotonicityNot monotonic
2023-12-13T08:58:58.697023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
1 29
 
5.6%
3 28
 
5.4%
2 28
 
5.4%
4 27
 
5.2%
5 27
 
5.2%
6 27
 
5.2%
7 27
 
5.2%
14 25
 
4.8%
18 25
 
4.8%
17 25
 
4.8%
Other values (10) 248
48.1%
ValueCountFrequency (%)
1 29
5.6%
2 28
5.4%
3 28
5.4%
4 27
5.2%
5 27
5.2%
6 27
5.2%
7 27
5.2%
8 25
4.8%
9 25
4.8%
10 25
4.8%
ValueCountFrequency (%)
20 24
4.7%
19 24
4.7%
18 25
4.8%
17 25
4.8%
16 25
4.8%
15 25
4.8%
14 25
4.8%
13 25
4.8%
12 25
4.8%
11 25
4.8%
Distinct323
Distinct (%)62.6%
Missing0
Missing (%)0.0%
Memory size4.2 KiB
2023-12-13T08:58:59.032976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1548
Distinct characters31
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique219 ?
Unique (%)42.4%

Sample

1st rowR63
2nd rowJ30
3rd rowJ00
4th rowZ26
5th rowB35
ValueCountFrequency (%)
z00 10
 
1.9%
i10 7
 
1.4%
e78 7
 
1.4%
k29 6
 
1.2%
z11 6
 
1.2%
u07 6
 
1.2%
m48 5
 
1.0%
m54 5
 
1.0%
z26 5
 
1.0%
j30 5
 
1.0%
Other values (313) 454
88.0%
2023-12-13T08:58:59.533325image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 188
 
12.1%
1 146
 
9.4%
2 116
 
7.5%
3 109
 
7.0%
4 105
 
6.8%
5 97
 
6.3%
6 78
 
5.0%
7 72
 
4.7%
8 61
 
3.9%
9 60
 
3.9%
Other values (21) 516
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1032
66.7%
Uppercase Letter 516
33.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
K 54
 
10.5%
M 52
 
10.1%
J 44
 
8.5%
R 43
 
8.3%
I 33
 
6.4%
E 33
 
6.4%
N 29
 
5.6%
H 26
 
5.0%
G 26
 
5.0%
Z 26
 
5.0%
Other values (11) 150
29.1%
Decimal Number
ValueCountFrequency (%)
0 188
18.2%
1 146
14.1%
2 116
11.2%
3 109
10.6%
4 105
10.2%
5 97
9.4%
6 78
7.6%
7 72
 
7.0%
8 61
 
5.9%
9 60
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
Common 1032
66.7%
Latin 516
33.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
K 54
 
10.5%
M 52
 
10.1%
J 44
 
8.5%
R 43
 
8.3%
I 33
 
6.4%
E 33
 
6.4%
N 29
 
5.6%
H 26
 
5.0%
G 26
 
5.0%
Z 26
 
5.0%
Other values (11) 150
29.1%
Common
ValueCountFrequency (%)
0 188
18.2%
1 146
14.1%
2 116
11.2%
3 109
10.6%
4 105
10.2%
5 97
9.4%
6 78
7.6%
7 72
 
7.0%
8 61
 
5.9%
9 60
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1548
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 188
 
12.1%
1 146
 
9.4%
2 116
 
7.5%
3 109
 
7.0%
4 105
 
6.8%
5 97
 
6.3%
6 78
 
5.0%
7 72
 
4.7%
8 61
 
3.9%
9 60
 
3.9%
Other values (21) 516
33.3%
Distinct319
Distinct (%)62.3%
Missing4
Missing (%)0.8%
Memory size4.2 KiB
2023-12-13T08:58:59.828740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length28
Mean length12.183594
Min length2

Characters and Unicode

Total characters6238
Distinct characters353
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique215 ?
Unique (%)42.0%

Sample

1st row음식 및 수액섭취에 관계된 증상 및 징후
2nd row혈관운동성 및 앨러지성 비염
3rd row급성 코인두염 [감기]
4th row기타 단일 감염성 질환에 대한 예방접종의 필요
5th row피부사상균증
ValueCountFrequency (%)
180
 
10.8%
기타 130
 
7.8%
장애 49
 
2.9%
신생물 30
 
1.8%
않은 27
 
1.6%
달리 26
 
1.6%
악성 23
 
1.4%
분류되지 21
 
1.3%
질환 20
 
1.2%
상세불명의 17
 
1.0%
Other values (545) 1146
68.7%
2023-12-13T08:59:00.248205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1158
 
18.6%
221
 
3.5%
219
 
3.5%
186
 
3.0%
164
 
2.6%
147
 
2.4%
134
 
2.1%
122
 
2.0%
108
 
1.7%
95
 
1.5%
Other values (343) 3684
59.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4922
78.9%
Space Separator 1158
 
18.6%
Open Punctuation 40
 
0.6%
Close Punctuation 40
 
0.6%
Decimal Number 23
 
0.4%
Other Punctuation 22
 
0.4%
Dash Punctuation 18
 
0.3%
Uppercase Letter 12
 
0.2%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
221
 
4.5%
219
 
4.4%
186
 
3.8%
164
 
3.3%
147
 
3.0%
134
 
2.7%
122
 
2.5%
108
 
2.2%
95
 
1.9%
89
 
1.8%
Other values (323) 3437
69.8%
Decimal Number
ValueCountFrequency (%)
0 9
39.1%
7 6
26.1%
3 3
 
13.0%
9 2
 
8.7%
1 2
 
8.7%
2 1
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
U 6
50.0%
G 3
25.0%
N 1
 
8.3%
O 1
 
8.3%
S 1
 
8.3%
Open Punctuation
ValueCountFrequency (%)
( 27
67.5%
[ 13
32.5%
Close Punctuation
ValueCountFrequency (%)
) 27
67.5%
] 13
32.5%
Other Punctuation
ValueCountFrequency (%)
. 16
72.7%
, 6
 
27.3%
Space Separator
ValueCountFrequency (%)
1158
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18
100.0%
Math Symbol
ValueCountFrequency (%)
+ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4916
78.8%
Common 1304
 
20.9%
Latin 12
 
0.2%
Han 6
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
221
 
4.5%
219
 
4.5%
186
 
3.8%
164
 
3.3%
147
 
3.0%
134
 
2.7%
122
 
2.5%
108
 
2.2%
95
 
1.9%
89
 
1.8%
Other values (318) 3431
69.8%
Common
ValueCountFrequency (%)
1158
88.8%
( 27
 
2.1%
) 27
 
2.1%
- 18
 
1.4%
. 16
 
1.2%
] 13
 
1.0%
[ 13
 
1.0%
0 9
 
0.7%
7 6
 
0.5%
, 6
 
0.5%
Other values (5) 11
 
0.8%
Latin
ValueCountFrequency (%)
U 6
50.0%
G 3
25.0%
N 1
 
8.3%
O 1
 
8.3%
S 1
 
8.3%
Han
ValueCountFrequency (%)
2
33.3%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4916
78.8%
ASCII 1316
 
21.1%
CJK 6
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1158
88.0%
( 27
 
2.1%
) 27
 
2.1%
- 18
 
1.4%
. 16
 
1.2%
] 13
 
1.0%
[ 13
 
1.0%
0 9
 
0.7%
7 6
 
0.5%
, 6
 
0.5%
Other values (10) 23
 
1.7%
Hangul
ValueCountFrequency (%)
221
 
4.5%
219
 
4.5%
186
 
3.8%
164
 
3.3%
147
 
3.0%
134
 
2.7%
122
 
2.5%
108
 
2.2%
95
 
1.9%
89
 
1.8%
Other values (318) 3431
69.8%
CJK
ValueCountFrequency (%)
2
33.3%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

건수
Real number (ℝ)

HIGH CORRELATION 

Distinct372
Distinct (%)72.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean777.68798
Minimum1
Maximum24668
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.7 KiB
2023-12-13T08:59:00.374071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q155.75
median206.5
Q3546.5
95-th percentile3869.5
Maximum24668
Range24667
Interquartile range (IQR)490.75

Descriptive statistics

Standard deviation2042.7436
Coefficient of variation (CV)2.6266878
Kurtosis53.478653
Mean777.68798
Median Absolute Deviation (MAD)176
Skewness6.4052649
Sum401287
Variance4172801.3
MonotonicityNot monotonic
2023-12-13T08:59:00.733897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 10
 
1.9%
8 10
 
1.9%
2 9
 
1.7%
6 6
 
1.2%
16 5
 
1.0%
36 5
 
1.0%
134 4
 
0.8%
11 4
 
0.8%
60 4
 
0.8%
38 4
 
0.8%
Other values (362) 455
88.2%
ValueCountFrequency (%)
1 10
1.9%
2 9
1.7%
3 4
 
0.8%
4 2
 
0.4%
5 2
 
0.4%
6 6
1.2%
7 2
 
0.4%
8 10
1.9%
10 4
 
0.8%
11 4
 
0.8%
ValueCountFrequency (%)
24668 1
0.2%
15958 1
0.2%
14662 1
0.2%
13864 1
0.2%
12675 1
0.2%
11678 1
0.2%
11301 1
0.2%
7350 1
0.2%
6306 1
0.2%
6283 1
0.2%

Interactions

2023-12-13T08:58:58.139998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:58:57.978672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:58:58.212542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:58:58.061946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:59:00.807918image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
진료과순위건수
진료과1.0000.0000.127
순위0.0001.0000.432
건수0.1270.4321.000
2023-12-13T08:59:00.880786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순위건수진료과
순위1.000-0.5700.000
건수-0.5701.0000.051
진료과0.0000.0511.000

Missing values

2023-12-13T08:58:58.303891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:58:58.379353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

진료과순위상명코드상병명건수
0가정의학과1R63음식 및 수액섭취에 관계된 증상 및 징후4444
1가정의학과2J30혈관운동성 및 앨러지성 비염2275
2가정의학과3J00급성 코인두염 [감기]1971
3가정의학과4Z26기타 단일 감염성 질환에 대한 예방접종의 필요1582
4가정의학과5B35피부사상균증726
5가정의학과6I10본태성(원발성) 고혈압692
6가정의학과7R05기침588
7가정의학과8R53병감 및 피로588
8가정의학과9E78지질단백질대사장애 및 기타 지질증564
9가정의학과10Z11감염성 및 기생충성 질환에 대한 특수선별검사539
진료과순위상명코드상병명건수
506호흡기내과11J47기관지확장증181
507호흡기내과12J86농흉156
508호흡기내과13J34코 및 코옆굴의 기타 장애149
509호흡기내과14J30혈관운동성 및 앨러지성 비염149
510호흡기내과15J43폐기종135
511호흡기내과16J98기타 호흡장애129
512호흡기내과17J42상세불명의 만성 기관지염122
513호흡기내과18U07U07의 응급사용110
514호흡기내과19A31기타 형태의 미코박테리아에 의한 감염105
515호흡기내과20A16세균학적으로나 조직학적으로 확인되지 않은 호흡기결핵95