Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells27282
Missing cells (%)27.3%
Duplicate rows1
Duplicate rows (%)< 0.1%
Total size in memory878.9 KiB
Average record size in memory90.0 B

Variable types

Text3
Boolean2
Categorical4
Numeric1

Dataset

Description한국표준질병·사인분류(KCD)를 기본으로 요양급여비용 청구에 필요한 상병기호 및 상병과 관련한 각종 부가정보를 반영한 상병마스터 파일 / 주상병사용 구분, 완전코드 구분, 성별구분, 법정감염병 구분, 상·하한연령 등
Author건강보험심사평가원
URLhttps://www.data.go.kr/data/15067467/fileData.do

Alerts

완전코드구분 has constant value ""Constant
주상병사용구분 has constant value ""Constant
Dataset has 1 (< 0.1%) duplicate rowsDuplicates
성별구분 is highly overall correlated with 하한연령 and 3 other fieldsHigh correlation
상한연령 is highly overall correlated with 하한연령 and 2 other fieldsHigh correlation
양한방구분 is highly overall correlated with 하한연령 and 3 other fieldsHigh correlation
법정감염병구분 is highly overall correlated with 성별구분 and 1 other fieldsHigh correlation
하한연령 is highly overall correlated with 성별구분 and 2 other fieldsHigh correlation
법정감염병구분 is highly imbalanced (95.1%)Imbalance
성별구분 is highly imbalanced (79.1%)Imbalance
상한연령 is highly imbalanced (92.1%)Imbalance
양한방구분 is highly imbalanced (94.6%)Imbalance
완전코드구분 has 8902 (89.0%) missing valuesMissing
주상병사용구분 has 8762 (87.6%) missing valuesMissing
하한연령 has 9618 (96.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 23:33:58.266687
Analysis finished2023-12-12 23:33:59.710821
Duration1.44 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct7333
Distinct (%)73.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T08:33:59.975653image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5
Mean length4.4367
Min length3

Characters and Unicode

Total characters44367
Distinct characters36
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5675 ?
Unique (%)56.8%

Sample

1st rowT269
2nd rowM1229
3rd rowW110
4th rowM8904
5th rowJ986
ValueCountFrequency (%)
e1140 12
 
0.1%
m9025 11
 
0.1%
m7797 11
 
0.1%
m0534 11
 
0.1%
m0537 11
 
0.1%
e1142 11
 
0.1%
e1132 10
 
0.1%
e1133 10
 
0.1%
m9021 9
 
0.1%
e1121 9
 
0.1%
Other values (7323) 9895
99.0%
2023-12-13T08:34:00.380902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 4807
10.8%
0 4718
10.6%
8 4016
9.1%
2 3783
8.5%
M 3592
 
8.1%
4 3003
 
6.8%
6 2981
 
6.7%
3 2897
 
6.5%
5 2771
 
6.2%
9 2713
 
6.1%
Other values (26) 9086
20.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 34367
77.5%
Uppercase Letter 10000
 
22.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 3592
35.9%
E 549
 
5.5%
S 433
 
4.3%
X 428
 
4.3%
K 390
 
3.9%
T 331
 
3.3%
Q 311
 
3.1%
Y 305
 
3.0%
H 285
 
2.9%
F 255
 
2.5%
Other values (16) 3121
31.2%
Decimal Number
ValueCountFrequency (%)
1 4807
14.0%
0 4718
13.7%
8 4016
11.7%
2 3783
11.0%
4 3003
8.7%
6 2981
8.7%
3 2897
8.4%
5 2771
8.1%
9 2713
7.9%
7 2678
7.8%

Most occurring scripts

ValueCountFrequency (%)
Common 34367
77.5%
Latin 10000
 
22.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 3592
35.9%
E 549
 
5.5%
S 433
 
4.3%
X 428
 
4.3%
K 390
 
3.9%
T 331
 
3.3%
Q 311
 
3.1%
Y 305
 
3.0%
H 285
 
2.9%
F 255
 
2.5%
Other values (16) 3121
31.2%
Common
ValueCountFrequency (%)
1 4807
14.0%
0 4718
13.7%
8 4016
11.7%
2 3783
11.0%
4 3003
8.7%
6 2981
8.7%
3 2897
8.4%
5 2771
8.1%
9 2713
7.9%
7 2678
7.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 44367
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 4807
10.8%
0 4718
10.6%
8 4016
9.1%
2 3783
8.5%
M 3592
 
8.1%
4 3003
 
6.8%
6 2981
 
6.7%
3 2897
 
6.5%
5 2771
 
6.2%
9 2713
 
6.1%
Other values (26) 9086
20.5%
Distinct9976
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T08:34:00.649278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length156
Median length70
Mean length18.8539
Min length2

Characters and Unicode

Total characters188539
Distinct characters920
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9952 ?
Unique (%)99.5%

Sample

1st row눈 및 눈부속기의 상세불명 부분의 부식
2nd row(착색) 융모결절성 윤활막염 상세불명 부분
3rd row사다리에서의 낙상 주택
4th row어깨-손증후군 손
5th row횡격막의 마비
ValueCountFrequency (%)
2219
 
5.0%
기타 2046
 
4.6%
상세불명의 966
 
2.2%
의한 749
 
1.7%
동반한 738
 
1.7%
nos 552
 
1.2%
또는 502
 
1.1%
상세불명 447
 
1.0%
명시된 415
 
0.9%
달리 410
 
0.9%
Other values (6883) 35387
79.6%
2023-12-13T08:34:01.239234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
34449
 
18.3%
6663
 
3.5%
4064
 
2.2%
3112
 
1.7%
2945
 
1.6%
2830
 
1.5%
2691
 
1.4%
2573
 
1.4%
2551
 
1.4%
2221
 
1.2%
Other values (910) 124440
66.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 140915
74.7%
Space Separator 34449
 
18.3%
Decimal Number 3609
 
1.9%
Uppercase Letter 3075
 
1.6%
Close Punctuation 2008
 
1.1%
Open Punctuation 2008
 
1.1%
Other Punctuation 1763
 
0.9%
Dash Punctuation 577
 
0.3%
Math Symbol 105
 
0.1%
Lowercase Letter 16
 
< 0.1%
Other values (2) 14
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6663
 
4.7%
4064
 
2.9%
3112
 
2.2%
2945
 
2.1%
2830
 
2.0%
2691
 
1.9%
2573
 
1.8%
2551
 
1.8%
2221
 
1.6%
2218
 
1.6%
Other values (850) 109047
77.4%
Uppercase Letter
ValueCountFrequency (%)
N 688
22.4%
S 661
21.5%
O 624
20.3%
A 251
 
8.2%
B 126
 
4.1%
G 112
 
3.6%
I 102
 
3.3%
H 90
 
2.9%
C 81
 
2.6%
T 67
 
2.2%
Other values (13) 273
 
8.9%
Decimal Number
ValueCountFrequency (%)
0 676
18.7%
3 489
13.5%
2 407
11.3%
1 361
10.0%
4 346
9.6%
5 339
9.4%
9 286
7.9%
8 273
7.6%
6 242
 
6.7%
7 190
 
5.3%
Other Punctuation
ValueCountFrequency (%)
. 911
51.7%
464
26.3%
* 370
21.0%
/ 10
 
0.6%
: 3
 
0.2%
3
 
0.2%
? 2
 
0.1%
Lowercase Letter
ValueCountFrequency (%)
m 6
37.5%
g 4
25.0%
l 3
18.8%
h 1
 
6.2%
b 1
 
6.2%
q 1
 
6.2%
Letter Number
ValueCountFrequency (%)
5
38.5%
3
23.1%
3
23.1%
1
 
7.7%
1
 
7.7%
Close Punctuation
ValueCountFrequency (%)
) 1859
92.6%
] 149
 
7.4%
Open Punctuation
ValueCountFrequency (%)
( 1859
92.6%
[ 149
 
7.4%
Math Symbol
ValueCountFrequency (%)
~ 59
56.2%
+ 46
43.8%
Space Separator
ValueCountFrequency (%)
34449
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 577
100.0%
Other Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 140540
74.5%
Common 44520
 
23.6%
Latin 3104
 
1.6%
Han 375
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6663
 
4.7%
4064
 
2.9%
3112
 
2.2%
2945
 
2.1%
2830
 
2.0%
2691
 
1.9%
2573
 
1.8%
2551
 
1.8%
2221
 
1.6%
2218
 
1.6%
Other values (713) 108672
77.3%
Han
ValueCountFrequency (%)
60
 
16.0%
16
 
4.3%
13
 
3.5%
11
 
2.9%
11
 
2.9%
9
 
2.4%
9
 
2.4%
8
 
2.1%
8
 
2.1%
7
 
1.9%
Other values (127) 223
59.5%
Latin
ValueCountFrequency (%)
N 688
22.2%
S 661
21.3%
O 624
20.1%
A 251
 
8.1%
B 126
 
4.1%
G 112
 
3.6%
I 102
 
3.3%
H 90
 
2.9%
C 81
 
2.6%
T 67
 
2.2%
Other values (24) 302
9.7%
Common
ValueCountFrequency (%)
34449
77.4%
) 1859
 
4.2%
( 1859
 
4.2%
. 911
 
2.0%
0 676
 
1.5%
- 577
 
1.3%
3 489
 
1.1%
464
 
1.0%
2 407
 
0.9%
* 370
 
0.8%
Other values (16) 2459
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 140540
74.5%
ASCII 47143
 
25.0%
Punctuation 464
 
0.2%
CJK 370
 
0.2%
Number Forms 13
 
< 0.1%
CJK Compat Ideographs 5
 
< 0.1%
None 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
34449
73.1%
) 1859
 
3.9%
( 1859
 
3.9%
. 911
 
1.9%
N 688
 
1.5%
0 676
 
1.4%
S 661
 
1.4%
O 624
 
1.3%
- 577
 
1.2%
3 489
 
1.0%
Other values (42) 4350
 
9.2%
Hangul
ValueCountFrequency (%)
6663
 
4.7%
4064
 
2.9%
3112
 
2.2%
2945
 
2.1%
2830
 
2.0%
2691
 
1.9%
2573
 
1.8%
2551
 
1.8%
2221
 
1.6%
2218
 
1.6%
Other values (713) 108672
77.3%
Punctuation
ValueCountFrequency (%)
464
100.0%
CJK
ValueCountFrequency (%)
60
 
16.2%
16
 
4.3%
13
 
3.5%
11
 
3.0%
11
 
3.0%
9
 
2.4%
9
 
2.4%
8
 
2.2%
8
 
2.2%
7
 
1.9%
Other values (122) 218
58.9%
Number Forms
ValueCountFrequency (%)
5
38.5%
3
23.1%
3
23.1%
1
 
7.7%
1
 
7.7%
None
ValueCountFrequency (%)
3
75.0%
1
 
25.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Distinct9993
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T08:34:01.578623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length218
Median length146
Mean length47.2809
Min length3

Characters and Unicode

Total characters472809
Distinct characters90
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9986 ?
Unique (%)99.9%

Sample

1st rowCorrosion of eye and adnexa part unspecified
2nd rowVillonodular synovitis (pigmented) site unspecified
3rd rowFall on and from ladder home
4th rowShoulder-hand syndrome hand
5th rowParalysis of diaphragm
ValueCountFrequency (%)
of 3385
 
5.6%
and 2252
 
3.7%
other 2063
 
3.4%
unspecified 1386
 
2.3%
in 1243
 
2.1%
with 1225
 
2.0%
to 816
 
1.3%
or 714
 
1.2%
nos 607
 
1.0%
by 595
 
1.0%
Other values (5365) 46275
76.4%
2023-12-13T08:34:02.080485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
50591
 
10.7%
e 41964
 
8.9%
i 39356
 
8.3%
o 32904
 
7.0%
t 31590
 
6.7%
a 30253
 
6.4%
s 29336
 
6.2%
n 28714
 
6.1%
r 27994
 
5.9%
l 18067
 
3.8%
Other values (80) 142040
30.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 396842
83.9%
Space Separator 50591
 
10.7%
Uppercase Letter 14363
 
3.0%
Decimal Number 3511
 
0.7%
Close Punctuation 2061
 
0.4%
Open Punctuation 2061
 
0.4%
Other Punctuation 1747
 
0.4%
Dash Punctuation 1304
 
0.3%
Final Punctuation 226
 
< 0.1%
Math Symbol 85
 
< 0.1%
Other values (3) 18
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 41964
10.6%
i 39356
 
9.9%
o 32904
 
8.3%
t 31590
 
8.0%
a 30253
 
7.6%
s 29336
 
7.4%
n 28714
 
7.2%
r 27994
 
7.1%
l 18067
 
4.6%
c 17755
 
4.5%
Other values (18) 98909
24.9%
Uppercase Letter
ValueCountFrequency (%)
O 2051
14.3%
S 1468
10.2%
A 1322
 
9.2%
C 1183
 
8.2%
N 1173
 
8.2%
P 1029
 
7.2%
M 799
 
5.6%
I 678
 
4.7%
E 554
 
3.9%
F 528
 
3.7%
Other values (16) 3578
24.9%
Decimal Number
ValueCountFrequency (%)
0 675
19.2%
3 463
13.2%
2 394
11.2%
5 336
9.6%
4 333
9.5%
1 331
9.4%
9 286
8.1%
8 261
 
7.4%
6 242
 
6.9%
7 190
 
5.4%
Other Punctuation
ValueCountFrequency (%)
. 904
51.7%
463
26.5%
* 361
 
20.7%
/ 10
 
0.6%
' 4
 
0.2%
3
 
0.2%
: 2
 
0.1%
Letter Number
ValueCountFrequency (%)
6
37.5%
3
18.8%
3
18.8%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Math Symbol
ValueCountFrequency (%)
+ 46
54.1%
~ 38
44.7%
> 1
 
1.2%
Close Punctuation
ValueCountFrequency (%)
) 1848
89.7%
] 213
 
10.3%
Open Punctuation
ValueCountFrequency (%)
( 1848
89.7%
[ 213
 
10.3%
Space Separator
ValueCountFrequency (%)
50591
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1304
100.0%
Final Punctuation
ValueCountFrequency (%)
226
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 1
100.0%
Other Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 411213
87.0%
Common 61588
 
13.0%
Greek 8
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 41964
 
10.2%
i 39356
 
9.6%
o 32904
 
8.0%
t 31590
 
7.7%
a 30253
 
7.4%
s 29336
 
7.1%
n 28714
 
7.0%
r 27994
 
6.8%
l 18067
 
4.4%
c 17755
 
4.3%
Other values (49) 113280
27.5%
Common
ValueCountFrequency (%)
50591
82.1%
) 1848
 
3.0%
( 1848
 
3.0%
- 1304
 
2.1%
. 904
 
1.5%
0 675
 
1.1%
463
 
0.8%
3 463
 
0.8%
2 394
 
0.6%
* 361
 
0.6%
Other values (19) 2737
 
4.4%
Greek
ValueCountFrequency (%)
β 6
75.0%
α 2
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 472091
99.8%
Punctuation 689
 
0.1%
Number Forms 16
 
< 0.1%
None 13
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
50591
 
10.7%
e 41964
 
8.9%
i 39356
 
8.3%
o 32904
 
7.0%
t 31590
 
6.7%
a 30253
 
6.4%
s 29336
 
6.2%
n 28714
 
6.1%
r 27994
 
5.9%
l 18067
 
3.8%
Other values (66) 141322
29.9%
Punctuation
ValueCountFrequency (%)
463
67.2%
226
32.8%
Number Forms
ValueCountFrequency (%)
6
37.5%
3
18.8%
3
18.8%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
None
ValueCountFrequency (%)
β 6
46.2%
3
23.1%
α 2
 
15.4%
´ 1
 
7.7%
1
 
7.7%

완전코드구분
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)0.1%
Missing8902
Missing (%)89.0%
Memory size97.7 KiB
False
1098 
(Missing)
8902 
ValueCountFrequency (%)
False 1098
 
11.0%
(Missing) 8902
89.0%
2023-12-13T08:34:02.185648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

주상병사용구분
Boolean

CONSTANT  MISSING 

Distinct1
Distinct (%)0.1%
Missing8762
Missing (%)87.6%
Memory size97.7 KiB
False
1238 
(Missing)
8762 
ValueCountFrequency (%)
False 1238
 
12.4%
(Missing) 8762
87.6%
2023-12-13T08:34:02.253752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

법정감염병구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9882 
제2급
 
45
제4급
 
35
제3급
 
31
제1급
 
7

Length

Max length4
Median length4
Mean length3.9882
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9882
98.8%
제2급 45
 
0.4%
제4급 35
 
0.4%
제3급 31
 
0.3%
제1급 7
 
0.1%

Length

2023-12-13T08:34:02.361953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:34:02.480050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9882
98.8%
제2급 45
 
0.4%
제4급 35
 
0.4%
제3급 31
 
0.3%
제1급 7
 
0.1%

성별구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9461 
X
 
474
Y
 
65

Length

Max length4
Median length4
Mean length3.8383
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9461
94.6%
X 474
 
4.7%
Y 65
 
0.7%

Length

2023-12-13T08:34:02.603037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:34:02.723217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9461
94.6%
x 474
 
4.7%
y 65
 
0.7%

상한연령
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9735 
60
 
257
24
 
5
20
 
2
15
 
1

Length

Max length4
Median length4
Mean length3.947
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9735
97.4%
60 257
 
2.6%
24 5
 
0.1%
20 2
 
< 0.1%
15 1
 
< 0.1%

Length

2023-12-13T08:34:02.853228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:34:02.963035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9735
97.4%
60 257
 
2.6%
24 5
 
< 0.1%
20 2
 
< 0.1%
15 1
 
< 0.1%

하한연령
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct6
Distinct (%)1.6%
Missing9618
Missing (%)96.2%
Infinite0
Infinite (%)0.0%
Mean12.188482
Minimum8
Maximum65
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T08:34:03.054438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8
5-th percentile8
Q18
median8
Q315
95-th percentile40
Maximum65
Range57
Interquartile range (IQR)7

Descriptive statistics

Standard deviation8.5601821
Coefficient of variation (CV)0.70231734
Kurtosis8.053814
Mean12.188482
Median Absolute Deviation (MAD)0
Skewness2.7629564
Sum4656
Variance73.276717
MonotonicityNot monotonic
2023-12-13T08:34:03.160434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
8 262
 
2.6%
15 71
 
0.7%
40 24
 
0.2%
20 23
 
0.2%
10 1
 
< 0.1%
65 1
 
< 0.1%
(Missing) 9618
96.2%
ValueCountFrequency (%)
8 262
2.6%
10 1
 
< 0.1%
15 71
 
0.7%
20 23
 
0.2%
40 24
 
0.2%
65 1
 
< 0.1%
ValueCountFrequency (%)
65 1
 
< 0.1%
40 24
 
0.2%
20 23
 
0.2%
15 71
 
0.7%
10 1
 
< 0.1%
8 262
2.6%

양한방구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
양한방 공통
9939 
한방
 
61

Length

Max length6
Median length6
Mean length5.9756
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row양한방 공통
2nd row양한방 공통
3rd row양한방 공통
4th row양한방 공통
5th row양한방 공통

Common Values

ValueCountFrequency (%)
양한방 공통 9939
99.4%
한방 61
 
0.6%

Length

2023-12-13T08:34:03.321005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:34:03.416095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
양한방 9939
49.8%
공통 9939
49.8%
한방 61
 
0.3%

Interactions

2023-12-13T08:33:59.288688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:34:03.476882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
법정감염병구분성별구분상한연령하한연령양한방구분
법정감염병구분1.0000.000NaNNaNNaN
성별구분0.0001.000NaNNaNNaN
상한연령NaNNaN1.000NaNNaN
하한연령NaNNaNNaN1.000NaN
양한방구분NaNNaNNaNNaN1.000
2023-12-13T08:34:03.578759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별구분상한연령양한방구분법정감염병구분
성별구분1.0001.0001.0001.000
상한연령1.0001.0001.000NaN
양한방구분1.0001.0001.0001.000
법정감염병구분1.000NaN1.0001.000
2023-12-13T08:34:03.680551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
하한연령법정감염병구분성별구분상한연령양한방구분
하한연령1.0000.0001.0001.0001.000
법정감염병구분0.0001.0001.0000.0001.000
성별구분1.0001.0001.0001.0001.000
상한연령1.0000.0001.0001.0001.000
양한방구분1.0001.0001.0001.0001.000

Missing values

2023-12-13T08:33:59.384412image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:33:59.521096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T08:33:59.644654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

상병기호한글명영문명완전코드구분주상병사용구분법정감염병구분성별구분상한연령하한연령양한방구분
39507T269눈 및 눈부속기의 상세불명 부분의 부식Corrosion of eye and adnexa part unspecified<NA><NA><NA><NA><NA><NA>양한방 공통
19062M1229(착색) 융모결절성 윤활막염 상세불명 부분Villonodular synovitis (pigmented) site unspecified<NA><NA><NA><NA><NA><NA>양한방 공통
42012W110사다리에서의 낙상 주택Fall on and from ladder home<NA>N<NA><NA><NA><NA>양한방 공통
29744M8904어깨-손증후군 손Shoulder-hand syndrome hand<NA><NA><NA><NA><NA><NA>양한방 공통
11958J986횡격막의 마비Paralysis of diaphragm<NA><NA><NA><NA><NA><NA>양한방 공통
31390M9912(척추)부분탈구복합 흉요추Subluxation complex (vertebral) thoracolumbar<NA><NA><NA><NA><NA><NA>양한방 공통
23406M6247근육의 구축 발가락Contracture of muscle toes<NA><NA><NA><NA><NA><NA>양한방 공통
12139K042치수변성Pulp degeneration<NA><NA><NA><NA><NA><NA>양한방 공통
42234W24철사와의 접촉Contact with wireNN<NA><NA><NA><NA>양한방 공통
32431N854자궁의 후굴Retroflexion of uterus<NA><NA><NA>X<NA><NA>양한방 공통
상병기호한글명영문명완전코드구분주상병사용구분법정감염병구분성별구분상한연령하한연령양한방구분
5417E1161수포(당뇨병성 수포증)를 동반한 성숙기발병당뇨병(진성 비비만성 비만성)(L14*)Maturity-onset diabetes (mellitus nonobese obese) with bullae(bullosis diabeticorum)(L14*)<NA><NA><NA><NA><NA><NA>양한방 공통
9707H447전방(~에서의) (비자기성)(오래된) 안구내이물Retained (nonmagnetic)(old) foreign body (in) anterior chamber<NA><NA><NA><NA><NA><NA>양한방 공통
21175M2591상세불명의 관절장애 견쇄관절Joint disorder unspecified acromioclavicular joints<NA><NA><NA><NA><NA><NA>양한방 공통
30731M9035잠함병에서의 골괴사(T70.3†) 고관절Osteonecrosis in caisson disease(T70.3†) hip (joint)<NA><NA><NA><NA><NA><NA>양한방 공통
17264M06879상세불명 기타 명시된 류마티스관절염 족근골Unspecified other specified rheumatoid arthritis tarsus<NA><NA><NA><NA><NA><NA>양한방 공통
12334K103치조골염Alveolar osteitis<NA><NA><NA><NA><NA><NA>양한방 공통
38233S62170갈고리뼈의 골절 폐쇄성Fracture of hamate bone closed<NA><NA><NA><NA><NA><NA>양한방 공통
34044P11중추신경계통에 대한 기타 출산손상Other birth injuries to central nervous systemN<NA><NA><NA><NA><NA>양한방 공통
30835M9058달리 분류된 기타 질환에서의 골괴사 두개골Osteonecrosis in other diseases classified elsewhere skull<NA><NA><NA><NA><NA><NA>양한방 공통
43236X092상세불명의 연기 불 및 불꽃에의 노출 학교 기타 시설 및 공공행정 구역Exposure to unspecified smoke fire and flames school other institution and public administrative area<NA>N<NA><NA><NA><NA>양한방 공통

Duplicate rows

Most frequently occurring

상병기호한글명영문명완전코드구분주상병사용구분법정감염병구분성별구분상한연령하한연령양한방구분# duplicates
0G03기타 및 상세불명의 원인에 의한 수막염Meningitis due to other and unspecified causesN<NA><NA><NA><NA><NA>양한방 공통2