Overview

Dataset statistics

Number of variables16
Number of observations10000
Missing cells16217
Missing cells (%)10.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.3 MiB
Average record size in memory138.0 B

Variable types

Text4
DateTime3
Categorical8
Numeric1

Dataset

Description인천보훈병원에서 제공하는 환자에 대한 어떤 종류의 밥을 먹는지 아침 점심 저녁 별로 급식, 식사, 식이 내용으로 구성되어 있습니다.
URLhttps://www.data.go.kr/data/15117976/fileData.do

Alerts

환자구분 has constant value ""Constant
식사범주코드 is highly overall correlated with 용량1High correlation
대상구분 is highly overall correlated with 용량1High correlation
환자급종 is highly overall correlated with 용량1High correlation
용량1 is highly overall correlated with 열량 and 5 other fieldsHigh correlation
열량 is highly overall correlated with 용량1High correlation
진료과코드 is highly overall correlated with 용량1High correlation
개시끼니 is highly overall correlated with 용량1High correlation
대상구분 is highly imbalanced (77.3%)Imbalance
환자급종 is highly imbalanced (72.0%)Imbalance
용량1 is highly imbalanced (99.0%)Imbalance
특이사항 has 8109 (81.1%) missing valuesMissing
병동특이사항 has 8108 (81.1%) missing valuesMissing
열량 has 7819 (78.2%) zerosZeros

Reproduction

Analysis started2023-12-12 07:37:24.773223
Analysis finished2023-12-12 07:37:26.600741
Duration1.83 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct908
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T16:37:26.861786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters80000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique145 ?
Unique (%)1.5%

Sample

1st row001**173
2nd row001**652
3rd row001**843
4th row212**424
5th row001**965
ValueCountFrequency (%)
212**581 132
 
1.3%
001**658 123
 
1.2%
210**190 120
 
1.2%
212**263 103
 
1.0%
130**935 100
 
1.0%
212**954 96
 
1.0%
212**691 84
 
0.8%
001**686 83
 
0.8%
212**827 81
 
0.8%
001**986 79
 
0.8%
Other values (898) 8999
90.0%
2023-12-12T16:37:27.289127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 20000
25.0%
0 16601
20.8%
1 13761
17.2%
2 7371
 
9.2%
9 4008
 
5.0%
3 3382
 
4.2%
8 3203
 
4.0%
6 3173
 
4.0%
5 3145
 
3.9%
7 2772
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 60000
75.0%
Other Punctuation 20000
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 16601
27.7%
1 13761
22.9%
2 7371
12.3%
9 4008
 
6.7%
3 3382
 
5.6%
8 3203
 
5.3%
6 3173
 
5.3%
5 3145
 
5.2%
7 2772
 
4.6%
4 2584
 
4.3%
Other Punctuation
ValueCountFrequency (%)
* 20000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 80000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 20000
25.0%
0 16601
20.8%
1 13761
17.2%
2 7371
 
9.2%
9 4008
 
5.0%
3 3382
 
4.2%
8 3203
 
4.0%
6 3173
 
4.0%
5 3145
 
3.9%
7 2772
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 80000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 20000
25.0%
0 16601
20.8%
1 13761
17.2%
2 7371
 
9.2%
9 4008
 
5.0%
3 3382
 
4.2%
8 3203
 
4.0%
6 3173
 
4.0%
5 3145
 
3.9%
7 2772
 
3.5%
Distinct524
Distinct (%)5.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2018-09-13 00:00:00
Maximum2020-02-22 00:00:00
2023-12-12T16:37:27.451117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:37:27.596357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

식사범주코드
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
상식
5779 
당뇨식
1649 
연식
980 
금식
949 
저염식
 
319
Other values (5)
 
324

Length

Max length6
Median length2
Mean length2.234
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row금식
2nd row상식
3rd row금식
4th row상식
5th row연식

Common Values

ValueCountFrequency (%)
상식 5779
57.8%
당뇨식 1649
 
16.5%
연식 980
 
9.8%
금식 949
 
9.5%
저염식 319
 
3.2%
저단백식 126
 
1.3%
관급식 100
 
1.0%
미음 90
 
0.9%
고단백식 6
 
0.1%
당뇨외치료식 2
 
< 0.1%

Length

2023-12-12T16:37:27.729323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:37:27.853194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상식 5779
57.8%
당뇨식 1649
 
16.5%
연식 980
 
9.8%
금식 949
 
9.5%
저염식 319
 
3.2%
저단백식 126
 
1.3%
관급식 100
 
1.0%
미음 90
 
0.9%
고단백식 6
 
0.1%
당뇨외치료식 2
 
< 0.1%

급식끼니
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
아침
3365 
점심
3321 
저녁
3314 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row점심
2nd row아침
3rd row점심
4th row아침
5th row저녁

Common Values

ValueCountFrequency (%)
아침 3365
33.7%
점심 3321
33.2%
저녁 3314
33.1%

Length

2023-12-12T16:37:27.986391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:37:28.092320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
아침 3365
33.7%
점심 3321
33.2%
저녁 3314
33.1%

환자구분
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
입원
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row입원
2nd row입원
3rd row입원
4th row입원
5th row입원

Common Values

ValueCountFrequency (%)
입원 10000
100.0%

Length

2023-12-12T16:37:28.229983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:37:28.339953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
입원 10000
100.0%
Distinct59
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T16:37:28.526223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length4
Mean length4.7286
Min length2

Characters and Unicode

Total characters47286
Distinct characters62
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)0.1%

Sample

1st row금식
2nd row일반상식
3rd row금식
4th row일반상식
5th row일반연식
ValueCountFrequency (%)
일반상식 5686
56.3%
당뇨상식1800 1028
 
10.2%
일반연식 973
 
9.6%
금식 674
 
6.7%
저염상식 289
 
2.9%
보호자금식 275
 
2.7%
당뇨저염상식1800 152
 
1.5%
당뇨상식1700 102
 
1.0%
당뇨연식1800 98
 
1.0%
보호자일반상식 93
 
0.9%
Other values (51) 729
 
7.2%
2023-12-12T16:37:28.894172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9822
20.8%
7687
16.3%
6932
14.7%
6932
14.7%
0 3510
 
7.4%
1717
 
3.6%
1717
 
3.6%
1 1670
 
3.5%
8 1362
 
2.9%
1183
 
2.5%
Other values (52) 4754
10.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 40163
84.9%
Decimal Number 6991
 
14.8%
Space Separator 99
 
0.2%
Uppercase Letter 33
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9822
24.5%
7687
19.1%
6932
17.3%
6932
17.3%
1717
 
4.3%
1717
 
4.3%
1183
 
2.9%
949
 
2.4%
535
 
1.3%
512
 
1.3%
Other values (38) 2177
 
5.4%
Decimal Number
ValueCountFrequency (%)
0 3510
50.2%
1 1670
23.9%
8 1362
 
19.5%
2 131
 
1.9%
7 116
 
1.7%
6 77
 
1.1%
9 51
 
0.7%
3 47
 
0.7%
4 20
 
0.3%
5 7
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
R 11
33.3%
T 11
33.3%
H 11
33.3%
Space Separator
ValueCountFrequency (%)
99
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 40163
84.9%
Common 7090
 
15.0%
Latin 33
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9822
24.5%
7687
19.1%
6932
17.3%
6932
17.3%
1717
 
4.3%
1717
 
4.3%
1183
 
2.9%
949
 
2.4%
535
 
1.3%
512
 
1.3%
Other values (38) 2177
 
5.4%
Common
ValueCountFrequency (%)
0 3510
49.5%
1 1670
23.6%
8 1362
 
19.2%
2 131
 
1.8%
7 116
 
1.6%
99
 
1.4%
6 77
 
1.1%
9 51
 
0.7%
3 47
 
0.7%
4 20
 
0.3%
Latin
ValueCountFrequency (%)
R 11
33.3%
T 11
33.3%
H 11
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 40163
84.9%
ASCII 7123
 
15.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
9822
24.5%
7687
19.1%
6932
17.3%
6932
17.3%
1717
 
4.3%
1717
 
4.3%
1183
 
2.9%
949
 
2.4%
535
 
1.3%
512
 
1.3%
Other values (38) 2177
 
5.4%
ASCII
ValueCountFrequency (%)
0 3510
49.3%
1 1670
23.4%
8 1362
 
19.1%
2 131
 
1.8%
7 116
 
1.6%
99
 
1.4%
6 77
 
1.1%
9 51
 
0.7%
3 47
 
0.7%
4 20
 
0.3%
Other values (4) 40
 
0.6%

진료과코드
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
정형외과
3135 
소화기내과
2308 
비뇨의학과
1699 
가정의학과
1183 
신경과
716 
Other values (8)
959 

Length

Max length5
Median length5
Mean length4.4174
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row소화기내과
2nd row신경과
3rd row소화기내과
4th row이비인후과
5th row비뇨의학과

Common Values

ValueCountFrequency (%)
정형외과 3135
31.4%
소화기내과 2308
23.1%
비뇨의학과 1699
17.0%
가정의학과 1183
 
11.8%
신경과 716
 
7.2%
이비인후과 334
 
3.3%
신경외과 272
 
2.7%
안과 96
 
1.0%
내과 91
 
0.9%
외과 83
 
0.8%
Other values (3) 83
 
0.8%

Length

2023-12-12T16:37:29.058436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
정형외과 3135
31.4%
소화기내과 2308
23.1%
비뇨의학과 1699
17.0%
가정의학과 1183
 
11.8%
신경과 716
 
7.2%
이비인후과 334
 
3.3%
신경외과 272
 
2.7%
안과 96
 
1.0%
내과 91
 
0.9%
외과 83
 
0.8%
Other values (3) 83
 
0.8%

대상구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
본인
9632 
보호자
 
368

Length

Max length3
Median length2
Mean length2.0368
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row본인
2nd row본인
3rd row본인
4th row본인
5th row본인

Common Values

ValueCountFrequency (%)
본인 9632
96.3%
보호자 368
 
3.7%

Length

2023-12-12T16:37:29.214407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:37:29.328394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
본인 9632
96.3%
보호자 368
 
3.7%

열량
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.1632
Minimum0
Maximum22
Zeros7819
Zeros (%)78.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T16:37:29.416720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile18
Maximum22
Range22
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6.7623471
Coefficient of variation (CV)2.1378184
Kurtosis1.0549356
Mean3.1632
Median Absolute Deviation (MAD)0
Skewness1.7355499
Sum31632
Variance45.729339
MonotonicityNot monotonic
2023-12-12T16:37:29.533983image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
0 7819
78.2%
18 1363
 
13.6%
1 368
 
3.7%
17 116
 
1.2%
16 77
 
0.8%
20 56
 
0.6%
19 51
 
0.5%
4 42
 
0.4%
6 41
 
0.4%
21 32
 
0.3%
Other values (6) 35
 
0.4%
ValueCountFrequency (%)
0 7819
78.2%
1 368
 
3.7%
2 12
 
0.1%
4 42
 
0.4%
6 41
 
0.4%
8 3
 
< 0.1%
12 4
 
< 0.1%
14 6
 
0.1%
15 7
 
0.1%
16 77
 
0.8%
ValueCountFrequency (%)
22 3
 
< 0.1%
21 32
 
0.3%
20 56
 
0.6%
19 51
 
0.5%
18 1363
13.6%
17 116
 
1.2%
16 77
 
0.8%
15 7
 
0.1%
14 6
 
0.1%
12 4
 
< 0.1%

특이사항
Text

MISSING 

Distinct417
Distinct (%)22.1%
Missing8109
Missing (%)81.1%
Memory size156.2 KiB
2023-12-12T16:37:29.797512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length25
Mean length9.2379693
Min length2

Characters and Unicode

Total characters17469
Distinct characters296
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique156 ?
Unique (%)8.2%

Sample

1st row잡곡밥
2nd row물김치많이
3rd row식판수거 부탁드립니다!
4th row 맵지않게 해주세요
5th row간장 제외(죽)
ValueCountFrequency (%)
주세요 457
 
10.9%
잡곡밥 209
 
5.0%
많이 205
 
4.9%
제외 202
 
4.8%
162
 
3.9%
반찬 105
 
2.5%
잡곡밥으로 99
 
2.4%
포크주세요 86
 
2.0%
포크 69
 
1.6%
밥많이 61
 
1.5%
Other values (400) 2550
60.6%
2023-12-12T16:37:30.220786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2730
 
15.6%
865
 
5.0%
840
 
4.8%
831
 
4.8%
640
 
3.7%
480
 
2.7%
396
 
2.3%
388
 
2.2%
, 370
 
2.1%
337
 
1.9%
Other values (286) 9592
54.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13551
77.6%
Space Separator 2730
 
15.6%
Other Punctuation 642
 
3.7%
Close Punctuation 149
 
0.9%
Open Punctuation 149
 
0.9%
Decimal Number 70
 
0.4%
Other Symbol 59
 
0.3%
Uppercase Letter 50
 
0.3%
Math Symbol 43
 
0.2%
Lowercase Letter 20
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
865
 
6.4%
840
 
6.2%
831
 
6.1%
640
 
4.7%
480
 
3.5%
396
 
2.9%
388
 
2.9%
337
 
2.5%
337
 
2.5%
335
 
2.5%
Other values (246) 8102
59.8%
Uppercase Letter
ValueCountFrequency (%)
R 10
20.0%
A 9
18.0%
X 6
12.0%
S 6
12.0%
M 6
12.0%
E 4
 
8.0%
V 4
 
8.0%
O 2
 
4.0%
K 1
 
2.0%
T 1
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 370
57.6%
. 172
26.8%
! 50
 
7.8%
/ 30
 
4.7%
& 15
 
2.3%
* 4
 
0.6%
: 1
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 21
30.0%
1 16
22.9%
2 14
20.0%
7 14
20.0%
5 3
 
4.3%
4 1
 
1.4%
3 1
 
1.4%
Lowercase Letter
ValueCountFrequency (%)
c 5
25.0%
a 5
25.0%
l 5
25.0%
k 4
20.0%
g 1
 
5.0%
Math Symbol
ValueCountFrequency (%)
~ 27
62.8%
+ 15
34.9%
> 1
 
2.3%
Close Punctuation
ValueCountFrequency (%)
) 147
98.7%
] 2
 
1.3%
Open Punctuation
ValueCountFrequency (%)
( 147
98.7%
[ 2
 
1.3%
Space Separator
ValueCountFrequency (%)
2730
100.0%
Other Symbol
ValueCountFrequency (%)
59
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13551
77.6%
Common 3848
 
22.0%
Latin 70
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
865
 
6.4%
840
 
6.2%
831
 
6.1%
640
 
4.7%
480
 
3.5%
396
 
2.9%
388
 
2.9%
337
 
2.5%
337
 
2.5%
335
 
2.5%
Other values (246) 8102
59.8%
Common
ValueCountFrequency (%)
2730
70.9%
, 370
 
9.6%
. 172
 
4.5%
) 147
 
3.8%
( 147
 
3.8%
59
 
1.5%
! 50
 
1.3%
/ 30
 
0.8%
~ 27
 
0.7%
0 21
 
0.5%
Other values (14) 95
 
2.5%
Latin
ValueCountFrequency (%)
R 10
14.3%
A 9
12.9%
X 6
8.6%
S 6
8.6%
M 6
8.6%
c 5
7.1%
a 5
7.1%
l 5
7.1%
k 4
 
5.7%
E 4
 
5.7%
Other values (6) 10
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13551
77.6%
ASCII 3859
 
22.1%
Misc Symbols 59
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2730
70.7%
, 370
 
9.6%
. 172
 
4.5%
) 147
 
3.8%
( 147
 
3.8%
! 50
 
1.3%
/ 30
 
0.8%
~ 27
 
0.7%
0 21
 
0.5%
1 16
 
0.4%
Other values (29) 149
 
3.9%
Hangul
ValueCountFrequency (%)
865
 
6.4%
840
 
6.2%
831
 
6.1%
640
 
4.7%
480
 
3.5%
396
 
2.9%
388
 
2.9%
337
 
2.5%
337
 
2.5%
335
 
2.5%
Other values (246) 8102
59.8%
Misc Symbols
ValueCountFrequency (%)
59
100.0%

개시끼니
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
저녁
4391 
점심
3027 
아침
2582 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row점심
2nd row점심
3rd row저녁
4th row점심
5th row점심

Common Values

ValueCountFrequency (%)
저녁 4391
43.9%
점심 3027
30.3%
아침 2582
25.8%

Length

2023-12-12T16:37:30.374085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:37:30.472406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
저녁 4391
43.9%
점심 3027
30.3%
아침 2582
25.8%
Distinct498
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2018-09-13 00:00:00
Maximum2020-02-21 00:00:00
2023-12-12T16:37:30.616098image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:37:30.766320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct329
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2018-09-13 00:00:00
Maximum2020-02-19 00:00:00
2023-12-12T16:37:30.885296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:37:31.009543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

환자급종
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
건강보험
8623 
의료급여1종
1078 
일반
 
203
자동차보험
 
48
의료급여2종
 
32

Length

Max length11
Median length4
Mean length4.1974
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row건강보험
2nd row건강보험
3rd row건강보험
4th row건강보험
5th row건강보험

Common Values

ValueCountFrequency (%)
건강보험 8623
86.2%
의료급여1종 1078
 
10.8%
일반 203
 
2.0%
자동차보험 48
 
0.5%
의료급여2종 32
 
0.3%
의료급여2종(장애인) 16
 
0.2%

Length

2023-12-12T16:37:31.142709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:37:31.250362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
건강보험 8623
86.2%
의료급여1종 1078
 
10.8%
일반 203
 
2.0%
자동차보험 48
 
0.5%
의료급여2종 32
 
0.3%
의료급여2종(장애인 16
 
0.2%

용량1
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9987 
400
 
11
100
 
2

Length

Max length4
Median length4
Mean length3.9987
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9987
99.9%
400 11
 
0.1%
100 2
 
< 0.1%

Length

2023-12-12T16:37:31.367730image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:37:31.484528image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9987
99.9%
400 11
 
0.1%
100 2
 
< 0.1%

병동특이사항
Text

MISSING 

Distinct418
Distinct (%)22.1%
Missing8108
Missing (%)81.1%
Memory size156.2 KiB
2023-12-12T16:37:31.783585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length30
Median length25
Mean length9.2378436
Min length2

Characters and Unicode

Total characters17478
Distinct characters296
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique157 ?
Unique (%)8.3%

Sample

1st row잡곡밥
2nd row물김치많이
3rd row식판수거 부탁드립니다!
4th row 맵지않게 해주세요
5th row간장 제외(죽)
ValueCountFrequency (%)
주세요 457
 
10.9%
잡곡밥 209
 
5.0%
많이 206
 
4.9%
제외 202
 
4.8%
162
 
3.8%
반찬 105
 
2.5%
잡곡밥으로 99
 
2.4%
포크주세요 86
 
2.0%
포크 69
 
1.6%
물김치 61
 
1.4%
Other values (400) 2552
60.6%
2023-12-12T16:37:32.278117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2732
 
15.6%
865
 
4.9%
840
 
4.8%
831
 
4.8%
640
 
3.7%
480
 
2.7%
396
 
2.3%
389
 
2.2%
, 370
 
2.1%
337
 
1.9%
Other values (286) 9598
54.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 13558
77.6%
Space Separator 2732
 
15.6%
Other Punctuation 642
 
3.7%
Close Punctuation 149
 
0.9%
Open Punctuation 149
 
0.9%
Decimal Number 70
 
0.4%
Other Symbol 59
 
0.3%
Uppercase Letter 50
 
0.3%
Math Symbol 43
 
0.2%
Lowercase Letter 20
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
865
 
6.4%
840
 
6.2%
831
 
6.1%
640
 
4.7%
480
 
3.5%
396
 
2.9%
389
 
2.9%
337
 
2.5%
337
 
2.5%
336
 
2.5%
Other values (246) 8107
59.8%
Uppercase Letter
ValueCountFrequency (%)
R 10
20.0%
A 9
18.0%
S 6
12.0%
M 6
12.0%
X 6
12.0%
V 4
 
8.0%
E 4
 
8.0%
O 2
 
4.0%
F 1
 
2.0%
T 1
 
2.0%
Other Punctuation
ValueCountFrequency (%)
, 370
57.6%
. 172
26.8%
! 50
 
7.8%
/ 30
 
4.7%
& 15
 
2.3%
* 4
 
0.6%
: 1
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 21
30.0%
1 16
22.9%
2 14
20.0%
7 14
20.0%
5 3
 
4.3%
4 1
 
1.4%
3 1
 
1.4%
Lowercase Letter
ValueCountFrequency (%)
l 5
25.0%
c 5
25.0%
a 5
25.0%
k 4
20.0%
g 1
 
5.0%
Math Symbol
ValueCountFrequency (%)
~ 27
62.8%
+ 15
34.9%
> 1
 
2.3%
Close Punctuation
ValueCountFrequency (%)
) 147
98.7%
] 2
 
1.3%
Open Punctuation
ValueCountFrequency (%)
( 147
98.7%
[ 2
 
1.3%
Space Separator
ValueCountFrequency (%)
2732
100.0%
Other Symbol
ValueCountFrequency (%)
59
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 13558
77.6%
Common 3850
 
22.0%
Latin 70
 
0.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
865
 
6.4%
840
 
6.2%
831
 
6.1%
640
 
4.7%
480
 
3.5%
396
 
2.9%
389
 
2.9%
337
 
2.5%
337
 
2.5%
336
 
2.5%
Other values (246) 8107
59.8%
Common
ValueCountFrequency (%)
2732
71.0%
, 370
 
9.6%
. 172
 
4.5%
) 147
 
3.8%
( 147
 
3.8%
59
 
1.5%
! 50
 
1.3%
/ 30
 
0.8%
~ 27
 
0.7%
0 21
 
0.5%
Other values (14) 95
 
2.5%
Latin
ValueCountFrequency (%)
R 10
14.3%
A 9
12.9%
S 6
8.6%
M 6
8.6%
X 6
8.6%
l 5
7.1%
c 5
7.1%
a 5
7.1%
V 4
 
5.7%
k 4
 
5.7%
Other values (6) 10
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 13558
77.6%
ASCII 3861
 
22.1%
Misc Symbols 59
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2732
70.8%
, 370
 
9.6%
. 172
 
4.5%
) 147
 
3.8%
( 147
 
3.8%
! 50
 
1.3%
/ 30
 
0.8%
~ 27
 
0.7%
0 21
 
0.5%
1 16
 
0.4%
Other values (29) 149
 
3.9%
Hangul
ValueCountFrequency (%)
865
 
6.4%
840
 
6.2%
831
 
6.1%
640
 
4.7%
480
 
3.5%
396
 
2.9%
389
 
2.9%
337
 
2.5%
337
 
2.5%
336
 
2.5%
Other values (246) 8107
59.8%
Misc Symbols
ValueCountFrequency (%)
59
100.0%

Interactions

2023-12-12T16:37:26.085051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:37:32.395014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
식사범주코드급식끼니식사코드진료과코드대상구분열량개시끼니환자급종용량1
식사범주코드1.0000.0321.0000.2780.5650.7610.2030.152NaN
급식끼니0.0321.0000.0780.0000.0000.0790.2670.0540.000
식사코드1.0000.0781.0000.5461.0001.0000.3450.3400.872
진료과코드0.2780.0000.5461.0000.1070.2320.2110.1940.872
대상구분0.5650.0001.0000.1071.0000.0870.0410.053NaN
열량0.7610.0791.0000.2320.0871.0000.2460.1960.872
개시끼니0.2030.2670.3450.2110.0410.2461.0000.1610.872
환자급종0.1520.0540.3400.1940.0530.1960.1611.000NaN
용량1NaN0.0000.8720.872NaN0.8720.872NaN1.000
2023-12-12T16:37:32.864307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
급식끼니진료과코드개시끼니식사범주코드대상구분환자급종용량1
급식끼니1.0000.0000.0870.0190.0000.0220.000
진료과코드0.0001.0000.1210.1180.0990.0970.671
개시끼니0.0870.1211.0000.1230.0690.0670.671
식사범주코드0.0190.1180.1231.0000.4360.0801.000
대상구분0.0000.0990.0690.4361.0000.0381.000
환자급종0.0220.0970.0670.0800.0381.0001.000
용량10.0000.6710.6711.0001.0001.0001.000
2023-12-12T16:37:32.996768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
열량식사범주코드급식끼니진료과코드대상구분개시끼니환자급종용량1
열량1.0000.4790.0340.1010.0870.1110.0990.671
식사범주코드0.4791.0000.0190.1180.4360.1230.0801.000
급식끼니0.0340.0191.0000.0000.0000.0870.0220.000
진료과코드0.1010.1180.0001.0000.0990.1210.0970.671
대상구분0.0870.4360.0000.0991.0000.0690.0381.000
개시끼니0.1110.1230.0870.1210.0691.0000.0670.671
환자급종0.0990.0800.0220.0970.0380.0671.0001.000
용량10.6711.0000.0000.6711.0000.6711.0001.000

Missing values

2023-12-12T16:37:26.213085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:37:26.408129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T16:37:26.531879image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

환자번호급식일자식사범주코드급식끼니환자구분식사코드진료과코드대상구분열량특이사항개시끼니개시일자원무접수일자환자급종용량1병동특이사항
50599001**1732019-08-15금식점심입원금식소화기내과본인0<NA>점심2019-08-152019-07-22건강보험<NA><NA>
65410001**6522019-11-09상식아침입원일반상식신경과본인0잡곡밥점심2019-11-052019-11-04건강보험<NA>잡곡밥
81494001**8432020-01-02금식점심입원금식소화기내과본인0<NA>저녁2019-09-292019-07-29건강보험<NA><NA>
64435212**4242019-12-04상식아침입원일반상식이비인후과본인0<NA>점심2019-12-032019-12-03건강보험<NA><NA>
42844001**9652019-08-05연식저녁입원일반연식비뇨의학과본인0<NA>점심2019-08-052019-08-05건강보험<NA><NA>
44034001**1482019-09-07상식아침입원일반상식정형외과본인0<NA>저녁2019-08-272019-08-27자동차보험<NA><NA>
45276001**8112019-08-27상식아침입원일반상식정형외과본인0<NA>점심2019-08-232019-08-08건강보험<NA><NA>
31886218**1162019-07-07연식저녁입원일반연식소화기내과본인0<NA>저녁2019-07-052019-06-10건강보험<NA><NA>
62665212**0502019-11-17당뇨식점심입원당뇨상식1800정형외과본인18<NA>저녁2019-11-152019-11-08일반<NA><NA>
38445001**4712019-06-21상식저녁입원일반상식정형외과본인0<NA>저녁2019-06-132019-06-11건강보험<NA><NA>
환자번호급식일자식사범주코드급식끼니환자구분식사코드진료과코드대상구분열량특이사항개시끼니개시일자원무접수일자환자급종용량1병동특이사항
6990232**2152019-01-23저단백식아침입원당뇨신부전상식1800가정의학과본인18<NA>아침2019-01-232019-01-22건강보험<NA><NA>
10623001**3552019-03-30금식점심입원금식소화기내과본인0<NA>아침2019-03-222019-03-18건강보험<NA><NA>
7466001**3092019-02-20상식저녁입원일반상식신경과본인0<NA>점심2019-02-192019-02-19건강보험<NA><NA>
76202001**8162019-12-06상식아침입원일반상식비뇨의학과본인0<NA>저녁2019-12-052019-12-05건강보험<NA><NA>
15787130**9352019-04-22상식점심입원일반상식소화기내과본인0<NA>점심2019-04-012019-04-01건강보험<NA><NA>
27881212**5812019-05-18당뇨식저녁입원당뇨상식2100소화기내과본인21<NA>저녁2019-05-172019-05-17의료급여1종<NA><NA>
84877212**0692020-01-24금식저녁입원금식신경과본인0<NA>저녁2020-01-242020-01-07건강보험<NA><NA>
4243001**7192018-12-22저단백식아침입원신부전상식내과본인0<NA>저녁2018-12-202018-12-18건강보험<NA><NA>
83091001**1362020-01-04상식점심입원일반상식정형외과본인0<NA>저녁2019-12-272019-12-24건강보험<NA><NA>
70568211**9552019-11-19상식아침입원일반상식정형외과본인0<NA>저녁2019-11-072019-10-29건강보험<NA><NA>