Overview

Dataset statistics

Number of variables4
Number of observations31
Missing cells1
Missing cells (%)0.8%
Duplicate rows1
Duplicate rows (%)3.2%
Total size in memory1.1 KiB
Average record size in memory36.3 B

Variable types

Categorical2
Text2

Dataset

Description울산광역시 북구 병리검사에 대한 데이터로 병리검사목록, 검사항목, 참고치(단위), 처리기한 등의 항목을 제공합니다.
Author울산광역시 북구
URLhttps://www.data.go.kr/data/3076002/fileData.do

Alerts

Dataset has 1 (3.2%) duplicate rowsDuplicates
구 분 is highly overall correlated with 처리기한High correlation
처리기한 is highly overall correlated with 구 분High correlation
참고치 (단위) has 1 (3.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 14:55:48.268583
Analysis finished2023-12-12 14:55:48.668622
Duration0.4 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구 분
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)41.9%
Missing0
Missing (%)0.0%
Memory size380.0 B
간기능검사
고지혈증검사
소변검사
신장기능검사
빈혈검사
Other values (8)
11 

Length

Max length13
Median length6
Mean length6.0322581
Min length4

Unique

Unique5 ?
Unique (%)16.1%

Sample

1st row당뇨검사
2nd row당뇨정밀검사
3rd row빈혈검사
4th row빈혈검사
5th row소변검사

Common Values

ValueCountFrequency (%)
간기능검사 7
22.6%
고지혈증검사 5
16.1%
소변검사 3
9.7%
신장기능검사 3
9.7%
빈혈검사 2
 
6.5%
통풍검사 2
 
6.5%
B형 간염검사(정밀검사) 2
 
6.5%
C형 간염검사(정밀검사) 2
 
6.5%
당뇨검사 1
 
3.2%
당뇨정밀검사 1
 
3.2%
Other values (3) 3
9.7%

Length

2023-12-12T23:55:48.748167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
간기능검사 7
20.0%
고지혈증검사 5
14.3%
간염검사(정밀검사 4
11.4%
소변검사 3
8.6%
신장기능검사 3
8.6%
빈혈검사 2
 
5.7%
통풍검사 2
 
5.7%
b형 2
 
5.7%
c형 2
 
5.7%
당뇨검사 1
 
2.9%
Other values (4) 4
11.4%
Distinct23
Distinct (%)74.2%
Missing0
Missing (%)0.0%
Memory size380.0 B
2023-12-12T23:55:48.955540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length11
Mean length8.1935484
Min length2

Characters and Unicode

Total characters254
Distinct characters56
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)48.4%

Sample

1st rowGlucose (혈당)
2nd rowHbA1C
3rd rowHb (헤모글로빈)
4th rowHb (헤모글로빈)
5th row3종
ValueCountFrequency (%)
creatinine 2
 
5.1%
acid 2
 
5.1%
hcv(eia 2
 
5.1%
hb 2
 
5.1%
hdl-cholesterol 2
 
5.1%
헤모글로빈 2
 
5.1%
uric 2
 
5.1%
γ-gtp 2
 
5.1%
alt 2
 
5.1%
ast 2
 
5.1%
Other values (18) 19
48.7%
2023-12-12T23:55:49.326188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 15
 
5.9%
A 15
 
5.9%
i 13
 
5.1%
C 11
 
4.3%
r 11
 
4.3%
l 11
 
4.3%
H 10
 
3.9%
( 9
 
3.5%
) 9
 
3.5%
o 9
 
3.5%
Other values (46) 141
55.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 105
41.3%
Uppercase Letter 88
34.6%
Other Letter 22
 
8.7%
Open Punctuation 9
 
3.5%
Close Punctuation 9
 
3.5%
Dash Punctuation 8
 
3.1%
Space Separator 8
 
3.1%
Decimal Number 4
 
1.6%
Other Punctuation 1
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 15
14.3%
i 13
12.4%
r 11
10.5%
l 11
10.5%
o 9
8.6%
t 7
6.7%
s 7
6.7%
c 6
 
5.7%
n 5
 
4.8%
b 5
 
4.8%
Other values (7) 16
15.2%
Uppercase Letter
ValueCountFrequency (%)
A 15
17.0%
C 11
12.5%
H 10
11.4%
T 8
9.1%
I 6
 
6.8%
L 6
 
6.8%
B 5
 
5.7%
P 5
 
5.7%
E 5
 
5.7%
U 3
 
3.4%
Other values (6) 14
15.9%
Other Letter
ValueCountFrequency (%)
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (5) 5
22.7%
Decimal Number
ValueCountFrequency (%)
1 2
50.0%
3 1
25.0%
0 1
25.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Space Separator
ValueCountFrequency (%)
8
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 191
75.2%
Common 39
 
15.4%
Hangul 22
 
8.7%
Greek 2
 
0.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 15
 
7.9%
A 15
 
7.9%
i 13
 
6.8%
C 11
 
5.8%
r 11
 
5.8%
l 11
 
5.8%
H 10
 
5.2%
o 9
 
4.7%
T 8
 
4.2%
t 7
 
3.7%
Other values (22) 81
42.4%
Hangul
ValueCountFrequency (%)
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (5) 5
22.7%
Common
ValueCountFrequency (%)
( 9
23.1%
) 9
23.1%
- 8
20.5%
8
20.5%
1 2
 
5.1%
3 1
 
2.6%
, 1
 
2.6%
0 1
 
2.6%
Greek
ValueCountFrequency (%)
γ 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 230
90.6%
Hangul 22
 
8.7%
None 2
 
0.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 15
 
6.5%
A 15
 
6.5%
i 13
 
5.7%
C 11
 
4.8%
r 11
 
4.8%
l 11
 
4.8%
H 10
 
4.3%
( 9
 
3.9%
) 9
 
3.9%
o 9
 
3.9%
Other values (30) 117
50.9%
Hangul
ValueCountFrequency (%)
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
2
 
9.1%
1
 
4.5%
1
 
4.5%
1
 
4.5%
Other values (5) 5
22.7%
None
ValueCountFrequency (%)
γ 2
100.0%

참고치 (단위)
Text

MISSING 

Distinct24
Distinct (%)80.0%
Missing1
Missing (%)3.2%
Memory size380.0 B
2023-12-12T23:55:49.544496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16.5
Mean length13.333333
Min length2

Characters and Unicode

Total characters400
Distinct characters33
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)66.7%

Sample

1st row70 ~ 100 (mg/dL) 공복
2nd row3.5 ~ 5.9이하 (mg/dL)
3rd row남 : 13이상(mg/dL)
4th row여 : 12이상(mg/dL)
5th row불검출
ValueCountFrequency (%)
16
20.0%
mg/dl 14
17.5%
여성 6
 
7.5%
남성 6
 
7.5%
음성(1.0이하 3
 
3.8%
불검출 3
 
3.8%
200이하 2
 
2.5%
31이하(mg/dl 2
 
2.5%
음성 2
 
2.5%
5.7이하 1
 
1.2%
Other values (25) 25
31.2%
2023-12-12T23:55:49.989352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
52
 
13.0%
) 26
 
6.5%
( 26
 
6.5%
d 21
 
5.2%
g 21
 
5.2%
m 21
 
5.2%
L 21
 
5.2%
/ 21
 
5.2%
20
 
5.0%
0 19
 
4.8%
Other values (23) 152
38.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 89
22.2%
Decimal Number 68
17.0%
Lowercase Letter 63
15.8%
Space Separator 52
13.0%
Other Punctuation 48
12.0%
Close Punctuation 26
 
6.5%
Open Punctuation 26
 
6.5%
Uppercase Letter 21
 
5.2%
Math Symbol 7
 
1.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
20
22.5%
18
20.2%
15
16.9%
7
 
7.9%
7
 
7.9%
5
 
5.6%
5
 
5.6%
3
 
3.4%
3
 
3.4%
3
 
3.4%
Other values (3) 3
 
3.4%
Decimal Number
ValueCountFrequency (%)
0 19
27.9%
1 16
23.5%
3 7
 
10.3%
5 7
 
10.3%
2 6
 
8.8%
7 6
 
8.8%
9 3
 
4.4%
4 2
 
2.9%
6 2
 
2.9%
Lowercase Letter
ValueCountFrequency (%)
d 21
33.3%
g 21
33.3%
m 21
33.3%
Other Punctuation
ValueCountFrequency (%)
/ 21
43.8%
: 14
29.2%
. 13
27.1%
Space Separator
ValueCountFrequency (%)
52
100.0%
Close Punctuation
ValueCountFrequency (%)
) 26
100.0%
Open Punctuation
ValueCountFrequency (%)
( 26
100.0%
Uppercase Letter
ValueCountFrequency (%)
L 21
100.0%
Math Symbol
ValueCountFrequency (%)
~ 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 227
56.8%
Hangul 89
 
22.2%
Latin 84
 
21.0%

Most frequent character per script

Common
ValueCountFrequency (%)
52
22.9%
) 26
11.5%
( 26
11.5%
/ 21
9.3%
0 19
 
8.4%
1 16
 
7.0%
: 14
 
6.2%
. 13
 
5.7%
3 7
 
3.1%
~ 7
 
3.1%
Other values (6) 26
11.5%
Hangul
ValueCountFrequency (%)
20
22.5%
18
20.2%
15
16.9%
7
 
7.9%
7
 
7.9%
5
 
5.6%
5
 
5.6%
3
 
3.4%
3
 
3.4%
3
 
3.4%
Other values (3) 3
 
3.4%
Latin
ValueCountFrequency (%)
d 21
25.0%
g 21
25.0%
m 21
25.0%
L 21
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 311
77.8%
Hangul 89
 
22.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
52
16.7%
) 26
 
8.4%
( 26
 
8.4%
d 21
 
6.8%
g 21
 
6.8%
m 21
 
6.8%
L 21
 
6.8%
/ 21
 
6.8%
0 19
 
6.1%
1 16
 
5.1%
Other values (10) 67
21.5%
Hangul
ValueCountFrequency (%)
20
22.5%
18
20.2%
15
16.9%
7
 
7.9%
7
 
7.9%
5
 
5.6%
5
 
5.6%
3
 
3.4%
3
 
3.4%
3
 
3.4%
Other values (3) 3
 
3.4%

처리기한
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)9.7%
Missing0
Missing (%)0.0%
Memory size380.0 B
3일
24 
즉시
2일
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)3.2%

Sample

1st row즉시
2nd row3일
3rd row즉시
4th row즉시
5th row즉시

Common Values

ValueCountFrequency (%)
3일 24
77.4%
즉시 6
 
19.4%
2일 1
 
3.2%

Length

2023-12-12T23:55:50.140438image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T23:55:50.237822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3일 24
77.4%
즉시 6
 
19.4%
2일 1
 
3.2%

Correlations

2023-12-12T23:55:50.298404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구 분검사항목참고치 (단위)처리기한
구 분1.0001.0000.9851.000
검사항목1.0001.0000.7381.000
참고치 (단위)0.9850.7381.0001.000
처리기한1.0001.0001.0001.000
2023-12-12T23:55:50.402059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리기한구 분
처리기한1.0000.802
구 분0.8021.000
2023-12-12T23:55:50.488015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구 분처리기한
구 분1.0000.802
처리기한0.8021.000

Missing values

2023-12-12T23:55:48.508031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:55:48.630229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구 분검사항목참고치 (단위)처리기한
0당뇨검사Glucose (혈당)70 ~ 100 (mg/dL) 공복즉시
1당뇨정밀검사HbA1C3.5 ~ 5.9이하 (mg/dL)3일
2빈혈검사Hb (헤모글로빈)남 : 13이상(mg/dL)즉시
3빈혈검사Hb (헤모글로빈)여 : 12이상(mg/dL)즉시
4소변검사3종불검출즉시
5소변검사10종불검출즉시
6소변검사요침사(현미경검경)불검출즉시
7간기능검사AST남성 : 37이하(mg/dL)3일
8간기능검사AST여성 : 31이하(mg/dL)3일
9간기능검사ALT남성 : 41이하(mg/dL)3일
구 분검사항목참고치 (단위)처리기한
21고지혈증검사LDL-Cholesterol130이하(mg/dL)3일
22통풍검사Uric Acid남성 : 7.0이하 (mg/dL)3일
23통풍검사Uric Acid여성 : 5.7이하 (mg/dL)3일
24B형 간염검사(정밀검사)HBs-Ag (EIA)음성 (1.0이하)3일
25B형 간염검사(정밀검사)HBs-Ab (EIA)양성 (10이상)3일
26C형 간염검사(정밀검사)HCV(EIA)음성(1.0이하)3일
27C형 간염검사(정밀검사)HCV(EIA)음성(1.0이하)3일
28매독검사RPR, TPPA음성3일
29에이즈검사HIV(EIA)음성(1.0이하)3일
30혈액학검사CBC<NA>2일

Duplicate rows

Most frequently occurring

구 분검사항목참고치 (단위)처리기한# duplicates
0C형 간염검사(정밀검사)HCV(EIA)음성(1.0이하)3일2