Overview

Dataset statistics

Number of variables9
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory810.5 KiB
Average record size in memory83.0 B

Variable types

Categorical5
Numeric3
Boolean1

Dataset

Description보건복지부에서 2017년~2022년까지의 장기이식자 성별, 연령, 혈액형, 시도, 이식연도, 기증형태, 기증장기, 재이식여부, 건수에 대한 정보를 제공합니다.
Author보건복지부
URLhttps://www.data.go.kr/data/15123384/fileData.do

Alerts

재이식여부 is highly imbalanced (63.9%)Imbalance

Reproduction

Analysis started2023-12-23 07:51:40.050186
Analysis finished2023-12-23 07:51:51.285293
Duration11.24 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
남자
6091 
여자
3909 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row남자
2nd row남자
3rd row남자
4th row남자
5th row여자

Common Values

ValueCountFrequency (%)
남자 6091
60.9%
여자 3909
39.1%

Length

2023-12-23T07:51:51.705257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-23T07:51:52.257320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
남자 6091
60.9%
여자 3909
39.1%

연령
Real number (ℝ)

Distinct81
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49.6637
Minimum0
Maximum80
Zeros37
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-23T07:51:52.847290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile21
Q142
median52
Q360
95-th percentile69
Maximum80
Range80
Interquartile range (IQR)18

Descriptive statistics

Standard deviation14.418654
Coefficient of variation (CV)0.29032582
Kurtosis1.0099635
Mean49.6637
Median Absolute Deviation (MAD)9
Skewness-0.96410479
Sum496637
Variance207.89759
MonotonicityNot monotonic
2023-12-23T07:51:53.541245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
57 338
 
3.4%
55 336
 
3.4%
56 321
 
3.2%
59 320
 
3.2%
58 314
 
3.1%
54 313
 
3.1%
53 300
 
3.0%
60 299
 
3.0%
50 298
 
3.0%
61 288
 
2.9%
Other values (71) 6873
68.7%
ValueCountFrequency (%)
0 37
0.4%
1 34
0.3%
2 19
0.2%
3 16
0.2%
4 14
 
0.1%
5 20
0.2%
6 9
 
0.1%
7 14
 
0.1%
8 15
0.1%
9 13
 
0.1%
ValueCountFrequency (%)
80 1
 
< 0.1%
79 2
 
< 0.1%
78 3
 
< 0.1%
77 7
 
0.1%
76 14
 
0.1%
75 28
 
0.3%
74 29
0.3%
73 58
0.6%
72 71
0.7%
71 68
0.7%

혈액형
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
A
3151 
B
2758 
O
2455 
AB
1636 

Length

Max length2
Median length1
Mean length1.1636
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAB
2nd rowO
3rd rowO
4th rowA
5th rowA

Common Values

ValueCountFrequency (%)
A 3151
31.5%
B 2758
27.6%
O 2455
24.6%
AB 1636
16.4%

Length

2023-12-23T07:51:54.454385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-23T07:51:55.560850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
a 3151
31.5%
b 2758
27.6%
o 2455
24.6%
ab 1636
16.4%

시도
Categorical

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
서울
4954 
경기
1242 
대구
944 
부산
840 
경남
549 
Other values (11)
1471 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원
2nd row인천
3rd row부산
4th row서울
5th row서울

Common Values

ValueCountFrequency (%)
서울 4954
49.5%
경기 1242
 
12.4%
대구 944
 
9.4%
부산 840
 
8.4%
경남 549
 
5.5%
인천 338
 
3.4%
광주 275
 
2.8%
울산 208
 
2.1%
대전 195
 
1.9%
전북 169
 
1.7%
Other values (6) 286
 
2.9%

Length

2023-12-23T07:51:56.212618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울 4954
49.5%
경기 1242
 
12.4%
대구 944
 
9.4%
부산 840
 
8.4%
경남 549
 
5.5%
인천 338
 
3.4%
광주 275
 
2.8%
울산 208
 
2.1%
대전 195
 
1.9%
전북 169
 
1.7%
Other values (6) 286
 
2.9%

이식연
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2019.4629
Minimum2017
Maximum2022
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-23T07:51:57.006393image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2017
5-th percentile2017
Q12018
median2019
Q32021
95-th percentile2022
Maximum2022
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.6851432
Coefficient of variation (CV)0.00083445117
Kurtosis-1.2324403
Mean2019.4629
Median Absolute Deviation (MAD)1
Skewness0.022994713
Sum20194629
Variance2.8397076
MonotonicityNot monotonic
2023-12-23T07:51:57.860766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2019 1786
17.9%
2020 1692
16.9%
2017 1674
16.7%
2021 1655
16.6%
2018 1651
16.5%
2022 1542
15.4%
ValueCountFrequency (%)
2017 1674
16.7%
2018 1651
16.5%
2019 1786
17.9%
2020 1692
16.9%
2021 1655
16.6%
2022 1542
15.4%
ValueCountFrequency (%)
2022 1542
15.4%
2021 1655
16.6%
2020 1692
16.9%
2019 1786
17.9%
2018 1651
16.5%
2017 1674
16.7%

기증유형
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
생존
5057 
뇌사
4940 
NHBD
 
3

Length

Max length4
Median length2
Mean length2.0006
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row생존
2nd row생존
3rd row생존
4th row생존
5th row뇌사

Common Values

ValueCountFrequency (%)
생존 5057
50.6%
뇌사 4940
49.4%
NHBD 3
 
< 0.1%

Length

2023-12-23T07:51:58.971119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-23T07:51:59.653474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
생존 5057
50.6%
뇌사 4940
49.4%
nhbd 3
 
< 0.1%

장기
Categorical

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
신장
5590 
간장
3198 
심장
590 
폐장
 
441
췌장
 
175
Other values (3)
 
6

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row간장
2nd row간장
3rd row신장
4th row신장
5th row신장

Common Values

ValueCountFrequency (%)
신장 5590
55.9%
간장 3198
32.0%
심장 590
 
5.9%
폐장 441
 
4.4%
췌장 175
 
1.8%
소장 3
 
< 0.1%
팔( 2
 
< 0.1%
췌도 1
 
< 0.1%

Length

2023-12-23T07:52:00.252261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-23T07:52:00.888332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
신장 5590
55.9%
간장 3198
32.0%
심장 590
 
5.9%
폐장 441
 
4.4%
췌장 175
 
1.8%
소장 3
 
< 0.1%
2
 
< 0.1%
췌도 1
 
< 0.1%

재이식여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.9 KiB
False
9314 
True
 
686
ValueCountFrequency (%)
False 9314
93.1%
True 686
 
6.9%
2023-12-23T07:52:01.491422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

건수
Real number (ℝ)

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5745
Minimum1
Maximum16
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-23T07:52:01.922120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile4
Maximum16
Range15
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.4002306
Coefficient of variation (CV)0.88931764
Kurtosis22.334267
Mean1.5745
Median Absolute Deviation (MAD)0
Skewness4.0960163
Sum15745
Variance1.9606458
MonotonicityNot monotonic
2023-12-23T07:52:02.567729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
1 7360
73.6%
2 1467
 
14.7%
3 492
 
4.9%
4 246
 
2.5%
5 145
 
1.5%
6 112
 
1.1%
7 62
 
0.6%
8 35
 
0.4%
9 27
 
0.3%
10 18
 
0.2%
Other values (6) 36
 
0.4%
ValueCountFrequency (%)
1 7360
73.6%
2 1467
 
14.7%
3 492
 
4.9%
4 246
 
2.5%
5 145
 
1.5%
6 112
 
1.1%
7 62
 
0.6%
8 35
 
0.4%
9 27
 
0.3%
10 18
 
0.2%
ValueCountFrequency (%)
16 2
 
< 0.1%
15 2
 
< 0.1%
14 8
 
0.1%
13 5
 
0.1%
12 9
 
0.1%
11 10
 
0.1%
10 18
 
0.2%
9 27
0.3%
8 35
0.4%
7 62
0.6%

Interactions

2023-12-23T07:51:48.543270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-23T07:51:45.324640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-23T07:51:46.968966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-23T07:51:48.998286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-23T07:51:45.912194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-23T07:51:47.629477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-23T07:51:49.354005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-23T07:51:46.506843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-23T07:51:48.090309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-23T07:52:03.184186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별연령혈액형시도이식연기증유형장기재이식여부건수
성별1.0000.0790.0350.0560.0000.0310.0590.0000.123
연령0.0791.0000.0290.1830.0610.1000.1660.0600.157
혈액형0.0350.0291.0000.0660.0190.0340.0000.0190.088
시도0.0560.1830.0661.0000.0840.2090.3420.1220.257
이식연0.0000.0610.0190.0841.0000.0260.0210.0000.049
기증유형0.0310.1000.0340.2090.0261.0000.3880.0370.281
장기0.0590.1660.0000.3420.0210.3881.0000.0570.101
재이식여부0.0000.0600.0190.1220.0000.0370.0571.0000.121
건수0.1230.1570.0880.2570.0490.2810.1010.1211.000
2023-12-23T07:52:03.707783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별장기혈액형재이식여부시도기증유형
성별1.0000.0440.0230.0000.0440.052
장기0.0441.0000.0000.0430.1260.268
혈액형0.0230.0001.0000.0130.0310.032
재이식여부0.0000.0430.0131.0000.0960.062
시도0.0440.1260.0310.0961.0000.115
기증유형0.0520.2680.0320.0620.1151.000
2023-12-23T07:52:04.147380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령이식연건수성별혈액형시도기증유형장기재이식여부
연령1.0000.0580.1010.0610.0170.0730.0620.0790.044
이식연0.0581.0000.0240.0000.0110.0430.0230.0170.000
건수0.1010.0241.0000.0940.0480.1030.1720.0480.093
성별0.0610.0000.0941.0000.0230.0440.0520.0440.000
혈액형0.0170.0110.0480.0231.0000.0310.0320.0000.013
시도0.0730.0430.1030.0440.0311.0000.1150.1260.096
기증유형0.0620.0230.1720.0520.0320.1151.0000.2680.062
장기0.0790.0170.0480.0440.0000.1260.2681.0000.043
재이식여부0.0440.0000.0930.0000.0130.0960.0620.0431.000

Missing values

2023-12-23T07:51:50.153638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-23T07:51:50.976847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

성별연령혈액형시도이식연기증유형장기재이식여부건수
722남자43AB강원2017생존간장N1
15203남자66O인천2022생존간장N1
9310남자53O부산2020생존신장N2
5865남자41A서울2019생존신장N3
15094여자65A서울2022뇌사신장N3
1260남자50O인천2017뇌사신장N1
2995여자36B울산2018뇌사신장N1
6978남자56O전북2019생존신장N1
14725남자60A부산2022생존신장Y1
2384여자65B울산2017뇌사신장N1
성별연령혈액형시도이식연기증유형장기재이식여부건수
14695남자59B서울2022뇌사간장N2
11912남자54AB부산2021뇌사간장N1
13324여자30B서울2022생존신장N4
7690여자66A서울2019뇌사신장N4
8329남자36A부산2020뇌사신장N1
14850여자61B서울2022생존신장N3
2991남자36B서울2018생존간장N1
4556남자60A경기2018뇌사신장N1
8207남자31A서울2020생존신장N3
12469남자61A인천2021뇌사신장N1