Overview

Dataset statistics

Number of variables5
Number of observations682
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory28.8 KiB
Average record size in memory43.2 B

Variable types

Numeric3
Text1
Boolean1

Dataset

Description국립암센터에서 19년도 9월까지 암환자의료비지원정보시스템을 통해 개방하는 설문정보 중 설문 보기 테이블 정보입니다. 설문 정보등에 대한 정보가 있습니다.
Author국립암센터
URLhttps://www.data.go.kr/data/15049636/fileData.do

Alerts

여부 is highly imbalanced (67.2%)Imbalance

Reproduction

Analysis started2023-12-12 14:58:24.976878
Analysis finished2023-12-12 14:58:26.704122
Duration1.73 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

설문번호
Real number (ℝ)

Distinct8
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8.3973607
Minimum6
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2023-12-12T23:58:26.763914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum6
5-th percentile7
Q17
median8
Q38
95-th percentile13
Maximum13
Range7
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.8174045
Coefficient of variation (CV)0.21642568
Kurtosis1.0216346
Mean8.3973607
Median Absolute Deviation (MAD)1
Skewness1.5193699
Sum5727
Variance3.3029593
MonotonicityNot monotonic
2023-12-12T23:58:26.906745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
8 287
42.1%
7 246
36.1%
13 45
 
6.6%
12 45
 
6.6%
10 20
 
2.9%
9 20
 
2.9%
11 18
 
2.6%
6 1
 
0.1%
ValueCountFrequency (%)
6 1
 
0.1%
7 246
36.1%
8 287
42.1%
9 20
 
2.9%
10 20
 
2.9%
11 18
 
2.6%
12 45
 
6.6%
13 45
 
6.6%
ValueCountFrequency (%)
13 45
 
6.6%
12 45
 
6.6%
11 18
 
2.6%
10 20
 
2.9%
9 20
 
2.9%
8 287
42.1%
7 246
36.1%
6 1
 
0.1%

예시번호
Real number (ℝ)

Distinct9
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.8768328
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2023-12-12T23:58:27.049190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile6
Maximum9
Range8
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.698083
Coefficient of variation (CV)0.59026128
Kurtosis1.9704371
Mean2.8768328
Median Absolute Deviation (MAD)1
Skewness1.2112025
Sum1962
Variance2.883486
MonotonicityNot monotonic
2023-12-12T23:58:27.197762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 166
24.3%
2 151
22.1%
3 146
21.4%
4 139
20.4%
5 40
 
5.9%
7 10
 
1.5%
9 10
 
1.5%
8 10
 
1.5%
6 10
 
1.5%
ValueCountFrequency (%)
1 166
24.3%
2 151
22.1%
3 146
21.4%
4 139
20.4%
5 40
 
5.9%
6 10
 
1.5%
7 10
 
1.5%
8 10
 
1.5%
9 10
 
1.5%
ValueCountFrequency (%)
9 10
 
1.5%
8 10
 
1.5%
7 10
 
1.5%
6 10
 
1.5%
5 40
 
5.9%
4 139
20.4%
3 146
21.4%
2 151
22.1%
1 166
24.3%

질문번호
Real number (ℝ)

Distinct78
Distinct (%)11.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.97654
Minimum1
Maximum78
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2023-12-12T23:58:27.366128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q16
median29
Q350
95-th percentile69
Maximum78
Range77
Interquartile range (IQR)44

Descriptive statistics

Standard deviation23.04333
Coefficient of variation (CV)0.74389619
Kurtosis-1.2168972
Mean30.97654
Median Absolute Deviation (MAD)22
Skewness0.27605092
Sum21126
Variance530.99504
MonotonicityNot monotonic
2023-12-12T23:58:27.543051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 40
 
5.9%
5 40
 
5.9%
3 39
 
5.7%
2 37
 
5.4%
55 10
 
1.5%
56 10
 
1.5%
57 10
 
1.5%
53 10
 
1.5%
1 9
 
1.3%
64 9
 
1.3%
Other values (68) 468
68.6%
ValueCountFrequency (%)
1 9
 
1.3%
2 37
5.4%
3 39
5.7%
4 40
5.9%
5 40
5.9%
6 9
 
1.3%
7 9
 
1.3%
8 3
 
0.4%
9 5
 
0.7%
10 5
 
0.7%
ValueCountFrequency (%)
78 1
 
0.1%
77 5
0.7%
76 4
0.6%
75 3
0.4%
74 4
0.6%
73 4
0.6%
72 4
0.6%
71 4
0.6%
70 4
0.6%
69 4
0.6%

답변
Text

Distinct127
Distinct (%)18.6%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
2023-12-12T23:58:27.915516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length144
Median length134
Mean length6.3856305
Min length1

Characters and Unicode

Total characters4355
Distinct characters223
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62 ?
Unique (%)9.1%

Sample

1st row간호직
2nd row전문대졸
3rd row주관식
4th row7급
5th row그렇다
ValueCountFrequency (%)
매우 195
 
15.0%
그렇다 181
 
13.9%
아니다 153
 
11.7%
않다 42
 
3.2%
그렇지 29
 
2.2%
필요 24
 
1.8%
불필요 24
 
1.8%
주관식 23
 
1.8%
어렵다 21
 
1.6%
전혀 21
 
1.6%
Other values (241) 591
45.3%
2023-12-12T23:58:28.465794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
622
 
14.3%
441
 
10.1%
221
 
5.1%
221
 
5.1%
200
 
4.6%
200
 
4.6%
155
 
3.6%
154
 
3.5%
67
 
1.5%
0 64
 
1.5%
Other values (213) 2010
46.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3127
71.8%
Space Separator 622
 
14.3%
Decimal Number 279
 
6.4%
Close Punctuation 105
 
2.4%
Open Punctuation 104
 
2.4%
Other Punctuation 61
 
1.4%
Connector Punctuation 22
 
0.5%
Math Symbol 13
 
0.3%
Other Number 10
 
0.2%
Uppercase Letter 9
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
441
 
14.1%
221
 
7.1%
221
 
7.1%
200
 
6.4%
200
 
6.4%
155
 
5.0%
154
 
4.9%
67
 
2.1%
52
 
1.7%
49
 
1.6%
Other values (179) 1367
43.7%
Decimal Number
ValueCountFrequency (%)
0 64
22.9%
1 40
14.3%
5 36
12.9%
2 25
 
9.0%
7 22
 
7.9%
3 20
 
7.2%
9 20
 
7.2%
4 18
 
6.5%
8 17
 
6.1%
6 17
 
6.1%
Other Number
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Other Punctuation
ValueCountFrequency (%)
, 26
42.6%
. 17
27.9%
% 16
26.2%
/ 2
 
3.3%
Close Punctuation
ValueCountFrequency (%)
) 64
61.0%
] 41
39.0%
Open Punctuation
ValueCountFrequency (%)
( 63
60.6%
[ 41
39.4%
Math Symbol
ValueCountFrequency (%)
~ 12
92.3%
> 1
 
7.7%
Space Separator
ValueCountFrequency (%)
622
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 22
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 9
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3127
71.8%
Common 1219
 
28.0%
Latin 9
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
441
 
14.1%
221
 
7.1%
221
 
7.1%
200
 
6.4%
200
 
6.4%
155
 
5.0%
154
 
4.9%
67
 
2.1%
52
 
1.7%
49
 
1.6%
Other values (179) 1367
43.7%
Common
ValueCountFrequency (%)
622
51.0%
0 64
 
5.3%
) 64
 
5.3%
( 63
 
5.2%
[ 41
 
3.4%
] 41
 
3.4%
1 40
 
3.3%
5 36
 
3.0%
, 26
 
2.1%
2 25
 
2.1%
Other values (23) 197
 
16.2%
Latin
ValueCountFrequency (%)
N 9
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3127
71.8%
ASCII 1218
 
28.0%
Enclosed Alphanum 10
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
622
51.1%
0 64
 
5.3%
) 64
 
5.3%
( 63
 
5.2%
[ 41
 
3.4%
] 41
 
3.4%
1 40
 
3.3%
5 36
 
3.0%
, 26
 
2.1%
2 25
 
2.1%
Other values (14) 196
 
16.1%
Hangul
ValueCountFrequency (%)
441
 
14.1%
221
 
7.1%
221
 
7.1%
200
 
6.4%
200
 
6.4%
155
 
5.0%
154
 
4.9%
67
 
2.1%
52
 
1.7%
49
 
1.6%
Other values (179) 1367
43.7%
Enclosed Alphanum
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%

여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size814.0 B
False
641 
True
 
41
ValueCountFrequency (%)
False 641
94.0%
True 41
 
6.0%
2023-12-12T23:58:28.588789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-12T23:58:26.091386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:58:25.245384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:58:25.668448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:58:26.224955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:58:25.389552image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:58:25.797476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:58:26.366581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:58:25.535212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:58:25.958008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:58:28.655149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설문번호예시번호질문번호여부
설문번호1.0000.4400.5780.335
예시번호0.4401.0000.2710.324
질문번호0.5780.2711.0000.220
여부0.3350.3240.2201.000
2023-12-12T23:58:28.781658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
설문번호예시번호질문번호여부
설문번호1.0000.216-0.4270.250
예시번호0.2161.000-0.1560.322
질문번호-0.427-0.1561.0000.168
여부0.2500.3220.1681.000

Missing values

2023-12-12T23:58:26.537881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:58:26.660348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

설문번호예시번호질문번호답변여부
0724간호직N
1723전문대졸N
2612주관식Y
37257급N
47213그렇다N
57310아니다N
6729그렇다N
77412매우 아니다N
87314아니다N
97215그렇다N
설문번호예시번호질문번호답변여부
67212447N
67312546 (보통이다)N
67412843N
675121510 (매우 그렇다)N
67612754 (그렇지 않은 편이다)N
67712962 (전혀 그렇지 않다)N
678131110(매우그렇다)N
67912269N
68012368 (그렇다)N
681131210(매우 그렇다)N