Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 10000 |
Missing cells | 9568 |
Missing cells (%) | 13.7% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 673.8 KiB |
Average record size in memory | 69.0 B |
Variable types
Numeric | 5 |
---|---|
Text | 1 |
DateTime | 1 |
Dataset
Description | 국립암센터에서 19년도 9월까지 암환자의료비지원정보시스템을 통해 개방하는 설문정보 중 설문 결과 테이블 정보 |
---|---|
Author | 국립암센터 |
URL | https://www.data.go.kr/data/15049638/fileData.do |
RES_SEQ is highly overall correlated with POLL_SEQ and 2 other fields | High correlation |
POLL_SEQ is highly overall correlated with RES_SEQ and 2 other fields | High correlation |
QUST_SEQ is highly overall correlated with RES_SEQ and 2 other fields | High correlation |
ANS_USER_ID is highly overall correlated with RES_SEQ and 2 other fields | High correlation |
EXAM_ETC has 8831 (88.3%) missing values | Missing |
ANS_USER_ID has 737 (7.4%) missing values | Missing |
RES_SEQ has unique values | Unique |
Reproduction
Analysis started | 2023-12-12 10:32:05.636532 |
---|---|
Analysis finished | 2023-12-12 10:32:10.704537 |
Duration | 5.07 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
RES_SEQ
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 10000 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 8551.8334 |
Minimum | 1 |
---|---|
Maximum | 16207 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 815.95 |
Q1 | 4578.75 |
median | 8778 |
Q3 | 12639.5 |
95-th percentile | 15518.05 |
Maximum | 16207 |
Range | 16206 |
Interquartile range (IQR) | 8060.75 |
Descriptive statistics
Standard deviation | 4690.5531 |
---|---|
Coefficient of variation (CV) | 0.54848509 |
Kurtosis | -1.171233 |
Mean | 8551.8334 |
Median Absolute Deviation (MAD) | 4021.5 |
Skewness | -0.13969559 |
Sum | 85518334 |
Variance | 22001288 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4188 | 1 | < 0.1% |
10387 | 1 | < 0.1% |
7086 | 1 | < 0.1% |
7010 | 1 | < 0.1% |
10591 | 1 | < 0.1% |
7295 | 1 | < 0.1% |
7565 | 1 | < 0.1% |
13893 | 1 | < 0.1% |
14831 | 1 | < 0.1% |
14625 | 1 | < 0.1% |
Other values (9990) | 9990 |
Value | Count | Frequency (%) |
1 | 1 | |
3 | 1 | |
4 | 1 | |
6 | 1 | |
7 | 1 | |
8 | 1 | |
10 | 1 | |
11 | 1 | |
13 | 1 | |
15 | 1 |
Value | Count | Frequency (%) |
16207 | 1 | |
16206 | 1 | |
16205 | 1 | |
16203 | 1 | |
16202 | 1 | |
16201 | 1 | |
16200 | 1 | |
16196 | 1 | |
16194 | 1 | |
16193 | 1 |
POLL_SEQ
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 11 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 9.145 |
Minimum | 1 |
---|---|
Maximum | 13 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 8 |
Q1 | 8 |
median | 8 |
Q3 | 11 |
95-th percentile | 13 |
Maximum | 13 |
Range | 12 |
Interquartile range (IQR) | 3 |
Descriptive statistics
Standard deviation | 1.9007199 |
---|---|
Coefficient of variation (CV) | 0.20784253 |
Kurtosis | 2.0072523 |
Mean | 9.145 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 0.1589526 |
Sum | 91450 |
Variance | 3.6127363 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
8 | 5797 | |
11 | 1132 | 11.3% |
9 | 925 | 9.2% |
13 | 854 | 8.5% |
12 | 608 | 6.1% |
10 | 573 | 5.7% |
1 | 68 | 0.7% |
2 | 38 | 0.4% |
5 | 2 | < 0.1% |
6 | 2 | < 0.1% |
Value | Count | Frequency (%) |
1 | 68 | 0.7% |
2 | 38 | 0.4% |
3 | 1 | < 0.1% |
5 | 2 | < 0.1% |
6 | 2 | < 0.1% |
8 | 5797 | |
9 | 925 | 9.2% |
10 | 573 | 5.7% |
11 | 1132 | 11.3% |
12 | 608 | 6.1% |
Value | Count | Frequency (%) |
13 | 854 | 8.5% |
12 | 608 | 6.1% |
11 | 1132 | 11.3% |
10 | 573 | 5.7% |
9 | 925 | 9.2% |
8 | 5797 | |
6 | 2 | < 0.1% |
5 | 2 | < 0.1% |
3 | 1 | < 0.1% |
2 | 38 | 0.4% |
QUST_SEQ
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 78 |
---|---|
Distinct (%) | 0.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 22.6165 |
Minimum | 1 |
---|---|
Maximum | 78 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 3 |
median | 9 |
Q3 | 41 |
95-th percentile | 70 |
Maximum | 78 |
Range | 77 |
Interquartile range (IQR) | 38 |
Descriptive statistics
Standard deviation | 23.670831 |
---|---|
Coefficient of variation (CV) | 1.0466178 |
Kurtosis | -0.60249028 |
Mean | 22.6165 |
Median Absolute Deviation (MAD) | 8 |
Skewness | 0.88998385 |
Sum | 226165 |
Variance | 560.30826 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
1 | 882 | 8.8% |
2 | 831 | 8.3% |
3 | 804 | 8.0% |
4 | 796 | 8.0% |
5 | 787 | 7.9% |
7 | 402 | 4.0% |
8 | 258 | 2.6% |
6 | 210 | 2.1% |
30 | 91 | 0.9% |
16 | 89 | 0.9% |
Other values (68) | 4850 |
Value | Count | Frequency (%) |
1 | 882 | |
2 | 831 | |
3 | 804 | |
4 | 796 | |
5 | 787 | |
6 | 210 | 2.1% |
7 | 402 | |
8 | 258 | 2.6% |
9 | 83 | 0.8% |
10 | 86 | 0.9% |
Value | Count | Frequency (%) |
78 | 73 | |
77 | 56 | |
76 | 62 | |
75 | 64 | |
74 | 64 | |
73 | 62 | |
72 | 59 | |
71 | 58 | |
70 | 63 | |
69 | 58 |
EXAM_SEQ
Real number (ℝ)
Distinct | 9 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 2.6224 |
Minimum | 1 |
---|---|
Maximum | 9 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 2 |
median | 3 |
Q3 | 3 |
95-th percentile | 5 |
Maximum | 9 |
Range | 8 |
Interquartile range (IQR) | 1 |
Descriptive statistics
Standard deviation | 1.2018164 |
---|---|
Coefficient of variation (CV) | 0.45828875 |
Kurtosis | 2.594336 |
Mean | 2.6224 |
Median Absolute Deviation (MAD) | 1 |
Skewness | 0.95258057 |
Sum | 26224 |
Variance | 1.4443627 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
3 | 3876 | |
2 | 2579 | |
1 | 1905 | |
4 | 960 | 9.6% |
5 | 551 | 5.5% |
7 | 58 | 0.6% |
6 | 33 | 0.3% |
9 | 30 | 0.3% |
8 | 8 | 0.1% |
Value | Count | Frequency (%) |
1 | 1905 | |
2 | 2579 | |
3 | 3876 | |
4 | 960 | 9.6% |
5 | 551 | 5.5% |
6 | 33 | 0.3% |
7 | 58 | 0.6% |
8 | 8 | 0.1% |
9 | 30 | 0.3% |
Value | Count | Frequency (%) |
9 | 30 | 0.3% |
8 | 8 | 0.1% |
7 | 58 | 0.6% |
6 | 33 | 0.3% |
5 | 551 | 5.5% |
4 | 960 | 9.6% |
3 | 3876 | |
2 | 2579 | |
1 | 1905 |
EXAM_ETC
Text
MISSING
 
Distinct | 559 |
---|---|
Distinct (%) | 47.8% |
Missing | 8831 |
Missing (%) | 88.3% |
Memory size | 156.2 KiB |
Length
Max length | 276 |
---|---|
Median length | 238 |
Mean length | 16.092387 |
Min length | 1 |
Characters and Unicode
Total characters | 18812 |
---|---|
Distinct characters | 561 |
Distinct categories | 13 ? |
Distinct scripts | 3 ? |
Distinct blocks | 3 ? |
Unique
Unique | 428 ? |
---|---|
Unique (%) | 36.6% |
Sample
1st row | 없음 |
---|---|
2nd row | -정보시스템 개선(의료비 영수증이 아닌 전산에서 가져 올수 있도록) ,-사업 담당 보건소와 정책작성 부서와의 피드백 |
3rd row | 11시간 |
4th row | 6급 |
5th row |
Value | Count | Frequency (%) |
2015년 | 58 | 1.3% |
50 | 1.1% | |
없음 | 36 | 0.8% |
수 | 32 | 0.7% |
2 | 28 | 0.6% |
1 | 27 | 0.6% |
10 | 24 | 0.5% |
좋겠습니다 | 23 | 0.5% |
지원 | 23 | 0.5% |
및 | 23 | 0.5% |
Other values (2367) | 4053 |
Most occurring characters
Value | Count | Frequency (%) |
3342 | 17.8% | |
1 | 484 | 2.6% |
. | 389 | 2.1% |
0 | 383 | 2.0% |
지 | 330 | 1.8% |
2 | 311 | 1.7% |
이 | 306 | 1.6% |
시 | 282 | 1.5% |
다 | 278 | 1.5% |
5 | 222 | 1.2% |
Other values (551) | 12485 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 12579 | |
Space Separator | 3342 | 17.8% |
Decimal Number | 1929 | 10.3% |
Other Punctuation | 640 | 3.4% |
Control | 137 | 0.7% |
Open Punctuation | 47 | 0.2% |
Close Punctuation | 44 | 0.2% |
Math Symbol | 32 | 0.2% |
Dash Punctuation | 31 | 0.2% |
Lowercase Letter | 21 | 0.1% |
Other values (3) | 10 | 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
지 | 330 | 2.6% |
이 | 306 | 2.4% |
시 | 282 | 2.2% |
다 | 278 | 2.2% |
원 | 218 | 1.7% |
하 | 209 | 1.7% |
의 | 189 | 1.5% |
도 | 188 | 1.5% |
년 | 187 | 1.5% |
는 | 182 | 1.4% |
Other values (500) | 10210 |
Lowercase Letter
Value | Count | Frequency (%) |
e | 3 | |
w | 2 | |
r | 2 | |
i | 2 | |
a | 2 | |
y | 2 | |
u | 2 | |
h | 1 | 4.8% |
s | 1 | 4.8% |
k | 1 | 4.8% |
Other values (3) | 3 |
Other Punctuation
Value | Count | Frequency (%) |
. | 389 | |
, | 205 | |
: | 10 | 1.6% |
! | 9 | 1.4% |
? | 9 | 1.4% |
* | 4 | 0.6% |
/ | 4 | 0.6% |
; | 3 | 0.5% |
& | 2 | 0.3% |
# | 2 | 0.3% |
Other values (2) | 3 | 0.5% |
Decimal Number
Value | Count | Frequency (%) |
1 | 484 | |
0 | 383 | |
2 | 311 | |
5 | 222 | |
4 | 111 | 5.8% |
3 | 110 | 5.7% |
9 | 93 | 4.8% |
8 | 87 | 4.5% |
6 | 71 | 3.7% |
7 | 57 | 3.0% |
Uppercase Letter
Value | Count | Frequency (%) |
A | 2 | |
Q | 1 | |
N | 1 | |
M | 1 | |
E | 1 |
Open Punctuation
Value | Count | Frequency (%) |
( | 45 | |
[ | 2 | 4.3% |
Close Punctuation
Value | Count | Frequency (%) |
) | 42 | |
] | 2 | 4.5% |
Math Symbol
Value | Count | Frequency (%) |
~ | 30 | |
> | 2 | 6.2% |
Space Separator
Value | Count | Frequency (%) |
3342 |
Control
Value | Count | Frequency (%) |
137 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 31 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 2 |
Modifier Symbol
Value | Count | Frequency (%) |
^ | 2 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 12579 | |
Common | 6206 | |
Latin | 27 | 0.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
지 | 330 | 2.6% |
이 | 306 | 2.4% |
시 | 282 | 2.2% |
다 | 278 | 2.2% |
원 | 218 | 1.7% |
하 | 209 | 1.7% |
의 | 189 | 1.5% |
도 | 188 | 1.5% |
년 | 187 | 1.5% |
는 | 182 | 1.4% |
Other values (500) | 10210 |
Common
Value | Count | Frequency (%) |
3342 | ||
1 | 484 | 7.8% |
. | 389 | 6.3% |
0 | 383 | 6.2% |
2 | 311 | 5.0% |
5 | 222 | 3.6% |
, | 205 | 3.3% |
137 | 2.2% | |
4 | 111 | 1.8% |
3 | 110 | 1.8% |
Other values (23) | 512 | 8.3% |
Latin
Value | Count | Frequency (%) |
e | 3 | 11.1% |
w | 2 | 7.4% |
r | 2 | 7.4% |
A | 2 | 7.4% |
i | 2 | 7.4% |
a | 2 | 7.4% |
y | 2 | 7.4% |
u | 2 | 7.4% |
Q | 1 | 3.7% |
N | 1 | 3.7% |
Other values (8) | 8 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 12571 | |
ASCII | 6233 | |
Compat Jamo | 8 | < 0.1% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
3342 | ||
1 | 484 | 7.8% |
. | 389 | 6.2% |
0 | 383 | 6.1% |
2 | 311 | 5.0% |
5 | 222 | 3.6% |
, | 205 | 3.3% |
137 | 2.2% | |
4 | 111 | 1.8% |
3 | 110 | 1.8% |
Other values (41) | 539 | 8.6% |
Hangul
Value | Count | Frequency (%) |
지 | 330 | 2.6% |
이 | 306 | 2.4% |
시 | 282 | 2.2% |
다 | 278 | 2.2% |
원 | 218 | 1.7% |
하 | 209 | 1.7% |
의 | 189 | 1.5% |
도 | 188 | 1.5% |
년 | 187 | 1.5% |
는 | 182 | 1.4% |
Other values (495) | 10202 |
Compat Jamo
Value | Count | Frequency (%) |
ㅎ | 2 | |
ㅠ | 2 | |
ㅋ | 2 | |
ㅗ | 1 | |
ㅛ | 1 |
ANS_DT
Date
Distinct | 69 |
---|---|
Distinct (%) | 0.7% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 156.2 KiB |
Minimum | 2007-05-10 00:00:00 |
---|---|
Maximum | 2018-12-28 00:00:00 |
ANS_USER_ID
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 686 |
---|---|
Distinct (%) | 7.4% |
Missing | 737 |
Missing (%) | 7.4% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 3536.9338 |
Minimum | 0 |
---|---|
Maximum | 5357 |
Zeros | 3 |
Zeros (%) | < 0.1% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 166.0 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 2266 |
Q1 | 3056 |
median | 3386 |
Q3 | 4306 |
95-th percentile | 5042 |
Maximum | 5357 |
Range | 5357 |
Interquartile range (IQR) | 1250 |
Descriptive statistics
Standard deviation | 956.6961 |
---|---|
Coefficient of variation (CV) | 0.27048742 |
Kurtosis | 1.514478 |
Mean | 3536.9338 |
Median Absolute Deviation (MAD) | 522 |
Skewness | -0.52474606 |
Sum | 32762618 |
Variance | 915267.43 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2401 | 79 | 0.8% |
3100 | 77 | 0.8% |
3399 | 75 | 0.8% |
3077 | 74 | 0.7% |
2675 | 70 | 0.7% |
3217 | 70 | 0.7% |
3453 | 70 | 0.7% |
2931 | 69 | 0.7% |
3056 | 69 | 0.7% |
3449 | 68 | 0.7% |
Other values (676) | 8542 | |
(Missing) | 737 | 7.4% |
Value | Count | Frequency (%) |
0 | 3 | < 0.1% |
9 | 2 | < 0.1% |
23 | 18 | 0.2% |
40 | 3 | < 0.1% |
43 | 2 | < 0.1% |
52 | 60 | |
63 | 3 | < 0.1% |
65 | 1 | < 0.1% |
66 | 3 | < 0.1% |
75 | 1 | < 0.1% |
Value | Count | Frequency (%) |
5357 | 4 | |
5354 | 4 | |
5353 | 3 | |
5352 | 2 | < 0.1% |
5351 | 4 | |
5349 | 4 | |
5347 | 3 | |
5346 | 2 | < 0.1% |
5345 | 5 | |
5344 | 4 |
RES_SEQ | POLL_SEQ | QUST_SEQ | EXAM_SEQ | ANS_DT | ANS_USER_ID | |
---|---|---|---|---|---|---|
RES_SEQ | 1.000 | 0.827 | 0.678 | 0.332 | 0.988 | 0.775 |
POLL_SEQ | 0.827 | 1.000 | 0.569 | 0.520 | 0.997 | 0.714 |
QUST_SEQ | 0.678 | 0.569 | 1.000 | 0.303 | 0.653 | 0.625 |
EXAM_SEQ | 0.332 | 0.520 | 0.303 | 1.000 | 0.543 | 0.259 |
ANS_DT | 0.988 | 0.997 | 0.653 | 0.543 | 1.000 | 0.862 |
ANS_USER_ID | 0.775 | 0.714 | 0.625 | 0.259 | 0.862 | 1.000 |
RES_SEQ | POLL_SEQ | QUST_SEQ | EXAM_SEQ | ANS_USER_ID | |
---|---|---|---|---|---|
RES_SEQ | 1.000 | 0.895 | -0.611 | 0.194 | 0.687 |
POLL_SEQ | 0.895 | 1.000 | -0.679 | 0.213 | 0.747 |
QUST_SEQ | -0.611 | -0.679 | 1.000 | -0.041 | -0.504 |
EXAM_SEQ | 0.194 | 0.213 | -0.041 | 1.000 | 0.151 |
ANS_USER_ID | 0.687 | 0.747 | -0.504 | 0.151 | 1.000 |
RES_SEQ | POLL_SEQ | QUST_SEQ | EXAM_SEQ | EXAM_ETC | ANS_DT | ANS_USER_ID | |
---|---|---|---|---|---|---|---|
3591 | 4188 | 8 | 45 | 2 | <NA> | 2015-12-28 | 52 |
7165 | 8940 | 8 | 63 | 1 | 없음 | 2016-01-07 | 2450 |
5343 | 6692 | 8 | 21 | 2 | <NA> | 2016-01-07 | 3310 |
12830 | 15453 | 13 | 1 | 3 | <NA> | 2018-12-03 | 5332 |
1374 | 1552 | 8 | 24 | 3 | <NA> | 2015-12-22 | 2989 |
1815 | 2336 | 8 | 58 | 1 | -정보시스템 개선(의료비 영수증이 아닌 전산에서 가져 올수 있도록) ,-사업 담당 보건소와 정책작성 부서와의 피드백 | 2015-12-23 | 3370 |
584 | 565 | 8 | 47 | 3 | <NA> | 2015-12-22 | 2683 |
12560 | 14537 | 12 | 4 | 3 | <NA> | 2018-06-20 | 4833 |
2466 | 2494 | 8 | 72 | 3 | <NA> | 2015-12-23 | 2489 |
2333 | 2432 | 8 | 10 | 1 | 11시간 | 2015-12-23 | 2489 |
RES_SEQ | POLL_SEQ | QUST_SEQ | EXAM_SEQ | EXAM_ETC | ANS_DT | ANS_USER_ID | |
---|---|---|---|---|---|---|---|
194 | 364 | 8 | 30 | 2 | <NA> | 2015-12-22 | 3189 |
5922 | 7809 | 8 | 57 | 1 | 전국적으로 암의료비 지원 예산이 부족한 것으로 알고 있습니다. ,타 신규 의료비 지원 사업을 늘리지 말고 기존의 암의료비 지원 예산을 ,확보해 주었으면 좋겠습니다. ,보건소에 신청하여 본인이 필요한 시기에 돈을 받아야 환자도 필요한 곳에 적절하게 사용할 수 있고 그래야 사업 만족도가 높을 것입니다. ,그리고 성인 암환자 의료비 지원 한도를 높여주십시오 | 2016-01-07 | 2266 |
6221 | 8080 | 8 | 25 | 3 | <NA> | 2016-01-07 | 2557 |
4706 | 6259 | 8 | 6 | 2 | <NA> | 2015-12-31 | <NA> |
933 | 1413 | 8 | 28 | 3 | <NA> | 2015-12-22 | 3077 |
1996 | 2879 | 8 | 67 | 5 | 전화 연결이 안됨. 급할 때는 무지 답답함. | 2015-12-24 | 3203 |
7195 | 9295 | 8 | 55 | 4 | <NA> | 2016-01-07 | 3217 |
4105 | 5278 | 8 | 75 | 2 | <NA> | 2015-12-30 | 2182 |
4043 | 4840 | 8 | 31 | 3 | <NA> | 2015-12-30 | 2991 |
536 | 280 | 8 | 24 | 2 | <NA> | 2015-12-22 | 2993 |