Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells9568
Missing cells (%)13.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory673.8 KiB
Average record size in memory69.0 B

Variable types

Numeric5
Text1
DateTime1

Dataset

Description국립암센터에서 19년도 9월까지 암환자의료비지원정보시스템을 통해 개방하는 설문정보 중 설문 결과 테이블 정보
Author국립암센터
URLhttps://www.data.go.kr/data/15049638/fileData.do

Alerts

RES_SEQ is highly overall correlated with POLL_SEQ and 2 other fieldsHigh correlation
POLL_SEQ is highly overall correlated with RES_SEQ and 2 other fieldsHigh correlation
QUST_SEQ is highly overall correlated with RES_SEQ and 2 other fieldsHigh correlation
ANS_USER_ID is highly overall correlated with RES_SEQ and 2 other fieldsHigh correlation
EXAM_ETC has 8831 (88.3%) missing valuesMissing
ANS_USER_ID has 737 (7.4%) missing valuesMissing
RES_SEQ has unique valuesUnique

Reproduction

Analysis started2023-12-12 10:32:05.636532
Analysis finished2023-12-12 10:32:10.704537
Duration5.07 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

RES_SEQ
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8551.8334
Minimum1
Maximum16207
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:32:10.807186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile815.95
Q14578.75
median8778
Q312639.5
95-th percentile15518.05
Maximum16207
Range16206
Interquartile range (IQR)8060.75

Descriptive statistics

Standard deviation4690.5531
Coefficient of variation (CV)0.54848509
Kurtosis-1.171233
Mean8551.8334
Median Absolute Deviation (MAD)4021.5
Skewness-0.13969559
Sum85518334
Variance22001288
MonotonicityNot monotonic
2023-12-12T19:32:11.331715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4188 1
 
< 0.1%
10387 1
 
< 0.1%
7086 1
 
< 0.1%
7010 1
 
< 0.1%
10591 1
 
< 0.1%
7295 1
 
< 0.1%
7565 1
 
< 0.1%
13893 1
 
< 0.1%
14831 1
 
< 0.1%
14625 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
10 1
< 0.1%
11 1
< 0.1%
13 1
< 0.1%
15 1
< 0.1%
ValueCountFrequency (%)
16207 1
< 0.1%
16206 1
< 0.1%
16205 1
< 0.1%
16203 1
< 0.1%
16202 1
< 0.1%
16201 1
< 0.1%
16200 1
< 0.1%
16196 1
< 0.1%
16194 1
< 0.1%
16193 1
< 0.1%

POLL_SEQ
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.145
Minimum1
Maximum13
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:32:11.489602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile8
Q18
median8
Q311
95-th percentile13
Maximum13
Range12
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.9007199
Coefficient of variation (CV)0.20784253
Kurtosis2.0072523
Mean9.145
Median Absolute Deviation (MAD)0
Skewness0.1589526
Sum91450
Variance3.6127363
MonotonicityNot monotonic
2023-12-12T19:32:11.657274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
8 5797
58.0%
11 1132
 
11.3%
9 925
 
9.2%
13 854
 
8.5%
12 608
 
6.1%
10 573
 
5.7%
1 68
 
0.7%
2 38
 
0.4%
5 2
 
< 0.1%
6 2
 
< 0.1%
ValueCountFrequency (%)
1 68
 
0.7%
2 38
 
0.4%
3 1
 
< 0.1%
5 2
 
< 0.1%
6 2
 
< 0.1%
8 5797
58.0%
9 925
 
9.2%
10 573
 
5.7%
11 1132
 
11.3%
12 608
 
6.1%
ValueCountFrequency (%)
13 854
 
8.5%
12 608
 
6.1%
11 1132
 
11.3%
10 573
 
5.7%
9 925
 
9.2%
8 5797
58.0%
6 2
 
< 0.1%
5 2
 
< 0.1%
3 1
 
< 0.1%
2 38
 
0.4%

QUST_SEQ
Real number (ℝ)

HIGH CORRELATION 

Distinct78
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.6165
Minimum1
Maximum78
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:32:11.839447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median9
Q341
95-th percentile70
Maximum78
Range77
Interquartile range (IQR)38

Descriptive statistics

Standard deviation23.670831
Coefficient of variation (CV)1.0466178
Kurtosis-0.60249028
Mean22.6165
Median Absolute Deviation (MAD)8
Skewness0.88998385
Sum226165
Variance560.30826
MonotonicityNot monotonic
2023-12-12T19:32:12.040867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 882
 
8.8%
2 831
 
8.3%
3 804
 
8.0%
4 796
 
8.0%
5 787
 
7.9%
7 402
 
4.0%
8 258
 
2.6%
6 210
 
2.1%
30 91
 
0.9%
16 89
 
0.9%
Other values (68) 4850
48.5%
ValueCountFrequency (%)
1 882
8.8%
2 831
8.3%
3 804
8.0%
4 796
8.0%
5 787
7.9%
6 210
 
2.1%
7 402
4.0%
8 258
 
2.6%
9 83
 
0.8%
10 86
 
0.9%
ValueCountFrequency (%)
78 73
0.7%
77 56
0.6%
76 62
0.6%
75 64
0.6%
74 64
0.6%
73 62
0.6%
72 59
0.6%
71 58
0.6%
70 63
0.6%
69 58
0.6%

EXAM_SEQ
Real number (ℝ)

Distinct9
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.6224
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:32:12.214464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q33
95-th percentile5
Maximum9
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.2018164
Coefficient of variation (CV)0.45828875
Kurtosis2.594336
Mean2.6224
Median Absolute Deviation (MAD)1
Skewness0.95258057
Sum26224
Variance1.4443627
MonotonicityNot monotonic
2023-12-12T19:32:12.359487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
3 3876
38.8%
2 2579
25.8%
1 1905
19.1%
4 960
 
9.6%
5 551
 
5.5%
7 58
 
0.6%
6 33
 
0.3%
9 30
 
0.3%
8 8
 
0.1%
ValueCountFrequency (%)
1 1905
19.1%
2 2579
25.8%
3 3876
38.8%
4 960
 
9.6%
5 551
 
5.5%
6 33
 
0.3%
7 58
 
0.6%
8 8
 
0.1%
9 30
 
0.3%
ValueCountFrequency (%)
9 30
 
0.3%
8 8
 
0.1%
7 58
 
0.6%
6 33
 
0.3%
5 551
 
5.5%
4 960
 
9.6%
3 3876
38.8%
2 2579
25.8%
1 1905
19.1%

EXAM_ETC
Text

MISSING 

Distinct559
Distinct (%)47.8%
Missing8831
Missing (%)88.3%
Memory size156.2 KiB
2023-12-12T19:32:12.846056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length276
Median length238
Mean length16.092387
Min length1

Characters and Unicode

Total characters18812
Distinct characters561
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique428 ?
Unique (%)36.6%

Sample

1st row없음
2nd row-정보시스템 개선(의료비 영수증이 아닌 전산에서 가져 올수 있도록) ,-사업 담당 보건소와 정책작성 부서와의 피드백
3rd row11시간
4th row6급
5th row
ValueCountFrequency (%)
2015년 58
 
1.3%
50
 
1.1%
없음 36
 
0.8%
32
 
0.7%
2 28
 
0.6%
1 27
 
0.6%
10 24
 
0.5%
좋겠습니다 23
 
0.5%
지원 23
 
0.5%
23
 
0.5%
Other values (2367) 4053
92.6%
2023-12-12T19:32:13.549858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3342
 
17.8%
1 484
 
2.6%
. 389
 
2.1%
0 383
 
2.0%
330
 
1.8%
2 311
 
1.7%
306
 
1.6%
282
 
1.5%
278
 
1.5%
5 222
 
1.2%
Other values (551) 12485
66.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 12579
66.9%
Space Separator 3342
 
17.8%
Decimal Number 1929
 
10.3%
Other Punctuation 640
 
3.4%
Control 137
 
0.7%
Open Punctuation 47
 
0.2%
Close Punctuation 44
 
0.2%
Math Symbol 32
 
0.2%
Dash Punctuation 31
 
0.2%
Lowercase Letter 21
 
0.1%
Other values (3) 10
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
330
 
2.6%
306
 
2.4%
282
 
2.2%
278
 
2.2%
218
 
1.7%
209
 
1.7%
189
 
1.5%
188
 
1.5%
187
 
1.5%
182
 
1.4%
Other values (500) 10210
81.2%
Lowercase Letter
ValueCountFrequency (%)
e 3
14.3%
w 2
9.5%
r 2
9.5%
i 2
9.5%
a 2
9.5%
y 2
9.5%
u 2
9.5%
h 1
 
4.8%
s 1
 
4.8%
k 1
 
4.8%
Other values (3) 3
14.3%
Other Punctuation
ValueCountFrequency (%)
. 389
60.8%
, 205
32.0%
: 10
 
1.6%
! 9
 
1.4%
? 9
 
1.4%
* 4
 
0.6%
/ 4
 
0.6%
; 3
 
0.5%
& 2
 
0.3%
# 2
 
0.3%
Other values (2) 3
 
0.5%
Decimal Number
ValueCountFrequency (%)
1 484
25.1%
0 383
19.9%
2 311
16.1%
5 222
11.5%
4 111
 
5.8%
3 110
 
5.7%
9 93
 
4.8%
8 87
 
4.5%
6 71
 
3.7%
7 57
 
3.0%
Uppercase Letter
ValueCountFrequency (%)
A 2
33.3%
Q 1
16.7%
N 1
16.7%
M 1
16.7%
E 1
16.7%
Open Punctuation
ValueCountFrequency (%)
( 45
95.7%
[ 2
 
4.3%
Close Punctuation
ValueCountFrequency (%)
) 42
95.5%
] 2
 
4.5%
Math Symbol
ValueCountFrequency (%)
~ 30
93.8%
> 2
 
6.2%
Space Separator
ValueCountFrequency (%)
3342
100.0%
Control
ValueCountFrequency (%)
137
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 31
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 12579
66.9%
Common 6206
33.0%
Latin 27
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
330
 
2.6%
306
 
2.4%
282
 
2.2%
278
 
2.2%
218
 
1.7%
209
 
1.7%
189
 
1.5%
188
 
1.5%
187
 
1.5%
182
 
1.4%
Other values (500) 10210
81.2%
Common
ValueCountFrequency (%)
3342
53.9%
1 484
 
7.8%
. 389
 
6.3%
0 383
 
6.2%
2 311
 
5.0%
5 222
 
3.6%
, 205
 
3.3%
137
 
2.2%
4 111
 
1.8%
3 110
 
1.8%
Other values (23) 512
 
8.3%
Latin
ValueCountFrequency (%)
e 3
 
11.1%
w 2
 
7.4%
r 2
 
7.4%
A 2
 
7.4%
i 2
 
7.4%
a 2
 
7.4%
y 2
 
7.4%
u 2
 
7.4%
Q 1
 
3.7%
N 1
 
3.7%
Other values (8) 8
29.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 12571
66.8%
ASCII 6233
33.1%
Compat Jamo 8
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3342
53.6%
1 484
 
7.8%
. 389
 
6.2%
0 383
 
6.1%
2 311
 
5.0%
5 222
 
3.6%
, 205
 
3.3%
137
 
2.2%
4 111
 
1.8%
3 110
 
1.8%
Other values (41) 539
 
8.6%
Hangul
ValueCountFrequency (%)
330
 
2.6%
306
 
2.4%
282
 
2.2%
278
 
2.2%
218
 
1.7%
209
 
1.7%
189
 
1.5%
188
 
1.5%
187
 
1.5%
182
 
1.4%
Other values (495) 10202
81.2%
Compat Jamo
ValueCountFrequency (%)
2
25.0%
2
25.0%
2
25.0%
1
12.5%
1
12.5%

ANS_DT
Date

Distinct69
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2007-05-10 00:00:00
Maximum2018-12-28 00:00:00
2023-12-12T19:32:13.777766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:13.998479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

ANS_USER_ID
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct686
Distinct (%)7.4%
Missing737
Missing (%)7.4%
Infinite0
Infinite (%)0.0%
Mean3536.9338
Minimum0
Maximum5357
Zeros3
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:32:14.184477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2266
Q13056
median3386
Q34306
95-th percentile5042
Maximum5357
Range5357
Interquartile range (IQR)1250

Descriptive statistics

Standard deviation956.6961
Coefficient of variation (CV)0.27048742
Kurtosis1.514478
Mean3536.9338
Median Absolute Deviation (MAD)522
Skewness-0.52474606
Sum32762618
Variance915267.43
MonotonicityNot monotonic
2023-12-12T19:32:14.332654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2401 79
 
0.8%
3100 77
 
0.8%
3399 75
 
0.8%
3077 74
 
0.7%
2675 70
 
0.7%
3217 70
 
0.7%
3453 70
 
0.7%
2931 69
 
0.7%
3056 69
 
0.7%
3449 68
 
0.7%
Other values (676) 8542
85.4%
(Missing) 737
 
7.4%
ValueCountFrequency (%)
0 3
 
< 0.1%
9 2
 
< 0.1%
23 18
 
0.2%
40 3
 
< 0.1%
43 2
 
< 0.1%
52 60
0.6%
63 3
 
< 0.1%
65 1
 
< 0.1%
66 3
 
< 0.1%
75 1
 
< 0.1%
ValueCountFrequency (%)
5357 4
< 0.1%
5354 4
< 0.1%
5353 3
< 0.1%
5352 2
 
< 0.1%
5351 4
< 0.1%
5349 4
< 0.1%
5347 3
< 0.1%
5346 2
 
< 0.1%
5345 5
0.1%
5344 4
< 0.1%

Interactions

2023-12-12T19:32:09.706898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:07.052775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:07.719066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:08.343118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:09.064657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:09.844903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:07.200736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:07.858275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:08.462720image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:09.200773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:09.965967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:07.323541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:07.965967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:08.592703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:09.303508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:10.083783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:07.473191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:08.084495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:08.824723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:09.431362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:10.214574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:07.606146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:08.201539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:08.954902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:32:09.566545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:32:14.434945image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RES_SEQPOLL_SEQQUST_SEQEXAM_SEQANS_DTANS_USER_ID
RES_SEQ1.0000.8270.6780.3320.9880.775
POLL_SEQ0.8271.0000.5690.5200.9970.714
QUST_SEQ0.6780.5691.0000.3030.6530.625
EXAM_SEQ0.3320.5200.3031.0000.5430.259
ANS_DT0.9880.9970.6530.5431.0000.862
ANS_USER_ID0.7750.7140.6250.2590.8621.000
2023-12-12T19:32:14.549116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
RES_SEQPOLL_SEQQUST_SEQEXAM_SEQANS_USER_ID
RES_SEQ1.0000.895-0.6110.1940.687
POLL_SEQ0.8951.000-0.6790.2130.747
QUST_SEQ-0.611-0.6791.000-0.041-0.504
EXAM_SEQ0.1940.213-0.0411.0000.151
ANS_USER_ID0.6870.747-0.5040.1511.000

Missing values

2023-12-12T19:32:10.368212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:32:10.517433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T19:32:10.645738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

RES_SEQPOLL_SEQQUST_SEQEXAM_SEQEXAM_ETCANS_DTANS_USER_ID
359141888452<NA>2015-12-2852
716589408631없음2016-01-072450
534366928212<NA>2016-01-073310
12830154531313<NA>2018-12-035332
137415528243<NA>2015-12-222989
181523368581-정보시스템 개선(의료비 영수증이 아닌 전산에서 가져 올수 있도록) ,-사업 담당 보건소와 정책작성 부서와의 피드백2015-12-233370
5845658473<NA>2015-12-222683
12560145371243<NA>2018-06-204833
246624948723<NA>2015-12-232489
23332432810111시간2015-12-232489
RES_SEQPOLL_SEQQUST_SEQEXAM_SEQEXAM_ETCANS_DTANS_USER_ID
1943648302<NA>2015-12-223189
592278098571전국적으로 암의료비 지원 예산이 부족한 것으로 알고 있습니다. ,타 신규 의료비 지원 사업을 늘리지 말고 기존의 암의료비 지원 예산을 ,확보해 주었으면 좋겠습니다. ,보건소에 신청하여 본인이 필요한 시기에 돈을 받아야 환자도 필요한 곳에 적절하게 사용할 수 있고 그래야 사업 만족도가 높을 것입니다. ,그리고 성인 암환자 의료비 지원 한도를 높여주십시오2016-01-072266
622180808253<NA>2016-01-072557
47066259862<NA>2015-12-31<NA>
93314138283<NA>2015-12-223077
199628798675전화 연결이 안됨. 급할 때는 무지 답답함.2015-12-243203
719592958554<NA>2016-01-073217
410552788752<NA>2015-12-302182
404348408313<NA>2015-12-302991
5362808242<NA>2015-12-222993