Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 21 |
Missing cells | 97 |
Missing cells (%) | 51.3% |
Duplicate rows | 1 |
Duplicate rows (%) | 4.8% |
Total size in memory | 1.6 KiB |
Average record size in memory | 78.3 B |
Variable types
Unsupported | 2 |
---|---|
Text | 5 |
Categorical | 1 |
Boolean | 1 |
Dataset
Description | 건강보험 가입자의 주요질환 분포 현황 |
---|---|
Author | 국민건강보험공단 |
URL | https://www.vworld.kr/dtmk/dtmk_ntads_s002.do?dsId=30026 |
Unnamed: 5 has constant value "" | Constant |
Unnamed: 8 has constant value "" | Constant |
Dataset has 1 (4.8%) duplicate rows | Duplicates |
테이블정의서 has 1 (4.8%) missing values | Missing |
Unnamed: 1 has 11 (52.4%) missing values | Missing |
Unnamed: 2 has 7 (33.3%) missing values | Missing |
Unnamed: 4 has 10 (47.6%) missing values | Missing |
Unnamed: 5 has 12 (57.1%) missing values | Missing |
Unnamed: 6 has 18 (85.7%) missing values | Missing |
Unnamed: 7 has 18 (85.7%) missing values | Missing |
Unnamed: 8 has 20 (95.2%) missing values | Missing |
테이블정의서 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Unnamed: 4 is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
Reproduction
Analysis started | 2024-04-22 00:14:15.998427 |
---|---|
Analysis finished | 2024-04-22 00:14:16.688117 |
Duration | 0.69 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
테이블정의서
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 1 |
---|---|
Missing (%) | 4.8% |
Memory size | 300.0 B |
Unnamed: 1
Text
MISSING
 
Distinct | 10 |
---|---|
Distinct (%) | 100.0% |
Missing | 11 |
Missing (%) | 52.4% |
Memory size | 300.0 B |
Value | Count | Frequency (%) |
컬럼id | 1 | |
coord_x_re | 1 | |
coord_y_re | 1 | |
coord_xy | 1 | |
sum_hp | 1 | |
sum_dib | 1 | |
sum_hyper | 1 | |
sum_cancer | 1 | |
sum_heart | 1 | |
sum_stroke | 1 |
Most occurring characters
Value | Count | Frequency (%) |
_ | 11 | |
R | 9 | |
O | 7 | 8.4% |
S | 7 | 8.4% |
U | 6 | 7.2% |
E | 6 | 7.2% |
M | 6 | 7.2% |
D | 5 | 6.0% |
C | 5 | 6.0% |
Y | 3 | 3.6% |
Other values (11) | 18 |
Most occurring categories
Value | Count | Frequency (%) |
Uppercase Letter | 70 | |
Connector Punctuation | 11 | 13.3% |
Other Letter | 2 | 2.4% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
R | 9 | |
O | 7 | |
S | 7 | |
U | 6 | |
E | 6 | |
M | 6 | |
D | 5 | 7.1% |
C | 5 | 7.1% |
Y | 3 | 4.3% |
H | 3 | 4.3% |
Other values (8) | 13 |
Other Letter
Value | Count | Frequency (%) |
컬 | 1 | |
럼 | 1 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 11 |
Most occurring scripts
Value | Count | Frequency (%) |
Latin | 70 | |
Common | 11 | 13.3% |
Hangul | 2 | 2.4% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
R | 9 | |
O | 7 | |
S | 7 | |
U | 6 | |
E | 6 | |
M | 6 | |
D | 5 | 7.1% |
C | 5 | 7.1% |
Y | 3 | 4.3% |
H | 3 | 4.3% |
Other values (8) | 13 |
Hangul
Value | Count | Frequency (%) |
컬 | 1 | |
럼 | 1 |
Common
Value | Count | Frequency (%) |
_ | 11 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 81 | |
Hangul | 2 | 2.4% |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
_ | 11 | |
R | 9 | |
O | 7 | |
S | 7 | |
U | 6 | 7.4% |
E | 6 | 7.4% |
M | 6 | 7.4% |
D | 5 | 6.2% |
C | 5 | 6.2% |
Y | 3 | 3.7% |
Other values (9) | 16 |
Hangul
Value | Count | Frequency (%) |
컬 | 1 | |
럼 | 1 |
Unnamed: 2
Text
MISSING
 
Distinct | 14 |
---|---|
Distinct (%) | 100.0% |
Missing | 7 |
Missing (%) | 33.3% |
Memory size | 300.0 B |
Length
Max length | 31 |
---|---|
Median length | 21.5 |
Mean length | 16 |
Min length | 3 |
Characters and Unicode
Total characters | 224 |
---|---|
Distinct characters | 89 |
Distinct categories | 9 ? |
Distinct scripts | 3 ? |
Distinct blocks | 2 ? |
Unique
Unique | 14 ? |
---|---|
Unique (%) | 100.0% |
Sample
1st row | 김재안 |
---|---|
2nd row | 건강보험통계 |
3rd row | 건강보험 가입자의 주요질환 분포 현황 |
4th row | 컬럼명 |
5th row | X좌표의 200m단위 grid |
Value | Count | Frequency (%) |
환자수(상병기호 | 3 | 7.9% |
200m단위 | 3 | 7.9% |
grid | 3 | 7.9% |
e10~e14 | 1 | 2.6% |
고지혈증(이상지질혈증 | 1 | 2.6% |
e78 | 1 | 2.6% |
악성신생물 | 1 | 2.6% |
환자수(암환자 | 1 | 2.6% |
김재안 | 1 | 2.6% |
당뇨병 | 1 | 2.6% |
Other values (22) | 22 |
Most occurring characters
Value | Count | Frequency (%) |
24 | 10.7% | |
자 | 9 | 4.0% |
0 | 9 | 4.0% |
) | 8 | 3.6% |
환 | 8 | 3.6% |
( | 8 | 3.6% |
수 | 7 | 3.1% |
2 | 6 | 2.7% |
1 | 6 | 2.7% |
I | 6 | 2.7% |
Other values (79) | 133 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 120 | |
Decimal Number | 30 | 13.4% |
Space Separator | 24 | 10.7% |
Lowercase Letter | 15 | 6.7% |
Uppercase Letter | 14 | 6.2% |
Close Punctuation | 8 | 3.6% |
Open Punctuation | 8 | 3.6% |
Math Symbol | 4 | 1.8% |
Dash Punctuation | 1 | 0.4% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
자 | 9 | 7.5% |
환 | 8 | 6.7% |
수 | 7 | 5.8% |
상 | 5 | 4.2% |
병 | 5 | 4.2% |
의 | 4 | 3.3% |
기 | 4 | 3.3% |
호 | 4 | 3.3% |
좌 | 3 | 2.5% |
혈 | 3 | 2.5% |
Other values (54) | 68 |
Decimal Number
Value | Count | Frequency (%) |
0 | 9 | |
2 | 6 | |
1 | 6 | |
4 | 2 | 6.7% |
6 | 2 | 6.7% |
9 | 1 | 3.3% |
3 | 1 | 3.3% |
8 | 1 | 3.3% |
5 | 1 | 3.3% |
7 | 1 | 3.3% |
Uppercase Letter
Value | Count | Frequency (%) |
I | 6 | |
E | 3 | |
X | 2 | 14.3% |
Y | 2 | 14.3% |
V | 1 | 7.1% |
Lowercase Letter
Value | Count | Frequency (%) |
d | 3 | |
i | 3 | |
r | 3 | |
g | 3 | |
m | 3 |
Space Separator
Value | Count | Frequency (%) |
24 |
Close Punctuation
Value | Count | Frequency (%) |
) | 8 |
Open Punctuation
Value | Count | Frequency (%) |
( | 8 |
Math Symbol
Value | Count | Frequency (%) |
~ | 4 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 120 | |
Common | 75 | |
Latin | 29 | 12.9% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
자 | 9 | 7.5% |
환 | 8 | 6.7% |
수 | 7 | 5.8% |
상 | 5 | 4.2% |
병 | 5 | 4.2% |
의 | 4 | 3.3% |
기 | 4 | 3.3% |
호 | 4 | 3.3% |
좌 | 3 | 2.5% |
혈 | 3 | 2.5% |
Other values (54) | 68 |
Common
Value | Count | Frequency (%) |
24 | ||
0 | 9 | 12.0% |
) | 8 | 10.7% |
( | 8 | 10.7% |
2 | 6 | 8.0% |
1 | 6 | 8.0% |
~ | 4 | 5.3% |
4 | 2 | 2.7% |
6 | 2 | 2.7% |
9 | 1 | 1.3% |
Other values (5) | 5 | 6.7% |
Latin
Value | Count | Frequency (%) |
I | 6 | |
d | 3 | |
i | 3 | |
r | 3 | |
g | 3 | |
m | 3 | |
E | 3 | |
X | 2 | 6.9% |
Y | 2 | 6.9% |
V | 1 | 3.4% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 120 | |
ASCII | 104 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
24 | ||
0 | 9 | 8.7% |
) | 8 | 7.7% |
( | 8 | 7.7% |
2 | 6 | 5.8% |
1 | 6 | 5.8% |
I | 6 | 5.8% |
~ | 4 | 3.8% |
d | 3 | 2.9% |
i | 3 | 2.9% |
Other values (15) | 27 |
Hangul
Value | Count | Frequency (%) |
자 | 9 | 7.5% |
환 | 8 | 6.7% |
수 | 7 | 5.8% |
상 | 5 | 4.2% |
병 | 5 | 4.2% |
의 | 4 | 3.3% |
기 | 4 | 3.3% |
호 | 4 | 3.3% |
좌 | 3 | 2.5% |
혈 | 3 | 2.5% |
Other values (54) | 68 |
Unnamed: 3
Categorical
Distinct | 5 |
---|---|
Distinct (%) | 23.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 300.0 B |
<NA> | |
---|---|
Numeric | |
테이블ID | 1 |
타입 | 1 |
Character | 1 |
Length
Max length | 9 |
---|---|
Median length | 7 |
Mean length | 5.3333333 |
Min length | 2 |
Unique
Unique | 3 ? |
---|---|
Unique (%) | 14.3% |
Sample
1st row | <NA> |
---|---|
2nd row | 테이블ID |
3rd row | <NA> |
4th row | 타입 |
5th row | Numeric |
Common Values
Value | Count | Frequency (%) |
<NA> | 10 | |
Numeric | 8 | |
테이블ID | 1 | 4.8% |
타입 | 1 | 4.8% |
Character | 1 | 4.8% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
na | 10 | |
numeric | 8 | |
테이블id | 1 | 4.8% |
타입 | 1 | 4.8% |
character | 1 | 4.8% |
Unnamed: 4
Unsupported
MISSING
  REJECTED
  UNSUPPORTED
 
Missing | 10 |
---|---|
Missing (%) | 47.6% |
Memory size | 300.0 B |
Unnamed: 5
Boolean
CONSTANT
  MISSING
 
Distinct | 1 |
---|---|
Distinct (%) | 11.1% |
Missing | 12 |
Missing (%) | 57.1% |
Memory size | 174.0 B |
True | |
---|---|
(Missing) |
Value | Count | Frequency (%) |
True | 9 | |
(Missing) | 12 |
Unnamed: 6
Text
MISSING
 
Distinct | 3 |
---|---|
Distinct (%) | 100.0% |
Missing | 18 |
Missing (%) | 85.7% |
Memory size | 300.0 B |
Value | Count | Frequency (%) |
작성일 | 1 | |
테이블명 | 1 | |
pk/fk | 1 |
Most occurring characters
Value | Count | Frequency (%) |
K | 2 | |
작 | 1 | |
성 | 1 | |
일 | 1 | |
테 | 1 | |
이 | 1 | |
블 | 1 | |
명 | 1 | |
P | 1 | |
/ | 1 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 7 | |
Uppercase Letter | 4 | |
Other Punctuation | 1 | 8.3% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
작 | 1 | |
성 | 1 | |
일 | 1 | |
테 | 1 | |
이 | 1 | |
블 | 1 | |
명 | 1 |
Uppercase Letter
Value | Count | Frequency (%) |
K | 2 | |
P | 1 | |
F | 1 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 7 | |
Latin | 4 | |
Common | 1 | 8.3% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
작 | 1 | |
성 | 1 | |
일 | 1 | |
테 | 1 | |
이 | 1 | |
블 | 1 | |
명 | 1 |
Latin
Value | Count | Frequency (%) |
K | 2 | |
P | 1 | |
F | 1 |
Common
Value | Count | Frequency (%) |
/ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 7 | |
ASCII | 5 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
K | 2 | |
P | 1 | |
/ | 1 | |
F | 1 |
Hangul
Value | Count | Frequency (%) |
작 | 1 | |
성 | 1 | |
일 | 1 | |
테 | 1 | |
이 | 1 | |
블 | 1 | |
명 | 1 |
Unnamed: 7
Text
MISSING
 
Distinct | 3 |
---|---|
Distinct (%) | 100.0% |
Missing | 18 |
Missing (%) | 85.7% |
Memory size | 300.0 B |
Value | Count | Frequency (%) |
2020.02.05 | 1 | |
전국민 | 1 | |
주요질환 | 1 | |
통계 | 1 | |
default | 1 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 4 | 14.3% |
2 | 3 | 10.7% |
. | 2 | 7.1% |
2 | 7.1% | |
통 | 1 | 3.6% |
l | 1 | 3.6% |
u | 1 | 3.6% |
a | 1 | 3.6% |
f | 1 | 3.6% |
e | 1 | 3.6% |
Other values (11) | 11 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 9 | |
Decimal Number | 8 | |
Lowercase Letter | 6 | |
Other Punctuation | 2 | 7.1% |
Space Separator | 2 | 7.1% |
Uppercase Letter | 1 | 3.6% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
통 | 1 | |
계 | 1 | |
질 | 1 | |
환 | 1 | |
요 | 1 | |
주 | 1 | |
민 | 1 | |
국 | 1 | |
전 | 1 |
Lowercase Letter
Value | Count | Frequency (%) |
l | 1 | |
u | 1 | |
a | 1 | |
f | 1 | |
e | 1 | |
t | 1 |
Decimal Number
Value | Count | Frequency (%) |
0 | 4 | |
2 | 3 | |
5 | 1 | 12.5% |
Other Punctuation
Value | Count | Frequency (%) |
. | 2 |
Space Separator
Value | Count | Frequency (%) |
2 |
Uppercase Letter
Value | Count | Frequency (%) |
D | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 12 | |
Hangul | 9 | |
Latin | 7 |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
통 | 1 | |
계 | 1 | |
질 | 1 | |
환 | 1 | |
요 | 1 | |
주 | 1 | |
민 | 1 | |
국 | 1 | |
전 | 1 |
Latin
Value | Count | Frequency (%) |
l | 1 | |
u | 1 | |
a | 1 | |
f | 1 | |
e | 1 | |
D | 1 | |
t | 1 |
Common
Value | Count | Frequency (%) |
0 | 4 | |
2 | 3 | |
. | 2 | |
2 | ||
5 | 1 | 8.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 19 | |
Hangul | 9 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 4 | |
2 | 3 | |
. | 2 | |
2 | ||
l | 1 | 5.3% |
u | 1 | 5.3% |
a | 1 | 5.3% |
f | 1 | 5.3% |
e | 1 | 5.3% |
D | 1 | 5.3% |
Other values (2) | 2 |
Hangul
Value | Count | Frequency (%) |
통 | 1 | |
계 | 1 | |
질 | 1 | |
환 | 1 | |
요 | 1 | |
주 | 1 | |
민 | 1 | |
국 | 1 | |
전 | 1 |
Unnamed: 8
Text
CONSTANT
  MISSING
 
Distinct | 1 |
---|---|
Distinct (%) | 100.0% |
Missing | 20 |
Missing (%) | 95.2% |
Memory size | 300.0 B |
Value | Count | Frequency (%) |
참조테이블명/비고 | 1 |
Most occurring characters
Value | Count | Frequency (%) |
참 | 1 | |
조 | 1 | |
테 | 1 | |
이 | 1 | |
블 | 1 | |
명 | 1 | |
/ | 1 | |
비 | 1 | |
고 | 1 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 8 | |
Other Punctuation | 1 | 11.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
참 | 1 | |
조 | 1 | |
테 | 1 | |
이 | 1 | |
블 | 1 | |
명 | 1 | |
비 | 1 | |
고 | 1 |
Other Punctuation
Value | Count | Frequency (%) |
/ | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 8 | |
Common | 1 | 11.1% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
참 | 1 | |
조 | 1 | |
테 | 1 | |
이 | 1 | |
블 | 1 | |
명 | 1 | |
비 | 1 | |
고 | 1 |
Common
Value | Count | Frequency (%) |
/ | 1 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 8 | |
ASCII | 1 | 11.1% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
참 | 1 | |
조 | 1 | |
테 | 1 | |
이 | 1 | |
블 | 1 | |
명 | 1 | |
비 | 1 | |
고 | 1 |
ASCII
Value | Count | Frequency (%) |
/ | 1 |
Unnamed: 1 | Unnamed: 2 | Unnamed: 3 | Unnamed: 6 | Unnamed: 7 | |
---|---|---|---|---|---|
Unnamed: 1 | 1.000 | 1.000 | 1.000 | NaN | NaN |
Unnamed: 2 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
Unnamed: 3 | 1.000 | 1.000 | 1.000 | 0.000 | 0.000 |
Unnamed: 6 | NaN | 1.000 | 0.000 | 1.000 | 1.000 |
Unnamed: 7 | NaN | 1.000 | 0.000 | 1.000 | 1.000 |
테이블정의서 | Unnamed: 1 | Unnamed: 2 | Unnamed: 3 | Unnamed: 4 | Unnamed: 5 | Unnamed: 6 | Unnamed: 7 | Unnamed: 8 | |
---|---|---|---|---|---|---|---|---|---|
0 | 작성자 | <NA> | 김재안 | <NA> | NaN | <NA> | 작성일 | 2020.02.05 | <NA> |
1 | 주제영역명 | <NA> | 건강보험통계 | 테이블ID | NHIS_2018_SICK | <NA> | 테이블명 | 전국민 주요질환 통계 | <NA> |
2 | 테이블설명 | <NA> | 건강보험 가입자의 주요질환 분포 현황 | <NA> | NaN | <NA> | <NA> | <NA> | <NA> |
3 | No | 컬럼ID | 컬럼명 | 타입 | 길이(Byte) | <NA> | PK/FK | Default | 참조테이블명/비고 |
4 | 1 | COORD_X_RE | X좌표의 200m단위 grid | Numeric | 8 | Y | <NA> | <NA> | <NA> |
5 | 2 | COORD_Y_RE | Y좌표의 200m단위 grid | Numeric | 8 | Y | <NA> | <NA> | <NA> |
6 | 3 | COORD_XY | X-Y좌표의 200m단위 grid | Character | 22 | Y | <NA> | <NA> | <NA> |
7 | 4 | SUM_HP | 고혈압 환자 수(상병기호 I10~I15) | Numeric | 8 | Y | <NA> | <NA> | <NA> |
8 | 5 | SUM_DIB | 당뇨병 환자수(상병기호 E10~E14) | Numeric | 8 | Y | <NA> | <NA> | <NA> |
9 | 6 | SUM_HYPER | 고지혈증(이상지질혈증) 환자수(상병기호 E78) | Numeric | 8 | Y | <NA> | <NA> | <NA> |
테이블정의서 | Unnamed: 1 | Unnamed: 2 | Unnamed: 3 | Unnamed: 4 | Unnamed: 5 | Unnamed: 6 | Unnamed: 7 | Unnamed: 8 | |
---|---|---|---|---|---|---|---|---|---|
11 | 8 | SUM_HEART | 심근경색 환자수(상병기호 I21~I22) | Numeric | 8 | Y | <NA> | <NA> | <NA> |
12 | 9 | SUM_STROKE | 뇌졸중 환자수(I60~I64) | Numeric | 8 | Y | <NA> | <NA> | <NA> |
13 | 10 | <NA> | <NA> | <NA> | NaN | <NA> | <NA> | <NA> | <NA> |
14 | 11 | <NA> | <NA> | <NA> | NaN | <NA> | <NA> | <NA> | <NA> |
15 | 12 | <NA> | <NA> | <NA> | NaN | <NA> | <NA> | <NA> | <NA> |
16 | 13 | <NA> | <NA> | <NA> | NaN | <NA> | <NA> | <NA> | <NA> |
17 | 14 | <NA> | <NA> | <NA> | NaN | <NA> | <NA> | <NA> | <NA> |
18 | 인덱스명 | <NA> | 인덱스키 | <NA> | NaN | <NA> | <NA> | <NA> | <NA> |
19 | NaN | <NA> | <NA> | <NA> | NaN | <NA> | <NA> | <NA> | <NA> |
20 | 업무규칙 | <NA> | <NA> | <NA> | NaN | <NA> | <NA> | <NA> | <NA> |
Most frequently occurring
Unnamed: 1 | Unnamed: 2 | Unnamed: 3 | Unnamed: 5 | Unnamed: 6 | Unnamed: 7 | Unnamed: 8 | # duplicates | |
---|---|---|---|---|---|---|---|---|
0 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | 7 |