Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory400.4 KiB
Average record size in memory41.0 B

Variable types

Numeric1
Text3

Dataset

Description국가유공자자격확인서비스의 질병정보를 관리하는 테이블로 환자들의 질병정보코드에 대한 데이터를 관리하는 테이블 입니다.
URLhttps://www.data.go.kr/data/15116493/fileData.do

Alerts

순번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 11:46:02.899494
Analysis finished2023-12-12 11:46:04.208752
Duration1.31 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14691.62
Minimum1
Maximum29279
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T20:46:04.288109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1448.95
Q17355
median14747
Q321982.75
95-th percentile27835.05
Maximum29279
Range29278
Interquartile range (IQR)14627.75

Descriptive statistics

Standard deviation8453.2642
Coefficient of variation (CV)0.57538001
Kurtosis-1.1949849
Mean14691.62
Median Absolute Deviation (MAD)7325.5
Skewness-0.014831689
Sum1.469162 × 108
Variance71457676
MonotonicityNot monotonic
2023-12-12T20:46:04.434065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24463 1
 
< 0.1%
19394 1
 
< 0.1%
15103 1
 
< 0.1%
3904 1
 
< 0.1%
28951 1
 
< 0.1%
14913 1
 
< 0.1%
3712 1
 
< 0.1%
23235 1
 
< 0.1%
5734 1
 
< 0.1%
13749 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
11 1
< 0.1%
13 1
< 0.1%
15 1
< 0.1%
19 1
< 0.1%
20 1
< 0.1%
22 1
< 0.1%
24 1
< 0.1%
27 1
< 0.1%
31 1
< 0.1%
ValueCountFrequency (%)
29279 1
< 0.1%
29278 1
< 0.1%
29266 1
< 0.1%
29264 1
< 0.1%
29261 1
< 0.1%
29256 1
< 0.1%
29253 1
< 0.1%
29250 1
< 0.1%
29249 1
< 0.1%
29248 1
< 0.1%
Distinct7897
Distinct (%)79.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T20:46:04.921651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length5.019
Min length3

Characters and Unicode

Total characters50190
Distinct characters41
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6518 ?
Unique (%)65.2%

Sample

1st rowW36.0
2nd rowM73.00*
3rd rowZ65.2
4th rowS74.1
5th rowF14.6
ValueCountFrequency (%)
e11.4 18
 
0.2%
e11.0 12
 
0.1%
e10.2 11
 
0.1%
a09.0 11
 
0.1%
e11.1 10
 
0.1%
e11.2 10
 
0.1%
e11.6 9
 
0.1%
e10.0 9
 
0.1%
e12.0 8
 
0.1%
t21 8
 
0.1%
Other values (7887) 9894
98.9%
2023-12-12T20:46:05.466302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 8654
17.2%
0 4293
 
8.6%
1 3974
 
7.9%
2 3335
 
6.6%
8 3199
 
6.4%
4 3037
 
6.1%
3 2933
 
5.8%
9 2693
 
5.4%
5 2630
 
5.2%
6 2399
 
4.8%
Other values (31) 13043
26.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 30657
61.1%
Uppercase Letter 10127
 
20.2%
Other Punctuation 9122
 
18.2%
Math Symbol 156
 
0.3%
Dash Punctuation 127
 
0.3%
Space Separator 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 1856
18.3%
S 533
 
5.3%
Q 473
 
4.7%
X 466
 
4.6%
T 439
 
4.3%
E 436
 
4.3%
W 410
 
4.0%
K 392
 
3.9%
F 380
 
3.8%
V 367
 
3.6%
Other values (16) 4375
43.2%
Decimal Number
ValueCountFrequency (%)
0 4293
14.0%
1 3974
13.0%
2 3335
10.9%
8 3199
10.4%
4 3037
9.9%
3 2933
9.6%
9 2693
8.8%
5 2630
8.6%
6 2399
7.8%
7 2164
7.1%
Other Punctuation
ValueCountFrequency (%)
. 8654
94.9%
* 468
 
5.1%
Math Symbol
ValueCountFrequency (%)
+ 156
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 127
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 40063
79.8%
Latin 10127
 
20.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 1856
18.3%
S 533
 
5.3%
Q 473
 
4.7%
X 466
 
4.6%
T 439
 
4.3%
E 436
 
4.3%
W 410
 
4.0%
K 392
 
3.9%
F 380
 
3.8%
V 367
 
3.6%
Other values (16) 4375
43.2%
Common
ValueCountFrequency (%)
. 8654
21.6%
0 4293
10.7%
1 3974
9.9%
2 3335
 
8.3%
8 3199
 
8.0%
4 3037
 
7.6%
3 2933
 
7.3%
9 2693
 
6.7%
5 2630
 
6.6%
6 2399
 
6.0%
Other values (5) 2916
 
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 50190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 8654
17.2%
0 4293
 
8.6%
1 3974
 
7.9%
2 3335
 
6.6%
8 3199
 
6.4%
4 3037
 
6.1%
3 2933
 
5.8%
9 2693
 
5.4%
5 2630
 
5.2%
6 2399
 
4.8%
Other values (31) 13043
26.0%
Distinct9958
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T20:46:05.844212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length99
Median length52
Mean length18.0425
Min length2

Characters and Unicode

Total characters180425
Distinct characters842
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9918 ?
Unique (%)99.2%

Sample

1st row가스통 폭발 및 파열, 주거지
2nd row임균성 윤활낭염,다발 부위
3rd row감옥에서 석방과 관련된 문제
4th row엉덩관절 및 넓적다리 부위에서의 넓적다리신경의 손상
5th row코카인 유발 기억상실성 장애
ValueCountFrequency (%)
2316
 
5.1%
기타 1763
 
3.9%
상세불명의 1505
 
3.3%
부위 872
 
1.9%
또는 727
 
1.6%
의한 719
 
1.6%
장애 430
 
0.9%
동반한 406
 
0.9%
명시된 351
 
0.8%
노출 319
 
0.7%
Other values (7912) 35994
79.3%
2023-12-12T20:46:06.604664image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35893
 
19.9%
6926
 
3.8%
4180
 
2.3%
3287
 
1.8%
, 3211
 
1.8%
3183
 
1.8%
2409
 
1.3%
2322
 
1.3%
2253
 
1.2%
2250
 
1.2%
Other values (832) 114511
63.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 137290
76.1%
Space Separator 35893
 
19.9%
Other Punctuation 3364
 
1.9%
Open Punctuation 1081
 
0.6%
Close Punctuation 1079
 
0.6%
Decimal Number 834
 
0.5%
Uppercase Letter 454
 
0.3%
Dash Punctuation 383
 
0.2%
Lowercase Letter 27
 
< 0.1%
Letter Number 16
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6926
 
5.0%
4180
 
3.0%
3287
 
2.4%
3183
 
2.3%
2409
 
1.8%
2322
 
1.7%
2253
 
1.6%
2250
 
1.6%
2156
 
1.6%
1987
 
1.4%
Other values (771) 106337
77.5%
Uppercase Letter
ValueCountFrequency (%)
S 67
14.8%
T 63
13.9%
I 59
13.0%
O 48
10.6%
C 38
8.4%
B 24
 
5.3%
H 23
 
5.1%
N 21
 
4.6%
A 19
 
4.2%
X 16
 
3.5%
Other values (13) 76
16.7%
Decimal Number
ValueCountFrequency (%)
0 172
20.6%
1 108
12.9%
2 105
12.6%
3 90
10.8%
9 80
9.6%
8 72
8.6%
4 68
 
8.2%
5 54
 
6.5%
6 43
 
5.2%
7 42
 
5.0%
Lowercase Letter
ValueCountFrequency (%)
m 10
37.0%
g 5
18.5%
l 5
18.5%
o 2
 
7.4%
a 2
 
7.4%
i 1
 
3.7%
b 1
 
3.7%
s 1
 
3.7%
Other Punctuation
ValueCountFrequency (%)
, 3211
95.5%
. 135
 
4.0%
/ 8
 
0.2%
% 7
 
0.2%
? 2
 
0.1%
1
 
< 0.1%
Letter Number
ValueCountFrequency (%)
12
75.0%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Open Punctuation
ValueCountFrequency (%)
( 953
88.2%
[ 127
 
11.7%
1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 951
88.1%
] 127
 
11.8%
1
 
0.1%
Space Separator
ValueCountFrequency (%)
35893
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 383
100.0%
Math Symbol
ValueCountFrequency (%)
+ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 137290
76.1%
Common 42638
 
23.6%
Latin 497
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6926
 
5.0%
4180
 
3.0%
3287
 
2.4%
3183
 
2.3%
2409
 
1.8%
2322
 
1.7%
2253
 
1.6%
2250
 
1.6%
2156
 
1.6%
1987
 
1.4%
Other values (771) 106337
77.5%
Latin
ValueCountFrequency (%)
S 67
13.5%
T 63
12.7%
I 59
11.9%
O 48
9.7%
C 38
 
7.6%
B 24
 
4.8%
H 23
 
4.6%
N 21
 
4.2%
A 19
 
3.8%
X 16
 
3.2%
Other values (26) 119
23.9%
Common
ValueCountFrequency (%)
35893
84.2%
, 3211
 
7.5%
( 953
 
2.2%
) 951
 
2.2%
- 383
 
0.9%
0 172
 
0.4%
. 135
 
0.3%
] 127
 
0.3%
[ 127
 
0.3%
1 108
 
0.3%
Other values (15) 578
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 137288
76.1%
ASCII 43116
 
23.9%
Number Forms 16
 
< 0.1%
None 3
 
< 0.1%
Compat Jamo 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
35893
83.2%
, 3211
 
7.4%
( 953
 
2.2%
) 951
 
2.2%
- 383
 
0.9%
0 172
 
0.4%
. 135
 
0.3%
] 127
 
0.3%
[ 127
 
0.3%
1 108
 
0.3%
Other values (43) 1056
 
2.4%
Hangul
ValueCountFrequency (%)
6926
 
5.0%
4180
 
3.0%
3287
 
2.4%
3183
 
2.3%
2409
 
1.8%
2322
 
1.7%
2253
 
1.6%
2250
 
1.6%
2156
 
1.6%
1987
 
1.4%
Other values (770) 106335
77.5%
Number Forms
ValueCountFrequency (%)
12
75.0%
1
 
6.2%
1
 
6.2%
1
 
6.2%
1
 
6.2%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
None
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Distinct9992
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T20:46:06.968837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length187
Median length138
Mean length49.5964
Min length4

Characters and Unicode

Total characters495964
Distinct characters91
Distinct categories14 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9984 ?
Unique (%)99.8%

Sample

1st rowExplosion and rupture of gas cylinder, home
2nd rowGonococcal bursitis, multiple sites (A54.4+)
3rd rowProblems related to release from prison
4th rowInjury of femoral nerves at hip and thigh level
5th rowAmnestic disorder, cocaine-induced
ValueCountFrequency (%)
of 3778
 
6.0%
and 2576
 
4.1%
other 1893
 
3.0%
in 1375
 
2.2%
unspecified 1220
 
1.9%
with 912
 
1.4%
to 889
 
1.4%
or 833
 
1.3%
joints 802
 
1.3%
by 575
 
0.9%
Other values (6287) 48393
76.5%
2023-12-12T20:46:07.491275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
53326
 
10.8%
e 42354
 
8.5%
i 40078
 
8.1%
o 33775
 
6.8%
n 32171
 
6.5%
a 32060
 
6.5%
t 30920
 
6.2%
r 29065
 
5.9%
s 28744
 
5.8%
l 19704
 
4.0%
Other values (81) 153767
31.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 410978
82.9%
Space Separator 53326
 
10.8%
Uppercase Letter 14160
 
2.9%
Other Punctuation 8134
 
1.6%
Decimal Number 2740
 
0.6%
Open Punctuation 2339
 
0.5%
Close Punctuation 2338
 
0.5%
Dash Punctuation 1548
 
0.3%
Math Symbol 368
 
0.1%
Letter Number 21
 
< 0.1%
Other values (4) 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 42354
10.3%
i 40078
 
9.8%
o 33775
 
8.2%
n 32171
 
7.8%
a 32060
 
7.8%
t 30920
 
7.5%
r 29065
 
7.1%
s 28744
 
7.0%
l 19704
 
4.8%
c 19411
 
4.7%
Other values (18) 102696
25.0%
Uppercase Letter
ValueCountFrequency (%)
O 1930
13.6%
S 1398
9.9%
C 1377
9.7%
A 1157
 
8.2%
N 1110
 
7.8%
P 1067
 
7.5%
I 800
 
5.6%
M 722
 
5.1%
E 647
 
4.6%
D 579
 
4.1%
Other values (16) 3373
23.8%
Decimal Number
ValueCountFrequency (%)
0 584
21.3%
9 279
10.2%
3 273
10.0%
5 270
9.9%
2 267
9.7%
1 261
9.5%
8 248
9.1%
4 236
8.6%
6 174
 
6.4%
7 148
 
5.4%
Other Punctuation
ValueCountFrequency (%)
, 7134
87.7%
. 582
 
7.2%
' 206
 
2.5%
* 188
 
2.3%
% 7
 
0.1%
/ 7
 
0.1%
; 5
 
0.1%
& 5
 
0.1%
Letter Number
ValueCountFrequency (%)
6
28.6%
6
28.6%
3
14.3%
2
 
9.5%
2
 
9.5%
1
 
4.8%
1
 
4.8%
Open Punctuation
ValueCountFrequency (%)
( 2140
91.5%
[ 199
 
8.5%
Close Punctuation
ValueCountFrequency (%)
) 2139
91.5%
] 199
 
8.5%
Other Letter
ValueCountFrequency (%)
2
50.0%
2
50.0%
Space Separator
ValueCountFrequency (%)
53326
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1548
100.0%
Math Symbol
ValueCountFrequency (%)
+ 368
100.0%
Control
ValueCountFrequency (%)
4
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 3
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 425152
85.7%
Common 70801
 
14.3%
Greek 7
 
< 0.1%
Hangul 4
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 42354
 
10.0%
i 40078
 
9.4%
o 33775
 
7.9%
n 32171
 
7.6%
a 32060
 
7.5%
t 30920
 
7.3%
r 29065
 
6.8%
s 28744
 
6.8%
l 19704
 
4.6%
c 19411
 
4.6%
Other values (49) 116870
27.5%
Common
ValueCountFrequency (%)
53326
75.3%
, 7134
 
10.1%
( 2140
 
3.0%
) 2139
 
3.0%
- 1548
 
2.2%
0 584
 
0.8%
. 582
 
0.8%
+ 368
 
0.5%
9 279
 
0.4%
3 273
 
0.4%
Other values (18) 2428
 
3.4%
Greek
ValueCountFrequency (%)
β 4
57.1%
α 3
42.9%
Hangul
ValueCountFrequency (%)
2
50.0%
2
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 495931
> 99.9%
Number Forms 21
 
< 0.1%
None 7
 
< 0.1%
Hangul 4
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
53326
 
10.8%
e 42354
 
8.5%
i 40078
 
8.1%
o 33775
 
6.8%
n 32171
 
6.5%
a 32060
 
6.5%
t 30920
 
6.2%
r 29065
 
5.9%
s 28744
 
5.8%
l 19704
 
4.0%
Other values (69) 153734
31.0%
Number Forms
ValueCountFrequency (%)
6
28.6%
6
28.6%
3
14.3%
2
 
9.5%
2
 
9.5%
1
 
4.8%
1
 
4.8%
None
ValueCountFrequency (%)
β 4
57.1%
α 3
42.9%
Hangul
ValueCountFrequency (%)
2
50.0%
2
50.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

Interactions

2023-12-12T20:46:03.881678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-12T20:46:04.026264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T20:46:04.148546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번질병코드한글명영문명
2446224463W36.0가스통 폭발 및 파열, 주거지Explosion and rupture of gas cylinder, home
1363513636M73.00*임균성 윤활낭염,다발 부위Gonococcal bursitis, multiple sites (A54.4+)
2653126532Z65.2감옥에서 석방과 관련된 문제Problems related to release from prison
2102621027S74.1엉덩관절 및 넓적다리 부위에서의 넓적다리신경의 손상Injury of femoral nerves at hip and thigh level
51005101F14.6코카인 유발 기억상실성 장애Amnestic disorder, cocaine-induced
1396313964N21.9상세불명의 하부 요로 결석Calculus of lower urinary tract, unspecified
93949395K04.6굴이 있는 치아치조 고름집(농양)Dentoalveolar abscess with sinus
82618262I51.8전층심장염(급성, 만성)Pancarditis (acute, chronic)
90609061J38.4성문의 부종Edema of glottis
2301623017T52.0석유 휘발유의 중독효과Toxic effect of Petroleum spirits
순번질병코드한글명영문명
38163817E11청년의 인슐린-비의존 당뇨병Non-insulin-dependent diabetes of the young
1087210873L27.1약물 및 약제에 의한 국소 피부발진Localized skin eruption due to drugs and medicaments
1837418375S62.00손의 발배뼈의 폐쇄성 골절Closed fracture of navicular[scaphoid] bone of hand
17361737B25.8기타 거대세포바이러스 질환Other cytomegaloviral diseases
81298130I42.9상세불명의 심장근육병증Cardiomyopathy, unspecified
1159811599M47.89척수병증 또는 신경뿌리병증이 없는 흉추강직, 상세불명의 부위Thoracic spondylosis without myelopathy or radiculopathy, site unspecified
91359136J45.0천식을 동반한 고초 열Hay fever with asthma
1951419515O62.2수축 불량Poor contractions
53615362F31조울성 반응Manic-depressive reaction
2311423115V04대형화물차 또는 버스와 충돌로 다친 보행자Pedestrian injured in collision with heavy transport vehicle or bus