Overview

Dataset statistics

Number of variables14
Number of observations536
Missing cells942
Missing cells (%)12.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory59.3 KiB
Average record size in memory113.2 B

Variable types

Numeric1
Categorical6
Text6
Boolean1

Dataset

Description한국교통안전공단 철도자격관리시스템의 신체검사 데이터, 적성검사 데이터, 교욱훈련기관 데이터 등 철도 유관기관 데이터 입니다.
Author한국교통안전공단
URLhttps://www.data.go.kr/data/15064519/fileData.do

Alerts

대분류 has constant value ""Constant
영문 테이블명 has constant value ""Constant
한글 테이블명 has constant value ""Constant
지역코드 is highly overall correlated with 소재지High correlation
소재지 is highly overall correlated with 지역코드High correlation
유관기관 종류코드 is highly imbalanced (54.3%)Imbalance
주소 has 219 (40.9%) missing valuesMissing
상세주소 has 278 (51.9%) missing valuesMissing
우편번호 has 242 (45.1%) missing valuesMissing
대표전화번호 has 203 (37.9%) missing valuesMissing
순번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 13:14:19.584920
Analysis finished2023-12-12 13:14:21.452734
Duration1.87 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct536
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean268.5
Minimum1
Maximum536
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.8 KiB
2023-12-12T22:14:21.553768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile27.75
Q1134.75
median268.5
Q3402.25
95-th percentile509.25
Maximum536
Range535
Interquartile range (IQR)267.5

Descriptive statistics

Standard deviation154.87414
Coefficient of variation (CV)0.57681245
Kurtosis-1.2
Mean268.5
Median Absolute Deviation (MAD)134
Skewness0
Sum143916
Variance23986
MonotonicityStrictly increasing
2023-12-12T22:14:21.738142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.2%
354 1
 
0.2%
368 1
 
0.2%
367 1
 
0.2%
366 1
 
0.2%
365 1
 
0.2%
364 1
 
0.2%
363 1
 
0.2%
362 1
 
0.2%
361 1
 
0.2%
Other values (526) 526
98.1%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
536 1
0.2%
535 1
0.2%
534 1
0.2%
533 1
0.2%
532 1
0.2%
531 1
0.2%
530 1
0.2%
529 1
0.2%
528 1
0.2%
527 1
0.2%

대분류
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.3 KiB
유관기관 정보
536 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row유관기관 정보
2nd row유관기관 정보
3rd row유관기관 정보
4th row유관기관 정보
5th row유관기관 정보

Common Values

ValueCountFrequency (%)
유관기관 정보 536
100.0%

Length

2023-12-12T22:14:21.900044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:14:22.022223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유관기관 536
50.0%
정보 536
50.0%

영문 테이블명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.3 KiB
TB_SM1013, TB_SM1014
536 

Length

Max length20
Median length20
Mean length20
Min length20

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTB_SM1013, TB_SM1014
2nd rowTB_SM1013, TB_SM1014
3rd rowTB_SM1013, TB_SM1014
4th rowTB_SM1013, TB_SM1014
5th rowTB_SM1013, TB_SM1014

Common Values

ValueCountFrequency (%)
TB_SM1013, TB_SM1014 536
100.0%

Length

2023-12-12T22:14:22.148558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:14:22.261407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
tb_sm1013 536
50.0%
tb_sm1014 536
50.0%

한글 테이블명
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.3 KiB
유관기관
536 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row유관기관
2nd row유관기관
3rd row유관기관
4th row유관기관
5th row유관기관

Common Values

ValueCountFrequency (%)
유관기관 536
100.0%

Length

2023-12-12T22:14:22.394002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:14:22.499848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
유관기관 536
100.0%

유관기관 종류코드
Categorical

IMBALANCE 

Distinct14
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size4.3 KiB
SM012102
348 
SM012303
92 
SM012301
36 
SM012002
 
22
SM012103
 
18
Other values (9)
 
20

Length

Max length8
Median length8
Mean length7.9925373
Min length4

Unique

Unique4 ?
Unique (%)0.7%

Sample

1st rowSM012102
2nd rowSM012102
3rd rowSM012301
4th rowSM012102
5th rowSM012102

Common Values

ValueCountFrequency (%)
SM012102 348
64.9%
SM012303 92
 
17.2%
SM012301 36
 
6.7%
SM012002 22
 
4.1%
SM012103 18
 
3.4%
SM012900 5
 
0.9%
SM012104 4
 
0.7%
SM012302 3
 
0.6%
SM012003 2
 
0.4%
SM012201 2
 
0.4%
Other values (4) 4
 
0.7%

Length

2023-12-12T22:14:22.636335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sm012102 348
64.9%
sm012303 92
 
17.2%
sm012301 36
 
6.7%
sm012002 22
 
4.1%
sm012103 18
 
3.4%
sm012900 5
 
0.9%
sm012104 4
 
0.7%
sm012302 3
 
0.6%
sm012003 2
 
0.4%
sm012201 2
 
0.4%
Other values (4) 4
 
0.7%
Distinct534
Distinct (%)99.6%
Missing0
Missing (%)0.0%
Memory size4.3 KiB
2023-12-12T22:14:23.002985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length6.9626866
Min length1

Characters and Unicode

Total characters3732
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique532 ?
Unique (%)99.3%

Sample

1st row2014-H26
2nd row2016-H04
3rd row1056
4th row2015-H01
5th row2016-H05
ValueCountFrequency (%)
1052 2
 
0.4%
6001 2
 
0.4%
1013 1
 
0.2%
000a1045 1
 
0.2%
2009-h08 1
 
0.2%
2009-h24 1
 
0.2%
2009-h01 1
 
0.2%
2009-h02 1
 
0.2%
2009-h03 1
 
0.2%
2009-h07 1
 
0.2%
Other values (524) 524
97.8%
2023-12-12T22:14:23.530468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1074
28.8%
2 585
15.7%
1 560
15.0%
- 359
 
9.6%
H 357
 
9.6%
7 133
 
3.6%
6 120
 
3.2%
4 118
 
3.2%
3 111
 
3.0%
9 97
 
2.6%
Other values (5) 218
 
5.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2970
79.6%
Uppercase Letter 403
 
10.8%
Dash Punctuation 359
 
9.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1074
36.2%
2 585
19.7%
1 560
18.9%
7 133
 
4.5%
6 120
 
4.0%
4 118
 
4.0%
3 111
 
3.7%
9 97
 
3.3%
5 92
 
3.1%
8 80
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
H 357
88.6%
A 44
 
10.9%
Z 1
 
0.2%
E 1
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
- 359
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3329
89.2%
Latin 403
 
10.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1074
32.3%
2 585
17.6%
1 560
16.8%
- 359
 
10.8%
7 133
 
4.0%
6 120
 
3.6%
4 118
 
3.5%
3 111
 
3.3%
9 97
 
2.9%
5 92
 
2.8%
Latin
ValueCountFrequency (%)
H 357
88.6%
A 44
 
10.9%
Z 1
 
0.2%
E 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3732
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1074
28.8%
2 585
15.7%
1 560
15.0%
- 359
 
9.6%
H 357
 
9.6%
7 133
 
3.6%
6 120
 
3.2%
4 118
 
3.2%
3 111
 
3.0%
9 97
 
2.6%
Other values (5) 218
 
5.8%
Distinct525
Distinct (%)97.9%
Missing0
Missing (%)0.0%
Memory size4.3 KiB
2023-12-12T22:14:23.864026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length23
Mean length9.8843284
Min length2

Characters and Unicode

Total characters5298
Distinct characters345
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique514 ?
Unique (%)95.9%

Sample

1st row건국대학교병원
2nd row건양대학교병원
3rd row수서고속철도(주)
4th row전자랜드의원
5th row온종합병원
ValueCountFrequency (%)
의료법인 33
 
4.0%
육군 27
 
3.3%
한국교통안전공단 24
 
2.9%
한국건강관리협회 15
 
1.8%
대한산업보건협회 11
 
1.3%
한국의학연구소 9
 
1.1%
공군 8
 
1.0%
근로복지공단 8
 
1.0%
한국철도공사 4
 
0.5%
서울특별시 4
 
0.5%
Other values (634) 685
82.7%
2023-12-12T22:14:24.366859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
347
 
6.5%
294
 
5.5%
246
 
4.6%
207
 
3.9%
122
 
2.3%
119
 
2.2%
117
 
2.2%
115
 
2.2%
108
 
2.0%
91
 
1.7%
Other values (335) 3532
66.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4668
88.1%
Space Separator 294
 
5.5%
Decimal Number 133
 
2.5%
Close Punctuation 68
 
1.3%
Open Punctuation 68
 
1.3%
Uppercase Letter 47
 
0.9%
Other Symbol 13
 
0.2%
Dash Punctuation 3
 
0.1%
Other Punctuation 2
 
< 0.1%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
347
 
7.4%
246
 
5.3%
207
 
4.4%
122
 
2.6%
119
 
2.5%
117
 
2.5%
115
 
2.5%
108
 
2.3%
91
 
1.9%
78
 
1.7%
Other values (300) 3118
66.8%
Uppercase Letter
ValueCountFrequency (%)
S 10
21.3%
K 6
12.8%
C 6
12.8%
G 4
 
8.5%
I 3
 
6.4%
N 3
 
6.4%
M 2
 
4.3%
T 2
 
4.3%
E 2
 
4.3%
R 2
 
4.3%
Other values (6) 7
14.9%
Decimal Number
ValueCountFrequency (%)
1 25
18.8%
3 19
14.3%
6 15
11.3%
2 15
11.3%
9 13
9.8%
8 13
9.8%
5 12
9.0%
7 11
8.3%
0 10
 
7.5%
Close Punctuation
ValueCountFrequency (%)
) 67
98.5%
1
 
1.5%
Open Punctuation
ValueCountFrequency (%)
( 67
98.5%
1
 
1.5%
Lowercase Letter
ValueCountFrequency (%)
l 1
50.0%
i 1
50.0%
Space Separator
ValueCountFrequency (%)
294
100.0%
Other Symbol
ValueCountFrequency (%)
13
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Other Punctuation
ValueCountFrequency (%)
. 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4680
88.3%
Common 568
 
10.7%
Latin 49
 
0.9%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
347
 
7.4%
246
 
5.3%
207
 
4.4%
122
 
2.6%
119
 
2.5%
117
 
2.5%
115
 
2.5%
108
 
2.3%
91
 
1.9%
78
 
1.7%
Other values (300) 3130
66.9%
Latin
ValueCountFrequency (%)
S 10
20.4%
K 6
12.2%
C 6
12.2%
G 4
 
8.2%
I 3
 
6.1%
N 3
 
6.1%
M 2
 
4.1%
T 2
 
4.1%
E 2
 
4.1%
R 2
 
4.1%
Other values (8) 9
18.4%
Common
ValueCountFrequency (%)
294
51.8%
) 67
 
11.8%
( 67
 
11.8%
1 25
 
4.4%
3 19
 
3.3%
6 15
 
2.6%
2 15
 
2.6%
9 13
 
2.3%
8 13
 
2.3%
5 12
 
2.1%
Other values (6) 28
 
4.9%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4667
88.1%
ASCII 615
 
11.6%
None 15
 
0.3%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
347
 
7.4%
246
 
5.3%
207
 
4.4%
122
 
2.6%
119
 
2.5%
117
 
2.5%
115
 
2.5%
108
 
2.3%
91
 
1.9%
78
 
1.7%
Other values (299) 3117
66.8%
ASCII
ValueCountFrequency (%)
294
47.8%
) 67
 
10.9%
( 67
 
10.9%
1 25
 
4.1%
3 19
 
3.1%
6 15
 
2.4%
2 15
 
2.4%
9 13
 
2.1%
8 13
 
2.1%
5 12
 
2.0%
Other values (22) 75
 
12.2%
None
ValueCountFrequency (%)
13
86.7%
1
 
6.7%
1
 
6.7%
CJK
ValueCountFrequency (%)
1
100.0%

지역코드
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)3.2%
Missing0
Missing (%)0.0%
Memory size4.3 KiB
SM003101
175 
<NA>
63 
SM003310
54 
SM003201
39 
SM003205
34 
Other values (12)
171 

Length

Max length8
Median length8
Mean length7.5298507
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSM003101
2nd rowSM003205
3rd rowSM003205
4th rowSM003101
5th rowSM003101

Common Values

ValueCountFrequency (%)
SM003101 175
32.6%
<NA> 63
 
11.8%
SM003310 54
 
10.1%
SM003201 39
 
7.3%
SM003205 34
 
6.3%
SM003202 25
 
4.7%
SM003351 25
 
4.7%
SM003203 22
 
4.1%
SM003352 18
 
3.4%
SM003204 17
 
3.2%
Other values (7) 64
 
11.9%

Length

2023-12-12T22:14:24.529612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sm003101 175
32.6%
na 63
 
11.8%
sm003310 54
 
10.1%
sm003201 39
 
7.3%
sm003205 34
 
6.3%
sm003202 25
 
4.7%
sm003351 25
 
4.7%
sm003203 22
 
4.1%
sm003352 18
 
3.4%
sm003204 17
 
3.2%
Other values (7) 64
 
11.9%

주소
Text

MISSING 

Distinct307
Distinct (%)96.8%
Missing219
Missing (%)40.9%
Memory size4.3 KiB
2023-12-12T22:14:24.926335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length38
Mean length19.037855
Min length3

Characters and Unicode

Total characters6035
Distinct characters305
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique298 ?
Unique (%)94.0%

Sample

1st row서울특별시 광진구 능동로 120
2nd row대전광역시 서구 관저동로 158 (관저동 1643) 건양대학교병원
3rd row서울특별시 강남구 광평로 281 (수서동 715)
4th row서울특별시 용산구 청파로 74
5th row부산광역시 부산진구 가야대로 767 (부전동) 서면메디칼센터
ValueCountFrequency (%)
서울특별시 40
 
2.9%
서울 30
 
2.2%
경기도 27
 
2.0%
부산 20
 
1.5%
서구 20
 
1.5%
중구 16
 
1.2%
대전 15
 
1.1%
경상북도 14
 
1.0%
대구 14
 
1.0%
부산광역시 13
 
1.0%
Other values (744) 1159
84.7%
2023-12-12T22:14:25.379802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1174
 
19.5%
286
 
4.7%
255
 
4.2%
222
 
3.7%
168
 
2.8%
1 142
 
2.4%
2 137
 
2.3%
128
 
2.1%
127
 
2.1%
105
 
1.7%
Other values (295) 3291
54.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3864
64.0%
Space Separator 1174
 
19.5%
Decimal Number 734
 
12.2%
Close Punctuation 85
 
1.4%
Open Punctuation 85
 
1.4%
Dash Punctuation 49
 
0.8%
Other Punctuation 38
 
0.6%
Uppercase Letter 6
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
286
 
7.4%
255
 
6.6%
222
 
5.7%
168
 
4.3%
128
 
3.3%
127
 
3.3%
105
 
2.7%
91
 
2.4%
88
 
2.3%
84
 
2.2%
Other values (275) 2310
59.8%
Decimal Number
ValueCountFrequency (%)
1 142
19.3%
2 137
18.7%
3 81
11.0%
7 60
8.2%
5 58
7.9%
9 57
7.8%
6 56
 
7.6%
0 55
 
7.5%
4 54
 
7.4%
8 34
 
4.6%
Uppercase Letter
ValueCountFrequency (%)
I 2
33.3%
K 2
33.3%
S 1
16.7%
N 1
16.7%
Other Punctuation
ValueCountFrequency (%)
, 37
97.4%
. 1
 
2.6%
Space Separator
ValueCountFrequency (%)
1174
100.0%
Close Punctuation
ValueCountFrequency (%)
) 85
100.0%
Open Punctuation
ValueCountFrequency (%)
( 85
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 49
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3864
64.0%
Common 2165
35.9%
Latin 6
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
286
 
7.4%
255
 
6.6%
222
 
5.7%
168
 
4.3%
128
 
3.3%
127
 
3.3%
105
 
2.7%
91
 
2.4%
88
 
2.3%
84
 
2.2%
Other values (275) 2310
59.8%
Common
ValueCountFrequency (%)
1174
54.2%
1 142
 
6.6%
2 137
 
6.3%
) 85
 
3.9%
( 85
 
3.9%
3 81
 
3.7%
7 60
 
2.8%
5 58
 
2.7%
9 57
 
2.6%
6 56
 
2.6%
Other values (6) 230
 
10.6%
Latin
ValueCountFrequency (%)
I 2
33.3%
K 2
33.3%
S 1
16.7%
N 1
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3864
64.0%
ASCII 2171
36.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1174
54.1%
1 142
 
6.5%
2 137
 
6.3%
) 85
 
3.9%
( 85
 
3.9%
3 81
 
3.7%
7 60
 
2.8%
5 58
 
2.7%
9 57
 
2.6%
6 56
 
2.6%
Other values (10) 236
 
10.9%
Hangul
ValueCountFrequency (%)
286
 
7.4%
255
 
6.6%
222
 
5.7%
168
 
4.3%
128
 
3.3%
127
 
3.3%
105
 
2.7%
91
 
2.4%
88
 
2.3%
84
 
2.2%
Other values (275) 2310
59.8%

상세주소
Text

MISSING 

Distinct253
Distinct (%)98.1%
Missing278
Missing (%)51.9%
Memory size4.3 KiB
2023-12-12T22:14:25.672123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length39
Median length30
Mean length10.236434
Min length1

Characters and Unicode

Total characters2641
Distinct characters247
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique248 ?
Unique (%)96.1%

Sample

1st row8층, 9층(수서동, 효성수서빌딩)
2nd row온종합병원 건강검진센터 5층
3rd row칠곡 경북대학교병원
4th row충렬대로348번길 1
5th row9층
ValueCountFrequency (%)
부산광역시 9
 
1.6%
한국교통안전공단 8
 
1.4%
2층 8
 
1.4%
경기도 7
 
1.3%
남구 7
 
1.3%
서울특별시 7
 
1.3%
4층 7
 
1.3%
3층 6
 
1.1%
5층 5
 
0.9%
대구광역시 5
 
0.9%
Other values (426) 488
87.6%
2023-12-12T22:14:26.121581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
300
 
11.4%
1 187
 
7.1%
- 140
 
5.3%
2 109
 
4.1%
3 103
 
3.9%
5 89
 
3.4%
4 87
 
3.3%
83
 
3.1%
0 66
 
2.5%
6 63
 
2.4%
Other values (237) 1414
53.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1240
47.0%
Decimal Number 882
33.4%
Space Separator 300
 
11.4%
Dash Punctuation 140
 
5.3%
Uppercase Letter 21
 
0.8%
Lowercase Letter 16
 
0.6%
Open Punctuation 15
 
0.6%
Close Punctuation 15
 
0.6%
Other Punctuation 10
 
0.4%
Math Symbol 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
83
 
6.7%
60
 
4.8%
58
 
4.7%
55
 
4.4%
33
 
2.7%
30
 
2.4%
26
 
2.1%
26
 
2.1%
25
 
2.0%
23
 
1.9%
Other values (199) 821
66.2%
Uppercase Letter
ValueCountFrequency (%)
L 4
19.0%
B 3
14.3%
J 2
9.5%
A 2
9.5%
S 2
9.5%
M 2
9.5%
F 2
9.5%
N 1
 
4.8%
C 1
 
4.8%
E 1
 
4.8%
Decimal Number
ValueCountFrequency (%)
1 187
21.2%
2 109
12.4%
3 103
11.7%
5 89
10.1%
4 87
9.9%
0 66
 
7.5%
6 63
 
7.1%
9 62
 
7.0%
7 60
 
6.8%
8 56
 
6.3%
Lowercase Letter
ValueCountFrequency (%)
a 3
18.8%
r 2
12.5%
n 2
12.5%
b 2
12.5%
e 2
12.5%
v 1
 
6.2%
o 1
 
6.2%
g 1
 
6.2%
u 1
 
6.2%
p 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
, 9
90.0%
? 1
 
10.0%
Space Separator
ValueCountFrequency (%)
300
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 140
100.0%
Open Punctuation
ValueCountFrequency (%)
( 15
100.0%
Close Punctuation
ValueCountFrequency (%)
) 15
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1364
51.6%
Hangul 1240
47.0%
Latin 37
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
83
 
6.7%
60
 
4.8%
58
 
4.7%
55
 
4.4%
33
 
2.7%
30
 
2.4%
26
 
2.1%
26
 
2.1%
25
 
2.0%
23
 
1.9%
Other values (199) 821
66.2%
Latin
ValueCountFrequency (%)
L 4
 
10.8%
B 3
 
8.1%
a 3
 
8.1%
r 2
 
5.4%
J 2
 
5.4%
A 2
 
5.4%
S 2
 
5.4%
n 2
 
5.4%
M 2
 
5.4%
b 2
 
5.4%
Other values (11) 13
35.1%
Common
ValueCountFrequency (%)
300
22.0%
1 187
13.7%
- 140
10.3%
2 109
 
8.0%
3 103
 
7.6%
5 89
 
6.5%
4 87
 
6.4%
0 66
 
4.8%
6 63
 
4.6%
9 62
 
4.5%
Other values (7) 158
11.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1401
53.0%
Hangul 1240
47.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
300
21.4%
1 187
13.3%
- 140
10.0%
2 109
 
7.8%
3 103
 
7.4%
5 89
 
6.4%
4 87
 
6.2%
0 66
 
4.7%
6 63
 
4.5%
9 62
 
4.4%
Other values (28) 195
13.9%
Hangul
ValueCountFrequency (%)
83
 
6.7%
60
 
4.8%
58
 
4.7%
55
 
4.4%
33
 
2.7%
30
 
2.4%
26
 
2.1%
26
 
2.1%
25
 
2.0%
23
 
1.9%
Other values (199) 821
66.2%

우편번호
Text

MISSING 

Distinct284
Distinct (%)96.6%
Missing242
Missing (%)45.1%
Memory size4.3 KiB
2023-12-12T22:14:26.503188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length6.3877551
Min length5

Characters and Unicode

Total characters1878
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique275 ?
Unique (%)93.5%

Sample

1st row143-701
2nd row35365
3rd row06349
4th row140-878
5th row41404
ValueCountFrequency (%)
135-880 3
 
1.0%
01902 2
 
0.7%
137-070 2
 
0.7%
16824 2
 
0.7%
133-847 2
 
0.7%
402-200 2
 
0.7%
06349 2
 
0.7%
302-122 2
 
0.7%
706-170 2
 
0.7%
441-821 1
 
0.3%
Other values (274) 274
93.2%
2023-12-12T22:14:27.087105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 367
19.5%
1 240
12.8%
- 204
10.9%
3 179
9.5%
2 175
9.3%
4 135
 
7.2%
7 134
 
7.1%
6 126
 
6.7%
8 124
 
6.6%
5 116
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1674
89.1%
Dash Punctuation 204
 
10.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 367
21.9%
1 240
14.3%
3 179
10.7%
2 175
10.5%
4 135
 
8.1%
7 134
 
8.0%
6 126
 
7.5%
8 124
 
7.4%
5 116
 
6.9%
9 78
 
4.7%
Dash Punctuation
ValueCountFrequency (%)
- 204
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1878
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 367
19.5%
1 240
12.8%
- 204
10.9%
3 179
9.5%
2 175
9.3%
4 135
 
7.2%
7 134
 
7.1%
6 126
 
6.7%
8 124
 
6.6%
5 116
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1878
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 367
19.5%
1 240
12.8%
- 204
10.9%
3 179
9.5%
2 175
9.3%
4 135
 
7.2%
7 134
 
7.1%
6 126
 
6.7%
8 124
 
6.6%
5 116
 
6.2%

대표전화번호
Text

MISSING 

Distinct322
Distinct (%)96.7%
Missing203
Missing (%)37.9%
Memory size4.3 KiB
2023-12-12T22:14:27.393341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length11.888889
Min length9

Characters and Unicode

Total characters3959
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique311 ?
Unique (%)93.4%

Sample

1st row02-2030-5706
2nd row02-704-9494
3rd row051-607-0789
4th row053-200-3100
5th row043-230-6222
ValueCountFrequency (%)
02-2247-6633 2
 
0.6%
02-326-1101 2
 
0.6%
032-899-9777 2
 
0.6%
031-270-5912 2
 
0.6%
02-540-0001 2
 
0.6%
031-297-5000 2
 
0.6%
02-919-7075 2
 
0.6%
031-362-3614 2
 
0.6%
02-375-1273 2
 
0.6%
02-2140-6000 2
 
0.6%
Other values (312) 313
94.0%
2023-12-12T22:14:27.833228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 762
19.2%
- 665
16.8%
2 418
10.6%
1 356
9.0%
3 349
8.8%
5 337
8.5%
4 259
 
6.5%
6 259
 
6.5%
7 231
 
5.8%
9 174
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3294
83.2%
Dash Punctuation 665
 
16.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 762
23.1%
2 418
12.7%
1 356
10.8%
3 349
10.6%
5 337
10.2%
4 259
 
7.9%
6 259
 
7.9%
7 231
 
7.0%
9 174
 
5.3%
8 149
 
4.5%
Dash Punctuation
ValueCountFrequency (%)
- 665
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3959
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 762
19.2%
- 665
16.8%
2 418
10.6%
1 356
9.0%
3 349
8.8%
5 337
8.5%
4 259
 
6.5%
6 259
 
6.5%
7 231
 
5.8%
9 174
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3959
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 762
19.2%
- 665
16.8%
2 418
10.6%
1 356
9.0%
3 349
8.8%
5 337
8.5%
4 259
 
6.5%
6 259
 
6.5%
7 231
 
5.8%
9 174
 
4.4%

소재지
Categorical

HIGH CORRELATION 

Distinct49
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Memory size4.3 KiB
<NA>
203 
SM003101
65 
SM003201
39 
SM003205
27 
SM003310
27 
Other values (44)
175 

Length

Max length12
Median length8
Mean length6.2276119
Min length2

Unique

Unique26 ?
Unique (%)4.9%

Sample

1st rowSM003101
2nd rowSM003205
3rd rowSM003101
4th rowSM003101
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 203
37.9%
SM003101 65
 
12.1%
SM003201 39
 
7.3%
SM003205 27
 
5.0%
SM003310 27
 
5.0%
SM003202 22
 
4.1%
SM003203 20
 
3.7%
SM003351 17
 
3.2%
SM003331 13
 
2.4%
SM003352 12
 
2.2%
Other values (39) 91
17.0%

Length

2023-12-12T22:14:28.000621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 203
36.2%
sm003101 65
 
11.6%
sm003201 39
 
7.0%
sm003205 27
 
4.8%
sm003310 27
 
4.8%
sm003202 22
 
3.9%
sm003203 20
 
3.6%
sm003351 17
 
3.0%
sm003331 13
 
2.3%
sm003352 12
 
2.1%
Other values (49) 115
20.5%
Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size668.0 B
True
315 
False
221 
ValueCountFrequency (%)
True 315
58.8%
False 221
41.2%
2023-12-12T22:14:28.110864image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-12T22:14:20.694173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:14:28.182111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번유관기관 종류코드지역코드소재지사용여부
순번1.0000.5520.3820.5770.463
유관기관 종류코드0.5521.0000.3470.8210.420
지역코드0.3820.3471.0001.0000.342
소재지0.5770.8211.0001.0000.000
사용여부0.4630.4200.3420.0001.000
2023-12-12T22:14:28.279951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역코드사용여부소재지유관기관 종류코드
지역코드1.0000.2640.9240.130
사용여부0.2641.0000.0000.387
소재지0.9240.0001.0000.461
유관기관 종류코드0.1300.3870.4611.000
2023-12-12T22:14:28.377530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번유관기관 종류코드지역코드소재지사용여부
순번1.0000.2650.1590.2260.353
유관기관 종류코드0.2651.0000.1300.4610.387
지역코드0.1590.1301.0000.9240.264
소재지0.2260.4610.9241.0000.000
사용여부0.3530.3870.2640.0001.000

Missing values

2023-12-12T22:14:20.899704image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:14:21.124094image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T22:14:21.319280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

순번대분류영문 테이블명한글 테이블명유관기관 종류코드기관코드기관명지역코드주소상세주소우편번호대표전화번호소재지사용여부
01유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022014-H26건국대학교병원SM003101서울특별시 광진구 능동로 120<NA>143-70102-2030-5706SM003101N
12유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022016-H04건양대학교병원SM003205대전광역시 서구 관저동로 158 (관저동 1643) 건양대학교병원<NA>35365<NA>SM003205N
23유관기관 정보TB_SM1013, TB_SM1014유관기관SM0123011056수서고속철도(주)SM003205서울특별시 강남구 광평로 281 (수서동 715)8층, 9층(수서동, 효성수서빌딩)06349<NA>SM003101N
34유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022015-H01전자랜드의원SM003101서울특별시 용산구 청파로 74<NA>140-87802-704-9494SM003101Y
45유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022016-H05온종합병원SM003101부산광역시 부산진구 가야대로 767 (부전동) 서면메디칼센터온종합병원 건강검진센터 5층<NA>051-607-0789<NA>Y
56유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022016-H09칠곡 경북대학교병원SM003202대구광역시 북구 호국로 807 (학정동 474)칠곡 경북대학교병원41404053-200-3100SM003202Y
67유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022013-H02한마음의료재단 하나병원SM003331충청북도 청주시 흥덕구 2순환로 1262<NA>361-803043-230-6222SM003331N
78유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022014-H30부평세림병원SM003203인천광역시 부평구 부평대로 175<NA>403-717<NA>SM003203Y
89유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022015-H13가톨릭관동대학교 국제성모병원SM003203인천광역시 중구 우현로50번길 2 (답동 3-1) 답동성당<NA>22321032-290-3341SM003203Y
910유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022014-H13의료법인 강릉고려병원SM003320강원도 강릉시 옥가로 30<NA>210-933033-649-0326SM003320Y
순번대분류영문 테이블명한글 테이블명유관기관 종류코드기관코드기관명지역코드주소상세주소우편번호대표전화번호소재지사용여부
526527유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022015-H19대구박병원SM003202대구광역시 북구 대학로 139 (산격동 1291-4)<NA>41535<NA>SM003202Y
527528유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022015-H15영동병원SM003101서울특별시 동대문구 한천로 4 (장안동 413-1)<NA>02633<NA>SM003101Y
528529유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022016-H17권태경내과의원SM003201부산광역시 기장군 기장읍 동부리 280-1<NA><NA><NA>SM003201N
529530유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022011-H11의료법인 정산의료재단 효성병원SM003331충북 청주시 상당구 금천동162-90번지360-802043-221-0012SM003331Y
530531유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022011-H12의료법인명지의료재단 명지병원SM003310경기 고양시 덕양구 화정동697-1412-270031-810-6114SM003310N
531532유관기관 정보TB_SM1013, TB_SM1014유관기관SM0123012011신분당선(주)SM003310경기 성남시 분당구 삼평동<NA>463-400031-8018-7555경기Y
532533유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022012-H07푸른미래 내과의원SM003202대구광역시 중구 달구벌대로 20952095700-742053-422-8575SM003202N
533534유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022012-H13강남고려병원SM003101서울특별시 관악구 관악로 242<NA>151-81002-877-5533SM003101Y
534535유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022011-H16수진단검사의학과의원SM003101서울 성북구 장위동33-34136-14002-919-7076SM003101N
535536유관기관 정보TB_SM1013, TB_SM1014유관기관SM0121022011-H17메트로병원SM003201부산 수영구 남천1동05월 24일613-814051-626-0250SM003201N