Overview

Dataset statistics

Number of variables13
Number of observations2661
Missing cells5409
Missing cells (%)15.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory278.2 KiB
Average record size in memory107.0 B

Variable types

Numeric1
Categorical6
Text5
Boolean1

Dataset

Description한국교통안전공단 철도자격관리시스템의 철도관련 기관 정보 데이터, 응시자격코드 데이터, 시스템을 구성하는 데이터를 포함하고있습니다.
Author한국교통안전공단
URLhttps://www.data.go.kr/data/15064510/fileData.do

Alerts

대분류 has constant value ""Constant
영문 테이블명 is highly overall correlated with 순번 and 5 other fieldsHigh correlation
소지면허코드 is highly overall correlated with 순번 and 4 other fieldsHigh correlation
면허경력구분 is highly overall correlated with 순번 and 4 other fieldsHigh correlation
경력년수 is highly overall correlated with 순번 and 4 other fieldsHigh correlation
사용여부 is highly overall correlated with 순번 and 4 other fieldsHigh correlation
한글 테이블명 is highly overall correlated with 순번 and 5 other fieldsHigh correlation
순번 is highly overall correlated with 영문 테이블명 and 5 other fieldsHigh correlation
면허경력구분 is highly imbalanced (96.0%)Imbalance
소지면허코드 is highly imbalanced (98.4%)Imbalance
경력년수 is highly imbalanced (98.7%)Imbalance
상위(분류)코드 has 644 (24.2%) missing valuesMissing
기관코드 has 2114 (79.4%) missing valuesMissing
경력유형코드 has 2651 (99.6%) missing valuesMissing
순번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 10:39:11.390711
Analysis finished2023-12-12 10:39:12.844612
Duration1.45 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct2661
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1331
Minimum1
Maximum2661
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.5 KiB
2023-12-12T19:39:12.951714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile134
Q1666
median1331
Q31996
95-th percentile2528
Maximum2661
Range2660
Interquartile range (IQR)1330

Descriptive statistics

Standard deviation768.30886
Coefficient of variation (CV)0.57724182
Kurtosis-1.2
Mean1331
Median Absolute Deviation (MAD)665
Skewness0
Sum3541791
Variance590298.5
MonotonicityStrictly increasing
2023-12-12T19:39:13.180051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
1769 1
 
< 0.1%
1771 1
 
< 0.1%
1772 1
 
< 0.1%
1773 1
 
< 0.1%
1774 1
 
< 0.1%
1775 1
 
< 0.1%
1776 1
 
< 0.1%
1777 1
 
< 0.1%
1778 1
 
< 0.1%
Other values (2651) 2651
99.6%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
2661 1
< 0.1%
2660 1
< 0.1%
2659 1
< 0.1%
2658 1
< 0.1%
2657 1
< 0.1%
2656 1
< 0.1%
2655 1
< 0.1%
2654 1
< 0.1%
2653 1
< 0.1%
2652 1
< 0.1%

대분류
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size20.9 KiB
시스템 관리용 코드
2661 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row시스템 관리용 코드
2nd row시스템 관리용 코드
3rd row시스템 관리용 코드
4th row시스템 관리용 코드
5th row시스템 관리용 코드

Common Values

ValueCountFrequency (%)
시스템 관리용 코드 2661
100.0%

Length

2023-12-12T19:39:13.359980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:39:13.834922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
시스템 2661
33.3%
관리용 2661
33.3%
코드 2661
33.3%

영문 테이블명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size20.9 KiB
TB_QM1001
1371 
TB_SM1005
646 
TB_SM1018
547 
TB_SM1004
 
75
TB_LM1048
 
22

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTB_QM1001
2nd rowTB_QM1001
3rd rowTB_QM1001
4th rowTB_QM1001
5th rowTB_QM1001

Common Values

ValueCountFrequency (%)
TB_QM1001 1371
51.5%
TB_SM1005 646
24.3%
TB_SM1018 547
 
20.6%
TB_SM1004 75
 
2.8%
TB_LM1048 22
 
0.8%

Length

2023-12-12T19:39:13.975391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:39:14.126727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
tb_qm1001 1371
51.5%
tb_sm1005 646
24.3%
tb_sm1018 547
 
20.6%
tb_sm1004 75
 
2.8%
tb_lm1048 22
 
0.8%

한글 테이블명
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size20.9 KiB
지사 시업소
1371 
공통상세코드
646 
철도운영기관별담당사무소코드
547 
공통분류코드
 
75
응시자격코드
 
22

Length

Max length14
Median length6
Mean length7.6444946
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row지사 시업소
2nd row지사 시업소
3rd row지사 시업소
4th row지사 시업소
5th row지사 시업소

Common Values

ValueCountFrequency (%)
지사 시업소 1371
51.5%
공통상세코드 646
24.3%
철도운영기관별담당사무소코드 547
 
20.6%
공통분류코드 75
 
2.8%
응시자격코드 22
 
0.8%

Length

2023-12-12T19:39:14.267606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:39:14.438777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지사 1371
34.0%
시업소 1371
34.0%
공통상세코드 646
16.0%
철도운영기관별담당사무소코드 547
 
13.6%
공통분류코드 75
 
1.9%
응시자격코드 22
 
0.5%

상위(분류)코드
Text

MISSING 

Distinct141
Distinct (%)7.0%
Missing644
Missing (%)24.2%
Memory size20.9 KiB
2023-12-12T19:39:14.852068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length6.8730788
Min length4

Characters and Unicode

Total characters13863
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)0.4%

Sample

1st row20301000
2nd row20301000
3rd row20301000
4th row20301000
5th row20301000
ValueCountFrequency (%)
sm008 174
 
8.6%
10106000 136
 
6.7%
10103000 100
 
5.0%
10117000 100
 
5.0%
10108000 95
 
4.7%
20202000 72
 
3.6%
10114000 70
 
3.5%
10102000 66
 
3.3%
10115000 61
 
3.0%
10109000 57
 
2.8%
Other values (131) 1086
53.8%
2023-12-12T19:39:15.503815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 7098
51.2%
1 2838
 
20.5%
2 915
 
6.6%
M 646
 
4.7%
8 347
 
2.5%
3 320
 
2.3%
S 302
 
2.2%
6 264
 
1.9%
L 221
 
1.6%
7 184
 
1.3%
Other values (8) 728
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12377
89.3%
Uppercase Letter 1363
 
9.8%
Space Separator 122
 
0.9%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7098
57.3%
1 2838
 
22.9%
2 915
 
7.4%
8 347
 
2.8%
3 320
 
2.6%
6 264
 
2.1%
7 184
 
1.5%
4 166
 
1.3%
5 136
 
1.1%
9 109
 
0.9%
Uppercase Letter
ValueCountFrequency (%)
M 646
47.4%
S 302
22.2%
L 221
 
16.2%
Q 123
 
9.0%
A 70
 
5.1%
H 1
 
0.1%
Space Separator
ValueCountFrequency (%)
122
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12500
90.2%
Latin 1363
 
9.8%

Most frequent character per script

Common
ValueCountFrequency (%)
0 7098
56.8%
1 2838
 
22.7%
2 915
 
7.3%
8 347
 
2.8%
3 320
 
2.6%
6 264
 
2.1%
7 184
 
1.5%
4 166
 
1.3%
5 136
 
1.1%
122
 
1.0%
Other values (2) 110
 
0.9%
Latin
ValueCountFrequency (%)
M 646
47.4%
S 302
22.2%
L 221
 
16.2%
Q 123
 
9.0%
A 70
 
5.1%
H 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13863
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7098
51.2%
1 2838
 
20.5%
2 915
 
6.6%
M 646
 
4.7%
8 347
 
2.5%
3 320
 
2.3%
S 302
 
2.2%
6 264
 
1.9%
L 221
 
1.6%
7 184
 
1.3%
Other values (8) 728
 
5.3%
Distinct2641
Distinct (%)99.2%
Missing0
Missing (%)0.0%
Memory size20.9 KiB
2023-12-12T19:39:15.918675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length6.7647501
Min length1

Characters and Unicode

Total characters18001
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2621 ?
Unique (%)98.5%

Sample

1st row20301004
2nd row20301005
3rd row20301006
4th row20301007
5th row20301008
ValueCountFrequency (%)
10 2
 
0.1%
12 2
 
0.1%
sm008 2
 
0.1%
13 2
 
0.1%
8 2
 
0.1%
7 2
 
0.1%
2 2
 
0.1%
5 2
 
0.1%
4 2
 
0.1%
11 2
 
0.1%
Other values (2630) 2641
99.2%
2023-12-12T19:39:16.508525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 6129
34.0%
1 3836
21.3%
2 1781
 
9.9%
3 1036
 
5.8%
M 721
 
4.0%
4 720
 
4.0%
8 690
 
3.8%
6 650
 
3.6%
5 646
 
3.6%
7 540
 
3.0%
Other values (8) 1252
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16544
91.9%
Uppercase Letter 1447
 
8.0%
Space Separator 8
 
< 0.1%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6129
37.0%
1 3836
23.2%
2 1781
 
10.8%
3 1036
 
6.3%
4 720
 
4.4%
8 690
 
4.2%
6 650
 
3.9%
5 646
 
3.9%
7 540
 
3.3%
9 516
 
3.1%
Uppercase Letter
ValueCountFrequency (%)
M 721
49.8%
S 318
22.0%
L 255
 
17.6%
Q 148
 
10.2%
A 3
 
0.2%
H 2
 
0.1%
Space Separator
ValueCountFrequency (%)
8
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16554
92.0%
Latin 1447
 
8.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6129
37.0%
1 3836
23.2%
2 1781
 
10.8%
3 1036
 
6.3%
4 720
 
4.3%
8 690
 
4.2%
6 650
 
3.9%
5 646
 
3.9%
7 540
 
3.3%
9 516
 
3.1%
Other values (2) 10
 
0.1%
Latin
ValueCountFrequency (%)
M 721
49.8%
S 318
22.0%
L 255
 
17.6%
Q 148
 
10.2%
A 3
 
0.2%
H 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18001
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6129
34.0%
1 3836
21.3%
2 1781
 
9.9%
3 1036
 
5.8%
M 721
 
4.0%
4 720
 
4.0%
8 690
 
3.8%
6 650
 
3.6%
5 646
 
3.6%
7 540
 
3.0%
Other values (8) 1252
 
7.0%
Distinct2256
Distinct (%)84.8%
Missing0
Missing (%)0.0%
Memory size20.9 KiB
2023-12-12T19:39:16.862722image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length24
Mean length6.164224
Min length2

Characters and Unicode

Total characters16403
Distinct characters423
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1981 ?
Unique (%)74.4%

Sample

1st row안전관리실
2nd row경영지원처
3rd row열차운영처
4th row영업처
5th row차량처
ValueCountFrequency (%)
기타부서 30
 
1.0%
코드 25
 
0.8%
본사 22
 
0.7%
경력자 16
 
0.5%
엑셀저장 15
 
0.5%
지원 13
 
0.4%
궤도신호사업소 10
 
0.3%
삭제 10
 
0.3%
소지 10
 
0.3%
경력 9
 
0.3%
Other values (2247) 2868
94.7%
2023-12-12T19:39:17.447912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
868
 
5.3%
827
 
5.0%
771
 
4.7%
657
 
4.0%
484
 
3.0%
453
 
2.8%
372
 
2.3%
341
 
2.1%
308
 
1.9%
284
 
1.7%
Other values (413) 11038
67.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15272
93.1%
Space Separator 372
 
2.3%
Decimal Number 213
 
1.3%
Close Punctuation 177
 
1.1%
Open Punctuation 177
 
1.1%
Uppercase Letter 98
 
0.6%
Other Punctuation 41
 
0.2%
Dash Punctuation 29
 
0.2%
Lowercase Letter 22
 
0.1%
Other Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
868
 
5.7%
827
 
5.4%
771
 
5.0%
657
 
4.3%
484
 
3.2%
453
 
3.0%
341
 
2.2%
308
 
2.0%
284
 
1.9%
253
 
1.7%
Other values (377) 10026
65.6%
Uppercase Letter
ValueCountFrequency (%)
X 55
56.1%
C 12
 
12.2%
S 11
 
11.2%
G 5
 
5.1%
K 4
 
4.1%
L 3
 
3.1%
O 3
 
3.1%
N 2
 
2.0%
D 1
 
1.0%
B 1
 
1.0%
Decimal Number
ValueCountFrequency (%)
2 56
26.3%
1 52
24.4%
0 24
11.3%
3 22
 
10.3%
5 15
 
7.0%
7 13
 
6.1%
8 10
 
4.7%
6 10
 
4.7%
4 7
 
3.3%
9 4
 
1.9%
Lowercase Letter
ValueCountFrequency (%)
x 14
63.6%
t 2
 
9.1%
l 2
 
9.1%
i 2
 
9.1%
s 1
 
4.5%
e 1
 
4.5%
Close Punctuation
ValueCountFrequency (%)
) 176
99.4%
] 1
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 176
99.4%
[ 1
 
0.6%
Other Punctuation
ValueCountFrequency (%)
, 22
53.7%
/ 19
46.3%
Space Separator
ValueCountFrequency (%)
372
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 29
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 15274
93.1%
Common 1009
 
6.2%
Latin 120
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
868
 
5.7%
827
 
5.4%
771
 
5.0%
657
 
4.3%
484
 
3.2%
453
 
3.0%
341
 
2.2%
308
 
2.0%
284
 
1.9%
253
 
1.7%
Other values (378) 10028
65.7%
Common
ValueCountFrequency (%)
372
36.9%
) 176
17.4%
( 176
17.4%
2 56
 
5.6%
1 52
 
5.2%
- 29
 
2.9%
0 24
 
2.4%
, 22
 
2.2%
3 22
 
2.2%
/ 19
 
1.9%
Other values (8) 61
 
6.0%
Latin
ValueCountFrequency (%)
X 55
45.8%
x 14
 
11.7%
C 12
 
10.0%
S 11
 
9.2%
G 5
 
4.2%
K 4
 
3.3%
L 3
 
2.5%
O 3
 
2.5%
t 2
 
1.7%
N 2
 
1.7%
Other values (7) 9
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 15272
93.1%
ASCII 1129
 
6.9%
None 2
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
868
 
5.7%
827
 
5.4%
771
 
5.0%
657
 
4.3%
484
 
3.2%
453
 
3.0%
341
 
2.2%
308
 
2.0%
284
 
1.9%
253
 
1.7%
Other values (377) 10026
65.6%
ASCII
ValueCountFrequency (%)
372
32.9%
) 176
15.6%
( 176
15.6%
2 56
 
5.0%
X 55
 
4.9%
1 52
 
4.6%
- 29
 
2.6%
0 24
 
2.1%
, 22
 
1.9%
3 22
 
1.9%
Other values (25) 145
 
12.8%
None
ValueCountFrequency (%)
2
100.0%

기관코드
Text

MISSING 

Distinct56
Distinct (%)10.2%
Missing2114
Missing (%)79.4%
Memory size20.9 KiB
2023-12-12T19:39:17.691052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length4
Mean length4.0530165
Min length1

Characters and Unicode

Total characters2217
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)2.9%

Sample

1st row1029
2nd row1001
3rd row2015-H09
4th row1001
5th row1001
ValueCountFrequency (%)
1001 314
57.4%
2001 54
 
9.9%
2018 19
 
3.5%
2003 18
 
3.3%
2007 14
 
2.6%
1003 12
 
2.2%
2011 9
 
1.6%
2006 8
 
1.5%
2005 6
 
1.1%
2004 6
 
1.1%
Other values (46) 87
 
15.9%
2023-12-12T19:39:18.032533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1027
46.3%
1 837
37.8%
2 192
 
8.7%
3 42
 
1.9%
8 34
 
1.5%
7 19
 
0.9%
5 17
 
0.8%
6 16
 
0.7%
4 15
 
0.7%
9 12
 
0.5%
Other values (2) 6
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2211
99.7%
Dash Punctuation 3
 
0.1%
Uppercase Letter 3
 
0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1027
46.4%
1 837
37.9%
2 192
 
8.7%
3 42
 
1.9%
8 34
 
1.5%
7 19
 
0.9%
5 17
 
0.8%
6 16
 
0.7%
4 15
 
0.7%
9 12
 
0.5%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Uppercase Letter
ValueCountFrequency (%)
H 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2214
99.9%
Latin 3
 
0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1027
46.4%
1 837
37.8%
2 192
 
8.7%
3 42
 
1.9%
8 34
 
1.5%
7 19
 
0.9%
5 17
 
0.8%
6 16
 
0.7%
4 15
 
0.7%
9 12
 
0.5%
Latin
ValueCountFrequency (%)
H 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2217
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1027
46.3%
1 837
37.8%
2 192
 
8.7%
3 42
 
1.9%
8 34
 
1.5%
7 19
 
0.9%
5 17
 
0.8%
6 16
 
0.7%
4 15
 
0.7%
9 12
 
0.5%
Other values (2) 6
 
0.3%

면허경력구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size20.9 KiB
<NA>
2639 
LM028002
 
11
LM028003
 
10
LM028001
 
1

Length

Max length8
Median length4
Mean length4.0330703
Min length4

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 2639
99.2%
LM028002 11
 
0.4%
LM028003 10
 
0.4%
LM028001 1
 
< 0.1%

Length

2023-12-12T19:39:18.200768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:39:18.386810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 2639
99.2%
lm028002 11
 
0.4%
lm028003 10
 
0.4%
lm028001 1
 
< 0.1%

소지면허코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size20.9 KiB
<NA>
2652 
4
 
2
3
 
2
2
 
2
5
 
2

Length

Max length4
Median length4
Mean length3.9898534
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 2652
99.7%
4 2
 
0.1%
3 2
 
0.1%
2 2
 
0.1%
5 2
 
0.1%
1 1
 
< 0.1%

Length

2023-12-12T19:39:18.538187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:39:18.716938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 2652
99.7%
4 2
 
0.1%
3 2
 
0.1%
2 2
 
0.1%
5 2
 
0.1%
1 1
 
< 0.1%

경력유형코드
Text

MISSING 

Distinct6
Distinct (%)60.0%
Missing2651
Missing (%)99.6%
Memory size20.9 KiB
2023-12-12T19:39:18.876636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters80
Distinct characters9
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)50.0%

Sample

1st rowLM034001
2nd rowLM034003
3rd rowLM034002
4th rowLM034005
5th rowLM034006
ValueCountFrequency (%)
lm034006 5
50.0%
lm034001 1
 
10.0%
lm034003 1
 
10.0%
lm034002 1
 
10.0%
lm034005 1
 
10.0%
lm034004 1
 
10.0%
2023-12-12T19:39:19.236136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 30
37.5%
3 11
 
13.8%
4 11
 
13.8%
L 10
 
12.5%
M 10
 
12.5%
6 5
 
6.2%
1 1
 
1.2%
2 1
 
1.2%
5 1
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 60
75.0%
Uppercase Letter 20
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 30
50.0%
3 11
 
18.3%
4 11
 
18.3%
6 5
 
8.3%
1 1
 
1.7%
2 1
 
1.7%
5 1
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
L 10
50.0%
M 10
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 60
75.0%
Latin 20
 
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 30
50.0%
3 11
 
18.3%
4 11
 
18.3%
6 5
 
8.3%
1 1
 
1.7%
2 1
 
1.7%
5 1
 
1.7%
Latin
ValueCountFrequency (%)
L 10
50.0%
M 10
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 80
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 30
37.5%
3 11
 
13.8%
4 11
 
13.8%
L 10
 
12.5%
M 10
 
12.5%
6 5
 
6.2%
1 1
 
1.2%
2 1
 
1.2%
5 1
 
1.2%

경력년수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size20.9 KiB
<NA>
2655 
1
 
2
3
 
2
2
 
2

Length

Max length4
Median length4
Mean length3.9932356
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 2655
99.8%
1 2
 
0.1%
3 2
 
0.1%
2 2
 
0.1%

Length

2023-12-12T19:39:19.431305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:39:19.590639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 2655
99.8%
1 2
 
0.1%
3 2
 
0.1%
2 2
 
0.1%

사용여부
Boolean

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
True
1831 
False
830 
ValueCountFrequency (%)
True 1831
68.8%
False 830
31.2%
2023-12-12T19:39:19.702648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-12T19:39:12.217870image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:39:19.798255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번영문 테이블명한글 테이블명기관코드면허경력구분소지면허코드경력유형코드경력년수사용여부
순번1.0000.9590.9590.306NaNNaNNaNNaN0.811
영문 테이블명0.9591.0001.000NaNNaNNaNNaNNaN0.434
한글 테이블명0.9591.0001.000NaNNaNNaNNaNNaN0.434
기관코드0.306NaNNaN1.000NaNNaNNaNNaN0.116
면허경력구분NaNNaNNaNNaN1.000NaN0.000NaN0.000
소지면허코드NaNNaNNaNNaNNaN1.000NaNNaN1.000
경력유형코드NaNNaNNaNNaN0.000NaN1.0001.000NaN
경력년수NaNNaNNaNNaNNaNNaN1.0001.000NaN
사용여부0.8110.4340.4340.1160.0001.000NaNNaN1.000
2023-12-12T19:39:19.957070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
영문 테이블명소지면허코드면허경력구분경력년수사용여부한글 테이블명
영문 테이블명1.0001.0001.0001.0000.5271.000
소지면허코드1.0001.0001.000NaN0.7561.000
면허경력구분1.0001.0001.0001.0000.0001.000
경력년수1.000NaN1.0001.0001.0001.000
사용여부0.5270.7560.0001.0001.0000.527
한글 테이블명1.0001.0001.0001.0000.5271.000
2023-12-12T19:39:20.118302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번영문 테이블명한글 테이블명면허경력구분소지면허코드경력년수사용여부
순번1.0000.7210.7211.0001.0001.0000.644
영문 테이블명0.7211.0001.0001.0001.0001.0000.527
한글 테이블명0.7211.0001.0001.0001.0001.0000.527
면허경력구분1.0001.0001.0001.0001.0001.0000.000
소지면허코드1.0001.0001.0001.0001.0000.0000.756
경력년수1.0001.0001.0001.0000.0001.0001.000
사용여부0.6440.5270.5270.0000.7561.0001.000

Missing values

2023-12-12T19:39:12.351686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:39:12.570298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T19:39:12.743157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

순번대분류영문 테이블명한글 테이블명상위(분류)코드(상세)코드번호코드명기관코드면허경력구분소지면허코드경력유형코드경력년수사용여부
01시스템 관리용 코드TB_QM1001지사 시업소2030100020301004안전관리실<NA><NA><NA><NA><NA>Y
12시스템 관리용 코드TB_QM1001지사 시업소2030100020301005경영지원처<NA><NA><NA><NA><NA>Y
23시스템 관리용 코드TB_QM1001지사 시업소2030100020301006열차운영처<NA><NA><NA><NA><NA>Y
34시스템 관리용 코드TB_QM1001지사 시업소2030100020301007영업처<NA><NA><NA><NA><NA>Y
45시스템 관리용 코드TB_QM1001지사 시업소2030100020301008차량처<NA><NA><NA><NA><NA>Y
56시스템 관리용 코드TB_QM1001지사 시업소2030100020301009전기기계설비처<NA><NA><NA><NA><NA>Y
67시스템 관리용 코드TB_QM1001지사 시업소2030100020301010신호통신처<NA><NA><NA><NA><NA>Y
78시스템 관리용 코드TB_QM1001지사 시업소2030100020301011시설처<NA><NA><NA><NA><NA>Y
89시스템 관리용 코드TB_QM1001지사 시업소200320302000현업<NA><NA><NA><NA><NA>N
910시스템 관리용 코드TB_QM1001지사 시업소2030200020302001제1영업소<NA><NA><NA><NA><NA>Y
순번대분류영문 테이블명한글 테이블명상위(분류)코드(상세)코드번호코드명기관코드면허경력구분소지면허코드경력유형코드경력년수사용여부
26512652시스템 관리용 코드TB_LM1048응시자격코드<NA>11철도관련업무3년 경력<NA>LM028003<NA>LM0340063Y
26522653시스템 관리용 코드TB_LM1048응시자격코드<NA>12전동차차장2년 경력<NA>LM028003<NA>LM0340042Y
26532654시스템 관리용 코드TB_LM1048응시자격코드<NA>14제1종전기차량운전면허 소지 (운전경력 2년 이상)<NA>LM0280022<NA><NA>Y
26542655시스템 관리용 코드TB_LM1048응시자격코드<NA>15철도장비운전면허 소지 (일반응시-5과목)<NA>LM0280025<NA><NA>Y
26552656시스템 관리용 코드TB_LM1048응시자격코드<NA>17운전업무 경력<NA>LM028003<NA>LM034006<NA>Y
26562657시스템 관리용 코드TB_LM1048응시자격코드<NA>18신호업무 경력<NA>LM028003<NA>LM034006<NA>Y
26572658시스템 관리용 코드TB_LM1048응시자격코드<NA>19철도차량운전면허 소지<NA>LM028002<NA><NA><NA>Y
26582659시스템 관리용 코드TB_LM1048응시자격코드<NA>20미경과조치<NA>LM028003<NA><NA><NA>Y
26592660시스템 관리용 코드TB_LM1048응시자격코드<NA>21보수교육<NA>LM028003<NA>LM034006<NA>Y
26602661시스템 관리용 코드TB_LM1048응시자격코드<NA>22갱신교육<NA>LM028002<NA>LM034006<NA>Y