Overview

Dataset statistics

Number of variables11
Number of observations6754
Missing cells31666
Missing cells (%)42.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory593.7 KiB
Average record size in memory90.0 B

Variable types

Text6
Categorical3
Unsupported2

Dataset

Description한국자산관리공사에서 현재까지 진행한 교육과정에 참여한 학습자들의 성별, 연령대, 기관 등 에 대한 데이터를 제공합니다.
Author한국자산관리공사
URLhttps://www.data.go.kr/data/15111485/fileData.do

Alerts

연령대 is highly overall correlated with 기관 분류High correlation
기관 분류 is highly overall correlated with 성별 and 1 other fieldsHigh correlation
성별 is highly overall correlated with 기관 분류High correlation
성별 is highly imbalanced (61.0%)Imbalance
연령대 is highly imbalanced (85.7%)Imbalance
기관 분류 is highly imbalanced (54.7%)Imbalance
사번 has 6589 (97.6%) missing valuesMissing
소속기관 has 171 (2.5%) missing valuesMissing
소속부서 has 6754 (100.0%) missing valuesMissing
직위 has 6754 (100.0%) missing valuesMissing
연락처(휴대전화) has 5813 (86.1%) missing valuesMissing
이메일 has 5585 (82.7%) missing valuesMissing
학습자 번호 has unique valuesUnique
소속부서 is an unsupported type, check if it needs cleaning or further analysisUnsupported
직위 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-13 00:56:23.634491
Analysis finished2023-12-13 00:56:24.556156
Duration0.92 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

학습자 번호
Text

UNIQUE 

Distinct6754
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size52.9 KiB
2023-12-13T09:56:24.758889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters33770
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6754 ?
Unique (%)100.0%

Sample

1st rowH0001
2nd rowH0002
3rd rowH0003
4th rowH0004
5th rowH0005
ValueCountFrequency (%)
h0001 1
 
< 0.1%
h7738 1
 
< 0.1%
h7762 1
 
< 0.1%
h7761 1
 
< 0.1%
h7760 1
 
< 0.1%
h7759 1
 
< 0.1%
h7758 1
 
< 0.1%
h7757 1
 
< 0.1%
h7756 1
 
< 0.1%
h7755 1
 
< 0.1%
Other values (6744) 6744
99.9%
2023-12-13T09:56:25.106071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
H 6754
20.0%
0 3174
9.4%
1 3049
9.0%
9 3021
8.9%
7 3004
8.9%
8 2992
8.9%
6 2894
8.6%
3 2486
 
7.4%
5 2227
 
6.6%
2 2120
 
6.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 27016
80.0%
Uppercase Letter 6754
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3174
11.7%
1 3049
11.3%
9 3021
11.2%
7 3004
11.1%
8 2992
11.1%
6 2894
10.7%
3 2486
9.2%
5 2227
8.2%
2 2120
7.8%
4 2049
7.6%
Uppercase Letter
ValueCountFrequency (%)
H 6754
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 27016
80.0%
Latin 6754
 
20.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3174
11.7%
1 3049
11.3%
9 3021
11.2%
7 3004
11.1%
8 2992
11.1%
6 2894
10.7%
3 2486
9.2%
5 2227
8.2%
2 2120
7.8%
4 2049
7.6%
Latin
ValueCountFrequency (%)
H 6754
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33770
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
H 6754
20.0%
0 3174
9.4%
1 3049
9.0%
9 3021
8.9%
7 3004
8.9%
8 2992
8.9%
6 2894
8.6%
3 2486
 
7.4%
5 2227
 
6.6%
2 2120
 
6.3%

사번
Text

MISSING 

Distinct87
Distinct (%)52.7%
Missing6589
Missing (%)97.6%
Memory size52.9 KiB
2023-12-13T09:56:25.329358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.6909091
Min length5

Characters and Unicode

Total characters939
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40 ?
Unique (%)24.2%

Sample

1st row1***69
2nd row1***00
3rd row1***94
4th row1***12
5th row1***84
ValueCountFrequency (%)
1***03 5
 
3.0%
9***5 5
 
3.0%
1***49 4
 
2.4%
1***67 4
 
2.4%
8***1 4
 
2.4%
9***4 4
 
2.4%
1***12 4
 
2.4%
1***40 4
 
2.4%
8***8 3
 
1.8%
9***9 3
 
1.8%
Other values (77) 125
75.8%
2023-12-13T09:56:25.640553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 495
52.7%
1 140
 
14.9%
9 53
 
5.6%
8 40
 
4.3%
5 37
 
3.9%
4 36
 
3.8%
6 35
 
3.7%
0 33
 
3.5%
7 26
 
2.8%
3 24
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 495
52.7%
Decimal Number 444
47.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 140
31.5%
9 53
 
11.9%
8 40
 
9.0%
5 37
 
8.3%
4 36
 
8.1%
6 35
 
7.9%
0 33
 
7.4%
7 26
 
5.9%
3 24
 
5.4%
2 20
 
4.5%
Other Punctuation
ValueCountFrequency (%)
* 495
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 939
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 495
52.7%
1 140
 
14.9%
9 53
 
5.6%
8 40
 
4.3%
5 37
 
3.9%
4 36
 
3.8%
6 35
 
3.7%
0 33
 
3.5%
7 26
 
2.8%
3 24
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 939
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 495
52.7%
1 140
 
14.9%
9 53
 
5.6%
8 40
 
4.3%
5 37
 
3.9%
4 36
 
3.8%
6 35
 
3.7%
0 33
 
3.5%
7 26
 
2.8%
3 24
 
2.6%

이름
Text

Distinct109
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size52.9 KiB
2023-12-13T09:56:25.812482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length3.0026651
Min length3

Characters and Unicode

Total characters20280
Distinct characters100
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)0.4%

Sample

1st row김**
2nd row전**
3rd row문**
4th row주**
5th row김**
ValueCountFrequency (%)
1377
20.4%
999
14.8%
566
 
8.4%
351
 
5.2%
336
 
5.0%
195
 
2.9%
150
 
2.2%
149
 
2.2%
144
 
2.1%
136
 
2.0%
Other values (88) 2351
34.8%
2023-12-13T09:56:26.073867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 13508
66.6%
1377
 
6.8%
999
 
4.9%
566
 
2.8%
351
 
1.7%
336
 
1.7%
195
 
1.0%
150
 
0.7%
149
 
0.7%
144
 
0.7%
Other values (90) 2505
 
12.4%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 13508
66.6%
Other Letter 6754
33.3%
Space Separator 18
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1377
20.4%
999
14.8%
566
 
8.4%
351
 
5.2%
336
 
5.0%
195
 
2.9%
150
 
2.2%
149
 
2.2%
144
 
2.1%
136
 
2.0%
Other values (88) 2351
34.8%
Other Punctuation
ValueCountFrequency (%)
* 13508
100.0%
Space Separator
ValueCountFrequency (%)
18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13526
66.7%
Hangul 6754
33.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1377
20.4%
999
14.8%
566
 
8.4%
351
 
5.2%
336
 
5.0%
195
 
2.9%
150
 
2.2%
149
 
2.2%
144
 
2.1%
136
 
2.0%
Other values (88) 2351
34.8%
Common
ValueCountFrequency (%)
* 13508
99.9%
18
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13526
66.7%
Hangul 6754
33.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 13508
99.9%
18
 
0.1%
Hangul
ValueCountFrequency (%)
1377
20.4%
999
14.8%
566
 
8.4%
351
 
5.2%
336
 
5.0%
195
 
2.9%
150
 
2.2%
149
 
2.2%
144
 
2.1%
136
 
2.0%
Other values (88) 2351
34.8%

성별
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size52.9 KiB
<NA>
5370 
남성
772 
여성
 
503
 
87
여성
 
21

Length

Max length4
Median length4
Mean length3.5836541
Min length1

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 5370
79.5%
남성 772
 
11.4%
여성 503
 
7.4%
87
 
1.3%
여성 21
 
0.3%
남성 1
 
< 0.1%

Length

2023-12-13T09:56:26.175954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:56:26.258523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 5370
80.5%
남성 773
 
11.6%
여성 524
 
7.9%

연령대
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size52.9 KiB
<NA>
6397 
30대
 
139
20대
 
98
40대
 
85
50대 이상
 
26
Other values (2)
 
9

Length

Max length6
Median length4
Mean length3.9586911
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 6397
94.7%
30대 139
 
2.1%
20대 98
 
1.5%
40대 85
 
1.3%
50대 이상 26
 
0.4%
50대 6
 
0.1%
10대 3
 
< 0.1%

Length

2023-12-13T09:56:26.348921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:56:26.444261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 6397
94.4%
30대 139
 
2.1%
20대 98
 
1.4%
40대 85
 
1.3%
50대 32
 
0.5%
이상 26
 
0.4%
10대 3
 
< 0.1%

기관 분류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size52.9 KiB
<NA>
5016 
지방자치단체
963 
공공기관
 
465
국가기관
 
131
중앙부처
 
96
Other values (2)
 
83

Length

Max length7
Median length4
Mean length4.269618
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 5016
74.3%
지방자치단체 963
 
14.3%
공공기관 465
 
6.9%
국가기관 131
 
1.9%
중앙부처 96
 
1.4%
59
 
0.9%
대한민국 공군 24
 
0.4%

Length

2023-12-13T09:56:26.533124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:56:26.616735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 5016
74.7%
지방자치단체 963
 
14.3%
공공기관 465
 
6.9%
국가기관 131
 
1.9%
중앙부처 96
 
1.4%
대한민국 24
 
0.4%
공군 24
 
0.4%

소속기관
Text

MISSING 

Distinct2236
Distinct (%)34.0%
Missing171
Missing (%)2.5%
Memory size52.9 KiB
2023-12-13T09:56:26.799449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length18
Mean length6.506456
Min length2

Characters and Unicode

Total characters42832
Distinct characters341
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1182 ?
Unique (%)18.0%

Sample

1st row충청남도예산교육지원청
2nd row충청남도예산교육지원청
3rd row경기도여주교육지원청
4th row경기도여주교육지원청
5th row경기도용인교육지원청
ValueCountFrequency (%)
한국자산관리공사 132
 
1.9%
국방시설본부 104
 
1.5%
한국철도시설공단 75
 
1.1%
해양수산부 49
 
0.7%
국가철도공단 49
 
0.7%
경찰청 49
 
0.7%
아산시청 47
 
0.7%
경기도 40
 
0.6%
부산광역시 39
 
0.6%
조달청 38
 
0.5%
Other values (2152) 6371
91.1%
2023-12-13T09:56:27.122908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3493
 
8.2%
2043
 
4.8%
1615
 
3.8%
1586
 
3.7%
1514
 
3.5%
1289
 
3.0%
1191
 
2.8%
1087
 
2.5%
967
 
2.3%
907
 
2.1%
Other values (331) 27140
63.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 42159
98.4%
Space Separator 479
 
1.1%
Decimal Number 144
 
0.3%
Close Punctuation 24
 
0.1%
Open Punctuation 22
 
0.1%
Uppercase Letter 3
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3493
 
8.3%
2043
 
4.8%
1615
 
3.8%
1586
 
3.8%
1514
 
3.6%
1289
 
3.1%
1191
 
2.8%
1087
 
2.6%
967
 
2.3%
907
 
2.2%
Other values (316) 26467
62.8%
Decimal Number
ValueCountFrequency (%)
1 40
27.8%
2 20
13.9%
3 17
11.8%
7 14
 
9.7%
5 12
 
8.3%
6 12
 
8.3%
0 11
 
7.6%
9 11
 
7.6%
8 7
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
D 2
66.7%
P 1
33.3%
Space Separator
ValueCountFrequency (%)
479
100.0%
Close Punctuation
ValueCountFrequency (%)
) 24
100.0%
Open Punctuation
ValueCountFrequency (%)
( 22
100.0%
Other Punctuation
ValueCountFrequency (%)
· 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 42159
98.4%
Common 670
 
1.6%
Latin 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3493
 
8.3%
2043
 
4.8%
1615
 
3.8%
1586
 
3.8%
1514
 
3.6%
1289
 
3.1%
1191
 
2.8%
1087
 
2.6%
967
 
2.3%
907
 
2.2%
Other values (316) 26467
62.8%
Common
ValueCountFrequency (%)
479
71.5%
1 40
 
6.0%
) 24
 
3.6%
( 22
 
3.3%
2 20
 
3.0%
3 17
 
2.5%
7 14
 
2.1%
5 12
 
1.8%
6 12
 
1.8%
0 11
 
1.6%
Other values (3) 19
 
2.8%
Latin
ValueCountFrequency (%)
D 2
66.7%
P 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 42159
98.4%
ASCII 672
 
1.6%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3493
 
8.3%
2043
 
4.8%
1615
 
3.8%
1586
 
3.8%
1514
 
3.6%
1289
 
3.1%
1191
 
2.8%
1087
 
2.6%
967
 
2.3%
907
 
2.2%
Other values (316) 26467
62.8%
ASCII
ValueCountFrequency (%)
479
71.3%
1 40
 
6.0%
) 24
 
3.6%
( 22
 
3.3%
2 20
 
3.0%
3 17
 
2.5%
7 14
 
2.1%
5 12
 
1.8%
6 12
 
1.8%
0 11
 
1.6%
Other values (4) 21
 
3.1%
None
ValueCountFrequency (%)
· 1
100.0%

소속부서
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing6754
Missing (%)100.0%
Memory size59.5 KiB

직위
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing6754
Missing (%)100.0%
Memory size59.5 KiB
Distinct106
Distinct (%)11.3%
Missing5813
Missing (%)86.1%
Memory size52.9 KiB
2023-12-13T09:56:27.323912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length13
Mean length13.002125
Min length12

Characters and Unicode

Total characters12235
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.6%

Sample

1st row010-****-6**9
2nd row010-****-0**2
3rd row010-****-5**6
4th row010-****-1**4
5th row010-****-8**2
ValueCountFrequency (%)
010-****-0**0 17
 
1.8%
010-****-7**3 15
 
1.6%
010-****-6**5 15
 
1.6%
010-****-2**0 14
 
1.5%
010-****-0**3 14
 
1.5%
010-****-3**1 14
 
1.5%
010-****-0**7 14
 
1.5%
010-****-7**0 13
 
1.4%
010-****-1**7 13
 
1.4%
010-****-3**3 13
 
1.4%
Other values (92) 799
84.9%
2023-12-13T09:56:27.655779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 5645
46.1%
0 2092
 
17.1%
- 1881
 
15.4%
1 1119
 
9.1%
3 209
 
1.7%
7 198
 
1.6%
5 197
 
1.6%
6 187
 
1.5%
9 187
 
1.5%
2 186
 
1.5%
Other values (3) 334
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Other Punctuation 5645
46.1%
Decimal Number 4705
38.5%
Dash Punctuation 1881
 
15.4%
Space Separator 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2092
44.5%
1 1119
23.8%
3 209
 
4.4%
7 198
 
4.2%
5 197
 
4.2%
6 187
 
4.0%
9 187
 
4.0%
2 186
 
4.0%
8 175
 
3.7%
4 155
 
3.3%
Other Punctuation
ValueCountFrequency (%)
* 5645
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1881
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12235
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 5645
46.1%
0 2092
 
17.1%
- 1881
 
15.4%
1 1119
 
9.1%
3 209
 
1.7%
7 198
 
1.6%
5 197
 
1.6%
6 187
 
1.5%
9 187
 
1.5%
2 186
 
1.5%
Other values (3) 334
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12235
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 5645
46.1%
0 2092
 
17.1%
- 1881
 
15.4%
1 1119
 
9.1%
3 209
 
1.7%
7 198
 
1.6%
5 197
 
1.6%
6 187
 
1.5%
9 187
 
1.5%
2 186
 
1.5%
Other values (3) 334
 
2.7%

이메일
Text

MISSING 

Distinct1093
Distinct (%)93.5%
Missing5585
Missing (%)82.7%
Memory size52.9 KiB
2023-12-13T09:56:27.858183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length24
Mean length17.788708
Min length13

Characters and Unicode

Total characters20795
Distinct characters55
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1034 ?
Unique (%)88.5%

Sample

1st rows****72@naver.com
2nd roww****nis@naver.com
3rd rowq****s601@naver.com
4th rowh****40@cne.go.kr
5th rowa****r21@naver.com
ValueCountFrequency (%)
k****@police.go.kr 6
 
0.5%
k****@korea.kr 5
 
0.4%
e****@korea.kr 4
 
0.3%
s****@korea.kr 4
 
0.3%
k****19@korea.kr 4
 
0.3%
j****2@korea.kr 3
 
0.3%
n****@korea.kr 3
 
0.3%
y****@kamco.or.kr 3
 
0.3%
h****@korea.kr 3
 
0.3%
h****yea@ice.go.kr 3
 
0.3%
Other values (1082) 1131
96.7%
2023-12-13T09:56:28.165129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 4669
22.5%
r 1734
 
8.3%
k 1575
 
7.6%
. 1547
 
7.4%
o 1417
 
6.8%
@ 1169
 
5.6%
a 1117
 
5.4%
e 1104
 
5.3%
n 630
 
3.0%
c 592
 
2.8%
Other values (45) 5241
25.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11584
55.7%
Other Punctuation 7388
35.5%
Decimal Number 1767
 
8.5%
Space Separator 22
 
0.1%
Uppercase Letter 21
 
0.1%
Dash Punctuation 7
 
< 0.1%
Connector Punctuation 5
 
< 0.1%
Other Letter 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 1734
15.0%
k 1575
13.6%
o 1417
12.2%
a 1117
9.6%
e 1104
9.5%
n 630
 
5.4%
c 592
 
5.1%
m 554
 
4.8%
g 456
 
3.9%
l 334
 
2.9%
Other values (16) 2071
17.9%
Uppercase Letter
ValueCountFrequency (%)
K 5
23.8%
R 3
14.3%
A 3
14.3%
O 2
 
9.5%
E 2
 
9.5%
P 1
 
4.8%
N 1
 
4.8%
V 1
 
4.8%
C 1
 
4.8%
M 1
 
4.8%
Decimal Number
ValueCountFrequency (%)
1 311
17.6%
0 295
16.7%
2 242
13.7%
7 186
10.5%
3 148
8.4%
9 140
7.9%
4 125
7.1%
8 113
 
6.4%
6 104
 
5.9%
5 103
 
5.8%
Other Punctuation
ValueCountFrequency (%)
* 4669
63.2%
. 1547
 
20.9%
@ 1169
 
15.8%
, 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
22
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%
Other Letter
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11605
55.8%
Common 9189
44.2%
Hangul 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 1734
14.9%
k 1575
13.6%
o 1417
12.2%
a 1117
9.6%
e 1104
9.5%
n 630
 
5.4%
c 592
 
5.1%
m 554
 
4.8%
g 456
 
3.9%
l 334
 
2.9%
Other values (27) 2092
18.0%
Common
ValueCountFrequency (%)
* 4669
50.8%
. 1547
 
16.8%
@ 1169
 
12.7%
1 311
 
3.4%
0 295
 
3.2%
2 242
 
2.6%
7 186
 
2.0%
3 148
 
1.6%
9 140
 
1.5%
4 125
 
1.4%
Other values (7) 357
 
3.9%
Hangul
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20794
> 99.9%
Hangul 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 4669
22.5%
r 1734
 
8.3%
k 1575
 
7.6%
. 1547
 
7.4%
o 1417
 
6.8%
@ 1169
 
5.6%
a 1117
 
5.4%
e 1104
 
5.3%
n 630
 
3.0%
c 592
 
2.8%
Other values (44) 5240
25.2%
Hangul
ValueCountFrequency (%)
1
100.0%

Correlations

2023-12-13T09:56:28.245913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사번성별연령대기관 분류
사번1.0000.5330.373NaN
성별0.5331.0000.2500.680
연령대0.3730.2501.000NaN
기관 분류NaN0.680NaN1.000
2023-12-13T09:56:28.318663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령대기관 분류성별
연령대1.0001.0000.106
기관 분류1.0001.0000.711
성별0.1060.7111.000
2023-12-13T09:56:28.388853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별연령대기관 분류
성별1.0000.1060.711
연령대0.1061.0001.000
기관 분류0.7111.0001.000

Missing values

2023-12-13T09:56:24.247785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:56:24.375498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T09:56:24.487812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

학습자 번호사번이름성별연령대기관 분류소속기관소속부서직위연락처(휴대전화)이메일
0H0001<NA>김**<NA><NA><NA>충청남도예산교육지원청<NA><NA><NA><NA>
1H0002<NA>전**<NA><NA><NA>충청남도예산교육지원청<NA><NA><NA><NA>
2H0003<NA>문**<NA><NA><NA>경기도여주교육지원청<NA><NA><NA><NA>
3H0004<NA>주**<NA><NA><NA>경기도여주교육지원청<NA><NA><NA><NA>
4H0005<NA>김**<NA><NA><NA>경기도용인교육지원청<NA><NA><NA><NA>
5H0006<NA>김**<NA><NA><NA>경기도용인교육지원청<NA><NA><NA><NA>
6H0007<NA>진**<NA><NA><NA>경상남도김해교육지원청<NA><NA><NA><NA>
7H0008<NA>남**<NA><NA><NA>경상북도포항교육지원청<NA><NA><NA><NA>
8H0009<NA>박**<NA><NA><NA>대전광역시동부교육지원청<NA><NA><NA><NA>
9H0010<NA>김**<NA><NA><NA>대전광역시서부교육지원청<NA><NA><NA><NA>
학습자 번호사번이름성별연령대기관 분류소속기관소속부서직위연락처(휴대전화)이메일
6744H4043<NA>이**<NA><NA><NA>서초구청<NA><NA><NA><NA>
6745H4044<NA>김**<NA><NA><NA>당진시청<NA><NA><NA><NA>
6746H4045<NA>박**<NA><NA><NA>신창초등학교<NA><NA><NA><NA>
6747H4046<NA>정**<NA><NA><NA>목포시청<NA><NA><NA><NA>
6748H4047<NA>이**<NA><NA><NA>완주군청<NA><NA><NA><NA>
6749H4048<NA>김**<NA><NA><NA>금정구청<NA><NA><NA><NA>
6750H4049<NA>류**<NA><NA><NA>금정구청<NA><NA><NA><NA>
6751H4050<NA>윤**<NA><NA><NA>부산지방항공청<NA><NA><NA><NA>
6752H4051<NA>박**<NA><NA><NA>한국체육대학교<NA><NA><NA><NA>
6753H4052<NA>성**<NA><NA><NA>춘천시청<NA><NA><NA><NA>