Overview

Dataset statistics

Number of variables9
Number of observations250
Missing cells245
Missing cells (%)10.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory17.7 KiB
Average record size in memory72.5 B

Variable types

Text3
Categorical3
Boolean1
DateTime2

Dataset

Description해당 데이터는 한국사학진흥재단의 홈페이지 관리 그룹의 도메인 관리메뉴(코드그룹아이디, 코드명, 코드설명, 사용여부 등)입니다.
URLhttps://www.data.go.kr/data/15042704/fileData.do

Alerts

사용여부 has constant value ""Constant
수정자 is highly overall correlated with 코드그룹아이디 and 1 other fieldsHigh correlation
코드그룹아이디 is highly overall correlated with 등록자 and 1 other fieldsHigh correlation
등록자 is highly overall correlated with 코드그룹아이디 and 1 other fieldsHigh correlation
수정자 is highly imbalanced (67.3%)Imbalance
코드설명 has 9 (3.6%) missing valuesMissing
수정일 has 235 (94.0%) missing valuesMissing

Reproduction

Analysis started2023-12-12 05:42:40.200460
Analysis finished2023-12-12 05:42:41.157626
Duration0.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

코드
Text

Distinct134
Distinct (%)53.6%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
2023-12-12T14:42:41.398315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length2
Mean length1.656
Min length1

Characters and Unicode

Total characters414
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique117 ?
Unique (%)46.8%

Sample

1st row,
2nd row0
3rd row3
4th row4
5th row5
ValueCountFrequency (%)
1 25
 
10.0%
2 24
 
9.6%
3 19
 
7.6%
4 11
 
4.4%
5 11
 
4.4%
6 8
 
3.2%
7 6
 
2.4%
8 4
 
1.6%
15 3
 
1.2%
16 3
 
1.2%
Other values (123) 136
54.4%
2023-12-12T14:42:41.964627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 103
24.9%
2 57
13.8%
3 43
10.4%
4 35
 
8.5%
5 35
 
8.5%
6 32
 
7.7%
7 29
 
7.0%
8 26
 
6.3%
0 25
 
6.0%
9 24
 
5.8%
Other values (5) 5
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 409
98.8%
Uppercase Letter 3
 
0.7%
Other Punctuation 1
 
0.2%
Math Symbol 1
 
0.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 103
25.2%
2 57
13.9%
3 43
10.5%
4 35
 
8.6%
5 35
 
8.6%
6 32
 
7.8%
7 29
 
7.1%
8 26
 
6.4%
0 25
 
6.1%
9 24
 
5.9%
Uppercase Letter
ValueCountFrequency (%)
B 1
33.3%
N 1
33.3%
R 1
33.3%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%
Math Symbol
ValueCountFrequency (%)
| 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 411
99.3%
Latin 3
 
0.7%

Most frequent character per script

Common
ValueCountFrequency (%)
1 103
25.1%
2 57
13.9%
3 43
10.5%
4 35
 
8.5%
5 35
 
8.5%
6 32
 
7.8%
7 29
 
7.1%
8 26
 
6.3%
0 25
 
6.1%
9 24
 
5.8%
Other values (2) 2
 
0.5%
Latin
ValueCountFrequency (%)
B 1
33.3%
N 1
33.3%
R 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 414
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 103
24.9%
2 57
13.8%
3 43
10.4%
4 35
 
8.5%
5 35
 
8.5%
6 32
 
7.7%
7 29
 
7.0%
8 26
 
6.3%
0 25
 
6.0%
9 24
 
5.8%
Other values (5) 5
 
1.2%

코드그룹아이디
Categorical

HIGH CORRELATION 

Distinct27
Distinct (%)10.8%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
C001
125 
C010
16 
C013
13 
C022
 
9
C032
 
8
Other values (22)
79 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowC014
2nd rowM002
3rd rowC001
4th rowC001
5th rowC001

Common Values

ValueCountFrequency (%)
C001 125
50.0%
C010 16
 
6.4%
C013 13
 
5.2%
C022 9
 
3.6%
C032 8
 
3.2%
C016 7
 
2.8%
C005 7
 
2.8%
C012 5
 
2.0%
C031 5
 
2.0%
M002 5
 
2.0%
Other values (17) 50
 
20.0%

Length

2023-12-12T14:42:42.131307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c001 125
50.0%
c010 16
 
6.4%
c013 13
 
5.2%
c022 9
 
3.6%
c032 8
 
3.2%
c016 7
 
2.8%
c005 7
 
2.8%
c012 5
 
2.0%
c031 5
 
2.0%
m002 5
 
2.0%
Other values (17) 50
 
20.0%
Distinct233
Distinct (%)93.6%
Missing1
Missing (%)0.4%
Memory size2.1 KiB
2023-12-12T14:42:42.355221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length27
Mean length8.8875502
Min length1

Characters and Unicode

Total characters2213
Distinct characters199
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique223 ?
Unique (%)89.6%

Sample

1st row,
2nd rowFailOver 처리중
3rd rowISO-2022-JP
4th rowISO-8859-1
5th rowBIG5
ValueCountFrequency (%)
발송실패 8
 
3.0%
에러 4
 
1.5%
sql 4
 
1.5%
iso-2022-jp 3
 
1.1%
페이지 3
 
1.1%
설문지 3
 
1.1%
3
 
1.1%
도메인 2
 
0.7%
big5 2
 
0.7%
발송대기 2
 
0.7%
Other values (224) 237
87.5%
2023-12-12T14:42:42.744960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 198
 
8.9%
i 135
 
6.1%
e 105
 
4.7%
a 85
 
3.8%
c 73
 
3.3%
n 72
 
3.3%
s 67
 
3.0%
x 64
 
2.9%
C 60
 
2.7%
o 58
 
2.6%
Other values (189) 1296
58.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1021
46.1%
Other Letter 424
19.2%
Uppercase Letter 309
 
14.0%
Decimal Number 210
 
9.5%
Dash Punctuation 198
 
8.9%
Space Separator 22
 
1.0%
Math Symbol 10
 
0.5%
Close Punctuation 7
 
0.3%
Open Punctuation 7
 
0.3%
Connector Punctuation 4
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
21
 
5.0%
19
 
4.5%
16
 
3.8%
15
 
3.5%
13
 
3.1%
12
 
2.8%
10
 
2.4%
10
 
2.4%
9
 
2.1%
9
 
2.1%
Other values (122) 290
68.4%
Lowercase Letter
ValueCountFrequency (%)
i 135
13.2%
e 105
10.3%
a 85
 
8.3%
c 73
 
7.1%
n 72
 
7.1%
s 67
 
6.6%
x 64
 
6.3%
o 58
 
5.7%
d 55
 
5.4%
r 54
 
5.3%
Other values (16) 253
24.8%
Uppercase Letter
ValueCountFrequency (%)
C 60
19.4%
I 36
11.7%
E 32
10.4%
B 31
10.0%
D 29
9.4%
S 26
8.4%
A 10
 
3.2%
O 8
 
2.6%
K 8
 
2.6%
F 8
 
2.6%
Other values (13) 61
19.7%
Decimal Number
ValueCountFrequency (%)
8 44
21.0%
2 38
18.1%
5 38
18.1%
1 22
10.5%
9 16
 
7.6%
0 15
 
7.1%
7 14
 
6.7%
6 11
 
5.2%
3 7
 
3.3%
4 5
 
2.4%
Math Symbol
ValueCountFrequency (%)
+ 9
90.0%
| 1
 
10.0%
Dash Punctuation
ValueCountFrequency (%)
- 198
100.0%
Space Separator
ValueCountFrequency (%)
22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1330
60.1%
Common 459
 
20.7%
Hangul 424
 
19.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
21
 
5.0%
19
 
4.5%
16
 
3.8%
15
 
3.5%
13
 
3.1%
12
 
2.8%
10
 
2.4%
10
 
2.4%
9
 
2.1%
9
 
2.1%
Other values (122) 290
68.4%
Latin
ValueCountFrequency (%)
i 135
 
10.2%
e 105
 
7.9%
a 85
 
6.4%
c 73
 
5.5%
n 72
 
5.4%
s 67
 
5.0%
x 64
 
4.8%
C 60
 
4.5%
o 58
 
4.4%
d 55
 
4.1%
Other values (39) 556
41.8%
Common
ValueCountFrequency (%)
- 198
43.1%
8 44
 
9.6%
2 38
 
8.3%
5 38
 
8.3%
22
 
4.8%
1 22
 
4.8%
9 16
 
3.5%
0 15
 
3.3%
7 14
 
3.1%
6 11
 
2.4%
Other values (8) 41
 
8.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1789
80.8%
Hangul 424
 
19.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 198
 
11.1%
i 135
 
7.5%
e 105
 
5.9%
a 85
 
4.8%
c 73
 
4.1%
n 72
 
4.0%
s 67
 
3.7%
x 64
 
3.6%
C 60
 
3.4%
o 58
 
3.2%
Other values (57) 872
48.7%
Hangul
ValueCountFrequency (%)
21
 
5.0%
19
 
4.5%
16
 
3.8%
15
 
3.5%
13
 
3.1%
12
 
2.8%
10
 
2.4%
10
 
2.4%
9
 
2.1%
9
 
2.1%
Other values (122) 290
68.4%

코드설명
Text

MISSING 

Distinct233
Distinct (%)96.7%
Missing9
Missing (%)3.6%
Memory size2.1 KiB
2023-12-12T14:42:42.955980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length63
Median length33
Mean length14.879668
Min length1

Characters and Unicode

Total characters3586
Distinct characters270
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique230 ?
Unique (%)95.4%

Sample

1st row구분자코드 COMMA
2nd row파일로 적재되었으나 SMS 발송테이블에 insert 되지 못한 상태
3rd row일본어(ISO-2022-JP)
4th row한국어(ISO-8859-1)
5th row중국어 번체(BIG5)
ValueCountFrequency (%)
ebcdic 37
 
6.3%
ibm 37
 
6.3%
mac 12
 
2.0%
japanese 11
 
1.9%
dos 10
 
1.7%
iso 10
 
1.7%
windows 10
 
1.7%
chinese 10
 
1.7%
iscii 10
 
1.7%
european 9
 
1.5%
Other values (297) 435
73.6%
2023-12-12T14:42:43.595410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
350
 
9.8%
e 175
 
4.9%
a 168
 
4.7%
I 141
 
3.9%
n 136
 
3.8%
i 128
 
3.6%
C 120
 
3.3%
( 111
 
3.1%
) 111
 
3.1%
r 107
 
3.0%
Other values (260) 2039
56.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1285
35.8%
Uppercase Letter 870
24.3%
Other Letter 746
20.8%
Space Separator 350
 
9.8%
Open Punctuation 111
 
3.1%
Close Punctuation 111
 
3.1%
Dash Punctuation 44
 
1.2%
Decimal Number 44
 
1.2%
Other Punctuation 22
 
0.6%
Control 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
24
 
3.2%
24
 
3.2%
23
 
3.1%
23
 
3.1%
21
 
2.8%
19
 
2.5%
18
 
2.4%
17
 
2.3%
14
 
1.9%
14
 
1.9%
Other values (192) 549
73.6%
Lowercase Letter
ValueCountFrequency (%)
e 175
13.6%
a 168
13.1%
n 136
10.6%
i 128
10.0%
r 107
 
8.3%
o 67
 
5.2%
l 66
 
5.1%
d 58
 
4.5%
s 56
 
4.4%
t 52
 
4.0%
Other values (15) 272
21.2%
Uppercase Letter
ValueCountFrequency (%)
I 141
16.2%
C 120
13.8%
B 93
10.7%
S 89
10.2%
E 73
8.4%
M 70
8.0%
D 58
 
6.7%
O 33
 
3.8%
U 21
 
2.4%
J 20
 
2.3%
Other values (14) 152
17.5%
Decimal Number
ValueCountFrequency (%)
2 10
22.7%
5 9
20.5%
8 7
15.9%
1 5
11.4%
0 4
 
9.1%
3 3
 
6.8%
7 2
 
4.5%
9 2
 
4.5%
4 1
 
2.3%
6 1
 
2.3%
Other Punctuation
ValueCountFrequency (%)
. 13
59.1%
: 5
 
22.7%
, 3
 
13.6%
/ 1
 
4.5%
Space Separator
ValueCountFrequency (%)
350
100.0%
Open Punctuation
ValueCountFrequency (%)
( 111
100.0%
Close Punctuation
ValueCountFrequency (%)
) 111
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 44
100.0%
Control
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2155
60.1%
Hangul 746
 
20.8%
Common 685
 
19.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
24
 
3.2%
24
 
3.2%
23
 
3.1%
23
 
3.1%
21
 
2.8%
19
 
2.5%
18
 
2.4%
17
 
2.3%
14
 
1.9%
14
 
1.9%
Other values (192) 549
73.6%
Latin
ValueCountFrequency (%)
e 175
 
8.1%
a 168
 
7.8%
I 141
 
6.5%
n 136
 
6.3%
i 128
 
5.9%
C 120
 
5.6%
r 107
 
5.0%
B 93
 
4.3%
S 89
 
4.1%
E 73
 
3.4%
Other values (39) 925
42.9%
Common
ValueCountFrequency (%)
350
51.1%
( 111
 
16.2%
) 111
 
16.2%
- 44
 
6.4%
. 13
 
1.9%
2 10
 
1.5%
5 9
 
1.3%
8 7
 
1.0%
1 5
 
0.7%
: 5
 
0.7%
Other values (9) 20
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2840
79.2%
Hangul 746
 
20.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
350
 
12.3%
e 175
 
6.2%
a 168
 
5.9%
I 141
 
5.0%
n 136
 
4.8%
i 128
 
4.5%
C 120
 
4.2%
( 111
 
3.9%
) 111
 
3.9%
r 107
 
3.8%
Other values (58) 1293
45.5%
Hangul
ValueCountFrequency (%)
24
 
3.2%
24
 
3.2%
23
 
3.1%
23
 
3.1%
21
 
2.8%
19
 
2.5%
18
 
2.4%
17
 
2.3%
14
 
1.9%
14
 
1.9%
Other values (192) 549
73.6%

사용여부
Boolean

CONSTANT 

Distinct1
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size382.0 B
True
250 
ValueCountFrequency (%)
True 250
100.0%
2023-12-12T14:42:43.721144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

등록자
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
admin
125 
<NA>
125 

Length

Max length5
Median length4.5
Mean length4.5
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowadmin
2nd rowadmin
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
admin 125
50.0%
<NA> 125
50.0%

Length

2023-12-12T14:42:43.838557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:42:43.942577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
admin 125
50.0%
na 125
50.0%
Distinct51
Distinct (%)20.4%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
Minimum2006-11-16 15:03:00
Maximum2008-08-05 00:00:00
2023-12-12T14:42:44.082619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:42:44.262673image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

수정자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size2.1 KiB
<NA>
235 
admin
 
15

Length

Max length5
Median length4
Mean length4.06
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 235
94.0%
admin 15
 
6.0%

Length

2023-12-12T14:42:44.383572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:42:44.482987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 235
94.0%
admin 15
 
6.0%

수정일
Date

MISSING 

Distinct13
Distinct (%)86.7%
Missing235
Missing (%)94.0%
Memory size2.1 KiB
Minimum2006-11-16 15:51:00
Maximum2008-08-01 10:01:00
2023-12-12T14:42:44.573940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:42:44.697616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)

Correlations

2023-12-12T14:42:44.786272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
코드그룹아이디등록일수정일
코드그룹아이디1.0000.9941.000
등록일0.9941.0000.986
수정일1.0000.9861.000
2023-12-12T14:42:44.882968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수정자코드그룹아이디등록자
수정자1.0001.0001.000
코드그룹아이디1.0001.0001.000
등록자1.0001.0001.000
2023-12-12T14:42:44.968832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
코드그룹아이디등록자수정자
코드그룹아이디1.0001.0001.000
등록자1.0001.0001.000
수정자1.0001.0001.000

Missing values

2023-12-12T14:42:40.789029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:42:40.930110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T14:42:41.072853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

코드코드그룹아이디코드명코드설명사용여부등록자등록일수정자수정일
0,C014,구분자코드 COMMAYadmin2007-01-16 11:20<NA><NA>
10M002FailOver 처리중파일로 적재되었으나 SMS 발송테이블에 insert 되지 못한 상태Yadmin2008-05-28 14:21<NA><NA>
23C001ISO-2022-JP일본어(ISO-2022-JP)Y<NA>2008-05-28 14:13<NA><NA>
34C001ISO-8859-1한국어(ISO-8859-1)Y<NA>2008-05-28 14:13<NA><NA>
45C001BIG5중국어 번체(BIG5)Y<NA>2008-05-28 14:13<NA><NA>
56C001GB2312중국어 간체(GB2312)Y<NA>2008-05-28 14:13<NA><NA>
61C001EUC-KR한국어(EUC-KR)Y<NA>2008-05-28 14:13<NA><NA>
71C002<NA>인코딩을 하지 않음Yadmin2006-11-16 15:03admin2006-11-16 15:51
81C003비공유비공유Yadmin2006-11-19 17:01<NA><NA>
91C004SQL BuilderSQL BuilderYadmin2006-11-21 10:11<NA><NA>
코드코드그룹아이디코드명코드설명사용여부등록자등록일수정자수정일
2405C016존재하지않는 도메인발송에러코드그룹Yadmin2007-01-16 11:20<NA><NA>
2416C013문법에러문법에러Yadmin2006-12-26 11:28<NA><NA>
2426C016문법 에러발송에러코드그룹Yadmin2007-01-16 11:20<NA><NA>
2437C013DNS오류DNS오류Yadmin2006-12-26 11:28<NA><NA>
2447C016DNS 에러발송에러코드그룹Yadmin2007-01-16 11:20<NA><NA>
24599C004자동메일그룹자동메일그룹Yadmin2006-11-21 10:15<NA><NA>
246BM002발송전등록된 SMS가 승인되어 발송전인 상태Yadmin2008-05-28 14:21<NA><NA>
247NM002미승인등록만하고 승인이전인 상태Yadmin2008-05-28 14:21<NA><NA>
248RM002정기예약SMS 정기예약 발송중Yadmin2008-05-28 14:21<NA><NA>
249|C014|구분자코드 PIPELINEYadmin2007-01-16 11:20<NA><NA>