Overview

Dataset statistics

Number of variables7
Number of observations54
Missing cells15
Missing cells (%)4.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.1 KiB
Average record size in memory58.4 B

Variable types

Categorical3
Text1
Boolean1
DateTime2

Dataset

Description교육연수센터 홈페이지 및 등록 고객정보 데이터(코드, 코드이름, 등록일자 등)
Author한국사학진흥재단
URLhttps://www.data.go.kr/data/15042702/fileData.do

Alerts

헤더코드 is highly overall correlated with 수정자아이디High correlation
수정자아이디 is highly overall correlated with 헤더코드High correlation
코드이름 has 6 (11.1%) missing valuesMissing
등록일자 has 7 (13.0%) missing valuesMissing
수정일자 has 2 (3.7%) missing valuesMissing

Reproduction

Analysis started2023-12-12 01:59:56.265614
Analysis finished2023-12-12 01:59:57.516806
Duration1.25 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

헤더코드
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)22.2%
Missing0
Missing (%)0.0%
Memory size564.0 B
WKFD
18 
BKCD
RMCD
WKDD
WKCD
Other values (7)
15 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique2 ?
Unique (%)3.7%

Sample

1st rowRMCD
2nd rowRMCD
3rd rowSXCD
4th rowSXCD
5th rowWKCD

Common Values

ValueCountFrequency (%)
WKFD 18
33.3%
BKCD 7
 
13.0%
RMCD 5
 
9.3%
WKDD 5
 
9.3%
WKCD 4
 
7.4%
WKPD 4
 
7.4%
STDV 3
 
5.6%
SXCD 2
 
3.7%
BANK 2
 
3.7%
ACCT 2
 
3.7%
Other values (2) 2
 
3.7%

Length

2023-12-12T10:59:57.596182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
wkfd 18
33.3%
bkcd 7
 
13.0%
rmcd 5
 
9.3%
wkdd 5
 
9.3%
wkcd 4
 
7.4%
wkpd 4
 
7.4%
stdv 3
 
5.6%
sxcd 2
 
3.7%
bank 2
 
3.7%
acct 2
 
3.7%
Other values (2) 2
 
3.7%

코드
Categorical

Distinct24
Distinct (%)44.4%
Missing0
Missing (%)0.0%
Memory size564.0 B
1
3
2
4
6
Other values (19)
28 

Length

Max length6
Median length1
Mean length1.4444444
Min length1

Unique

Unique12 ?
Unique (%)22.2%

Sample

1st row1
2nd row2
3rd rowM
4th rowF
5th row1

Common Values

ValueCountFrequency (%)
1 7
13.0%
3 6
 
11.1%
2 5
 
9.3%
4 4
 
7.4%
6 4
 
7.4%
5 3
 
5.6%
9 3
 
5.6%
20 2
 
3.7%
12 2
 
3.7%
7 2
 
3.7%
Other values (14) 16
29.6%

Length

2023-12-12T10:59:57.726809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1 7
13.0%
3 6
 
11.1%
2 5
 
9.3%
4 4
 
7.4%
6 4
 
7.4%
5 3
 
5.6%
9 3
 
5.6%
11 2
 
3.7%
13 2
 
3.7%
12 2
 
3.7%
Other values (14) 16
29.6%

코드이름
Text

MISSING 

Distinct37
Distinct (%)77.1%
Missing6
Missing (%)11.1%
Memory size564.0 B
2023-12-12T10:59:57.965068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length33
Median length15
Mean length4.3541667
Min length2

Characters and Unicode

Total characters209
Distinct characters86
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)68.8%

Sample

1st row온돌방
2nd row침대방
3rd row남자
4th row여자
5th row
ValueCountFrequency (%)
기타 3
 
6.7%
e-러닝 3
 
6.7%
케프 2
 
4.4%
전망 2
 
4.4%
504-10-393783-7 2
 
4.4%
대구은행 2
 
4.4%
기숙사 1
 
2.2%
내부 1
 
2.2%
예금주 1
 
2.2%
온돌방 1
 
2.2%
Other values (27) 27
60.0%
2023-12-12T10:59:58.410401image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
39
 
18.7%
- 9
 
4.3%
7
 
3.3%
3 6
 
2.9%
6
 
2.9%
5
 
2.4%
4
 
1.9%
0 4
 
1.9%
4
 
1.9%
4
 
1.9%
Other values (76) 121
57.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 127
60.8%
Space Separator 39
 
18.7%
Decimal Number 26
 
12.4%
Dash Punctuation 9
 
4.3%
Lowercase Letter 3
 
1.4%
Other Punctuation 3
 
1.4%
Open Punctuation 1
 
0.5%
Close Punctuation 1
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
5.5%
6
 
4.7%
5
 
3.9%
4
 
3.1%
4
 
3.1%
4
 
3.1%
3
 
2.4%
3
 
2.4%
3
 
2.4%
3
 
2.4%
Other values (61) 85
66.9%
Decimal Number
ValueCountFrequency (%)
3 6
23.1%
0 4
15.4%
7 4
15.4%
1 3
11.5%
5 2
 
7.7%
4 2
 
7.7%
8 2
 
7.7%
9 2
 
7.7%
2 1
 
3.8%
Space Separator
ValueCountFrequency (%)
39
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 3
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 3
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 127
60.8%
Common 79
37.8%
Latin 3
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
5.5%
6
 
4.7%
5
 
3.9%
4
 
3.1%
4
 
3.1%
4
 
3.1%
3
 
2.4%
3
 
2.4%
3
 
2.4%
3
 
2.4%
Other values (61) 85
66.9%
Common
ValueCountFrequency (%)
39
49.4%
- 9
 
11.4%
3 6
 
7.6%
0 4
 
5.1%
7 4
 
5.1%
1 3
 
3.8%
/ 3
 
3.8%
5 2
 
2.5%
4 2
 
2.5%
8 2
 
2.5%
Other values (4) 5
 
6.3%
Latin
ValueCountFrequency (%)
e 3
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 127
60.8%
ASCII 82
39.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
39
47.6%
- 9
 
11.0%
3 6
 
7.3%
0 4
 
4.9%
7 4
 
4.9%
e 3
 
3.7%
1 3
 
3.7%
/ 3
 
3.7%
5 2
 
2.4%
4 2
 
2.4%
Other values (5) 7
 
8.5%
Hangul
ValueCountFrequency (%)
7
 
5.5%
6
 
4.7%
5
 
3.9%
4
 
3.1%
4
 
3.1%
4
 
3.1%
3
 
2.4%
3
 
2.4%
3
 
2.4%
3
 
2.4%
Other values (61) 85
66.9%
Distinct2
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Memory size186.0 B
True
37 
False
17 
ValueCountFrequency (%)
True 37
68.5%
False 17
31.5%
2023-12-12T10:59:58.558557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

등록일자
Date

MISSING 

Distinct35
Distinct (%)74.5%
Missing7
Missing (%)13.0%
Memory size564.0 B
Minimum2007-02-02 03:36:00
Maximum2020-11-10 13:24:00
2023-12-12T10:59:58.709845image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:59:58.909899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)

수정일자
Date

MISSING 

Distinct25
Distinct (%)48.1%
Missing2
Missing (%)3.7%
Memory size564.0 B
Minimum2007-02-02 03:38:00
Maximum2020-11-23 17:20:00
2023-12-12T10:59:59.070889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T10:59:59.233866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)

수정자아이디
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size564.0 B
kmkim
25 
young0217
10 
<NA>
sbkim
ejlee
 
2
Other values (4)

Length

Max length9
Median length5
Mean length5.6851852
Min length4

Unique

Unique2 ?
Unique (%)3.7%

Sample

1st rowkmkim
2nd rowkmkim
3rd row<NA>
4th row<NA>
5th rowkmkim

Common Values

ValueCountFrequency (%)
kmkim 25
46.3%
young0217 10
 
18.5%
<NA> 6
 
11.1%
sbkim 5
 
9.3%
ejlee 2
 
3.7%
thkim 2
 
3.7%
seojk 2
 
3.7%
hgkim 1
 
1.9%
mjpark01 1
 
1.9%

Length

2023-12-12T10:59:59.407923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T10:59:59.542150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kmkim 25
46.3%
young0217 10
 
18.5%
na 6
 
11.1%
sbkim 5
 
9.3%
ejlee 2
 
3.7%
thkim 2
 
3.7%
seojk 2
 
3.7%
hgkim 1
 
1.9%
mjpark01 1
 
1.9%

Correlations

2023-12-12T10:59:59.661678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
헤더코드코드코드이름사용여부등록일자수정일자수정자아이디
헤더코드1.0000.0000.9820.6431.0000.9750.878
코드0.0001.0000.0000.0000.6890.0000.717
코드이름0.9820.0001.0000.9480.7820.9750.995
사용여부0.6430.0000.9481.0000.9500.9680.657
등록일자1.0000.6890.7820.9501.0000.9901.000
수정일자0.9750.0000.9750.9680.9901.0000.993
수정자아이디0.8780.7170.9950.6571.0000.9931.000
2023-12-12T10:59:59.803965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수정자아이디헤더코드사용여부코드
수정자아이디1.0000.6500.4630.300
헤더코드0.6501.0000.4510.000
사용여부0.4630.4511.0000.000
코드0.3000.0000.0001.000
2023-12-12T10:59:59.949847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
헤더코드코드사용여부수정자아이디
헤더코드1.0000.0000.4510.650
코드0.0001.0000.0000.300
사용여부0.4510.0001.0000.463
수정자아이디0.6500.3000.4631.000

Missing values

2023-12-12T10:59:57.111628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:59:57.279751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T10:59:57.445690image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

헤더코드코드코드이름사용여부등록일자수정일자수정자아이디
0RMCD1온돌방Y2007-02-02 3:362015-05-29 9:26kmkim
1RMCD2침대방Y2007-02-02 3:362015-05-29 9:26kmkim
2SXCDM남자Y2007-02-02 3:382007-02-02 3:38<NA>
3SXCDF여자Y2007-02-02 3:382007-02-02 3:38<NA>
4WKCD1N2007-02-02 3:392015-05-29 10:03kmkim
5WKCD2e-러닝Y2007-02-02 3:392015-05-29 10:10kmkim
6BKCD1케프 e-러닝Y2007-02-05 3:192015-05-29 10:21kmkim
7BKCD2<NA>N2007-02-05 3:202008-04-15 2:40sbkim
8WKFD1회계/예산Y2007-02-05 10:232015-05-29 9:51kmkim
9WKFD2세무Y2007-02-05 10:232015-05-29 9:19kmkim
헤더코드코드코드이름사용여부등록일자수정일자수정자아이디
44WKFD13N<NA>2015-05-29 10:03kmkim
45WKFD16N<NA>2015-05-29 10:04kmkim
46WKCD3내부Y<NA><NA><NA>
47WKCD4외부Y<NA><NA><NA>
48WKFD14N<NA>2015-05-29 10:04kmkim
49WKFD15N<NA>2015-05-29 10:04kmkim
50WKFD17N<NA>2015-05-29 10:04kmkim
51ACCT31504-10-393783-7Y2020-11-10 13:222020-11-23 17:20seojk
52BANK500106대구은행Y2020-11-10 13:232020-11-23 17:20young0217
53BKCD7대구은행 504-10-393783-7 (예금주 사학진흥기금)Y2020-11-10 13:242020-11-10 13:24seojk