Overview

Dataset statistics

Number of variables11
Number of observations1568
Missing cells1568
Missing cells (%)9.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory137.9 KiB
Average record size in memory90.1 B

Variable types

Categorical7
Text3
Unsupported1

Dataset

Description교과서목록표
Author교육부
URLhttps://www.data.go.kr/data/3038852/fileData.do

Alerts

개정구분 has constant value ""Constant
대표 is highly overall correlated with 검·인정구분 and 2 other fieldsHigh correlation
검·인정구분 is highly overall correlated with 대표 and 1 other fieldsHigh correlation
기관 is highly overall correlated with 검·인정구분 and 1 other fieldsHigh correlation
교지명 is highly overall correlated with 학교급명High correlation
학교급명 is highly overall correlated with 교지명 and 1 other fieldsHigh correlation
대표 is highly imbalanced (87.4%)Imbalance
권별 has 1568 (100.0%) missing valuesMissing
권별 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 17:46:36.292193
Analysis finished2023-12-12 17:46:37.249853
Duration0.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

검·인정구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.4 KiB
인정
868 
검정
700 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row검정
2nd row검정
3rd row검정
4th row검정
5th row검정

Common Values

ValueCountFrequency (%)
인정 868
55.4%
검정 700
44.6%

Length

2023-12-13T02:46:37.323642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:46:37.440613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
인정 868
55.4%
검정 700
44.6%

교지명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.4 KiB
교과서
1197 
지도서
371 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row교과서
2nd row지도서
3rd row교과서
4th row지도서
5th row교과서

Common Values

ValueCountFrequency (%)
교과서 1197
76.3%
지도서 371
 
23.7%

Length

2023-12-13T02:46:37.568786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:46:37.683114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
교과서 1197
76.3%
지도서 371
 
23.7%

학교급명
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size12.4 KiB
고등학교
766 
중학교
587 
초등학교
215 

Length

Max length4
Median length4
Mean length3.6256378
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row초등학교
2nd row초등학교
3rd row초등학교
4th row초등학교
5th row초등학교

Common Values

ValueCountFrequency (%)
고등학교 766
48.9%
중학교 587
37.4%
초등학교 215
 
13.7%

Length

2023-12-13T02:46:37.802731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:46:37.909371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
고등학교 766
48.9%
중학교 587
37.4%
초등학교 215
 
13.7%
Distinct87
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Memory size12.4 KiB
2023-12-13T02:46:38.175344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length15
Mean length8.1549745
Min length3

Characters and Unicode

Total characters12787
Distinct characters141
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.4%

Sample

1st row(주)와이비엠
2nd row(주)와이비엠
3rd row동아출판(주)
4th row동아출판(주)
5th row(주)금성출판사
ValueCountFrequency (%)
주)비상교육 115
 
7.2%
주)미래엔 108
 
6.8%
주)천재교과서 107
 
6.7%
동아출판(주 106
 
6.6%
주)천재교육 103
 
6.5%
주)금성출판사 98
 
6.1%
주)지학사 92
 
5.8%
주)와이비엠 90
 
5.6%
주)교학사 64
 
4.0%
사)한국검인정(서울교육청 39
 
2.4%
Other values (82) 673
42.2%
2023-12-13T02:46:38.638512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
( 1686
 
13.2%
) 1686
 
13.2%
1177
 
9.2%
694
 
5.4%
680
 
5.3%
489
 
3.8%
292
 
2.3%
292
 
2.3%
271
 
2.1%
270
 
2.1%
Other values (131) 5250
41.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9370
73.3%
Open Punctuation 1686
 
13.2%
Close Punctuation 1686
 
13.2%
Space Separator 27
 
0.2%
Other Punctuation 9
 
0.1%
Other Symbol 9
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1177
 
12.6%
694
 
7.4%
680
 
7.3%
489
 
5.2%
292
 
3.1%
292
 
3.1%
271
 
2.9%
270
 
2.9%
265
 
2.8%
256
 
2.7%
Other values (126) 4684
50.0%
Open Punctuation
ValueCountFrequency (%)
( 1686
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1686
100.0%
Space Separator
ValueCountFrequency (%)
27
100.0%
Other Punctuation
ValueCountFrequency (%)
, 9
100.0%
Other Symbol
ValueCountFrequency (%)
9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9379
73.3%
Common 3408
 
26.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1177
 
12.5%
694
 
7.4%
680
 
7.3%
489
 
5.2%
292
 
3.1%
292
 
3.1%
271
 
2.9%
270
 
2.9%
265
 
2.8%
256
 
2.7%
Other values (127) 4693
50.0%
Common
ValueCountFrequency (%)
( 1686
49.5%
) 1686
49.5%
27
 
0.8%
, 9
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9370
73.3%
ASCII 3408
 
26.7%
None 9
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
( 1686
49.5%
) 1686
49.5%
27
 
0.8%
, 9
 
0.3%
Hangul
ValueCountFrequency (%)
1177
 
12.6%
694
 
7.4%
680
 
7.3%
489
 
5.2%
292
 
3.1%
292
 
3.1%
271
 
2.9%
270
 
2.9%
265
 
2.8%
256
 
2.7%
Other values (126) 4684
50.0%
None
ValueCountFrequency (%)
9
100.0%
Distinct441
Distinct (%)28.1%
Missing0
Missing (%)0.0%
Memory size12.4 KiB
2023-12-13T02:46:38.971838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length16
Mean length8.6269133
Min length2

Characters and Unicode

Total characters13527
Distinct characters242
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique294 ?
Unique (%)18.8%

Sample

1st row음악(3~4학년군) 3
2nd row음악(3~4학년군) 3~4
3rd row음악(3~4학년군) 3
4th row음악(3~4학년군) 3~4
5th row음악(3~4학년군) 3
ValueCountFrequency (%)
미술 75
 
2.7%
음악 72
 
2.6%
1 65
 
2.3%
영어 63
 
2.2%
국어 61
 
2.2%
①(15개정 58
 
2.1%
②(15개정 58
 
2.1%
56
 
2.0%
56
 
2.0%
6 52
 
1.9%
Other values (456) 2189
78.0%
2023-12-13T02:46:39.560031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1237
 
9.1%
) 921
 
6.8%
( 921
 
6.8%
5 886
 
6.5%
825
 
6.1%
1 803
 
5.9%
706
 
5.2%
417
 
3.1%
360
 
2.7%
~ 244
 
1.8%
Other values (232) 6207
45.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7313
54.1%
Decimal Number 2280
 
16.9%
Space Separator 1237
 
9.1%
Close Punctuation 921
 
6.8%
Open Punctuation 921
 
6.8%
Other Number 276
 
2.0%
Math Symbol 244
 
1.8%
Letter Number 196
 
1.4%
Other Punctuation 103
 
0.8%
Dash Punctuation 36
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
825
 
11.3%
706
 
9.7%
417
 
5.7%
360
 
4.9%
215
 
2.9%
215
 
2.9%
179
 
2.4%
164
 
2.2%
157
 
2.1%
145
 
2.0%
Other values (215) 3930
53.7%
Decimal Number
ValueCountFrequency (%)
5 886
38.9%
1 803
35.2%
6 184
 
8.1%
4 153
 
6.7%
3 153
 
6.7%
2 101
 
4.4%
Other Number
ValueCountFrequency (%)
138
50.0%
138
50.0%
Letter Number
ValueCountFrequency (%)
116
59.2%
80
40.8%
Other Punctuation
ValueCountFrequency (%)
· 79
76.7%
/ 24
 
23.3%
Space Separator
ValueCountFrequency (%)
1237
100.0%
Close Punctuation
ValueCountFrequency (%)
) 921
100.0%
Open Punctuation
ValueCountFrequency (%)
( 921
100.0%
Math Symbol
ValueCountFrequency (%)
~ 244
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 36
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7313
54.1%
Common 6018
44.5%
Latin 196
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
825
 
11.3%
706
 
9.7%
417
 
5.7%
360
 
4.9%
215
 
2.9%
215
 
2.9%
179
 
2.4%
164
 
2.2%
157
 
2.1%
145
 
2.0%
Other values (215) 3930
53.7%
Common
ValueCountFrequency (%)
1237
20.6%
) 921
15.3%
( 921
15.3%
5 886
14.7%
1 803
13.3%
~ 244
 
4.1%
6 184
 
3.1%
4 153
 
2.5%
3 153
 
2.5%
138
 
2.3%
Other values (5) 378
 
6.3%
Latin
ValueCountFrequency (%)
116
59.2%
80
40.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7313
54.1%
ASCII 5663
41.9%
Enclosed Alphanum 276
 
2.0%
Number Forms 196
 
1.4%
None 79
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1237
21.8%
) 921
16.3%
( 921
16.3%
5 886
15.6%
1 803
14.2%
~ 244
 
4.3%
6 184
 
3.2%
4 153
 
2.7%
3 153
 
2.7%
2 101
 
1.8%
Other values (2) 60
 
1.1%
Hangul
ValueCountFrequency (%)
825
 
11.3%
706
 
9.7%
417
 
5.7%
360
 
4.9%
215
 
2.9%
215
 
2.9%
179
 
2.4%
164
 
2.2%
157
 
2.1%
145
 
2.0%
Other values (215) 3930
53.7%
Enclosed Alphanum
ValueCountFrequency (%)
138
50.0%
138
50.0%
Number Forms
ValueCountFrequency (%)
116
59.2%
80
40.8%
None
ValueCountFrequency (%)
· 79
100.0%

대표
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct14
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size12.4 KiB
<NA>
1484 
 
15
 
14
 
14
 
8
Other values (9)
 
33

Length

Max length4
Median length4
Mean length3.8405612
Min length1

Unique

Unique2 ?
Unique (%)0.1%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1484
94.6%
15
 
1.0%
14
 
0.9%
14
 
0.9%
8
 
0.5%
6
 
0.4%
5
 
0.3%
4
 
0.3%
4
 
0.3%
4
 
0.3%
Other values (4) 10
 
0.6%

Length

2023-12-13T02:46:39.766001image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 1484
94.6%
15
 
1.0%
14
 
0.9%
14
 
0.9%
8
 
0.5%
6
 
0.4%
5
 
0.3%
4
 
0.3%
4
 
0.3%
4
 
0.3%
Other values (4) 10
 
0.6%

권별
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing1568
Missing (%)100.0%
Memory size13.9 KiB

저자
Text

Distinct627
Distinct (%)40.0%
Missing0
Missing (%)0.0%
Memory size12.4 KiB
2023-12-13T02:46:40.193841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.9987245
Min length2

Characters and Unicode

Total characters4702
Distinct characters188
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique318 ?
Unique (%)20.3%

Sample

1st row홍종건
2nd row홍종건
3rd row석문주
4th row석문주
5th row김용희
ValueCountFrequency (%)
양종모 13
 
0.8%
주명덕 12
 
0.8%
안양옥 12
 
0.8%
김원경 12
 
0.8%
김진수 11
 
0.7%
이준열 11
 
0.7%
장기범 11
 
0.7%
황선욱 10
 
0.6%
류희찬 10
 
0.6%
이삼형 10
 
0.6%
Other values (617) 1456
92.9%
2023-12-13T02:46:40.862406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
290
 
6.2%
263
 
5.6%
171
 
3.6%
164
 
3.5%
121
 
2.6%
112
 
2.4%
101
 
2.1%
95
 
2.0%
84
 
1.8%
74
 
1.6%
Other values (178) 3227
68.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4702
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
290
 
6.2%
263
 
5.6%
171
 
3.6%
164
 
3.5%
121
 
2.6%
112
 
2.4%
101
 
2.1%
95
 
2.0%
84
 
1.8%
74
 
1.6%
Other values (178) 3227
68.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4702
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
290
 
6.2%
263
 
5.6%
171
 
3.6%
164
 
3.5%
121
 
2.6%
112
 
2.4%
101
 
2.1%
95
 
2.0%
84
 
1.8%
74
 
1.6%
Other values (178) 3227
68.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4702
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
290
 
6.2%
263
 
5.6%
171
 
3.6%
164
 
3.5%
121
 
2.6%
112
 
2.4%
101
 
2.1%
95
 
2.0%
84
 
1.8%
74
 
1.6%
Other values (178) 3227
68.6%

개정구분
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.4 KiB
2015개정
1568 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2015개정
2nd row2015개정
3rd row2015개정
4th row2015개정
5th row2015개정

Common Values

ValueCountFrequency (%)
2015개정 1568
100.0%

Length

2023-12-13T02:46:41.058093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:46:41.160786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2015개정 1568
100.0%

시작년도
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size12.4 KiB
2018
1262 
2019
306 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2018
2nd row2018
3rd row2018
4th row2018
5th row2018

Common Values

ValueCountFrequency (%)
2018 1262
80.5%
2019 306
 
19.5%

Length

2023-12-13T02:46:41.293934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:46:41.419097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2018 1262
80.5%
2019 306
 
19.5%

기관
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size12.4 KiB
<NA>
868 
교육과정평가원
526 
한국과학창의재단
174 

Length

Max length8
Median length4
Mean length5.4502551
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row교육과정평가원
2nd row교육과정평가원
3rd row교육과정평가원
4th row교육과정평가원
5th row교육과정평가원

Common Values

ValueCountFrequency (%)
<NA> 868
55.4%
교육과정평가원 526
33.5%
한국과학창의재단 174
 
11.1%

Length

2023-12-13T02:46:41.609830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:46:41.780578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 868
55.4%
교육과정평가원 526
33.5%
한국과학창의재단 174
 
11.1%

Correlations

2023-12-13T02:46:41.911942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검·인정구분교지명학교급명출판사대표시작년도기관
검·인정구분1.0000.1600.2780.798NaN0.690NaN
교지명0.1601.0000.3400.2440.0000.1060.214
학교급명0.2780.3401.0000.6900.8030.2180.266
출판사0.7980.2440.6901.0000.9190.3880.275
대표NaN0.0000.8030.9191.0000.0000.726
시작년도0.6900.1060.2180.3880.0001.0000.110
기관NaN0.2140.2660.2750.7260.1101.000
2023-12-13T02:46:42.108473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대표학교급명교지명검·인정구분기관시작년도
대표1.0000.6180.0001.0000.6430.000
학교급명0.6181.0000.5440.4500.4320.356
교지명0.0000.5441.0000.1020.1370.068
검·인정구분1.0000.4500.1021.0001.0000.485
기관0.6430.4320.1371.0001.0000.070
시작년도0.0000.3560.0680.4850.0701.000
2023-12-13T02:46:42.228681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검·인정구분교지명학교급명대표시작년도기관
검·인정구분1.0000.1020.4501.0000.4851.000
교지명0.1021.0000.5440.0000.0680.137
학교급명0.4500.5441.0000.6180.3560.432
대표1.0000.0000.6181.0000.0000.643
시작년도0.4850.0680.3560.0001.0000.070
기관1.0000.1370.4320.6430.0701.000

Missing values

2023-12-13T02:46:37.025343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:46:37.177577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

검·인정구분교지명학교급명출판사도서명대표권별저자개정구분시작년도기관
0검정교과서초등학교(주)와이비엠음악(3~4학년군) 3<NA><NA>홍종건2015개정2018교육과정평가원
1검정지도서초등학교(주)와이비엠음악(3~4학년군) 3~4<NA><NA>홍종건2015개정2018교육과정평가원
2검정교과서초등학교동아출판(주)음악(3~4학년군) 3<NA><NA>석문주2015개정2018교육과정평가원
3검정지도서초등학교동아출판(주)음악(3~4학년군) 3~4<NA><NA>석문주2015개정2018교육과정평가원
4검정교과서초등학교(주)금성출판사음악(3~4학년군) 3<NA><NA>김용희2015개정2018교육과정평가원
5검정지도서초등학교(주)금성출판사음악(3~4학년군) 3~4<NA><NA>김용희2015개정2018교육과정평가원
6검정교과서초등학교(주)지학사음악(3~4학년군) 3<NA><NA>허정미2015개정2018교육과정평가원
7검정지도서초등학교(주)지학사음악(3~4학년군) 3~4<NA><NA>허정미2015개정2018교육과정평가원
8검정교과서초등학교(주)음악과생활음악(3~4학년군) 3<NA><NA>권태욱2015개정2018교육과정평가원
9검정지도서초등학교(주)음악과생활음악(3~4학년군) 3~4<NA><NA>권태욱2015개정2018교육과정평가원
검·인정구분교지명학교급명출판사도서명대표권별저자개정구분시작년도기관
1558인정교과서고등학교(사)한국검인정(부산교육청)해사 영어(15개정)<NA><NA>채종주2015개정2018<NA>
1559인정교과서고등학교(사)한국검인정(인천교육청)항해사 직무<NA><NA>박용선2015개정2018<NA>
1560인정교과서고등학교(사)한국검인정(인천교육청)해운 일반<NA><NA>홍성화2015개정2018<NA>
1561인정교과서고등학교(사)한국검인정(인천교육청)열기관(15개정)<NA><NA>이창용2015개정2018<NA>
1562인정교과서고등학교(사)한국검인정(인천교육청)선박 보조 기계(15개정)<NA><NA>박주성2015개정2018<NA>
1563인정교과서고등학교(사)한국검인정(부산교육청)선박 전기·전자(15개정)<NA><NA>소명옥2015개정2018<NA>
1564인정교과서고등학교(사)한국검인정(인천교육청)기관 실무 기초<NA><NA>박종운2015개정2018<NA>
1565인정교과서고등학교(사)한국검인정(인천교육청)기관 직무 일반<NA><NA>정태영2015개정2018<NA>
1566인정교과서중학교(주)금성출판사두런두런 컴퓨팅<NA><NA>김영일2015개정2019<NA>
1567인정교과서중학교(주)금성출판사진로체험과 포트폴리오<NA><NA>민창기2015개정2019<NA>