Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells10062
Missing cells (%)14.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory654.3 KiB
Average record size in memory67.0 B

Variable types

Numeric2
Categorical1
Text3
Unsupported1

Dataset

Description한국주택금융공사에서 발행한 데이터 입니다. 순서,자료유형,등록번호,서명,저자,출판사,출판년도 칼럼이 포함되어있으며 관련값이 있습니다.
Author한국주택금융공사
URLhttps://www.data.go.kr/data/3071759/fileData.do

Alerts

순서 is highly overall correlated with 등록번호High correlation
등록번호 is highly overall correlated with 순서High correlation
자료유형 is highly imbalanced (60.8%)Imbalance
Unnamed: 6 has 10000 (100.0%) missing valuesMissing
순서 has unique valuesUnique
등록번호 has unique valuesUnique
Unnamed: 6 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-12 18:15:43.294959
Analysis finished2023-12-12 18:15:46.105634
Duration2.81 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순서
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13582.494
Minimum2
Maximum27353
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T03:15:46.196457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile1377.95
Q16703.5
median13468.5
Q320226.5
95-th percentile26080.05
Maximum27353
Range27351
Interquartile range (IQR)13523

Descriptive statistics

Standard deviation7933.8622
Coefficient of variation (CV)0.5841241
Kurtosis-1.1998084
Mean13582.494
Median Absolute Deviation (MAD)6763.5
Skewness0.028562979
Sum1.3582494 × 108
Variance62946170
MonotonicityNot monotonic
2023-12-13T03:15:46.374700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6473 1
 
< 0.1%
18265 1
 
< 0.1%
19115 1
 
< 0.1%
17138 1
 
< 0.1%
27302 1
 
< 0.1%
6127 1
 
< 0.1%
13426 1
 
< 0.1%
10768 1
 
< 0.1%
24508 1
 
< 0.1%
25584 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
9 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
16 1
< 0.1%
18 1
< 0.1%
21 1
< 0.1%
22 1
< 0.1%
23 1
< 0.1%
26 1
< 0.1%
ValueCountFrequency (%)
27353 1
< 0.1%
27352 1
< 0.1%
27343 1
< 0.1%
27342 1
< 0.1%
27341 1
< 0.1%
27337 1
< 0.1%
27335 1
< 0.1%
27328 1
< 0.1%
27326 1
< 0.1%
27315 1
< 0.1%

자료유형
Categorical

IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
국내서
7852 
연구보고서
1495 
국외서
 
417
(비)일반
 
218
학위논문
 
17

Length

Max length8
Median length3
Mean length3.3448
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row국내서
2nd row국내서
3rd row연구보고서
4th row연구보고서
5th row국내서

Common Values

ValueCountFrequency (%)
국내서 7852
78.5%
연구보고서 1495
 
14.9%
국외서 417
 
4.2%
(비)일반 218
 
2.2%
학위논문 17
 
0.2%
(연)국내연간물 1
 
< 0.1%

Length

2023-12-13T03:15:46.505947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:15:46.603084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
국내서 7852
78.5%
연구보고서 1495
 
14.9%
국외서 417
 
4.2%
비)일반 218
 
2.2%
학위논문 17
 
0.2%
연)국내연간물 1
 
< 0.1%

등록번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14291.106
Minimum2
Maximum28274
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T03:15:46.737826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile1574.95
Q17342.5
median14281.5
Q321101.5
95-th percentile26997.05
Maximum28274
Range28272
Interquartile range (IQR)13759

Descriptive statistics

Standard deviation8144.6942
Coefficient of variation (CV)0.5699135
Kurtosis-1.1843254
Mean14291.106
Median Absolute Deviation (MAD)6888
Skewness-0.0069491107
Sum1.4291106 × 108
Variance66336044
MonotonicityNot monotonic
2023-12-13T03:15:46.889318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7110 1
 
< 0.1%
19129 1
 
< 0.1%
19987 1
 
< 0.1%
17995 1
 
< 0.1%
28223 1
 
< 0.1%
6713 1
 
< 0.1%
14239 1
 
< 0.1%
11531 1
 
< 0.1%
25416 1
 
< 0.1%
26498 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
9 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
16 1
< 0.1%
18 1
< 0.1%
21 1
< 0.1%
22 1
< 0.1%
23 1
< 0.1%
26 1
< 0.1%
ValueCountFrequency (%)
28274 1
< 0.1%
28273 1
< 0.1%
28264 1
< 0.1%
28263 1
< 0.1%
28262 1
< 0.1%
28258 1
< 0.1%
28256 1
< 0.1%
28249 1
< 0.1%
28247 1
< 0.1%
28236 1
< 0.1%

서명
Text

Distinct9203
Distinct (%)92.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T03:15:47.263365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length226
Median length138
Mean length24.4648
Min length1

Characters and Unicode

Total characters244648
Distinct characters1571
Distinct categories15 ?
Distinct scripts6 ?
Distinct blocks13 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8783 ?
Unique (%)87.8%

Sample

1st row대제국 고구려.1:광개토대제비의 위용
2nd row(온쪽이 하예린의)내가 만난 파리
3rd row맥쿼리 그룹(Macquarie Group)의 성장과정과 전략적 시사점
4th row(2007 통신연수)금융경제.1
5th row(기발한 시골 양반 라 만차의)돈끼호떼.1
ValueCountFrequency (%)
the 405
 
0.9%
of 380
 
0.8%
366
 
0.8%
위한 338
 
0.7%
연구 294
 
0.6%
and 286
 
0.6%
장편소설 176
 
0.4%
167
 
0.4%
관한 160
 
0.3%
for 157
 
0.3%
Other values (21504) 44254
94.2%
2023-12-13T03:15:47.862352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
36996
 
15.1%
e 5608
 
2.3%
n 4546
 
1.9%
a 4402
 
1.8%
i 4173
 
1.7%
o 4040
 
1.7%
3999
 
1.6%
t 3851
 
1.6%
: 3583
 
1.5%
r 3214
 
1.3%
Other values (1561) 170236
69.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 129176
52.8%
Lowercase Letter 47747
 
19.5%
Space Separator 36996
 
15.1%
Uppercase Letter 9652
 
3.9%
Decimal Number 7522
 
3.1%
Other Punctuation 6692
 
2.7%
Open Punctuation 2677
 
1.1%
Close Punctuation 2675
 
1.1%
Math Symbol 982
 
0.4%
Dash Punctuation 417
 
0.2%
Other values (5) 112
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3999
 
3.1%
2293
 
1.8%
2256
 
1.7%
2245
 
1.7%
1981
 
1.5%
1834
 
1.4%
1781
 
1.4%
1555
 
1.2%
1542
 
1.2%
1461
 
1.1%
Other values (1449) 108229
83.8%
Lowercase Letter
ValueCountFrequency (%)
e 5608
11.7%
n 4546
9.5%
a 4402
 
9.2%
i 4173
 
8.7%
o 4040
 
8.5%
t 3851
 
8.1%
r 3214
 
6.7%
s 2879
 
6.0%
c 2040
 
4.3%
l 1874
 
3.9%
Other values (16) 11120
23.3%
Uppercase Letter
ValueCountFrequency (%)
S 899
 
9.3%
T 786
 
8.1%
E 762
 
7.9%
A 753
 
7.8%
C 745
 
7.7%
P 569
 
5.9%
I 562
 
5.8%
M 533
 
5.5%
D 492
 
5.1%
R 474
 
4.9%
Other values (16) 3077
31.9%
Other Punctuation
ValueCountFrequency (%)
: 3583
53.5%
. 1262
 
18.9%
, 1005
 
15.0%
· 198
 
3.0%
/ 136
 
2.0%
& 104
 
1.6%
' 102
 
1.5%
; 73
 
1.1%
! 72
 
1.1%
54
 
0.8%
Other values (10) 103
 
1.5%
Decimal Number
ValueCountFrequency (%)
0 2177
28.9%
2 1593
21.2%
1 1575
20.9%
3 503
 
6.7%
5 364
 
4.8%
4 291
 
3.9%
9 286
 
3.8%
7 276
 
3.7%
6 243
 
3.2%
8 214
 
2.8%
Math Symbol
ValueCountFrequency (%)
= 905
92.2%
+ 40
 
4.1%
~ 27
 
2.7%
> 4
 
0.4%
< 4
 
0.4%
2
 
0.2%
Letter Number
ValueCountFrequency (%)
40
40.8%
27
27.6%
13
 
13.3%
11
 
11.2%
5
 
5.1%
2
 
2.0%
Open Punctuation
ValueCountFrequency (%)
( 2643
98.7%
[ 30
 
1.1%
2
 
0.1%
1
 
< 0.1%
1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 2641
98.7%
] 30
 
1.1%
2
 
0.1%
1
 
< 0.1%
1
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
6
85.7%
1
 
14.3%
Modifier Symbol
ValueCountFrequency (%)
˚ 2
66.7%
˙ 1
33.3%
Space Separator
ValueCountFrequency (%)
36996
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 417
100.0%
Initial Punctuation
ValueCountFrequency (%)
2
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 127583
52.1%
Common 57969
23.7%
Latin 57497
23.5%
Han 1577
 
0.6%
Hiragana 15
 
< 0.1%
Katakana 7
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3999
 
3.1%
2293
 
1.8%
2256
 
1.8%
2245
 
1.8%
1981
 
1.6%
1834
 
1.4%
1781
 
1.4%
1555
 
1.2%
1542
 
1.2%
1461
 
1.1%
Other values (1094) 106636
83.6%
Han
ValueCountFrequency (%)
119
 
7.5%
50
 
3.2%
37
 
2.3%
36
 
2.3%
35
 
2.2%
25
 
1.6%
23
 
1.5%
22
 
1.4%
20
 
1.3%
19
 
1.2%
Other values (329) 1191
75.5%
Latin
ValueCountFrequency (%)
e 5608
 
9.8%
n 4546
 
7.9%
a 4402
 
7.7%
i 4173
 
7.3%
o 4040
 
7.0%
t 3851
 
6.7%
r 3214
 
5.6%
s 2879
 
5.0%
c 2040
 
3.5%
l 1874
 
3.3%
Other values (48) 20870
36.3%
Common
ValueCountFrequency (%)
36996
63.8%
: 3583
 
6.2%
( 2643
 
4.6%
) 2641
 
4.6%
0 2177
 
3.8%
2 1593
 
2.7%
1 1575
 
2.7%
. 1262
 
2.2%
, 1005
 
1.7%
= 905
 
1.6%
Other values (43) 3589
 
6.2%
Hiragana
ValueCountFrequency (%)
3
20.0%
2
13.3%
2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Katakana
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 127548
52.1%
ASCII 115046
47.0%
CJK 1515
 
0.6%
None 317
 
0.1%
Number Forms 98
 
< 0.1%
CJK Compat Ideographs 62
 
< 0.1%
Compat Jamo 29
 
< 0.1%
Hiragana 15
 
< 0.1%
Katakana 7
 
< 0.1%
Punctuation 5
 
< 0.1%
Other values (3) 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
36996
32.2%
e 5608
 
4.9%
n 4546
 
4.0%
a 4402
 
3.8%
i 4173
 
3.6%
o 4040
 
3.5%
t 3851
 
3.3%
: 3583
 
3.1%
r 3214
 
2.8%
s 2879
 
2.5%
Other values (75) 41754
36.3%
Hangul
ValueCountFrequency (%)
3999
 
3.1%
2293
 
1.8%
2256
 
1.8%
2245
 
1.8%
1981
 
1.6%
1834
 
1.4%
1781
 
1.4%
1555
 
1.2%
1542
 
1.2%
1461
 
1.1%
Other values (1092) 106601
83.6%
None
ValueCountFrequency (%)
· 198
62.5%
54
 
17.0%
34
 
10.7%
11
 
3.5%
6
 
1.9%
4
 
1.3%
2
 
0.6%
2
 
0.6%
1
 
0.3%
1
 
0.3%
Other values (4) 4
 
1.3%
CJK
ValueCountFrequency (%)
119
 
7.9%
50
 
3.3%
37
 
2.4%
36
 
2.4%
35
 
2.3%
25
 
1.7%
23
 
1.5%
22
 
1.5%
20
 
1.3%
19
 
1.3%
Other values (318) 1129
74.5%
Number Forms
ValueCountFrequency (%)
40
40.8%
27
27.6%
13
 
13.3%
11
 
11.2%
5
 
5.1%
2
 
2.0%
Compat Jamo
ValueCountFrequency (%)
29
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
17
27.4%
13
21.0%
13
21.0%
8
12.9%
4
 
6.5%
2
 
3.2%
1
 
1.6%
1
 
1.6%
1
 
1.6%
1
 
1.6%
Hiragana
ValueCountFrequency (%)
3
20.0%
2
13.3%
2
13.3%
2
13.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Math Operators
ValueCountFrequency (%)
2
100.0%
Modifier Letters
ValueCountFrequency (%)
˚ 2
66.7%
˙ 1
33.3%
Punctuation
ValueCountFrequency (%)
2
40.0%
2
40.0%
1
20.0%
Katakana
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
Misc Symbols
ValueCountFrequency (%)
1
100.0%

저자
Text

Distinct6318
Distinct (%)63.4%
Missing32
Missing (%)0.3%
Memory size156.2 KiB
2023-12-13T03:15:48.276293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length72
Median length68
Mean length7.1175762
Min length2

Characters and Unicode

Total characters70948
Distinct characters981
Distinct categories11 ?
Distinct scripts6 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5135 ?
Unique (%)51.5%

Sample

1st row유현종
2nd row최하예린
3rd row한국금융연구원
4th row한국금융연수원
5th row세르반떼스, 미겔 데
ValueCountFrequency (%)
대외경제정책연구원 216
 
1.6%
한국금융연수원 162
 
1.2%
한국금융연구원 152
 
1.1%
한국은행 101
 
0.7%
법제처 69
 
0.5%
삼성경제연구소 60
 
0.4%
j 58
 
0.4%
한국경제연구원 55
 
0.4%
edited 54
 
0.4%
지음 46
 
0.3%
Other values (7531) 12890
93.0%
2023-12-13T03:15:48.930033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3898
 
5.5%
, 3785
 
5.3%
a 2167
 
3.1%
e 2136
 
3.0%
n 1752
 
2.5%
r 1563
 
2.2%
i 1495
 
2.1%
o 1435
 
2.0%
1291
 
1.8%
1279
 
1.8%
Other values (971) 50147
70.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 40055
56.5%
Lowercase Letter 19113
26.9%
Other Punctuation 4466
 
6.3%
Space Separator 3898
 
5.5%
Uppercase Letter 3125
 
4.4%
Close Punctuation 111
 
0.2%
Open Punctuation 97
 
0.1%
Dash Punctuation 46
 
0.1%
Decimal Number 20
 
< 0.1%
Math Symbol 16
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1291
 
3.2%
1279
 
3.2%
1201
 
3.0%
1097
 
2.7%
1083
 
2.7%
1006
 
2.5%
984
 
2.5%
963
 
2.4%
734
 
1.8%
623
 
1.6%
Other values (893) 29794
74.4%
Lowercase Letter
ValueCountFrequency (%)
a 2167
11.3%
e 2136
11.2%
n 1752
 
9.2%
r 1563
 
8.2%
i 1495
 
7.8%
o 1435
 
7.5%
l 1006
 
5.3%
s 946
 
4.9%
t 871
 
4.6%
h 721
 
3.8%
Other values (17) 5021
26.3%
Uppercase Letter
ValueCountFrequency (%)
S 272
 
8.7%
C 261
 
8.4%
M 218
 
7.0%
B 210
 
6.7%
K 205
 
6.6%
R 191
 
6.1%
J 188
 
6.0%
H 171
 
5.5%
A 162
 
5.2%
D 153
 
4.9%
Other values (17) 1094
35.0%
Other Punctuation
ValueCountFrequency (%)
, 3785
84.8%
. 657
 
14.7%
& 10
 
0.2%
5
 
0.1%
; 3
 
0.1%
: 3
 
0.1%
· 2
 
< 0.1%
1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 7
35.0%
6 6
30.0%
1 3
15.0%
2 2
 
10.0%
4 1
 
5.0%
5 1
 
5.0%
Math Symbol
ValueCountFrequency (%)
= 8
50.0%
> 5
31.2%
< 3
 
18.8%
Close Punctuation
ValueCountFrequency (%)
] 97
87.4%
) 14
 
12.6%
Open Punctuation
ValueCountFrequency (%)
[ 83
85.6%
( 14
 
14.4%
Space Separator
ValueCountFrequency (%)
3898
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 46
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 39543
55.7%
Latin 22238
31.3%
Common 8655
 
12.2%
Han 493
 
0.7%
Katakana 12
 
< 0.1%
Hiragana 7
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1291
 
3.3%
1279
 
3.2%
1201
 
3.0%
1097
 
2.8%
1083
 
2.7%
1006
 
2.5%
984
 
2.5%
963
 
2.4%
734
 
1.9%
623
 
1.6%
Other values (678) 29282
74.1%
Han
ValueCountFrequency (%)
18
 
3.7%
15
 
3.0%
15
 
3.0%
14
 
2.8%
13
 
2.6%
13
 
2.6%
11
 
2.2%
9
 
1.8%
9
 
1.8%
8
 
1.6%
Other values (187) 368
74.6%
Latin
ValueCountFrequency (%)
a 2167
 
9.7%
e 2136
 
9.6%
n 1752
 
7.9%
r 1563
 
7.0%
i 1495
 
6.7%
o 1435
 
6.5%
l 1006
 
4.5%
s 946
 
4.3%
t 871
 
3.9%
h 721
 
3.2%
Other values (44) 8146
36.6%
Common
ValueCountFrequency (%)
3898
45.0%
, 3785
43.7%
. 657
 
7.6%
] 97
 
1.1%
[ 83
 
1.0%
- 46
 
0.5%
( 14
 
0.2%
) 14
 
0.2%
& 10
 
0.1%
= 8
 
0.1%
Other values (14) 43
 
0.5%
Katakana
ValueCountFrequency (%)
2
16.7%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
Hiragana
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 39543
55.7%
ASCII 30882
43.5%
CJK 467
 
0.7%
CJK Compat Ideographs 26
 
< 0.1%
Katakana 12
 
< 0.1%
None 10
 
< 0.1%
Hiragana 7
 
< 0.1%
Enclosed Alphanum 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3898
 
12.6%
, 3785
 
12.3%
a 2167
 
7.0%
e 2136
 
6.9%
n 1752
 
5.7%
r 1563
 
5.1%
i 1495
 
4.8%
o 1435
 
4.6%
l 1006
 
3.3%
s 946
 
3.1%
Other values (62) 10699
34.6%
Hangul
ValueCountFrequency (%)
1291
 
3.3%
1279
 
3.2%
1201
 
3.0%
1097
 
2.8%
1083
 
2.7%
1006
 
2.5%
984
 
2.5%
963
 
2.4%
734
 
1.9%
623
 
1.6%
Other values (678) 29282
74.1%
CJK
ValueCountFrequency (%)
18
 
3.9%
15
 
3.2%
15
 
3.2%
14
 
3.0%
13
 
2.8%
13
 
2.8%
11
 
2.4%
9
 
1.9%
8
 
1.7%
8
 
1.7%
Other values (178) 343
73.4%
CJK Compat Ideographs
ValueCountFrequency (%)
9
34.6%
8
30.8%
2
 
7.7%
2
 
7.7%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
1
 
3.8%
None
ValueCountFrequency (%)
5
50.0%
· 2
 
20.0%
æ 1
 
10.0%
1
 
10.0%
Ø 1
 
10.0%
Katakana
ValueCountFrequency (%)
2
16.7%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
1
8.3%
Enclosed Alphanum
ValueCountFrequency (%)
1
100.0%
Hiragana
ValueCountFrequency (%)
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
Distinct2247
Distinct (%)22.5%
Missing30
Missing (%)0.3%
Memory size156.2 KiB
2023-12-13T03:15:49.358069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length80
Median length50
Mean length5.6376128
Min length1

Characters and Unicode

Total characters56207
Distinct characters706
Distinct categories10 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1229 ?
Unique (%)12.3%

Sample

1st row굿인포메이션
2nd row디자인하우스
3rd row한국금융연구원
4th row한국금융연수원
5th row창작과 비평사
ValueCountFrequency (%)
대외경제정책연구원 264
 
2.4%
한국금융연수원 264
 
2.4%
한국금융연구원 261
 
2.4%
국토연구원 181
 
1.7%
한국개발연구원 172
 
1.6%
한국행정연구원 150
 
1.4%
한국경제연구원 147
 
1.4%
민음사 142
 
1.3%
김영사 131
 
1.2%
한국은행 122
 
1.1%
Other values (2340) 8979
83.0%
2023-12-13T03:15:50.334605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2186
 
3.9%
2143
 
3.8%
1974
 
3.5%
1914
 
3.4%
1785
 
3.2%
1773
 
3.2%
1340
 
2.4%
967
 
1.7%
927
 
1.6%
843
 
1.5%
Other values (696) 40355
71.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 46014
81.9%
Lowercase Letter 6369
 
11.3%
Uppercase Letter 2276
 
4.0%
Space Separator 843
 
1.5%
Decimal Number 339
 
0.6%
Other Punctuation 160
 
0.3%
Open Punctuation 74
 
0.1%
Close Punctuation 74
 
0.1%
Dash Punctuation 57
 
0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2186
 
4.8%
2143
 
4.7%
1974
 
4.3%
1914
 
4.2%
1785
 
3.9%
1773
 
3.9%
1340
 
2.9%
967
 
2.1%
927
 
2.0%
760
 
1.7%
Other values (617) 30245
65.7%
Lowercase Letter
ValueCountFrequency (%)
e 704
11.1%
o 616
9.7%
n 592
9.3%
i 541
 
8.5%
a 521
 
8.2%
s 504
 
7.9%
r 477
 
7.5%
l 367
 
5.8%
t 306
 
4.8%
c 273
 
4.3%
Other values (15) 1468
23.0%
Uppercase Letter
ValueCountFrequency (%)
S 217
 
9.5%
B 211
 
9.3%
P 178
 
7.8%
M 166
 
7.3%
O 140
 
6.2%
H 128
 
5.6%
C 125
 
5.5%
I 117
 
5.1%
A 114
 
5.0%
E 107
 
4.7%
Other values (15) 773
34.0%
Other Punctuation
ValueCountFrequency (%)
& 74
46.2%
, 32
20.0%
. 21
 
13.1%
/ 12
 
7.5%
; 4
 
2.5%
: 4
 
2.5%
3
 
1.9%
' 3
 
1.9%
@ 2
 
1.2%
· 2
 
1.2%
Other values (2) 3
 
1.9%
Decimal Number
ValueCountFrequency (%)
1 152
44.8%
2 138
40.7%
0 11
 
3.2%
3 9
 
2.7%
4 9
 
2.7%
8 6
 
1.8%
9 5
 
1.5%
5 4
 
1.2%
6 3
 
0.9%
7 2
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 64
86.5%
[ 10
 
13.5%
Close Punctuation
ValueCountFrequency (%)
) 64
86.5%
] 10
 
13.5%
Space Separator
ValueCountFrequency (%)
843
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 57
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 45501
81.0%
Latin 8645
 
15.4%
Common 1548
 
2.8%
Han 513
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2186
 
4.8%
2143
 
4.7%
1974
 
4.3%
1914
 
4.2%
1785
 
3.9%
1773
 
3.9%
1340
 
2.9%
967
 
2.1%
927
 
2.0%
760
 
1.7%
Other values (538) 29732
65.3%
Han
ValueCountFrequency (%)
79
 
15.4%
35
 
6.8%
33
 
6.4%
32
 
6.2%
29
 
5.7%
20
 
3.9%
20
 
3.9%
12
 
2.3%
12
 
2.3%
11
 
2.1%
Other values (69) 230
44.8%
Latin
ValueCountFrequency (%)
e 704
 
8.1%
o 616
 
7.1%
n 592
 
6.8%
i 541
 
6.3%
a 521
 
6.0%
s 504
 
5.8%
r 477
 
5.5%
l 367
 
4.2%
t 306
 
3.5%
c 273
 
3.2%
Other values (40) 3744
43.3%
Common
ValueCountFrequency (%)
843
54.5%
1 152
 
9.8%
2 138
 
8.9%
& 74
 
4.8%
( 64
 
4.1%
) 64
 
4.1%
- 57
 
3.7%
, 32
 
2.1%
. 21
 
1.4%
/ 12
 
0.8%
Other values (19) 91
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 45501
81.0%
ASCII 10186
 
18.1%
CJK 503
 
0.9%
CJK Compat Ideographs 10
 
< 0.1%
None 7
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2186
 
4.8%
2143
 
4.7%
1974
 
4.3%
1914
 
4.2%
1785
 
3.9%
1773
 
3.9%
1340
 
2.9%
967
 
2.1%
927
 
2.0%
760
 
1.7%
Other values (538) 29732
65.3%
ASCII
ValueCountFrequency (%)
843
 
8.3%
e 704
 
6.9%
o 616
 
6.0%
n 592
 
5.8%
i 541
 
5.3%
a 521
 
5.1%
s 504
 
4.9%
r 477
 
4.7%
l 367
 
3.6%
t 306
 
3.0%
Other values (66) 4715
46.3%
CJK
ValueCountFrequency (%)
79
 
15.7%
35
 
7.0%
33
 
6.6%
32
 
6.4%
29
 
5.8%
20
 
4.0%
20
 
4.0%
12
 
2.4%
12
 
2.4%
11
 
2.2%
Other values (67) 220
43.7%
CJK Compat Ideographs
ValueCountFrequency (%)
8
80.0%
2
 
20.0%
None
ValueCountFrequency (%)
3
42.9%
· 2
28.6%
2
28.6%

Unnamed: 6
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size166.0 KiB

Interactions

2023-12-13T03:15:45.491082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:45.253277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:45.595769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:15:45.364366image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:15:50.472344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순서자료유형등록번호
순서1.0000.2480.998
자료유형0.2481.0000.252
등록번호0.9980.2521.000
2023-12-13T03:15:50.574570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순서등록번호자료유형
순서1.0001.0000.133
등록번호1.0001.0000.135
자료유형0.1330.1351.000

Missing values

2023-12-13T03:15:45.741559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:15:45.904122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T03:15:46.034920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

순서자료유형등록번호서명저자출판사Unnamed: 6
64146473국내서7110대제국 고구려.1:광개토대제비의 위용유현종굿인포메이션<NA>
2627국내서27(온쪽이 하예린의)내가 만난 파리최하예린디자인하우스<NA>
1688017383연구보고서18240맥쿼리 그룹(Macquarie Group)의 성장과정과 전략적 시사점한국금융연구원한국금융연구원<NA>
89649043연구보고서9789(2007 통신연수)금융경제.1한국금융연수원한국금융연수원<NA>
44114426국내서4836(기발한 시골 양반 라 만차의)돈끼호떼.1세르반떼스, 미겔 데창작과 비평사<NA>
1239512672국내서13474(2008년도)건물신축단가표한국감정원한국감정원<NA>
2461925831연구보고서26745(2016년)하반기 경제전망한국금융연구원한국금융연구원<NA>
2065621277국내서22162아! 아브라함조우철오직말씀<NA>
1288513208국내서14014사회공헌활동백서한국도로공사한국도로공사<NA>
82038276국내서8978(켈러의)경영경제통계학:엑셀의 실전적 활용Keller, geraldThomson<NA>
순서자료유형등록번호서명저자출판사Unnamed: 6
28772885국내서3157TOEIC 900이나 500이나 미국가면 헤매는 20가지 이유구경서스타일 리더<NA>
17241729국내서1927변신 이야기.1오비디우스민음사<NA>
43264341국내서4751Business CommunicationSchool, harvard businessHarvard Business School<NA>
25712579국내서2839꿈꾸는 책들의 도시 2뫼르스, 발터들녘<NA>
515516국내서518내부감사 실무전서:기준. 매뉴얼, 체크리스트한국감사협의회한국감사협의회<NA>
1916219748국내서20622Romance of the three kingdoms =삼국지나관중다산북스<NA>
16941699국내서1897밑줄 긋는 남자봉그랑, 카롤린열린책들<NA>
1311113437국내서14250미국 부동산 금융의 대위기:언제 극복할 것인가=(The) Great crisis of real estate & finance of America : When will America overcome김일권부연사<NA>
2133322275국내서23167(지승호가 묻고 강신주가 답하다)강신주의 맨얼굴의 철학 당당한 인문학강신주시대의창<NA>
51925217국내서5667(한 권으로 끝내는)변액보험김종서미래지식<NA>