Overview

Dataset statistics

Number of variables11
Number of observations10000
Missing cells33097
Missing cells (%)30.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory947.3 KiB
Average record size in memory97.0 B

Variable types

Numeric1
Text8
DateTime2

Dataset

Description2022년 3분기까지 국립극장이 공연한 공연 자료 현황 (공연제목, 공연시작일, 공연종료일, 공연장소, 공연시간 등)
URLhttps://www.data.go.kr/data/3062564/fileData.do

Alerts

부제목 has 5973 (59.7%) missing valuesMissing
시작일 has 232 (2.3%) missing valuesMissing
종료일 has 232 (2.3%) missing valuesMissing
장소 has 1313 (13.1%) missing valuesMissing
단체 has 748 (7.5%) missing valuesMissing
캐스팅 has 7424 (74.2%) missing valuesMissing
스태프 has 7815 (78.1%) missing valuesMissing
공연시간(분) has 9355 (93.5%) missing valuesMissing
번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 05:59:34.850093
Analysis finished2023-12-12 05:59:39.338134
Duration4.49 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5556.141
Minimum1
Maximum11074
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T14:59:39.421335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile556.95
Q12799.75
median5565.5
Q38323.25
95-th percentile10523.05
Maximum11074
Range11073
Interquartile range (IQR)5523.5

Descriptive statistics

Standard deviation3193.8277
Coefficient of variation (CV)0.5748284
Kurtosis-1.1973951
Mean5556.141
Median Absolute Deviation (MAD)2762
Skewness-0.0072588256
Sum55561410
Variance10200535
MonotonicityNot monotonic
2023-12-12T14:59:39.553632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
7006 1
 
< 0.1%
9132 1
 
< 0.1%
2218 1
 
< 0.1%
7917 1
 
< 0.1%
605 1
 
< 0.1%
785 1
 
< 0.1%
8130 1
 
< 0.1%
6128 1
 
< 0.1%
10947 1
 
< 0.1%
10534 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
11074 1
< 0.1%
11073 1
< 0.1%
11072 1
< 0.1%
11071 1
< 0.1%
11070 1
< 0.1%
11069 1
< 0.1%
11068 1
< 0.1%
11067 1
< 0.1%
11066 1
< 0.1%
11065 1
< 0.1%
Distinct8944
Distinct (%)89.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T14:59:39.892784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length130
Median length82
Mean length17.9762
Min length1

Characters and Unicode

Total characters179762
Distinct characters2013
Distinct categories13 ?
Distinct scripts7 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8377 ?
Unique (%)83.8%

Sample

1st row律呂樂會 定期演奏會 (율여악회 정기연주회)
2nd row2020 정오의 음악회 3월
3rd row바보각시
4th row어느 刑事 이야기 (어느 형사 이야기)
5th row2002 로미오와 줄리엣
ValueCountFrequency (%)
363
 
1.0%
국립극장 363
 
1.0%
정기연주회 335
 
0.9%
피아노 284
 
0.8%
253
 
0.7%
독주회 243
 
0.7%
the 239
 
0.7%
독창회 208
 
0.6%
2017 196
 
0.5%
2016 171
 
0.5%
Other values (13692) 33028
92.6%
2023-12-12T14:59:40.441521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
25746
 
14.3%
( 3369
 
1.9%
) 3365
 
1.9%
e 3306
 
1.8%
1 3073
 
1.7%
3001
 
1.7%
0 2974
 
1.7%
a 2514
 
1.4%
2 2491
 
1.4%
i 2191
 
1.2%
Other values (2003) 127732
71.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 96765
53.8%
Space Separator 25746
 
14.3%
Lowercase Letter 24040
 
13.4%
Decimal Number 14155
 
7.9%
Uppercase Letter 8078
 
4.5%
Open Punctuation 3372
 
1.9%
Close Punctuation 3369
 
1.9%
Other Punctuation 3047
 
1.7%
Dash Punctuation 438
 
0.2%
Math Symbol 354
 
0.2%
Other values (3) 398
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3001
 
3.1%
1973
 
2.0%
1925
 
2.0%
1864
 
1.9%
1681
 
1.7%
1566
 
1.6%
1542
 
1.6%
1264
 
1.3%
1255
 
1.3%
1251
 
1.3%
Other values (1889) 79443
82.1%
Lowercase Letter
ValueCountFrequency (%)
e 3306
13.8%
a 2514
10.5%
i 2191
9.1%
o 1986
 
8.3%
n 1934
 
8.0%
r 1928
 
8.0%
t 1485
 
6.2%
s 1308
 
5.4%
l 1230
 
5.1%
h 854
 
3.6%
Other values (28) 5304
22.1%
Uppercase Letter
ValueCountFrequency (%)
T 660
 
8.2%
S 627
 
7.8%
A 589
 
7.3%
L 490
 
6.1%
C 477
 
5.9%
M 465
 
5.8%
E 438
 
5.4%
O 437
 
5.4%
R 434
 
5.4%
D 392
 
4.9%
Other values (22) 3069
38.0%
Other Punctuation
ValueCountFrequency (%)
: 1599
52.5%
, 579
 
19.0%
. 538
 
17.7%
& 112
 
3.7%
; 53
 
1.7%
' 53
 
1.7%
! 33
 
1.1%
/ 30
 
1.0%
· 26
 
0.9%
" 20
 
0.7%
Other values (3) 4
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 3073
21.7%
0 2974
21.0%
2 2491
17.6%
9 2046
14.5%
8 848
 
6.0%
7 670
 
4.7%
6 589
 
4.2%
3 509
 
3.6%
5 493
 
3.5%
4 462
 
3.3%
Letter Number
ValueCountFrequency (%)
37
31.9%
36
31.0%
21
18.1%
12
 
10.3%
4
 
3.4%
3
 
2.6%
2
 
1.7%
1
 
0.9%
Math Symbol
ValueCountFrequency (%)
> 121
34.2%
< 121
34.2%
~ 107
30.2%
+ 3
 
0.8%
× 2
 
0.6%
Open Punctuation
ValueCountFrequency (%)
( 3369
99.9%
[ 3
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 3365
99.9%
] 4
 
0.1%
Space Separator
ValueCountFrequency (%)
25746
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 438
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 281
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 90290
50.2%
Common 50763
28.2%
Latin 32209
 
17.9%
Han 6466
 
3.6%
Cyrillic 25
 
< 0.1%
Katakana 5
 
< 0.1%
Hiragana 4
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3001
 
3.3%
1973
 
2.2%
1925
 
2.1%
1864
 
2.1%
1681
 
1.9%
1566
 
1.7%
1542
 
1.7%
1264
 
1.4%
1255
 
1.4%
1251
 
1.4%
Other values (997) 72968
80.8%
Han
ValueCountFrequency (%)
485
 
7.5%
212
 
3.3%
207
 
3.2%
171
 
2.6%
143
 
2.2%
125
 
1.9%
115
 
1.8%
105
 
1.6%
105
 
1.6%
103
 
1.6%
Other values (873) 4695
72.6%
Latin
ValueCountFrequency (%)
e 3306
 
10.3%
a 2514
 
7.8%
i 2191
 
6.8%
o 1986
 
6.2%
n 1934
 
6.0%
r 1928
 
6.0%
t 1485
 
4.6%
s 1308
 
4.1%
l 1230
 
3.8%
h 854
 
2.7%
Other values (50) 13473
41.8%
Common
ValueCountFrequency (%)
25746
50.7%
( 3369
 
6.6%
) 3365
 
6.6%
1 3073
 
6.1%
0 2974
 
5.9%
2 2491
 
4.9%
9 2046
 
4.0%
: 1599
 
3.1%
8 848
 
1.7%
7 670
 
1.3%
Other values (26) 4582
 
9.0%
Cyrillic
ValueCountFrequency (%)
и 4
16.0%
А 2
 
8.0%
л 2
 
8.0%
ь 2
 
8.0%
к 2
 
8.0%
С 1
 
4.0%
е 1
 
4.0%
в 1
 
4.0%
с 1
 
4.0%
й 1
 
4.0%
Other values (8) 8
32.0%
Katakana
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Hiragana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 90290
50.2%
ASCII 82825
46.1%
CJK 6214
 
3.5%
CJK Compat Ideographs 252
 
0.1%
Number Forms 116
 
0.1%
None 28
 
< 0.1%
Cyrillic 25
 
< 0.1%
Katakana 5
 
< 0.1%
Hiragana 4
 
< 0.1%
Punctuation 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
25746
31.1%
( 3369
 
4.1%
) 3365
 
4.1%
e 3306
 
4.0%
1 3073
 
3.7%
0 2974
 
3.6%
a 2514
 
3.0%
2 2491
 
3.0%
i 2191
 
2.6%
9 2046
 
2.5%
Other values (74) 31750
38.3%
Hangul
ValueCountFrequency (%)
3001
 
3.3%
1973
 
2.2%
1925
 
2.1%
1864
 
2.1%
1681
 
1.9%
1566
 
1.7%
1542
 
1.7%
1264
 
1.4%
1255
 
1.4%
1251
 
1.4%
Other values (997) 72968
80.8%
CJK
ValueCountFrequency (%)
485
 
7.8%
212
 
3.4%
207
 
3.3%
171
 
2.8%
143
 
2.3%
125
 
2.0%
115
 
1.9%
105
 
1.7%
105
 
1.7%
103
 
1.7%
Other values (833) 4443
71.5%
CJK Compat Ideographs
ValueCountFrequency (%)
88
34.9%
62
24.6%
11
 
4.4%
8
 
3.2%
5
 
2.0%
5
 
2.0%
5
 
2.0%
5
 
2.0%
5
 
2.0%
4
 
1.6%
Other values (30) 54
21.4%
Number Forms
ValueCountFrequency (%)
37
31.9%
36
31.0%
21
18.1%
12
 
10.3%
4
 
3.4%
3
 
2.6%
2
 
1.7%
1
 
0.9%
None
ValueCountFrequency (%)
· 26
92.9%
× 2
 
7.1%
Cyrillic
ValueCountFrequency (%)
и 4
16.0%
А 2
 
8.0%
л 2
 
8.0%
ь 2
 
8.0%
к 2
 
8.0%
С 1
 
4.0%
е 1
 
4.0%
в 1
 
4.0%
с 1
 
4.0%
й 1
 
4.0%
Other values (8) 8
32.0%
Punctuation
ValueCountFrequency (%)
2
66.7%
1
33.3%
Hiragana
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Katakana
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

부제목
Text

MISSING 

Distinct2884
Distinct (%)71.6%
Missing5973
Missing (%)59.7%
Memory size156.2 KiB
2023-12-12T14:59:40.792137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length69
Median length55
Mean length15.535634
Min length1

Characters and Unicode

Total characters62562
Distinct characters1004
Distinct categories13 ?
Distinct scripts6 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2613 ?
Unique (%)64.9%

Sample

1st row第5回 (제5회)
2nd row2010 세계국립극장페스티벌 특별공연, 2010 서울연극올림픽 공식초청작
3rd row메아리 現協 劇會 第2回 公演 (메아리 현협극회 제2회 공연)
4th row국립극단 제116회 정기공연 (85 청소년공연예술제)
5th row제2회
ValueCountFrequency (%)
정기공연 300
 
2.3%
공연 256
 
2.0%
국립극장 235
 
1.8%
정기연주회 219
 
1.7%
국립극단 204
 
1.6%
극단 184
 
1.4%
국립교향악단 157
 
1.2%
국립무용단 121
 
0.9%
제2회 120
 
0.9%
제1회 117
 
0.9%
Other values (3844) 11078
85.3%
2023-12-12T14:59:41.751039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8978
 
14.4%
2509
 
4.0%
2129
 
3.4%
1964
 
3.1%
1 1884
 
3.0%
1767
 
2.8%
2 1446
 
2.3%
0 1421
 
2.3%
1371
 
2.2%
1355
 
2.2%
Other values (994) 37738
60.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 41009
65.5%
Space Separator 8978
 
14.4%
Decimal Number 8208
 
13.1%
Lowercase Letter 1593
 
2.5%
Uppercase Letter 952
 
1.5%
Close Punctuation 571
 
0.9%
Open Punctuation 571
 
0.9%
Other Punctuation 440
 
0.7%
Dash Punctuation 121
 
0.2%
Math Symbol 66
 
0.1%
Other values (3) 53
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2509
 
6.1%
2129
 
5.2%
1964
 
4.8%
1767
 
4.3%
1371
 
3.3%
1355
 
3.3%
1338
 
3.3%
1200
 
2.9%
1197
 
2.9%
718
 
1.8%
Other values (909) 25461
62.1%
Lowercase Letter
ValueCountFrequency (%)
e 232
14.6%
a 169
10.6%
r 140
8.8%
o 128
 
8.0%
t 118
 
7.4%
i 114
 
7.2%
l 103
 
6.5%
n 95
 
6.0%
s 90
 
5.6%
h 66
 
4.1%
Other values (16) 338
21.2%
Uppercase Letter
ValueCountFrequency (%)
A 85
 
8.9%
T 80
 
8.4%
S 72
 
7.6%
E 68
 
7.1%
C 66
 
6.9%
B 56
 
5.9%
O 55
 
5.8%
I 47
 
4.9%
R 45
 
4.7%
P 40
 
4.2%
Other values (15) 338
35.5%
Decimal Number
ValueCountFrequency (%)
1 1884
23.0%
2 1446
17.6%
0 1421
17.3%
3 685
 
8.3%
9 628
 
7.7%
4 534
 
6.5%
5 472
 
5.8%
8 405
 
4.9%
6 393
 
4.8%
7 340
 
4.1%
Other Punctuation
ValueCountFrequency (%)
, 208
47.3%
. 136
30.9%
: 58
 
13.2%
· 12
 
2.7%
& 10
 
2.3%
' 8
 
1.8%
/ 6
 
1.4%
! 2
 
0.5%
Letter Number
ValueCountFrequency (%)
13
36.1%
9
25.0%
7
19.4%
4
 
11.1%
1
 
2.8%
1
 
2.8%
1
 
2.8%
Math Symbol
ValueCountFrequency (%)
> 30
45.5%
< 30
45.5%
~ 6
 
9.1%
Space Separator
ValueCountFrequency (%)
8978
100.0%
Close Punctuation
ValueCountFrequency (%)
) 571
100.0%
Open Punctuation
ValueCountFrequency (%)
( 571
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 121
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 15
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 38558
61.6%
Common 18972
30.3%
Latin 2581
 
4.1%
Han 2433
 
3.9%
Hiragana 9
 
< 0.1%
Katakana 9
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2509
 
6.5%
2129
 
5.5%
1964
 
5.1%
1767
 
4.6%
1371
 
3.6%
1355
 
3.5%
1338
 
3.5%
1200
 
3.1%
1197
 
3.1%
718
 
1.9%
Other values (601) 23010
59.7%
Han
ValueCountFrequency (%)
243
 
10.0%
231
 
9.5%
176
 
7.2%
175
 
7.2%
142
 
5.8%
117
 
4.8%
64
 
2.6%
51
 
2.1%
48
 
2.0%
38
 
1.6%
Other values (283) 1148
47.2%
Latin
ValueCountFrequency (%)
e 232
 
9.0%
a 169
 
6.5%
r 140
 
5.4%
o 128
 
5.0%
t 118
 
4.6%
i 114
 
4.4%
l 103
 
4.0%
n 95
 
3.7%
s 90
 
3.5%
A 85
 
3.3%
Other values (48) 1307
50.6%
Common
ValueCountFrequency (%)
8978
47.3%
1 1884
 
9.9%
2 1446
 
7.6%
0 1421
 
7.5%
3 685
 
3.6%
9 628
 
3.3%
) 571
 
3.0%
( 571
 
3.0%
4 534
 
2.8%
5 472
 
2.5%
Other values (17) 1782
 
9.4%
Katakana
ValueCountFrequency (%)
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
Hiragana
ValueCountFrequency (%)
3
33.3%
2
22.2%
1
 
11.1%
1
 
11.1%
1
 
11.1%
1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 38556
61.6%
ASCII 21503
34.4%
CJK 2383
 
3.8%
CJK Compat Ideographs 50
 
0.1%
Number Forms 36
 
0.1%
None 12
 
< 0.1%
Hiragana 9
 
< 0.1%
Katakana 9
 
< 0.1%
Punctuation 2
 
< 0.1%
Compat Jamo 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8978
41.8%
1 1884
 
8.8%
2 1446
 
6.7%
0 1421
 
6.6%
3 685
 
3.2%
9 628
 
2.9%
) 571
 
2.7%
( 571
 
2.7%
4 534
 
2.5%
5 472
 
2.2%
Other values (66) 4313
20.1%
Hangul
ValueCountFrequency (%)
2509
 
6.5%
2129
 
5.5%
1964
 
5.1%
1767
 
4.6%
1371
 
3.6%
1355
 
3.5%
1338
 
3.5%
1200
 
3.1%
1197
 
3.1%
718
 
1.9%
Other values (600) 23008
59.7%
CJK
ValueCountFrequency (%)
243
 
10.2%
231
 
9.7%
176
 
7.4%
175
 
7.3%
142
 
6.0%
117
 
4.9%
64
 
2.7%
51
 
2.1%
48
 
2.0%
38
 
1.6%
Other values (278) 1098
46.1%
CJK Compat Ideographs
ValueCountFrequency (%)
30
60.0%
12
 
24.0%
4
 
8.0%
3
 
6.0%
1
 
2.0%
Number Forms
ValueCountFrequency (%)
13
36.1%
9
25.0%
7
19.4%
4
 
11.1%
1
 
2.8%
1
 
2.8%
1
 
2.8%
None
ValueCountFrequency (%)
· 12
100.0%
Hiragana
ValueCountFrequency (%)
3
33.3%
2
22.2%
1
 
11.1%
1
 
11.1%
1
 
11.1%
1
 
11.1%
Punctuation
ValueCountFrequency (%)
2
100.0%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
Katakana
ValueCountFrequency (%)
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%

시작일
Date

MISSING 

Distinct6996
Distinct (%)71.6%
Missing232
Missing (%)2.3%
Memory size156.2 KiB
Minimum1916-01-01 00:00:00
Maximum2022-12-31 00:00:00
2023-12-12T14:59:42.038242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:59:42.417947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

종료일
Date

MISSING 

Distinct6910
Distinct (%)70.7%
Missing232
Missing (%)2.3%
Memory size156.2 KiB
Minimum1917-12-31 00:00:00
Maximum2022-12-31 00:00:00
2023-12-12T14:59:42.700529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T14:59:42.985952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

장소
Text

MISSING 

Distinct1482
Distinct (%)17.1%
Missing1313
Missing (%)13.1%
Memory size156.2 KiB
2023-12-12T14:59:43.286156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length115
Median length72
Mean length10.208588
Min length1

Characters and Unicode

Total characters88682
Distinct characters552
Distinct categories11 ?
Distinct scripts5 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1138 ?
Unique (%)13.1%

Sample

1st row國立劇場 小劇場 (국립극장 소극장)
2nd row하늘극장
3rd row국립극장 달오름극장
4th row國立劇場 (국립극장)
5th row국립극장 해오름극장
ValueCountFrequency (%)
국립극장 4661
27.4%
소극장 1017
 
6.0%
대극장 968
 
5.7%
달오름극장 780
 
4.6%
해오름극장 683
 
4.0%
중앙국립극장 490
 
2.9%
별오름극장 424
 
2.5%
國立劇場 423
 
2.5%
미상 299
 
1.8%
예술의전당 260
 
1.5%
Other values (1600) 7019
41.2%
2023-12-12T14:59:43.752791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10921
 
12.3%
10635
 
12.0%
8342
 
9.4%
5857
 
6.6%
5554
 
6.3%
e 2227
 
2.5%
2055
 
2.3%
a 1946
 
2.2%
1923
 
2.2%
1575
 
1.8%
Other values (542) 37647
42.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 59732
67.4%
Lowercase Letter 14333
 
16.2%
Space Separator 8343
 
9.4%
Uppercase Letter 3901
 
4.4%
Other Punctuation 1009
 
1.1%
Close Punctuation 643
 
0.7%
Open Punctuation 643
 
0.7%
Decimal Number 57
 
0.1%
Dash Punctuation 19
 
< 0.1%
Letter Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10921
18.3%
10635
17.8%
5857
 
9.8%
5554
 
9.3%
2055
 
3.4%
1923
 
3.2%
1575
 
2.6%
1255
 
2.1%
827
 
1.4%
803
 
1.3%
Other values (468) 18327
30.7%
Lowercase Letter
ValueCountFrequency (%)
e 2227
15.5%
a 1946
13.6%
r 1338
9.3%
n 1047
 
7.3%
o 1010
 
7.0%
t 952
 
6.6%
l 912
 
6.4%
i 912
 
6.4%
s 760
 
5.3%
h 501
 
3.5%
Other values (16) 2728
19.0%
Uppercase Letter
ValueCountFrequency (%)
B 608
15.6%
K 514
13.2%
T 409
 
10.5%
S 270
 
6.9%
O 257
 
6.6%
F 169
 
4.3%
M 169
 
4.3%
R 161
 
4.1%
C 150
 
3.8%
G 150
 
3.8%
Other values (15) 1044
26.8%
Decimal Number
ValueCountFrequency (%)
1 31
54.4%
2 8
 
14.0%
3 5
 
8.8%
7 4
 
7.0%
6 2
 
3.5%
9 2
 
3.5%
8 2
 
3.5%
5 1
 
1.8%
0 1
 
1.8%
4 1
 
1.8%
Other Punctuation
ValueCountFrequency (%)
, 880
87.2%
. 116
 
11.5%
' 8
 
0.8%
/ 2
 
0.2%
& 2
 
0.2%
; 1
 
0.1%
Space Separator
ValueCountFrequency (%)
8342
> 99.9%
  1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 643
100.0%
Open Punctuation
ValueCountFrequency (%)
( 643
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 19
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%
Math Symbol
ValueCountFrequency (%)
> 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 57601
65.0%
Latin 18235
 
20.6%
Common 10715
 
12.1%
Han 2114
 
2.4%
Katakana 17
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10921
19.0%
10635
18.5%
5857
 
10.2%
5554
 
9.6%
2055
 
3.6%
1923
 
3.3%
1575
 
2.7%
1255
 
2.2%
827
 
1.4%
803
 
1.4%
Other values (420) 16196
28.1%
Latin
ValueCountFrequency (%)
e 2227
 
12.2%
a 1946
 
10.7%
r 1338
 
7.3%
n 1047
 
5.7%
o 1010
 
5.5%
t 952
 
5.2%
l 912
 
5.0%
i 912
 
5.0%
s 760
 
4.2%
B 608
 
3.3%
Other values (42) 6523
35.8%
Han
ValueCountFrequency (%)
527
24.9%
526
24.9%
444
21.0%
443
21.0%
60
 
2.8%
21
 
1.0%
20
 
0.9%
14
 
0.7%
11
 
0.5%
10
 
0.5%
Other values (29) 38
 
1.8%
Common
ValueCountFrequency (%)
8342
77.9%
, 880
 
8.2%
) 643
 
6.0%
( 643
 
6.0%
. 116
 
1.1%
1 31
 
0.3%
- 19
 
0.2%
' 8
 
0.1%
2 8
 
0.1%
3 5
 
< 0.1%
Other values (12) 20
 
0.2%
Katakana
ValueCountFrequency (%)
3
17.6%
3
17.6%
2
11.8%
2
11.8%
2
11.8%
2
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 57597
64.9%
ASCII 28948
32.6%
CJK 2114
 
2.4%
Katakana 17
 
< 0.1%
Compat Jamo 4
 
< 0.1%
None 1
 
< 0.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10921
19.0%
10635
18.5%
5857
 
10.2%
5554
 
9.6%
2055
 
3.6%
1923
 
3.3%
1575
 
2.7%
1255
 
2.2%
827
 
1.4%
803
 
1.4%
Other values (416) 16192
28.1%
ASCII
ValueCountFrequency (%)
8342
28.8%
e 2227
 
7.7%
a 1946
 
6.7%
r 1338
 
4.6%
n 1047
 
3.6%
o 1010
 
3.5%
t 952
 
3.3%
l 912
 
3.2%
i 912
 
3.2%
, 880
 
3.0%
Other values (62) 9382
32.4%
CJK
ValueCountFrequency (%)
527
24.9%
526
24.9%
444
21.0%
443
21.0%
60
 
2.8%
21
 
1.0%
20
 
0.9%
14
 
0.7%
11
 
0.5%
10
 
0.5%
Other values (29) 38
 
1.8%
Katakana
ValueCountFrequency (%)
3
17.6%
3
17.6%
2
11.8%
2
11.8%
2
11.8%
2
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%
None
ValueCountFrequency (%)
  1
100.0%
Compat Jamo
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Number Forms
ValueCountFrequency (%)
1
100.0%

장르
Text

Distinct123
Distinct (%)1.2%
Missing5
Missing (%)< 0.1%
Memory size156.2 KiB
2023-12-12T14:59:44.022362image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length15
Mean length6.9424712
Min length2

Characters and Unicode

Total characters69390
Distinct characters117
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)0.2%

Sample

1st row음악
2nd row국악
3rd row연극>일반연극
4th row연극
5th row무용>발레
ValueCountFrequency (%)
음악 1623
16.2%
연극>일반연극 1309
 
13.1%
음악>음악극>오페라 875
 
8.8%
무용>발레 463
 
4.6%
음악>서양성악>합창 462
 
4.6%
무용 448
 
4.5%
무용>한국무용>창작무용 345
 
3.5%
연극 310
 
3.1%
음악>한국기악>국악관현악 300
 
3.0%
공연일반>종합공연 287
 
2.9%
Other values (113) 3573
35.7%
2023-12-12T14:59:44.403278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
> 11133
16.0%
9031
 
13.0%
6446
 
9.3%
4693
 
6.8%
3957
 
5.7%
2898
 
4.2%
2892
 
4.2%
1897
 
2.7%
1863
 
2.7%
1816
 
2.6%
Other values (107) 22764
32.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 58255
84.0%
Math Symbol 11133
 
16.0%
Decimal Number 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9031
15.5%
6446
 
11.1%
4693
 
8.1%
3957
 
6.8%
2898
 
5.0%
2892
 
5.0%
1897
 
3.3%
1863
 
3.2%
1816
 
3.1%
1531
 
2.6%
Other values (105) 21231
36.4%
Math Symbol
ValueCountFrequency (%)
> 11133
100.0%
Decimal Number
ValueCountFrequency (%)
4 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 58255
84.0%
Common 11135
 
16.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9031
15.5%
6446
 
11.1%
4693
 
8.1%
3957
 
6.8%
2898
 
5.0%
2892
 
5.0%
1897
 
3.3%
1863
 
3.2%
1816
 
3.1%
1531
 
2.6%
Other values (105) 21231
36.4%
Common
ValueCountFrequency (%)
> 11133
> 99.9%
4 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 58255
84.0%
ASCII 11135
 
16.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
> 11133
> 99.9%
4 2
 
< 0.1%
Hangul
ValueCountFrequency (%)
9031
15.5%
6446
 
11.1%
4693
 
8.1%
3957
 
6.8%
2898
 
5.0%
2892
 
5.0%
1897
 
3.3%
1863
 
3.2%
1816
 
3.1%
1531
 
2.6%
Other values (105) 21231
36.4%

단체
Text

MISSING 

Distinct4160
Distinct (%)45.0%
Missing748
Missing (%)7.5%
Memory size156.2 KiB
2023-12-12T14:59:44.678596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length45
Mean length9.6561824
Min length1

Characters and Unicode

Total characters89339
Distinct characters1135
Distinct categories13 ?
Distinct scripts4 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3546 ?
Unique (%)38.3%

Sample

1st row律呂樂會 (율여악회)
2nd row국립극장
3rd row연희단거리패
4th row메아리 現協 劇會 (메아리 현협극회)
5th row모스크바 국립 클래시컬 발레단 외
ValueCountFrequency (%)
1607
 
9.0%
국립극장 491
 
2.7%
극단 482
 
2.7%
459
 
2.6%
국립합창단 420
 
2.3%
국립극단 406
 
2.3%
국립국악관현악단 359
 
2.0%
국립창극단 358
 
2.0%
국립무용단 341
 
1.9%
국립오페라단 248
 
1.4%
Other values (6297) 12705
71.1%
2023-12-12T14:59:45.129448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8648
 
9.7%
4180
 
4.7%
4059
 
4.5%
3327
 
3.7%
a 2588
 
2.9%
e 2503
 
2.8%
2301
 
2.6%
r 1746
 
2.0%
1737
 
1.9%
, 1732
 
1.9%
Other values (1125) 56518
63.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 51852
58.0%
Lowercase Letter 19986
 
22.4%
Space Separator 8648
 
9.7%
Uppercase Letter 4781
 
5.4%
Other Punctuation 2302
 
2.6%
Close Punctuation 746
 
0.8%
Open Punctuation 743
 
0.8%
Decimal Number 192
 
0.2%
Dash Punctuation 75
 
0.1%
Math Symbol 10
 
< 0.1%
Other values (3) 4
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4180
 
8.1%
4059
 
7.8%
3327
 
6.4%
2301
 
4.4%
1737
 
3.3%
1620
 
3.1%
1026
 
2.0%
906
 
1.7%
830
 
1.6%
797
 
1.5%
Other values (1043) 31069
59.9%
Lowercase Letter
ValueCountFrequency (%)
a 2588
12.9%
e 2503
12.5%
r 1746
8.7%
i 1698
8.5%
n 1635
 
8.2%
o 1546
 
7.7%
l 1472
 
7.4%
t 1063
 
5.3%
s 934
 
4.7%
h 705
 
3.5%
Other values (16) 4096
20.5%
Uppercase Letter
ValueCountFrequency (%)
B 373
 
7.8%
S 356
 
7.4%
A 355
 
7.4%
M 345
 
7.2%
C 284
 
5.9%
T 265
 
5.5%
R 262
 
5.5%
D 241
 
5.0%
P 232
 
4.9%
L 214
 
4.5%
Other values (16) 1854
38.8%
Decimal Number
ValueCountFrequency (%)
1 44
22.9%
2 39
20.3%
0 31
16.1%
8 17
 
8.9%
6 16
 
8.3%
3 12
 
6.2%
9 12
 
6.2%
5 9
 
4.7%
4 7
 
3.6%
7 5
 
2.6%
Other Punctuation
ValueCountFrequency (%)
, 1732
75.2%
. 516
 
22.4%
& 31
 
1.3%
' 13
 
0.6%
/ 5
 
0.2%
: 2
 
0.1%
· 2
 
0.1%
; 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
> 4
40.0%
< 4
40.0%
+ 2
20.0%
Close Punctuation
ValueCountFrequency (%)
) 743
99.6%
] 3
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 740
99.6%
[ 3
 
0.4%
Space Separator
ValueCountFrequency (%)
8648
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 75
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 49320
55.2%
Latin 24767
27.7%
Common 12718
 
14.2%
Han 2534
 
2.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4180
 
8.5%
4059
 
8.2%
3327
 
6.7%
2301
 
4.7%
1737
 
3.5%
1620
 
3.3%
1026
 
2.1%
906
 
1.8%
830
 
1.7%
797
 
1.6%
Other values (632) 28537
57.9%
Han
ValueCountFrequency (%)
177
 
7.0%
174
 
6.9%
106
 
4.2%
74
 
2.9%
68
 
2.7%
66
 
2.6%
50
 
2.0%
48
 
1.9%
45
 
1.8%
39
 
1.5%
Other values (402) 1687
66.6%
Latin
ValueCountFrequency (%)
a 2588
 
10.4%
e 2503
 
10.1%
r 1746
 
7.0%
i 1698
 
6.9%
n 1635
 
6.6%
o 1546
 
6.2%
l 1472
 
5.9%
t 1063
 
4.3%
s 934
 
3.8%
h 705
 
2.8%
Other values (42) 8877
35.8%
Common
ValueCountFrequency (%)
8648
68.0%
, 1732
 
13.6%
) 743
 
5.8%
( 740
 
5.8%
. 516
 
4.1%
- 75
 
0.6%
1 44
 
0.3%
2 39
 
0.3%
& 31
 
0.2%
0 31
 
0.2%
Other values (19) 119
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 49316
55.2%
ASCII 37481
42.0%
CJK 2427
 
2.7%
CJK Compat Ideographs 107
 
0.1%
None 4
 
< 0.1%
Compat Jamo 2
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8648
23.1%
a 2588
 
6.9%
e 2503
 
6.7%
r 1746
 
4.7%
, 1732
 
4.6%
i 1698
 
4.5%
n 1635
 
4.4%
o 1546
 
4.1%
l 1472
 
3.9%
t 1063
 
2.8%
Other values (68) 12850
34.3%
Hangul
ValueCountFrequency (%)
4180
 
8.5%
4059
 
8.2%
3327
 
6.7%
2301
 
4.7%
1737
 
3.5%
1620
 
3.3%
1026
 
2.1%
906
 
1.8%
830
 
1.7%
797
 
1.6%
Other values (630) 28533
57.9%
CJK
ValueCountFrequency (%)
177
 
7.3%
174
 
7.2%
106
 
4.4%
74
 
3.0%
68
 
2.8%
66
 
2.7%
48
 
2.0%
45
 
1.9%
39
 
1.6%
38
 
1.6%
Other values (379) 1592
65.6%
CJK Compat Ideographs
ValueCountFrequency (%)
50
46.7%
14
 
13.1%
7
 
6.5%
4
 
3.7%
4
 
3.7%
4
 
3.7%
3
 
2.8%
2
 
1.9%
2
 
1.9%
2
 
1.9%
Other values (13) 15
 
14.0%
None
ValueCountFrequency (%)
· 2
50.0%
2
50.0%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%

캐스팅
Text

MISSING 

Distinct2439
Distinct (%)94.7%
Missing7424
Missing (%)74.2%
Memory size156.2 KiB
2023-12-12T14:59:45.384985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length1024
Median length408
Mean length91.954193
Min length2

Characters and Unicode

Total characters236874
Distinct characters1099
Distinct categories16 ?
Distinct scripts6 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2402 ?
Unique (%)93.2%

Sample

1st row올가 파블로바, 올가 알렉산드로브나 파블로바, 비치슬라프 피시만, 이온 쿠로슈, 게오르기 칼리닌, 안드레이 로파예프, 니콜라이 체브 첼로프, 알렉세이 프랴드킨, 블라지미르 스투로프, 나탈리야 카르네예바, 마리나 르자니코바, 올가 클리멘코, 이반 카르네예프, 세르게이 벨로르 브킨, 수사나 아베티소바, 드미트리 에신, 알렉산드르 고로호비크, 다비드 토라파야 / 지휘-V. 릐로프 / 프라임오케스트라
2nd row지휘-염진섭 / 국립합창단
3rd row안숙선, 유수정, 왕기철, 왕기석, 김지숙, 박애리, 윤석안, 남상일, 김금미, 서정금, 윤충일, 김학용, 김경숙, 허종열, 이영태, 김차경, 정미정, 주호종, 우지용
4th row<Still Life At the Penguin Cafe>Deborah Bull, Guy Niblett, Bruce Sansom, Fiona Brockway, Phillip Broomhead, Tracy Brown, Jonathan Cope, Michelle di Lorenzo, Stephen Jefferies, Nicola Roberts / <The Penguin Cafe Orchestra>Simon Jeffes, Helen Liebmann, Neil Rennie, Paul Street, Ian Maidman, Bob Loveday, Geoffrey Richardson, Steve Nye
5th row김명자, 채상묵, 김묘선, 한혜경, 김은희, 진유림, 오은명, 황순임, 최창덕, 김효분, 안숙선, 창작타악그룹 유소
ValueCountFrequency (%)
2962
 
6.7%
국립합창단 330
 
0.7%
국립국악관현악단 158
 
0.4%
코리안심포니오케스트라 157
 
0.4%
147
 
0.3%
국립창극단 120
 
0.3%
합창단 105
 
0.2%
안숙선 103
 
0.2%
국립무용단 103
 
0.2%
김재건 93
 
0.2%
Other values (17793) 40099
90.4%
2023-12-12T14:59:45.744237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
42075
 
17.8%
, 26701
 
11.3%
5958
 
2.5%
4661
 
2.0%
a 3770
 
1.6%
3710
 
1.6%
e 3429
 
1.4%
2718
 
1.1%
/ 2693
 
1.1%
i 2594
 
1.1%
Other values (1089) 138565
58.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 122037
51.5%
Space Separator 42075
 
17.8%
Other Punctuation 30355
 
12.8%
Lowercase Letter 29444
 
12.4%
Uppercase Letter 7288
 
3.1%
Dash Punctuation 2265
 
1.0%
Math Symbol 1451
 
0.6%
Open Punctuation 802
 
0.3%
Close Punctuation 798
 
0.3%
Decimal Number 296
 
0.1%
Other values (6) 63
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5958
 
4.9%
4661
 
3.8%
3710
 
3.0%
2718
 
2.2%
2222
 
1.8%
1990
 
1.6%
1985
 
1.6%
1981
 
1.6%
1958
 
1.6%
1815
 
1.5%
Other values (995) 93039
76.2%
Lowercase Letter
ValueCountFrequency (%)
a 3770
12.8%
e 3429
11.6%
i 2594
8.8%
n 2577
8.8%
r 2570
8.7%
o 2347
 
8.0%
l 1810
 
6.1%
t 1463
 
5.0%
s 1286
 
4.4%
h 1180
 
4.0%
Other values (17) 6418
21.8%
Uppercase Letter
ValueCountFrequency (%)
S 586
 
8.0%
A 576
 
7.9%
M 535
 
7.3%
C 489
 
6.7%
B 469
 
6.4%
R 371
 
5.1%
T 362
 
5.0%
L 338
 
4.6%
P 337
 
4.6%
D 327
 
4.5%
Other values (16) 2898
39.8%
Other Punctuation
ValueCountFrequency (%)
, 26701
88.0%
/ 2693
 
8.9%
: 500
 
1.6%
. 319
 
1.1%
· 49
 
0.2%
' 43
 
0.1%
& 24
 
0.1%
; 10
 
< 0.1%
* 10
 
< 0.1%
! 6
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 82
27.7%
1 59
19.9%
3 33
11.1%
0 23
 
7.8%
4 23
 
7.8%
5 20
 
6.8%
7 18
 
6.1%
6 14
 
4.7%
9 13
 
4.4%
8 11
 
3.7%
Math Symbol
ValueCountFrequency (%)
< 723
49.8%
> 723
49.8%
= 3
 
0.2%
+ 1
 
0.1%
| 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 784
97.8%
[ 17
 
2.1%
1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 781
97.9%
] 17
 
2.1%
Final Punctuation
ValueCountFrequency (%)
5
62.5%
3
37.5%
Initial Punctuation
ValueCountFrequency (%)
5
62.5%
3
37.5%
Letter Number
ValueCountFrequency (%)
2
66.7%
1
33.3%
Space Separator
ValueCountFrequency (%)
42075
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2265
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 29
100.0%
Other Symbol
ValueCountFrequency (%)
11
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 121874
51.5%
Common 78102
33.0%
Latin 36735
 
15.5%
Han 117
 
< 0.1%
Katakana 43
 
< 0.1%
Hiragana 3
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5958
 
4.9%
4661
 
3.8%
3710
 
3.0%
2718
 
2.2%
2222
 
1.8%
1990
 
1.6%
1985
 
1.6%
1981
 
1.6%
1958
 
1.6%
1815
 
1.5%
Other values (872) 92876
76.2%
Han
ValueCountFrequency (%)
8
 
6.8%
4
 
3.4%
3
 
2.6%
3
 
2.6%
3
 
2.6%
2
 
1.7%
2
 
1.7%
2
 
1.7%
2
 
1.7%
2
 
1.7%
Other values (81) 86
73.5%
Latin
ValueCountFrequency (%)
a 3770
 
10.3%
e 3429
 
9.3%
i 2594
 
7.1%
n 2577
 
7.0%
r 2570
 
7.0%
o 2347
 
6.4%
l 1810
 
4.9%
t 1463
 
4.0%
s 1286
 
3.5%
h 1180
 
3.2%
Other values (45) 13709
37.3%
Common
ValueCountFrequency (%)
42075
53.9%
, 26701
34.2%
/ 2693
 
3.4%
- 2265
 
2.9%
( 784
 
1.0%
) 781
 
1.0%
< 723
 
0.9%
> 723
 
0.9%
: 500
 
0.6%
. 319
 
0.4%
Other values (29) 538
 
0.7%
Katakana
ValueCountFrequency (%)
3
 
7.0%
3
 
7.0%
3
 
7.0%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
Other values (19) 20
46.5%
Hiragana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 121871
51.4%
ASCII 114756
48.4%
CJK 116
 
< 0.1%
None 51
 
< 0.1%
Katakana 43
 
< 0.1%
Punctuation 16
 
< 0.1%
Geometric Shapes 11
 
< 0.1%
Compat Jamo 3
 
< 0.1%
Number Forms 3
 
< 0.1%
Hiragana 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
42075
36.7%
, 26701
23.3%
a 3770
 
3.3%
e 3429
 
3.0%
/ 2693
 
2.3%
i 2594
 
2.3%
n 2577
 
2.2%
r 2570
 
2.2%
o 2347
 
2.0%
- 2265
 
2.0%
Other values (74) 23735
20.7%
Hangul
ValueCountFrequency (%)
5958
 
4.9%
4661
 
3.8%
3710
 
3.0%
2718
 
2.2%
2222
 
1.8%
1990
 
1.6%
1985
 
1.6%
1981
 
1.6%
1958
 
1.6%
1815
 
1.5%
Other values (870) 92873
76.2%
None
ValueCountFrequency (%)
· 49
96.1%
ø 1
 
2.0%
1
 
2.0%
Geometric Shapes
ValueCountFrequency (%)
11
100.0%
CJK
ValueCountFrequency (%)
8
 
6.9%
4
 
3.4%
3
 
2.6%
3
 
2.6%
3
 
2.6%
2
 
1.7%
2
 
1.7%
2
 
1.7%
2
 
1.7%
2
 
1.7%
Other values (80) 85
73.3%
Punctuation
ValueCountFrequency (%)
5
31.2%
5
31.2%
3
18.8%
3
18.8%
Katakana
ValueCountFrequency (%)
3
 
7.0%
3
 
7.0%
3
 
7.0%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
2
 
4.7%
Other values (19) 20
46.5%
Compat Jamo
ValueCountFrequency (%)
2
66.7%
1
33.3%
Number Forms
ValueCountFrequency (%)
2
66.7%
1
33.3%
Hiragana
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%

스태프
Text

MISSING 

Distinct2100
Distinct (%)96.1%
Missing7815
Missing (%)78.1%
Memory size156.2 KiB
2023-12-12T14:59:45.962972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length885
Median length298
Mean length90.296568
Min length3

Characters and Unicode

Total characters197298
Distinct characters914
Distinct categories16 ?
Distinct scripts5 ?
Distinct blocks12 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2055 ?
Unique (%)94.1%

Sample

1st row예술감독-나탈리야 카사트키나, 블라디미르 바실료프 / 연출·총감독-O. 크라스노셀스키흐 / 음향감독-E. 클류시니크 / 무대감독-A. 푸호프 / 조명감독-S. 유르킨 / 무대장치-N. 자만 / 분장디자이너-R. 페텔리나, M. 소콜로바 / 의상-N. 마트베예바 / 소품제작-L. 로미히나 / 극장미술감독-E. 드보르키나 / 미술감독-여운덕 / 기술감독-김인철 / 무대감독-박인원 / 무대장치제작-이원영, 강승구, 신종현, 주기홍, 박병우 / 작화-구재화, 이성현 / 소품-정복모, 채수형 / 의상-김경수 / 장신구-엄인섭 / 조명-고상순, 주영석, 이승재 / 음향-김호성 / 영사-범기창, 김동기, 이승수, 전선택
2nd row예술감독-유영대 / 연출-김홍승 / 직창·도창-안숙선 / 작곡·지휘-이용탁 / 안무-정은혜 / 대본-박성환 / 무대디자인-이학순 / 조명디자인-고희선 / 의상디자인-손희정 / 분장디자인-김종한
3rd row<Still Life At the Penguin Cafe>안무-David Bintley / 작곡-Simon Jeffes / 지휘-Isaiah Jackson / 오케스트라-Penguin Cafe Orchestra, <The Penguin Cafe Orchestra>음악-Simon Jeffes / 감독-Andrew Harries
4th row예술총감독·구성-김명자 / 연출-조주현 / 조명디자인-김철희 / 무대디자인-김종석 / 영상제작-황정남 / 무대감독-김진우 / 의상-故 이매방, 정예희, 부산미미고전한복 / 분장-이재형 / 영상기록-지화충 / 사진기록-이진환 / 홍보·진행-이동민, 정성진
5th rowDirector of Music-Stephen Cleobury / Producer-James Whitbourn / Director-Pamela Hossick / Executive Producer-Tommy Nagra / Organ Scholars-Douglas Tand & Tom Etherigde
ValueCountFrequency (%)
12126
32.5%
연출 232
 
0.6%
의상-김경수 165
 
0.4%
안무 158
 
0.4%
분장-김종한 150
 
0.4%
작곡 144
 
0.4%
예술감독 135
 
0.4%
조명-김인철 115
 
0.3%
장치제작-송점수 107
 
0.3%
조명 101
 
0.3%
Other values (12766) 23885
64.0%
2023-12-12T14:59:46.373113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35291
 
17.9%
- 13457
 
6.8%
/ 11709
 
5.9%
, 3707
 
1.9%
3234
 
1.6%
e 2819
 
1.4%
2778
 
1.4%
2463
 
1.2%
2339
 
1.2%
r 2310
 
1.2%
Other values (904) 117191
59.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 100327
50.9%
Space Separator 35291
 
17.9%
Lowercase Letter 23185
 
11.8%
Other Punctuation 17707
 
9.0%
Dash Punctuation 13457
 
6.8%
Uppercase Letter 5462
 
2.8%
Math Symbol 1117
 
0.6%
Close Punctuation 290
 
0.1%
Open Punctuation 289
 
0.1%
Decimal Number 84
 
< 0.1%
Other values (6) 89
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3234
 
3.2%
2778
 
2.8%
2463
 
2.5%
2339
 
2.3%
2058
 
2.1%
2017
 
2.0%
1982
 
2.0%
1947
 
1.9%
1942
 
1.9%
1934
 
1.9%
Other values (816) 77633
77.4%
Lowercase Letter
ValueCountFrequency (%)
e 2819
12.2%
r 2310
10.0%
a 2307
10.0%
i 2229
9.6%
n 1962
 
8.5%
o 1958
 
8.4%
t 1353
 
5.8%
s 1103
 
4.8%
l 1063
 
4.6%
h 881
 
3.8%
Other values (16) 5200
22.4%
Uppercase Letter
ValueCountFrequency (%)
C 501
 
9.2%
S 470
 
8.6%
D 450
 
8.2%
M 404
 
7.4%
P 330
 
6.0%
A 317
 
5.8%
L 295
 
5.4%
B 283
 
5.2%
G 240
 
4.4%
R 234
 
4.3%
Other values (16) 1938
35.5%
Decimal Number
ValueCountFrequency (%)
1 19
22.6%
2 17
20.2%
3 14
16.7%
0 8
9.5%
7 7
 
8.3%
9 7
 
8.3%
5 6
 
7.1%
8 2
 
2.4%
4 2
 
2.4%
6 2
 
2.4%
Other Punctuation
ValueCountFrequency (%)
/ 11709
66.1%
, 3707
 
20.9%
: 1042
 
5.9%
· 841
 
4.7%
. 366
 
2.1%
& 20
 
0.1%
' 17
 
0.1%
! 4
 
< 0.1%
; 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
< 548
49.1%
> 548
49.1%
| 13
 
1.2%
8
 
0.7%
Other Symbol
ValueCountFrequency (%)
9
52.9%
5
29.4%
3
 
17.6%
Letter Number
ValueCountFrequency (%)
2
66.7%
1
33.3%
Space Separator
ValueCountFrequency (%)
35291
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13457
100.0%
Close Punctuation
ValueCountFrequency (%)
) 290
100.0%
Open Punctuation
ValueCountFrequency (%)
( 289
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 62
100.0%
Initial Punctuation
ValueCountFrequency (%)
3
100.0%
Final Punctuation
ValueCountFrequency (%)
3
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 100294
50.8%
Common 68321
34.6%
Latin 28650
 
14.5%
Han 31
 
< 0.1%
Hiragana 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3234
 
3.2%
2778
 
2.8%
2463
 
2.5%
2339
 
2.3%
2058
 
2.1%
2017
 
2.0%
1982
 
2.0%
1947
 
1.9%
1942
 
1.9%
1934
 
1.9%
Other values (785) 77600
77.4%
Latin
ValueCountFrequency (%)
e 2819
 
9.8%
r 2310
 
8.1%
a 2307
 
8.1%
i 2229
 
7.8%
n 1962
 
6.8%
o 1958
 
6.8%
t 1353
 
4.7%
s 1103
 
3.8%
l 1063
 
3.7%
h 881
 
3.1%
Other values (44) 10665
37.2%
Common
ValueCountFrequency (%)
35291
51.7%
- 13457
 
19.7%
/ 11709
 
17.1%
, 3707
 
5.4%
: 1042
 
1.5%
· 841
 
1.2%
< 548
 
0.8%
> 548
 
0.8%
. 366
 
0.5%
) 290
 
0.4%
Other values (24) 522
 
0.8%
Han
ValueCountFrequency (%)
2
 
6.5%
2
 
6.5%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
1
 
3.2%
Other values (19) 19
61.3%
Hiragana
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 100292
50.8%
ASCII 96096
48.7%
None 841
 
0.4%
CJK 30
 
< 0.1%
Geometric Shapes 14
 
< 0.1%
Math Operators 8
 
< 0.1%
Punctuation 6
 
< 0.1%
Box Drawing 3
 
< 0.1%
Number Forms 3
 
< 0.1%
Compat Jamo 2
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
35291
36.7%
- 13457
 
14.0%
/ 11709
 
12.2%
, 3707
 
3.9%
e 2819
 
2.9%
r 2310
 
2.4%
a 2307
 
2.4%
i 2229
 
2.3%
n 1962
 
2.0%
o 1958
 
2.0%
Other values (69) 18347
19.1%
Hangul
ValueCountFrequency (%)
3234
 
3.2%
2778
 
2.8%
2463
 
2.5%
2339
 
2.3%
2058
 
2.1%
2017
 
2.0%
1982
 
2.0%
1947
 
1.9%
1942
 
1.9%
1934
 
1.9%
Other values (784) 77598
77.4%
None
ValueCountFrequency (%)
· 841
100.0%
Geometric Shapes
ValueCountFrequency (%)
9
64.3%
5
35.7%
Math Operators
ValueCountFrequency (%)
8
100.0%
Box Drawing
ValueCountFrequency (%)
3
100.0%
Punctuation
ValueCountFrequency (%)
3
50.0%
3
50.0%
Number Forms
ValueCountFrequency (%)
2
66.7%
1
33.3%
CJK
ValueCountFrequency (%)
2
 
6.7%
2
 
6.7%
1
 
3.3%
1
 
3.3%
1
 
3.3%
1
 
3.3%
1
 
3.3%
1
 
3.3%
1
 
3.3%
1
 
3.3%
Other values (18) 18
60.0%
Compat Jamo
ValueCountFrequency (%)
2
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%
Hiragana
ValueCountFrequency (%)
1
50.0%
1
50.0%

공연시간(분)
Text

MISSING 

Distinct118
Distinct (%)18.3%
Missing9355
Missing (%)93.5%
Memory size156.2 KiB
2023-12-12T14:59:46.664686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length3
Mean length3.0077519
Min length2

Characters and Unicode

Total characters1940
Distinct characters32
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique67 ?
Unique (%)10.4%

Sample

1st row70분
2nd row240분 (인터미션 포함)
3rd row110분
4th row70
5th row120
ValueCountFrequency (%)
90 74
 
11.0%
100 62
 
9.2%
120 53
 
7.9%
70 46
 
6.8%
80 43
 
6.4%
60 37
 
5.5%
70분 32
 
4.7%
110 19
 
2.8%
180 19
 
2.8%
150 19
 
2.8%
Other values (107) 271
40.1%
2023-12-12T14:59:47.126655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 647
33.4%
1 355
18.3%
128
 
6.6%
7 107
 
5.5%
9 101
 
5.2%
2 101
 
5.2%
8 92
 
4.7%
5 89
 
4.6%
6 73
 
3.8%
3 36
 
1.9%
Other values (22) 211
 
10.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1631
84.1%
Other Letter 243
 
12.5%
Space Separator 30
 
1.5%
Open Punctuation 18
 
0.9%
Close Punctuation 18
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
128
52.7%
14
 
5.8%
14
 
5.8%
14
 
5.8%
11
 
4.5%
11
 
4.5%
11
 
4.5%
6
 
2.5%
6
 
2.5%
6
 
2.5%
Other values (9) 22
 
9.1%
Decimal Number
ValueCountFrequency (%)
0 647
39.7%
1 355
21.8%
7 107
 
6.6%
9 101
 
6.2%
2 101
 
6.2%
8 92
 
5.6%
5 89
 
5.5%
6 73
 
4.5%
3 36
 
2.2%
4 30
 
1.8%
Space Separator
ValueCountFrequency (%)
30
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1697
87.5%
Hangul 243
 
12.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
128
52.7%
14
 
5.8%
14
 
5.8%
14
 
5.8%
11
 
4.5%
11
 
4.5%
11
 
4.5%
6
 
2.5%
6
 
2.5%
6
 
2.5%
Other values (9) 22
 
9.1%
Common
ValueCountFrequency (%)
0 647
38.1%
1 355
20.9%
7 107
 
6.3%
9 101
 
6.0%
2 101
 
6.0%
8 92
 
5.4%
5 89
 
5.2%
6 73
 
4.3%
3 36
 
2.1%
30
 
1.8%
Other values (3) 66
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1697
87.5%
Hangul 243
 
12.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 647
38.1%
1 355
20.9%
7 107
 
6.3%
9 101
 
6.0%
2 101
 
6.0%
8 92
 
5.4%
5 89
 
5.2%
6 73
 
4.3%
3 36
 
2.1%
30
 
1.8%
Other values (3) 66
 
3.9%
Hangul
ValueCountFrequency (%)
128
52.7%
14
 
5.8%
14
 
5.8%
14
 
5.8%
11
 
4.5%
11
 
4.5%
11
 
4.5%
6
 
2.5%
6
 
2.5%
6
 
2.5%
Other values (9) 22
 
9.1%

Interactions

2023-12-12T14:59:38.477470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-12T14:59:38.873146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:59:39.040231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T14:59:39.212542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

번호공연제목부제목시작일종료일장소장르단체캐스팅스태프공연시간(분)
70057006律呂樂會 定期演奏會 (율여악회 정기연주회)第5回 (제5회)1987-03-111987-03-11國立劇場 小劇場 (국립극장 소극장)음악律呂樂會 (율여악회)<NA><NA><NA>
1401412020 정오의 음악회 3월<NA>2020-03-112020-03-11하늘극장국악국립극장<NA><NA>70분
24772478바보각시2010 세계국립극장페스티벌 특별공연, 2010 서울연극올림픽 공식초청작2010-09-242010-09-28국립극장 달오름극장연극>일반연극연희단거리패<NA><NA><NA>
1052010521어느 刑事 이야기 (어느 형사 이야기)메아리 現協 劇會 第2回 公演 (메아리 현협극회 제2회 공연)1965-08-201965-08-21國立劇場 (국립극장)연극메아리 現協 劇會 (메아리 현협극회)<NA><NA><NA>
463546362002 로미오와 줄리엣<NA>2002-05-182002-05-23국립극장 해오름극장무용>발레모스크바 국립 클래시컬 발레단 외올가 파블로바, 올가 알렉산드로브나 파블로바, 비치슬라프 피시만, 이온 쿠로슈, 게오르기 칼리닌, 안드레이 로파예프, 니콜라이 체브 첼로프, 알렉세이 프랴드킨, 블라지미르 스투로프, 나탈리야 카르네예바, 마리나 르자니코바, 올가 클리멘코, 이반 카르네예프, 세르게이 벨로르 브킨, 수사나 아베티소바, 드미트리 에신, 알렉산드르 고로호비크, 다비드 토라파야 / 지휘-V. 릐로프 / 프라임오케스트라예술감독-나탈리야 카사트키나, 블라디미르 바실료프 / 연출·총감독-O. 크라스노셀스키흐 / 음향감독-E. 클류시니크 / 무대감독-A. 푸호프 / 조명감독-S. 유르킨 / 무대장치-N. 자만 / 분장디자이너-R. 페텔리나, M. 소콜로바 / 의상-N. 마트베예바 / 소품제작-L. 로미히나 / 극장미술감독-E. 드보르키나 / 미술감독-여운덕 / 기술감독-김인철 / 무대감독-박인원 / 무대장치제작-이원영, 강승구, 신종현, 주기홍, 박병우 / 작화-구재화, 이성현 / 소품-정복모, 채수형 / 의상-김경수 / 장신구-엄인섭 / 조명-고상순, 주영석, 이승재 / 음향-김호성 / 영사-범기창, 김동기, 이승수, 전선택<NA>
94952020 국립창극단 <완창판소리> 10월<NA>2020-10-242020-10-24하늘극장국악국립극장<NA><NA>240분 (인터미션 포함)
773377341984 국립극장 행사<NA>1984-01-011984-12-31국립극장행사>내부국립극장<NA><NA><NA>
74887489대각개교절 70주년기념 원불교예술인공연<NA>1985-04-281985-04-28중앙국립극장공연일반>종합공연.<NA><NA><NA>
74747475여자가국립극단 제116회 정기공연 (85 청소년공연예술제)1985-05-281985-05-31<NA>연극>일반연극국립극단<NA><NA><NA>
457245732002 국립합창단 초청 가을음악회<NA>2002-10-212002-10-21단양군민회관음악>서양성악>합창국립합창단지휘-염진섭 / 국립합창단<NA><NA>
번호공연제목부제목시작일종료일장소장르단체캐스팅스태프공연시간(분)
168316842013 국립극장 전통예술아카데미<NA>2013-03-042013-11-06국립극장 달오름극장기타<NA><NA><NA><NA>
65826583미오로시 발레 정기공연제2회1988-11-131988-11-13국립극장 소극장무용>발레한양대학교 무용학과<NA><NA><NA>
84858486한국 민요 모음 (경음악) - 김희조 선생 지휘<NA>1979-10-281979-10-28<NA>음악<NA><NA><NA><NA>
91922020 정오의 음악회 11월<NA>2020-11-112020-11-11하늘극장국악국립극장<NA><NA>70분
44224423Othello(오델로)제11회 젊은연극제, 중앙대학교 연극학과 45주년 기념공연2003-07-042003-07-04국립극장 달오름극장연극>일반연극중앙대학교 연극학과<NA><NA><NA>
57855786문정온의 춤<NA>1992-12-141992-12-15국립중앙극장 소극장무용>현대무용문정온<NA><NA><NA>
71147115木覓藝術節 (목멱예술절)개교 31주년 기념1986-10-141986-10-15국립극장 소극장공연일반국악고등학교<NA><NA><NA>
696469651987 꿈하늘단재 신채호1987-05-011987-05-15국립극장 소극장연극>일반연극국립극단장민호, 김동원, 손숙, 정상철, 오영수, 박상규, 김재건, 최상설, 서희승, 문영수, 전국환, 김종구, 김명환, 이혜경, 권복순, 주진모, 김희령, 권경희, 최운교, 이경성, 오성열, 이영호극본-차범석 / 연출-김석만<NA>
253125322010 고양 합창페스티벌<NA>2010-08-102010-08-21고양 아람누리 아람음악당음악>서양성악>합창국립합창단 외지휘-나영수, 국립합창단 / 지휘-이상훈, 성남시립합창단 / 지휘-구천, 광주시립합창단 / 지휘-정남규, 원주시립합창단 / 지휘-박신화, 안산시립합창단 / 지휘-민인기, 수원시립합창단 / 지휘-박영호, 대구시립합창단 / 지휘-이상길, 안양시립합창단 / 지휘-빈프리트 톨, 대전시립합창단 / 지휘-이기선, 고양시립합창단<NA><NA>
487548762000 토요문화광장_5~9월<NA>2000-05-062000-09-30국립극장 문화광장공연일반>종합공연국립국악관현악단 외류복성, 김용우, 서도창, 이수자, 박정원, 임학성, 서정근, 장사익, 김규형, 김광석, 노름마치, 미8군 군악대, 전미례 재즈발레단, 서울춤사랑, 한국현대춤연구회 푸름, 국립국악관현악단, 국립창극단, 국립무용단, 국립발레단, 서울팝스오케스트라, 서도소리연구회, 극단아리랑, 국수호디딤무용단<NA><NA>