Overview

Dataset statistics

Number of variables5
Number of observations1374
Missing cells509
Missing cells (%)7.4%
Duplicate rows14
Duplicate rows (%)1.0%
Total size in memory55.1 KiB
Average record size in memory41.1 B

Variable types

Text3
Categorical2

Dataset

Description국립중앙극장 관련된 공연 기획을 하기 위한 기획 협력 업체들의 정보에 대한 데이터로 960여건의 데이터를 제공합니다.
Author문화체육관광부 국립중앙극장
URLhttps://www.data.go.kr/data/15090292/fileData.do

Alerts

Dataset has 14 (1.0%) duplicate rowsDuplicates
상태코드 is highly overall correlated with 조직유형High correlation
조직유형 is highly overall correlated with 상태코드High correlation
상태코드 is highly imbalanced (99.1%)Imbalance
우편번호 has 220 (16.0%) missing valuesMissing
주소 1 has 289 (21.0%) missing valuesMissing

Reproduction

Analysis started2023-12-12 21:13:33.784338
Analysis finished2023-12-12 21:13:34.508020
Duration0.72 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1342
Distinct (%)97.7%
Missing0
Missing (%)0.0%
Memory size10.9 KiB
2023-12-13T06:13:34.665798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length31
Median length22
Mean length7.636099
Min length2

Characters and Unicode

Total characters10492
Distinct characters605
Distinct categories15 ?
Distinct scripts4 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1310 ?
Unique (%)95.3%

Sample

1st row(사) 나누리.
2nd row(사) 전국문예회관연합회.
3rd row(사) 한국기독교문화사업단
4th row(사)갈물한글서회.
5th row(사)고려오페라단.
ValueCountFrequency (%)
사단법인 37
 
2.2%
극단 30
 
1.8%
주식회사 25
 
1.5%
무용단 9
 
0.5%
프로젝트 5
 
0.3%
컴퍼니 4
 
0.2%
필하모닉 4
 
0.2%
art 4
 
0.2%
오케스트라 4
 
0.2%
4
 
0.2%
Other values (1489) 1571
92.6%
2023-12-13T06:13:35.021016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 714
 
6.8%
325
 
3.1%
) 270
 
2.6%
( 262
 
2.5%
240
 
2.3%
202
 
1.9%
190
 
1.8%
183
 
1.7%
175
 
1.7%
173
 
1.6%
Other values (595) 7758
73.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8347
79.6%
Other Punctuation 728
 
6.9%
Space Separator 325
 
3.1%
Uppercase Letter 283
 
2.7%
Close Punctuation 271
 
2.6%
Open Punctuation 263
 
2.5%
Lowercase Letter 214
 
2.0%
Decimal Number 33
 
0.3%
Other Symbol 16
 
0.2%
Dash Punctuation 4
 
< 0.1%
Other values (5) 8
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
240
 
2.9%
202
 
2.4%
190
 
2.3%
183
 
2.2%
175
 
2.1%
173
 
2.1%
160
 
1.9%
154
 
1.8%
150
 
1.8%
138
 
1.7%
Other values (521) 6582
78.9%
Uppercase Letter
ValueCountFrequency (%)
M 26
 
9.2%
S 23
 
8.1%
T 20
 
7.1%
A 19
 
6.7%
C 19
 
6.7%
O 18
 
6.4%
E 16
 
5.7%
R 15
 
5.3%
N 15
 
5.3%
I 13
 
4.6%
Other values (14) 99
35.0%
Lowercase Letter
ValueCountFrequency (%)
a 23
10.7%
e 22
10.3%
o 19
 
8.9%
n 17
 
7.9%
s 16
 
7.5%
u 16
 
7.5%
t 16
 
7.5%
c 13
 
6.1%
r 10
 
4.7%
m 9
 
4.2%
Other values (13) 53
24.8%
Decimal Number
ValueCountFrequency (%)
1 14
42.4%
2 4
 
12.1%
0 3
 
9.1%
4 3
 
9.1%
9 3
 
9.1%
5 2
 
6.1%
3 2
 
6.1%
6 1
 
3.0%
7 1
 
3.0%
Other Punctuation
ValueCountFrequency (%)
. 714
98.1%
& 7
 
1.0%
, 4
 
0.5%
' 2
 
0.3%
/ 1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 270
99.6%
] 1
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 262
99.6%
[ 1
 
0.4%
Math Symbol
ValueCountFrequency (%)
+ 2
50.0%
= 2
50.0%
Space Separator
ValueCountFrequency (%)
325
100.0%
Other Symbol
ValueCountFrequency (%)
16
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8358
79.7%
Common 1632
 
15.6%
Latin 497
 
4.7%
Han 5
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
240
 
2.9%
202
 
2.4%
190
 
2.3%
183
 
2.2%
175
 
2.1%
173
 
2.1%
160
 
1.9%
154
 
1.8%
150
 
1.8%
138
 
1.7%
Other values (517) 6593
78.9%
Latin
ValueCountFrequency (%)
M 26
 
5.2%
a 23
 
4.6%
S 23
 
4.6%
e 22
 
4.4%
T 20
 
4.0%
o 19
 
3.8%
A 19
 
3.8%
C 19
 
3.8%
O 18
 
3.6%
n 17
 
3.4%
Other values (37) 291
58.6%
Common
ValueCountFrequency (%)
. 714
43.8%
325
19.9%
) 270
 
16.5%
( 262
 
16.1%
1 14
 
0.9%
& 7
 
0.4%
2 4
 
0.2%
, 4
 
0.2%
- 4
 
0.2%
0 3
 
0.2%
Other values (16) 25
 
1.5%
Han
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8342
79.5%
ASCII 2127
 
20.3%
None 16
 
0.2%
CJK 5
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 714
33.6%
325
15.3%
) 270
 
12.7%
( 262
 
12.3%
M 26
 
1.2%
a 23
 
1.1%
S 23
 
1.1%
e 22
 
1.0%
T 20
 
0.9%
o 19
 
0.9%
Other values (61) 423
19.9%
Hangul
ValueCountFrequency (%)
240
 
2.9%
202
 
2.4%
190
 
2.3%
183
 
2.2%
175
 
2.1%
173
 
2.1%
160
 
1.9%
154
 
1.8%
150
 
1.8%
138
 
1.7%
Other values (516) 6577
78.8%
None
ValueCountFrequency (%)
16
100.0%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
CJK
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

조직유형
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size10.9 KiB
1480
882 
10
276 
9
198 
3530
 
12
1481
 
5

Length

Max length4
Median length4
Mean length3.1659389
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row1480
2nd row1480
3rd row10
4th row1480
5th row1480

Common Values

ValueCountFrequency (%)
1480 882
64.2%
10 276
 
20.1%
9 198
 
14.4%
3530 12
 
0.9%
1481 5
 
0.4%
<NA> 1
 
0.1%

Length

2023-12-13T06:13:35.168712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:13:35.283382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1480 882
64.2%
10 276
 
20.1%
9 198
 
14.4%
3530 12
 
0.9%
1481 5
 
0.4%
na 1
 
0.1%

우편번호
Text

MISSING 

Distinct775
Distinct (%)67.2%
Missing220
Missing (%)16.0%
Memory size10.9 KiB
2023-12-13T06:13:35.667490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length5.9055459
Min length4

Characters and Unicode

Total characters6815
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique625 ?
Unique (%)54.2%

Sample

1st row135-833
2nd row137-718
3rd row6063
4th row110-320
5th row151-056
ValueCountFrequency (%)
137-070 65
 
5.6%
4621 26
 
2.3%
6715 15
 
1.3%
137-718 11
 
1.0%
135-080 11
 
1.0%
137-867 11
 
1.0%
137-060 9
 
0.8%
135-090 9
 
0.8%
111-111 8
 
0.7%
135-270 7
 
0.6%
Other values (765) 982
85.1%
2023-12-13T06:13:36.331443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1225
18.0%
0 999
14.7%
3 719
10.6%
- 701
10.3%
7 621
9.1%
6 478
 
7.0%
8 469
 
6.9%
2 462
 
6.8%
5 454
 
6.7%
4 452
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6114
89.7%
Dash Punctuation 701
 
10.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1225
20.0%
0 999
16.3%
3 719
11.8%
7 621
10.2%
6 478
 
7.8%
8 469
 
7.7%
2 462
 
7.6%
5 454
 
7.4%
4 452
 
7.4%
9 235
 
3.8%
Dash Punctuation
ValueCountFrequency (%)
- 701
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6815
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1225
18.0%
0 999
14.7%
3 719
10.6%
- 701
10.3%
7 621
9.1%
6 478
 
7.0%
8 469
 
6.9%
2 462
 
6.8%
5 454
 
6.7%
4 452
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6815
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1225
18.0%
0 999
14.7%
3 719
10.6%
- 701
10.3%
7 621
9.1%
6 478
 
7.0%
8 469
 
6.9%
2 462
 
6.8%
5 454
 
6.7%
4 452
 
6.6%

주소 1
Text

MISSING 

Distinct743
Distinct (%)68.5%
Missing289
Missing (%)21.0%
Memory size10.9 KiB
2023-12-13T06:13:36.729680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length45
Median length41
Mean length16.943779
Min length1

Characters and Unicode

Total characters18384
Distinct characters423
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique637 ?
Unique (%)58.7%

Sample

1st row서울 강남 논현2동
2nd row서울 서초 서초3동 예술의전당
3rd row서울 강남구 청담동 11-23 (청담동)
4th row서울 종로 낙원동
5th row서울시 관악구 행운동
ValueCountFrequency (%)
서울 841
 
18.6%
서초 162
 
3.6%
경기 110
 
2.4%
서초동 101
 
2.2%
강남 91
 
2.0%
서초구 70
 
1.6%
중구 69
 
1.5%
강남구 61
 
1.4%
종로 60
 
1.3%
서초3동 52
 
1.2%
Other values (1509) 2893
64.1%
2023-12-13T06:13:37.276993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3426
 
18.6%
1381
 
7.5%
1168
 
6.4%
898
 
4.9%
518
 
2.8%
507
 
2.8%
1 415
 
2.3%
410
 
2.2%
( 404
 
2.2%
) 404
 
2.2%
Other values (413) 8853
48.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11934
64.9%
Space Separator 3426
 
18.6%
Decimal Number 1810
 
9.8%
Open Punctuation 404
 
2.2%
Close Punctuation 404
 
2.2%
Other Punctuation 259
 
1.4%
Dash Punctuation 101
 
0.5%
Uppercase Letter 39
 
0.2%
Lowercase Letter 4
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1381
 
11.6%
1168
 
9.8%
898
 
7.5%
518
 
4.3%
507
 
4.2%
410
 
3.4%
245
 
2.1%
207
 
1.7%
198
 
1.7%
181
 
1.5%
Other values (375) 6221
52.1%
Uppercase Letter
ValueCountFrequency (%)
S 5
12.8%
E 5
12.8%
M 4
10.3%
T 4
10.3%
G 3
7.7%
A 3
7.7%
W 2
 
5.1%
X 2
 
5.1%
L 2
 
5.1%
C 2
 
5.1%
Other values (7) 7
17.9%
Decimal Number
ValueCountFrequency (%)
1 415
22.9%
2 282
15.6%
3 261
14.4%
4 171
9.4%
5 145
 
8.0%
0 137
 
7.6%
6 109
 
6.0%
7 106
 
5.9%
8 100
 
5.5%
9 84
 
4.6%
Lowercase Letter
ValueCountFrequency (%)
r 1
25.0%
k 1
25.0%
o 1
25.0%
e 1
25.0%
Space Separator
ValueCountFrequency (%)
3426
100.0%
Open Punctuation
ValueCountFrequency (%)
( 404
100.0%
Close Punctuation
ValueCountFrequency (%)
) 404
100.0%
Other Punctuation
ValueCountFrequency (%)
, 259
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 101
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%
Letter Number
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11934
64.9%
Common 6406
34.8%
Latin 44
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1381
 
11.6%
1168
 
9.8%
898
 
7.5%
518
 
4.3%
507
 
4.2%
410
 
3.4%
245
 
2.1%
207
 
1.7%
198
 
1.7%
181
 
1.5%
Other values (375) 6221
52.1%
Latin
ValueCountFrequency (%)
S 5
11.4%
E 5
11.4%
M 4
 
9.1%
T 4
 
9.1%
G 3
 
6.8%
A 3
 
6.8%
W 2
 
4.5%
X 2
 
4.5%
L 2
 
4.5%
C 2
 
4.5%
Other values (12) 12
27.3%
Common
ValueCountFrequency (%)
3426
53.5%
1 415
 
6.5%
( 404
 
6.3%
) 404
 
6.3%
2 282
 
4.4%
3 261
 
4.1%
, 259
 
4.0%
4 171
 
2.7%
5 145
 
2.3%
0 137
 
2.1%
Other values (6) 502
 
7.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11934
64.9%
ASCII 6449
35.1%
Number Forms 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3426
53.1%
1 415
 
6.4%
( 404
 
6.3%
) 404
 
6.3%
2 282
 
4.4%
3 261
 
4.0%
, 259
 
4.0%
4 171
 
2.7%
5 145
 
2.2%
0 137
 
2.1%
Other values (27) 545
 
8.5%
Hangul
ValueCountFrequency (%)
1381
 
11.6%
1168
 
9.8%
898
 
7.5%
518
 
4.3%
507
 
4.2%
410
 
3.4%
245
 
2.1%
207
 
1.7%
198
 
1.7%
181
 
1.5%
Other values (375) 6221
52.1%
Number Forms
ValueCountFrequency (%)
1
100.0%

상태코드
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size10.9 KiB
1
1373 
서울 서초구 방배로34길 8 (방배동, 다원빌딩)
 
1

Length

Max length27
Median length1
Mean length1.0189229
Min length1

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 1373
99.9%
서울 서초구 방배로34길 8 (방배동, 다원빌딩) 1
 
0.1%

Length

2023-12-13T06:13:37.438890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T06:13:37.564133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 1373
99.6%
서울 1
 
0.1%
서초구 1
 
0.1%
방배로34길 1
 
0.1%
8 1
 
0.1%
방배동 1
 
0.1%
다원빌딩 1
 
0.1%

Correlations

2023-12-13T06:13:37.643259image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조직유형상태코드
조직유형1.000NaN
상태코드NaN1.000
2023-12-13T06:13:37.756314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상태코드조직유형
상태코드1.0001.000
조직유형1.0001.000
2023-12-13T06:13:37.850254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
조직유형상태코드
조직유형1.0001.000
상태코드1.0001.000

Missing values

2023-12-13T06:13:34.303926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T06:13:34.384338image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T06:13:34.458509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

조직명조직유형우편번호주소 1상태코드
0(사) 나누리.1480135-833서울 강남 논현2동1
1(사) 전국문예회관연합회.1480137-718서울 서초 서초3동 예술의전당1
2(사) 한국기독교문화사업단106063서울 강남구 청담동 11-23 (청담동)1
3(사)갈물한글서회.1480110-320서울 종로 낙원동1
4(사)고려오페라단.1480151-056서울시 관악구 행운동1
5(사)국제문화공연교류회104561서울 중구 을지로44길 10 (광희동1가)1
6(사)국제서예가협회.1480501-847광주 동구 학동1
7(사)글로리아오페라단.1480135-893서울 강남 신사동1
8(사)김자경오페라단.1480137-873서울 서초 서초3동1
9(사)꾸러기예술단1480135-120서울 강남구 신사동1
조직명조직유형우편번호주소 1상태코드
1364Sugar & Co.1480140-031서울 용산 이촌1동1
1365SYJ Dance company.1480135-877<NA>1
1366testtest.1480300-814대전 동구 삼성1동1
1367TIMF앙상블.1480650-110경남 통영 도천동1
1368UNICO.1480156-094서울 동작 사당4동1
1369Unico.1480137-060서울 서초 방배동1
1370We Music.1480449-040경기 용인 마평동1
1371Y발레단104728서울 성동구 금호로 117 (금호동2가, 금호자이1차)1
1372YAJ MUSIC(야즈뮤직)14803315경기 화성시 동탄대로시범길 19 (청계동, 동탄역 시범 더샵 센트럴시티)1
1373Zamstick.1480151-834서울 관악 행운동1

Duplicate rows

Most frequently occurring

조직명조직유형우편번호주소 1상태코드# duplicates
0(재)윤이상평화재단.1480110-070서울 종로 내수동12
1극단목화.1480129-149서울특별시 종로구12
2김민주9<NA><NA>12
3대전광역시청.14806715대전 서구 둔산로 100 (둔산동, 대전광역시청)12
4무직클람머.1480135-864서울 강남 삼성2동12
5뮤자인.1480137-040서울 서초 반포동12
6보류.1480111-111112
7서울싱어즈소사이어티.1480137-868서울 서초 서초3동12
8서울오라토리오.1480137-867서울 서초 서초3동12
9아시아투데이.1480150-890서울 영등 여의도동12