Overview

Dataset statistics

Number of variables6
Number of observations21
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.1 KiB
Average record size in memory55.3 B

Variable types

Text4
Categorical2

Alerts

영화관등록수 is highly overall correlated with 상영관(스크린)High correlation
상영관(스크린) is highly overall correlated with 영화관등록수High correlation
영화관등록수 is highly imbalanced (72.4%)Imbalance
상영영화관명 has unique valuesUnique
소재지 has unique valuesUnique

Reproduction

Analysis started2024-03-14 01:11:25.246880
Analysis finished2024-03-14 01:11:25.637767
Duration0.39 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Text

Distinct13
Distinct (%)61.9%
Missing0
Missing (%)0.0%
Memory size300.0 B
2024-03-14T10:11:25.706167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.9047619
Min length1

Characters and Unicode

Total characters61
Distinct characters23
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)52.4%

Sample

1st row
2nd row전주시
3rd row전주시
4th row전주시
5th row전주시
ValueCountFrequency (%)
전주시 8
38.1%
군산시 2
 
9.5%
1
 
4.8%
익산시 1
 
4.8%
정읍시 1
 
4.8%
남원시 1
 
4.8%
김제시 1
 
4.8%
완주군 1
 
4.8%
무주군 1
 
4.8%
장수군 1
 
4.8%
Other values (3) 3
 
14.3%
2024-03-14T10:11:25.931332image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14
23.0%
10
16.4%
8
13.1%
8
13.1%
3
 
4.9%
1
 
1.6%
1
 
1.6%
1
 
1.6%
1
 
1.6%
1
 
1.6%
Other values (13) 13
21.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 61
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
23.0%
10
16.4%
8
13.1%
8
13.1%
3
 
4.9%
1
 
1.6%
1
 
1.6%
1
 
1.6%
1
 
1.6%
1
 
1.6%
Other values (13) 13
21.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 61
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
23.0%
10
16.4%
8
13.1%
8
13.1%
3
 
4.9%
1
 
1.6%
1
 
1.6%
1
 
1.6%
1
 
1.6%
1
 
1.6%
Other values (13) 13
21.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 61
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
14
23.0%
10
16.4%
8
13.1%
8
13.1%
3
 
4.9%
1
 
1.6%
1
 
1.6%
1
 
1.6%
1
 
1.6%
1
 
1.6%
Other values (13) 13
21.3%

영화관등록수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)9.5%
Missing0
Missing (%)0.0%
Memory size300.0 B
1
20 
20
 
1

Length

Max length2
Median length1
Mean length1.047619
Min length1

Unique

Unique1 ?
Unique (%)4.8%

Sample

1st row20
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 20
95.2%
20 1
 
4.8%

Length

2024-03-14T10:11:26.031505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T10:11:26.122023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 20
95.2%
20 1
 
4.8%

상영영화관명
Text

UNIQUE 

Distinct21
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size300.0 B
2024-03-14T10:11:26.269059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length9.1428571
Min length4

Characters and Unicode

Total characters192
Distinct characters62
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)100.0%

Sample

1st row20개소(일반13,작은7)
2nd row롯데시네마 전주
3rd row롯데시네마 평화점
4th row메가박스 전주
5th row씨너스 송천
ValueCountFrequency (%)
cgv 4
 
12.9%
전주 3
 
9.7%
롯데시네마 3
 
9.7%
군산 2
 
6.5%
메가박스 2
 
6.5%
정읍 1
 
3.2%
동리시네마(작은영화관 1
 
3.2%
작은별영화관(작은영화관 1
 
3.2%
한누리시네마(작은영화관 1
 
3.2%
무주산골영화관(작은영화관 1
 
3.2%
Other values (12) 12
38.7%
2024-03-14T10:11:26.540876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
6.2%
11
 
5.7%
11
 
5.7%
10
 
5.2%
9
 
4.7%
9
 
4.7%
9
 
4.7%
( 8
 
4.2%
8
 
4.2%
8
 
4.2%
Other values (52) 97
50.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 145
75.5%
Uppercase Letter 15
 
7.8%
Space Separator 10
 
5.2%
Open Punctuation 8
 
4.2%
Close Punctuation 8
 
4.2%
Decimal Number 5
 
2.6%
Other Punctuation 1
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
8.3%
11
 
7.6%
11
 
7.6%
9
 
6.2%
9
 
6.2%
9
 
6.2%
8
 
5.5%
8
 
5.5%
6
 
4.1%
5
 
3.4%
Other values (40) 57
39.3%
Decimal Number
ValueCountFrequency (%)
2 1
20.0%
7 1
20.0%
1 1
20.0%
3 1
20.0%
0 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
G 5
33.3%
C 5
33.3%
V 5
33.3%
Space Separator
ValueCountFrequency (%)
10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 145
75.5%
Common 32
 
16.7%
Latin 15
 
7.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
8.3%
11
 
7.6%
11
 
7.6%
9
 
6.2%
9
 
6.2%
9
 
6.2%
8
 
5.5%
8
 
5.5%
6
 
4.1%
5
 
3.4%
Other values (40) 57
39.3%
Common
ValueCountFrequency (%)
10
31.2%
( 8
25.0%
) 8
25.0%
2 1
 
3.1%
7 1
 
3.1%
1 1
 
3.1%
3 1
 
3.1%
, 1
 
3.1%
0 1
 
3.1%
Latin
ValueCountFrequency (%)
G 5
33.3%
C 5
33.3%
V 5
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 145
75.5%
ASCII 47
 
24.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
 
8.3%
11
 
7.6%
11
 
7.6%
9
 
6.2%
9
 
6.2%
9
 
6.2%
8
 
5.5%
8
 
5.5%
6
 
4.1%
5
 
3.4%
Other values (40) 57
39.3%
ASCII
ValueCountFrequency (%)
10
21.3%
( 8
17.0%
) 8
17.0%
G 5
10.6%
C 5
10.6%
V 5
10.6%
2 1
 
2.1%
7 1
 
2.1%
1 1
 
2.1%
3 1
 
2.1%
Other values (2) 2
 
4.3%

상영관(스크린)
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)47.6%
Missing0
Missing (%)0.0%
Memory size300.0 B
2
7
8
6
4
Other values (5)

Length

Max length10
Median length1
Mean length1.4761905
Min length1

Unique

Unique5 ?
Unique (%)23.8%

Sample

1st row96관(82/14)
2nd row8
3rd row6
4th row10
5th row8

Common Values

ValueCountFrequency (%)
2 7
33.3%
7 3
14.3%
8 2
 
9.5%
6 2
 
9.5%
4 2
 
9.5%
96관(82/14) 1
 
4.8%
10 1
 
4.8%
1 1
 
4.8%
5 1
 
4.8%
9 1
 
4.8%

Length

2024-03-14T10:11:26.690248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T10:11:26.793518image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 7
33.3%
7 3
14.3%
8 2
 
9.5%
6 2
 
9.5%
4 2
 
9.5%
96관(82/14 1
 
4.8%
10 1
 
4.8%
1 1
 
4.8%
5 1
 
4.8%
9 1
 
4.8%

좌석
Text

Distinct18
Distinct (%)85.7%
Missing0
Missing (%)0.0%
Memory size300.0 B
2024-03-14T10:11:26.963671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length7
Mean length6.1428571
Min length4

Characters and Unicode

Total characters129
Distinct characters16
Distinct categories6 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)71.4%

Sample

1st row 14,871 (14,208/663)
2nd row 1,626
3rd row 922
4th row 1,515
5th row 1,215
ValueCountFrequency (%)
90 2
 
9.1%
99 2
 
9.1%
98 2
 
9.1%
1,325 1
 
4.5%
14,208/663 1
 
4.5%
14,871 1
 
4.5%
1,100 1
 
4.5%
94 1
 
4.5%
615 1
 
4.5%
565 1
 
4.5%
Other values (9) 9
40.9%
2024-03-14T10:11:27.250950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
42
32.6%
1 16
 
12.4%
9 14
 
10.9%
, 10
 
7.8%
5 10
 
7.8%
2 8
 
6.2%
6 7
 
5.4%
0 5
 
3.9%
8 4
 
3.1%
7 3
 
2.3%
Other values (6) 10
 
7.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 73
56.6%
Space Separator 42
32.6%
Other Punctuation 11
 
8.5%
Control 1
 
0.8%
Open Punctuation 1
 
0.8%
Close Punctuation 1
 
0.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 16
21.9%
9 14
19.2%
5 10
13.7%
2 8
11.0%
6 7
9.6%
0 5
 
6.8%
8 4
 
5.5%
7 3
 
4.1%
3 3
 
4.1%
4 3
 
4.1%
Other Punctuation
ValueCountFrequency (%)
, 10
90.9%
/ 1
 
9.1%
Space Separator
ValueCountFrequency (%)
42
100.0%
Control
ValueCountFrequency (%)
1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 129
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
42
32.6%
1 16
 
12.4%
9 14
 
10.9%
, 10
 
7.8%
5 10
 
7.8%
2 8
 
6.2%
6 7
 
5.4%
0 5
 
3.9%
8 4
 
3.1%
7 3
 
2.3%
Other values (6) 10
 
7.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 129
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
42
32.6%
1 16
 
12.4%
9 14
 
10.9%
, 10
 
7.8%
5 10
 
7.8%
2 8
 
6.2%
6 7
 
5.4%
0 5
 
3.9%
8 4
 
3.1%
7 3
 
2.3%
Other values (6) 10
 
7.8%

소재지
Text

UNIQUE 

Distinct21
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size300.0 B
2024-03-14T10:11:27.393638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length19
Mean length16.190476
Min length1

Characters and Unicode

Total characters340
Distinct characters90
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)100.0%

Sample

1st row-
2nd row전주시 완산구 서신동 971
3rd row전주시 완산구 평화동 604-1
4th row전주시 고사동 181
5th row전주시 덕진구 송천동 2가 661-15
ValueCountFrequency (%)
전주시 7
 
10.6%
완산구 4
 
6.1%
고사동 3
 
4.5%
나운동 2
 
3.0%
군산시 2
 
3.0%
1
 
1.5%
881 1
 
1.5%
82-1 1
 
1.5%
김제시 1
 
1.5%
검산동 1
 
1.5%
Other values (43) 43
65.2%
2024-03-14T10:11:27.678044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
45
 
13.2%
1 24
 
7.1%
14
 
4.1%
14
 
4.1%
12
 
3.5%
- 11
 
3.2%
11
 
3.2%
10
 
2.9%
4 10
 
2.9%
2 9
 
2.6%
Other values (80) 180
52.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 203
59.7%
Decimal Number 73
 
21.5%
Space Separator 45
 
13.2%
Dash Punctuation 11
 
3.2%
Open Punctuation 4
 
1.2%
Close Punctuation 4
 
1.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
6.9%
14
 
6.9%
12
 
5.9%
11
 
5.4%
10
 
4.9%
9
 
4.4%
7
 
3.4%
6
 
3.0%
6
 
3.0%
5
 
2.5%
Other values (66) 109
53.7%
Decimal Number
ValueCountFrequency (%)
1 24
32.9%
4 10
13.7%
2 9
 
12.3%
8 7
 
9.6%
3 6
 
8.2%
6 5
 
6.8%
7 4
 
5.5%
0 4
 
5.5%
9 2
 
2.7%
5 2
 
2.7%
Space Separator
ValueCountFrequency (%)
45
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 203
59.7%
Common 137
40.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
6.9%
14
 
6.9%
12
 
5.9%
11
 
5.4%
10
 
4.9%
9
 
4.4%
7
 
3.4%
6
 
3.0%
6
 
3.0%
5
 
2.5%
Other values (66) 109
53.7%
Common
ValueCountFrequency (%)
45
32.8%
1 24
17.5%
- 11
 
8.0%
4 10
 
7.3%
2 9
 
6.6%
8 7
 
5.1%
3 6
 
4.4%
6 5
 
3.6%
( 4
 
2.9%
7 4
 
2.9%
Other values (4) 12
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 203
59.7%
ASCII 137
40.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
45
32.8%
1 24
17.5%
- 11
 
8.0%
4 10
 
7.3%
2 9
 
6.6%
8 7
 
5.1%
3 6
 
4.4%
6 5
 
3.6%
( 4
 
2.9%
7 4
 
2.9%
Other values (4) 12
 
8.8%
Hangul
ValueCountFrequency (%)
14
 
6.9%
14
 
6.9%
12
 
5.9%
11
 
5.4%
10
 
4.9%
9
 
4.4%
7
 
3.4%
6
 
3.0%
6
 
3.0%
5
 
2.5%
Other values (66) 109
53.7%

Correlations

2024-03-14T10:11:27.756584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분영화관등록수상영영화관명상영관(스크린)좌석소재지
구분1.0001.0001.0000.0000.0001.000
영화관등록수1.0001.0001.0001.0001.0001.000
상영영화관명1.0001.0001.0001.0001.0001.000
상영관(스크린)0.0001.0001.0001.0000.9611.000
좌석0.0001.0001.0000.9611.0001.000
소재지1.0001.0001.0001.0001.0001.000
2024-03-14T10:11:27.851868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
영화관등록수상영관(스크린)
영화관등록수1.0000.761
상영관(스크린)0.7611.000
2024-03-14T10:11:27.936516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
영화관등록수상영관(스크린)
영화관등록수1.0000.761
상영관(스크린)0.7611.000

Missing values

2024-03-14T10:11:25.525064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T10:11:25.604379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분영화관등록수상영영화관명상영관(스크린)좌석소재지
02020개소(일반13,작은7)96관(82/14)14,871 (14,208/663)-
1전주시1롯데시네마 전주81,626전주시 완산구 서신동 971
2전주시1롯데시네마 평화점6922전주시 완산구 평화동 604-1
3전주시1메가박스 전주101,515전주시 고사동 181
4전주시1씨너스 송천81,215전주시 덕진구 송천동 2가 661-15
5전주시1전주시네마타운71,265전주시 완산구 고사동 340-3
6전주시1CGV 전주61,192전주시 고사동 288-2
7전주시1전주디지털독립영화관198전주영화제작소 4층(전주시고사동 객사3길18)
8전주시1CGV효자동71,795전주시 완산구 용머리로45
9군산시1롯데시네마 군산5975군산시 나운동 124-4
구분영화관등록수상영영화관명상영관(스크린)좌석소재지
11익산시1CGV 익산91,325익산시 영등동 149-1
12정읍시1CGV 정읍4565정읍시 중앙1길 1
13남원시1메가박스4615남원시 쌍교동 82-1
14김제시1지평선 시네마(작은영화관)299김제시 검산동 62-1 청소년수련관
15완주군1휴시네마(작은영화관)290완주군 봉동읍 군산리 881
16무주군1무주산골영화관(작은영화관)298무주군 무주읍 한풍루로326-17
17장수군1한누리시네마(작은영화관)290한누리전당내가람관1층장수군(장수읍 두산리 472)
18임실군1작은별영화관(작은영화관)294국민회관지하1층(임실군임실읍호국로1703)
19고창군1동리시네마(작은영화관)293동리국악당지하1층(고창군고창읍판소리길20)
20부안군1마실영화관(작은영화관)299전북 부안군 부안읍 예술화관길 11