Overview

Dataset statistics

Number of variables5
Number of observations1087
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)0.1%
Total size in memory43.7 KiB
Average record size in memory41.1 B

Variable types

Categorical2
Text2
Numeric1

Alerts

Dataset has 1 (0.1%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-10 22:31:42.790276
Analysis finished2023-12-10 22:31:43.577391
Duration0.79 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

Distinct31
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
고양시
124 
부천시
102 
파주시
101 
성남시
89 
용인시
77 
Other values (26)
594 

Length

Max length4
Median length3
Mean length3.0717571
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가평군
2nd row가평군
3rd row가평군
4th row가평군
5th row가평군

Common Values

ValueCountFrequency (%)
고양시 124
11.4%
부천시 102
 
9.4%
파주시 101
 
9.3%
성남시 89
 
8.2%
용인시 77
 
7.1%
화성시 75
 
6.9%
안산시 72
 
6.6%
수원시 63
 
5.8%
의정부시 47
 
4.3%
광명시 33
 
3.0%
Other values (21) 304
28.0%

Length

2023-12-11T07:31:43.627013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
고양시 124
11.4%
부천시 102
 
9.4%
파주시 101
 
9.3%
성남시 89
 
8.2%
용인시 77
 
7.1%
화성시 75
 
6.9%
안산시 72
 
6.6%
수원시 63
 
5.8%
의정부시 47
 
4.3%
광명시 33
 
3.0%
Other values (21) 304
28.0%
Distinct983
Distinct (%)90.4%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
2023-12-11T07:31:43.881341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length23
Mean length6.1361546
Min length1

Characters and Unicode

Total characters6670
Distinct characters726
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique911 ?
Unique (%)83.8%

Sample

1st rowR2B:리턴투베이스
2nd row드라마월드
3rd row오싹한 연애
4th row성묘가는 길
5th row신의 선물
ValueCountFrequency (%)
나의 12
 
0.7%
11
 
0.6%
이야기 9
 
0.5%
홍보영상 9
 
0.5%
무서운 7
 
0.4%
7
 
0.4%
마이 7
 
0.4%
남자 7
 
0.4%
연애 6
 
0.3%
리미트 6
 
0.3%
Other values (1380) 1679
95.4%
2023-12-11T07:31:44.265943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
673
 
10.1%
205
 
3.1%
114
 
1.7%
111
 
1.7%
93
 
1.4%
83
 
1.2%
77
 
1.2%
74
 
1.1%
56
 
0.8%
56
 
0.8%
Other values (716) 5128
76.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5199
77.9%
Space Separator 673
 
10.1%
Lowercase Letter 328
 
4.9%
Uppercase Letter 256
 
3.8%
Decimal Number 85
 
1.3%
Other Punctuation 67
 
1.0%
Dash Punctuation 33
 
0.5%
Close Punctuation 10
 
0.1%
Open Punctuation 10
 
0.1%
Math Symbol 8
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
205
 
3.9%
114
 
2.2%
111
 
2.1%
93
 
1.8%
83
 
1.6%
77
 
1.5%
74
 
1.4%
56
 
1.1%
56
 
1.1%
56
 
1.1%
Other values (641) 4274
82.2%
Uppercase Letter
ValueCountFrequency (%)
S 28
 
10.9%
B 21
 
8.2%
A 17
 
6.6%
N 16
 
6.2%
R 15
 
5.9%
K 14
 
5.5%
E 14
 
5.5%
D 13
 
5.1%
T 13
 
5.1%
O 12
 
4.7%
Other values (15) 93
36.3%
Lowercase Letter
ValueCountFrequency (%)
e 52
15.9%
a 50
15.2%
r 29
 
8.8%
o 28
 
8.5%
l 19
 
5.8%
n 16
 
4.9%
s 16
 
4.9%
h 15
 
4.6%
i 15
 
4.6%
t 12
 
3.7%
Other values (12) 76
23.2%
Other Punctuation
ValueCountFrequency (%)
, 18
26.9%
. 17
25.4%
: 14
20.9%
' 8
11.9%
! 2
 
3.0%
& 2
 
3.0%
" 2
 
3.0%
1
 
1.5%
% 1
 
1.5%
? 1
 
1.5%
Decimal Number
ValueCountFrequency (%)
2 21
24.7%
1 17
20.0%
0 10
11.8%
9 7
 
8.2%
8 7
 
8.2%
3 6
 
7.1%
6 6
 
7.1%
7 4
 
4.7%
4 4
 
4.7%
5 3
 
3.5%
Math Symbol
ValueCountFrequency (%)
< 4
50.0%
> 4
50.0%
Space Separator
ValueCountFrequency (%)
673
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 33
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10
100.0%
Open Punctuation
ValueCountFrequency (%)
( 10
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5197
77.9%
Common 887
 
13.3%
Latin 584
 
8.8%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
205
 
3.9%
114
 
2.2%
111
 
2.1%
93
 
1.8%
83
 
1.6%
77
 
1.5%
74
 
1.4%
56
 
1.1%
56
 
1.1%
56
 
1.1%
Other values (639) 4272
82.2%
Latin
ValueCountFrequency (%)
e 52
 
8.9%
a 50
 
8.6%
r 29
 
5.0%
o 28
 
4.8%
S 28
 
4.8%
B 21
 
3.6%
l 19
 
3.3%
A 17
 
2.9%
n 16
 
2.7%
N 16
 
2.7%
Other values (37) 308
52.7%
Common
ValueCountFrequency (%)
673
75.9%
- 33
 
3.7%
2 21
 
2.4%
, 18
 
2.0%
. 17
 
1.9%
1 17
 
1.9%
: 14
 
1.6%
0 10
 
1.1%
) 10
 
1.1%
( 10
 
1.1%
Other values (18) 64
 
7.2%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5197
77.9%
ASCII 1470
 
22.0%
CJK 2
 
< 0.1%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
673
45.8%
e 52
 
3.5%
a 50
 
3.4%
- 33
 
2.2%
r 29
 
2.0%
o 28
 
1.9%
S 28
 
1.9%
B 21
 
1.4%
2 21
 
1.4%
l 19
 
1.3%
Other values (64) 516
35.1%
Hangul
ValueCountFrequency (%)
205
 
3.9%
114
 
2.2%
111
 
2.1%
93
 
1.8%
83
 
1.6%
77
 
1.5%
74
 
1.4%
56
 
1.1%
56
 
1.1%
56
 
1.1%
Other values (639) 4272
82.2%
Punctuation
ValueCountFrequency (%)
1
100.0%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%

촬영구분명
Categorical

Distinct27
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
상업장편
354 
독립단편
154 
독립장편
123 
기타
86 
개인단편
82 
Other values (22)
288 

Length

Max length8
Median length4
Mean length3.6531739
Min length2

Unique

Unique6 ?
Unique (%)0.6%

Sample

1st row상업장편
2nd row기타
3rd row상업장편
4th row학생단편
5th row독립장편

Common Values

ValueCountFrequency (%)
상업장편 354
32.6%
독립단편 154
14.2%
독립장편 123
 
11.3%
기타 86
 
7.9%
개인단편 82
 
7.5%
TV 81
 
7.5%
학생단편 43
 
4.0%
CF 39
 
3.6%
TV드라마 34
 
3.1%
뮤직비디오 25
 
2.3%
Other values (17) 66
 
6.1%

Length

2023-12-11T07:31:44.382801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
상업장편 354
32.6%
독립단편 154
14.2%
독립장편 123
 
11.3%
기타 86
 
7.9%
tv 86
 
7.9%
개인단편 82
 
7.5%
학생단편 43
 
4.0%
cf 41
 
3.8%
tv드라마 34
 
3.1%
뮤직비디오 25
 
2.3%
Other values (14) 59
 
5.4%

촬영년도
Real number (ℝ)

Distinct12
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.1205
Minimum2010
Maximum2021
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.7 KiB
2023-12-11T07:31:44.467242image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2010
5-th percentile2010
Q12013
median2016
Q32020
95-th percentile2021
Maximum2021
Range11
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.4552137
Coefficient of variation (CV)0.0017137932
Kurtosis-1.2590545
Mean2016.1205
Median Absolute Deviation (MAD)3
Skewness-0.11817956
Sum2191523
Variance11.938501
MonotonicityNot monotonic
2023-12-11T07:31:44.556899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
2020 152
14.0%
2021 125
11.5%
2012 106
9.8%
2015 93
8.6%
2016 93
8.6%
2014 90
8.3%
2019 90
8.3%
2017 88
8.1%
2013 86
7.9%
2010 56
 
5.2%
Other values (2) 108
9.9%
ValueCountFrequency (%)
2010 56
5.2%
2011 55
5.1%
2012 106
9.8%
2013 86
7.9%
2014 90
8.3%
2015 93
8.6%
2016 93
8.6%
2017 88
8.1%
2018 53
4.9%
2019 90
8.3%
ValueCountFrequency (%)
2021 125
11.5%
2020 152
14.0%
2019 90
8.3%
2018 53
 
4.9%
2017 88
8.1%
2016 93
8.6%
2015 93
8.6%
2014 90
8.3%
2013 86
7.9%
2012 106
9.8%
Distinct879
Distinct (%)80.9%
Missing0
Missing (%)0.0%
Memory size8.6 KiB
2023-12-11T07:31:44.808348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length22
Mean length10.217111
Min length2

Characters and Unicode

Total characters11106
Distinct characters507
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique803 ?
Unique (%)73.9%

Sample

1st row자라섬 입구 배수펌프장
2nd row가평군 청심평화월드센터
3rd row산유리 75번 국도
4th row가평휴게소 상행선
5th row신천리 개인별장
ValueCountFrequency (%)
고양시 62
 
2.5%
파주시 56
 
2.3%
도로 43
 
1.7%
부천시 42
 
1.7%
용인시 36
 
1.4%
미개통도로 33
 
1.3%
성남시 31
 
1.2%
안산 30
 
1.2%
수원시 28
 
1.1%
27
 
1.1%
Other values (1202) 2097
84.4%
2023-12-11T07:31:45.202263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1399
 
12.6%
565
 
5.1%
250
 
2.3%
245
 
2.2%
204
 
1.8%
199
 
1.8%
194
 
1.7%
177
 
1.6%
175
 
1.6%
164
 
1.5%
Other values (497) 7534
67.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 9455
85.1%
Space Separator 1399
 
12.6%
Decimal Number 112
 
1.0%
Uppercase Letter 62
 
0.6%
Other Punctuation 26
 
0.2%
Close Punctuation 15
 
0.1%
Dash Punctuation 14
 
0.1%
Open Punctuation 12
 
0.1%
Lowercase Letter 8
 
0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
565
 
6.0%
250
 
2.6%
245
 
2.6%
204
 
2.2%
199
 
2.1%
194
 
2.1%
177
 
1.9%
175
 
1.9%
164
 
1.7%
156
 
1.6%
Other values (451) 7126
75.4%
Uppercase Letter
ValueCountFrequency (%)
I 8
12.9%
C 8
12.9%
D 5
 
8.1%
T 5
 
8.1%
G 5
 
8.1%
S 4
 
6.5%
K 4
 
6.5%
M 3
 
4.8%
F 3
 
4.8%
R 3
 
4.8%
Other values (9) 14
22.6%
Decimal Number
ValueCountFrequency (%)
1 23
20.5%
2 21
18.8%
7 15
13.4%
4 12
10.7%
5 10
8.9%
3 9
 
8.0%
0 7
 
6.2%
9 6
 
5.4%
6 5
 
4.5%
8 4
 
3.6%
Lowercase Letter
ValueCountFrequency (%)
i 2
25.0%
c 1
12.5%
s 1
12.5%
y 1
12.5%
r 1
12.5%
d 1
12.5%
a 1
12.5%
Other Punctuation
ValueCountFrequency (%)
/ 11
42.3%
, 7
26.9%
. 6
23.1%
? 1
 
3.8%
& 1
 
3.8%
Space Separator
ValueCountFrequency (%)
1399
100.0%
Close Punctuation
ValueCountFrequency (%)
) 15
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 9455
85.1%
Common 1581
 
14.2%
Latin 70
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
565
 
6.0%
250
 
2.6%
245
 
2.6%
204
 
2.2%
199
 
2.1%
194
 
2.1%
177
 
1.9%
175
 
1.9%
164
 
1.7%
156
 
1.6%
Other values (451) 7126
75.4%
Latin
ValueCountFrequency (%)
I 8
 
11.4%
C 8
 
11.4%
D 5
 
7.1%
T 5
 
7.1%
G 5
 
7.1%
S 4
 
5.7%
K 4
 
5.7%
M 3
 
4.3%
F 3
 
4.3%
R 3
 
4.3%
Other values (16) 22
31.4%
Common
ValueCountFrequency (%)
1399
88.5%
1 23
 
1.5%
2 21
 
1.3%
7 15
 
0.9%
) 15
 
0.9%
- 14
 
0.9%
( 12
 
0.8%
4 12
 
0.8%
/ 11
 
0.7%
5 10
 
0.6%
Other values (10) 49
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 9455
85.1%
ASCII 1651
 
14.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1399
84.7%
1 23
 
1.4%
2 21
 
1.3%
7 15
 
0.9%
) 15
 
0.9%
- 14
 
0.8%
( 12
 
0.7%
4 12
 
0.7%
/ 11
 
0.7%
5 10
 
0.6%
Other values (36) 119
 
7.2%
Hangul
ValueCountFrequency (%)
565
 
6.0%
250
 
2.6%
245
 
2.6%
204
 
2.2%
199
 
2.1%
194
 
2.1%
177
 
1.9%
175
 
1.9%
164
 
1.7%
156
 
1.6%
Other values (451) 7126
75.4%

Interactions

2023-12-11T07:31:43.382480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:31:45.283548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명촬영구분명촬영년도
시군명1.0000.3950.374
촬영구분명0.3951.0000.566
촬영년도0.3740.5661.000
2023-12-11T07:31:45.383205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
촬영구분명시군명
촬영구분명1.0000.101
시군명0.1011.000
2023-12-11T07:31:45.722867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
촬영년도시군명촬영구분명
촬영년도1.0000.1340.238
시군명0.1341.0000.101
촬영구분명0.2380.1011.000

Missing values

2023-12-11T07:31:43.474691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:31:43.545889image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군명작품명촬영구분명촬영년도촬영장소명
0가평군R2B:리턴투베이스상업장편2010자라섬 입구 배수펌프장
1가평군드라마월드기타2020가평군 청심평화월드센터
2가평군오싹한 연애상업장편2010산유리 75번 국도
3가평군성묘가는 길학생단편2011가평휴게소 상행선
4가평군신의 선물독립장편2012신천리 개인별장
5가평군내가 하는 로맨스개인단편2013청평역 주변 갓길
6가평군참 친절하시네요상업장편2014북한강로 도로
7가평군우리는 형제입니다상업장편2014현대도예문화원
8가평군참 친절하시네요상업장편2014북한강로 도로
9가평군동희와 할매개인단편2014읍내파출소
시군명작품명촬영구분명촬영년도촬영장소명
1077화성시백청강-BALLAD뮤직비디오2012어섬비행장
1078화성시장재인-여름밤뮤직비디오2012어섬 주변도로
1079화성시일말의 순정TV2013동탄신도시센트럴파크
1080화성시커플링개인단편2013우음도 주변 비포장도로
1081화성시아웃도어 화보기타2013어섬 비행장
1082화성시열대야개인단편2013어섬 비행장
1083화성시아직 잘 지내니뮤직비디오2013어섬 비행장
1084화성시탑기어 코리아TV2013어섬 비행장
1085화성시사냥꾼개인단편2014어섬비행장
1086화성시나의 독재자상업장편2014매화리 마을

Duplicate rows

Most frequently occurring

시군명작품명촬영구분명촬영년도촬영장소명# duplicates
0가평군참 친절하시네요상업장편2014북한강로 도로2