Overview

Dataset statistics

Number of variables4
Number of observations86
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.9 KiB
Average record size in memory34.5 B

Variable types

Numeric1
Categorical1
Text2

Dataset

Description경상북도의 지역별 축제정보, 템플스테이, 관광안내소 등 체험, 즐길거리, 레포츠 등과 관련한 다양한 관광정보를 제공합니다(경상북도 지역축제 및 문화행사 정보입니다.)
Author경상북도
URLhttps://www.data.go.kr/data/15052261/fileData.do

Alerts

연번 is highly overall correlated with 시군High correlation
시군 is highly overall correlated with 연번High correlation
연번 has unique valuesUnique
축 제 명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 18:08:51.907627
Analysis finished2023-12-12 18:08:52.355168
Duration0.45 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct86
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.5
Minimum1
Maximum86
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size906.0 B
2023-12-13T03:08:52.426995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.25
Q122.25
median43.5
Q364.75
95-th percentile81.75
Maximum86
Range85
Interquartile range (IQR)42.5

Descriptive statistics

Standard deviation24.969982
Coefficient of variation (CV)0.57402257
Kurtosis-1.2
Mean43.5
Median Absolute Deviation (MAD)21.5
Skewness0
Sum3741
Variance623.5
MonotonicityStrictly increasing
2023-12-13T03:08:52.562099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.2%
56 1
 
1.2%
64 1
 
1.2%
63 1
 
1.2%
62 1
 
1.2%
61 1
 
1.2%
60 1
 
1.2%
59 1
 
1.2%
58 1
 
1.2%
57 1
 
1.2%
Other values (76) 76
88.4%
ValueCountFrequency (%)
1 1
1.2%
2 1
1.2%
3 1
1.2%
4 1
1.2%
5 1
1.2%
6 1
1.2%
7 1
1.2%
8 1
1.2%
9 1
1.2%
10 1
1.2%
ValueCountFrequency (%)
86 1
1.2%
85 1
1.2%
84 1
1.2%
83 1
1.2%
82 1
1.2%
81 1
1.2%
80 1
1.2%
79 1
1.2%
78 1
1.2%
77 1
1.2%

시군
Categorical

HIGH CORRELATION 

Distinct23
Distinct (%)26.7%
Missing0
Missing (%)0.0%
Memory size820.0 B
포항
12 
경주
안동
 
5
구미
 
5
영주
 
5
Other values (18)
53 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique4 ?
Unique (%)4.7%

Sample

1st row포항
2nd row포항
3rd row포항
4th row포항
5th row포항

Common Values

ValueCountFrequency (%)
포항 12
 
14.0%
경주 6
 
7.0%
안동 5
 
5.8%
구미 5
 
5.8%
영주 5
 
5.8%
영천 5
 
5.8%
문경 5
 
5.8%
예천 5
 
5.8%
영덕 4
 
4.7%
상주 4
 
4.7%
Other values (13) 30
34.9%

Length

2023-12-13T03:08:52.658682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
포항 12
 
14.0%
경주 6
 
7.0%
안동 5
 
5.8%
구미 5
 
5.8%
영주 5
 
5.8%
영천 5
 
5.8%
문경 5
 
5.8%
예천 5
 
5.8%
영덕 4
 
4.7%
상주 4
 
4.7%
Other values (13) 30
34.9%

축 제 명
Text

UNIQUE 

Distinct86
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size820.0 B
2023-12-13T03:08:52.844777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length20
Median length16.5
Mean length10.593023
Min length4

Characters and Unicode

Total characters911
Distinct characters231
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique86 ?
Unique (%)100.0%

Sample

1st row포항 구룡포 대게 축제
2nd row포항해병대문화축제
3rd row포항 호미곶 돌문어 축제
4th row포항국제불빛축제
5th row포항 영일만 검은 돌장어 축제
ValueCountFrequency (%)
2021 8
 
5.0%
축제 8
 
5.0%
포항 5
 
3.1%
영덕 2
 
1.2%
2021년 2
 
1.2%
구룡포 2
 
1.2%
영덕황금은어축제 1
 
0.6%
문경사과축제 1
 
0.6%
축산항 1
 
0.6%
festival 1
 
0.6%
Other values (129) 129
80.6%
2023-12-13T03:08:53.167826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
88
 
9.7%
75
 
8.2%
70
 
7.7%
2 33
 
3.6%
17
 
1.9%
16
 
1.8%
0 15
 
1.6%
1 15
 
1.6%
15
 
1.6%
15
 
1.6%
Other values (221) 552
60.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 740
81.2%
Decimal Number 77
 
8.5%
Space Separator 75
 
8.2%
Lowercase Letter 9
 
1.0%
Other Punctuation 5
 
0.5%
Uppercase Letter 4
 
0.4%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
88
 
11.9%
70
 
9.5%
17
 
2.3%
16
 
2.2%
15
 
2.0%
15
 
2.0%
14
 
1.9%
13
 
1.8%
13
 
1.8%
12
 
1.6%
Other values (193) 467
63.1%
Decimal Number
ValueCountFrequency (%)
2 33
42.9%
0 15
19.5%
1 15
19.5%
8 4
 
5.2%
4 4
 
5.2%
9 2
 
2.6%
3 1
 
1.3%
7 1
 
1.3%
5 1
 
1.3%
6 1
 
1.3%
Lowercase Letter
ValueCountFrequency (%)
e 1
11.1%
s 1
11.1%
l 1
11.1%
a 1
11.1%
v 1
11.1%
i 1
11.1%
t 1
11.1%
k 1
11.1%
m 1
11.1%
Uppercase Letter
ValueCountFrequency (%)
T 1
25.0%
H 1
25.0%
F 1
25.0%
O 1
25.0%
Other Punctuation
ValueCountFrequency (%)
. 3
60.0%
& 1
 
20.0%
, 1
 
20.0%
Space Separator
ValueCountFrequency (%)
75
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 740
81.2%
Common 158
 
17.3%
Latin 13
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
88
 
11.9%
70
 
9.5%
17
 
2.3%
16
 
2.2%
15
 
2.0%
15
 
2.0%
14
 
1.9%
13
 
1.8%
13
 
1.8%
12
 
1.6%
Other values (193) 467
63.1%
Common
ValueCountFrequency (%)
75
47.5%
2 33
20.9%
0 15
 
9.5%
1 15
 
9.5%
8 4
 
2.5%
4 4
 
2.5%
. 3
 
1.9%
9 2
 
1.3%
- 1
 
0.6%
3 1
 
0.6%
Other values (5) 5
 
3.2%
Latin
ValueCountFrequency (%)
T 1
 
7.7%
H 1
 
7.7%
F 1
 
7.7%
O 1
 
7.7%
e 1
 
7.7%
s 1
 
7.7%
l 1
 
7.7%
a 1
 
7.7%
v 1
 
7.7%
i 1
 
7.7%
Other values (3) 3
23.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 740
81.2%
ASCII 171
 
18.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
88
 
11.9%
70
 
9.5%
17
 
2.3%
16
 
2.2%
15
 
2.0%
15
 
2.0%
14
 
1.9%
13
 
1.8%
13
 
1.8%
12
 
1.6%
Other values (193) 467
63.1%
ASCII
ValueCountFrequency (%)
75
43.9%
2 33
19.3%
0 15
 
8.8%
1 15
 
8.8%
8 4
 
2.3%
4 4
 
2.3%
. 3
 
1.8%
9 2
 
1.2%
T 1
 
0.6%
H 1
 
0.6%
Other values (18) 18
 
10.5%
Distinct73
Distinct (%)84.9%
Missing0
Missing (%)0.0%
Memory size820.0 B
2023-12-13T03:08:53.355113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length14
Mean length10.244186
Min length2

Characters and Unicode

Total characters881
Distinct characters23
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique63 ?
Unique (%)73.3%

Sample

1st row12월중
2nd row10월중
3rd row4월중
4th row11-19~11- 21(3일)
5th row10월초
ValueCountFrequency (%)
12
 
9.2%
9
 
6.9%
10월 5
 
3.8%
중(2일간 5
 
3.8%
9월 5
 
3.8%
10월중 4
 
3.1%
1-1 3
 
2.3%
12-31 3
 
2.3%
17(10일 2
 
1.5%
중(3일간 2
 
1.5%
Other values (73) 81
61.8%
2023-12-13T03:08:53.669935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 144
16.3%
- 75
 
8.5%
0 71
 
8.1%
) 55
 
6.2%
( 55
 
6.2%
53
 
6.0%
~ 53
 
6.0%
49
 
5.6%
47
 
5.3%
2 44
 
5.0%
Other values (13) 235
26.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 411
46.7%
Other Letter 185
21.0%
Dash Punctuation 75
 
8.5%
Close Punctuation 55
 
6.2%
Open Punctuation 55
 
6.2%
Math Symbol 53
 
6.0%
Space Separator 47
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 144
35.0%
0 71
17.3%
2 44
 
10.7%
3 35
 
8.5%
9 30
 
7.3%
4 22
 
5.4%
5 20
 
4.9%
7 18
 
4.4%
6 14
 
3.4%
8 13
 
3.2%
Other Letter
ValueCountFrequency (%)
53
28.6%
49
26.5%
39
21.1%
34
18.4%
3
 
1.6%
3
 
1.6%
2
 
1.1%
2
 
1.1%
Dash Punctuation
ValueCountFrequency (%)
- 75
100.0%
Close Punctuation
ValueCountFrequency (%)
) 55
100.0%
Open Punctuation
ValueCountFrequency (%)
( 55
100.0%
Math Symbol
ValueCountFrequency (%)
~ 53
100.0%
Space Separator
ValueCountFrequency (%)
47
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 696
79.0%
Hangul 185
 
21.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 144
20.7%
- 75
10.8%
0 71
10.2%
) 55
 
7.9%
( 55
 
7.9%
~ 53
 
7.6%
47
 
6.8%
2 44
 
6.3%
3 35
 
5.0%
9 30
 
4.3%
Other values (5) 87
12.5%
Hangul
ValueCountFrequency (%)
53
28.6%
49
26.5%
39
21.1%
34
18.4%
3
 
1.6%
3
 
1.6%
2
 
1.1%
2
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 696
79.0%
Hangul 185
 
21.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 144
20.7%
- 75
10.8%
0 71
10.2%
) 55
 
7.9%
( 55
 
7.9%
~ 53
 
7.6%
47
 
6.8%
2 44
 
6.3%
3 35
 
5.0%
9 30
 
4.3%
Other values (5) 87
12.5%
Hangul
ValueCountFrequency (%)
53
28.6%
49
26.5%
39
21.1%
34
18.4%
3
 
1.6%
3
 
1.6%
2
 
1.1%
2
 
1.1%

Interactions

2023-12-13T03:08:52.167906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:08:53.749196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시군축 제 명기 간
연번1.0000.9801.0000.866
시군0.9801.0001.0000.961
축 제 명1.0001.0001.0001.000
기 간0.8660.9611.0001.000
2023-12-13T03:08:53.831132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시군
연번1.0000.824
시군0.8241.000

Missing values

2023-12-13T03:08:52.252230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:08:52.322977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번시군축 제 명기 간
01포항포항 구룡포 대게 축제12월중
12포항포항해병대문화축제10월중
23포항포항 호미곶 돌문어 축제4월중
34포항포항국제불빛축제11-19~11- 21(3일)
45포항포항 영일만 검은 돌장어 축제10월초
56포항포항운하축제9월 중
67포항2021 포항스틸아트페스티벌10-15~10-30(15일)
78포항포항 수산물페스티벌9월 중
89포항제14회 부조장터문화축제10-16~17
910포항포항 구룡포 과메기축제11월 중
연번시군축 제 명기 간
7677예천2021 예천 세계활축제10-15~10-17(3일간)
7778예천2021 삼강주막 나루터축제9-20~9-22(3일간)
7879봉화제23회 봉화은어축제7-31~8-8(9일간)
7980봉화제25회 봉화송이축제9월 중(4일간)
8081봉화시장애 불금축제5~9월(150일간)
8182봉화2020-2021 분천산타마을운영12~2월(60일간)
8283울진울진대게와붉은대게축제2월중
8384울진울진금강송송이축제10-1~10-4(4일간)
8485울진죽변항 수산물 축제12-11~12-13(3일간)
8586울릉울릉도오징어축제10월중(3일간)