Overview

Dataset statistics

Number of variables7
Number of observations172
Missing cells0
Missing cells (%)0.0%
Duplicate rows2
Duplicate rows (%)1.2%
Total size in memory9.5 KiB
Average record size in memory56.8 B

Variable types

Unsupported3
Categorical2
Text2

Dataset

Description국내전지훈련선수단유치현황2014
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=202032

Alerts

Dataset has 2 (1.2%) duplicate rowsDuplicates
종목 is highly overall correlated with 장소High correlation
장소 is highly overall correlated with 종목High correlation
is an unsupported type, check if it needs cleaning or further analysisUnsupported
인원 is an unsupported type, check if it needs cleaning or further analysisUnsupported
연인원 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-03-14 00:40:03.894954
Analysis finished2024-03-14 00:40:04.301924
Duration0.41 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size1.5 KiB

종목
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)14.5%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
야구
37 
펜싱
22 
정구
14 
축구
13 
육상
12 
Other values (20)
74 

Length

Max length6
Median length2
Mean length2.3488372
Min length1

Unique

Unique5 ?
Unique (%)2.9%

Sample

1st row축구
2nd row축구
3rd row축구
4th row축구
5th row축구

Common Values

ValueCountFrequency (%)
야구 37
21.5%
펜싱 22
12.8%
정구 14
 
8.1%
축구 13
 
7.6%
육상 12
 
7.0%
아이스하키 9
 
5.2%
유도 9
 
5.2%
수영 7
 
4.1%
핸드볼 6
 
3.5%
배드민턴 5
 
2.9%
Other values (15) 38
22.1%

Length

2024-03-14T09:40:04.373379image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
야구 37
21.5%
펜싱 22
12.8%
정구 14
 
8.1%
축구 13
 
7.6%
육상 12
 
7.0%
아이스하키 9
 
5.2%
유도 9
 
5.2%
수영 7
 
4.1%
핸드볼 6
 
3.5%
배드민턴 5
 
2.9%
Other values (15) 38
22.1%

팀명
Text

Distinct147
Distinct (%)85.5%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-03-14T09:40:04.743674image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length8
Mean length7.4593023
Min length1

Characters and Unicode

Total characters1283
Distinct characters156
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique134 ?
Unique (%)77.9%

Sample

1st row부산 U-12세팀
2nd row김포 JD FC
3rd row대전 사커클럽
4th row구리 주니어팀
5th row광명 유소년팀
ValueCountFrequency (%)
34
 
6.9%
경기 31
 
6.3%
서울 23
 
4.7%
16
 
3.2%
16
 
3.2%
전남 15
 
3.0%
국가대표 12
 
2.4%
충남 10
 
2.0%
인천 9
 
1.8%
광주 9
 
1.8%
Other values (178) 318
64.5%
2024-03-14T09:40:05.202750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
321
25.0%
51
 
4.0%
51
 
4.0%
42
 
3.3%
38
 
3.0%
33
 
2.6%
32
 
2.5%
30
 
2.3%
29
 
2.3%
27
 
2.1%
Other values (146) 629
49.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 931
72.6%
Space Separator 321
 
25.0%
Uppercase Letter 11
 
0.9%
Dash Punctuation 6
 
0.5%
Decimal Number 6
 
0.5%
Open Punctuation 4
 
0.3%
Close Punctuation 4
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
51
 
5.5%
51
 
5.5%
42
 
4.5%
38
 
4.1%
33
 
3.5%
32
 
3.4%
30
 
3.2%
29
 
3.1%
27
 
2.9%
26
 
2.8%
Other values (134) 572
61.4%
Uppercase Letter
ValueCountFrequency (%)
C 4
36.4%
F 4
36.4%
U 1
 
9.1%
J 1
 
9.1%
D 1
 
9.1%
Decimal Number
ValueCountFrequency (%)
1 3
50.0%
6 2
33.3%
2 1
 
16.7%
Space Separator
ValueCountFrequency (%)
321
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 931
72.6%
Common 341
 
26.6%
Latin 11
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
51
 
5.5%
51
 
5.5%
42
 
4.5%
38
 
4.1%
33
 
3.5%
32
 
3.4%
30
 
3.2%
29
 
3.1%
27
 
2.9%
26
 
2.8%
Other values (134) 572
61.4%
Common
ValueCountFrequency (%)
321
94.1%
- 6
 
1.8%
( 4
 
1.2%
) 4
 
1.2%
1 3
 
0.9%
6 2
 
0.6%
2 1
 
0.3%
Latin
ValueCountFrequency (%)
C 4
36.4%
F 4
36.4%
U 1
 
9.1%
J 1
 
9.1%
D 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 931
72.6%
ASCII 352
 
27.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
321
91.2%
- 6
 
1.7%
C 4
 
1.1%
( 4
 
1.1%
) 4
 
1.1%
F 4
 
1.1%
1 3
 
0.9%
6 2
 
0.6%
U 1
 
0.3%
2 1
 
0.3%
Other values (2) 2
 
0.6%
Hangul
ValueCountFrequency (%)
51
 
5.5%
51
 
5.5%
42
 
4.5%
38
 
4.1%
33
 
3.5%
32
 
3.4%
30
 
3.2%
29
 
3.1%
27
 
2.9%
26
 
2.8%
Other values (134) 572
61.4%

기간
Text

Distinct89
Distinct (%)51.7%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-03-14T09:40:05.422241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7
Mean length6.9883721
Min length1

Characters and Unicode

Total characters1202
Distinct characters18
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique66 ?
Unique (%)38.4%

Sample

1st row1. 8∼15
2nd row1. 8∼15
3rd row1. 8∼15
4th row1. 8∼15
5th row1. 8∼15
ValueCountFrequency (%)
1 19
 
8.6%
2.13∼22 13
 
5.9%
8∼15 13
 
5.9%
2 12
 
5.4%
7.23〜8.31 11
 
5.0%
2.15∼21 10
 
4.5%
2.22∼28 8
 
3.6%
2.27∼3.2 6
 
2.7%
8 6
 
2.7%
1.13∼17 5
 
2.3%
Other values (86) 119
53.6%
2024-03-14T09:40:05.774940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 225
18.7%
1 209
17.4%
. 195
16.2%
144
12.0%
3 87
 
7.2%
8 63
 
5.2%
7 56
 
4.7%
50
 
4.2%
5 46
 
3.8%
0 30
 
2.5%
Other values (8) 97
8.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 781
65.0%
Other Punctuation 195
 
16.2%
Math Symbol 146
 
12.1%
Space Separator 50
 
4.2%
Dash Punctuation 22
 
1.8%
Other Letter 8
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 225
28.8%
1 209
26.8%
3 87
 
11.1%
8 63
 
8.1%
7 56
 
7.2%
5 46
 
5.9%
0 30
 
3.8%
4 30
 
3.8%
6 21
 
2.7%
9 14
 
1.8%
Math Symbol
ValueCountFrequency (%)
144
98.6%
~ 2
 
1.4%
Dash Punctuation
ValueCountFrequency (%)
17
77.3%
- 5
 
22.7%
Other Letter
ValueCountFrequency (%)
4
50.0%
4
50.0%
Other Punctuation
ValueCountFrequency (%)
. 195
100.0%
Space Separator
ValueCountFrequency (%)
50
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1194
99.3%
Hangul 8
 
0.7%

Most frequent character per script

Common
ValueCountFrequency (%)
2 225
18.8%
1 209
17.5%
. 195
16.3%
144
12.1%
3 87
 
7.3%
8 63
 
5.3%
7 56
 
4.7%
50
 
4.2%
5 46
 
3.9%
0 30
 
2.5%
Other values (6) 89
 
7.5%
Hangul
ValueCountFrequency (%)
4
50.0%
4
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1033
85.9%
Math Operators 144
 
12.0%
None 17
 
1.4%
Hangul 8
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 225
21.8%
1 209
20.2%
. 195
18.9%
3 87
 
8.4%
8 63
 
6.1%
7 56
 
5.4%
50
 
4.8%
5 46
 
4.5%
0 30
 
2.9%
4 30
 
2.9%
Other values (4) 42
 
4.1%
Math Operators
ValueCountFrequency (%)
144
100.0%
None
ValueCountFrequency (%)
17
100.0%
Hangul
ValueCountFrequency (%)
4
50.0%
4
50.0%

인원
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size1.5 KiB

연인원
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size1.5 KiB

장소
Categorical

HIGH CORRELATION 

Distinct42
Distinct (%)24.4%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
-
111 
장 소
 
4
순창실내정구장
 
4
익산시 일원
 
3
전주화산빙상장
 
3
Other values (37)
47 

Length

Max length10
Median length1
Mean length3.1686047
Min length1

Unique

Unique28 ?
Unique (%)16.3%

Sample

1st row남원 일원
2nd row-
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
- 111
64.5%
장 소 4
 
2.3%
순창실내정구장 4
 
2.3%
익산시 일원 3
 
1.7%
전주화산빙상장 3
 
1.7%
군산시 일원 3
 
1.7%
호원대 체육관 2
 
1.2%
익산시청 펜싱장 2
 
1.2%
생명과학고 체육관 2
 
1.2%
정일여중체육관 2
 
1.2%
Other values (32) 36
 
20.9%

Length

2024-03-14T09:40:05.878774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
111
50.7%
체육관 12
 
5.5%
일원 10
 
4.6%
4
 
1.8%
순창실내정구장 4
 
1.8%
익산시 4
 
1.8%
펜싱장 4
 
1.8%
전주 4
 
1.8%
체육중고 4
 
1.8%
4
 
1.8%
Other values (40) 58
26.5%

Correlations

2024-03-14T09:40:05.933969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종목기간장소
종목1.0001.0000.971
기간1.0001.0000.992
장소0.9710.9921.000
2024-03-14T09:40:06.000632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종목장소
종목1.0000.617
장소0.6171.000
2024-03-14T09:40:06.066713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종목장소
종목1.0000.617
장소0.6171.000

Missing values

2024-03-14T09:40:04.186691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T09:40:04.270395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

종목팀명기간인원연인원장소
01축구부산 U-12세팀1. 8∼1535280남원 일원
12축구김포 JD FC1. 8∼1535280-
23축구대전 사커클럽1. 8∼1535280-
34축구구리 주니어팀1. 8∼1540320-
45축구광명 유소년팀1. 8∼1560480-
56축구수지 주니어팀1. 8∼1535280-
67축구송탄 주니오팀1. 8∼1540320-
78축구경기 호동이팀1. 8∼1535280-
89축구의왕 정우사커1. 8∼1535280-
910축구파주 조영증FC1. 8∼1535280-
종목팀명기간인원연인원장소
162155육상경남 국 제 대7.23〜8.314160-
163156육상전남 해남군청7.23〜8.316240-
164157육상광주 광역시청7.23〜8.316240-
165158육상전남 조 선 대7.23〜8.31140-
166159육상상비군 국가대표8. 2〜1633495익산종합운동장
167160핸드볼청소년 국가대표(여)8. 9〜1327135정일여중체육관
168161핸드볼청소년 국가대표(남)8.11〜3027540익산시 일원
169162핸드볼청소년 국가대표(16세이하 남)8.1~2021420정읍시체육센타
170163핸드볼청소년 국가대표(16세이하 여)8.1~2021420정일여중체육관
171소계---2515785-

Duplicate rows

Most frequently occurring

종목팀명기간장소# duplicates
0----5
1종목팀 명기 간장 소4