Overview

Dataset statistics

Number of variables7
Number of observations172
Missing cells0
Missing cells (%)0.0%
Duplicate rows1
Duplicate rows (%)0.6%
Total size in memory9.5 KiB
Average record size in memory56.8 B

Variable types

Text4
Categorical3

Dataset

Description국내전지훈련선수단유치현황2014
Author전라북도
URLhttps://www.bigdatahub.go.kr/opendata/dataSet/detail.nm?contentId=37&rlik=49451aebf056b486&serviceId=202032

Alerts

Dataset has 1 (0.6%) duplicate rowsDuplicates
종목 is highly overall correlated with 장소High correlation
장소 is highly overall correlated with 종목High correlation

Reproduction

Analysis started2024-03-14 00:39:58.727306
Analysis finished2024-03-14 00:39:59.522575
Duration0.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables


Text

Distinct165
Distinct (%)95.9%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-03-14T09:39:59.754527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length2
Mean length2.2965116
Min length1

Characters and Unicode

Total characters395
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique163 ?
Unique (%)94.8%

Sample

1st row1
2nd row2
3rd row3
4th row4
5th row5
ValueCountFrequency (%)
소계 5
 
2.9%
4
 
2.3%
159 1
 
0.6%
119 1
 
0.6%
105 1
 
0.6%
121 1
 
0.6%
112 1
 
0.6%
114 1
 
0.6%
106 1
 
0.6%
107 1
 
0.6%
Other values (155) 155
90.1%
2024-03-14T09:40:00.196249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 101
25.6%
2 37
 
9.4%
3 37
 
9.4%
5 36
 
9.1%
4 36
 
9.1%
6 30
 
7.6%
9 26
 
6.6%
8 26
 
6.6%
0 26
 
6.6%
7 26
 
6.6%
Other values (3) 14
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 381
96.5%
Other Letter 14
 
3.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 101
26.5%
2 37
 
9.7%
3 37
 
9.7%
5 36
 
9.4%
4 36
 
9.4%
6 30
 
7.9%
9 26
 
6.8%
8 26
 
6.8%
0 26
 
6.8%
7 26
 
6.8%
Other Letter
ValueCountFrequency (%)
5
35.7%
5
35.7%
4
28.6%

Most occurring scripts

ValueCountFrequency (%)
Common 381
96.5%
Hangul 14
 
3.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 101
26.5%
2 37
 
9.7%
3 37
 
9.7%
5 36
 
9.4%
4 36
 
9.4%
6 30
 
7.9%
9 26
 
6.8%
8 26
 
6.8%
0 26
 
6.8%
7 26
 
6.8%
Hangul
ValueCountFrequency (%)
5
35.7%
5
35.7%
4
28.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 381
96.5%
Hangul 14
 
3.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 101
26.5%
2 37
 
9.7%
3 37
 
9.7%
5 36
 
9.4%
4 36
 
9.4%
6 30
 
7.9%
9 26
 
6.8%
8 26
 
6.8%
0 26
 
6.8%
7 26
 
6.8%
Hangul
ValueCountFrequency (%)
5
35.7%
5
35.7%
4
28.6%

종목
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)14.5%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
야구
37 
펜싱
22 
정구
14 
축구
13 
육상
12 
Other values (20)
74 

Length

Max length6
Median length2
Mean length2.3488372
Min length1

Unique

Unique5 ?
Unique (%)2.9%

Sample

1st row축구
2nd row축구
3rd row축구
4th row축구
5th row축구

Common Values

ValueCountFrequency (%)
야구 37
21.5%
펜싱 22
12.8%
정구 14
 
8.1%
축구 13
 
7.6%
육상 12
 
7.0%
아이스하키 9
 
5.2%
유도 9
 
5.2%
수영 7
 
4.1%
핸드볼 6
 
3.5%
배드민턴 5
 
2.9%
Other values (15) 38
22.1%

Length

2024-03-14T09:40:00.355509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
야구 37
21.5%
펜싱 22
12.8%
정구 14
 
8.1%
축구 13
 
7.6%
육상 12
 
7.0%
아이스하키 9
 
5.2%
유도 9
 
5.2%
수영 7
 
4.1%
핸드볼 6
 
3.5%
배드민턴 5
 
2.9%
Other values (15) 38
22.1%

팀명
Text

Distinct147
Distinct (%)85.5%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-03-14T09:40:00.683163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length8
Mean length7.4593023
Min length1

Characters and Unicode

Total characters1283
Distinct characters156
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique134 ?
Unique (%)77.9%

Sample

1st row부산 U-12세팀
2nd row김포 JD FC
3rd row대전 사커클럽
4th row구리 주니어팀
5th row광명 유소년팀
ValueCountFrequency (%)
34
 
6.9%
경기 31
 
6.3%
서울 23
 
4.7%
16
 
3.2%
16
 
3.2%
전남 15
 
3.0%
국가대표 12
 
2.4%
충남 10
 
2.0%
인천 9
 
1.8%
광주 9
 
1.8%
Other values (178) 318
64.5%
2024-03-14T09:40:01.090763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
321
25.0%
51
 
4.0%
51
 
4.0%
42
 
3.3%
38
 
3.0%
33
 
2.6%
32
 
2.5%
30
 
2.3%
29
 
2.3%
27
 
2.1%
Other values (146) 629
49.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 931
72.6%
Space Separator 321
 
25.0%
Uppercase Letter 11
 
0.9%
Dash Punctuation 6
 
0.5%
Decimal Number 6
 
0.5%
Open Punctuation 4
 
0.3%
Close Punctuation 4
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
51
 
5.5%
51
 
5.5%
42
 
4.5%
38
 
4.1%
33
 
3.5%
32
 
3.4%
30
 
3.2%
29
 
3.1%
27
 
2.9%
26
 
2.8%
Other values (134) 572
61.4%
Uppercase Letter
ValueCountFrequency (%)
C 4
36.4%
F 4
36.4%
U 1
 
9.1%
J 1
 
9.1%
D 1
 
9.1%
Decimal Number
ValueCountFrequency (%)
1 3
50.0%
6 2
33.3%
2 1
 
16.7%
Space Separator
ValueCountFrequency (%)
321
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 931
72.6%
Common 341
 
26.6%
Latin 11
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
51
 
5.5%
51
 
5.5%
42
 
4.5%
38
 
4.1%
33
 
3.5%
32
 
3.4%
30
 
3.2%
29
 
3.1%
27
 
2.9%
26
 
2.8%
Other values (134) 572
61.4%
Common
ValueCountFrequency (%)
321
94.1%
- 6
 
1.8%
( 4
 
1.2%
) 4
 
1.2%
1 3
 
0.9%
6 2
 
0.6%
2 1
 
0.3%
Latin
ValueCountFrequency (%)
C 4
36.4%
F 4
36.4%
U 1
 
9.1%
J 1
 
9.1%
D 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 931
72.6%
ASCII 352
 
27.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
321
91.2%
- 6
 
1.7%
C 4
 
1.1%
( 4
 
1.1%
) 4
 
1.1%
F 4
 
1.1%
1 3
 
0.9%
6 2
 
0.6%
U 1
 
0.3%
2 1
 
0.3%
Other values (2) 2
 
0.6%
Hangul
ValueCountFrequency (%)
51
 
5.5%
51
 
5.5%
42
 
4.5%
38
 
4.1%
33
 
3.5%
32
 
3.4%
30
 
3.2%
29
 
3.1%
27
 
2.9%
26
 
2.8%
Other values (134) 572
61.4%

기간
Text

Distinct89
Distinct (%)51.7%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-03-14T09:40:01.310185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7
Mean length6.9883721
Min length1

Characters and Unicode

Total characters1202
Distinct characters18
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique66 ?
Unique (%)38.4%

Sample

1st row1. 8∼15
2nd row1. 8∼15
3rd row1. 8∼15
4th row1. 8∼15
5th row1. 8∼15
ValueCountFrequency (%)
1 20
 
9.0%
2.13∼22 13
 
5.9%
8∼15 13
 
5.9%
2 12
 
5.4%
7.23?8.31 11
 
5.0%
2.15∼21 10
 
4.5%
2.22∼28 8
 
3.6%
2.27∼3.2 6
 
2.7%
8 6
 
2.7%
1.13∼17 5
 
2.3%
Other values (85) 118
53.2%
2024-03-14T09:40:01.662812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 225
18.7%
1 209
17.4%
. 195
16.2%
144
12.0%
3 87
 
7.2%
8 63
 
5.2%
7 56
 
4.7%
50
 
4.2%
5 46
 
3.8%
0 30
 
2.5%
Other values (8) 97
8.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 781
65.0%
Other Punctuation 212
 
17.6%
Math Symbol 146
 
12.1%
Space Separator 50
 
4.2%
Other Letter 8
 
0.7%
Dash Punctuation 5
 
0.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 225
28.8%
1 209
26.8%
3 87
 
11.1%
8 63
 
8.1%
7 56
 
7.2%
5 46
 
5.9%
0 30
 
3.8%
4 30
 
3.8%
6 21
 
2.7%
9 14
 
1.8%
Other Punctuation
ValueCountFrequency (%)
. 195
92.0%
? 17
 
8.0%
Math Symbol
ValueCountFrequency (%)
144
98.6%
~ 2
 
1.4%
Other Letter
ValueCountFrequency (%)
4
50.0%
4
50.0%
Space Separator
ValueCountFrequency (%)
50
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1194
99.3%
Hangul 8
 
0.7%

Most frequent character per script

Common
ValueCountFrequency (%)
2 225
18.8%
1 209
17.5%
. 195
16.3%
144
12.1%
3 87
 
7.3%
8 63
 
5.3%
7 56
 
4.7%
50
 
4.2%
5 46
 
3.9%
0 30
 
2.5%
Other values (6) 89
 
7.5%
Hangul
ValueCountFrequency (%)
4
50.0%
4
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1050
87.4%
Math Operators 144
 
12.0%
Hangul 8
 
0.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 225
21.4%
1 209
19.9%
. 195
18.6%
3 87
 
8.3%
8 63
 
6.0%
7 56
 
5.3%
50
 
4.8%
5 46
 
4.4%
0 30
 
2.9%
4 30
 
2.9%
Other values (5) 59
 
5.6%
Math Operators
ValueCountFrequency (%)
144
100.0%
Hangul
ValueCountFrequency (%)
4
50.0%
4
50.0%

인원
Categorical

Distinct43
Distinct (%)25.0%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
35
15 
7
14 
30
12 
6
12 
8
11 
Other values (38)
108 

Length

Max length5
Median length2
Mean length1.7093023
Min length1

Unique

Unique16 ?
Unique (%)9.3%

Sample

1st row35
2nd row35
3rd row35
4th row40
5th row60

Common Values

ValueCountFrequency (%)
35 15
 
8.7%
7 14
 
8.1%
30 12
 
7.0%
6 12
 
7.0%
8 11
 
6.4%
10 9
 
5.2%
25 7
 
4.1%
9 7
 
4.1%
5 7
 
4.1%
20 7
 
4.1%
Other values (33) 71
41.3%

Length

2024-03-14T09:40:01.777838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
35 15
 
8.7%
7 14
 
8.1%
30 12
 
7.0%
6 12
 
7.0%
8 11
 
6.4%
10 9
 
5.2%
25 7
 
4.1%
9 7
 
4.1%
5 7
 
4.1%
20 7
 
4.1%
Other values (33) 71
41.3%
Distinct97
Distinct (%)56.4%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
2024-03-14T09:40:01.978084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length3
Mean length2.7209302
Min length2

Characters and Unicode

Total characters468
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique59 ?
Unique (%)34.3%

Sample

1st row280
2nd row280
3rd row280
4th row320
5th row480
ValueCountFrequency (%)
280 13
 
7.6%
210 8
 
4.7%
80 6
 
3.5%
240 5
 
2.9%
480 4
 
2.3%
연인원 4
 
2.3%
245 4
 
2.3%
140 3
 
1.7%
40 3
 
1.7%
24 3
 
1.7%
Other values (87) 119
69.2%
2024-03-14T09:40:02.332299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 106
22.6%
2 75
16.0%
1 53
11.3%
4 46
9.8%
8 42
 
9.0%
5 39
 
8.3%
3 33
 
7.1%
6 24
 
5.1%
7 15
 
3.2%
9 14
 
3.0%
Other values (4) 21
 
4.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 447
95.5%
Other Letter 12
 
2.6%
Other Punctuation 9
 
1.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 106
23.7%
2 75
16.8%
1 53
11.9%
4 46
10.3%
8 42
 
9.4%
5 39
 
8.7%
3 33
 
7.4%
6 24
 
5.4%
7 15
 
3.4%
9 14
 
3.1%
Other Letter
ValueCountFrequency (%)
4
33.3%
4
33.3%
4
33.3%
Other Punctuation
ValueCountFrequency (%)
, 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 456
97.4%
Hangul 12
 
2.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 106
23.2%
2 75
16.4%
1 53
11.6%
4 46
10.1%
8 42
 
9.2%
5 39
 
8.6%
3 33
 
7.2%
6 24
 
5.3%
7 15
 
3.3%
9 14
 
3.1%
Hangul
ValueCountFrequency (%)
4
33.3%
4
33.3%
4
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 456
97.4%
Hangul 12
 
2.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 106
23.2%
2 75
16.4%
1 53
11.6%
4 46
10.1%
8 42
 
9.2%
5 39
 
8.6%
3 33
 
7.2%
6 24
 
5.3%
7 15
 
3.3%
9 14
 
3.1%
Hangul
ValueCountFrequency (%)
4
33.3%
4
33.3%
4
33.3%

장소
Categorical

HIGH CORRELATION 

Distinct42
Distinct (%)24.4%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
-
111 
장 소
 
4
순창실내정구장
 
4
익산시 일원
 
3
전주화산빙상장
 
3
Other values (37)
47 

Length

Max length10
Median length1
Mean length3.1686047
Min length1

Unique

Unique28 ?
Unique (%)16.3%

Sample

1st row남원 일원
2nd row-
3rd row-
4th row-
5th row-

Common Values

ValueCountFrequency (%)
- 111
64.5%
장 소 4
 
2.3%
순창실내정구장 4
 
2.3%
익산시 일원 3
 
1.7%
전주화산빙상장 3
 
1.7%
군산시 일원 3
 
1.7%
호원대 체육관 2
 
1.2%
익산시청 펜싱장 2
 
1.2%
생명과학고 체육관 2
 
1.2%
정일여중체육관 2
 
1.2%
Other values (32) 36
 
20.9%

Length

2024-03-14T09:40:02.448149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
111
50.7%
체육관 12
 
5.5%
일원 10
 
4.6%
4
 
1.8%
순창실내정구장 4
 
1.8%
익산시 4
 
1.8%
펜싱장 4
 
1.8%
전주 4
 
1.8%
체육중고 4
 
1.8%
4
 
1.8%
Other values (40) 58
26.5%

Correlations

2024-03-14T09:40:02.785562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종목기간인원연인원장소
종목1.0001.0000.9300.9820.971
기간1.0001.0000.9320.9930.992
인원0.9300.9321.0000.9960.847
연인원0.9820.9930.9961.0000.966
장소0.9710.9920.8470.9661.000
2024-03-14T09:40:02.866807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종목장소인원
종목1.0000.6170.464
장소0.6171.0000.288
인원0.4640.2881.000
2024-03-14T09:40:02.949523image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
종목인원장소
종목1.0000.4640.617
인원0.4641.0000.288
장소0.6170.2881.000

Missing values

2024-03-14T09:39:59.404417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T09:39:59.489279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

종목팀명기간인원연인원장소
01축구부산 U-12세팀1. 8∼1535280남원 일원
12축구김포 JD FC1. 8∼1535280-
23축구대전 사커클럽1. 8∼1535280-
34축구구리 주니어팀1. 8∼1540320-
45축구광명 유소년팀1. 8∼1560480-
56축구수지 주니어팀1. 8∼1535280-
67축구송탄 주니오팀1. 8∼1540320-
78축구경기 호동이팀1. 8∼1535280-
89축구의왕 정우사커1. 8∼1535280-
910축구파주 조영증FC1. 8∼1535280-
종목팀명기간인원연인원장소
162155육상경남 국 제 대7.23?8.314160-
163156육상전남 해남군청7.23?8.316240-
164157육상광주 광역시청7.23?8.316240-
165158육상전남 조 선 대7.23?8.31140-
166159육상상비군 국가대표8. 2?1633495익산종합운동장
167160핸드볼청소년 국가대표(여)8. 9?1327135정일여중체육관
168161핸드볼청소년 국가대표(남)8.11?3027540익산시 일원
169162핸드볼청소년 국가대표(16세이하 남)8.1~2021420정읍시체육센타
170163핸드볼청소년 국가대표(16세이하 여)8.1~2021420정일여중체육관
171소계---2515,785-

Duplicate rows

Most frequently occurring

종목팀명기간인원연인원장소# duplicates
0종목팀 명기 간인원연인원장 소4