Overview

Dataset statistics

Number of variables6
Number of observations114
Missing cells288
Missing cells (%)42.1%
Duplicate rows1
Duplicate rows (%)0.9%
Total size in memory6.0 KiB
Average record size in memory54.2 B

Variable types

Text1
Numeric2
Categorical3

Dataset

Description실내체육관, 종합경기장, 테니스장, 수영장, 축구장 등 경기장의 년도별 통계
Author강원도
URLhttps://www.data.go.kr/data/15056037/fileData.do

Alerts

Dataset has 1 (0.9%) duplicate rowsDuplicates
종합경기장 is highly imbalanced (57.9%)Imbalance
수영장 is highly imbalanced (58.3%)Imbalance
축구장 is highly imbalanced (62.4%)Imbalance
시군명 has 96 (84.2%) missing valuesMissing
실내체육관 has 96 (84.2%) missing valuesMissing
테니스장 has 96 (84.2%) missing valuesMissing

Reproduction

Analysis started2023-12-12 07:23:00.940138
Analysis finished2023-12-12 07:23:01.927269
Duration0.99 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Text

MISSING 

Distinct18
Distinct (%)100.0%
Missing96
Missing (%)84.2%
Memory size1.0 KiB
2023-12-12T16:23:02.075757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters54
Distinct characters32
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)100.0%

Sample

1st row춘천시
2nd row원주시
3rd row강릉시
4th row동해시
5th row태백시
ValueCountFrequency (%)
원주시 1
 
5.6%
강릉시 1
 
5.6%
양양군 1
 
5.6%
고성군 1
 
5.6%
인제군 1
 
5.6%
양구군 1
 
5.6%
화천군 1
 
5.6%
철원군 1
 
5.6%
정선군 1
 
5.6%
평창군 1
 
5.6%
Other values (8) 8
44.4%
2023-12-12T16:23:02.492940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
11
20.4%
7
 
13.0%
3
 
5.6%
3
 
5.6%
2
 
3.7%
2
 
3.7%
1
 
1.9%
1
 
1.9%
1
 
1.9%
1
 
1.9%
Other values (22) 22
40.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 54
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
11
20.4%
7
 
13.0%
3
 
5.6%
3
 
5.6%
2
 
3.7%
2
 
3.7%
1
 
1.9%
1
 
1.9%
1
 
1.9%
1
 
1.9%
Other values (22) 22
40.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 54
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
11
20.4%
7
 
13.0%
3
 
5.6%
3
 
5.6%
2
 
3.7%
2
 
3.7%
1
 
1.9%
1
 
1.9%
1
 
1.9%
1
 
1.9%
Other values (22) 22
40.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 54
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
11
20.4%
7
 
13.0%
3
 
5.6%
3
 
5.6%
2
 
3.7%
2
 
3.7%
1
 
1.9%
1
 
1.9%
1
 
1.9%
1
 
1.9%
Other values (22) 22
40.7%

실내체육관
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)33.3%
Missing96
Missing (%)84.2%
Infinite0
Infinite (%)0.0%
Mean2.2777778
Minimum0
Maximum5
Zeros1
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-12T16:23:02.607574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.85
Q12
median2
Q33
95-th percentile4.15
Maximum5
Range5
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.2274103
Coefficient of variation (CV)0.53886304
Kurtosis0.38688977
Mean2.2777778
Median Absolute Deviation (MAD)1
Skewness0.47496083
Sum41
Variance1.5065359
MonotonicityNot monotonic
2023-12-12T16:23:02.709617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2 8
 
7.0%
3 3
 
2.6%
1 3
 
2.6%
4 2
 
1.8%
0 1
 
0.9%
5 1
 
0.9%
(Missing) 96
84.2%
ValueCountFrequency (%)
0 1
 
0.9%
1 3
 
2.6%
2 8
7.0%
3 3
 
2.6%
4 2
 
1.8%
5 1
 
0.9%
ValueCountFrequency (%)
5 1
 
0.9%
4 2
 
1.8%
3 3
 
2.6%
2 8
7.0%
1 3
 
2.6%
0 1
 
0.9%

종합경기장
Categorical

IMBALANCE 

Distinct4
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
<NA>
96 
1
10 
2
 
6
0
 
2

Length

Max length4
Median length4
Mean length3.5263158
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row2
4th row1
5th row2

Common Values

ValueCountFrequency (%)
<NA> 96
84.2%
1 10
 
8.8%
2 6
 
5.3%
0 2
 
1.8%

Length

2023-12-12T16:23:02.849352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:23:02.974963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 96
84.2%
1 10
 
8.8%
2 6
 
5.3%
0 2
 
1.8%

테니스장
Real number (ℝ)

MISSING 

Distinct6
Distinct (%)33.3%
Missing96
Missing (%)84.2%
Infinite0
Infinite (%)0.0%
Mean2.5
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-12T16:23:03.075030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1.5
Q33
95-th percentile7.3
Maximum9
Range8
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.3072775
Coefficient of variation (CV)0.922911
Kurtosis3.1316199
Mean2.5
Median Absolute Deviation (MAD)0.5
Skewness1.8910858
Sum45
Variance5.3235294
MonotonicityNot monotonic
2023-12-12T16:23:03.261513image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 9
 
7.9%
3 3
 
2.6%
2 3
 
2.6%
5 1
 
0.9%
9 1
 
0.9%
7 1
 
0.9%
(Missing) 96
84.2%
ValueCountFrequency (%)
1 9
7.9%
2 3
 
2.6%
3 3
 
2.6%
5 1
 
0.9%
7 1
 
0.9%
9 1
 
0.9%
ValueCountFrequency (%)
9 1
 
0.9%
7 1
 
0.9%
5 1
 
0.9%
3 3
 
2.6%
2 3
 
2.6%
1 9
7.9%

수영장
Categorical

IMBALANCE 

Distinct4
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
<NA>
96 
0
11 
1
 
5
3
 
2

Length

Max length4
Median length4
Mean length3.5263158
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row3
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
<NA> 96
84.2%
0 11
 
9.6%
1 5
 
4.4%
3 2
 
1.8%

Length

2023-12-12T16:23:03.482602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:23:03.593747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 96
84.2%
0 11
 
9.6%
1 5
 
4.4%
3 2
 
1.8%

축구장
Categorical

IMBALANCE 

Distinct6
Distinct (%)5.3%
Missing0
Missing (%)0.0%
Memory size1.0 KiB
<NA>
96 
1
 
6
0
 
5
4
 
3
2
 
2

Length

Max length4
Median length4
Mean length3.5263158
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row1
5th row3

Common Values

ValueCountFrequency (%)
<NA> 96
84.2%
1 6
 
5.3%
0 5
 
4.4%
4 3
 
2.6%
2 2
 
1.8%
3 2
 
1.8%

Length

2023-12-12T16:23:03.737030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:23:03.892455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 96
84.2%
1 6
 
5.3%
0 5
 
4.4%
4 3
 
2.6%
2 2
 
1.8%
3 2
 
1.8%

Interactions

2023-12-12T16:23:01.370606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:23:01.205941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:23:01.453960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:23:01.280776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:23:03.994752image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명실내체육관종합경기장테니스장수영장축구장
시군명1.0001.0001.0001.0001.0001.000
실내체육관1.0001.0000.0000.7520.0000.640
종합경기장1.0000.0001.0000.1710.7340.000
테니스장1.0000.7520.1711.0000.0000.000
수영장1.0000.0000.7340.0001.0000.000
축구장1.0000.6400.0000.0000.0001.000
2023-12-12T16:23:04.112342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
축구장종합경기장수영장
축구장1.0000.0000.000
종합경기장0.0001.0000.382
수영장0.0000.3821.000
2023-12-12T16:23:04.612635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
실내체육관테니스장종합경기장수영장축구장
실내체육관1.0000.0380.0000.0000.463
테니스장0.0381.0000.0000.0000.000
종합경기장0.0000.0001.0000.3820.000
수영장0.0000.0000.3821.0000.000
축구장0.4630.0000.0000.0001.000

Missing values

2023-12-12T16:23:01.572505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:23:01.688421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T16:23:01.836563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

시군명실내체육관종합경기장테니스장수영장축구장
0춘천시30132
1원주시41331
2강릉시42301
3동해시21111
4태백시32103
5속초시21102
6삼척시22501
7홍천군21101
8횡성군21111
9영월군01100
시군명실내체육관종합경기장테니스장수영장축구장
104<NA><NA><NA><NA><NA><NA>
105<NA><NA><NA><NA><NA><NA>
106<NA><NA><NA><NA><NA><NA>
107<NA><NA><NA><NA><NA><NA>
108<NA><NA><NA><NA><NA><NA>
109<NA><NA><NA><NA><NA><NA>
110<NA><NA><NA><NA><NA><NA>
111<NA><NA><NA><NA><NA><NA>
112<NA><NA><NA><NA><NA><NA>
113<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

시군명실내체육관종합경기장테니스장수영장축구장# duplicates
0<NA><NA><NA><NA><NA><NA>96