Overview

Dataset statistics

Number of variables8
Number of observations1616
Missing cells0
Missing cells (%)0.0%
Duplicate rows5
Duplicate rows (%)0.3%
Total size in memory104.3 KiB
Average record size in memory66.1 B

Variable types

Categorical4
Text1
DateTime2
Numeric1

Dataset

Description경주시시설관리공단에서 운영하는 경주국민체육센터의 운영 프로그램 내역입니다. (2019년~2022년의 내용을 포함하고 있습니다.)
Author경주시시설관리공단
URLhttps://www.data.go.kr/data/15095711/fileData.do

Alerts

Dataset has 5 (0.3%) duplicate rowsDuplicates
이용료 is highly overall correlated with 강습요일 and 2 other fieldsHigh correlation
구분 is highly overall correlated with 강습요일 and 2 other fieldsHigh correlation
강습요일 is highly overall correlated with 이용료 and 2 other fieldsHigh correlation
강습정원 is highly overall correlated with 이용료 and 2 other fieldsHigh correlation
대상 is highly overall correlated with 이용료 and 1 other fieldsHigh correlation
구분 is highly imbalanced (88.1%)Imbalance
강습요일 is highly imbalanced (82.8%)Imbalance
강습정원 is highly imbalanced (78.9%)Imbalance

Reproduction

Analysis started2023-12-12 10:12:45.641965
Analysis finished2023-12-12 10:12:46.749681
Duration1.11 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
수영장
1564 
에어로빅
 
38
밸리댄스
 
12
생활요가&필라테스
 
2

Length

Max length9
Median length3
Mean length3.0383663
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row수영장
2nd row수영장
3rd row수영장
4th row수영장
5th row수영장

Common Values

ValueCountFrequency (%)
수영장 1564
96.8%
에어로빅 38
 
2.4%
밸리댄스 12
 
0.7%
생활요가&필라테스 2
 
0.1%

Length

2023-12-12T19:12:46.853676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:12:47.006772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
수영장 1564
96.8%
에어로빅 38
 
2.4%
밸리댄스 12
 
0.7%
생활요가&필라테스 2
 
0.1%
Distinct63
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
2023-12-12T19:12:47.216069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length11
Mean length11.346535
Min length11

Characters and Unicode

Total characters18336
Distinct characters54
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)0.2%

Sample

1st row강습수영 06시 초급
2nd row강습수영 06시 초급
3rd row강습수영 06시 중급
4th row강습수영 06시 중급
5th row강습수영 07시 초급
ValueCountFrequency (%)
강습수영 1498
30.2%
초급 291
 
5.9%
중급 284
 
5.7%
상급 258
 
5.2%
고급 243
 
4.9%
연수 211
 
4.3%
교정 211
 
4.3%
10시 207
 
4.2%
09시 202
 
4.1%
19시 196
 
4.0%
Other values (23) 1360
27.4%
2023-12-12T19:12:47.622270image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3345
18.2%
1715
9.4%
1610
8.8%
1504
8.2%
1504
8.2%
1498
8.2%
1082
 
5.9%
0 1030
 
5.6%
1 1028
 
5.6%
9 398
 
2.2%
Other values (44) 3622
19.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11413
62.2%
Decimal Number 3558
 
19.4%
Space Separator 3345
 
18.2%
Other Punctuation 8
 
< 0.1%
Open Punctuation 6
 
< 0.1%
Close Punctuation 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1715
15.0%
1610
14.1%
1504
13.2%
1504
13.2%
1498
13.1%
1082
9.5%
297
 
2.6%
290
 
2.5%
258
 
2.3%
243
 
2.1%
Other values (29) 1412
12.4%
Decimal Number
ValueCountFrequency (%)
0 1030
28.9%
1 1028
28.9%
9 398
 
11.2%
6 321
 
9.0%
4 200
 
5.6%
7 193
 
5.4%
2 150
 
4.2%
8 137
 
3.9%
3 70
 
2.0%
5 31
 
0.9%
Other Punctuation
ValueCountFrequency (%)
. 6
75.0%
& 2
 
25.0%
Space Separator
ValueCountFrequency (%)
3345
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11413
62.2%
Common 6923
37.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1715
15.0%
1610
14.1%
1504
13.2%
1504
13.2%
1498
13.1%
1082
9.5%
297
 
2.6%
290
 
2.5%
258
 
2.3%
243
 
2.1%
Other values (29) 1412
12.4%
Common
ValueCountFrequency (%)
3345
48.3%
0 1030
 
14.9%
1 1028
 
14.8%
9 398
 
5.7%
6 321
 
4.6%
4 200
 
2.9%
7 193
 
2.8%
2 150
 
2.2%
8 137
 
2.0%
3 70
 
1.0%
Other values (5) 51
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11413
62.2%
ASCII 6923
37.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3345
48.3%
0 1030
 
14.9%
1 1028
 
14.8%
9 398
 
5.7%
6 321
 
4.6%
4 200
 
2.9%
7 193
 
2.8%
2 150
 
2.2%
8 137
 
2.0%
3 70
 
1.0%
Other values (5) 51
 
0.7%
Hangul
ValueCountFrequency (%)
1715
15.0%
1610
14.1%
1504
13.2%
1504
13.2%
1498
13.1%
1082
9.5%
297
 
2.6%
290
 
2.5%
258
 
2.3%
243
 
2.1%
Other values (29) 1412
12.4%

강습요일
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
월,화,수,목,금
1542 
월,수,금
 
73
화,목
 
1

Length

Max length9
Median length9
Mean length8.8155941
Min length3

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row월,화,수,목,금
2nd row월,화,수,목,금
3rd row월,화,수,목,금
4th row월,화,수,목,금
5th row월,화,수,목,금

Common Values

ValueCountFrequency (%)
월,화,수,목,금 1542
95.4%
월,수,금 73
 
4.5%
화,목 1
 
0.1%

Length

2023-12-12T19:12:47.809296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:12:47.945956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
월,화,수,목,금 1542
95.4%
월,수,금 73
 
4.5%
화,목 1
 
0.1%

강습정원
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
25
1509 
10
 
66
80
 
39
15
 
2

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row25
2nd row25
3rd row25
4th row25
5th row25

Common Values

ValueCountFrequency (%)
25 1509
93.4%
10 66
 
4.1%
80 39
 
2.4%
15 2
 
0.1%

Length

2023-12-12T19:12:48.064448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:12:48.193275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
25 1509
93.4%
10 66
 
4.1%
80 39
 
2.4%
15 2
 
0.1%
Distinct15
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
Minimum2019-01-01 00:00:00
Maximum2022-12-01 00:00:00
2023-12-12T19:12:48.321380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:48.466770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
Distinct21
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
Minimum2019-01-31 00:00:00
Maximum2022-12-31 00:00:00
2023-12-12T19:12:48.633719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:12:48.792188image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)

대상
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size12.8 KiB
성인
534 
청소년,군인
506 
경로,어린이
491 
성인,청소년,경로
 
34
어린이
 
26
Other values (2)
 
25

Length

Max length9
Median length6
Mean length4.7060644
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row성인
2nd row청소년, 군인
3rd row성인
4th row청소년, 군인
5th row성인

Common Values

ValueCountFrequency (%)
성인 534
33.0%
청소년,군인 506
31.3%
경로,어린이 491
30.4%
성인,청소년,경로 34
 
2.1%
어린이 26
 
1.6%
청소년, 군인 23
 
1.4%
성인,경로 2
 
0.1%

Length

2023-12-12T19:12:48.954525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:12:49.126593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
성인 534
32.6%
청소년,군인 506
30.9%
경로,어린이 491
30.0%
성인,청소년,경로 34
 
2.1%
어린이 26
 
1.6%
청소년 23
 
1.4%
군인 23
 
1.4%
성인,경로 2
 
0.1%

이용료
Real number (ℝ)

HIGH CORRELATION 

Distinct10
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean48001.856
Minimum26000
Maximum70000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size14.3 KiB
2023-12-12T19:12:49.289553image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum26000
5-th percentile35000
Q135000
median48000
Q360000
95-th percentile60000
Maximum70000
Range44000
Interquartile range (IQR)25000

Descriptive statistics

Standard deviation10608.863
Coefficient of variation (CV)0.22100944
Kurtosis-1.3208172
Mean48001.856
Median Absolute Deviation (MAD)12000
Skewness0.057795711
Sum77571000
Variance1.1254798 × 108
MonotonicityNot monotonic
2023-12-12T19:12:49.428526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
60000 492
30.4%
48000 491
30.4%
35000 491
30.4%
70000 38
 
2.4%
56000 38
 
2.4%
36000 36
 
2.2%
41000 26
 
1.6%
42000 2
 
0.1%
38000 1
 
0.1%
26000 1
 
0.1%
ValueCountFrequency (%)
26000 1
 
0.1%
35000 491
30.4%
36000 36
 
2.2%
38000 1
 
0.1%
41000 26
 
1.6%
42000 2
 
0.1%
48000 491
30.4%
56000 38
 
2.4%
60000 492
30.4%
70000 38
 
2.4%
ValueCountFrequency (%)
70000 38
 
2.4%
60000 492
30.4%
56000 38
 
2.4%
48000 491
30.4%
42000 2
 
0.1%
41000 26
 
1.6%
38000 1
 
0.1%
36000 36
 
2.2%
35000 491
30.4%
26000 1
 
0.1%

Interactions

2023-12-12T19:12:46.368502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:12:49.534626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분강습반명강습요일강습정원시작일종료일대상이용료
구분1.0001.0000.5670.8950.2080.3520.6670.149
강습반명1.0001.0000.9780.9340.6210.6450.7450.452
강습요일0.5670.9781.0000.6840.2730.3310.1910.000
강습정원0.8950.9340.6841.0000.8010.8180.4520.628
시작일0.2080.6210.2730.8011.0001.0000.6120.730
종료일0.3520.6450.3310.8181.0001.0000.6920.741
대상0.6670.7450.1910.4520.6120.6921.0000.921
이용료0.1490.4520.0000.6280.7300.7410.9211.000
2023-12-12T19:12:49.690040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
강습정원강습요일대상구분
강습정원1.0000.7170.3270.577
강습요일0.7171.0000.1300.576
대상0.3270.1301.0000.528
구분0.5770.5760.5281.000
2023-12-12T19:12:49.834014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
이용료구분강습요일강습정원대상
이용료1.0000.4190.7060.6120.764
구분0.4191.0000.5760.5770.528
강습요일0.7060.5761.0000.7170.130
강습정원0.6120.5770.7171.0000.327
대상0.7640.5280.1300.3271.000

Missing values

2023-12-12T19:12:46.510060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:12:46.681696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분강습반명강습요일강습정원시작일종료일대상이용료
0수영장강습수영 06시 초급월,화,수,목,금252022-12-012022-12-31성인70000
1수영장강습수영 06시 초급월,화,수,목,금252022-12-012022-12-31청소년, 군인56000
2수영장강습수영 06시 중급월,화,수,목,금252022-12-012022-12-31성인70000
3수영장강습수영 06시 중급월,화,수,목,금252022-12-012022-12-31청소년, 군인56000
4수영장강습수영 07시 초급월,화,수,목,금252022-12-012022-12-31성인70000
5수영장강습수영 07시 초급월,화,수,목,금252022-12-012022-12-31청소년, 군인56000
6수영장강습수영 07시 중급월,화,수,목,금252022-12-012022-12-31성인70000
7수영장강습수영 07시 중급월,화,수,목,금252022-12-012022-12-31청소년, 군인56000
8수영장강습수영 09시 초급월,화,수,목,금252022-12-012022-12-31성인70000
9수영장강습수영 09시 초급월,화,수,목,금252022-12-012022-12-31청소년, 군인56000
구분강습반명강습요일강습정원시작일종료일대상이용료
1606수영장강습수영 20시 초급월,화,수,목,금252019-08-012019-08-31성인60000
1607수영장강습수영 20시 초급월,화,수,목,금252019-08-012019-08-31청소년,군인48000
1608수영장강습수영 20시 초급월,화,수,목,금252019-08-012019-08-31경로,어린이35000
1609수영장아쿠아로빅 13시 00분월,수,금802019-08-012019-08-31성인60000
1610수영장아쿠아로빅 13시 00분월,수,금802019-08-012019-08-31청소년,군인48000
1611수영장아쿠아로빅 13시 00분월,수,금802019-08-012019-08-31경로,어린이35000
1612수영장아쿠아로빅 15시 10분월,수,금802019-08-012019-08-31성인60000
1613수영장아쿠아로빅 15시 10분월,수,금802019-08-012019-08-31청소년,군인48000
1614수영장아쿠아로빅 15시 10분월,수,금802019-08-012019-08-31경로,어린이35000
1615밸리댄스밸리댄스 11시 40분반월,수,금252019-08-012019-08-31성인,청소년,경로36000

Duplicate rows

Most frequently occurring

구분강습반명강습요일강습정원시작일종료일대상이용료# duplicates
0수영장강습수영 14시 교정월,화,수,목,금252020-02-012020-02-29경로,어린이350002
1수영장강습수영 14시 연수월,화,수,목,금252020-02-012020-02-29성인600002
2수영장강습수영 16시 고급월,화,수,목,금252020-01-012020-01-31경로,어린이350002
3수영장강습수영 16시 고급월,화,수,목,금252020-01-012020-01-31성인600002
4수영장강습수영 16시 고급월,화,수,목,금252020-01-012020-01-31청소년,군인480002