Overview

Dataset statistics

Number of variables5
Number of observations676
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory27.2 KiB
Average record size in memory41.2 B

Variable types

Numeric1
Categorical3
Text1

Dataset

Description한국환경산업기술원 친환경소비분석시스템(greencrm.keiti.re.kr)에서 제공하는 그린카드 공공부문 유료시설 할인혜택에 대해 23년 1월 기준('23.1.28.) 정리한 데이터
URLhttps://www.data.go.kr/data/15089158/fileData.do

Alerts

번호 is highly overall correlated with 구분 and 1 other fieldsHigh correlation
구분 is highly overall correlated with 번호 and 1 other fieldsHigh correlation
지역 is highly overall correlated with 번호 and 1 other fieldsHigh correlation
구분 is highly imbalanced (60.0%)Imbalance
번호 has unique valuesUnique
공공시설명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 06:13:14.231760
Analysis finished2023-12-12 06:13:14.906905
Duration0.68 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct676
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean338.5
Minimum1
Maximum676
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.1 KiB
2023-12-12T15:13:14.975724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile34.75
Q1169.75
median338.5
Q3507.25
95-th percentile642.25
Maximum676
Range675
Interquartile range (IQR)337.5

Descriptive statistics

Standard deviation195.28868
Coefficient of variation (CV)0.57692371
Kurtosis-1.2
Mean338.5
Median Absolute Deviation (MAD)169
Skewness0
Sum228826
Variance38137.667
MonotonicityStrictly increasing
2023-12-12T15:13:15.112514image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
456 1
 
0.1%
448 1
 
0.1%
449 1
 
0.1%
450 1
 
0.1%
451 1
 
0.1%
452 1
 
0.1%
453 1
 
0.1%
454 1
 
0.1%
455 1
 
0.1%
Other values (666) 666
98.5%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
676 1
0.1%
675 1
0.1%
674 1
0.1%
673 1
0.1%
672 1
0.1%
671 1
0.1%
670 1
0.1%
669 1
0.1%
668 1
0.1%
667 1
0.1%

구분
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
지자체공공시설
569 
국립공원
66 
국립휴양림
 
39
국립기관
 
2

Length

Max length7
Median length7
Mean length6.5828402
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row국립기관
2nd row국립기관
3rd row국립휴양림
4th row국립휴양림
5th row국립휴양림

Common Values

ValueCountFrequency (%)
지자체공공시설 569
84.2%
국립공원 66
 
9.8%
국립휴양림 39
 
5.8%
국립기관 2
 
0.3%

Length

2023-12-12T15:13:15.249133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:13:15.653778image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
지자체공공시설 569
84.2%
국립공원 66
 
9.8%
국립휴양림 39
 
5.8%
국립기관 2
 
0.3%

지역
Categorical

HIGH CORRELATION 

Distinct23
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
경기
97 
경남
64 
강원
57 
전남
48 
서울
46 
Other values (18)
364 

Length

Max length4
Median length2
Mean length2.1612426
Min length2

Unique

Unique2 ?
Unique (%)0.3%

Sample

1st row경상북도
2nd row충청남도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
경기 97
14.3%
경남 64
 
9.5%
강원 57
 
8.4%
전남 48
 
7.1%
서울 46
 
6.8%
충북 34
 
5.0%
경상도 32
 
4.7%
경북 32
 
4.7%
충남 31
 
4.6%
제주 30
 
4.4%
Other values (13) 205
30.3%

Length

2023-12-12T15:13:15.794516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 97
14.3%
경남 64
 
9.5%
강원 57
 
8.4%
전남 48
 
7.1%
서울 46
 
6.8%
충북 34
 
5.0%
경상도 32
 
4.7%
경북 32
 
4.7%
충남 31
 
4.6%
제주 30
 
4.4%
Other values (13) 205
30.3%

공공시설명
Text

UNIQUE 

Distinct676
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
2023-12-12T15:13:16.060672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length22
Mean length8.4423077
Min length3

Characters and Unicode

Total characters5707
Distinct characters388
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique676 ?
Unique (%)100.0%

Sample

1st row국립낙동강생물자원관
2nd row서천국립생태원
3rd row가리왕산 국립휴양림
4th row검봉산 국립휴양림
5th row대관령 국립휴양림
ValueCountFrequency (%)
국립휴양림 39
 
4.0%
주차장 25
 
2.6%
자동차야영장 20
 
2.1%
8
 
0.8%
청소년수련관 7
 
0.7%
여성회관 6
 
0.6%
점용면적야영장 5
 
0.5%
일반야영장 5
 
0.5%
북구 4
 
0.4%
체육센터 4
 
0.4%
Other values (772) 848
87.3%
2023-12-12T15:13:16.519030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
295
 
5.2%
258
 
4.5%
151
 
2.6%
117
 
2.1%
115
 
2.0%
114
 
2.0%
105
 
1.8%
104
 
1.8%
100
 
1.8%
98
 
1.7%
Other values (378) 4250
74.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5357
93.9%
Space Separator 295
 
5.2%
Decimal Number 18
 
0.3%
Open Punctuation 12
 
0.2%
Close Punctuation 12
 
0.2%
Uppercase Letter 8
 
0.1%
Other Punctuation 4
 
0.1%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
258
 
4.8%
151
 
2.8%
117
 
2.2%
115
 
2.1%
114
 
2.1%
105
 
2.0%
104
 
1.9%
100
 
1.9%
98
 
1.8%
93
 
1.7%
Other values (359) 4102
76.6%
Decimal Number
ValueCountFrequency (%)
2 5
27.8%
1 4
22.2%
3 3
16.7%
5 3
16.7%
6 1
 
5.6%
4 1
 
5.6%
0 1
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
D 2
25.0%
M 2
25.0%
Z 2
25.0%
N 1
12.5%
G 1
12.5%
Other Punctuation
ValueCountFrequency (%)
, 2
50.0%
/ 1
25.0%
. 1
25.0%
Space Separator
ValueCountFrequency (%)
295
100.0%
Open Punctuation
ValueCountFrequency (%)
( 12
100.0%
Close Punctuation
ValueCountFrequency (%)
) 12
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5357
93.9%
Common 342
 
6.0%
Latin 8
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
258
 
4.8%
151
 
2.8%
117
 
2.2%
115
 
2.1%
114
 
2.1%
105
 
2.0%
104
 
1.9%
100
 
1.9%
98
 
1.8%
93
 
1.7%
Other values (359) 4102
76.6%
Common
ValueCountFrequency (%)
295
86.3%
( 12
 
3.5%
) 12
 
3.5%
2 5
 
1.5%
1 4
 
1.2%
3 3
 
0.9%
5 3
 
0.9%
, 2
 
0.6%
6 1
 
0.3%
/ 1
 
0.3%
Other values (4) 4
 
1.2%
Latin
ValueCountFrequency (%)
D 2
25.0%
M 2
25.0%
Z 2
25.0%
N 1
12.5%
G 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5357
93.9%
ASCII 350
 
6.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
295
84.3%
( 12
 
3.4%
) 12
 
3.4%
2 5
 
1.4%
1 4
 
1.1%
3 3
 
0.9%
5 3
 
0.9%
D 2
 
0.6%
M 2
 
0.6%
Z 2
 
0.6%
Other values (9) 10
 
2.9%
Hangul
ValueCountFrequency (%)
258
 
4.8%
151
 
2.8%
117
 
2.2%
115
 
2.1%
114
 
2.1%
105
 
2.0%
104
 
1.9%
100
 
1.9%
98
 
1.8%
93
 
1.7%
Other values (359) 4102
76.6%
Distinct17
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
10%
192 
100%
145 
50%
125 
5%
80 
20%
46 
Other values (12)
88 

Length

Max length5
Median length3
Mean length3.158284
Min length2

Unique

Unique4 ?
Unique (%)0.6%

Sample

1st row30%
2nd row30%
3rd row100%
4th row100%
5th row100%

Common Values

ValueCountFrequency (%)
10% 192
28.4%
100% 145
21.4%
50% 125
18.5%
5% 80
11.8%
20% 46
 
6.8%
30% 34
 
5.0%
3000원 20
 
3.0%
3% 13
 
1.9%
2000원 6
 
0.9%
33% 5
 
0.7%
Other values (7) 10
 
1.5%

Length

2023-12-12T15:13:16.651481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
10 192
28.4%
100 145
21.4%
50 125
18.5%
5 80
11.8%
20 46
 
6.8%
30 34
 
5.0%
3000원 20
 
3.0%
3 13
 
1.9%
2000원 6
 
0.9%
33 5
 
0.7%
Other values (7) 10
 
1.5%

Interactions

2023-12-12T15:13:14.599744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:13:16.734324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호구분지역할인정보(할인율 또는 할인가)
번호1.0000.7760.9710.631
구분0.7761.0000.9450.681
지역0.9710.9451.0000.680
할인정보(할인율 또는 할인가)0.6310.6810.6801.000
2023-12-12T15:13:16.843368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역할인정보(할인율 또는 할인가)구분
지역1.0000.2670.815
할인정보(할인율 또는 할인가)0.2671.0000.449
구분0.8150.4491.000
2023-12-12T15:13:16.943655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호구분지역할인정보(할인율 또는 할인가)
번호1.0000.5890.8360.303
구분0.5891.0000.8150.449
지역0.8360.8151.0000.267
할인정보(할인율 또는 할인가)0.3030.4490.2671.000

Missing values

2023-12-12T15:13:14.726623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:13:14.865662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호구분지역공공시설명할인정보(할인율 또는 할인가)
01국립기관경상북도국립낙동강생물자원관30%
12국립기관충청남도서천국립생태원30%
23국립휴양림강원도가리왕산 국립휴양림100%
34국립휴양림강원도검봉산 국립휴양림100%
45국립휴양림강원도대관령 국립휴양림100%
56국립휴양림강원도두타산 국립휴양림100%
67국립휴양림강원도미천골 국립휴양림100%
78국립휴양림강원도방태산 국립휴양림100%
89국립휴양림강원도백운산 국립휴양림100%
910국립휴양림강원도복주산 국립휴양림100%
번호구분지역공공시설명할인정보(할인율 또는 할인가)
666667지자체공공시설충북진천종박물관50%
667668지자체공공시설충북진천화랑관50%
668669지자체공공시설충북문의문화재단지100%
669670지자체공공시설충북청원국민체육센터10%
670671지자체공공시설충북청주랜드관리사업소100%
671672지자체공공시설충북청주실내수영장10%
672673지자체공공시설충북충주고구려천문과학관30%
673674지자체공공시설충북수안보인공암벽장30%
674675지자체공공시설충북충주자연생태체험관50%
675676지자체공공시설충북중앙탑사적공원100%