Overview

Dataset statistics

Number of variables6
Number of observations689
Missing cells0
Missing cells (%)0.0%
Duplicate rows8
Duplicate rows (%)1.2%
Total size in memory32.4 KiB
Average record size in memory48.2 B

Variable types

Categorical4
Text2

Dataset

Description경기도여성가족재단_통계지표목록
Author경기도여성가족재단
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=VWG6ZNRMI4WU4FUTIS9L30307649&infSeq=1

Alerts

Dataset has 8 (1.2%) duplicate rowsDuplicates
영역명 is highly overall correlated with 분야명 and 1 other fieldsHigh correlation
분야명 is highly overall correlated with 영역명High correlation
갱신시기 is highly overall correlated with 영역명High correlation
통계유형 is highly imbalanced (97.1%)Imbalance

Reproduction

Analysis started2024-04-29 13:34:06.638062
Analysis finished2024-04-29 13:34:08.526986
Duration1.89 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

분야명
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
경제활동
103 
가구 및 가족
90 
안전및환경
86 
복지
81 
건강
80 
Other values (6)
249 

Length

Max length8
Median length7
Mean length4.0174165
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경제활동
2nd row경제활동
3rd row경제활동
4th row경제활동
5th row경제활동

Common Values

ValueCountFrequency (%)
경제활동 103
14.9%
가구 및 가족 90
13.1%
안전및환경 86
12.5%
복지 81
11.8%
건강 80
11.6%
교육 70
10.2%
인구 51
7.4%
아동돌봄 48
7.0%
정치및사회참여 46
6.7%
문화및정보미디어 25
 
3.6%

Length

2024-04-29T22:34:08.610123image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경제활동 103
11.9%
가구 90
10.4%
90
10.4%
가족 90
10.4%
안전및환경 86
9.9%
복지 81
9.3%
건강 80
9.2%
교육 70
8.1%
인구 51
5.9%
아동돌봄 48
5.5%
Other values (3) 80
9.2%

영역명
Categorical

HIGH CORRELATION 

Distinct39
Distinct (%)5.7%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
사회복지서비스
58 
경제활동참여
52 
시도비교
 
39
건강상태 및 행태
 
36
가구구성
 
32
Other values (34)
472 

Length

Max length10
Median length4
Mean length4.9521045
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경제활동참여
2nd row경제활동참여
3rd row경제활동참여
4th row경제활동참여
5th row경제활동참여

Common Values

ValueCountFrequency (%)
사회복지서비스 58
 
8.4%
경제활동참여 52
 
7.5%
시도비교 39
 
5.7%
건강상태 및 행태 36
 
5.2%
가구구성 32
 
4.6%
안전정책 32
 
4.6%
취업태도 27
 
3.9%
가족형성 25
 
3.6%
교육기회 25
 
3.6%
돌봄시설 21
 
3.0%
Other values (29) 342
49.6%

Length

2024-04-29T22:34:08.733562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
64
 
7.6%
사회복지서비스 58
 
6.9%
경제활동참여 52
 
6.2%
시도비교 39
 
4.6%
건강상태 36
 
4.3%
행태 36
 
4.3%
가구구성 32
 
3.8%
안전정책 32
 
3.8%
사망 27
 
3.2%
취업태도 27
 
3.2%
Other values (34) 438
52.1%
Distinct112
Distinct (%)16.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
2024-04-29T22:34:08.999691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length18
Mean length5.2307692
Min length2

Characters and Unicode

Total characters3604
Distinct characters170
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)2.3%

Sample

1st row취업자
2nd row취업자
3rd row취업자
4th row취업자
5th row취업자
ValueCountFrequency (%)
시도비교 39
 
4.5%
가구형태 32
 
3.7%
젠더폭력 28
 
3.2%
직업선택 27
 
3.1%
피해자 26
 
3.0%
지원 26
 
3.0%
24
 
2.7%
노인복지 21
 
2.4%
초중고등학교 20
 
2.3%
취업자 16
 
1.8%
Other values (128) 617
70.4%
2024-04-29T22:34:09.388829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
187
 
5.2%
145
 
4.0%
129
 
3.6%
107
 
3.0%
93
 
2.6%
79
 
2.2%
69
 
1.9%
65
 
1.8%
65
 
1.8%
63
 
1.7%
Other values (160) 2602
72.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3401
94.4%
Space Separator 187
 
5.2%
Decimal Number 12
 
0.3%
Math Symbol 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
145
 
4.3%
129
 
3.8%
107
 
3.1%
93
 
2.7%
79
 
2.3%
69
 
2.0%
65
 
1.9%
65
 
1.9%
63
 
1.9%
60
 
1.8%
Other values (156) 2526
74.3%
Decimal Number
ValueCountFrequency (%)
1 6
50.0%
9 6
50.0%
Space Separator
ValueCountFrequency (%)
187
100.0%
Math Symbol
ValueCountFrequency (%)
+ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3401
94.4%
Common 203
 
5.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
145
 
4.3%
129
 
3.8%
107
 
3.1%
93
 
2.7%
79
 
2.3%
69
 
2.0%
65
 
1.9%
65
 
1.9%
63
 
1.9%
60
 
1.8%
Other values (156) 2526
74.3%
Common
ValueCountFrequency (%)
187
92.1%
1 6
 
3.0%
9 6
 
3.0%
+ 4
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3401
94.4%
ASCII 203
 
5.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
187
92.1%
1 6
 
3.0%
9 6
 
3.0%
+ 4
 
2.0%
Hangul
ValueCountFrequency (%)
145
 
4.3%
129
 
3.8%
107
 
3.1%
93
 
2.7%
79
 
2.3%
69
 
2.0%
65
 
1.9%
65
 
1.9%
63
 
1.9%
60
 
1.8%
Other values (156) 2526
74.3%
Distinct680
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
2024-04-29T22:34:09.624358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length47
Median length37
Mean length22.232221
Min length8

Characters and Unicode

Total characters15318
Distinct characters348
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique671 ?
Unique (%)97.4%

Sample

1st row연령별 취업자의 종사상 지위(2018-2020)
2nd row취업자의 직업분포(2018-2020)
3rd row교육정도별 취업자의 직업분포(2018-2020)
4th row혼인상태별 취업자의 직업분포(2018-2020)
5th row취업자의 산업분포(2018-2020)
ValueCountFrequency (%)
131
 
5.3%
현황(2021 40
 
1.6%
시도별 40
 
1.6%
연령별 36
 
1.5%
대한 36
 
1.5%
현황(2018-2020 34
 
1.4%
경기도 28
 
1.1%
시도 21
 
0.8%
어린이집 19
 
0.8%
수(2018-2020 18
 
0.7%
Other values (1104) 2072
83.7%
2024-04-29T22:34:10.006866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1786
 
11.7%
2 1504
 
9.8%
0 1350
 
8.8%
1 723
 
4.7%
) 629
 
4.1%
( 627
 
4.1%
8 247
 
1.6%
- 233
 
1.5%
225
 
1.5%
216
 
1.4%
Other values (338) 7778
50.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 7754
50.6%
Decimal Number 4083
26.7%
Space Separator 1786
 
11.7%
Close Punctuation 629
 
4.1%
Open Punctuation 627
 
4.1%
Dash Punctuation 233
 
1.5%
Other Punctuation 184
 
1.2%
Math Symbol 15
 
0.1%
Uppercase Letter 7
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
225
 
2.9%
216
 
2.8%
196
 
2.5%
168
 
2.2%
152
 
2.0%
151
 
1.9%
150
 
1.9%
148
 
1.9%
147
 
1.9%
136
 
1.8%
Other values (312) 6065
78.2%
Decimal Number
ValueCountFrequency (%)
2 1504
36.8%
0 1350
33.1%
1 723
17.7%
8 247
 
6.0%
9 101
 
2.5%
7 67
 
1.6%
5 43
 
1.1%
6 37
 
0.9%
4 6
 
0.1%
3 5
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
C 2
28.6%
E 1
14.3%
Q 1
14.3%
T 1
14.3%
V 1
14.3%
D 1
14.3%
Other Punctuation
ValueCountFrequency (%)
, 171
92.9%
. 5
 
2.7%
: 4
 
2.2%
? 4
 
2.2%
Math Symbol
ValueCountFrequency (%)
~ 13
86.7%
2
 
13.3%
Space Separator
ValueCountFrequency (%)
1786
100.0%
Close Punctuation
ValueCountFrequency (%)
) 629
100.0%
Open Punctuation
ValueCountFrequency (%)
( 627
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 233
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 7754
50.6%
Common 7557
49.3%
Latin 7
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
225
 
2.9%
216
 
2.8%
196
 
2.5%
168
 
2.2%
152
 
2.0%
151
 
1.9%
150
 
1.9%
148
 
1.9%
147
 
1.9%
136
 
1.8%
Other values (312) 6065
78.2%
Common
ValueCountFrequency (%)
1786
23.6%
2 1504
19.9%
0 1350
17.9%
1 723
9.6%
) 629
 
8.3%
( 627
 
8.3%
8 247
 
3.3%
- 233
 
3.1%
, 171
 
2.3%
9 101
 
1.3%
Other values (10) 186
 
2.5%
Latin
ValueCountFrequency (%)
C 2
28.6%
E 1
14.3%
Q 1
14.3%
T 1
14.3%
V 1
14.3%
D 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 7754
50.6%
ASCII 7562
49.4%
Arrows 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1786
23.6%
2 1504
19.9%
0 1350
17.9%
1 723
9.6%
) 629
 
8.3%
( 627
 
8.3%
8 247
 
3.3%
- 233
 
3.1%
, 171
 
2.3%
9 101
 
1.3%
Other values (15) 191
 
2.5%
Hangul
ValueCountFrequency (%)
225
 
2.9%
216
 
2.8%
196
 
2.5%
168
 
2.2%
152
 
2.0%
151
 
1.9%
150
 
1.9%
148
 
1.9%
147
 
1.9%
136
 
1.8%
Other values (312) 6065
78.2%
Arrows
ValueCountFrequency (%)
2
100.0%

통계유형
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
기년통계
687 
추계
 
2

Length

Max length4
Median length4
Mean length3.9941945
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기년통계
2nd row기년통계
3rd row기년통계
4th row기년통계
5th row기년통계

Common Values

ValueCountFrequency (%)
기년통계 687
99.7%
추계 2
 
0.3%

Length

2024-04-29T22:34:10.161628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-29T22:34:10.264463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기년통계 687
99.7%
추계 2
 
0.3%

갱신시기
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
12월
215 
6월+12월
105 
11월
93 
8월
68 
6월
54 
Other values (5)
154 

Length

Max length6
Median length3
Mean length3.1886792
Min length2

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row6월+12월
2nd row6월+12월
3rd row6월+12월
4th row6월+12월
5th row6월+12월

Common Values

ValueCountFrequency (%)
12월 215
31.2%
6월+12월 105
15.2%
11월 93
13.5%
8월 68
 
9.9%
6월 54
 
7.8%
7월 54
 
7.8%
10월 51
 
7.4%
9월 38
 
5.5%
7월+11월 10
 
1.5%
1월 1
 
0.1%

Length

2024-04-29T22:34:10.370268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-29T22:34:10.491489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
12월 215
31.2%
6월+12월 105
15.2%
11월 93
13.5%
8월 68
 
9.9%
6월 54
 
7.8%
7월 54
 
7.8%
10월 51
 
7.4%
9월 38
 
5.5%
7월+11월 10
 
1.5%
1월 1
 
0.1%

Correlations

2024-04-29T22:34:10.585819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분야명영역명통계유형갱신시기
분야명1.0000.9930.1560.776
영역명0.9931.0000.3460.957
통계유형0.1560.3461.0000.000
갱신시기0.7760.9570.0001.000
2024-04-29T22:34:10.679532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
갱신시기영역명통계유형분야명
갱신시기1.0000.7350.0000.472
영역명0.7351.0000.2830.912
통계유형0.0000.2831.0000.148
분야명0.4720.9120.1481.000
2024-04-29T22:34:10.784425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분야명영역명통계유형갱신시기
분야명1.0000.9120.1480.472
영역명0.9121.0000.2830.735
통계유형0.1480.2831.0000.000
갱신시기0.4720.7350.0001.000

Missing values

2024-04-29T22:34:08.352772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-29T22:34:08.473054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

분야명영역명통계파일명통계지표명통계유형갱신시기
0경제활동경제활동참여취업자연령별 취업자의 종사상 지위(2018-2020)기년통계6월+12월
1경제활동경제활동참여취업자취업자의 직업분포(2018-2020)기년통계6월+12월
2경제활동경제활동참여취업자교육정도별 취업자의 직업분포(2018-2020)기년통계6월+12월
3경제활동경제활동참여취업자혼인상태별 취업자의 직업분포(2018-2020)기년통계6월+12월
4경제활동경제활동참여취업자취업자의 산업분포(2018-2020)기년통계6월+12월
5경제활동경제활동참여취업자사업체규모별 취업자의 산업분포(2019)기년통계6월+12월
6경제활동경제활동참여취업자교육정도별 취업자의 산업분포(2018-2020)기년통계6월+12월
7경제활동경제활동참여취업자혼인상태별 취업자의 산업분포(2018-2020)기년통계6월+12월
8경제활동경제활동참여취업자연령별고용률(2021)기년통계6월+12월
9경제활동경제활동참여취업자혼인상태별 고용률(2021)기년통계6월+12월
분야명영역명통계파일명통계지표명통계유형갱신시기
679가구 및 가족가족형성이혼외국인과의 이혼(2021)기년통계6월+12월
680가구 및 가족가족생활소득월평균 가구소득(2018,2020)기년통계9월
681가구 및 가족가족생활주택점유형태주택 점유형태(2018,2020)기년통계9월
682가구 및 가족가족생활가족관계가족관계에 대한 만족도(2018.2020)기년통계12월
683가구 및 가족가족생활가족관계청소년 고민상담 대상(2018,2020)기년통계12월
684가구 및 가족가족생활가족관계전반적인 가족관계 만족도(2018)기년통계12월
685가구 및 가족가족생활가사노동가사노동시간(2019)기년통계9월
686가구 및 가족가족생활가사노동가사분담에 대한 태도(2018,2020)기년통계9월
687가구 및 가족가족생활가사노동가사 분담 실태(2018,2020)기년통계9월
688가구 및 가족가족생활가사노동돌봄노동 시간(2019)기년통계9월

Duplicate rows

Most frequently occurring

분야명영역명통계파일명통계지표명통계유형갱신시기# duplicates
0아동돌봄돌봄지원보육료지원보육료 지원 현황(2018-2020)기년통계12월2
1아동돌봄돌봄지원보육료지원보육료 지원 현황(2021)기년통계12월2
2아동돌봄돌봄지원보육료지원양육수당 지급 대상 현황(2018-2020)기년통계12월2
3아동돌봄돌봄지원보육료지원양육수당 지원 현황(2021)기년통계12월2
4아동돌봄돌봄지원보육료지원유아학비 지원 현황(2021)기년통계12월2
5아동돌봄돌봄지원보육만족미취학 아동 보육 만족도(2018)기년통계12월2
6아동돌봄돌봄지원보육만족미취학 아동 보육 방법(2018)기년통계12월2
7아동돌봄돌봄지원보육만족우선보육지원(2018)기년통계12월2