Overview

Dataset statistics

Number of variables11
Number of observations30
Missing cells2
Missing cells (%)0.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.7 KiB
Average record size in memory93.4 B

Variable types

Text2
Numeric1
Categorical5
DateTime3

Dataset

Description샘플 데이터
Author경기도일자리재단
URLhttps://www.bigdata-region.kr/#/dataset/93e07762-c6f5-4fcd-8573-e114073160dc

Alerts

회원지번주소 has constant value ""Constant
회원유입구분명 has constant value ""Constant
회원생일일자 is highly overall correlated with 회원우편번호 and 2 other fieldsHigh correlation
회원성별코드 is highly overall correlated with 회원생일일자High correlation
회원취업상태명 is highly overall correlated with 회원생일일자High correlation
회원우편번호 is highly overall correlated with 회원생일일자High correlation
회원생일일자 is highly imbalanced (64.7%)Imbalance
회원우편번호 has 2 (6.7%) missing valuesMissing
이용자관심범주번호 has unique valuesUnique

Reproduction

Analysis started2023-12-10 13:50:50.008658
Analysis finished2023-12-10 13:50:51.258443
Duration1.25 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T22:50:51.512493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length26
Mean length26
Min length26

Characters and Unicode

Total characters780
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)100.0%

Sample

1st row20170402213531001_CMMN_202
2nd row20170402214034001_CMMN_181
3rd row20170410084200001_CMMN_183
4th row20170410084359001_CMMN_192
5th row20170410084506001_CMMN_181
ValueCountFrequency (%)
20170402213531001_cmmn_202 1
 
3.3%
20170402214034001_cmmn_181 1
 
3.3%
20170410095147002_cmmn_181 1
 
3.3%
20170410095105001_cmmn_193 1
 
3.3%
20170410094735001_cmmn_192 1
 
3.3%
20170410094735001_cmmn_181 1
 
3.3%
20170410094626001_cmmn_200 1
 
3.3%
20170410094626001_cmmn_198 1
 
3.3%
20170410094626001_cmmn_193 1
 
3.3%
20170410093922001_cmmn_193 1
 
3.3%
Other values (20) 20
66.7%
2023-12-10T22:50:52.038982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 197
25.3%
1 139
17.8%
_ 60
 
7.7%
M 60
 
7.7%
2 55
 
7.1%
4 54
 
6.9%
7 43
 
5.5%
9 40
 
5.1%
C 30
 
3.8%
N 30
 
3.8%
Other values (4) 72
 
9.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 600
76.9%
Uppercase Letter 120
 
15.4%
Connector Punctuation 60
 
7.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 197
32.8%
1 139
23.2%
2 55
 
9.2%
4 54
 
9.0%
7 43
 
7.2%
9 40
 
6.7%
8 29
 
4.8%
3 18
 
3.0%
5 15
 
2.5%
6 10
 
1.7%
Uppercase Letter
ValueCountFrequency (%)
M 60
50.0%
C 30
25.0%
N 30
25.0%
Connector Punctuation
ValueCountFrequency (%)
_ 60
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 660
84.6%
Latin 120
 
15.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0 197
29.8%
1 139
21.1%
_ 60
 
9.1%
2 55
 
8.3%
4 54
 
8.2%
7 43
 
6.5%
9 40
 
6.1%
8 29
 
4.4%
3 18
 
2.7%
5 15
 
2.3%
Latin
ValueCountFrequency (%)
M 60
50.0%
C 30
25.0%
N 30
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 780
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 197
25.3%
1 139
17.8%
_ 60
 
7.7%
M 60
 
7.7%
2 55
 
7.1%
4 54
 
6.9%
7 43
 
5.5%
9 40
 
5.1%
C 30
 
3.8%
N 30
 
3.8%
Other values (4) 72
 
9.2%

회원우편번호
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct19
Distinct (%)67.9%
Missing2
Missing (%)6.7%
Infinite0
Infinite (%)0.0%
Mean15026.25
Minimum11479
Maximum18132
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:50:52.233712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11479
5-th percentile12736
Q113421
median15222
Q316606
95-th percentile17245.35
Maximum18132
Range6653
Interquartile range (IQR)3185

Descriptive statistics

Standard deviation1708.7259
Coefficient of variation (CV)0.11371606
Kurtosis-0.79921135
Mean15026.25
Median Absolute Deviation (MAD)1499
Skewness-0.17651668
Sum420735
Variance2919744.3
MonotonicityNot monotonic
2023-12-10T22:50:52.407236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
12736 3
 
10.0%
13421 3
 
10.0%
15251 3
 
10.0%
16606 2
 
6.7%
14568 2
 
6.7%
15222 2
 
6.7%
17051 1
 
3.3%
17350 1
 
3.3%
14614 1
 
3.3%
15041 1
 
3.3%
Other values (9) 9
30.0%
(Missing) 2
 
6.7%
ValueCountFrequency (%)
11479 1
 
3.3%
12736 3
10.0%
12915 1
 
3.3%
13421 3
10.0%
14285 1
 
3.3%
14568 2
6.7%
14614 1
 
3.3%
15041 1
 
3.3%
15222 2
6.7%
15251 3
10.0%
ValueCountFrequency (%)
18132 1
 
3.3%
17350 1
 
3.3%
17051 1
 
3.3%
16988 1
 
3.3%
16873 1
 
3.3%
16836 1
 
3.3%
16606 2
6.7%
16275 1
 
3.3%
15880 1
 
3.3%
15251 3
10.0%

회원지번주소
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
30 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
30
100.0%

Length

2023-12-10T22:50:52.581295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:50:52.733005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
No values found.

회원생일일자
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
19**
28 
<NA>
 
2

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row19**
4th row19**
5th row19**

Common Values

ValueCountFrequency (%)
19** 28
93.3%
<NA> 2
 
6.7%

Length

2023-12-10T22:50:52.884488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:50:53.047097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
19 28
93.3%
na 2
 
6.7%

회원성별코드
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
F
20 
M
<NA>
 
2

Length

Max length4
Median length1
Mean length1.2
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd rowF
4th rowF
5th rowF

Common Values

ValueCountFrequency (%)
F 20
66.7%
M 8
 
26.7%
<NA> 2
 
6.7%

Length

2023-12-10T22:50:53.243117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:50:53.496160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
f 20
66.7%
m 8
 
26.7%
na 2
 
6.7%

회원유입구분명
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
청년통장
30 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row청년통장
2nd row청년통장
3rd row청년통장
4th row청년통장
5th row청년통장

Common Values

ValueCountFrequency (%)
청년통장 30
100.0%

Length

2023-12-10T22:50:53.674547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:50:53.814900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
청년통장 30
100.0%

회원취업상태명
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
재직중
24 
<NA>
취업준비중
 
2
기타
 
1

Length

Max length5
Median length3
Mean length3.2
Min length2

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row<NA>
2nd row<NA>
3rd row재직중
4th row재직중
5th row재직중

Common Values

ValueCountFrequency (%)
재직중 24
80.0%
<NA> 3
 
10.0%
취업준비중 2
 
6.7%
기타 1
 
3.3%

Length

2023-12-10T22:50:53.995499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:50:54.268141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
재직중 24
80.0%
na 3
 
10.0%
취업준비중 2
 
6.7%
기타 1
 
3.3%
Distinct22
Distinct (%)73.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T22:50:54.533602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length6
Mean length3.8333333
Min length2

Characters and Unicode

Total characters115
Distinct characters61
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)60.0%

Sample

1st row기후변화
2nd row빅데이터
3rd row여행
4th row반려동물
5th row빅데이터
ValueCountFrequency (%)
빅데이터 6
20.0%
행정 2
 
6.7%
반려동물 2
 
6.7%
세무 2
 
6.7%
실내디자인 1
 
3.3%
기후변화 1
 
3.3%
디자인 1
 
3.3%
엑셀 1
 
3.3%
핸드메이드 1
 
3.3%
조리·제빵·바리스타 1
 
3.3%
Other values (12) 12
40.0%
2023-12-10T22:50:55.109896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
 
6.1%
6
 
5.2%
6
 
5.2%
6
 
5.2%
5
 
4.3%
4
 
3.5%
4
 
3.5%
4
 
3.5%
3
 
2.6%
3
 
2.6%
Other values (51) 67
58.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 110
95.7%
Uppercase Letter 3
 
2.6%
Other Punctuation 2
 
1.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
6.4%
6
 
5.5%
6
 
5.5%
6
 
5.5%
5
 
4.5%
4
 
3.6%
4
 
3.6%
4
 
3.6%
3
 
2.7%
3
 
2.7%
Other values (47) 62
56.4%
Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
B 1
33.3%
C 1
33.3%
Other Punctuation
ValueCountFrequency (%)
· 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 110
95.7%
Latin 3
 
2.6%
Common 2
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
6.4%
6
 
5.5%
6
 
5.5%
6
 
5.5%
5
 
4.5%
4
 
3.6%
4
 
3.6%
4
 
3.6%
3
 
2.7%
3
 
2.7%
Other values (47) 62
56.4%
Latin
ValueCountFrequency (%)
P 1
33.3%
B 1
33.3%
C 1
33.3%
Common
ValueCountFrequency (%)
· 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 110
95.7%
ASCII 3
 
2.6%
None 2
 
1.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7
 
6.4%
6
 
5.5%
6
 
5.5%
6
 
5.5%
5
 
4.5%
4
 
3.6%
4
 
3.6%
4
 
3.6%
3
 
2.7%
3
 
2.7%
Other values (47) 62
56.4%
None
ValueCountFrequency (%)
· 2
100.0%
ASCII
ValueCountFrequency (%)
P 1
33.3%
B 1
33.3%
C 1
33.3%
Distinct21
Distinct (%)70.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
Minimum2018-11-06 12:41:00
Maximum2021-07-11 23:24:00
2023-12-10T22:50:55.386410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:50:55.570456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
Distinct21
Distinct (%)70.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
Minimum2018-11-06 12:41:00
Maximum2021-07-11 23:24:00
2023-12-10T22:50:55.770926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:50:55.950400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=21)
Distinct18
Distinct (%)60.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
Minimum2018-11-06 00:00:00
Maximum2021-07-11 00:00:00
2023-12-10T22:50:56.262594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:50:56.467724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=18)

Interactions

2023-12-10T22:50:50.547077image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:50:56.631749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
이용자관심범주번호회원우편번호회원성별코드회원취업상태명관심범주명등록일시수정일시데이터기준일자
이용자관심범주번호1.0001.0001.0001.0001.0001.0001.0001.000
회원우편번호1.0001.0000.6940.0000.0001.0001.0000.954
회원성별코드1.0000.6941.0000.0000.5121.0001.0000.807
회원취업상태명1.0000.0000.0001.0000.0001.0001.0001.000
관심범주명1.0000.0000.5120.0001.0000.0000.0000.478
등록일시1.0001.0001.0001.0000.0001.0001.0001.000
수정일시1.0001.0001.0001.0000.0001.0001.0001.000
데이터기준일자1.0000.9540.8071.0000.4781.0001.0001.000
2023-12-10T22:50:56.992515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회원생일일자회원성별코드회원취업상태명
회원생일일자1.0001.0001.000
회원성별코드1.0001.0000.000
회원취업상태명1.0000.0001.000
2023-12-10T22:50:57.160463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회원우편번호회원생일일자회원성별코드회원취업상태명
회원우편번호1.0001.0000.4310.000
회원생일일자1.0001.0001.0001.000
회원성별코드0.4311.0001.0000.000
회원취업상태명0.0001.0000.0001.000

Missing values

2023-12-10T22:50:50.853965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:50:51.145452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

이용자관심범주번호회원우편번호회원지번주소회원생일일자회원성별코드회원유입구분명회원취업상태명관심범주명등록일시수정일시데이터기준일자
020170402213531001_CMMN_202<NA><NA><NA>청년통장<NA>기후변화2019-01-28 17:102019-01-28 17:102019-01-28
120170402214034001_CMMN_181<NA><NA><NA>청년통장<NA>빅데이터2019-11-07 15:352019-11-07 15:352019-11-07
220170410084200001_CMMN_1831588019**F청년통장재직중여행2019-01-09 11:562019-01-09 11:562019-01-09
320170410084359001_CMMN_1921698819**F청년통장재직중반려동물2019-02-14 14:332019-02-14 14:332019-02-14
420170410084506001_CMMN_1811813219**F청년통장재직중빅데이터2018-12-10 13:132018-12-10 13:132018-12-10
520170410084610001_CMMN_1831273619**M청년통장재직중관광통역2018-12-10 15:072018-12-10 15:072018-12-10
620170410084610001_CMMN_1881273619**M청년통장재직중수출입2018-12-10 15:072018-12-10 15:072018-12-10
720170410084610001_CMMN_1891273619**M청년통장재직중세무2018-12-10 15:072018-12-10 15:072018-12-10
820170410085107001_CMMN_1811428519**F청년통장재직중빅데이터2018-12-07 21:552018-12-07 21:552018-12-07
920170410090219001_CMMN_1841342119**F청년통장재직중건축2018-12-06 10:422018-12-06 10:422018-12-06
이용자관심범주번호회원우편번호회원지번주소회원생일일자회원성별코드회원유입구분명회원취업상태명관심범주명등록일시수정일시데이터기준일자
2020170410093747001_CMMN_1971504119**M청년통장재직중PCB2019-07-04 10:322019-07-04 10:322019-07-04
2120170410093922001_CMMN_1931461419**M청년통장재직중행정2018-12-04 10:112018-12-04 10:112018-12-04
2220170410094626001_CMMN_1931525119**F청년통장재직중세무2018-11-06 12:412018-11-06 12:412018-11-06
2320170410094626001_CMMN_1981525119**F청년통장재직중조리·제빵·바리스타2018-11-06 12:412018-11-06 12:412018-11-06
2420170410094626001_CMMN_2001525119**F청년통장재직중핸드메이드2018-11-06 12:412018-11-06 12:412018-11-06
2520170410094735001_CMMN_1811456819**F청년통장재직중빅데이터2018-12-10 17:282018-12-10 17:282018-12-10
2620170410094735001_CMMN_1921456819**F청년통장재직중반려동물2018-12-10 17:282018-12-10 17:282018-12-10
2720170410095105001_CMMN_1931735019**F청년통장재직중엑셀2019-12-12 17:352019-12-12 17:352019-12-12
2820170410095147002_CMMN_1811660619**M청년통장재직중빅데이터2018-12-14 09:472018-12-14 09:472018-12-14
2920170410095147002_CMMN_1841660619**M청년통장재직중건설안전기사2018-12-14 09:472018-12-14 09:472018-12-14