Overview

Dataset statistics

Number of variables7
Number of observations669
Missing cells1338
Missing cells (%)28.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory39.3 KiB
Average record size in memory60.2 B

Variable types

Numeric2
Categorical3
Unsupported2

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15241/S/1/datasetView.do

Alerts

구분 has constant value ""Constant
Unnamed: 5 has 669 (100.0%) missing valuesMissing
성별미수집기간 '19년 말부터 시스템 개편 후 성별 데이터 수집 has 669 (100.0%) missing valuesMissing
Unnamed: 5 is an unsupported type, check if it needs cleaning or further analysisUnsupported
성별미수집기간 '19년 말부터 시스템 개편 후 성별 데이터 수집 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-05-11 08:55:07.712099
Analysis finished2024-05-11 08:55:10.765765
Duration3.05 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

가입월
Real number (ℝ)

Distinct48
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean202035.88
Minimum201807
Maximum202206
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.0 KiB
2024-05-11T08:55:11.298357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum201807
5-th percentile201811
Q1202002
median202010
Q3202108
95-th percentile202204
Maximum202206
Range399
Interquartile range (IQR)106

Descriptive statistics

Standard deviation107.54739
Coefficient of variation (CV)0.00053231828
Kurtosis-0.53593381
Mean202035.88
Median Absolute Deviation (MAD)97
Skewness-0.23161712
Sum1.35162 × 108
Variance11566.441
MonotonicityIncreasing
2024-05-11T08:55:12.032832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
202006 24
 
3.6%
202003 24
 
3.6%
202004 24
 
3.6%
202002 23
 
3.4%
202005 21
 
3.1%
202007 17
 
2.5%
202106 17
 
2.5%
202008 17
 
2.5%
202009 17
 
2.5%
202010 17
 
2.5%
Other values (38) 468
70.0%
ValueCountFrequency (%)
201807 7
1.0%
201808 7
1.0%
201809 7
1.0%
201810 7
1.0%
201811 7
1.0%
201812 7
1.0%
201901 7
1.0%
201902 7
1.0%
201903 7
1.0%
201904 7
1.0%
ValueCountFrequency (%)
202206 16
2.4%
202205 16
2.4%
202204 16
2.4%
202203 16
2.4%
202202 16
2.4%
202201 16
2.4%
202112 16
2.4%
202111 16
2.4%
202110 16
2.4%
202109 16
2.4%

구분
Categorical

CONSTANT 

Distinct1
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
회원-내국인
669 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row회원-내국인
2nd row회원-내국인
3rd row회원-내국인
4th row회원-내국인
5th row회원-내국인

Common Values

ValueCountFrequency (%)
회원-내국인 669
100.0%

Length

2024-05-11T08:55:12.672253image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T08:55:13.148392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
회원-내국인 669
100.0%

연령
Categorical

Distinct8
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
~10대
85 
20대
85 
30대
85 
40대
85 
50대
85 
Other values (3)
244 

Length

Max length5
Median length3
Mean length3.2615845
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row~10대
2nd row20대
3rd row30대
4th row40대
5th row50대

Common Values

ValueCountFrequency (%)
~10대 85
12.7%
20대 85
12.7%
30대 85
12.7%
40대 85
12.7%
50대 85
12.7%
60대 85
12.7%
70대이상 83
12.4%
기타 76
11.4%

Length

2024-05-11T08:55:13.663322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T08:55:14.323699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
10대 85
12.7%
20대 85
12.7%
30대 85
12.7%
40대 85
12.7%
50대 85
12.7%
60대 85
12.7%
70대이상 83
12.4%
기타 76
11.4%

성별
Categorical

Distinct3
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size5.4 KiB
F
254 
M
254 
<NA>
161 

Length

Max length4
Median length1
Mean length1.7219731
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
F 254
38.0%
M 254
38.0%
<NA> 161
24.1%

Length

2024-05-11T08:55:15.012740image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T08:55:15.422538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
f 254
38.0%
m 254
38.0%
na 161
24.1%

신규가입자수
Real number (ℝ)

Distinct588
Distinct (%)87.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5010.9028
Minimum1
Maximum105677
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.0 KiB
2024-05-11T08:55:15.845896image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile13.4
Q1208
median1444
Q35012
95-th percentile21673.2
Maximum105677
Range105676
Interquartile range (IQR)4804

Descriptive statistics

Standard deviation10210.503
Coefficient of variation (CV)2.0376573
Kurtosis26.873958
Mean5010.9028
Median Absolute Deviation (MAD)1407
Skewness4.4995508
Sum3352294
Variance1.0425437 × 108
MonotonicityNot monotonic
2024-05-11T08:55:16.412961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23 6
 
0.9%
12 6
 
0.9%
30 5
 
0.7%
6 5
 
0.7%
3 4
 
0.6%
51 3
 
0.4%
131 3
 
0.4%
147 3
 
0.4%
10 3
 
0.4%
27 3
 
0.4%
Other values (578) 628
93.9%
ValueCountFrequency (%)
1 2
 
0.3%
3 4
0.6%
4 3
0.4%
5 3
0.4%
6 5
0.7%
7 2
 
0.3%
8 2
 
0.3%
9 1
 
0.1%
10 3
0.4%
12 6
0.9%
ValueCountFrequency (%)
105677 1
0.1%
75728 1
0.1%
69038 1
0.1%
66246 1
0.1%
59964 1
0.1%
58945 1
0.1%
58491 1
0.1%
55120 1
0.1%
50779 1
0.1%
42153 1
0.1%

Unnamed: 5
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing669
Missing (%)100.0%
Memory size6.0 KiB
Missing669
Missing (%)100.0%
Memory size6.0 KiB

Interactions

2024-05-11T08:55:08.997694image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T08:55:08.119489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T08:55:09.423226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T08:55:08.587782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T08:55:16.807915image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가입월연령성별신규가입자수
가입월1.0000.0000.0000.145
연령0.0001.0000.0000.322
성별0.0000.0001.0000.000
신규가입자수0.1450.3220.0001.000
2024-05-11T08:55:17.079142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별연령
성별1.0000.000
연령0.0001.000
2024-05-11T08:55:17.304444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가입월신규가입자수연령성별
가입월1.000-0.0600.0000.000
신규가입자수-0.0601.0000.1640.000
연령0.0000.1641.0000.000
성별0.0000.0000.0001.000

Missing values

2024-05-11T08:55:09.958172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T08:55:10.468860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

가입월구분연령성별신규가입자수Unnamed: 5성별미수집기간 '19년 말부터 시스템 개편 후 성별 데이터 수집
0201807회원-내국인~10대<NA>6662<NA><NA>
1201807회원-내국인20대<NA>38729<NA><NA>
2201807회원-내국인30대<NA>11245<NA><NA>
3201807회원-내국인40대<NA>5877<NA><NA>
4201807회원-내국인50대<NA>2332<NA><NA>
5201807회원-내국인60대<NA>456<NA><NA>
6201807회원-내국인70대이상<NA>111<NA><NA>
7201808회원-내국인~10대<NA>6085<NA><NA>
8201808회원-내국인20대<NA>34541<NA><NA>
9201808회원-내국인30대<NA>9206<NA><NA>
가입월구분연령성별신규가입자수Unnamed: 5성별미수집기간 '19년 말부터 시스템 개편 후 성별 데이터 수집
659202206회원-내국인70대이상F99<NA><NA>
660202206회원-내국인기타F332<NA><NA>
661202206회원-내국인~10대M5012<NA><NA>
662202206회원-내국인20대M13317<NA><NA>
663202206회원-내국인30대M9049<NA><NA>
664202206회원-내국인40대M5856<NA><NA>
665202206회원-내국인50대M3272<NA><NA>
666202206회원-내국인60대M1088<NA><NA>
667202206회원-내국인70대이상M183<NA><NA>
668202206회원-내국인기타M90<NA><NA>