Overview

Dataset statistics

Number of variables5
Number of observations4856
Missing cells0
Missing cells (%)0.0%
Duplicate rows16
Duplicate rows (%)0.3%
Total size in memory194.6 KiB
Average record size in memory41.0 B

Variable types

DateTime1
Categorical3
Numeric1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15243/S/1/datasetView.do

Alerts

사용자코드 has constant value ""Constant
Dataset has 16 (0.3%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-11 06:24:43.858604
Analysis finished2023-12-11 06:24:44.403678
Duration0.55 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct181
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Memory size38.1 KiB
Minimum2022-01-01 00:00:00
Maximum2022-06-30 00:00:00
2023-12-11T15:24:44.465120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T15:24:44.605222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

사용자코드
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.1 KiB
회원-내국인
4856 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row회원-내국인
2nd row회원-내국인
3rd row회원-내국인
4th row회원-내국인
5th row회원-내국인

Common Values

ValueCountFrequency (%)
회원-내국인 4856
100.0%

Length

2023-12-11T15:24:44.733749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T15:24:44.815735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
회원-내국인 4856
100.0%

연령대코드
Categorical

Distinct8
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size38.1 KiB
20대
723 
10대
716 
40대
714 
30대
712 
50대
705 
Other values (3)
1286 

Length

Max length5
Median length3
Mean length3.0817545
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row10대
2nd row10대
3rd row10대
4th row20대
5th row20대

Common Values

ValueCountFrequency (%)
20대 723
14.9%
10대 716
14.7%
40대 714
14.7%
30대 712
14.7%
50대 705
14.5%
60대 615
12.7%
70대이상 356
7.3%
기타 315
6.5%

Length

2023-12-11T15:24:44.906533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T15:24:45.009345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20대 723
14.9%
10대 716
14.7%
40대 714
14.7%
30대 712
14.7%
50대 705
14.5%
60대 615
12.7%
70대이상 356
7.3%
기타 315
6.5%

성별
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size38.1 KiB
M
2497 
F
2359 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowM
3rd rowM
4th rowF
5th rowM

Common Values

ValueCountFrequency (%)
M 2497
51.4%
F 2359
48.6%

Length

2023-12-11T15:24:45.112468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T15:24:45.218597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
m 2497
51.4%
f 2359
48.6%

가입 수
Real number (ℝ)

Distinct511
Distinct (%)10.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean69.372117
Minimum1
Maximum1530
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size42.8 KiB
2023-12-11T15:24:45.376686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median11
Q363
95-th percentile357.25
Maximum1530
Range1529
Interquartile range (IQR)59

Descriptive statistics

Standard deviation143.01637
Coefficient of variation (CV)2.061583
Kurtosis20.007456
Mean69.372117
Median Absolute Deviation (MAD)9
Skewness3.8717352
Sum336871
Variance20453.683
MonotonicityNot monotonic
2023-12-11T15:24:45.505357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 418
 
8.6%
2 319
 
6.6%
4 310
 
6.4%
3 308
 
6.3%
5 265
 
5.5%
6 208
 
4.3%
7 175
 
3.6%
8 136
 
2.8%
10 109
 
2.2%
11 103
 
2.1%
Other values (501) 2505
51.6%
ValueCountFrequency (%)
1 418
8.6%
2 319
6.6%
3 308
6.3%
4 310
6.4%
5 265
5.5%
6 208
4.3%
7 175
3.6%
8 136
 
2.8%
9 101
 
2.1%
10 109
 
2.2%
ValueCountFrequency (%)
1530 1
< 0.1%
1507 1
< 0.1%
1382 1
< 0.1%
1374 1
< 0.1%
1282 1
< 0.1%
1216 1
< 0.1%
1159 1
< 0.1%
1043 1
< 0.1%
1013 1
< 0.1%
1003 1
< 0.1%

Interactions

2023-12-11T15:24:44.064794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T15:24:45.623029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령대코드성별가입 수
연령대코드1.0000.0620.325
성별0.0621.0000.028
가입 수0.3250.0281.000
2023-12-11T15:24:46.032880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령대코드성별
연령대코드1.0000.046
성별0.0461.000
2023-12-11T15:24:46.145548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가입 수연령대코드성별
가입 수1.0000.1610.022
연령대코드0.1611.0000.046
성별0.0220.0461.000

Missing values

2023-12-11T15:24:44.270962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T15:24:44.363504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

가입일자사용자코드연령대코드성별가입 수
02022-01-01회원-내국인10대F39
12022-01-01회원-내국인10대M2
22022-01-01회원-내국인10대M70
32022-01-01회원-내국인20대F63
42022-01-01회원-내국인20대M7
52022-01-01회원-내국인20대M77
62022-01-01회원-내국인30대F27
72022-01-01회원-내국인30대M4
82022-01-01회원-내국인30대M47
92022-01-01회원-내국인40대F4
가입일자사용자코드연령대코드성별가입 수
48462022-06-30회원-내국인50대F3
48472022-06-30회원-내국인50대F14
48482022-06-30회원-내국인50대M8
48492022-06-30회원-내국인50대M20
48502022-06-30회원-내국인60대F1
48512022-06-30회원-내국인60대F6
48522022-06-30회원-내국인60대M6
48532022-06-30회원-내국인60대M10
48542022-06-30회원-내국인기타F1
48552022-06-30회원-내국인기타M2

Duplicate rows

Most frequently occurring

가입일자사용자코드연령대코드성별가입 수# duplicates
02022-01-10회원-내국인60대M52
12022-01-13회원-내국인기타M12
22022-01-15회원-내국인60대F12
32022-01-19회원-내국인60대M12
42022-01-21회원-내국인60대F12
52022-01-24회원-내국인60대F12
62022-01-28회원-내국인70대이상M12
72022-02-02회원-내국인70대이상M12
82022-02-04회원-내국인60대M32
92022-03-07회원-내국인70대이상M12