Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 500 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 29.4 KiB |
Average record size in memory | 60.3 B |
Variable types
Text | 1 |
---|---|
Numeric | 4 |
Categorical | 2 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | 신한카드 |
URL | https://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=318 |
Reproduction
Analysis started | 2023-12-10 14:58:21.092946 |
---|---|
Analysis finished | 2023-12-10 14:58:28.174410 |
Duration | 7.08 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
Distinct | 65 |
---|---|
Distinct (%) | 13.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
Value | Count | Frequency (%) |
ss016 | 31 | 6.2% |
ss069 | 25 | 5.0% |
ss001 | 25 | 5.0% |
ss058 | 22 | 4.4% |
ss008 | 21 | 4.2% |
ss013 | 20 | 4.0% |
ss048 | 18 | 3.6% |
ss055 | 17 | 3.4% |
ss004 | 16 | 3.2% |
ss068 | 15 | 3.0% |
Other values (55) | 290 |
Most occurring characters
Value | Count | Frequency (%) |
S | 1000 | |
0 | 650 | |
1 | 152 | 6.1% |
4 | 130 | 5.2% |
6 | 126 | 5.0% |
5 | 114 | 4.6% |
8 | 92 | 3.7% |
3 | 75 | 3.0% |
2 | 73 | 2.9% |
9 | 52 | 2.1% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 1500 | |
Uppercase Letter | 1000 |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 650 | |
1 | 152 | 10.1% |
4 | 130 | 8.7% |
6 | 126 | 8.4% |
5 | 114 | 7.6% |
8 | 92 | 6.1% |
3 | 75 | 5.0% |
2 | 73 | 4.9% |
9 | 52 | 3.5% |
7 | 36 | 2.4% |
Uppercase Letter
Value | Count | Frequency (%) |
S | 1000 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 1500 | |
Latin | 1000 |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 650 | |
1 | 152 | 10.1% |
4 | 130 | 8.7% |
6 | 126 | 8.4% |
5 | 114 | 7.6% |
8 | 92 | 6.1% |
3 | 75 | 5.0% |
2 | 73 | 4.9% |
9 | 52 | 3.5% |
7 | 36 | 2.4% |
Latin
Value | Count | Frequency (%) |
S | 1000 |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2500 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
S | 1000 | |
0 | 650 | |
1 | 152 | 6.1% |
4 | 130 | 5.2% |
6 | 126 | 5.0% |
5 | 114 | 4.6% |
8 | 92 | 3.7% |
3 | 75 | 3.0% |
2 | 73 | 2.9% |
9 | 52 | 2.1% |
기준년월(YM)
Real number (ℝ)
Distinct | 67 |
---|---|
Distinct (%) | 13.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 201834.71 |
Minimum | 201601 |
---|---|
Maximum | 202107 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 201601 |
---|---|
5-th percentile | 201603 |
Q1 | 201705 |
median | 201810 |
Q3 | 202003 |
95-th percentile | 202104 |
Maximum | 202107 |
Range | 506 |
Interquartile range (IQR) | 298 |
Descriptive statistics
Standard deviation | 163.52945 |
---|---|
Coefficient of variation (CV) | 0.00081021469 |
Kurtosis | -1.1957063 |
Mean | 201834.71 |
Median Absolute Deviation (MAD) | 107.5 |
Skewness | 0.086923505 |
Sum | 1.0091736 × 108 |
Variance | 26741.881 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
201812 | 13 | 2.6% |
202012 | 13 | 2.6% |
201907 | 12 | 2.4% |
202104 | 12 | 2.4% |
202003 | 11 | 2.2% |
201903 | 11 | 2.2% |
201604 | 11 | 2.2% |
201804 | 11 | 2.2% |
202011 | 11 | 2.2% |
202106 | 10 | 2.0% |
Other values (57) | 385 |
Value | Count | Frequency (%) |
201601 | 10 | |
201602 | 9 | |
201603 | 7 | |
201604 | 11 | |
201605 | 9 | |
201606 | 10 | |
201607 | 8 | |
201608 | 10 | |
201609 | 2 | 0.4% |
201610 | 7 |
Value | Count | Frequency (%) |
202107 | 6 | |
202106 | 10 | |
202105 | 7 | |
202104 | 12 | |
202103 | 6 | |
202102 | 6 | |
202101 | 6 | |
202012 | 13 | |
202011 | 11 | |
202010 | 5 | 1.0% |
고객주소블록코드(BLOCK_CD)
Real number (ℝ)
Distinct | 498 |
---|---|
Distinct (%) | 99.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 209152.71 |
Minimum | 1700 |
---|---|
Maximum | 502146 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 1700 |
---|---|
5-th percentile | 12691.4 |
Q1 | 147162.5 |
median | 218893.5 |
Q3 | 320160.25 |
95-th percentile | 415276.5 |
Maximum | 502146 |
Range | 500446 |
Interquartile range (IQR) | 172997.75 |
Descriptive statistics
Standard deviation | 130034.57 |
---|---|
Coefficient of variation (CV) | 0.6217207 |
Kurtosis | -0.92883365 |
Mean | 209152.71 |
Median Absolute Deviation (MAD) | 78651.5 |
Skewness | -0.079739872 |
Sum | 1.0457635 × 108 |
Variance | 1.6908989 × 1010 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
363705 | 2 | 0.4% |
152747 | 2 | 0.4% |
14037 | 1 | 0.2% |
217781 | 1 | 0.2% |
215407 | 1 | 0.2% |
231070 | 1 | 0.2% |
283313 | 1 | 0.2% |
282459 | 1 | 0.2% |
228606 | 1 | 0.2% |
278916 | 1 | 0.2% |
Other values (488) | 488 |
Value | Count | Frequency (%) |
1700 | 1 | |
1795 | 1 | |
5648 | 1 | |
7175 | 1 | |
8321 | 1 | |
8406 | 1 | |
8470 | 1 | |
8705 | 1 | |
8880 | 1 | |
9022 | 1 |
Value | Count | Frequency (%) |
502146 | 1 | |
501646 | 1 | |
423014 | 1 | |
422551 | 1 | |
422515 | 1 | |
421887 | 1 | |
421556 | 1 | |
421419 | 1 | |
420859 | 1 | |
420603 | 1 |
성별(GEDNER)
Categorical
Distinct | 2 |
---|---|
Distinct (%) | 0.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
M | |
---|---|
F |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | F |
---|---|
2nd row | M |
3rd row | M |
4th row | F |
5th row | M |
Common Values
Value | Count | Frequency (%) |
M | 260 | |
F | 240 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
m | 260 | |
f | 240 |
연령대별(AGE)
Categorical
Distinct | 7 |
---|---|
Distinct (%) | 1.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
40대 | |
---|---|
30대 | |
20대 | |
50대 | |
60대 | |
Other values (2) |
Length
Max length | 5 |
---|---|
Median length | 3 |
Mean length | 3.152 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 50대 |
---|---|
2nd row | 30대 |
3rd row | 50대 |
4th row | 30대 |
5th row | 30대 |
Common Values
Value | Count | Frequency (%) |
40대 | 109 | |
30대 | 104 | |
20대 | 97 | |
50대 | 77 | |
60대 | 63 | |
70대이상 | 38 | 7.6% |
10대 | 12 | 2.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
40대 | 109 | |
30대 | 104 | |
20대 | 97 | |
50대 | 77 | |
60대 | 63 | |
70대이상 | 38 | 7.6% |
10대 | 12 | 2.4% |
카드이용금액계(AMT_CORR)
Real number (ℝ)
Distinct | 413 |
---|---|
Distinct (%) | 82.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 907636.86 |
Minimum | 15 |
---|---|
Maximum | 50385007 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 15 |
---|---|
5-th percentile | 20044.55 |
Q1 | 80480 |
median | 238422 |
Q3 | 717554.5 |
95-th percentile | 3602858.2 |
Maximum | 50385007 |
Range | 50384992 |
Interquartile range (IQR) | 637074.5 |
Descriptive statistics
Standard deviation | 2952486.3 |
---|---|
Coefficient of variation (CV) | 3.2529379 |
Kurtosis | 168.53302 |
Mean | 907636.86 |
Median Absolute Deviation (MAD) | 188625 |
Skewness | 11.341779 |
Sum | 4.5381843 × 108 |
Variance | 8.7171756 × 1012 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
50300 | 8 | 1.6% |
75450 | 7 | 1.4% |
15090 | 5 | 1.0% |
60360 | 5 | 1.0% |
100600 | 4 | 0.8% |
251500 | 4 | 0.8% |
20120 | 4 | 0.8% |
30180 | 4 | 0.8% |
226350 | 4 | 0.8% |
65390 | 4 | 0.8% |
Other values (403) | 451 |
Value | Count | Frequency (%) |
15 | 1 | 0.2% |
2515 | 1 | 0.2% |
3521 | 1 | 0.2% |
4024 | 1 | 0.2% |
5533 | 2 | |
6036 | 1 | 0.2% |
7545 | 2 | |
8048 | 3 | |
10060 | 1 | 0.2% |
11066 | 1 | 0.2% |
Value | Count | Frequency (%) |
50385007 | 1 | |
23551179 | 1 | |
19322795 | 1 | |
13478242 | 1 | |
11102382 | 1 | |
9590198 | 1 | |
8722976 | 1 | |
8620414 | 1 | |
7207291 | 1 | |
7108134 | 1 |
카드이용건수계(USECT_CORR)
Real number (ℝ)
Distinct | 45 |
---|---|
Distinct (%) | 9.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 36.316 |
Minimum | 5 |
---|---|
Maximum | 795 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 5 |
---|---|
5-th percentile | 5 |
Q1 | 5 |
median | 10 |
Q3 | 35 |
95-th percentile | 161 |
Maximum | 795 |
Range | 790 |
Interquartile range (IQR) | 30 |
Descriptive statistics
Standard deviation | 72.176555 |
---|---|
Coefficient of variation (CV) | 1.9874588 |
Kurtosis | 44.012155 |
Mean | 36.316 |
Median Absolute Deviation (MAD) | 5 |
Skewness | 5.6742988 |
Sum | 18158 |
Variance | 5209.4551 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
5 | 165 | |
10 | 86 | |
15 | 55 | 11.0% |
25 | 27 | 5.4% |
20 | 23 | 4.6% |
30 | 18 | 3.6% |
45 | 14 | 2.8% |
40 | 10 | 2.0% |
55 | 9 | 1.8% |
50 | 8 | 1.6% |
Other values (35) | 85 |
Value | Count | Frequency (%) |
5 | 165 | |
10 | 86 | |
15 | 55 | 11.0% |
20 | 23 | 4.6% |
25 | 27 | 5.4% |
30 | 18 | 3.6% |
35 | 7 | 1.4% |
40 | 10 | 2.0% |
45 | 14 | 2.8% |
50 | 8 | 1.6% |
Value | Count | Frequency (%) |
795 | 1 | 0.2% |
684 | 1 | 0.2% |
488 | 1 | 0.2% |
463 | 1 | 0.2% |
423 | 1 | 0.2% |
297 | 1 | 0.2% |
282 | 1 | 0.2% |
267 | 1 | 0.2% |
246 | 1 | 0.2% |
231 | 3 |
서울시민업종코드(UPJONG_CD) | 기준년월(YM) | 고객주소블록코드(BLOCK_CD) | 성별(GEDNER) | 연령대별(AGE) | 카드이용금액계(AMT_CORR) | 카드이용건수계(USECT_CORR) | |
---|---|---|---|---|---|---|---|
서울시민업종코드(UPJONG_CD) | 1.000 | 0.000 | 0.339 | 0.000 | 0.206 | 0.000 | 0.632 |
기준년월(YM) | 0.000 | 1.000 | 0.000 | 0.066 | 0.004 | 0.000 | 0.000 |
고객주소블록코드(BLOCK_CD) | 0.339 | 0.000 | 1.000 | 0.132 | 0.000 | 0.072 | 0.000 |
성별(GEDNER) | 0.000 | 0.066 | 0.132 | 1.000 | 0.028 | 0.000 | 0.000 |
연령대별(AGE) | 0.206 | 0.004 | 0.000 | 0.028 | 1.000 | 0.162 | 0.000 |
카드이용금액계(AMT_CORR) | 0.000 | 0.000 | 0.072 | 0.000 | 0.162 | 1.000 | 0.000 |
카드이용건수계(USECT_CORR) | 0.632 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 1.000 |
성별(GEDNER) | 연령대별(AGE) | |
---|---|---|
성별(GEDNER) | 1.000 | 0.030 |
연령대별(AGE) | 0.030 | 1.000 |
기준년월(YM) | 고객주소블록코드(BLOCK_CD) | 카드이용금액계(AMT_CORR) | 카드이용건수계(USECT_CORR) | 성별(GEDNER) | 연령대별(AGE) | |
---|---|---|---|---|---|---|
기준년월(YM) | 1.000 | -0.042 | 0.075 | -0.062 | 0.033 | 0.016 |
고객주소블록코드(BLOCK_CD) | -0.042 | 1.000 | -0.004 | 0.103 | 0.100 | 0.000 |
카드이용금액계(AMT_CORR) | 0.075 | -0.004 | 1.000 | 0.010 | 0.000 | 0.097 |
카드이용건수계(USECT_CORR) | -0.062 | 0.103 | 0.010 | 1.000 | 0.000 | 0.000 |
성별(GEDNER) | 0.033 | 0.100 | 0.000 | 0.000 | 1.000 | 0.030 |
연령대별(AGE) | 0.016 | 0.000 | 0.097 | 0.000 | 0.030 | 1.000 |
서울시민업종코드(UPJONG_CD) | 기준년월(YM) | 고객주소블록코드(BLOCK_CD) | 성별(GEDNER) | 연령대별(AGE) | 카드이용금액계(AMT_CORR) | 카드이용건수계(USECT_CORR) | |
---|---|---|---|---|---|---|---|
0 | SS013 | 201906 | 14037 | F | 50대 | 48288 | 65 |
1 | SS048 | 201608 | 156830 | M | 30대 | 132792 | 10 |
2 | SS016 | 202009 | 32925 | M | 50대 | 100600 | 25 |
3 | SS001 | 201805 | 214245 | F | 30대 | 213876 | 40 |
4 | SS044 | 201705 | 279472 | M | 30대 | 155930 | 5 |
5 | SS048 | 201911 | 220797 | M | 40대 | 20120 | 15 |
6 | SS058 | 202006 | 152728 | F | 40대 | 285201 | 91 |
7 | SS020 | 201711 | 83127 | F | 60대 | 15090 | 15 |
8 | SS007 | 201808 | 366838 | M | 30대 | 197931 | 5 |
9 | SS005 | 201601 | 15146 | M | 50대 | 155930 | 5 |
서울시민업종코드(UPJONG_CD) | 기준년월(YM) | 고객주소블록코드(BLOCK_CD) | 성별(GEDNER) | 연령대별(AGE) | 카드이용금액계(AMT_CORR) | 카드이용건수계(USECT_CORR) | |
---|---|---|---|---|---|---|---|
490 | SS015 | 202007 | 17960 | M | 20대 | 729853 | 5 |
491 | SS044 | 201908 | 17038 | F | 30대 | 804347 | 5 |
492 | SS069 | 201907 | 24110 | F | 20대 | 15090 | 5 |
493 | SS007 | 201905 | 227869 | F | 30대 | 342040 | 5 |
494 | SS068 | 201710 | 25185 | F | 60대 | 60360 | 5 |
495 | SS012 | 201709 | 11449 | F | 30대 | 75450 | 45 |
496 | SS044 | 201707 | 353037 | F | 30대 | 118708 | 25 |
497 | SS054 | 201901 | 418149 | F | 20대 | 1056300 | 15 |
498 | SS021 | 202009 | 269015 | F | 70대이상 | 25150 | 5 |
499 | SS003 | 201812 | 28101 | M | 40대 | 569094 | 5 |