Dataset statistics
Number of variables | 9 |
---|---|
Number of observations | 500 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 37.7 KiB |
Average record size in memory | 77.3 B |
Variable types
Categorical | 6 |
---|---|
Numeric | 3 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | 신한카드 |
URL | https://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=4 |
섹터코드(SECTOR_CD) has constant value "" | Constant |
매출건수합(SUM_COUNT) is highly imbalanced (76.3%) | Imbalance |
Reproduction
Analysis started | 2023-12-10 14:53:24.278482 |
---|---|
Analysis finished | 2023-12-10 14:53:26.374195 |
Duration | 2.1 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
섹터코드(SECTOR_CD)
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 0.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
1 |
---|
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 1 |
---|---|
2nd row | 1 |
3rd row | 1 |
4th row | 1 |
5th row | 1 |
Common Values
Value | Count | Frequency (%) |
1 | 500 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1 | 500 |
년월일(DATE)
Real number (ℝ)
Distinct | 33 |
---|---|
Distinct (%) | 6.6% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 20151086 |
Minimum | 20151022 |
---|---|
Maximum | 20151123 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 20151022 |
---|---|
5-th percentile | 20151023 |
Q1 | 20151030 |
median | 20151106 |
Q3 | 20151115 |
95-th percentile | 20151121 |
Maximum | 20151123 |
Range | 101 |
Interquartile range (IQR) | 85 |
Descriptive statistics
Standard deviation | 39.582779 |
---|---|
Coefficient of variation (CV) | 1.9643 × 10-6 |
Kurtosis | -1.2570635 |
Mean | 20151086 |
Median Absolute Deviation (MAD) | 11 |
Skewness | -0.80420889 |
Sum | 1.0075543 × 1010 |
Variance | 1566.7964 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
20151102 | 23 | 4.6% |
20151030 | 21 | 4.2% |
20151105 | 21 | 4.2% |
20151117 | 21 | 4.2% |
20151103 | 19 | 3.8% |
20151026 | 19 | 3.8% |
20151120 | 19 | 3.8% |
20151111 | 19 | 3.8% |
20151121 | 19 | 3.8% |
20151113 | 18 | 3.6% |
Other values (23) | 301 |
Value | Count | Frequency (%) |
20151022 | 18 | |
20151023 | 15 | |
20151024 | 11 | |
20151025 | 6 | 1.2% |
20151026 | 19 | |
20151027 | 18 | |
20151028 | 15 | |
20151029 | 14 | |
20151030 | 21 | |
20151031 | 14 |
Value | Count | Frequency (%) |
20151123 | 16 | |
20151122 | 9 | |
20151121 | 19 | |
20151120 | 19 | |
20151119 | 16 | |
20151118 | 12 | |
20151117 | 21 | |
20151116 | 12 | |
20151115 | 8 | 1.6% |
20151114 | 13 |
업종코드(C_CD)
Categorical
Distinct | 32 |
---|---|
Distinct (%) | 6.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
편의점 | |
---|---|
한식음식점 | |
커피음료 | |
패스트푸드점 | |
슈퍼마켓 | |
Other values (27) |
Length
Max length | 7 |
---|---|
Median length | 6 |
Mean length | 4.074 |
Min length | 2 |
Unique
Unique | 10 ? |
---|---|
Unique (%) | 2.0% |
Sample
1st row | 한식음식점 |
---|---|
2nd row | 제과점 |
3rd row | 분식집 |
4th row | 중국집 |
5th row | 한식음식점 |
Common Values
Value | Count | Frequency (%) |
편의점 | 85 | |
한식음식점 | 78 | |
커피음료 | 69 | |
패스트푸드점 | 50 | |
슈퍼마켓 | 25 | 5.0% |
외식업기타 | 25 | 5.0% |
제과점 | 22 | 4.4% |
분식집 | 19 | 3.8% |
호프간이주점 | 19 | 3.8% |
주차장 | 16 | 3.2% |
Other values (22) | 92 |
Length
Value | Count | Frequency (%) |
편의점 | 85 | |
한식음식점 | 78 | |
커피음료 | 69 | |
패스트푸드점 | 50 | |
슈퍼마켓 | 25 | 5.0% |
외식업기타 | 25 | 5.0% |
제과점 | 22 | 4.4% |
분식집 | 19 | 3.8% |
호프간이주점 | 19 | 3.8% |
주차장 | 16 | 3.2% |
Other values (22) | 92 |
성별코드(GEN_CD)
Categorical
Distinct | 2 |
---|---|
Distinct (%) | 0.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
M | |
---|---|
F |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | M |
---|---|
2nd row | F |
3rd row | F |
4th row | M |
5th row | M |
Common Values
Value | Count | Frequency (%) |
M | 258 | |
F | 242 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
m | 258 | |
f | 242 |
연령대별코드(AGE_CD)
Categorical
Distinct | 9 |
---|---|
Distinct (%) | 1.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
20대후반 | |
---|---|
30대초반 | |
20대초반이하 | |
30대후반 | |
40대초반 | |
Other values (4) |
Length
Max length | 7 |
---|---|
Median length | 5 |
Mean length | 5.308 |
Min length | 5 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 20대후반 |
---|---|
2nd row | 20대초반이하 |
3rd row | 30대후반 |
4th row | 30대초반 |
5th row | 20대초반이하 |
Common Values
Value | Count | Frequency (%) |
20대후반 | 92 | |
30대초반 | 84 | |
20대초반이하 | 77 | |
30대후반 | 75 | |
40대초반 | 60 | |
40대후반 | 48 | |
50대초반 | 27 | 5.4% |
60대이상 | 22 | 4.4% |
50대후반 | 15 | 3.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
20대후반 | 92 | |
30대초반 | 84 | |
20대초반이하 | 77 | |
30대후반 | 75 | |
40대초반 | 60 | |
40대후반 | 48 | |
50대초반 | 27 | 5.4% |
60대이상 | 22 | 4.4% |
50대후반 | 15 | 3.0% |
유입지코드(INFLOW_CD)
Real number (ℝ)
Distinct | 219 |
---|---|
Distinct (%) | 43.8% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 7355324.7 |
Minimum | 26320 |
---|---|
Maximum | 11740110 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 26320 |
---|---|
5-th percentile | 28245 |
Q1 | 41440 |
median | 11260102 |
Q3 | 11500104 |
95-th percentile | 11710102 |
Maximum | 11740110 |
Range | 11713790 |
Interquartile range (IQR) | 11458664 |
Descriptive statistics
Standard deviation | 5471235.3 |
---|---|
Coefficient of variation (CV) | 0.74384688 |
Kurtosis | -1.6530753 |
Mean | 7355324.7 |
Median Absolute Deviation (MAD) | 374999 |
Skewness | -0.59169431 |
Sum | 3.6776623 × 109 |
Variance | 2.9934415 × 1013 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
41281 | 12 | 2.4% |
41135 | 10 | 2.0% |
41465 | 9 | 1.8% |
11650107 | 9 | 1.8% |
41287 | 8 | 1.6% |
11500103 | 8 | 1.6% |
28260 | 7 | 1.4% |
41463 | 7 | 1.4% |
11410111 | 7 | 1.4% |
28237 | 6 | 1.2% |
Other values (209) | 417 |
Value | Count | Frequency (%) |
26320 | 2 | 0.4% |
26500 | 1 | 0.2% |
27110 | 2 | 0.4% |
27140 | 1 | 0.2% |
27230 | 1 | 0.2% |
27290 | 1 | 0.2% |
28170 | 2 | 0.4% |
28185 | 2 | 0.4% |
28200 | 2 | 0.4% |
28237 | 6 |
Value | Count | Frequency (%) |
11740110 | 1 | 0.2% |
11740109 | 4 | |
11740108 | 2 | |
11740106 | 2 | |
11740105 | 1 | 0.2% |
11740103 | 1 | 0.2% |
11710114 | 1 | 0.2% |
11710111 | 2 | |
11710108 | 2 | |
11710107 | 3 |
시간대(TIME)
Categorical
Distinct | 6 |
---|---|
Distinct (%) | 1.2% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
02:점심 | |
---|---|
04:저녁 | |
03:오후 | |
01:오전 | |
05:늦은저녁 |
Length
Max length | 7 |
---|---|
Median length | 5 |
Mean length | 5.132 |
Min length | 5 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 04:저녁 |
---|---|
2nd row | 02:점심 |
3rd row | 02:점심 |
4th row | 03:오후 |
5th row | 01:오전 |
Common Values
Value | Count | Frequency (%) |
02:점심 | 156 | |
04:저녁 | 133 | |
03:오후 | 88 | |
01:오전 | 80 | |
05:늦은저녁 | 33 | 6.6% |
06:심야 | 10 | 2.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
02:점심 | 156 | |
04:저녁 | 133 | |
03:오후 | 88 | |
01:오전 | 80 | |
05:늦은저녁 | 33 | 6.6% |
06:심야 | 10 | 2.0% |
매출금액합(SUM_MONEY)
Real number (ℝ)
Distinct | 207 |
---|---|
Distinct (%) | 41.4% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 20308.29 |
Minimum | 670 |
---|---|
Maximum | 635200 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 4.5 KiB |
Quantile statistics
Minimum | 670 |
---|---|
5-th percentile | 1998 |
Q1 | 4500 |
median | 7200 |
Q3 | 18000 |
95-th percentile | 64010 |
Maximum | 635200 |
Range | 634530 |
Interquartile range (IQR) | 13500 |
Descriptive statistics
Standard deviation | 50648.386 |
---|---|
Coefficient of variation (CV) | 2.4939759 |
Kurtosis | 76.434631 |
Mean | 20308.29 |
Median Absolute Deviation (MAD) | 4300 |
Skewness | 7.9702541 |
Sum | 10154145 |
Variance | 2.565259 × 109 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
4500 | 18 | 3.6% |
6000 | 18 | 3.6% |
7000 | 15 | 3.0% |
2500 | 15 | 3.0% |
8000 | 12 | 2.4% |
3000 | 11 | 2.2% |
9000 | 11 | 2.2% |
5500 | 11 | 2.2% |
15000 | 11 | 2.2% |
4000 | 10 | 2.0% |
Other values (197) | 368 |
Value | Count | Frequency (%) |
670 | 1 | 0.2% |
800 | 2 | 0.4% |
900 | 1 | 0.2% |
1000 | 5 | |
1100 | 1 | 0.2% |
1200 | 3 | |
1350 | 1 | 0.2% |
1400 | 3 | |
1500 | 3 | |
1700 | 1 | 0.2% |
Value | Count | Frequency (%) |
635200 | 1 | |
500000 | 1 | |
435600 | 1 | |
412000 | 1 | |
298000 | 1 | |
196000 | 1 | |
191000 | 1 | |
150000 | 1 | |
119900 | 1 | |
116000 | 1 |
매출건수합(SUM_COUNT)
Categorical
IMBALANCE
 
Distinct | 4 |
---|---|
Distinct (%) | 0.8% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.0 KiB |
1 | |
---|---|
2 | 36 |
3 | 5 |
4 | 1 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.2% |
Sample
1st row | 1 |
---|---|
2nd row | 1 |
3rd row | 1 |
4th row | 1 |
5th row | 1 |
Common Values
Value | Count | Frequency (%) |
1 | 458 | |
2 | 36 | 7.2% |
3 | 5 | 1.0% |
4 | 1 | 0.2% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1 | 458 | |
2 | 36 | 7.2% |
3 | 5 | 1.0% |
4 | 1 | 0.2% |
년월일(DATE) | 업종코드(C_CD) | 성별코드(GEN_CD) | 연령대별코드(AGE_CD) | 유입지코드(INFLOW_CD) | 시간대(TIME) | 매출금액합(SUM_MONEY) | 매출건수합(SUM_COUNT) | |
---|---|---|---|---|---|---|---|---|
년월일(DATE) | 1.000 | 0.123 | 0.000 | 0.000 | 0.000 | 0.050 | 0.000 | 0.082 |
업종코드(C_CD) | 0.123 | 1.000 | 0.113 | 0.243 | 0.125 | 0.000 | 0.499 | 0.123 |
성별코드(GEN_CD) | 0.000 | 0.113 | 1.000 | 0.000 | 0.000 | 0.076 | 0.000 | 0.000 |
연령대별코드(AGE_CD) | 0.000 | 0.243 | 0.000 | 1.000 | 0.084 | 0.000 | 0.152 | 0.000 |
유입지코드(INFLOW_CD) | 0.000 | 0.125 | 0.000 | 0.084 | 1.000 | 0.122 | 0.009 | 0.000 |
시간대(TIME) | 0.050 | 0.000 | 0.076 | 0.000 | 0.122 | 1.000 | 0.000 | 0.000 |
매출금액합(SUM_MONEY) | 0.000 | 0.499 | 0.000 | 0.152 | 0.009 | 0.000 | 1.000 | 0.000 |
매출건수합(SUM_COUNT) | 0.082 | 0.123 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 1.000 |
연령대별코드(AGE_CD) | 매출건수합(SUM_COUNT) | 업종코드(C_CD) | 성별코드(GEN_CD) | 시간대(TIME) | |
---|---|---|---|---|---|
연령대별코드(AGE_CD) | 1.000 | 0.000 | 0.089 | 0.000 | 0.000 |
매출건수합(SUM_COUNT) | 0.000 | 1.000 | 0.056 | 0.000 | 0.000 |
업종코드(C_CD) | 0.089 | 0.056 | 1.000 | 0.087 | 0.000 |
성별코드(GEN_CD) | 0.000 | 0.000 | 0.087 | 1.000 | 0.054 |
시간대(TIME) | 0.000 | 0.000 | 0.000 | 0.054 | 1.000 |
년월일(DATE) | 유입지코드(INFLOW_CD) | 매출금액합(SUM_MONEY) | 업종코드(C_CD) | 성별코드(GEN_CD) | 연령대별코드(AGE_CD) | 시간대(TIME) | 매출건수합(SUM_COUNT) | |
---|---|---|---|---|---|---|---|---|
년월일(DATE) | 1.000 | -0.011 | 0.019 | 0.077 | 0.000 | 0.000 | 0.052 | 0.018 |
유입지코드(INFLOW_CD) | -0.011 | 1.000 | -0.003 | 0.104 | 0.000 | 0.084 | 0.086 | 0.000 |
매출금액합(SUM_MONEY) | 0.019 | -0.003 | 1.000 | 0.209 | 0.000 | 0.074 | 0.000 | 0.000 |
업종코드(C_CD) | 0.077 | 0.104 | 0.209 | 1.000 | 0.087 | 0.089 | 0.000 | 0.056 |
성별코드(GEN_CD) | 0.000 | 0.000 | 0.000 | 0.087 | 1.000 | 0.000 | 0.054 | 0.000 |
연령대별코드(AGE_CD) | 0.000 | 0.084 | 0.074 | 0.089 | 0.000 | 1.000 | 0.000 | 0.000 |
시간대(TIME) | 0.052 | 0.086 | 0.000 | 0.000 | 0.054 | 0.000 | 1.000 | 0.000 |
매출건수합(SUM_COUNT) | 0.018 | 0.000 | 0.000 | 0.056 | 0.000 | 0.000 | 0.000 | 1.000 |
섹터코드(SECTOR_CD) | 년월일(DATE) | 업종코드(C_CD) | 성별코드(GEN_CD) | 연령대별코드(AGE_CD) | 유입지코드(INFLOW_CD) | 시간대(TIME) | 매출금액합(SUM_MONEY) | 매출건수합(SUM_COUNT) | |
---|---|---|---|---|---|---|---|---|---|
0 | 1 | 20151118 | 한식음식점 | M | 20대후반 | 11215103 | 04:저녁 | 9800 | 1 |
1 | 1 | 20151118 | 제과점 | F | 20대초반이하 | 11350105 | 02:점심 | 7000 | 1 |
2 | 1 | 20151110 | 분식집 | F | 30대후반 | 11170101 | 02:점심 | 31000 | 1 |
3 | 1 | 20151121 | 중국집 | M | 30대초반 | 11290133 | 03:오후 | 62000 | 1 |
4 | 1 | 20151116 | 한식음식점 | M | 20대초반이하 | 11170119 | 01:오전 | 4000 | 1 |
5 | 1 | 20151024 | 편의점 | M | 20대초반이하 | 41115 | 03:오후 | 8000 | 1 |
6 | 1 | 20151117 | 편의점 | F | 20대초반이하 | 41290 | 01:오전 | 6500 | 1 |
7 | 1 | 20151119 | 커피음료 | F | 20대후반 | 41463 | 02:점심 | 28900 | 1 |
8 | 1 | 20151121 | 슈퍼마켓 | F | 20대초반이하 | 11230105 | 04:저녁 | 2000 | 1 |
9 | 1 | 20151122 | 편의점 | F | 30대후반 | 41281 | 03:오후 | 10350 | 1 |
섹터코드(SECTOR_CD) | 년월일(DATE) | 업종코드(C_CD) | 성별코드(GEN_CD) | 연령대별코드(AGE_CD) | 유입지코드(INFLOW_CD) | 시간대(TIME) | 매출금액합(SUM_MONEY) | 매출건수합(SUM_COUNT) | |
---|---|---|---|---|---|---|---|---|---|
490 | 1 | 20151115 | 주차장 | M | 20대후반 | 11170101 | 02:점심 | 63000 | 1 |
491 | 1 | 20151102 | 한식음식점 | M | 40대초반 | 11710103 | 03:오후 | 112000 | 1 |
492 | 1 | 20151112 | 한식음식점 | F | 40대후반 | 11500103 | 01:오전 | 7400 | 1 |
493 | 1 | 20151105 | 커피음료 | F | 30대후반 | 11170130 | 04:저녁 | 5900 | 1 |
494 | 1 | 20151022 | 분식집 | F | 60대이상 | 41310 | 03:오후 | 4800 | 2 |
495 | 1 | 20151123 | 편의점 | M | 40대후반 | 28260 | 05:늦은저녁 | 4900 | 1 |
496 | 1 | 20151027 | 한식음식점 | F | 30대후반 | 44200 | 04:저녁 | 20000 | 2 |
497 | 1 | 20151029 | 커피음료 | M | 40대초반 | 46110 | 04:저녁 | 89000 | 1 |
498 | 1 | 20151111 | 편의점 | F | 20대후반 | 28170 | 04:저녁 | 3600 | 1 |
499 | 1 | 20151121 | 슈퍼마켓 | F | 20대초반이하 | 28260 | 03:오후 | 8300 | 1 |