Overview

Dataset statistics

Number of variables9
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory37.7 KiB
Average record size in memory77.3 B

Variable types

Categorical6
Numeric3

Dataset

Description샘플 데이터
Author신한카드
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=4

Alerts

섹터코드(SECTOR_CD) has constant value ""Constant
매출건수합(SUM_COUNT) is highly imbalanced (73.4%)Imbalance

Reproduction

Analysis started2023-12-10 14:53:30.544661
Analysis finished2023-12-10 14:53:32.808612
Duration2.26 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

섹터코드(SECTOR_CD)
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
500 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 500
100.0%

Length

2023-12-10T23:53:32.892481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:53:33.016814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 500
100.0%

년월일(DATE)
Real number (ℝ)

Distinct47
Distinct (%)9.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20151109
Minimum20151022
Maximum20151207
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-10T23:53:33.144623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20151022
5-th percentile20151023
Q120151101
median20151112
Q320151125
95-th percentile20151205
Maximum20151207
Range185
Interquartile range (IQR)24

Descriptive statistics

Standard deviation54.719055
Coefficient of variation (CV)2.7154364 × 10-6
Kurtosis-0.40562706
Mean20151109
Median Absolute Deviation (MAD)11.5
Skewness0.052677185
Sum1.0075554 × 1010
Variance2994.175
MonotonicityNot monotonic
2023-12-10T23:53:33.304482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
20151103 20
 
4.0%
20151022 17
 
3.4%
20151201 17
 
3.4%
20151027 17
 
3.4%
20151119 15
 
3.0%
20151029 15
 
3.0%
20151023 15
 
3.0%
20151106 15
 
3.0%
20151127 15
 
3.0%
20151101 14
 
2.8%
Other values (37) 340
68.0%
ValueCountFrequency (%)
20151022 17
3.4%
20151023 15
3.0%
20151024 7
1.4%
20151025 4
 
0.8%
20151026 9
1.8%
20151027 17
3.4%
20151028 11
2.2%
20151029 15
3.0%
20151030 12
2.4%
20151031 5
 
1.0%
ValueCountFrequency (%)
20151207 11
2.2%
20151206 8
1.6%
20151205 13
2.6%
20151204 11
2.2%
20151203 8
1.6%
20151202 9
1.8%
20151201 17
3.4%
20151130 6
 
1.2%
20151129 6
 
1.2%
20151128 8
1.6%
Distinct35
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
한식음식점
91 
편의점
84 
커피음료
63 
외식업기타
42 
슈퍼마켓
29 
Other values (30)
191 

Length

Max length7
Median length6
Mean length4.028
Min length2

Unique

Unique9 ?
Unique (%)1.8%

Sample

1st row한식음식점
2nd row커피음료
3rd row커피음료
4th row편의점
5th row커피음료

Common Values

ValueCountFrequency (%)
한식음식점 91
18.2%
편의점 84
16.8%
커피음료 63
12.6%
외식업기타 42
8.4%
슈퍼마켓 29
 
5.8%
호프간이주점 22
 
4.4%
제과점 20
 
4.0%
패스트푸드점 18
 
3.6%
약국 18
 
3.6%
일반의원 16
 
3.2%
Other values (25) 97
19.4%

Length

2023-12-10T23:53:33.495807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
한식음식점 91
18.2%
편의점 84
16.8%
커피음료 63
12.6%
외식업기타 42
8.4%
슈퍼마켓 29
 
5.8%
호프간이주점 22
 
4.4%
제과점 20
 
4.0%
약국 18
 
3.6%
패스트푸드점 18
 
3.6%
일반의원 16
 
3.2%
Other values (25) 97
19.4%
Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
M
285 
F
215 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowF
3rd rowM
4th rowM
5th rowF

Common Values

ValueCountFrequency (%)
M 285
57.0%
F 215
43.0%

Length

2023-12-10T23:53:33.635744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:53:33.731699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
m 285
57.0%
f 215
43.0%
Distinct9
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
20대후반
110 
20대초반이하
98 
30대초반
77 
30대후반
57 
40대후반
50 
Other values (4)
108 

Length

Max length7
Median length5
Mean length5.392
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20대초반이하
2nd row50대초반
3rd row20대초반이하
4th row40대후반
5th row20대후반

Common Values

ValueCountFrequency (%)
20대후반 110
22.0%
20대초반이하 98
19.6%
30대초반 77
15.4%
30대후반 57
11.4%
40대후반 50
10.0%
40대초반 45
9.0%
50대초반 32
 
6.4%
60대이상 17
 
3.4%
50대후반 14
 
2.8%

Length

2023-12-10T23:53:33.856750image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:53:33.999447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20대후반 110
22.0%
20대초반이하 98
19.6%
30대초반 77
15.4%
30대후반 57
11.4%
40대후반 50
10.0%
40대초반 45
9.0%
50대초반 32
 
6.4%
60대이상 17
 
3.4%
50대후반 14
 
2.8%
Distinct248
Distinct (%)49.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7368269.8
Minimum26260
Maximum11740110
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-10T23:53:34.176137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum26260
5-th percentile28237
Q141410
median11230106
Q311500102
95-th percentile11710108
Maximum11740110
Range11713850
Interquartile range (IQR)11458692

Descriptive statistics

Standard deviation5457130.7
Coefficient of variation (CV)0.7406258
Kurtosis-1.6420943
Mean7368269.8
Median Absolute Deviation (MAD)389996.5
Skewness-0.60058829
Sum3.6841349 × 109
Variance2.9780276 × 1013
MonotonicityNot monotonic
2023-12-10T23:53:34.424043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
41135 8
 
1.6%
11500103 8
 
1.6%
41173 8
 
1.6%
41287 6
 
1.2%
28200 6
 
1.2%
41199 6
 
1.2%
41210 6
 
1.2%
41285 6
 
1.2%
11710111 6
 
1.2%
11710101 5
 
1.0%
Other values (238) 435
87.0%
ValueCountFrequency (%)
26260 1
 
0.2%
26320 2
 
0.4%
26350 2
 
0.4%
26380 1
 
0.2%
26410 1
 
0.2%
27290 1
 
0.2%
28140 1
 
0.2%
28170 3
0.6%
28185 3
0.6%
28200 6
1.2%
ValueCountFrequency (%)
11740110 2
0.4%
11740109 1
 
0.2%
11740108 2
0.4%
11740107 1
 
0.2%
11740106 2
0.4%
11740105 2
0.4%
11740103 1
 
0.2%
11740101 3
0.6%
11710113 1
 
0.2%
11710112 4
0.8%

시간대(TIME)
Categorical

Distinct6
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
04:저녁
135 
02:점심
126 
01:오전
79 
03:오후
75 
05:늦은저녁
69 

Length

Max length7
Median length5
Mean length5.276
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row01:오전
2nd row04:저녁
3rd row01:오전
4th row01:오전
5th row02:점심

Common Values

ValueCountFrequency (%)
04:저녁 135
27.0%
02:점심 126
25.2%
01:오전 79
15.8%
03:오후 75
15.0%
05:늦은저녁 69
13.8%
06:심야 16
 
3.2%

Length

2023-12-10T23:53:34.601453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:53:34.751930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
04:저녁 135
27.0%
02:점심 126
25.2%
01:오전 79
15.8%
03:오후 75
15.0%
05:늦은저녁 69
13.8%
06:심야 16
 
3.2%
Distinct234
Distinct (%)46.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean21437.884
Minimum840
Maximum1042000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-10T23:53:34.897266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum840
5-th percentile1547.5
Q14400
median7850
Q320000
95-th percentile65171
Maximum1042000
Range1041160
Interquartile range (IQR)15600

Descriptive statistics

Standard deviation60601.339
Coefficient of variation (CV)2.826834
Kurtosis172.73607
Mean21437.884
Median Absolute Deviation (MAD)4850
Skewness11.540924
Sum10718942
Variance3.6725223 × 109
MonotonicityNot monotonic
2023-12-10T23:53:35.050133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4500 19
 
3.8%
6000 14
 
2.8%
8000 12
 
2.4%
5000 11
 
2.2%
3500 9
 
1.8%
2000 9
 
1.8%
20000 9
 
1.8%
9000 9
 
1.8%
6500 9
 
1.8%
7000 9
 
1.8%
Other values (224) 390
78.0%
ValueCountFrequency (%)
840 1
 
0.2%
900 5
1.0%
1000 2
 
0.4%
1020 1
 
0.2%
1100 1
 
0.2%
1200 2
 
0.4%
1250 1
 
0.2%
1280 1
 
0.2%
1300 3
0.6%
1350 1
 
0.2%
ValueCountFrequency (%)
1042000 1
0.2%
495000 1
0.2%
426700 1
0.2%
208800 1
0.2%
201580 1
0.2%
198100 1
0.2%
197800 1
0.2%
194500 1
0.2%
182000 1
0.2%
140000 1
0.2%

매출건수합(SUM_COUNT)
Categorical

IMBALANCE 

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
443 
2
46 
3
 
8
5
 
2
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row1
2nd row1
3rd row2
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 443
88.6%
2 46
 
9.2%
3 8
 
1.6%
5 2
 
0.4%
4 1
 
0.2%

Length

2023-12-10T23:53:35.207882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:53:35.369453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 443
88.6%
2 46
 
9.2%
3 8
 
1.6%
5 2
 
0.4%
4 1
 
0.2%

Interactions

2023-12-10T23:53:32.094032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:31.315963image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:31.724705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:32.259774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:31.460093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:31.864398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:32.393286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:31.572768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:53:31.991181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:53:35.493924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년월일(DATE)업종코드(C_CD)성별코드(GEN_CD)연령대별코드(AGE_CD)유입지코드(INFLOW_CD)시간대(TIME)매출금액합(SUM_MONEY)매출건수합(SUM_COUNT)
년월일(DATE)1.0000.0000.0000.0000.0000.1170.0000.103
업종코드(C_CD)0.0001.0000.0630.0000.1570.0000.1150.277
성별코드(GEN_CD)0.0000.0631.0000.0000.0000.0000.0000.060
연령대별코드(AGE_CD)0.0000.0000.0001.0000.0000.0000.1300.162
유입지코드(INFLOW_CD)0.0000.1570.0000.0001.0000.2010.0000.007
시간대(TIME)0.1170.0000.0000.0000.2011.0000.0000.000
매출금액합(SUM_MONEY)0.0000.1150.0000.1300.0000.0001.0000.000
매출건수합(SUM_COUNT)0.1030.2770.0600.1620.0070.0000.0001.000
2023-12-10T23:53:35.687886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령대별코드(AGE_CD)매출건수합(SUM_COUNT)업종코드(C_CD)성별코드(GEN_CD)시간대(TIME)
연령대별코드(AGE_CD)1.0000.0930.0000.0000.000
매출건수합(SUM_COUNT)0.0931.0000.1190.0730.000
업종코드(C_CD)0.0000.1191.0000.0500.000
성별코드(GEN_CD)0.0000.0730.0501.0000.000
시간대(TIME)0.0000.0000.0000.0001.000
2023-12-10T23:53:35.857182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
년월일(DATE)유입지코드(INFLOW_CD)매출금액합(SUM_MONEY)업종코드(C_CD)성별코드(GEN_CD)연령대별코드(AGE_CD)시간대(TIME)매출건수합(SUM_COUNT)
년월일(DATE)1.0000.034-0.0550.0000.0000.0000.0800.087
유입지코드(INFLOW_CD)0.0341.000-0.0860.1250.0000.0000.1410.000
매출금액합(SUM_MONEY)-0.055-0.0861.0000.0520.0000.0830.0000.000
업종코드(C_CD)0.0000.1250.0521.0000.0500.0000.0000.119
성별코드(GEN_CD)0.0000.0000.0000.0501.0000.0000.0000.073
연령대별코드(AGE_CD)0.0000.0000.0830.0000.0001.0000.0000.093
시간대(TIME)0.0800.1410.0000.0000.0000.0001.0000.000
매출건수합(SUM_COUNT)0.0870.0000.0000.1190.0730.0930.0001.000

Missing values

2023-12-10T23:53:32.546173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:53:32.723751image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

섹터코드(SECTOR_CD)년월일(DATE)업종코드(C_CD)성별코드(GEN_CD)연령대별코드(AGE_CD)유입지코드(INFLOW_CD)시간대(TIME)매출금액합(SUM_MONEY)매출건수합(SUM_COUNT)
0120151118한식음식점F20대초반이하1171011101:오전34001
1120151023커피음료F50대초반4113104:저녁113001
2120151128커피음료M20대초반이하4161001:오전108002
3120151201편의점M40대후반1168010501:오전210001
4120151027커피음료F20대후반4514002:점심80001
5120151118의류점F20대초반이하2635005:늦은저녁50001
6120151104한식음식점F40대후반2818503:오후200001
7120151104약국M20대초반이하2823704:저녁90001
8120151101편의점F20대초반이하1135010401:오전200001
9120151110문구점M40대후반2915505:늦은저녁77001
섹터코드(SECTOR_CD)년월일(DATE)업종코드(C_CD)성별코드(GEN_CD)연령대별코드(AGE_CD)유입지코드(INFLOW_CD)시간대(TIME)매출금액합(SUM_MONEY)매출건수합(SUM_COUNT)
490120151106한식음식점F20대초반이하2817002:점심405001
491120151111분식집M30대후반1138010902:점심160001
492120151028슈퍼마켓M30대초반1114011004:저녁250001
493120151120당구장F30대초반1171010102:점심55002
494120151116패션잡화M50대초반1117013006:심야45002
495120151130의류점F30대초반2814004:저녁226002
496120151025호프간이주점F50대후반1121510501:오전4267002
497120151112커피음료F30대후반1156013304:저녁49001
498120151116한식음식점F30대초반1123011004:저녁55002
499120151029약국F40대후반1144012702:점심170001