Overview

Dataset statistics

Number of variables6
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory26.0 KiB
Average record size in memory53.3 B

Variable types

Text1
Categorical1
Numeric4

Dataset

Description샘플 데이터
Author서울시, 신한카드, KCB(코리아크레딧뷰로)
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=321

Reproduction

Analysis started2023-12-10 15:02:13.136449
Analysis finished2023-12-10 15:02:16.735168
Duration3.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct313
Distinct (%)62.6%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-11T00:02:17.244489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.724
Min length2

Characters and Unicode

Total characters2862
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique194 ?
Unique (%)38.8%

Sample

1st row2*2*1*
2nd row2*1*1*
3rd row1*9*0*
4th row2*4*5*
5th row4*9*5*
ValueCountFrequency (%)
1*3*1 8
 
1.6%
2*9*1 8
 
1.6%
2*3*9 6
 
1.2%
2*4*6 5
 
1.0%
2*7*7 5
 
1.0%
3*5*9 5
 
1.0%
2*2*5 5
 
1.0%
2*3*7 5
 
1.0%
2*7*3 5
 
1.0%
3*3*5 5
 
1.0%
Other values (253) 443
88.6%
2023-12-11T00:02:18.331975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 1378
48.1%
2 342
 
11.9%
1 223
 
7.8%
3 212
 
7.4%
4 138
 
4.8%
9 108
 
3.8%
5 105
 
3.7%
6 92
 
3.2%
0 91
 
3.2%
8 87
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1484
51.9%
Other Punctuation 1378
48.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 342
23.0%
1 223
15.0%
3 212
14.3%
4 138
9.3%
9 108
 
7.3%
5 105
 
7.1%
6 92
 
6.2%
0 91
 
6.1%
8 87
 
5.9%
7 86
 
5.8%
Other Punctuation
ValueCountFrequency (%)
* 1378
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2862
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 1378
48.1%
2 342
 
11.9%
1 223
 
7.8%
3 212
 
7.4%
4 138
 
4.8%
9 108
 
3.8%
5 105
 
3.7%
6 92
 
3.2%
0 91
 
3.2%
8 87
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2862
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 1378
48.1%
2 342
 
11.9%
1 223
 
7.8%
3 212
 
7.4%
4 138
 
4.8%
9 108
 
3.8%
5 105
 
3.7%
6 92
 
3.2%
0 91
 
3.2%
8 87
 
3.0%

성별(GENDER)
Categorical

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2
261 
1
239 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row2
5th row1

Common Values

ValueCountFrequency (%)
2 261
52.2%
1 239
47.8%

Length

2023-12-11T00:02:18.621422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T00:02:18.865914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 261
52.2%
1 239
47.8%

연령대(AGE)
Real number (ℝ)

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.068
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-11T00:02:19.078120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median4
Q36
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.7915902
Coefficient of variation (CV)0.44041058
Kurtosis-1.0558941
Mean4.068
Median Absolute Deviation (MAD)1
Skewness-0.010112195
Sum2034
Variance3.2097956
MonotonicityNot monotonic
2023-12-11T00:02:19.325085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
3 89
17.8%
4 86
17.2%
6 81
16.2%
5 80
16.0%
2 73
14.6%
7 50
10.0%
1 41
8.2%
ValueCountFrequency (%)
1 41
8.2%
2 73
14.6%
3 89
17.8%
4 86
17.2%
5 80
16.0%
6 81
16.2%
7 50
10.0%
ValueCountFrequency (%)
7 50
10.0%
6 81
16.2%
5 80
16.0%
4 86
17.2%
3 89
17.8%
2 73
14.6%
1 41
8.2%
Distinct101
Distinct (%)20.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.9164
Minimum0
Maximum1125.5
Zeros286
Zeros (%)57.2%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-11T00:02:19.595013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q314.7
95-th percentile143.93
Maximum1125.5
Range1125.5
Interquartile range (IQR)14.7

Descriptive statistics

Standard deviation92.172013
Coefficient of variation (CV)3.1875342
Kurtosis53.324804
Mean28.9164
Median Absolute Deviation (MAD)0
Skewness6.3033117
Sum14458.2
Variance8495.68
MonotonicityNot monotonic
2023-12-11T00:02:19.942209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0 286
57.2%
4.9 31
 
6.2%
5.0 26
 
5.2%
9.9 11
 
2.2%
5.1 10
 
2.0%
14.9 7
 
1.4%
15.0 5
 
1.0%
15.1 4
 
0.8%
24.7 4
 
0.8%
14.8 4
 
0.8%
Other values (91) 112
 
22.4%
ValueCountFrequency (%)
0.0 286
57.2%
4.9 31
 
6.2%
5.0 26
 
5.2%
5.1 10
 
2.0%
9.8 3
 
0.6%
9.9 11
 
2.2%
10.0 3
 
0.6%
10.1 3
 
0.6%
14.7 4
 
0.8%
14.8 4
 
0.8%
ValueCountFrequency (%)
1125.5 1
0.2%
679.8 1
0.2%
578.2 1
0.2%
540.6 1
0.2%
517.4 1
0.2%
507.5 1
0.2%
493.7 1
0.2%
387.4 1
0.2%
376.6 1
0.2%
337.8 1
0.2%
Distinct466
Distinct (%)93.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1378.2662
Minimum0
Maximum11744.8
Zeros7
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-11T00:02:20.701432image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile10.1
Q1178.125
median651
Q31840.525
95-th percentile5020.71
Maximum11744.8
Range11744.8
Interquartile range (IQR)1662.4

Descriptive statistics

Standard deviation1778.8555
Coefficient of variation (CV)1.2906472
Kurtosis5.5302476
Mean1378.2662
Median Absolute Deviation (MAD)586.2
Skewness2.1600106
Sum689133.1
Variance3164326.8
MonotonicityNot monotonic
2023-12-11T00:02:21.014478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5.0 9
 
1.8%
0.0 7
 
1.4%
14.9 6
 
1.2%
4.9 4
 
0.8%
20.2 3
 
0.6%
39.9 3
 
0.6%
10.0 2
 
0.4%
74.5 2
 
0.4%
14.7 2
 
0.4%
74.3 2
 
0.4%
Other values (456) 460
92.0%
ValueCountFrequency (%)
0.0 7
1.4%
4.9 4
0.8%
5.0 9
1.8%
5.1 1
 
0.2%
9.9 1
 
0.2%
10.0 2
 
0.4%
10.1 2
 
0.4%
14.7 2
 
0.4%
14.8 1
 
0.2%
14.9 6
1.2%
ValueCountFrequency (%)
11744.8 1
0.2%
9090.9 1
0.2%
8647.7 1
0.2%
8553.5 1
0.2%
8427.4 1
0.2%
8386.5 1
0.2%
8344.1 1
0.2%
7684.5 1
0.2%
7471.2 1
0.2%
7380.4 1
0.2%

홈쇼핑_지수(INDEX05)
Real number (ℝ)

Distinct10
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.87
Minimum0
Maximum9
Zeros5
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-11T00:02:21.221232image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q36
95-th percentile9
Maximum9
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.5569858
Coefficient of variation (CV)0.66071984
Kurtosis-0.93148931
Mean3.87
Median Absolute Deviation (MAD)2
Skewness0.57727
Sum1935
Variance6.5381764
MonotonicityNot monotonic
2023-12-11T00:02:21.415995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 99
19.8%
3 92
18.4%
2 90
18.0%
5 46
9.2%
8 45
9.0%
6 36
 
7.2%
7 32
 
6.4%
4 29
 
5.8%
9 26
 
5.2%
0 5
 
1.0%
ValueCountFrequency (%)
0 5
 
1.0%
1 99
19.8%
2 90
18.0%
3 92
18.4%
4 29
 
5.8%
5 46
9.2%
6 36
 
7.2%
7 32
 
6.4%
8 45
9.0%
9 26
 
5.2%
ValueCountFrequency (%)
9 26
 
5.2%
8 45
9.0%
7 32
 
6.4%
6 36
 
7.2%
5 46
9.2%
4 29
 
5.8%
3 92
18.4%
2 90
18.0%
1 99
19.8%
0 5
 
1.0%

Interactions

2023-12-11T00:02:15.466645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:13.456692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:14.139761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:14.795251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:15.644936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:13.650375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:14.327272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:14.975898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:15.815748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:13.811539image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:14.489468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:15.139917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:16.016323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:13.974944image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:14.658497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:15.296819image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T00:02:21.575439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별(GENDER)연령대(AGE)홈쇼핑_결재(지출)건수(INDEX05_CNT)백화점_할인점_결제(지출)건수(INDEX05_CNT2)홈쇼핑_지수(INDEX05)
성별(GENDER)1.0000.0000.0000.0790.127
연령대(AGE)0.0001.0000.0000.0580.000
홈쇼핑_결재(지출)건수(INDEX05_CNT)0.0000.0001.0000.0000.045
백화점_할인점_결제(지출)건수(INDEX05_CNT2)0.0790.0580.0001.0000.000
홈쇼핑_지수(INDEX05)0.1270.0000.0450.0001.000
2023-12-11T00:02:21.768078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령대(AGE)홈쇼핑_결재(지출)건수(INDEX05_CNT)백화점_할인점_결제(지출)건수(INDEX05_CNT2)홈쇼핑_지수(INDEX05)성별(GENDER)
연령대(AGE)1.000-0.015-0.0330.0180.000
홈쇼핑_결재(지출)건수(INDEX05_CNT)-0.0151.000-0.0120.0550.000
백화점_할인점_결제(지출)건수(INDEX05_CNT2)-0.033-0.0121.000-0.0260.078
홈쇼핑_지수(INDEX05)0.0180.055-0.0261.0000.097
성별(GENDER)0.0000.0000.0780.0971.000

Missing values

2023-12-11T00:02:16.281746image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T00:02:16.600719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

서울시_블록ID(BLK_CD)성별(GENDER)연령대(AGE)홈쇼핑_결재(지출)건수(INDEX05_CNT)백화점_할인점_결제(지출)건수(INDEX05_CNT2)홈쇼핑_지수(INDEX05)
02*2*1*250.0125.08
12*1*1*210.0427.26
21*9*0*220.0363.06
32*4*5*240.014.93
44*9*5*15113.8130.28
52*0*8*16578.289.71
62*9*5*2524.75135.28
72*1*230.0562.12
83*3*9*240.04191.92
92*5*1*160.020.13
서울시_블록ID(BLK_CD)성별(GENDER)연령대(AGE)홈쇼핑_결재(지출)건수(INDEX05_CNT)백화점_할인점_결제(지출)건수(INDEX05_CNT2)홈쇼핑_지수(INDEX05)
4903*1*5*234.95446.95
4913*0*6*265.0273.13
4922*9*1160.0854.73
4931*2*1*240.02762.83
4941*3*1*110.090.23
4954*9*6*250.09.92
4962*8*0*175.05.03
4971*3*1*2415.13323.92
4982*3*7160.0178.83
4992*7*7*225.19090.91