Overview

Dataset statistics

Number of variables6
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory26.0 KiB
Average record size in memory53.3 B

Variable types

Text1
Categorical1
Numeric4

Dataset

Description샘플 데이터
Author서울시, 신한카드, KCB(코리아크레딧뷰로)
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=321

Alerts

자동차_보유자수(INDEX04_CNT2) has 125 (25.0%) zerosZeros
자가용이용_지수(INDEX04) has 122 (24.4%) zerosZeros

Reproduction

Analysis started2023-12-10 15:01:52.368688
Analysis finished2023-12-10 15:01:56.061750
Duration3.69 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct321
Distinct (%)64.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-11T00:01:56.629187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.724
Min length4

Characters and Unicode

Total characters2862
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique206 ?
Unique (%)41.2%

Sample

1st row3*9*5*
2nd row2*7*7*
3rd row2*2*8*
4th row2*3*3*
5th row1*4*7
ValueCountFrequency (%)
2*2*7 7
 
1.4%
2*2*2 5
 
1.0%
3*5*2 5
 
1.0%
1*2*9 5
 
1.0%
2*2*8 5
 
1.0%
2*3*3 5
 
1.0%
2*0*9 5
 
1.0%
2*9*3 5
 
1.0%
2*2*1 5
 
1.0%
2*1*9 4
 
0.8%
Other values (249) 449
89.8%
2023-12-11T00:01:57.584520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 1368
47.8%
2 347
 
12.1%
1 246
 
8.6%
3 190
 
6.6%
4 150
 
5.2%
7 99
 
3.5%
5 98
 
3.4%
6 94
 
3.3%
8 92
 
3.2%
0 90
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1494
52.2%
Other Punctuation 1368
47.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 347
23.2%
1 246
16.5%
3 190
12.7%
4 150
10.0%
7 99
 
6.6%
5 98
 
6.6%
6 94
 
6.3%
8 92
 
6.2%
0 90
 
6.0%
9 88
 
5.9%
Other Punctuation
ValueCountFrequency (%)
* 1368
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2862
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 1368
47.8%
2 347
 
12.1%
1 246
 
8.6%
3 190
 
6.6%
4 150
 
5.2%
7 99
 
3.5%
5 98
 
3.4%
6 94
 
3.3%
8 92
 
3.2%
0 90
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2862
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 1368
47.8%
2 347
 
12.1%
1 246
 
8.6%
3 190
 
6.6%
4 150
 
5.2%
7 99
 
3.5%
5 98
 
3.4%
6 94
 
3.3%
8 92
 
3.2%
0 90
 
3.1%

성별(GENDER)
Categorical

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
287 
2
213 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row2
5th row1

Common Values

ValueCountFrequency (%)
1 287
57.4%
2 213
42.6%

Length

2023-12-11T00:01:57.888784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T00:01:58.073585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 287
57.4%
2 213
42.6%

연령대(AGE)
Real number (ℝ)

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.206
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-11T00:01:58.235367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median4
Q35
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.3825291
Coefficient of variation (CV)0.32870402
Kurtosis-0.74738805
Mean4.206
Median Absolute Deviation (MAD)1
Skewness0.15623462
Sum2103
Variance1.9113868
MonotonicityNot monotonic
2023-12-11T00:01:58.453637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
5 120
24.0%
4 117
23.4%
3 114
22.8%
6 64
12.8%
2 56
11.2%
7 28
 
5.6%
1 1
 
0.2%
ValueCountFrequency (%)
1 1
 
0.2%
2 56
11.2%
3 114
22.8%
4 117
23.4%
5 120
24.0%
6 64
12.8%
7 28
 
5.6%
ValueCountFrequency (%)
7 28
 
5.6%
6 64
12.8%
5 120
24.0%
4 117
23.4%
3 114
22.8%
2 56
11.2%
1 1
 
0.2%
Distinct86
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean72.578
Minimum5
Maximum981
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-11T00:01:58.729474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile5
Q115
median45
Q396
95-th percentile251.55
Maximum981
Range976
Interquartile range (IQR)81

Descriptive statistics

Standard deviation89.48501
Coefficient of variation (CV)1.2329495
Kurtosis23.989135
Mean72.578
Median Absolute Deviation (MAD)35
Skewness3.6053462
Sum36289
Variance8007.5671
MonotonicityNot monotonic
2023-12-11T00:01:59.018605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5 62
 
12.4%
10 35
 
7.0%
15 32
 
6.4%
20 30
 
6.0%
25 25
 
5.0%
45 24
 
4.8%
35 23
 
4.6%
30 22
 
4.4%
55 15
 
3.0%
40 14
 
2.8%
Other values (76) 218
43.6%
ValueCountFrequency (%)
5 62
12.4%
10 35
7.0%
15 32
6.4%
20 30
6.0%
25 25
5.0%
30 22
 
4.4%
35 23
 
4.6%
40 14
 
2.8%
41 1
 
0.2%
45 24
 
4.8%
ValueCountFrequency (%)
981 1
0.2%
548 1
0.2%
433 1
0.2%
406 1
0.2%
388 1
0.2%
382 1
0.2%
362 1
0.2%
357 1
0.2%
352 1
0.2%
332 1
0.2%

자동차_보유자수(INDEX04_CNT2)
Real number (ℝ)

ZEROS 

Distinct11
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.652
Minimum0
Maximum11
Zeros125
Zeros (%)25.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-11T00:01:59.244969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.75
median1
Q32
95-th percentile5
Maximum11
Range11
Interquartile range (IQR)1.25

Descriptive statistics

Standard deviation1.7183718
Coefficient of variation (CV)1.0401766
Kurtosis3.5787407
Mean1.652
Median Absolute Deviation (MAD)1
Skewness1.6723553
Sum826
Variance2.9528016
MonotonicityNot monotonic
2023-12-11T00:01:59.469028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 179
35.8%
0 125
25.0%
2 88
17.6%
3 38
 
7.6%
4 32
 
6.4%
5 17
 
3.4%
6 14
 
2.8%
8 3
 
0.6%
9 2
 
0.4%
7 1
 
0.2%
ValueCountFrequency (%)
0 125
25.0%
1 179
35.8%
2 88
17.6%
3 38
 
7.6%
4 32
 
6.4%
5 17
 
3.4%
6 14
 
2.8%
7 1
 
0.2%
8 3
 
0.6%
9 2
 
0.4%
ValueCountFrequency (%)
11 1
 
0.2%
9 2
 
0.4%
8 3
 
0.6%
7 1
 
0.2%
6 14
 
2.8%
5 17
 
3.4%
4 32
 
6.4%
3 38
 
7.6%
2 88
17.6%
1 179
35.8%

자가용이용_지수(INDEX04)
Real number (ℝ)

ZEROS 

Distinct10
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.112
Minimum0
Maximum9
Zeros122
Zeros (%)24.4%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-11T00:01:59.658396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q35
95-th percentile9
Maximum9
Range9
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.9128036
Coefficient of variation (CV)0.93599088
Kurtosis-0.79110439
Mean3.112
Median Absolute Deviation (MAD)2
Skewness0.6937214
Sum1556
Variance8.4844248
MonotonicityNot monotonic
2023-12-11T00:01:59.848929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0 122
24.4%
2 78
15.6%
1 65
13.0%
3 61
12.2%
8 33
 
6.6%
9 32
 
6.4%
7 31
 
6.2%
5 31
 
6.2%
4 27
 
5.4%
6 20
 
4.0%
ValueCountFrequency (%)
0 122
24.4%
1 65
13.0%
2 78
15.6%
3 61
12.2%
4 27
 
5.4%
5 31
 
6.2%
6 20
 
4.0%
7 31
 
6.2%
8 33
 
6.6%
9 32
 
6.4%
ValueCountFrequency (%)
9 32
 
6.4%
8 33
 
6.6%
7 31
 
6.2%
6 20
 
4.0%
5 31
 
6.2%
4 27
 
5.4%
3 61
12.2%
2 78
15.6%
1 65
13.0%
0 122
24.4%

Interactions

2023-12-11T00:01:54.923997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:52.655166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:53.352175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:54.064735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:55.118532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:52.823788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:53.535248image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:54.321993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:55.300777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:52.984154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:53.710324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:54.498771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:55.513946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:53.162530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:53.881014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:01:54.724632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T00:01:59.998626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별(GENDER)연령대(AGE)주유소_결재(지출)건수(INDEX04_CNT)자동차_보유자수(INDEX04_CNT2)자가용이용_지수(INDEX04)
성별(GENDER)1.0000.0000.0000.0500.000
연령대(AGE)0.0001.0000.1380.0000.070
주유소_결재(지출)건수(INDEX04_CNT)0.0000.1381.0000.1640.000
자동차_보유자수(INDEX04_CNT2)0.0500.0000.1641.0000.000
자가용이용_지수(INDEX04)0.0000.0700.0000.0001.000
2023-12-11T00:02:00.219461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령대(AGE)주유소_결재(지출)건수(INDEX04_CNT)자동차_보유자수(INDEX04_CNT2)자가용이용_지수(INDEX04)성별(GENDER)
연령대(AGE)1.0000.026-0.004-0.0010.000
주유소_결재(지출)건수(INDEX04_CNT)0.0261.0000.032-0.0330.000
자동차_보유자수(INDEX04_CNT2)-0.0040.0321.000-0.0050.037
자가용이용_지수(INDEX04)-0.001-0.033-0.0051.0000.000
성별(GENDER)0.0000.0000.0370.0001.000

Missing values

2023-12-11T00:01:55.724163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T00:01:55.946712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

서울시_블록ID(BLK_CD)성별(GENDER)연령대(AGE)주유소_결재(지출)건수(INDEX04_CNT)자동차_보유자수(INDEX04_CNT2)자가용이용_지수(INDEX04)
03*9*5*231020
12*7*7*146513
22*2*8*161014
32*3*3*254501
41*4*71420121
52*3*1*2332718
63*4*7*2426617
73*7*9*261520
82*4*82629227
92*5*8*253020
서울시_블록ID(BLK_CD)성별(GENDER)연령대(AGE)주유소_결재(지출)건수(INDEX04_CNT)자동차_보유자수(INDEX04_CNT2)자가용이용_지수(INDEX04)
4903*2*2*253011
4911*0*1131520
4922*1*4*235016
4932*9*3*151011
4943*5*9264504
4951*6*71514104
4961*5*7128004
4971*6*42516612
4981*9*6271018
4993*3*4*255010