Overview

Dataset statistics

Number of variables6
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory26.0 KiB
Average record size in memory53.3 B

Variable types

Text1
Categorical1
Numeric4

Dataset

Description샘플 데이터
Author서울시, 신한카드, KCB(코리아크레딧뷰로)
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=321

Reproduction

Analysis started2023-12-10 15:02:01.721471
Analysis finished2023-12-10 15:02:06.749674
Duration5.03 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct327
Distinct (%)65.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-11T00:02:07.212147image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.72
Min length4

Characters and Unicode

Total characters2860
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique213 ?
Unique (%)42.6%

Sample

1st row2*0*7*
2nd row5*2*1*
3rd row4*8*6*
4th row5*2*8*
5th row1*6*2*
ValueCountFrequency (%)
2*6*6 6
 
1.2%
2*3*7 6
 
1.2%
2*6*2 6
 
1.2%
2*7*4 5
 
1.0%
1*3*5 5
 
1.0%
2*0*4 5
 
1.0%
2*0*5 4
 
0.8%
1*3*3 4
 
0.8%
2*2*2 4
 
0.8%
2*0*6 4
 
0.8%
Other values (267) 451
90.2%
2023-12-11T00:02:08.104687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
* 1370
47.9%
2 313
 
10.9%
1 231
 
8.1%
3 196
 
6.9%
4 156
 
5.5%
6 108
 
3.8%
5 107
 
3.7%
8 104
 
3.6%
0 100
 
3.5%
9 93
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1490
52.1%
Other Punctuation 1370
47.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 313
21.0%
1 231
15.5%
3 196
13.2%
4 156
10.5%
6 108
 
7.2%
5 107
 
7.2%
8 104
 
7.0%
0 100
 
6.7%
9 93
 
6.2%
7 82
 
5.5%
Other Punctuation
ValueCountFrequency (%)
* 1370
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2860
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
* 1370
47.9%
2 313
 
10.9%
1 231
 
8.1%
3 196
 
6.9%
4 156
 
5.5%
6 108
 
3.8%
5 107
 
3.7%
8 104
 
3.6%
0 100
 
3.5%
9 93
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2860
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
* 1370
47.9%
2 313
 
10.9%
1 231
 
8.1%
3 196
 
6.9%
4 156
 
5.5%
6 108
 
3.8%
5 107
 
3.7%
8 104
 
3.6%
0 100
 
3.5%
9 93
 
3.3%

성별(GENDER)
Categorical

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2
289 
1
211 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row2
3rd row2
4th row1
5th row2

Common Values

ValueCountFrequency (%)
2 289
57.8%
1 211
42.2%

Length

2023-12-11T00:02:08.370922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T00:02:08.592443image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2 289
57.8%
1 211
42.2%

연령대(AGE)
Real number (ℝ)

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.678
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-11T00:02:08.762193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q13
median4
Q34
95-th percentile5
Maximum7
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1405779
Coefficient of variation (CV)0.31010817
Kurtosis-0.58492961
Mean3.678
Median Absolute Deviation (MAD)1
Skewness0.036313814
Sum1839
Variance1.3009178
MonotonicityNot monotonic
2023-12-11T00:02:08.997293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
4 175
35.0%
3 106
21.2%
5 102
20.4%
2 96
19.2%
6 16
 
3.2%
7 3
 
0.6%
1 2
 
0.4%
ValueCountFrequency (%)
1 2
 
0.4%
2 96
19.2%
3 106
21.2%
4 175
35.0%
5 102
20.4%
6 16
 
3.2%
7 3
 
0.6%
ValueCountFrequency (%)
7 3
 
0.6%
6 16
 
3.2%
5 102
20.4%
4 175
35.0%
3 106
21.2%
2 96
19.2%
1 2
 
0.4%
Distinct411
Distinct (%)82.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1076646
Minimum0
Maximum15109000
Zeros1
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-11T00:02:09.282371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile20950
Q1113000
median315000
Q31158250
95-th percentile4818750
Maximum15109000
Range15109000
Interquartile range (IQR)1045250

Descriptive statistics

Standard deviation1844792.8
Coefficient of variation (CV)1.7134628
Kurtosis14.30725
Mean1076646
Median Absolute Deviation (MAD)271500
Skewness3.3470287
Sum5.38323 × 108
Variance3.4032606 × 1012
MonotonicityNot monotonic
2023-12-11T00:02:09.619643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127000 4
 
0.8%
4000 3
 
0.6%
63000 3
 
0.6%
87000 3
 
0.6%
38000 3
 
0.6%
66000 3
 
0.6%
52000 3
 
0.6%
121000 3
 
0.6%
278000 3
 
0.6%
17000 3
 
0.6%
Other values (401) 469
93.8%
ValueCountFrequency (%)
0 1
 
0.2%
1000 1
 
0.2%
2000 1
 
0.2%
3000 1
 
0.2%
4000 3
0.6%
6000 1
 
0.2%
7000 3
0.6%
8000 1
 
0.2%
9000 1
 
0.2%
11000 1
 
0.2%
ValueCountFrequency (%)
15109000 1
0.2%
11428000 1
0.2%
10748000 1
0.2%
10271000 1
0.2%
9500000 1
0.2%
9267000 1
0.2%
9031000 1
0.2%
8935000 1
0.2%
8509000 1
0.2%
7689000 1
0.2%
Distinct414
Distinct (%)82.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.0995474
Minimum0.0003
Maximum1.2877
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-11T00:02:10.053775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.0003
5-th percentile0.003195
Q10.01295
median0.0377
Q30.103825
95-th percentile0.409735
Maximum1.2877
Range1.2874
Interquartile range (IQR)0.090875

Descriptive statistics

Standard deviation0.15641927
Coefficient of variation (CV)1.5713044
Kurtosis15.565182
Mean0.0995474
Median Absolute Deviation (MAD)0.03155
Skewness3.3705283
Sum49.7737
Variance0.024466988
MonotonicityNot monotonic
2023-12-11T00:02:10.420566image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0152 4
 
0.8%
0.0077 4
 
0.8%
0.0007 4
 
0.8%
0.0058 3
 
0.6%
0.0116 3
 
0.6%
0.0048 3
 
0.6%
0.0056 3
 
0.6%
0.0082 3
 
0.6%
0.0005 3
 
0.6%
0.013 3
 
0.6%
Other values (404) 467
93.4%
ValueCountFrequency (%)
0.0003 2
0.4%
0.0004 1
 
0.2%
0.0005 3
0.6%
0.0007 4
0.8%
0.0008 1
 
0.2%
0.001 1
 
0.2%
0.0011 1
 
0.2%
0.0012 3
0.6%
0.0013 1
 
0.2%
0.0015 2
0.4%
ValueCountFrequency (%)
1.2877 1
0.2%
1.1764 1
0.2%
1.0011 1
0.2%
0.877 1
0.2%
0.7363 1
0.2%
0.6917 1
0.2%
0.6706 1
0.2%
0.6553 1
0.2%
0.6551 1
0.2%
0.6111 1
0.2%

학원비지수(INDEX03)
Real number (ℝ)

Distinct9
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.854
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-11T00:02:10.748855image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median5
Q37
95-th percentile9
Maximum9
Range8
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.5988829
Coefficient of variation (CV)0.53541057
Kurtosis-1.2230164
Mean4.854
Median Absolute Deviation (MAD)2
Skewness0.098436179
Sum2427
Variance6.7541924
MonotonicityNot monotonic
2023-12-11T00:02:11.044240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
2 62
12.4%
4 60
12.0%
1 59
11.8%
3 58
11.6%
6 57
11.4%
9 57
11.4%
7 51
10.2%
5 50
10.0%
8 46
9.2%
ValueCountFrequency (%)
1 59
11.8%
2 62
12.4%
3 58
11.6%
4 60
12.0%
5 50
10.0%
6 57
11.4%
7 51
10.2%
8 46
9.2%
9 57
11.4%
ValueCountFrequency (%)
9 57
11.4%
8 46
9.2%
7 51
10.2%
6 57
11.4%
5 50
10.0%
4 60
12.0%
3 58
11.6%
2 62
12.4%
1 59
11.8%

Interactions

2023-12-11T00:02:05.011234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:02.118763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:03.013074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:04.072967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:05.694613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:02.329847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:03.242695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:04.324359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:05.929139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:02.559234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:03.540068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:04.566651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:06.177820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:02.783186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:03.820296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T00:02:04.762352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T00:02:11.290789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별(GENDER)연령대(AGE)학원비_인당_평균_지출액(INDEX03_AMT)인당_평균_소득대비_평균_학원비_지출_비율(INDEX03_RT)학원비지수(INDEX03)
성별(GENDER)1.0000.0450.0000.0100.139
연령대(AGE)0.0451.0000.0000.0000.075
학원비_인당_평균_지출액(INDEX03_AMT)0.0000.0001.0000.2750.079
인당_평균_소득대비_평균_학원비_지출_비율(INDEX03_RT)0.0100.0000.2751.0000.000
학원비지수(INDEX03)0.1390.0750.0790.0001.000
2023-12-11T00:02:11.566672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령대(AGE)학원비_인당_평균_지출액(INDEX03_AMT)인당_평균_소득대비_평균_학원비_지출_비율(INDEX03_RT)학원비지수(INDEX03)성별(GENDER)
연령대(AGE)1.0000.058-0.0350.0450.048
학원비_인당_평균_지출액(INDEX03_AMT)0.0581.0000.0620.0150.000
인당_평균_소득대비_평균_학원비_지출_비율(INDEX03_RT)-0.0350.0621.000-0.1230.008
학원비지수(INDEX03)0.0450.015-0.1231.0000.138
성별(GENDER)0.0480.0000.0080.1381.000

Missing values

2023-12-11T00:02:06.443928image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T00:02:06.667724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

서울시_블록ID(BLK_CD)성별(GENDER)연령대(AGE)학원비_인당_평균_지출액(INDEX03_AMT)인당_평균_소득대비_평균_학원비_지출_비율(INDEX03_RT)학원비지수(INDEX03)
02*0*7*235040000.00318
15*2*1*222420000.08175
24*8*6*2349950000.07126
35*2*8*151490000.00542
41*6*2*251270000.04579
53*4*8*271150000.00088
64*7*0*2421810000.07012
72*0*4*1413340000.00924
82*9*1*241010000.07774
92*1*4*231480000.00339
서울시_블록ID(BLK_CD)성별(GENDER)연령대(AGE)학원비_인당_평균_지출액(INDEX03_AMT)인당_평균_소득대비_평균_학원비_지출_비율(INDEX03_RT)학원비지수(INDEX03)
4901*3*71340440000.10613
4912*4*6*226460000.08475
4922*1*92489350000.07716
4933*5*8*243280000.04959
4942*4*1*1470000.10842
4953*6*9*1121600000.4891
4962*2*4*22170000.09186
4973*3*0*2530500000.13468
4983*0*8*151950000.02676
4992*1*2*1219030000.01369