Overview

Dataset statistics

Number of variables6
Number of observations30
Missing cells30
Missing cells (%)16.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 KiB
Average record size in memory54.4 B

Variable types

Categorical3
Text1
Numeric1
Unsupported1

Dataset

Description샘플 데이터
Author경기콘텐츠진흥원
URLhttps://www.bigdata-region.kr/#/dataset/7c9606dc-68c6-4282-ab7d-d71b3b019742

Alerts

기준년월 has constant value ""Constant
시도명 is highly imbalanced (51.4%)Imbalance
카드사명 has 30 (100.0%) missing valuesMissing
사용금액 has unique valuesUnique
카드사명 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-10 14:01:19.077063
Analysis finished2023-12-10 14:01:19.727051
Duration0.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준년월
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
2013-01
30 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2013-01
2nd row2013-01
3rd row2013-01
4th row2013-01
5th row2013-01

Common Values

ValueCountFrequency (%)
2013-01 30
100.0%

Length

2023-12-10T23:01:19.824224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:01:19.996164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2013-01 30
100.0%

시도명
Categorical

IMBALANCE 

Distinct4
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
경기도
24 
서울특별시
충청남도
 
1
인천광역시
 
1

Length

Max length5
Median length3
Mean length3.3666667
Min length3

Unique

Unique2 ?
Unique (%)6.7%

Sample

1st row경기도
2nd row경기도
3rd row경기도
4th row경기도
5th row경기도

Common Values

ValueCountFrequency (%)
경기도 24
80.0%
서울특별시 4
 
13.3%
충청남도 1
 
3.3%
인천광역시 1
 
3.3%

Length

2023-12-10T23:01:20.169948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:01:20.356689image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 24
80.0%
서울특별시 4
 
13.3%
충청남도 1
 
3.3%
인천광역시 1
 
3.3%
Distinct26
Distinct (%)86.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:01:20.662836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length4.3333333
Min length3

Characters and Unicode

Total characters130
Distinct characters43
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)73.3%

Sample

1st row수원시 권선구
2nd row광주시
3rd row수원시 권선구
4th row이천시
5th row시흥시
ValueCountFrequency (%)
수원시 3
 
7.7%
하남시 2
 
5.1%
성남시 2
 
5.1%
권선구 2
 
5.1%
안양시 2
 
5.1%
동두천시 2
 
5.1%
김포시 2
 
5.1%
용인시 1
 
2.6%
만안구 1
 
2.6%
의정부시 1
 
2.6%
Other values (21) 21
53.8%
2023-12-10T23:01:21.302818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26
20.0%
15
 
11.5%
9
 
6.9%
5
 
3.8%
5
 
3.8%
5
 
3.8%
5
 
3.8%
5
 
3.8%
5
 
3.8%
4
 
3.1%
Other values (33) 46
35.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 121
93.1%
Space Separator 9
 
6.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26
21.5%
15
 
12.4%
5
 
4.1%
5
 
4.1%
5
 
4.1%
5
 
4.1%
5
 
4.1%
5
 
4.1%
4
 
3.3%
3
 
2.5%
Other values (32) 43
35.5%
Space Separator
ValueCountFrequency (%)
9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 121
93.1%
Common 9
 
6.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
26
21.5%
15
 
12.4%
5
 
4.1%
5
 
4.1%
5
 
4.1%
5
 
4.1%
5
 
4.1%
5
 
4.1%
4
 
3.3%
3
 
2.5%
Other values (32) 43
35.5%
Common
ValueCountFrequency (%)
9
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 121
93.1%
ASCII 9
 
6.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
26
21.5%
15
 
12.4%
5
 
4.1%
5
 
4.1%
5
 
4.1%
5
 
4.1%
5
 
4.1%
5
 
4.1%
4
 
3.3%
3
 
2.5%
Other values (32) 43
35.5%
ASCII
ValueCountFrequency (%)
9
100.0%

연령대코드
Categorical

Distinct6
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
40
10
30
20
50

Length

Max length3
Median length2
Mean length2.0333333
Min length2

Unique

Unique1 ?
Unique (%)3.3%

Sample

1st row10
2nd row30
3rd row50
4th row10
5th row20

Common Values

ValueCountFrequency (%)
40 9
30.0%
10 7
23.3%
30 6
20.0%
20 4
13.3%
50 3
 
10.0%
60+ 1
 
3.3%

Length

2023-12-10T23:01:21.616416image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:01:21.801377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
40 9
30.0%
10 7
23.3%
30 6
20.0%
20 4
13.3%
50 3
 
10.0%
60 1
 
3.3%

사용금액
Real number (ℝ)

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.7469864 × 1010
Minimum88148
Maximum9.33485 × 1010
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:01:22.020817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum88148
5-th percentile872499.3
Q14.2073026 × 108
median8.5551514 × 109
Q33.2576118 × 1010
95-th percentile5.5305418 × 1010
Maximum9.33485 × 1010
Range9.3348411 × 1010
Interquartile range (IQR)3.2155388 × 1010

Descriptive statistics

Standard deviation2.2649624 × 1010
Coefficient of variation (CV)1.2964968
Kurtosis3.0178421
Mean1.7469864 × 1010
Median Absolute Deviation (MAD)8.4605348 × 109
Skewness1.7008265
Sum5.2409593 × 1011
Variance5.1300547 × 1020
MonotonicityNot monotonic
2023-12-10T23:01:22.202503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
394790493 1
 
3.3%
7656705706 1
 
3.3%
13424682609 1
 
3.3%
93348499623 1
 
3.3%
4345394530 1
 
3.3%
39302827531 1
 
3.3%
43066007484 1
 
3.3%
338461444 1
 
3.3%
9088220372 1
 
3.3%
37752528412 1
 
3.3%
Other values (20) 20
66.7%
ValueCountFrequency (%)
88148 1
3.3%
160026 1
3.3%
1743300 1
3.3%
35060874 1
3.3%
64096778 1
3.3%
125136420 1
3.3%
338461444 1
3.3%
394790493 1
3.3%
498549564 1
3.3%
725647324 1
3.3%
ValueCountFrequency (%)
93348499623 1
3.3%
55511909592 1
3.3%
55053039610 1
3.3%
43066007484 1
3.3%
39302827531 1
3.3%
38093721586 1
3.3%
37752528412 1
3.3%
37325993096 1
3.3%
18326491756 1
3.3%
17362978196 1
3.3%

카드사명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing30
Missing (%)100.0%
Memory size402.0 B

Interactions

2023-12-10T23:01:19.301756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:01:22.324319image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명시군구명연령대코드사용금액
시도명1.0001.0000.0000.000
시군구명1.0001.0000.0000.948
연령대코드0.0000.0001.0000.329
사용금액0.0000.9480.3291.000
2023-12-10T23:01:22.467350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령대코드시도명
연령대코드1.0000.000
시도명0.0001.000
2023-12-10T23:01:22.651142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용금액시도명연령대코드
사용금액1.0000.0000.139
시도명0.0001.0000.000
연령대코드0.1390.0001.000

Missing values

2023-12-10T23:01:19.472964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:01:19.666764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준년월시도명시군구명연령대코드사용금액카드사명
02013-01경기도수원시 권선구10394790493<NA>
12013-01경기도광주시3037325993096<NA>
22013-01경기도수원시 권선구5038093721586<NA>
32013-01경기도이천시10125136420<NA>
42013-01경기도시흥시2011233522102<NA>
52013-01경기도하남시4018326491756<NA>
62013-01경기도하남시5014557586121<NA>
72013-01서울특별시노원구40498549564<NA>
82013-01서울특별시서초구10160026<NA>
92013-01충청남도아산시4035060874<NA>
기준년월시도명시군구명연령대코드사용금액카드사명
202013-01서울특별시양천구1088148<NA>
212013-01인천광역시계양구401743300<NA>
222013-01경기도광명시3037752528412<NA>
232013-01경기도구리시209088220372<NA>
242013-01경기도군포시10338461444<NA>
252013-01경기도남양주시3043066007484<NA>
262013-01경기도김포시4039302827531<NA>
272013-01경기도동두천시504345394530<NA>
282013-01경기도성남시 분당구3093348499623<NA>
292013-01경기도성남시 수정구3013424682609<NA>