Overview

Dataset statistics

Number of variables6
Number of observations30
Missing cells30
Missing cells (%)16.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 KiB
Average record size in memory54.4 B

Variable types

Categorical3
Text1
Numeric1
Unsupported1

Dataset

Description샘플 데이터
Author경기콘텐츠진흥원
URLhttps://www.bigdata-region.kr/#/dataset/846c457c-f299-4e43-a6f2-14e4bde65671

Alerts

기준년월 has constant value ""Constant
시도명 is highly imbalanced (56.1%)Imbalance
카드사명 has 30 (100.0%) missing valuesMissing
사용금액 has unique valuesUnique
카드사명 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-10 14:14:32.725919
Analysis finished2023-12-10 14:14:33.556469
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준년월
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
2013-01
30 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2013-01
2nd row2013-01
3rd row2013-01
4th row2013-01
5th row2013-01

Common Values

ValueCountFrequency (%)
2013-01 30
100.0%

Length

2023-12-10T23:14:33.656355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:14:33.818376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2013-01 30
100.0%

시도명
Categorical

IMBALANCE 

Distinct4
Distinct (%)13.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
경기도
25 
서울특별시
인천광역시
 
1
충청북도
 
1

Length

Max length5
Median length3
Mean length3.3
Min length3

Unique

Unique2 ?
Unique (%)6.7%

Sample

1st row경기도
2nd row경기도
3rd row경기도
4th row경기도
5th row경기도

Common Values

ValueCountFrequency (%)
경기도 25
83.3%
서울특별시 3
 
10.0%
인천광역시 1
 
3.3%
충청북도 1
 
3.3%

Length

2023-12-10T23:14:34.015742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:14:34.249194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 25
83.3%
서울특별시 3
 
10.0%
인천광역시 1
 
3.3%
충청북도 1
 
3.3%
Distinct29
Distinct (%)96.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:14:34.547655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length4.3666667
Min length2

Characters and Unicode

Total characters131
Distinct characters51
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)93.3%

Sample

1st row성남시 중원구
2nd row부천시
3rd row용인시 수지구
4th row의정부시
5th row의왕시
ValueCountFrequency (%)
성남시 3
 
7.5%
고양시 2
 
5.0%
덕양구 2
 
5.0%
용인시 2
 
5.0%
권선구 1
 
2.5%
안산시 1
 
2.5%
상록구 1
 
2.5%
안성시 1
 
2.5%
구로구 1
 
2.5%
도봉구 1
 
2.5%
Other values (25) 25
62.5%
2023-12-10T23:14:35.108826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23
17.6%
15
 
11.5%
10
 
7.6%
7
 
5.3%
5
 
3.8%
4
 
3.1%
4
 
3.1%
4
 
3.1%
3
 
2.3%
3
 
2.3%
Other values (41) 53
40.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 121
92.4%
Space Separator 10
 
7.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
23
19.0%
15
 
12.4%
7
 
5.8%
5
 
4.1%
4
 
3.3%
4
 
3.3%
4
 
3.3%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (40) 50
41.3%
Space Separator
ValueCountFrequency (%)
10
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 121
92.4%
Common 10
 
7.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
23
19.0%
15
 
12.4%
7
 
5.8%
5
 
4.1%
4
 
3.3%
4
 
3.3%
4
 
3.3%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (40) 50
41.3%
Common
ValueCountFrequency (%)
10
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 121
92.4%
ASCII 10
 
7.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
23
19.0%
15
 
12.4%
7
 
5.8%
5
 
4.1%
4
 
3.3%
4
 
3.3%
4
 
3.3%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (40) 50
41.3%
ASCII
ValueCountFrequency (%)
10
100.0%

성별코드
Categorical

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
M
15 
F
15 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowF
4th rowM
5th rowM

Common Values

ValueCountFrequency (%)
M 15
50.0%
F 15
50.0%

Length

2023-12-10T23:14:35.354194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:14:35.535979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
m 15
50.0%
f 15
50.0%

사용금액
Real number (ℝ)

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.5913868 × 1010
Minimum196680
Maximum1.7358655 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:14:35.698593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum196680
5-th percentile63865802
Q11.3768742 × 1010
median3.6690115 × 1010
Q37.0871698 × 1010
95-th percentile1.1283516 × 1011
Maximum1.7358655 × 1011
Range1.7358635 × 1011
Interquartile range (IQR)5.7102956 × 1010

Descriptive statistics

Standard deviation4.197119 × 1010
Coefficient of variation (CV)0.9141288
Kurtosis1.5755
Mean4.5913868 × 1010
Median Absolute Deviation (MAD)2.9629535 × 1010
Skewness1.1740206
Sum1.3774161 × 1012
Variance1.7615808 × 1021
MonotonicityNot monotonic
2023-12-10T23:14:35.881076image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
50859194158 1
 
3.3%
105674787 1
 
3.3%
76836034530 1
 
3.3%
19433180914 1
 
3.3%
34981854640 1
 
3.3%
196680 1
 
3.3%
25940946727 1
 
3.3%
255616950 1
 
3.3%
29658450 1
 
3.3%
1122055698 1
 
3.3%
Other values (20) 20
66.7%
ValueCountFrequency (%)
196680 1
3.3%
29658450 1
3.3%
105674787 1
3.3%
255616950 1
3.3%
1122055698 1
3.3%
7375622612 1
3.3%
11698942346 1
3.3%
13437959063 1
3.3%
14761092504 1
3.3%
19433180914 1
3.3%
ValueCountFrequency (%)
173586549577 1
3.3%
129903411988 1
3.3%
91973973116 1
3.3%
91676394037 1
3.3%
87067383389 1
3.3%
80898521424 1
3.3%
76836034530 1
3.3%
71025603544 1
3.3%
70409981997 1
3.3%
66634693113 1
3.3%

카드사명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing30
Missing (%)100.0%
Memory size402.0 B

Interactions

2023-12-10T23:14:33.021898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:14:36.057667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명시군구명성별코드사용금액
시도명1.0001.0000.0000.000
시군구명1.0001.0000.0000.714
성별코드0.0000.0001.0000.000
사용금액0.0000.7140.0001.000
2023-12-10T23:14:36.305175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명성별코드
시도명1.0000.000
성별코드0.0001.000
2023-12-10T23:14:36.579617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용금액시도명성별코드
사용금액1.0000.0000.000
시도명0.0001.0000.000
성별코드0.0000.0001.000

Missing values

2023-12-10T23:14:33.301433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:14:33.491256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준년월시도명시군구명성별코드사용금액카드사명
02013-01경기도성남시 중원구M50859194158<NA>
12013-01경기도부천시M129903411988<NA>
22013-01경기도용인시 수지구F46414566357<NA>
32013-01경기도의정부시M91973973116<NA>
42013-01경기도의왕시M38398374681<NA>
52013-01경기도하남시F21433247677<NA>
62013-01경기도가평군F7375622612<NA>
72013-01경기도고양시 덕양구F66634693113<NA>
82013-01경기도고양시 덕양구M91676394037<NA>
92013-01경기도김포시F44995266971<NA>
기준년월시도명시군구명성별코드사용금액카드사명
202013-01경기도안산시 상록구M70409981997<NA>
212013-01경기도안성시M55525352098<NA>
222013-01서울특별시구로구M1122055698<NA>
232013-01서울특별시도봉구M29658450<NA>
242013-01인천광역시서구F255616950<NA>
252013-01경기도성남시 수정구M25940946727<NA>
262013-01충청북도음성군M196680<NA>
272013-01경기도안양시 만안구F34981854640<NA>
282013-01경기도양주시F19433180914<NA>
292013-01경기도용인시 기흥구F76836034530<NA>