Overview

Dataset statistics

Number of variables6
Number of observations30
Missing cells30
Missing cells (%)16.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.6 KiB
Average record size in memory54.4 B

Variable types

Categorical2
Text2
Numeric1
Unsupported1

Dataset

Description샘플 데이터
Author경기콘텐츠진흥원
URLhttps://www.bigdata-region.kr/#/dataset/34da2653-aafa-4095-a4eb-d60bdf6c3e05

Alerts

기준년월 has constant value ""Constant
카드사명 has 30 (100.0%) missing valuesMissing
시군구명 has unique valuesUnique
사용금액 has unique valuesUnique
카드사명 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-10 14:16:27.134671
Analysis finished2023-12-10 14:16:28.026881
Duration0.89 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준년월
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
2013-01
30 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2013-01
2nd row2013-01
3rd row2013-01
4th row2013-01
5th row2013-01

Common Values

ValueCountFrequency (%)
2013-01 30
100.0%

Length

2023-12-10T23:16:28.204687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:16:28.443672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2013-01 30
100.0%

시도명
Categorical

Distinct5
Distinct (%)16.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
경기도
22 
서울특별시
인천광역시
 
2
충청북도
 
1
충청남도
 
1

Length

Max length5
Median length3
Mean length3.4666667
Min length3

Unique

Unique2 ?
Unique (%)6.7%

Sample

1st row경기도
2nd row경기도
3rd row경기도
4th row서울특별시
5th row경기도

Common Values

ValueCountFrequency (%)
경기도 22
73.3%
서울특별시 4
 
13.3%
인천광역시 2
 
6.7%
충청북도 1
 
3.3%
충청남도 1
 
3.3%

Length

2023-12-10T23:16:28.652900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:16:28.863916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 22
73.3%
서울특별시 4
 
13.3%
인천광역시 2
 
6.7%
충청북도 1
 
3.3%
충청남도 1
 
3.3%

시군구명
Text

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:16:29.289691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length4.5
Min length2

Characters and Unicode

Total characters135
Distinct characters52
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)100.0%

Sample

1st row성남시 수정구
2nd row의정부시
3rd row고양시 덕양구
4th row중랑구
5th row용인시 기흥구
ValueCountFrequency (%)
용인시 3
 
7.3%
성남시 2
 
4.9%
고양시 2
 
4.9%
수원시 2
 
4.9%
광주시 1
 
2.4%
팔달구 1
 
2.4%
도봉구 1
 
2.4%
파주시 1
 
2.4%
일산서구 1
 
2.4%
구리시 1
 
2.4%
Other values (26) 26
63.4%
2023-12-10T23:16:29.965441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
23
17.0%
19
 
14.1%
11
 
8.1%
4
 
3.0%
4
 
3.0%
4
 
3.0%
4
 
3.0%
4
 
3.0%
4
 
3.0%
3
 
2.2%
Other values (42) 55
40.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 124
91.9%
Space Separator 11
 
8.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
23
18.5%
19
 
15.3%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
3
 
2.4%
3
 
2.4%
Other values (41) 52
41.9%
Space Separator
ValueCountFrequency (%)
11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 124
91.9%
Common 11
 
8.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
23
18.5%
19
 
15.3%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
3
 
2.4%
3
 
2.4%
Other values (41) 52
41.9%
Common
ValueCountFrequency (%)
11
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 124
91.9%
ASCII 11
 
8.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
23
18.5%
19
 
15.3%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
4
 
3.2%
3
 
2.4%
3
 
2.4%
Other values (41) 52
41.9%
ASCII
ValueCountFrequency (%)
11
100.0%
Distinct18
Distinct (%)60.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:16:30.297858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length13
Mean length10.633333
Min length6

Characters and Unicode

Total characters319
Distinct characters70
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)26.7%

Sample

1st row요식/유흥.일식
2nd row요식/유흥.노래방
3rd row스포츠/문화/레저.스포츠시설
4th row가정생활/서비스.인테리어
5th row교육/학원.문구용품
ValueCountFrequency (%)
요식/유흥.일식 3
 
10.0%
유통.기타유통 3
 
10.0%
가전/가구.가전 2
 
6.7%
교육/학원.문구용품 2
 
6.7%
스포츠/문화/레저.서점 2
 
6.7%
가정생활/서비스.인테리어 2
 
6.7%
스포츠/문화/레저.스포츠/레저용품 2
 
6.7%
자동차.자동차경정비 2
 
6.7%
스포츠/문화/레저.반려동물관련 2
 
6.7%
요식/유흥.노래방 2
 
6.7%
Other values (8) 8
26.7%
2023-12-10T23:16:30.888339image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 33
 
10.3%
. 30
 
9.4%
14
 
4.4%
12
 
3.8%
11
 
3.4%
11
 
3.4%
10
 
3.1%
10
 
3.1%
10
 
3.1%
10
 
3.1%
Other values (60) 168
52.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 253
79.3%
Other Punctuation 63
 
19.7%
Uppercase Letter 3
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14
 
5.5%
12
 
4.7%
11
 
4.3%
11
 
4.3%
10
 
4.0%
10
 
4.0%
10
 
4.0%
10
 
4.0%
9
 
3.6%
8
 
3.2%
Other values (55) 148
58.5%
Uppercase Letter
ValueCountFrequency (%)
G 1
33.3%
P 1
33.3%
L 1
33.3%
Other Punctuation
ValueCountFrequency (%)
/ 33
52.4%
. 30
47.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 253
79.3%
Common 63
 
19.7%
Latin 3
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14
 
5.5%
12
 
4.7%
11
 
4.3%
11
 
4.3%
10
 
4.0%
10
 
4.0%
10
 
4.0%
10
 
4.0%
9
 
3.6%
8
 
3.2%
Other values (55) 148
58.5%
Latin
ValueCountFrequency (%)
G 1
33.3%
P 1
33.3%
L 1
33.3%
Common
ValueCountFrequency (%)
/ 33
52.4%
. 30
47.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 253
79.3%
ASCII 66
 
20.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 33
50.0%
. 30
45.5%
G 1
 
1.5%
P 1
 
1.5%
L 1
 
1.5%
Hangul
ValueCountFrequency (%)
14
 
5.5%
12
 
4.7%
11
 
4.3%
11
 
4.3%
10
 
4.0%
10
 
4.0%
10
 
4.0%
10
 
4.0%
9
 
3.6%
8
 
3.2%
Other values (55) 148
58.5%

사용금액
Real number (ℝ)

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.3484459 × 108
Minimum295020
Maximum6.0670921 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:16:31.233314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum295020
5-th percentile1663667
Q150147366
median5.1591954 × 108
Q38.237487 × 108
95-th percentile1.8700945 × 109
Maximum6.0670921 × 109
Range6.066797 × 109
Interquartile range (IQR)7.7360133 × 108

Descriptive statistics

Standard deviation1.152162 × 109
Coefficient of variation (CV)1.5678989
Kurtosis16.296513
Mean7.3484459 × 108
Median Absolute Deviation (MAD)4.5032542 × 108
Skewness3.6616896
Sum2.2045338 × 1010
Variance1.3274773 × 1018
MonotonicityNot monotonic
2023-12-10T23:16:31.528867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
287208076 1
 
3.3%
1130656126 1
 
3.3%
295020 1
 
3.3%
3044070 1
 
3.3%
824729751 1
 
3.3%
6067092057 1
 
3.3%
216911667 1
 
3.3%
656551812 1
 
3.3%
2104476 1
 
3.3%
508816009 1
 
3.3%
Other values (20) 20
66.7%
ValueCountFrequency (%)
295020 1
3.3%
1303005 1
3.3%
2104476 1
3.3%
2598634 1
3.3%
3044070 1
3.3%
8448300 1
3.3%
24983277 1
3.3%
34700610 1
3.3%
96487632 1
3.3%
101794863 1
3.3%
ValueCountFrequency (%)
6067092057 1
3.3%
1987858567 1
3.3%
1726160670 1
3.3%
1713285250 1
3.3%
1145339754 1
3.3%
1130656126 1
3.3%
1061510764 1
3.3%
824729751 1
3.3%
820805538 1
3.3%
711673429 1
3.3%

카드사명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing30
Missing (%)100.0%
Memory size402.0 B

Interactions

2023-12-10T23:16:27.442405image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:16:31.737507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도명시군구명업종분류명사용금액
시도명1.0001.0000.0000.000
시군구명1.0001.0001.0001.000
업종분류명0.0001.0001.0000.818
사용금액0.0001.0000.8181.000
2023-12-10T23:16:31.892394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
사용금액시도명
사용금액1.0000.000
시도명0.0001.000

Missing values

2023-12-10T23:16:27.650894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:16:27.949317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준년월시도명시군구명업종분류명사용금액카드사명
02013-01경기도성남시 수정구요식/유흥.일식287208076<NA>
12013-01경기도의정부시요식/유흥.노래방1061510764<NA>
22013-01경기도고양시 덕양구스포츠/문화/레저.스포츠시설96487632<NA>
32013-01서울특별시중랑구가정생활/서비스.인테리어2598634<NA>
42013-01경기도용인시 기흥구교육/학원.문구용품1726160670<NA>
52013-01인천광역시부평구스포츠/문화/레저.서점8448300<NA>
62013-01경기도수원시 권선구스포츠/문화/레저.스크린골프101794863<NA>
72013-01경기도여주시가전/가구.가전699108983<NA>
82013-01경기도오산시교육/학원.입시보습학원1713285250<NA>
92013-01경기도용인시 수지구유통.기타유통389877199<NA>
기준년월시도명시군구명업종분류명사용금액카드사명
202013-01경기도성남시 분당구스포츠/문화/레저.스포츠/레저용품1987858567<NA>
212013-01경기도안성시교육/학원.문구용품711673429<NA>
222013-01경기도양주시가전/가구.가전508816009<NA>
232013-01충청남도천안시 서북구스포츠/문화/레저.스포츠/레저용품2104476<NA>
242013-01경기도고양시 일산서구요식/유흥.노래방656551812<NA>
252013-01경기도파주시스포츠/문화/레저.반려동물관련216911667<NA>
262013-01경기도수원시 팔달구주유.LPG6067092057<NA>
272013-01경기도광주시요식/유흥.일식824729751<NA>
282013-01서울특별시도봉구유통.기타유통3044070<NA>
292013-01서울특별시서초구음/식료품.기타음/식료품295020<NA>