Overview

Dataset statistics

Number of variables7
Number of observations29
Missing cells28
Missing cells (%)13.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.8 KiB
Average record size in memory64.6 B

Variable types

DateTime1
Numeric2
Categorical3
Text1

Dataset

Description샘플 데이터
Author코나아이㈜
URLhttps://www.bigdata-region.kr/#/dataset/29aa2a0a-61e1-4e3e-9ee6-b976d1931f04

Alerts

일반일간결제일자 has constant value ""Constant
결제상품명 has constant value ""Constant
결제상품ID is highly imbalanced (78.4%)Imbalance
결제금액 is highly imbalanced (78.4%)Imbalance
결제상품명 has 28 (96.6%) missing valuesMissing
가맹점번호 has unique valuesUnique
가맹점우편번호 has unique valuesUnique

Reproduction

Analysis started2023-12-10 13:58:11.046089
Analysis finished2023-12-10 13:58:12.350029
Duration1.3 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct1
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size364.0 B
Minimum2019-04-27 00:00:00
Maximum2019-04-27 00:00:00
2023-12-10T22:58:12.420139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:58:12.592318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

가맹점번호
Real number (ℝ)

UNIQUE 

Distinct29
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.0006589 × 108
Minimum7.0001262 × 108
Maximum7.001381 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size393.0 B
2023-12-10T22:58:12.769449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7.0001262 × 108
5-th percentile7.00013 × 108
Q17.0003461 × 108
median7.000581 × 108
Q37.0009085 × 108
95-th percentile7.0013202 × 108
Maximum7.001381 × 108
Range125483
Interquartile range (IQR)56244

Descriptive statistics

Standard deviation38926.238
Coefficient of variation (CV)5.5603677 × 10-5
Kurtosis-0.87072397
Mean7.0006589 × 108
Median Absolute Deviation (MAD)31378
Skewness0.35704776
Sum2.0301911 × 1010
Variance1.515252 × 109
MonotonicityStrictly increasing
2023-12-10T22:58:13.035774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
700012617 1
 
3.4%
700012770 1
 
3.4%
700138100 1
 
3.4%
700136117 1
 
3.4%
700125879 1
 
3.4%
700124410 1
 
3.4%
700119749 1
 
3.4%
700102179 1
 
3.4%
700092363 1
 
3.4%
700090851 1
 
3.4%
Other values (19) 19
65.5%
ValueCountFrequency (%)
700012617 1
3.4%
700012770 1
3.4%
700013346 1
3.4%
700017559 1
3.4%
700017640 1
3.4%
700025306 1
3.4%
700026724 1
3.4%
700034607 1
3.4%
700042099 1
3.4%
700050394 1
3.4%
ValueCountFrequency (%)
700138100 1
3.4%
700136117 1
3.4%
700125879 1
3.4%
700124410 1
3.4%
700119749 1
3.4%
700102179 1
3.4%
700092363 1
3.4%
700090851 1
3.4%
700085630 1
3.4%
700085622 1
3.4%

결제상품ID
Categorical

IMBALANCE 

Distinct2
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size364.0 B
999999999999999
28 
140000126000
 
1

Length

Max length15
Median length15
Mean length14.896552
Min length12

Unique

Unique1 ?
Unique (%)3.4%

Sample

1st row999999999999999
2nd row999999999999999
3rd row999999999999999
4th row999999999999999
5th row999999999999999

Common Values

ValueCountFrequency (%)
999999999999999 28
96.6%
140000126000 1
 
3.4%

Length

2023-12-10T22:58:13.250711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:58:13.412524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
999999999999999 28
96.6%
140000126000 1
 
3.4%
Distinct12
Distinct (%)41.4%
Missing0
Missing (%)0.0%
Memory size364.0 B
일반휴게음식
11 
연료판매점
가구
유통업 영리
의류
Other values (7)

Length

Max length8
Median length6
Mean length4.7931034
Min length2

Unique

Unique5 ?
Unique (%)17.2%

Sample

1st row연료판매점
2nd row일반휴게음식
3rd row가구
4th row레저업소
5th row유통업 영리

Common Values

ValueCountFrequency (%)
일반휴게음식 11
37.9%
연료판매점 3
 
10.3%
가구 2
 
6.9%
유통업 영리 2
 
6.9%
의류 2
 
6.9%
음료식품 2
 
6.9%
보건위생 2
 
6.9%
레저업소 1
 
3.4%
건축자재 1
 
3.4%
자동차정비 유지 1
 
3.4%
Other values (2) 2
 
6.9%

Length

2023-12-10T22:58:13.612557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
일반휴게음식 11
34.4%
연료판매점 3
 
9.4%
가구 2
 
6.2%
유통업 2
 
6.2%
영리 2
 
6.2%
의류 2
 
6.2%
음료식품 2
 
6.2%
보건위생 2
 
6.2%
레저업소 1
 
3.1%
건축자재 1
 
3.1%
Other values (4) 4
 
12.5%

가맹점우편번호
Real number (ℝ)

UNIQUE 

Distinct29
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14327.276
Minimum10049
Maximum18593
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size393.0 B
2023-12-10T22:58:13.817372image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10049
5-th percentile10099.4
Q111695
median14533
Q316802
95-th percentile17664.2
Maximum18593
Range8544
Interquartile range (IQR)5107

Descriptive statistics

Standard deviation2682.0772
Coefficient of variation (CV)0.18720078
Kurtosis-1.3625402
Mean14327.276
Median Absolute Deviation (MAD)2473
Skewness-0.18930558
Sum415491
Variance7193538.2
MonotonicityNot monotonic
2023-12-10T22:58:14.014764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=29)
ValueCountFrequency (%)
17825 1
 
3.4%
11161 1
 
3.4%
10124 1
 
3.4%
15842 1
 
3.4%
12756 1
 
3.4%
17145 1
 
3.4%
14424 1
 
3.4%
13642 1
 
3.4%
13535 1
 
3.4%
17052 1
 
3.4%
Other values (19) 19
65.5%
ValueCountFrequency (%)
10049 1
3.4%
10083 1
3.4%
10124 1
3.4%
10819 1
3.4%
11161 1
3.4%
11626 1
3.4%
11676 1
3.4%
11695 1
3.4%
12271 1
3.4%
12711 1
3.4%
ValueCountFrequency (%)
18593 1
3.4%
17825 1
3.4%
17423 1
3.4%
17419 1
3.4%
17145 1
3.4%
17052 1
3.4%
17006 1
3.4%
16802 1
3.4%
16409 1
3.4%
16226 1
3.4%

결제상품명
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)100.0%
Missing28
Missing (%)96.6%
Memory size364.0 B
2023-12-10T22:58:14.211441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row수원페이
ValueCountFrequency (%)
수원페이 1
100.0%
2023-12-10T22:58:14.789818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%

결제금액
Categorical

IMBALANCE 

Distinct2
Distinct (%)6.9%
Missing0
Missing (%)0.0%
Memory size364.0 B
0
28 
17100
 
1

Length

Max length5
Median length1
Mean length1.137931
Min length1

Unique

Unique1 ?
Unique (%)3.4%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 28
96.6%
17100 1
 
3.4%

Length

2023-12-10T22:58:15.009984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:58:15.194144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 28
96.6%
17100 1
 
3.4%

Interactions

2023-12-10T22:58:11.714471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:58:11.359954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:58:11.871546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:58:11.516111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:58:15.718521image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가맹점번호결제상품ID가맹점업종명가맹점우편번호결제금액
가맹점번호1.0000.0000.0000.0000.000
결제상품ID0.0001.0000.5220.0000.653
가맹점업종명0.0000.5221.0000.6140.522
가맹점우편번호0.0000.0000.6141.0000.000
결제금액0.0000.6530.5220.0001.000
2023-12-10T22:58:15.866070image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가맹점업종명결제금액결제상품ID
가맹점업종명1.0000.3040.304
결제금액0.3041.0000.452
결제상품ID0.3040.4521.000
2023-12-10T22:58:16.026113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
가맹점번호가맹점우편번호결제상품ID가맹점업종명결제금액
가맹점번호1.0000.0030.0000.0000.000
가맹점우편번호0.0031.0000.0000.1170.000
결제상품ID0.0000.0001.0000.3040.452
가맹점업종명0.0000.1170.3041.0000.304
결제금액0.0000.0000.4520.3041.000

Missing values

2023-12-10T22:58:12.059544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:58:12.274937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일반일간결제일자가맹점번호결제상품ID가맹점업종명가맹점우편번호결제상품명결제금액
02019-04-27700012617999999999999999연료판매점17825<NA>0
12019-04-27700012770999999999999999일반휴게음식11161<NA>0
22019-04-27700013346999999999999999가구16802<NA>0
32019-04-27700017559999999999999999레저업소17006<NA>0
42019-04-27700017640999999999999999유통업 영리12271<NA>0
52019-04-27700025306999999999999999건축자재14533<NA>0
62019-04-27700026724999999999999999일반휴게음식15002<NA>0
72019-04-27700034607999999999999999유통업 영리10049<NA>0
82019-04-27700042099999999999999999의류11695<NA>0
92019-04-27700050394999999999999999연료판매점10819<NA>0
일반일간결제일자가맹점번호결제상품ID가맹점업종명가맹점우편번호결제상품명결제금액
192019-04-27700085622999999999999999일반휴게음식11626<NA>0
202019-04-27700085630140000126000음료식품16226수원페이17100
212019-04-27700090851999999999999999보건위생17052<NA>0
222019-04-27700092363999999999999999건강식품13535<NA>0
232019-04-27700102179999999999999999일반휴게음식13642<NA>0
242019-04-27700119749999999999999999의원14424<NA>0
252019-04-27700124410999999999999999일반휴게음식17145<NA>0
262019-04-27700125879999999999999999보건위생12756<NA>0
272019-04-27700136117999999999999999일반휴게음식15842<NA>0
282019-04-27700138100999999999999999일반휴게음식10124<NA>0