Overview

Dataset statistics

Number of variables4
Number of observations25
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory982.0 B
Average record size in memory39.3 B

Variable types

Numeric2
Categorical1
Text1

Dataset

Description샘플 데이터
Author신한카드
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=50

Alerts

대분류코드(UPJONG_L) is highly overall correlated with 중분류코드(UPJONG_M) and 1 other fieldsHigh correlation
중분류코드(UPJONG_M) is highly overall correlated with 대분류코드(UPJONG_L) and 1 other fieldsHigh correlation
대분류코드명(UPJONG_L_NM) is highly overall correlated with 대분류코드(UPJONG_L) and 1 other fieldsHigh correlation
중분류코드(UPJONG_M) has unique valuesUnique
중분류코드명(UPJONG_M_NM) has unique valuesUnique

Reproduction

Analysis started2023-12-10 14:52:57.561446
Analysis finished2023-12-10 14:52:58.207654
Duration0.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

대분류코드(UPJONG_L)
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)44.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.8
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size357.0 B
2023-12-10T23:52:58.258292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q37
95-th percentile9.8
Maximum11
Range10
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.1358146
Coefficient of variation (CV)0.65329471
Kurtosis-1.0971428
Mean4.8
Median Absolute Deviation (MAD)3
Skewness0.30667635
Sum120
Variance9.8333333
MonotonicityIncreasing
2023-12-10T23:52:58.741596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 5
20.0%
2 4
16.0%
6 3
12.0%
8 3
12.0%
4 2
 
8.0%
5 2
 
8.0%
7 2
 
8.0%
3 1
 
4.0%
9 1
 
4.0%
10 1
 
4.0%
ValueCountFrequency (%)
1 5
20.0%
2 4
16.0%
3 1
 
4.0%
4 2
 
8.0%
5 2
 
8.0%
6 3
12.0%
7 2
 
8.0%
8 3
12.0%
9 1
 
4.0%
10 1
 
4.0%
ValueCountFrequency (%)
11 1
 
4.0%
10 1
 
4.0%
9 1
 
4.0%
8 3
12.0%
7 2
8.0%
6 3
12.0%
5 2
8.0%
4 2
8.0%
3 1
 
4.0%
2 4
16.0%

대분류코드명(UPJONG_L_NM)
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)44.0%
Missing0
Missing (%)0.0%
Memory size332.0 B
요식/유흥
유통
여행/교통
의료
의류/잡화
Other values (6)

Length

Max length9
Median length5
Mean length4.04
Min length2

Unique

Unique4 ?
Unique (%)16.0%

Sample

1st row요식/유흥
2nd row요식/유흥
3rd row요식/유흥
4th row요식/유흥
5th row요식/유흥

Common Values

ValueCountFrequency (%)
요식/유흥 5
20.0%
유통 4
16.0%
여행/교통 3
12.0%
의료 3
12.0%
의류/잡화 2
 
8.0%
스포츠/문화/레저 2
 
8.0%
미용 2
 
8.0%
음/식료품 1
 
4.0%
가전/가구 1
 
4.0%
자동차 1
 
4.0%

Length

2023-12-10T23:52:58.882388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
요식/유흥 5
20.0%
유통 4
16.0%
여행/교통 3
12.0%
의료 3
12.0%
의류/잡화 2
 
8.0%
스포츠/문화/레저 2
 
8.0%
미용 2
 
8.0%
음/식료품 1
 
4.0%
가전/가구 1
 
4.0%
자동차 1
 
4.0%

중분류코드(UPJONG_M)
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct25
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean493
Minimum101
Maximum1125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size357.0 B
2023-12-10T23:52:59.015717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102.2
Q1207
median513
Q3719
95-th percentile1003.8
Maximum1125
Range1024
Interquartile range (IQR)512

Descriptive statistics

Standard deviation320.82576
Coefficient of variation (CV)0.65076219
Kurtosis-1.1047722
Mean493
Median Absolute Deviation (MAD)306
Skewness0.30057109
Sum12325
Variance102929.17
MonotonicityStrictly increasing
2023-12-10T23:52:59.149516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
101 1
 
4.0%
102 1
 
4.0%
1125 1
 
4.0%
1024 1
 
4.0%
923 1
 
4.0%
822 1
 
4.0%
821 1
 
4.0%
820 1
 
4.0%
719 1
 
4.0%
718 1
 
4.0%
Other values (15) 15
60.0%
ValueCountFrequency (%)
101 1
4.0%
102 1
4.0%
103 1
4.0%
104 1
4.0%
105 1
4.0%
206 1
4.0%
207 1
4.0%
208 1
4.0%
209 1
4.0%
310 1
4.0%
ValueCountFrequency (%)
1125 1
4.0%
1024 1
4.0%
923 1
4.0%
822 1
4.0%
821 1
4.0%
820 1
4.0%
719 1
4.0%
718 1
4.0%
617 1
4.0%
616 1
4.0%
Distinct25
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size332.0 B
2023-12-10T23:52:59.427815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length9
Mean length4.72
Min length2

Characters and Unicode

Total characters118
Distinct characters65
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique25 ?
Unique (%)100.0%

Sample

1st row한식
2nd row일식/중식/양식
3rd row제과/커피/패스트푸드
4th row기타요식
5th row유흥
ValueCountFrequency (%)
한식 1
 
4.0%
스포츠/문화/레저용품 1
 
4.0%
자동차서비스/용품 1
 
4.0%
가전/가구 1
 
4.0%
기타의료 1
 
4.0%
약국 1
 
4.0%
병원 1
 
4.0%
화장품 1
 
4.0%
미용서비스 1
 
4.0%
교통 1
 
4.0%
Other values (15) 15
60.0%
2023-12-10T23:52:59.822659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 14
 
11.9%
6
 
5.1%
5
 
4.2%
5
 
4.2%
4
 
3.4%
4
 
3.4%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (55) 68
57.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 104
88.1%
Other Punctuation 14
 
11.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6
 
5.8%
5
 
4.8%
5
 
4.8%
4
 
3.8%
4
 
3.8%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (54) 65
62.5%
Other Punctuation
ValueCountFrequency (%)
/ 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 104
88.1%
Common 14
 
11.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6
 
5.8%
5
 
4.8%
5
 
4.8%
4
 
3.8%
4
 
3.8%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (54) 65
62.5%
Common
ValueCountFrequency (%)
/ 14
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 104
88.1%
ASCII 14
 
11.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 14
100.0%
Hangul
ValueCountFrequency (%)
6
 
5.8%
5
 
4.8%
5
 
4.8%
4
 
3.8%
4
 
3.8%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
3
 
2.9%
Other values (54) 65
62.5%

Interactions

2023-12-10T23:52:57.895019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:52:57.737721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:52:57.974719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:52:57.817096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:52:59.959987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대분류코드(UPJONG_L)대분류코드명(UPJONG_L_NM)중분류코드(UPJONG_M)중분류코드명(UPJONG_M_NM)
대분류코드(UPJONG_L)1.0001.0000.9991.000
대분류코드명(UPJONG_L_NM)1.0001.0001.0001.000
중분류코드(UPJONG_M)0.9991.0001.0001.000
중분류코드명(UPJONG_M_NM)1.0001.0001.0001.000
2023-12-10T23:53:00.097947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대분류코드(UPJONG_L)중분류코드(UPJONG_M)대분류코드명(UPJONG_L_NM)
대분류코드(UPJONG_L)1.0000.9920.966
중분류코드(UPJONG_M)0.9921.0000.966
대분류코드명(UPJONG_L_NM)0.9660.9661.000

Missing values

2023-12-10T23:52:58.081900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:52:58.172835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

대분류코드(UPJONG_L)대분류코드명(UPJONG_L_NM)중분류코드(UPJONG_M)중분류코드명(UPJONG_M_NM)
01요식/유흥101한식
11요식/유흥102일식/중식/양식
21요식/유흥103제과/커피/패스트푸드
31요식/유흥104기타요식
41요식/유흥105유흥
52유통206백화점
62유통207할인점/슈퍼마켓
72유통208편의점
82유통209기타유통
93음/식료품310음/식료품
대분류코드(UPJONG_L)대분류코드명(UPJONG_L_NM)중분류코드(UPJONG_M)중분류코드명(UPJONG_M_NM)
156여행/교통616여행
166여행/교통617교통
177미용718미용서비스
187미용719화장품
198의료820병원
208의료821약국
218의료822기타의료
229가전/가구923가전/가구
2310자동차1024자동차서비스/용품
2411주유1125주유