Overview

Dataset statistics

Number of variables4
Number of observations63
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.1 KiB
Average record size in memory34.1 B

Variable types

Categorical2
Text2

Dataset

Description샘플 데이터
Author신한카드
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=51

Alerts

대분류(SB_L_UPJONG_NM) is highly overall correlated with 중분류(SB_M_UPJONG_NM)High correlation
중분류(SB_M_UPJONG_NM) is highly overall correlated with 대분류(SB_L_UPJONG_NM)High correlation
내국인업종분류(SB_UPJONG_NM) has unique valuesUnique
내국인업종코드(SB_UPJONG_CD) has unique valuesUnique

Reproduction

Analysis started2023-12-10 14:54:25.444471
Analysis finished2023-12-10 14:54:26.319131
Duration0.87 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

대분류(SB_L_UPJONG_NM)
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)20.6%
Missing0
Missing (%)0.0%
Memory size636.0 B
요식/유흥
11 
스포츠/문화/레저
10 
유통
의료
의류/잡화
Other values (8)
26 

Length

Max length9
Median length8
Mean length4.9206349
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row요식/유흥
2nd row요식/유흥
3rd row요식/유흥
4th row요식/유흥
5th row요식/유흥

Common Values

ValueCountFrequency (%)
요식/유흥 11
17.5%
스포츠/문화/레저 10
15.9%
유통 6
9.5%
의료 6
9.5%
의류/잡화 4
 
6.3%
여행/교통 4
 
6.3%
가정생활/서비스 4
 
6.3%
교육/학원 4
 
6.3%
음/식료품 3
 
4.8%
미용 3
 
4.8%
Other values (3) 8
12.7%

Length

2023-12-10T23:54:26.420191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
요식/유흥 11
17.5%
스포츠/문화/레저 10
15.9%
유통 6
9.5%
의료 6
9.5%
의류/잡화 4
 
6.3%
여행/교통 4
 
6.3%
가정생활/서비스 4
 
6.3%
교육/학원 4
 
6.3%
음/식료품 3
 
4.8%
미용 3
 
4.8%
Other values (3) 8
12.7%

중분류(SB_M_UPJONG_NM)
Categorical

HIGH CORRELATION 

Distinct30
Distinct (%)47.6%
Missing0
Missing (%)0.0%
Memory size636.0 B
스포츠/문화/레저
병원
 
4
일식/중식/양식
 
3
제과/커피/패스트푸드
 
3
유흥
 
3
Other values (25)
43 

Length

Max length11
Median length8
Mean length5.3333333
Min length2

Unique

Unique13 ?
Unique (%)20.6%

Sample

1st row한식
2nd row일식/중식/양식
3rd row일식/중식/양식
4th row일식/중식/양식
5th row제과/커피/패스트푸드

Common Values

ValueCountFrequency (%)
스포츠/문화/레저 7
 
11.1%
병원 4
 
6.3%
일식/중식/양식 3
 
4.8%
제과/커피/패스트푸드 3
 
4.8%
유흥 3
 
4.8%
가전/가구 3
 
4.8%
할인점/슈퍼마켓 3
 
4.8%
음/식료품 3
 
4.8%
서비스 3
 
4.8%
패션/잡화 3
 
4.8%
Other values (20) 28
44.4%

Length

2023-12-10T23:54:26.561033image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
스포츠/문화/레저 7
 
11.1%
병원 4
 
6.3%
일식/중식/양식 3
 
4.8%
제과/커피/패스트푸드 3
 
4.8%
유흥 3
 
4.8%
가전/가구 3
 
4.8%
할인점/슈퍼마켓 3
 
4.8%
음/식료품 3
 
4.8%
서비스 3
 
4.8%
패션/잡화 3
 
4.8%
Other values (20) 28
44.4%
Distinct63
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size636.0 B
2023-12-10T23:54:26.836903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length7
Mean length4.1746032
Min length2

Characters and Unicode

Total characters263
Distinct characters125
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique63 ?
Unique (%)100.0%

Sample

1st row한식
2nd row일식
3rd row양식
4th row중식
5th row제과점
ValueCountFrequency (%)
한식 1
 
1.6%
문화용품 1
 
1.6%
호텔/콘도 1
 
1.6%
모텔/여관/기타숙박 1
 
1.6%
여행사 1
 
1.6%
면세점 1
 
1.6%
미용실 1
 
1.6%
미용서비스 1
 
1.6%
화장품 1
 
1.6%
생활서비스 1
 
1.6%
Other values (53) 53
84.1%
2023-12-10T23:54:27.527939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 14
 
5.3%
9
 
3.4%
9
 
3.4%
7
 
2.7%
7
 
2.7%
7
 
2.7%
6
 
2.3%
6
 
2.3%
6
 
2.3%
6
 
2.3%
Other values (115) 186
70.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 246
93.5%
Other Punctuation 14
 
5.3%
Uppercase Letter 3
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
 
3.7%
9
 
3.7%
7
 
2.8%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
6
 
2.4%
6
 
2.4%
6
 
2.4%
Other values (111) 177
72.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
L 1
33.3%
G 1
33.3%
Other Punctuation
ValueCountFrequency (%)
/ 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 246
93.5%
Common 14
 
5.3%
Latin 3
 
1.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
 
3.7%
9
 
3.7%
7
 
2.8%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
6
 
2.4%
6
 
2.4%
6
 
2.4%
Other values (111) 177
72.0%
Latin
ValueCountFrequency (%)
P 1
33.3%
L 1
33.3%
G 1
33.3%
Common
ValueCountFrequency (%)
/ 14
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 246
93.5%
ASCII 17
 
6.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 14
82.4%
P 1
 
5.9%
L 1
 
5.9%
G 1
 
5.9%
Hangul
ValueCountFrequency (%)
9
 
3.7%
9
 
3.7%
7
 
2.8%
7
 
2.8%
7
 
2.8%
6
 
2.4%
6
 
2.4%
6
 
2.4%
6
 
2.4%
6
 
2.4%
Other values (111) 177
72.0%
Distinct63
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size636.0 B
2023-12-10T23:54:27.838483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters315
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique63 ?
Unique (%)100.0%

Sample

1st rowsb001
2nd rowsb002
3rd rowsb003
4th rowsb004
5th rowsb005
ValueCountFrequency (%)
sb001 1
 
1.6%
sb033 1
 
1.6%
sb035 1
 
1.6%
sb036 1
 
1.6%
sb037 1
 
1.6%
sb038 1
 
1.6%
sb039 1
 
1.6%
sb040 1
 
1.6%
sb041 1
 
1.6%
sb042 1
 
1.6%
Other values (53) 53
84.1%
2023-12-10T23:54:28.297712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 78
24.8%
s 63
20.0%
b 63
20.0%
1 17
 
5.4%
3 17
 
5.4%
2 17
 
5.4%
4 16
 
5.1%
5 16
 
5.1%
6 10
 
3.2%
7 6
 
1.9%
Other values (2) 12
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 189
60.0%
Lowercase Letter 126
40.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 78
41.3%
1 17
 
9.0%
3 17
 
9.0%
2 17
 
9.0%
4 16
 
8.5%
5 16
 
8.5%
6 10
 
5.3%
7 6
 
3.2%
8 6
 
3.2%
9 6
 
3.2%
Lowercase Letter
ValueCountFrequency (%)
s 63
50.0%
b 63
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 189
60.0%
Latin 126
40.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 78
41.3%
1 17
 
9.0%
3 17
 
9.0%
2 17
 
9.0%
4 16
 
8.5%
5 16
 
8.5%
6 10
 
5.3%
7 6
 
3.2%
8 6
 
3.2%
9 6
 
3.2%
Latin
ValueCountFrequency (%)
s 63
50.0%
b 63
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 315
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 78
24.8%
s 63
20.0%
b 63
20.0%
1 17
 
5.4%
3 17
 
5.4%
2 17
 
5.4%
4 16
 
5.1%
5 16
 
5.1%
6 10
 
3.2%
7 6
 
1.9%
Other values (2) 12
 
3.8%

Correlations

2023-12-10T23:54:28.425849image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대분류(SB_L_UPJONG_NM)중분류(SB_M_UPJONG_NM)내국인업종분류(SB_UPJONG_NM)내국인업종코드(SB_UPJONG_CD)
대분류(SB_L_UPJONG_NM)1.0001.0001.0001.000
중분류(SB_M_UPJONG_NM)1.0001.0001.0001.000
내국인업종분류(SB_UPJONG_NM)1.0001.0001.0001.000
내국인업종코드(SB_UPJONG_CD)1.0001.0001.0001.000
2023-12-10T23:54:28.540166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
중분류(SB_M_UPJONG_NM)대분류(SB_L_UPJONG_NM)
중분류(SB_M_UPJONG_NM)1.0000.812
대분류(SB_L_UPJONG_NM)0.8121.000
2023-12-10T23:54:28.649211image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대분류(SB_L_UPJONG_NM)중분류(SB_M_UPJONG_NM)
대분류(SB_L_UPJONG_NM)1.0000.812
중분류(SB_M_UPJONG_NM)0.8121.000

Missing values

2023-12-10T23:54:26.160491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:54:26.267977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

대분류(SB_L_UPJONG_NM)중분류(SB_M_UPJONG_NM)내국인업종분류(SB_UPJONG_NM)내국인업종코드(SB_UPJONG_CD)
0요식/유흥한식한식sb001
1요식/유흥일식/중식/양식일식sb002
2요식/유흥일식/중식/양식양식sb003
3요식/유흥일식/중식/양식중식sb004
4요식/유흥제과/커피/패스트푸드제과점sb005
5요식/유흥제과/커피/패스트푸드커피전문점sb006
6요식/유흥제과/커피/패스트푸드패스트푸드sb007
7요식/유흥기타요식기타요식sb008
8요식/유흥유흥노래방sb009
9요식/유흥유흥기타유흥업소sb010
대분류(SB_L_UPJONG_NM)중분류(SB_M_UPJONG_NM)내국인업종분류(SB_UPJONG_NM)내국인업종코드(SB_UPJONG_CD)
53의료약국약국sb054
54의료기타의료기타의료sb055
55가전/가구가전/가구가전sb056
56가전/가구가전/가구가구sb057
57가전/가구가전/가구기타가전/가구sb058
58자동차자동차판매자동차판매sb059
59자동차자동차서비스/용품자동차서비스sb060
60자동차자동차서비스/용품자동차용품sb061
61주유주유주유소sb062
62주유주유LPGsb063