Overview

Dataset statistics

Number of variables6
Number of observations56
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.9 KiB
Average record size in memory52.4 B

Variable types

Numeric2
Categorical2
Text2

Dataset

Description샘플 데이터
Author신한카드
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=51

Alerts

대분류코드(SF_L_UPJONG_CD) is highly overall correlated with 중분류코드(SF_M_UPJONG_CD) and 2 other fieldsHigh correlation
중분류코드(SF_M_UPJONG_CD) is highly overall correlated with 대분류코드(SF_L_UPJONG_CD) and 2 other fieldsHigh correlation
대분류코드(SF_L_UPJONG_NM) is highly overall correlated with 대분류코드(SF_L_UPJONG_CD) and 2 other fieldsHigh correlation
중분류코드(SF_M_UPJONG_NM) is highly overall correlated with 대분류코드(SF_L_UPJONG_CD) and 2 other fieldsHigh correlation
외국인관광업종코드(SF_UPJONG_CD) has unique valuesUnique
외국인관광업종분류(SF_UPJONG_NM) has unique valuesUnique

Reproduction

Analysis started2023-12-10 14:54:30.914793
Analysis finished2023-12-10 14:54:32.012333
Duration1.1 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

대분류코드(SF_L_UPJONG_CD)
Real number (ℝ)

HIGH CORRELATION 

Distinct11
Distinct (%)19.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.8571429
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.0 B
2023-12-10T23:54:32.111931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q37
95-th percentile10
Maximum11
Range10
Interquartile range (IQR)5

Descriptive statistics

Standard deviation2.9752224
Coefficient of variation (CV)0.61254578
Kurtosis-0.92920044
Mean4.8571429
Median Absolute Deviation (MAD)3
Skewness0.27263955
Sum272
Variance8.8519481
MonotonicityIncreasing
2023-12-10T23:54:32.254272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 11
19.6%
5 10
17.9%
2 6
10.7%
6 6
10.7%
8 6
10.7%
4 4
 
7.1%
3 3
 
5.4%
7 3
 
5.4%
9 3
 
5.4%
10 2
 
3.6%
ValueCountFrequency (%)
1 11
19.6%
2 6
10.7%
3 3
 
5.4%
4 4
 
7.1%
5 10
17.9%
6 6
10.7%
7 3
 
5.4%
8 6
10.7%
9 3
 
5.4%
10 2
 
3.6%
ValueCountFrequency (%)
11 2
 
3.6%
10 2
 
3.6%
9 3
 
5.4%
8 6
10.7%
7 3
 
5.4%
6 6
10.7%
5 10
17.9%
4 4
 
7.1%
3 3
 
5.4%
2 6
10.7%

대분류코드(SF_L_UPJONG_NM)
Categorical

HIGH CORRELATION 

Distinct11
Distinct (%)19.6%
Missing0
Missing (%)0.0%
Memory size580.0 B
요식/유흥
11 
스포츠/문화/레저
10 
유통
여행/교통
의료
Other values (6)
17 

Length

Max length9
Median length5
Mean length4.7321429
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row요식/유흥
2nd row요식/유흥
3rd row요식/유흥
4th row요식/유흥
5th row요식/유흥

Common Values

ValueCountFrequency (%)
요식/유흥 11
19.6%
스포츠/문화/레저 10
17.9%
유통 6
10.7%
여행/교통 6
10.7%
의료 6
10.7%
의류/잡화 4
 
7.1%
음/식료품 3
 
5.4%
미용 3
 
5.4%
가전/가구 3
 
5.4%
자동차 2
 
3.6%

Length

2023-12-10T23:54:32.417913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
요식/유흥 11
19.6%
스포츠/문화/레저 10
17.9%
유통 6
10.7%
여행/교통 6
10.7%
의료 6
10.7%
의류/잡화 4
 
7.1%
음/식료품 3
 
5.4%
미용 3
 
5.4%
가전/가구 3
 
5.4%
자동차 2
 
3.6%

중분류코드(SF_M_UPJONG_CD)
Real number (ℝ)

HIGH CORRELATION 

Distinct25
Distinct (%)44.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean498.73214
Minimum101
Maximum1125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size636.0 B
2023-12-10T23:54:32.564916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum101
5-th percentile102
Q1207
median513
Q3718.25
95-th percentile1024
Maximum1125
Range1024
Interquartile range (IQR)511.25

Descriptive statistics

Standard deviation304.33901
Coefficient of variation (CV)0.61022539
Kurtosis-0.93444884
Mean498.73214
Median Absolute Deviation (MAD)305.5
Skewness0.2672149
Sum27929
Variance92622.236
MonotonicityIncreasing
2023-12-10T23:54:32.744358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
513 7
 
12.5%
820 4
 
7.1%
514 3
 
5.4%
923 3
 
5.4%
616 3
 
5.4%
102 3
 
5.4%
412 3
 
5.4%
310 3
 
5.4%
207 3
 
5.4%
105 3
 
5.4%
Other values (15) 21
37.5%
ValueCountFrequency (%)
101 1
 
1.8%
102 3
5.4%
103 3
5.4%
104 1
 
1.8%
105 3
5.4%
206 1
 
1.8%
207 3
5.4%
208 1
 
1.8%
209 1
 
1.8%
310 3
5.4%
ValueCountFrequency (%)
1125 2
3.6%
1024 2
3.6%
923 3
5.4%
822 1
 
1.8%
821 1
 
1.8%
820 4
7.1%
719 1
 
1.8%
718 2
3.6%
617 1
 
1.8%
616 3
5.4%

중분류코드(SF_M_UPJONG_NM)
Categorical

HIGH CORRELATION 

Distinct25
Distinct (%)44.6%
Missing0
Missing (%)0.0%
Memory size580.0 B
스포츠/문화/레저
병원
제과/커피/패스트푸드
 
3
유흥
 
3
가전/가구
 
3
Other values (20)
36 

Length

Max length11
Median length9
Mean length5.5357143
Min length2

Unique

Unique10 ?
Unique (%)17.9%

Sample

1st row한식
2nd row일식/중식/양식
3rd row일식/중식/양식
4th row일식/중식/양식
5th row제과/커피/패스트푸드

Common Values

ValueCountFrequency (%)
스포츠/문화/레저 7
 
12.5%
병원 4
 
7.1%
제과/커피/패스트푸드 3
 
5.4%
유흥 3
 
5.4%
가전/가구 3
 
5.4%
할인점/슈퍼마켓 3
 
5.4%
음/식료품 3
 
5.4%
여행 3
 
5.4%
패션/잡화 3
 
5.4%
일식/중식/양식 3
 
5.4%
Other values (15) 21
37.5%

Length

2023-12-10T23:54:32.923255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
스포츠/문화/레저 7
 
12.5%
병원 4
 
7.1%
제과/커피/패스트푸드 3
 
5.4%
유흥 3
 
5.4%
가전/가구 3
 
5.4%
할인점/슈퍼마켓 3
 
5.4%
음/식료품 3
 
5.4%
여행 3
 
5.4%
패션/잡화 3
 
5.4%
일식/중식/양식 3
 
5.4%
Other values (15) 21
37.5%
Distinct56
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size580.0 B
2023-12-10T23:54:33.262026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters448
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)100.0%

Sample

1st rowsf010101
2nd rowsf010202
3rd rowsf010203
4th rowsf010204
5th rowsf010305
ValueCountFrequency (%)
sf010101 1
 
1.8%
sf010202 1
 
1.8%
sf071842 1
 
1.8%
sf051331 1
 
1.8%
sf051432 1
 
1.8%
sf051433 1
 
1.8%
sf051434 1
 
1.8%
sf061535 1
 
1.8%
sf061536 1
 
1.8%
sf061637 1
 
1.8%
Other values (46) 46
82.1%
2023-12-10T23:54:33.773810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 92
20.5%
1 62
13.8%
s 56
12.5%
f 56
12.5%
2 42
9.4%
3 32
 
7.1%
5 30
 
6.7%
4 26
 
5.8%
6 16
 
3.6%
8 14
 
3.1%
Other values (2) 22
 
4.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 336
75.0%
Lowercase Letter 112
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 92
27.4%
1 62
18.5%
2 42
12.5%
3 32
 
9.5%
5 30
 
8.9%
4 26
 
7.7%
6 16
 
4.8%
8 14
 
4.2%
7 12
 
3.6%
9 10
 
3.0%
Lowercase Letter
ValueCountFrequency (%)
s 56
50.0%
f 56
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 336
75.0%
Latin 112
 
25.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 92
27.4%
1 62
18.5%
2 42
12.5%
3 32
 
9.5%
5 30
 
8.9%
4 26
 
7.7%
6 16
 
4.8%
8 14
 
4.2%
7 12
 
3.6%
9 10
 
3.0%
Latin
ValueCountFrequency (%)
s 56
50.0%
f 56
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 448
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 92
20.5%
1 62
13.8%
s 56
12.5%
f 56
12.5%
2 42
9.4%
3 32
 
7.1%
5 30
 
6.7%
4 26
 
5.8%
6 16
 
3.6%
8 14
 
3.1%
Other values (2) 22
 
4.9%
Distinct56
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size580.0 B
2023-12-10T23:54:34.099297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length7
Mean length4.1428571
Min length2

Characters and Unicode

Total characters232
Distinct characters116
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)100.0%

Sample

1st row한식
2nd row일식
3rd row양식
4th row중식
5th row제과점
ValueCountFrequency (%)
한식 1
 
1.8%
일식 1
 
1.8%
미용서비스 1
 
1.8%
서점 1
 
1.8%
스포츠/레저용품 1
 
1.8%
문화용품 1
 
1.8%
화원 1
 
1.8%
호텔/콘도 1
 
1.8%
모텔/여관/기타숙박 1
 
1.8%
여행사 1
 
1.8%
Other values (46) 46
82.1%
2023-12-10T23:54:34.651323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 14
 
6.0%
9
 
3.9%
7
 
3.0%
7
 
3.0%
7
 
3.0%
7
 
3.0%
6
 
2.6%
5
 
2.2%
5
 
2.2%
5
 
2.2%
Other values (106) 160
69.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 215
92.7%
Other Punctuation 14
 
6.0%
Uppercase Letter 3
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
9
 
4.2%
7
 
3.3%
7
 
3.3%
7
 
3.3%
7
 
3.3%
6
 
2.8%
5
 
2.3%
5
 
2.3%
5
 
2.3%
5
 
2.3%
Other values (102) 152
70.7%
Uppercase Letter
ValueCountFrequency (%)
P 1
33.3%
L 1
33.3%
G 1
33.3%
Other Punctuation
ValueCountFrequency (%)
/ 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 215
92.7%
Common 14
 
6.0%
Latin 3
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
9
 
4.2%
7
 
3.3%
7
 
3.3%
7
 
3.3%
7
 
3.3%
6
 
2.8%
5
 
2.3%
5
 
2.3%
5
 
2.3%
5
 
2.3%
Other values (102) 152
70.7%
Latin
ValueCountFrequency (%)
P 1
33.3%
L 1
33.3%
G 1
33.3%
Common
ValueCountFrequency (%)
/ 14
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 215
92.7%
ASCII 17
 
7.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 14
82.4%
P 1
 
5.9%
L 1
 
5.9%
G 1
 
5.9%
Hangul
ValueCountFrequency (%)
9
 
4.2%
7
 
3.3%
7
 
3.3%
7
 
3.3%
7
 
3.3%
6
 
2.8%
5
 
2.3%
5
 
2.3%
5
 
2.3%
5
 
2.3%
Other values (102) 152
70.7%

Interactions

2023-12-10T23:54:31.454829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:54:31.235816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:54:31.592890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:54:31.339661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:54:34.782863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대분류코드(SF_L_UPJONG_CD)대분류코드(SF_L_UPJONG_NM)중분류코드(SF_M_UPJONG_CD)중분류코드(SF_M_UPJONG_NM)외국인관광업종코드(SF_UPJONG_CD)외국인관광업종분류(SF_UPJONG_NM)
대분류코드(SF_L_UPJONG_CD)1.0001.0000.9991.0001.0001.000
대분류코드(SF_L_UPJONG_NM)1.0001.0001.0001.0001.0001.000
중분류코드(SF_M_UPJONG_CD)0.9991.0001.0001.0001.0001.000
중분류코드(SF_M_UPJONG_NM)1.0001.0001.0001.0001.0001.000
외국인관광업종코드(SF_UPJONG_CD)1.0001.0001.0001.0001.0001.000
외국인관광업종분류(SF_UPJONG_NM)1.0001.0001.0001.0001.0001.000
2023-12-10T23:54:34.955470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대분류코드(SF_L_UPJONG_NM)중분류코드(SF_M_UPJONG_NM)
대분류코드(SF_L_UPJONG_NM)1.0000.830
중분류코드(SF_M_UPJONG_NM)0.8301.000
2023-12-10T23:54:35.084177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
대분류코드(SF_L_UPJONG_CD)중분류코드(SF_M_UPJONG_CD)대분류코드(SF_L_UPJONG_NM)중분류코드(SF_M_UPJONG_NM)
대분류코드(SF_L_UPJONG_CD)1.0000.9930.9890.821
중분류코드(SF_M_UPJONG_CD)0.9931.0000.9890.821
대분류코드(SF_L_UPJONG_NM)0.9890.9891.0000.830
중분류코드(SF_M_UPJONG_NM)0.8210.8210.8301.000

Missing values

2023-12-10T23:54:31.773919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:54:31.935068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

대분류코드(SF_L_UPJONG_CD)대분류코드(SF_L_UPJONG_NM)중분류코드(SF_M_UPJONG_CD)중분류코드(SF_M_UPJONG_NM)외국인관광업종코드(SF_UPJONG_CD)외국인관광업종분류(SF_UPJONG_NM)
01요식/유흥101한식sf010101한식
11요식/유흥102일식/중식/양식sf010202일식
21요식/유흥102일식/중식/양식sf010203양식
31요식/유흥102일식/중식/양식sf010204중식
41요식/유흥103제과/커피/패스트푸드sf010305제과점
51요식/유흥103제과/커피/패스트푸드sf010306커피전문점
61요식/유흥103제과/커피/패스트푸드sf010307패스트푸드
71요식/유흥104기타요식sf010408기타요식
81요식/유흥105유흥sf010509노래방
91요식/유흥105유흥sf010510기타유흥업소
대분류코드(SF_L_UPJONG_CD)대분류코드(SF_L_UPJONG_NM)중분류코드(SF_M_UPJONG_CD)중분류코드(SF_M_UPJONG_NM)외국인관광업종코드(SF_UPJONG_CD)외국인관광업종분류(SF_UPJONG_NM)
468의료820병원sf082047한의원
478의료821약국sf082148약국
488의료822기타의료sf082249기타의료
499가전/가구923가전/가구sf092350가전
509가전/가구923가전/가구sf092351가구
519가전/가구923가전/가구sf092352기타가전/가구
5210자동차1024자동차서비스/용품sf102453자동차서비스
5310자동차1024자동차서비스/용품sf102454자동차용품
5411주유1125주유sf112555주유소
5511주유1125주유sf112556LPG