Overview

Dataset statistics

Number of variables7
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory28.4 KiB
Average record size in memory58.3 B

Variable types

Numeric2
Categorical4
Text1

Dataset

Description샘플 데이터
Author다음소프트
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=57

Alerts

수집소스(SOURCE) has constant value ""Constant

Reproduction

Analysis started2023-12-10 14:54:16.992722
Analysis finished2023-12-10 14:54:18.238219
Duration1.25 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

DOC_DATE(DATE)
Real number (ℝ)

Distinct400
Distinct (%)80.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20181990
Minimum20170101
Maximum20191228
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-10T23:54:18.332010image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20170101
5-th percentile20170321
Q120171217
median20180870
Q320190504
95-th percentile20191106
Maximum20191228
Range21127
Interquartile range (IQR)19287

Descriptive statistics

Standard deviation7945.5372
Coefficient of variation (CV)0.00039369444
Kurtosis-1.3715064
Mean20181990
Median Absolute Deviation (MAD)9636.5
Skewness-0.24477153
Sum1.0090995 × 1010
Variance63131562
MonotonicityNot monotonic
2023-12-10T23:54:18.514482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20190415 5
 
1.0%
20170529 5
 
1.0%
20180625 3
 
0.6%
20180511 3
 
0.6%
20190325 3
 
0.6%
20180527 3
 
0.6%
20190801 3
 
0.6%
20180324 3
 
0.6%
20180812 3
 
0.6%
20190506 3
 
0.6%
Other values (390) 466
93.2%
ValueCountFrequency (%)
20170101 1
0.2%
20170107 1
0.2%
20170118 1
0.2%
20170119 1
0.2%
20170124 1
0.2%
20170201 1
0.2%
20170202 1
0.2%
20170203 1
0.2%
20170206 1
0.2%
20170210 1
0.2%
ValueCountFrequency (%)
20191228 1
0.2%
20191222 1
0.2%
20191219 1
0.2%
20191218 2
0.4%
20191216 1
0.2%
20191210 1
0.2%
20191207 1
0.2%
20191203 1
0.2%
20191130 1
0.2%
20191129 1
0.2%

수집소스(SOURCE)
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
블로그
500 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row블로그
2nd row블로그
3rd row블로그
4th row블로그
5th row블로그

Common Values

ValueCountFrequency (%)
블로그 500
100.0%

Length

2023-12-10T23:54:18.693766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:54:19.164678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
블로그 500
100.0%
Distinct234
Distinct (%)46.8%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-10T23:54:19.518377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length3.136
Min length2

Characters and Unicode

Total characters1568
Distinct characters193
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique126 ?
Unique (%)25.2%

Sample

1st row성수동
2nd row공릉동
3rd row석촌호수
4th row성수동
5th row정릉동
ValueCountFrequency (%)
서울 14
 
2.8%
홍대 11
 
2.2%
익선동 11
 
2.2%
이태원 9
 
1.8%
성수동 8
 
1.6%
강남 8
 
1.6%
한남동 7
 
1.4%
한강 6
 
1.2%
중구 6
 
1.2%
을지로 6
 
1.2%
Other values (224) 414
82.8%
2023-12-10T23:54:20.041065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
133
 
8.5%
81
 
5.2%
57
 
3.6%
48
 
3.1%
39
 
2.5%
37
 
2.4%
33
 
2.1%
33
 
2.1%
32
 
2.0%
26
 
1.7%
Other values (183) 1049
66.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1543
98.4%
Lowercase Letter 23
 
1.5%
Decimal Number 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
133
 
8.6%
81
 
5.2%
57
 
3.7%
48
 
3.1%
39
 
2.5%
37
 
2.4%
33
 
2.1%
33
 
2.1%
32
 
2.1%
26
 
1.7%
Other values (175) 1024
66.4%
Lowercase Letter
ValueCountFrequency (%)
c 6
26.1%
v 5
21.7%
g 5
21.7%
n 3
13.0%
d 2
 
8.7%
i 1
 
4.3%
f 1
 
4.3%
Decimal Number
ValueCountFrequency (%)
3 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1543
98.4%
Latin 23
 
1.5%
Common 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
133
 
8.6%
81
 
5.2%
57
 
3.7%
48
 
3.1%
39
 
2.5%
37
 
2.4%
33
 
2.1%
33
 
2.1%
32
 
2.1%
26
 
1.7%
Other values (175) 1024
66.4%
Latin
ValueCountFrequency (%)
c 6
26.1%
v 5
21.7%
g 5
21.7%
n 3
13.0%
d 2
 
8.7%
i 1
 
4.3%
f 1
 
4.3%
Common
ValueCountFrequency (%)
3 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1543
98.4%
ASCII 25
 
1.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
133
 
8.6%
81
 
5.2%
57
 
3.7%
48
 
3.1%
39
 
2.5%
37
 
2.4%
33
 
2.1%
33
 
2.1%
32
 
2.1%
26
 
1.7%
Other values (175) 1024
66.4%
ASCII
ValueCountFrequency (%)
c 6
24.0%
v 5
20.0%
g 5
20.0%
n 3
12.0%
3 2
 
8.0%
d 2
 
8.0%
i 1
 
4.0%
f 1
 
4.0%

행정구(GU_NM)
Categorical

Distinct26
Distinct (%)5.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
마포구
81 
용산구
64 
강남구
62 
종로구
61 
중구
23 
Other values (21)
209 

Length

Max length4
Median length3
Mean length2.992
Min length2

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row종로구
2nd row성동구
3rd row서초구
4th row종로구
5th row동대문구

Common Values

ValueCountFrequency (%)
마포구 81
16.2%
용산구 64
12.8%
강남구 62
12.4%
종로구 61
12.2%
중구 23
 
4.6%
성동구 23
 
4.6%
영등포구 22
 
4.4%
서울 19
 
3.8%
송파구 19
 
3.8%
서초구 18
 
3.6%
Other values (16) 108
21.6%

Length

2023-12-10T23:54:20.229620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
마포구 81
16.2%
용산구 64
12.8%
강남구 62
12.4%
종로구 61
12.2%
중구 23
 
4.6%
성동구 23
 
4.6%
영등포구 22
 
4.4%
서울 19
 
3.8%
송파구 19
 
3.8%
서초구 18
 
3.6%
Other values (16) 108
21.6%
Distinct30
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
인스타그램
57 
비쥬얼
46 
이색메뉴
40 
셀카
32 
식감
30 
Other values (25)
295 

Length

Max length5
Median length4
Mean length3.102
Min length2

Unique

Unique3 ?
Unique (%)0.6%

Sample

1st row비쥬얼
2nd row인스타그램
3rd row식감
4th row비쥬얼
5th row인스타그램

Common Values

ValueCountFrequency (%)
인스타그램 57
 
11.4%
비쥬얼 46
 
9.2%
이색메뉴 40
 
8.0%
셀카 32
 
6.4%
식감 30
 
6.0%
존맛 29
 
5.8%
미식 25
 
5.0%
핫플레이스 25
 
5.0%
꿀맛 24
 
4.8%
셰프 17
 
3.4%
Other values (20) 175
35.0%

Length

2023-12-10T23:54:20.392903image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
인스타그램 57
 
11.4%
비쥬얼 46
 
9.2%
이색메뉴 40
 
8.0%
셀카 32
 
6.4%
식감 30
 
6.0%
존맛 29
 
5.8%
미식 25
 
5.0%
핫플레이스 25
 
5.0%
꿀맛 24
 
4.8%
셰프 17
 
3.4%
Other values (20) 175
35.0%
Distinct3
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
211 
포토제닉
163 
입소문
126 

Length

Max length4
Median length3
Mean length2.482
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row포토제닉
2nd row입소문
3rd row입소문
4th row포토제닉
5th row포토제닉

Common Values

ValueCountFrequency (%)
211
42.2%
포토제닉 163
32.6%
입소문 126
25.2%

Length

2023-12-10T23:54:20.557237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:54:20.698189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
211
42.2%
포토제닉 163
32.6%
입소문 126
25.2%

FREQ(FREQ)
Real number (ℝ)

Distinct11
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.4
Minimum1
Maximum29
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-10T23:54:20.811108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile3
Maximum29
Range28
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.6435945
Coefficient of variation (CV)1.1739961
Kurtosis164.0221
Mean1.4
Median Absolute Deviation (MAD)0
Skewness10.877041
Sum700
Variance2.7014028
MonotonicityNot monotonic
2023-12-10T23:54:20.942353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
1 415
83.0%
2 51
 
10.2%
3 16
 
3.2%
4 4
 
0.8%
5 3
 
0.6%
6 3
 
0.6%
7 3
 
0.6%
8 2
 
0.4%
9 1
 
0.2%
11 1
 
0.2%
ValueCountFrequency (%)
1 415
83.0%
2 51
 
10.2%
3 16
 
3.2%
4 4
 
0.8%
5 3
 
0.6%
6 3
 
0.6%
7 3
 
0.6%
8 2
 
0.4%
9 1
 
0.2%
11 1
 
0.2%
ValueCountFrequency (%)
29 1
 
0.2%
11 1
 
0.2%
9 1
 
0.2%
8 2
 
0.4%
7 3
 
0.6%
6 3
 
0.6%
5 3
 
0.6%
4 4
 
0.8%
3 16
 
3.2%
2 51
10.2%

Interactions

2023-12-10T23:54:17.664805image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:54:17.419161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:54:17.828847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:54:17.522158image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:54:21.040869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
DOC_DATE(DATE)행정구(GU_NM)세부견인요소(KEYWORD_DETAIL)견인요소(KEYWORD)FREQ(FREQ)
DOC_DATE(DATE)1.0000.2850.0000.0630.000
행정구(GU_NM)0.2851.0000.2630.0000.000
세부견인요소(KEYWORD_DETAIL)0.0000.2631.0000.2310.404
견인요소(KEYWORD)0.0630.0000.2311.0000.038
FREQ(FREQ)0.0000.0000.4040.0381.000
2023-12-10T23:54:21.156824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
견인요소(KEYWORD)세부견인요소(KEYWORD_DETAIL)행정구(GU_NM)
견인요소(KEYWORD)1.0000.1050.000
세부견인요소(KEYWORD_DETAIL)0.1051.0000.065
행정구(GU_NM)0.0000.0651.000
2023-12-10T23:54:21.277759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
DOC_DATE(DATE)FREQ(FREQ)행정구(GU_NM)세부견인요소(KEYWORD_DETAIL)견인요소(KEYWORD)
DOC_DATE(DATE)1.0000.0990.1480.0000.059
FREQ(FREQ)0.0991.0000.0000.1810.028
행정구(GU_NM)0.1480.0001.0000.0650.000
세부견인요소(KEYWORD_DETAIL)0.0000.1810.0651.0000.105
견인요소(KEYWORD)0.0590.0280.0000.1051.000

Missing values

2023-12-10T23:54:18.013899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:54:18.181402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

DOC_DATE(DATE)수집소스(SOURCE)행정동(DONG_NM)행정구(GU_NM)세부견인요소(KEYWORD_DETAIL)견인요소(KEYWORD)FREQ(FREQ)
020190403블로그성수동종로구비쥬얼포토제닉2
120190828블로그공릉동성동구인스타그램입소문1
220170329블로그석촌호수서초구식감입소문1
320191117블로그성수동종로구비쥬얼포토제닉1
420191023블로그정릉동동대문구인스타그램포토제닉1
520190716블로그흑석동종로구인스타그램입소문2
620190612블로그장안동양천구셀카입소문3
720190106블로그서울용산구인스타그램1
820170421블로그광화문중구셰프1
920170730블로그청담동강남구사진촬영1
DOC_DATE(DATE)수집소스(SOURCE)행정동(DONG_NM)행정구(GU_NM)세부견인요소(KEYWORD_DETAIL)견인요소(KEYWORD)FREQ(FREQ)
49020171217블로그연남동송파구갬성1
49120180306블로그충무로용산구포토존1
49220180507블로그한강광진구포토존1
49320180423블로그한남동종로구인스타그램1
49420180126블로그송파구강남구핫플레이스입소문1
49520181013블로그영등포구청역강남구시그니처1
49620180719블로그녹사평종로구셀카포토제닉1
49720170819블로그홍대입구영등포구인스타그램포토제닉1
49820180806블로그광화문중구식감포토제닉1
49920190623블로그광진구용산구인스타감성1