Overview

Dataset statistics

Number of variables7
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory28.4 KiB
Average record size in memory58.3 B

Variable types

Numeric1
Categorical5
Text1

Dataset

Description샘플 데이터
Author다음소프트
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=57

Alerts

수집소스(SOURCE) has constant value ""Constant
FREQ(FREQ) is highly imbalanced (78.7%)Imbalance

Reproduction

Analysis started2023-12-10 14:53:55.655725
Analysis finished2023-12-10 14:53:56.456839
Duration0.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

DOC_DATE(DATE)
Real number (ℝ)

Distinct383
Distinct (%)76.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20180017
Minimum20170102
Maximum20191227
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-10T23:53:56.565891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20170102
5-th percentile20170203
Q120170816
median20180622
Q320190206
95-th percentile20191008
Maximum20191227
Range21125
Interquartile range (IQR)19391

Descriptive statistics

Standard deviation8155.1702
Coefficient of variation (CV)0.00040412108
Kurtosis-1.4824968
Mean20180017
Median Absolute Deviation (MAD)9705.5
Skewness0.10109956
Sum1.0090008 × 1010
Variance66506802
MonotonicityNot monotonic
2023-12-10T23:53:56.838612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20180826 4
 
0.8%
20191021 4
 
0.8%
20170716 3
 
0.6%
20190929 3
 
0.6%
20190114 3
 
0.6%
20180622 3
 
0.6%
20170122 3
 
0.6%
20180903 3
 
0.6%
20170201 3
 
0.6%
20170731 3
 
0.6%
Other values (373) 468
93.6%
ValueCountFrequency (%)
20170102 1
 
0.2%
20170103 1
 
0.2%
20170107 1
 
0.2%
20170108 2
0.4%
20170110 1
 
0.2%
20170113 2
0.4%
20170114 1
 
0.2%
20170116 2
0.4%
20170120 2
0.4%
20170122 3
0.6%
ValueCountFrequency (%)
20191227 1
0.2%
20191223 1
0.2%
20191216 1
0.2%
20191204 1
0.2%
20191201 1
0.2%
20191130 2
0.4%
20191129 1
0.2%
20191119 1
0.2%
20191118 1
0.2%
20191115 1
0.2%

수집소스(SOURCE)
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
블로그
500 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row블로그
2nd row블로그
3rd row블로그
4th row블로그
5th row블로그

Common Values

ValueCountFrequency (%)
블로그 500
100.0%

Length

2023-12-10T23:53:57.013353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:53:57.117313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
블로그 500
100.0%
Distinct169
Distinct (%)33.8%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-10T23:53:57.706089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length9
Mean length3.704
Min length2

Characters and Unicode

Total characters1852
Distinct characters179
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique87 ?
Unique (%)17.4%

Sample

1st row을지로
2nd row서울
3rd row서울시립미술관
4th row서울
5th row서울
ValueCountFrequency (%)
서울 42
 
8.4%
디뮤지엄 22
 
4.4%
대림미술관 20
 
4.0%
예술의전당 18
 
3.6%
인사동 13
 
2.6%
한가람미술관 11
 
2.2%
경복궁 10
 
2.0%
한강 9
 
1.8%
국립현대미술관 9
 
1.8%
홍대 9
 
1.8%
Other values (159) 337
67.4%
2023-12-10T23:53:58.182868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
100
 
5.4%
97
 
5.2%
87
 
4.7%
87
 
4.7%
76
 
4.1%
70
 
3.8%
66
 
3.6%
44
 
2.4%
39
 
2.1%
30
 
1.6%
Other values (169) 1156
62.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1846
99.7%
Lowercase Letter 5
 
0.3%
Decimal Number 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
100
 
5.4%
97
 
5.3%
87
 
4.7%
87
 
4.7%
76
 
4.1%
70
 
3.8%
66
 
3.6%
44
 
2.4%
39
 
2.1%
30
 
1.6%
Other values (166) 1150
62.3%
Lowercase Letter
ValueCountFrequency (%)
k 3
60.0%
n 2
40.0%
Decimal Number
ValueCountFrequency (%)
4 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1846
99.7%
Latin 5
 
0.3%
Common 1
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
100
 
5.4%
97
 
5.3%
87
 
4.7%
87
 
4.7%
76
 
4.1%
70
 
3.8%
66
 
3.6%
44
 
2.4%
39
 
2.1%
30
 
1.6%
Other values (166) 1150
62.3%
Latin
ValueCountFrequency (%)
k 3
60.0%
n 2
40.0%
Common
ValueCountFrequency (%)
4 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1846
99.7%
ASCII 6
 
0.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
100
 
5.4%
97
 
5.3%
87
 
4.7%
87
 
4.7%
76
 
4.1%
70
 
3.8%
66
 
3.6%
44
 
2.4%
39
 
2.1%
30
 
1.6%
Other values (166) 1150
62.3%
ASCII
ValueCountFrequency (%)
k 3
50.0%
n 2
33.3%
4 1
 
16.7%

행정구(GU_NM)
Categorical

Distinct23
Distinct (%)4.6%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
종로구
129 
용산구
73 
서울
63 
서초구
49 
강남구
46 
Other values (18)
140 

Length

Max length4
Median length3
Mean length2.828
Min length2

Unique

Unique6 ?
Unique (%)1.2%

Sample

1st row종로구
2nd row강남구
3rd row중구
4th row강남구
5th row강남구

Common Values

ValueCountFrequency (%)
종로구 129
25.8%
용산구 73
14.6%
서울 63
12.6%
서초구 49
 
9.8%
강남구 46
 
9.2%
중구 40
 
8.0%
마포구 30
 
6.0%
성동구 19
 
3.8%
영등포구 11
 
2.2%
광진구 9
 
1.8%
Other values (13) 31
 
6.2%

Length

2023-12-10T23:53:58.366321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
종로구 129
25.8%
용산구 73
14.6%
서울 63
12.6%
서초구 49
 
9.8%
강남구 46
 
9.2%
중구 40
 
8.0%
마포구 30
 
6.0%
성동구 19
 
3.8%
영등포구 11
 
2.2%
광진구 9
 
1.8%
Other values (13) 31
 
6.2%
Distinct42
Distinct (%)8.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
작품
72 
작가
49 
전시장
38 
예술
32 
인스타그램
 
20
Other values (37)
289 

Length

Max length5
Median length2
Mean length2.64
Min length2

Unique

Unique4 ?
Unique (%)0.8%

Sample

1st row화가
2nd row포스터
3rd row도슨트
4th row예술
5th row촬영

Common Values

ValueCountFrequency (%)
작품 72
 
14.4%
작가 49
 
9.8%
전시장 38
 
7.6%
예술 32
 
6.4%
인스타그램 20
 
4.0%
화가 19
 
3.8%
포토존 19
 
3.8%
인테리어 18
 
3.6%
셀카 17
 
3.4%
촬영 17
 
3.4%
Other values (32) 199
39.8%

Length

2023-12-10T23:53:58.520247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
작품 72
 
14.4%
작가 49
 
9.8%
전시장 38
 
7.6%
예술 32
 
6.4%
인스타그램 20
 
4.0%
화가 19
 
3.8%
포토존 19
 
3.8%
인테리어 18
 
3.6%
촬영 17
 
3.4%
셀카 17
 
3.4%
Other values (32) 199
39.8%
Distinct4
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
기타
204 
예술성
113 
감성사진
106 
건축
77 

Length

Max length4
Median length2
Mean length2.65
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row기타
2nd row감성사진
3rd row건축
4th row기타
5th row예술성

Common Values

ValueCountFrequency (%)
기타 204
40.8%
예술성 113
22.6%
감성사진 106
21.2%
건축 77
 
15.4%

Length

2023-12-10T23:53:58.672585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:53:58.816132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
기타 204
40.8%
예술성 113
22.6%
감성사진 106
21.2%
건축 77
 
15.4%

FREQ(FREQ)
Categorical

IMBALANCE 

Distinct5
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
459 
2
 
32
3
 
6
4
 
2
5
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 459
91.8%
2 32
 
6.4%
3 6
 
1.2%
4 2
 
0.4%
5 1
 
0.2%

Length

2023-12-10T23:53:58.960542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:53:59.083779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 459
91.8%
2 32
 
6.4%
3 6
 
1.2%
4 2
 
0.4%
5 1
 
0.2%

Interactions

2023-12-10T23:53:56.086798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:53:59.180179image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
DOC_DATE(DATE)행정구(GU_NM)세부견인요소(KEYWORD_DETAIL)견인요소(KEYWORD)FREQ(FREQ)
DOC_DATE(DATE)1.0000.0000.0000.0000.000
행정구(GU_NM)0.0001.0000.3420.0430.000
세부견인요소(KEYWORD_DETAIL)0.0000.3421.0000.0000.000
견인요소(KEYWORD)0.0000.0430.0001.0000.000
FREQ(FREQ)0.0000.0000.0000.0001.000
2023-12-10T23:53:59.296885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
견인요소(KEYWORD)세부견인요소(KEYWORD_DETAIL)행정구(GU_NM)FREQ(FREQ)
견인요소(KEYWORD)1.0000.0000.0200.000
세부견인요소(KEYWORD_DETAIL)0.0001.0000.0840.000
행정구(GU_NM)0.0200.0841.0000.000
FREQ(FREQ)0.0000.0000.0001.000
2023-12-10T23:53:59.397424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
DOC_DATE(DATE)행정구(GU_NM)세부견인요소(KEYWORD_DETAIL)견인요소(KEYWORD)FREQ(FREQ)
DOC_DATE(DATE)1.0000.0000.0000.0000.000
행정구(GU_NM)0.0001.0000.0840.0200.000
세부견인요소(KEYWORD_DETAIL)0.0000.0841.0000.0000.000
견인요소(KEYWORD)0.0000.0200.0001.0000.000
FREQ(FREQ)0.0000.0000.0000.0001.000

Missing values

2023-12-10T23:53:56.239980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:53:56.382071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

DOC_DATE(DATE)수집소스(SOURCE)행정동(DONG_NM)행정구(GU_NM)세부견인요소(KEYWORD_DETAIL)견인요소(KEYWORD)FREQ(FREQ)
020170717블로그을지로종로구화가기타1
120170814블로그서울강남구포스터감성사진1
220190930블로그서울시립미술관중구도슨트건축1
320180608블로그서울강남구예술기타1
420191115블로그서울강남구촬영예술성1
520181201블로그한강서초구예술가예술성1
620170816블로그서울용산구작품예술성1
720191118블로그국립현대미술관종로구인스타그램건축1
820180629블로그올림픽공원서대문구작품기타1
920190310블로그올림픽공원강남구전시장예술성1
DOC_DATE(DATE)수집소스(SOURCE)행정동(DONG_NM)행정구(GU_NM)세부견인요소(KEYWORD_DETAIL)견인요소(KEYWORD)FREQ(FREQ)
49020191109블로그한강서울포토존건축1
49120190113블로그예술의전당강남구조명예술성1
49220180622블로그서울종로구작품건축1
49320181023블로그한남동사운즈서초구외관기타1
49420170310블로그서울서초구예술감성사진1
49520170914블로그버티고개종로구전시장감성사진1
49620171221블로그용산중구예술예술성1
49720190116블로그압구정로데오강남구비쥬얼기타1
49820180726블로그이태원종로구인테리어기타2
49920181025블로그이촌역영등포구사진촬영기타1