Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells187
Missing cells (%)20.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 KiB
Average record size in memory77.3 B

Variable types

Numeric2
Text1
Unsupported1
Categorical5

Alerts

area_nm is highly overall correlated with regist_deHigh correlation
cltur_cnter_nm is highly overall correlated with data_manage_no and 4 other fieldsHigh correlation
lwprt_cl_nm is highly overall correlated with data_manage_no and 4 other fieldsHigh correlation
cl_nm is highly overall correlated with data_manage_no and 4 other fieldsHigh correlation
regist_de is highly overall correlated with data_manage_no and 5 other fieldsHigh correlation
data_manage_no is highly overall correlated with cl_nm and 3 other fieldsHigh correlation
vlm_co is highly overall correlated with cl_nm and 3 other fieldsHigh correlation
cl_nm is highly imbalanced (67.4%)Imbalance
lwprt_cl_nm is highly imbalanced (67.4%)Imbalance
cltur_cnter_nm is highly imbalanced (83.5%)Imbalance
regist_de is highly imbalanced (75.8%)Imbalance
sumry_cn has 100 (100.0%) missing valuesMissing
vlm_co has 87 (87.0%) missing valuesMissing
data_manage_no has unique valuesUnique
sumry_cn is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-10 10:11:39.288134
Analysis finished2023-12-10 10:11:41.754861
Duration2.47 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

data_manage_no
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean231461.37
Minimum1
Maximum1369040
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:11:41.900619image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile186106.3
Q1186153.5
median186197.5
Q3186263.25
95-th percentile186319.1
Maximum1369040
Range1369039
Interquartile range (IQR)109.75

Descriptive statistics

Standard deviation234115.6
Coefficient of variation (CV)1.0114673
Kurtosis20.846848
Mean231461.37
Median Absolute Deviation (MAD)57
Skewness4.7124027
Sum23146137
Variance5.4810115 × 1010
MonotonicityNot monotonic
2023-12-10T19:11:42.260663image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
186194 1
 
1.0%
186216 1
 
1.0%
186215 1
 
1.0%
186214 1
 
1.0%
186210 1
 
1.0%
186208 1
 
1.0%
186205 1
 
1.0%
186203 1
 
1.0%
186198 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
173284 1
1.0%
179747 1
1.0%
186072 1
1.0%
186074 1
1.0%
186108 1
1.0%
186110 1
1.0%
186111 1
1.0%
186112 1
1.0%
186114 1
1.0%
ValueCountFrequency (%)
1369040 1
1.0%
1369039 1
1.0%
1369013 1
1.0%
1368958 1
1.0%
186321 1
1.0%
186319 1
1.0%
186318 1
1.0%
186317 1
1.0%
186315 1
1.0%
186311 1
1.0%
Distinct73
Distinct (%)73.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:11:42.917154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length45
Median length37
Mean length16.02
Min length3

Characters and Unicode

Total characters1602
Distinct characters309
Distinct categories9 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65 ?
Unique (%)65.0%

Sample

1st row문화강남
2nd row영남대로 문화콘텐츠 '영남대로, 벼랑길을 걷다'
3rd row향토사료집 3 화성의 얼 Ⅲ
4th row한양도성과 중구의 각자성석
5th row문화대덕 창간호 1995.12
ValueCountFrequency (%)
한국의 12
 
3.4%
설봉문화 8
 
2.3%
雪峯文化 8
 
2.3%
전통마을 8
 
2.3%
울산문화 7
 
2.0%
蔚山文化 7
 
2.0%
회룡문화 6
 
1.7%
回龍文化 6
 
1.7%
화성문화 6
 
1.7%
소사벌 4
 
1.1%
Other values (241) 277
79.4%
2023-12-10T19:11:43.864371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
249
 
15.5%
68
 
4.2%
66
 
4.1%
36
 
2.2%
( 35
 
2.2%
) 35
 
2.2%
32
 
2.0%
29
 
1.8%
29
 
1.8%
1 21
 
1.3%
Other values (299) 1002
62.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1176
73.4%
Space Separator 249
 
15.5%
Decimal Number 70
 
4.4%
Open Punctuation 35
 
2.2%
Close Punctuation 35
 
2.2%
Other Punctuation 31
 
1.9%
Dash Punctuation 3
 
0.2%
Letter Number 2
 
0.1%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
68
 
5.8%
66
 
5.6%
36
 
3.1%
32
 
2.7%
29
 
2.5%
29
 
2.5%
20
 
1.7%
19
 
1.6%
18
 
1.5%
16
 
1.4%
Other values (277) 843
71.7%
Decimal Number
ValueCountFrequency (%)
1 21
30.0%
9 7
 
10.0%
5 6
 
8.6%
0 6
 
8.6%
7 6
 
8.6%
4 5
 
7.1%
6 5
 
7.1%
3 5
 
7.1%
2 5
 
7.1%
8 4
 
5.7%
Other Punctuation
ValueCountFrequency (%)
/ 8
25.8%
. 8
25.8%
, 6
19.4%
· 5
16.1%
' 4
12.9%
Letter Number
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
249
100.0%
Open Punctuation
ValueCountFrequency (%)
( 35
100.0%
Close Punctuation
ValueCountFrequency (%)
) 35
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 982
61.3%
Common 424
26.5%
Han 194
 
12.1%
Latin 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
68
 
6.9%
66
 
6.7%
36
 
3.7%
29
 
3.0%
20
 
2.0%
19
 
1.9%
18
 
1.8%
16
 
1.6%
16
 
1.6%
16
 
1.6%
Other values (213) 678
69.0%
Han
ValueCountFrequency (%)
32
16.5%
29
 
14.9%
9
 
4.6%
8
 
4.1%
8
 
4.1%
8
 
4.1%
6
 
3.1%
6
 
3.1%
6
 
3.1%
6
 
3.1%
Other values (54) 76
39.2%
Common
ValueCountFrequency (%)
249
58.7%
( 35
 
8.3%
) 35
 
8.3%
1 21
 
5.0%
/ 8
 
1.9%
. 8
 
1.9%
9 7
 
1.7%
5 6
 
1.4%
0 6
 
1.4%
7 6
 
1.4%
Other values (10) 43
 
10.1%
Latin
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 982
61.3%
ASCII 419
26.2%
CJK 189
 
11.8%
None 5
 
0.3%
CJK Compat Ideographs 5
 
0.3%
Number Forms 2
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
249
59.4%
( 35
 
8.4%
) 35
 
8.4%
1 21
 
5.0%
/ 8
 
1.9%
. 8
 
1.9%
9 7
 
1.7%
5 6
 
1.4%
0 6
 
1.4%
7 6
 
1.4%
Other values (9) 38
 
9.1%
Hangul
ValueCountFrequency (%)
68
 
6.9%
66
 
6.7%
36
 
3.7%
29
 
3.0%
20
 
2.0%
19
 
1.9%
18
 
1.8%
16
 
1.6%
16
 
1.6%
16
 
1.6%
Other values (213) 678
69.0%
CJK
ValueCountFrequency (%)
32
16.9%
29
15.3%
9
 
4.8%
8
 
4.2%
8
 
4.2%
8
 
4.2%
6
 
3.2%
6
 
3.2%
6
 
3.2%
6
 
3.2%
Other values (50) 71
37.6%
None
ValueCountFrequency (%)
· 5
100.0%
CJK Compat Ideographs
ValueCountFrequency (%)
2
40.0%
1
20.0%
1
20.0%
1
20.0%
Number Forms
ValueCountFrequency (%)
1
50.0%
1
50.0%

sumry_cn
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing100
Missing (%)100.0%
Memory size1.0 KiB

cl_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
종교와 문화
85 
생활과 민속
10 
성씨와 인물
 
2
자연과 지리
 
1
문화유산
 
1

Length

Max length8
Median length6
Mean length6
Min length4

Unique

Unique3 ?
Unique (%)3.0%

Sample

1st row종교와 문화
2nd row자연과 지리
3rd row생활과 민속
4th row문화유산
5th row종교와 문화

Common Values

ValueCountFrequency (%)
종교와 문화 85
85.0%
생활과 민속 10
 
10.0%
성씨와 인물 2
 
2.0%
자연과 지리 1
 
1.0%
문화유산 1
 
1.0%
구비전승,어문학 1
 
1.0%

Length

2023-12-10T19:11:44.509815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:11:44.796194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
종교와 85
42.9%
문화 85
42.9%
생활과 10
 
5.1%
민속 10
 
5.1%
성씨와 2
 
1.0%
인물 2
 
1.0%
자연과 1
 
0.5%
지리 1
 
0.5%
문화유산 1
 
0.5%
구비전승,어문학 1
 
0.5%

lwprt_cl_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
문화예술
85 
생활
10 
인물
 
2
자연환경
 
1
건축유적
 
1

Length

Max length4
Median length4
Mean length3.76
Min length2

Unique

Unique3 ?
Unique (%)3.0%

Sample

1st row문화예술
2nd row자연환경
3rd row생활
4th row건축유적
5th row문화예술

Common Values

ValueCountFrequency (%)
문화예술 85
85.0%
생활 10
 
10.0%
인물 2
 
2.0%
자연환경 1
 
1.0%
건축유적 1
 
1.0%
지명유래 1
 
1.0%

Length

2023-12-10T19:11:45.029142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:11:45.240397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
문화예술 85
85.0%
생활 10
 
10.0%
인물 2
 
2.0%
자연환경 1
 
1.0%
건축유적 1
 
1.0%
지명유래 1
 
1.0%

vlm_co
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct12
Distinct (%)92.3%
Missing87
Missing (%)87.0%
Infinite0
Infinite (%)0.0%
Mean11.230769
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:11:45.431479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.6
Q17
median13
Q317
95-th percentile18.8
Maximum20
Range19
Interquartile range (IQR)10

Descriptive statistics

Standard deviation6.4569581
Coefficient of variation (CV)0.57493462
Kurtosis-1.2514824
Mean11.230769
Median Absolute Deviation (MAD)5
Skewness-0.39346553
Sum146
Variance41.692308
MonotonicityNot monotonic
2023-12-10T19:11:45.636408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
17 2
 
2.0%
15 1
 
1.0%
3 1
 
1.0%
1 1
 
1.0%
18 1
 
1.0%
8 1
 
1.0%
14 1
 
1.0%
7 1
 
1.0%
20 1
 
1.0%
11 1
 
1.0%
Other values (2) 2
 
2.0%
(Missing) 87
87.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
7 1
1.0%
8 1
1.0%
11 1
1.0%
13 1
1.0%
14 1
1.0%
15 1
1.0%
17 2
2.0%
ValueCountFrequency (%)
20 1
1.0%
18 1
1.0%
17 2
2.0%
15 1
1.0%
14 1
1.0%
13 1
1.0%
11 1
1.0%
8 1
1.0%
7 1
1.0%
3 1
1.0%

cltur_cnter_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
대전중구문화원
95 
서울중구문화원
 
2
밀양문화원
 
1
용산문화원
 
1
제주도문화원연합회
 
1

Length

Max length9
Median length7
Mean length6.98
Min length5

Unique

Unique3 ?
Unique (%)3.0%

Sample

1st row서울중구문화원
2nd row밀양문화원
3rd row대전중구문화원
4th row서울중구문화원
5th row대전중구문화원

Common Values

ValueCountFrequency (%)
대전중구문화원 95
95.0%
서울중구문화원 2
 
2.0%
밀양문화원 1
 
1.0%
용산문화원 1
 
1.0%
제주도문화원연합회 1
 
1.0%

Length

2023-12-10T19:11:45.887240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:11:46.083009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대전중구문화원 95
95.0%
서울중구문화원 2
 
2.0%
밀양문화원 1
 
1.0%
용산문화원 1
 
1.0%
제주도문화원연합회 1
 
1.0%

area_nm
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
서울·경인권
58 
전국
21 
영남권
10 
충청권
<NA>
 
4

Length

Max length6
Median length6
Mean length4.57
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울·경인권
2nd row<NA>
3rd row서울·경인권
4th row<NA>
5th row충청권

Common Values

ValueCountFrequency (%)
서울·경인권 58
58.0%
전국 21
 
21.0%
영남권 10
 
10.0%
충청권 7
 
7.0%
<NA> 4
 
4.0%

Length

2023-12-10T19:11:46.294205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:11:46.601283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울·경인권 58
58.0%
전국 21
 
21.0%
영남권 10
 
10.0%
충청권 7
 
7.0%
na 4
 
4.0%

regist_de
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
20181116
96 
20181129
 
4

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20181116
2nd row20181129
3rd row20181116
4th row20181129
5th row20181116

Common Values

ValueCountFrequency (%)
20181116 96
96.0%
20181129 4
 
4.0%

Length

2023-12-10T19:11:46.861695image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:11:46.996765image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20181116 96
96.0%
20181129 4
 
4.0%

Interactions

2023-12-10T19:11:40.773132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:11:40.265801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:11:40.980132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:11:40.523463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:11:47.132607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
data_manage_nodata_title_nmcl_nmlwprt_cl_nmvlm_cocltur_cnter_nmarea_nmregist_de
data_manage_no1.0001.0000.8410.8411.0000.8010.0001.000
data_title_nm1.0001.0000.9980.9981.0001.0001.0001.000
cl_nm0.8410.9981.0001.0001.0000.8110.6260.942
lwprt_cl_nm0.8410.9981.0001.0001.0000.8110.6260.942
vlm_co1.0001.0001.0001.0001.0001.0000.440NaN
cltur_cnter_nm0.8011.0000.8110.8111.0001.0000.0000.800
area_nm0.0001.0000.6260.6260.4400.0001.000NaN
regist_de1.0001.0000.9420.942NaN0.800NaN1.000
2023-12-10T19:11:47.492229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
area_nmcltur_cnter_nmlwprt_cl_nmcl_nmregist_de
area_nm1.0000.0000.2870.2871.000
cltur_cnter_nm0.0001.0000.7000.7000.915
lwprt_cl_nm0.2870.7001.0001.0000.769
cl_nm0.2870.7001.0001.0000.769
regist_de1.0000.9150.7690.7691.000
2023-12-10T19:11:47.710433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
data_manage_novlm_cocl_nmlwprt_cl_nmcltur_cnter_nmarea_nmregist_de
data_manage_no1.000-0.3710.5230.5230.8130.0000.995
vlm_co-0.3711.0000.6740.6740.6740.0001.000
cl_nm0.5230.6741.0001.0000.7000.2870.769
lwprt_cl_nm0.5230.6741.0001.0000.7000.2870.769
cltur_cnter_nm0.8130.6740.7000.7001.0000.0000.915
area_nm0.0000.0000.2870.2870.0001.0001.000
regist_de0.9951.0000.7690.7690.9151.0001.000

Missing values

2023-12-10T19:11:41.241713image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:11:41.606240image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

data_manage_nodata_title_nmsumry_cncl_nmlwprt_cl_nmvlm_cocltur_cnter_nmarea_nmregist_de
01문화강남<NA>종교와 문화문화예술15서울중구문화원서울·경인권20181116
11368958영남대로 문화콘텐츠 '영남대로, 벼랑길을 걷다'<NA>자연과 지리자연환경<NA>밀양문화원<NA>20181129
2186309향토사료집 3 화성의 얼 Ⅲ<NA>생활과 민속생활3대전중구문화원서울·경인권20181116
31369040한양도성과 중구의 각자성석<NA>문화유산건축유적<NA>서울중구문화원<NA>20181129
4179747문화대덕 창간호 1995.12<NA>종교와 문화문화예술1대전중구문화원충청권20181116
5186110제5회 대덕백일장 작품집<NA>종교와 문화문화예술<NA>대전중구문화원충청권20181116
6186117중구문화 길잡이<NA>종교와 문화문화예술<NA>대전중구문화원영남권20181116
71369013이야기로 듣는 4人4色 용산史<NA>성씨와 인물인물<NA>용산문화원<NA>20181129
8186296화성문화<NA>종교와 문화문화예술<NA>대전중구문화원서울·경인권20181116
9186298화성문화 (제10회 경로효친 선양글짓기 공모 입상작품집) 제16호<NA>종교와 문화문화예술<NA>대전중구문화원서울·경인권20181116
data_manage_nodata_title_nmsumry_cncl_nmlwprt_cl_nmvlm_cocltur_cnter_nmarea_nmregist_de
90186262설봉문화 (雪峯文化)<NA>종교와 문화문화예술<NA>대전중구문화원서울·경인권20181116
91186263설봉문화 (雪峯文化)<NA>종교와 문화문화예술<NA>대전중구문화원서울·경인권20181116
92186264설봉문화 (雪峯文化)<NA>종교와 문화문화예술<NA>대전중구문화원서울·경인권20181116
93186266설봉문화 (雪峯文化)<NA>종교와 문화문화예술<NA>대전중구문화원서울·경인권20181116
94186268설봉문화 (雪峯文化)<NA>종교와 문화문화예술<NA>대전중구문화원서울·경인권20181116
95186272利川의 民俗 '거복놀이'<NA>종교와 문화문화예술<NA>대전중구문화원서울·경인권20181116
96186275화성군의 3·1 독립운동사<NA>종교와 문화문화예술<NA>대전중구문화원서울·경인권20181116
97186277화성문화 제2호<NA>종교와 문화문화예술<NA>대전중구문화원서울·경인권20181116
98186278화성문화 창간호<NA>종교와 문화문화예술<NA>대전중구문화원서울·경인권20181116
99186279난파의 생애와 예술세계<NA>종교와 문화문화예술<NA>대전중구문화원서울·경인권20181116