Overview

Dataset statistics

Number of variables4
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.4 KiB
Average record size in memory34.3 B

Variable types

Categorical2
Text1
Numeric1

Alerts

lclas_nm is highly overall correlated with mlsfc_nmHigh correlation
mlsfc_nm is highly overall correlated with lclas_nmHigh correlation
lclas_nm is highly imbalanced (80.6%)Imbalance
mlsfc_nm is highly imbalanced (80.6%)Imbalance
signgu_nm has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:06:56.728880
Analysis finished2023-12-10 10:06:57.474352
Duration0.75 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

lclas_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
문화예술
97 
소비
 
3

Length

Max length4
Median length4
Mean length3.94
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row문화예술
2nd row소비
3rd row문화예술
4th row문화예술
5th row문화예술

Common Values

ValueCountFrequency (%)
문화예술 97
97.0%
소비 3
 
3.0%

Length

2023-12-10T19:06:57.737919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:06:57.906168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
문화예술 97
97.0%
소비 3
 
3.0%

mlsfc_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
공연시설
97 
기타소비
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공연시설
2nd row기타소비
3rd row공연시설
4th row공연시설
5th row공연시설

Common Values

ValueCountFrequency (%)
공연시설 97
97.0%
기타소비 3
 
3.0%

Length

2023-12-10T19:06:58.069665image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:06:58.236268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공연시설 97
97.0%
기타소비 3
 
3.0%

signgu_nm
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:06:58.622460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length8.46
Min length7

Characters and Unicode

Total characters846
Distinct characters87
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row서울특별시 종로구
2nd row경상남도 합천군
3rd row서울특별시 용산구
4th row서울특별시 성동구
5th row서울특별시 광진구
ValueCountFrequency (%)
경기도 25
 
12.5%
서울특별시 23
 
11.5%
부산광역시 15
 
7.5%
인천광역시 10
 
5.0%
대구광역시 8
 
4.0%
동구 6
 
3.0%
대전광역시 5
 
2.5%
서구 5
 
2.5%
울산광역시 5
 
2.5%
광주광역시 5
 
2.5%
Other values (81) 93
46.5%
2023-12-10T19:06:59.229741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
101
 
11.9%
100
 
11.8%
76
 
9.0%
57
 
6.7%
48
 
5.7%
34
 
4.0%
30
 
3.5%
29
 
3.4%
27
 
3.2%
27
 
3.2%
Other values (77) 317
37.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 746
88.2%
Space Separator 100
 
11.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
101
 
13.5%
76
 
10.2%
57
 
7.6%
48
 
6.4%
34
 
4.6%
30
 
4.0%
29
 
3.9%
27
 
3.6%
27
 
3.6%
26
 
3.5%
Other values (76) 291
39.0%
Space Separator
ValueCountFrequency (%)
100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 746
88.2%
Common 100
 
11.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
101
 
13.5%
76
 
10.2%
57
 
7.6%
48
 
6.4%
34
 
4.6%
30
 
4.0%
29
 
3.9%
27
 
3.6%
27
 
3.6%
26
 
3.5%
Other values (76) 291
39.0%
Common
ValueCountFrequency (%)
100
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 746
88.2%
ASCII 100
 
11.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
101
 
13.5%
76
 
10.2%
57
 
7.6%
48
 
6.4%
34
 
4.6%
30
 
4.0%
29
 
3.9%
27
 
3.6%
27
 
3.6%
26
 
3.5%
Other values (76) 291
39.0%
ASCII
ValueCountFrequency (%)
100
100.0%
Distinct99
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.954939
Minimum0
Maximum100
Zeros1
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:06:59.455585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.273817
Q12.8932775
median4.558775
Q38.7778025
95-th percentile43.265498
Maximum100
Range100
Interquartile range (IQR)5.884525

Descriptive statistics

Standard deviation18.887433
Coefficient of variation (CV)1.7241021
Kurtosis12.953585
Mean10.954939
Median Absolute Deviation (MAD)1.98685
Skewness3.5165141
Sum1095.4939
Variance356.73514
MonotonicityNot monotonic
2023-12-10T19:06:59.704647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100.0 2
 
2.0%
7.01753 1
 
1.0%
6.14798 1
 
1.0%
6.03525 1
 
1.0%
0.19694 1
 
1.0%
1.37063 1
 
1.0%
3.08422 1
 
1.0%
0.55599 1
 
1.0%
2.33549 1
 
1.0%
10.86667 1
 
1.0%
Other values (89) 89
89.0%
ValueCountFrequency (%)
0.0 1
1.0%
0.19694 1
1.0%
0.55599 1
1.0%
1.16819 1
1.0%
1.24792 1
1.0%
1.27518 1
1.0%
1.37063 1
1.0%
1.50006 1
1.0%
1.54075 1
1.0%
1.77336 1
1.0%
ValueCountFrequency (%)
100.0 2
2.0%
94.34368 1
1.0%
69.70102 1
1.0%
46.02103 1
1.0%
43.12047 1
1.0%
39.59594 1
1.0%
36.38463 1
1.0%
30.16915 1
1.0%
29.23194 1
1.0%
25.88114 1
1.0%

Interactions

2023-12-10T19:06:57.005315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:06:59.862154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
lclas_nmmlsfc_nmsigngu_nmarea_popltn_stdiz_fclty_score
lclas_nm1.0000.9631.0000.465
mlsfc_nm0.9631.0001.0000.465
signgu_nm1.0001.0001.0001.000
area_popltn_stdiz_fclty_score0.4650.4651.0001.000
2023-12-10T19:07:00.000558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
lclas_nmmlsfc_nm
lclas_nm1.0000.826
mlsfc_nm0.8261.000
2023-12-10T19:07:00.118168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
area_popltn_stdiz_fclty_scorelclas_nmmlsfc_nm
area_popltn_stdiz_fclty_score1.0000.4850.485
lclas_nm0.4851.0000.826
mlsfc_nm0.4850.8261.000

Missing values

2023-12-10T19:06:57.222843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:06:57.395869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

lclas_nmmlsfc_nmsigngu_nmarea_popltn_stdiz_fclty_score
0문화예술공연시설서울특별시 종로구100.0
1소비기타소비경상남도 합천군16.24628
2문화예술공연시설서울특별시 용산구46.02103
3문화예술공연시설서울특별시 성동구39.59594
4문화예술공연시설서울특별시 광진구15.32985
5문화예술공연시설서울특별시 동대문구3.82535
6문화예술공연시설서울특별시 중랑구3.28826
7소비기타소비제주특별자치도 제주시43.12047
8문화예술공연시설서울특별시 강북구4.2156
9문화예술공연시설서울특별시 도봉구3.07233
lclas_nmmlsfc_nmsigngu_nmarea_popltn_stdiz_fclty_score
90문화예술공연시설경기도 군포시2.90197
91문화예술공연시설경기도 의왕시1.27518
92문화예술공연시설경기도 하남시4.79954
93문화예술공연시설경기도 용인시8.26013
94문화예술공연시설경기도 파주시1.50006
95문화예술공연시설경기도 이천시6.61224
96문화예술공연시설경기도 안성시6.43127
97문화예술공연시설경기도 김포시4.92195
98문화예술공연시설경기도 화성시2.56688
99문화예술공연시설경기도 광주시5.31933