Overview

Dataset statistics

Number of variables4
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.4 KiB
Average record size in memory34.3 B

Variable types

Categorical2
Text1
Numeric1

Alerts

lclas_nm is highly overall correlated with mlsfc_nmHigh correlation
mlsfc_nm is highly overall correlated with lclas_nmHigh correlation
lclas_nm is highly imbalanced (80.6%)Imbalance
mlsfc_nm is highly imbalanced (80.6%)Imbalance
signgu_nm has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:03:27.907522
Analysis finished2023-12-10 10:03:28.724251
Duration0.82 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

lclas_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
문화예술
97 
소비
 
3

Length

Max length4
Median length4
Mean length3.94
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row문화예술
2nd row소비
3rd row문화예술
4th row문화예술
5th row문화예술

Common Values

ValueCountFrequency (%)
문화예술 97
97.0%
소비 3
 
3.0%

Length

2023-12-10T19:03:28.867942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:03:29.098878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
문화예술 97
97.0%
소비 3
 
3.0%

mlsfc_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
공연시설
97 
기타소비
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공연시설
2nd row기타소비
3rd row공연시설
4th row공연시설
5th row공연시설

Common Values

ValueCountFrequency (%)
공연시설 97
97.0%
기타소비 3
 
3.0%

Length

2023-12-10T19:03:29.270618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:03:29.420851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공연시설 97
97.0%
기타소비 3
 
3.0%

signgu_nm
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:03:29.830978image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length8.46
Min length7

Characters and Unicode

Total characters846
Distinct characters87
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row서울특별시 종로구
2nd row경상남도 합천군
3rd row서울특별시 용산구
4th row서울특별시 성동구
5th row서울특별시 광진구
ValueCountFrequency (%)
경기도 25
 
12.5%
서울특별시 23
 
11.5%
부산광역시 15
 
7.5%
인천광역시 10
 
5.0%
대구광역시 8
 
4.0%
동구 6
 
3.0%
대전광역시 5
 
2.5%
서구 5
 
2.5%
울산광역시 5
 
2.5%
광주광역시 5
 
2.5%
Other values (81) 93
46.5%
2023-12-10T19:03:30.725711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
101
 
11.9%
100
 
11.8%
76
 
9.0%
57
 
6.7%
48
 
5.7%
34
 
4.0%
30
 
3.5%
29
 
3.4%
27
 
3.2%
27
 
3.2%
Other values (77) 317
37.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 746
88.2%
Space Separator 100
 
11.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
101
 
13.5%
76
 
10.2%
57
 
7.6%
48
 
6.4%
34
 
4.6%
30
 
4.0%
29
 
3.9%
27
 
3.6%
27
 
3.6%
26
 
3.5%
Other values (76) 291
39.0%
Space Separator
ValueCountFrequency (%)
100
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 746
88.2%
Common 100
 
11.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
101
 
13.5%
76
 
10.2%
57
 
7.6%
48
 
6.4%
34
 
4.6%
30
 
4.0%
29
 
3.9%
27
 
3.6%
27
 
3.6%
26
 
3.5%
Other values (76) 291
39.0%
Common
ValueCountFrequency (%)
100
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 746
88.2%
ASCII 100
 
11.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
101
 
13.5%
76
 
10.2%
57
 
7.6%
48
 
6.4%
34
 
4.6%
30
 
4.0%
29
 
3.9%
27
 
3.6%
27
 
3.6%
26
 
3.5%
Other values (76) 291
39.0%
ASCII
ValueCountFrequency (%)
100
100.0%
Distinct98
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean12.747388
Minimum0.07337
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:03:31.024378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.07337
5-th percentile0.2209235
Q11.008045
median4.05176
Q310.026933
95-th percentile69.89367
Maximum100
Range99.92663
Interquartile range (IQR)9.0188875

Descriptive statistics

Standard deviation23.365035
Coefficient of variation (CV)1.8329273
Kurtosis7.1707153
Mean12.747388
Median Absolute Deviation (MAD)3.40625
Skewness2.7702863
Sum1274.7388
Variance545.92488
MonotonicityNot monotonic
2023-12-10T19:03:31.441951image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100.0 3
 
3.0%
3.17567 1
 
1.0%
0.79211 1
 
1.0%
0.24033 1
 
1.0%
0.11637 1
 
1.0%
1.10088 1
 
1.0%
2.21212 1
 
1.0%
0.80679 1
 
1.0%
1.01393 1
 
1.0%
3.34276 1
 
1.0%
Other values (88) 88
88.0%
ValueCountFrequency (%)
0.07337 1
1.0%
0.11637 1
1.0%
0.16596 1
1.0%
0.20791 1
1.0%
0.20997 1
1.0%
0.2215 1
1.0%
0.22606 1
1.0%
0.24033 1
1.0%
0.29647 1
1.0%
0.32303 1
1.0%
ValueCountFrequency (%)
100.0 3
3.0%
99.2381 1
 
1.0%
89.64589 1
 
1.0%
68.85408 1
 
1.0%
60.16789 1
 
1.0%
51.55814 1
 
1.0%
46.16235 1
 
1.0%
38.85259 1
 
1.0%
31.83654 1
 
1.0%
28.94762 1
 
1.0%

Interactions

2023-12-10T19:03:28.197178image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:03:31.619650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
lclas_nmmlsfc_nmsigngu_nmarea_arby_stdiz_fclty_score
lclas_nm1.0000.9631.0000.000
mlsfc_nm0.9631.0001.0000.000
signgu_nm1.0001.0001.0001.000
area_arby_stdiz_fclty_score0.0000.0001.0001.000
2023-12-10T19:03:31.792799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
mlsfc_nmlclas_nm
mlsfc_nm1.0000.826
lclas_nm0.8261.000
2023-12-10T19:03:31.943666image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
area_arby_stdiz_fclty_scorelclas_nmmlsfc_nm
area_arby_stdiz_fclty_score1.0000.0000.000
lclas_nm0.0001.0000.826
mlsfc_nm0.0000.8261.000

Missing values

2023-12-10T19:03:28.477430image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:03:28.655571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

lclas_nmmlsfc_nmsigngu_nmarea_arby_stdiz_fclty_score
0문화예술공연시설서울특별시 종로구100.0
1소비기타소비경상남도 합천군0.07337
2문화예술공연시설서울특별시 용산구68.85408
3문화예술공연시설서울특별시 성동구99.2381
4문화예술공연시설서울특별시 광진구46.16235
5문화예술공연시설서울특별시 동대문구14.8833
6문화예술공연시설서울특별시 중랑구11.44303
7소비기타소비제주특별자치도 제주시2.24823
8문화예술공연시설서울특별시 강북구8.54166
9문화예술공연시설서울특별시 도봉구7.74908
lclas_nmmlsfc_nmsigngu_nmarea_arby_stdiz_fclty_score
90문화예술공연시설경기도 군포시0.22606
91문화예술공연시설경기도 의왕시0.68001
92문화예술공연시설경기도 하남시0.80316
93문화예술공연시설경기도 용인시1.56763
94문화예술공연시설경기도 파주시0.2215
95문화예술공연시설경기도 이천시0.32303
96문화예술공연시설경기도 안성시4.48033
97문화예술공연시설경기도 김포시0.63647
98문화예술공연시설경기도 화성시7.47722
99문화예술공연시설경기도 광주시4.0765