Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.8 KiB
Average record size in memory69.3 B

Variable types

Categorical4
Numeric3
Text1

Alerts

ctprvn_nm is highly overall correlated with signgu_cd and 3 other fieldsHigh correlation
signgu_nm is highly overall correlated with signgu_cd and 3 other fieldsHigh correlation
ctprvn_cd is highly overall correlated with signgu_cd and 3 other fieldsHigh correlation
signgu_cd is highly overall correlated with adstrd_cd and 3 other fieldsHigh correlation
adstrd_cd is highly overall correlated with signgu_cd and 3 other fieldsHigh correlation
ctprvn_cd is highly imbalanced (80.6%)Imbalance
ctprvn_nm is highly imbalanced (80.6%)Imbalance

Reproduction

Analysis started2023-12-10 10:20:11.835051
Analysis finished2023-12-10 10:20:14.221749
Duration2.39 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

cl
Categorical

Distinct8
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
조각
59 
회화
18 
기타
미디어
벽화
 
5
Other values (3)
 
5

Length

Max length3
Median length2
Mean length2.07
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row기타
2nd row회화
3rd row분수대
4th row조각
5th row조각

Common Values

ValueCountFrequency (%)
조각 59
59.0%
회화 18
 
18.0%
기타 7
 
7.0%
미디어 6
 
6.0%
벽화 5
 
5.0%
공예 2
 
2.0%
사진 2
 
2.0%
분수대 1
 
1.0%

Length

2023-12-10T19:20:14.315029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:20:14.479793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
조각 59
59.0%
회화 18
 
18.0%
기타 7
 
7.0%
미디어 6
 
6.0%
벽화 5
 
5.0%
공예 2
 
2.0%
사진 2
 
2.0%
분수대 1
 
1.0%

ctprvn_cd
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
11
97 
39
 
3

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11
2nd row39
3rd row11
4th row11
5th row11

Common Values

ValueCountFrequency (%)
11 97
97.0%
39 3
 
3.0%

Length

2023-12-10T19:20:14.663419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:20:14.824110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
11 97
97.0%
39 3
 
3.0%

ctprvn_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
서울특별시
97 
제주특별자치도
 
3

Length

Max length7
Median length5
Mean length5.06
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row제주특별자치도
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 97
97.0%
제주특별자치도 3
 
3.0%

Length

2023-12-10T19:20:15.005974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:20:15.133155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 97
97.0%
제주특별자치도 3
 
3.0%

signgu_cd
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11869.7
Minimum11010
Maximum39020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:15.270441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11010
5-th percentile11010
Q111020
median11030
Q311040
95-th percentile11060
Maximum39020
Range28010
Interquartile range (IQR)20

Descriptive statistics

Standard deviation4798.8162
Coefficient of variation (CV)0.40429128
Kurtosis29.897071
Mean11869.7
Median Absolute Deviation (MAD)10
Skewness5.5945539
Sum1186970
Variance23028637
MonotonicityNot monotonic
2023-12-10T19:20:15.452414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
11010 22
22.0%
11020 21
21.0%
11040 19
19.0%
11030 16
16.0%
11050 11
11.0%
11060 8
 
8.0%
39020 3
 
3.0%
ValueCountFrequency (%)
11010 22
22.0%
11020 21
21.0%
11030 16
16.0%
11040 19
19.0%
11050 11
11.0%
11060 8
 
8.0%
39020 3
 
3.0%
ValueCountFrequency (%)
39020 3
 
3.0%
11060 8
 
8.0%
11050 11
11.0%
11040 19
19.0%
11030 16
16.0%
11020 21
21.0%
11010 22
22.0%

signgu_nm
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
종로구
22 
중구
21 
성동구
19 
용산구
16 
광진구
11 
Other values (2)
11 

Length

Max length4
Median length3
Mean length2.9
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row서귀포시
3rd row종로구
4th row종로구
5th row종로구

Common Values

ValueCountFrequency (%)
종로구 22
22.0%
중구 21
21.0%
성동구 19
19.0%
용산구 16
16.0%
광진구 11
11.0%
동대문구 8
 
8.0%
서귀포시 3
 
3.0%

Length

2023-12-10T19:20:15.686988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:20:15.926437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
종로구 22
22.0%
중구 21
21.0%
성동구 19
19.0%
용산구 16
16.0%
광진구 11
11.0%
동대문구 8
 
8.0%
서귀포시 3
 
3.0%

adstrd_cd
Real number (ℝ)

HIGH CORRELATION 

Distinct62
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1187033.4
Minimum1101053
Maximum3902062
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:16.117612image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1101053
5-th percentile1101057.9
Q11102052
median1103069.5
Q31104070.2
95-th percentile1106081
Maximum3902062
Range2801009
Interquartile range (IQR)2018.25

Descriptive statistics

Standard deviation479881.38
Coefficient of variation (CV)0.40426948
Kurtosis29.897068
Mean1187033.4
Median Absolute Deviation (MAD)1015.5
Skewness5.5945536
Sum1.1870334 × 108
Variance2.3028614 × 1011
MonotonicityNot monotonic
2023-12-10T19:20:16.455065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1101061 6
 
6.0%
1102052 5
 
5.0%
1101053 3
 
3.0%
1106081 3
 
3.0%
1101067 3
 
3.0%
3902062 3
 
3.0%
1103074 3
 
3.0%
1103073 3
 
3.0%
1102055 3
 
3.0%
1102069 3
 
3.0%
Other values (52) 65
65.0%
ValueCountFrequency (%)
1101053 3
3.0%
1101056 1
 
1.0%
1101057 1
 
1.0%
1101058 1
 
1.0%
1101061 6
6.0%
1101063 2
 
2.0%
1101064 2
 
2.0%
1101067 3
3.0%
1101070 2
 
2.0%
1101072 1
 
1.0%
ValueCountFrequency (%)
3902062 3
3.0%
1106082 1
 
1.0%
1106081 3
3.0%
1106080 1
 
1.0%
1106073 1
 
1.0%
1106072 1
 
1.0%
1106071 1
 
1.0%
1105067 1
 
1.0%
1105066 1
 
1.0%
1105065 2
2.0%
Distinct62
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:20:16.857464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length7
Mean length4.14
Min length2

Characters and Unicode

Total characters414
Distinct characters86
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)39.0%

Sample

1st row사직동
2nd row예래동
3rd row사직동
4th row사직동
5th row평창동
ValueCountFrequency (%)
종로1·2·3·4가동 6
 
6.0%
소공동 5
 
5.0%
사직동 3
 
3.0%
용신동 3
 
3.0%
창신1동 3
 
3.0%
예래동 3
 
3.0%
한남동 3
 
3.0%
한강로동 3
 
3.0%
명동 3
 
3.0%
신당동 3
 
3.0%
Other values (52) 65
65.0%
2023-12-10T19:20:17.647515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
100
24.2%
1 24
 
5.8%
· 21
 
5.1%
2 17
 
4.1%
16
 
3.9%
16
 
3.9%
3 10
 
2.4%
4 9
 
2.2%
9
 
2.2%
8
 
1.9%
Other values (76) 184
44.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 329
79.5%
Decimal Number 64
 
15.5%
Other Punctuation 21
 
5.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
100
30.4%
16
 
4.9%
16
 
4.9%
9
 
2.7%
8
 
2.4%
6
 
1.8%
6
 
1.8%
6
 
1.8%
6
 
1.8%
5
 
1.5%
Other values (69) 151
45.9%
Decimal Number
ValueCountFrequency (%)
1 24
37.5%
2 17
26.6%
3 10
15.6%
4 9
 
14.1%
5 2
 
3.1%
6 2
 
3.1%
Other Punctuation
ValueCountFrequency (%)
· 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 329
79.5%
Common 85
 
20.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
100
30.4%
16
 
4.9%
16
 
4.9%
9
 
2.7%
8
 
2.4%
6
 
1.8%
6
 
1.8%
6
 
1.8%
6
 
1.8%
5
 
1.5%
Other values (69) 151
45.9%
Common
ValueCountFrequency (%)
1 24
28.2%
· 21
24.7%
2 17
20.0%
3 10
11.8%
4 9
 
10.6%
5 2
 
2.4%
6 2
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 329
79.5%
ASCII 64
 
15.5%
None 21
 
5.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
100
30.4%
16
 
4.9%
16
 
4.9%
9
 
2.7%
8
 
2.4%
6
 
1.8%
6
 
1.8%
6
 
1.8%
6
 
1.8%
5
 
1.5%
Other values (69) 151
45.9%
ASCII
ValueCountFrequency (%)
1 24
37.5%
2 17
26.6%
3 10
15.6%
4 9
 
14.1%
5 2
 
3.1%
6 2
 
3.1%
None
ValueCountFrequency (%)
· 21
100.0%

co
Real number (ℝ)

Distinct17
Distinct (%)17.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.72
Minimum1
Maximum27
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:20:17.872922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2.5
Q37
95-th percentile14
Maximum27
Range26
Interquartile range (IQR)6

Descriptive statistics

Standard deviation5.0193162
Coefficient of variation (CV)1.0634145
Kurtosis4.8114404
Mean4.72
Median Absolute Deviation (MAD)1.5
Skewness2.0324481
Sum472
Variance25.193535
MonotonicityNot monotonic
2023-12-10T19:20:18.110486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
1 30
30.0%
2 20
20.0%
4 10
 
10.0%
3 8
 
8.0%
11 5
 
5.0%
9 4
 
4.0%
8 4
 
4.0%
7 3
 
3.0%
6 3
 
3.0%
5 3
 
3.0%
Other values (7) 10
 
10.0%
ValueCountFrequency (%)
1 30
30.0%
2 20
20.0%
3 8
 
8.0%
4 10
 
10.0%
5 3
 
3.0%
6 3
 
3.0%
7 3
 
3.0%
8 4
 
4.0%
9 4
 
4.0%
10 2
 
2.0%
ValueCountFrequency (%)
27 1
 
1.0%
23 1
 
1.0%
19 1
 
1.0%
17 1
 
1.0%
14 3
3.0%
13 1
 
1.0%
11 5
5.0%
10 2
 
2.0%
9 4
4.0%
8 4
4.0%

Interactions

2023-12-10T19:20:13.406461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:12.473283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:12.946943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:13.549260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:12.640538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:13.100229image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:13.715827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:12.805921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:20:13.248883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:20:18.265234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
clctprvn_cdctprvn_nmsigngu_cdsigngu_nmadstrd_cdadstrd_nmco
cl1.0000.5280.5280.5530.3430.3790.0000.000
ctprvn_cd0.5281.0000.9630.9621.0000.9631.0000.000
ctprvn_nm0.5280.9631.0000.9621.0000.9631.0000.000
signgu_cd0.5530.9620.9621.0001.0000.9621.0000.000
signgu_nm0.3431.0001.0001.0001.0001.0001.0000.000
adstrd_cd0.3790.9630.9630.9621.0001.0001.0000.000
adstrd_nm0.0001.0001.0001.0001.0001.0001.0000.000
co0.0000.0000.0000.0000.0000.0000.0001.000
2023-12-10T19:20:18.477823image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ctprvn_nmsigngu_nmclctprvn_cd
ctprvn_nm1.0000.9740.3850.826
signgu_nm0.9741.0000.1890.974
cl0.3850.1891.0000.385
ctprvn_cd0.8260.9740.3851.000
2023-12-10T19:20:18.967281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
signgu_cdadstrd_cdcoclctprvn_cdctprvn_nmsigngu_nm
signgu_cd1.0000.9840.0230.3850.8260.8260.974
adstrd_cd0.9841.0000.0220.3850.8260.8260.974
co0.0230.0221.0000.0000.0000.0000.000
cl0.3850.3850.0001.0000.3850.3850.189
ctprvn_cd0.8260.8260.0000.3851.0000.8260.974
ctprvn_nm0.8260.8260.0000.3850.8261.0000.974
signgu_nm0.9740.9740.0000.1890.9740.9741.000

Missing values

2023-12-10T19:20:13.911700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:20:14.141439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

clctprvn_cdctprvn_nmsigngu_cdsigngu_nmadstrd_cdadstrd_nmco
0기타11서울특별시11010종로구1101053사직동2
1회화39제주특별자치도39020서귀포시3902062예래동11
2분수대11서울특별시11010종로구1101053사직동1
3조각11서울특별시11010종로구1101053사직동7
4조각11서울특별시11010종로구1101056평창동4
5조각11서울특별시11010종로구1101057무악동1
6조각11서울특별시11010종로구1101058교남동6
7미디어39제주특별자치도39020서귀포시3902062예래동2
8공예11서울특별시11010종로구1101061종로1·2·3·4가동1
9회화11서울특별시11010종로구1101061종로1·2·3·4가동9
clctprvn_cdctprvn_nmsigngu_cdsigngu_nmadstrd_cdadstrd_nmco
90조각11서울특별시11050광진구1105066자양3동1
91조각11서울특별시11050광진구1105067자양4동1
92벽화11서울특별시11060동대문구1106071회기동2
93조각11서울특별시11060동대문구1106072휘경1동7
94조각11서울특별시11060동대문구1106073휘경2동3
95조각11서울특별시11060동대문구1106080청량리동4
96조각11서울특별시11060동대문구1106081용신동11
97벽화11서울특별시11060동대문구1106081용신동1
98회화11서울특별시11060동대문구1106081용신동1
99조각11서울특별시11060동대문구1106082제기동3