Overview

Dataset statistics

Number of variables7
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.2 KiB
Average record size in memory63.3 B

Variable types

Categorical3
Numeric4

Alerts

pc20_cnt is highly overall correlated with pc40_cnt and 3 other fieldsHigh correlation
pc40_cnt is highly overall correlated with pc20_cnt and 3 other fieldsHigh correlation
pc60_cnt is highly overall correlated with pc20_cnt and 3 other fieldsHigh correlation
pc61_cnt is highly overall correlated with pc20_cnt and 3 other fieldsHigh correlation
gu_dc is highly overall correlated with pc20_cnt and 3 other fieldsHigh correlation
pc20_cnt has unique valuesUnique
pc40_cnt has unique valuesUnique
pc61_cnt has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:57:20.613706
Analysis finished2023-12-10 09:57:26.124958
Duration5.51 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

base_year
Categorical

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2014
61 
2015
36 
2021
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2014
2nd row2021
3rd row2014
4th row2014
5th row2014

Common Values

ValueCountFrequency (%)
2014 61
61.0%
2015 36
36.0%
2021 3
 
3.0%

Length

2023-12-10T18:57:26.271089image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:57:26.471088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2014 61
61.0%
2015 36
36.0%
2021 3
 
3.0%

base_month
Categorical

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
6
34 
3
30 
9
20 
12
16 

Length

Max length2
Median length1
Mean length1.16
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row6
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
6 34
34.0%
3 30
30.0%
9 20
20.0%
12 16
16.0%

Length

2023-12-10T18:57:26.737710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:57:27.062946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
6 34
34.0%
3 30
30.0%
9 20
20.0%
12 16
16.0%

gu_dc
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
강서구
영도구
기장군
남구
중구
Other values (11)
65 

Length

Max length4
Median length3
Mean length2.81
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강서구
2nd row영도구
3rd row기장군
4th row남구
5th row동구

Common Values

ValueCountFrequency (%)
강서구 7
 
7.0%
영도구 7
 
7.0%
기장군 7
 
7.0%
남구 7
 
7.0%
중구 7
 
7.0%
동구 6
 
6.0%
동래구 6
 
6.0%
부산진구 6
 
6.0%
사상구 6
 
6.0%
사하구 6
 
6.0%
Other values (6) 35
35.0%

Length

2023-12-10T18:57:27.530859image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강서구 7
 
7.0%
영도구 7
 
7.0%
기장군 7
 
7.0%
남구 7
 
7.0%
중구 7
 
7.0%
동구 6
 
6.0%
동래구 6
 
6.0%
부산진구 6
 
6.0%
사상구 6
 
6.0%
사하구 6
 
6.0%
Other values (6) 35
35.0%

pc20_cnt
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44666.1
Minimum7527
Maximum87265
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:57:28.101068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum7527
5-th percentile10421.4
Q121466.25
median45574
Q361847.75
95-th percentile85312.05
Maximum87265
Range79738
Interquartile range (IQR)40381.5

Descriptive statistics

Standard deviation24334.322
Coefficient of variation (CV)0.54480517
Kurtosis-1.2759038
Mean44666.1
Median Absolute Deviation (MAD)22543
Skewness0.10105551
Sum4466610
Variance5.9215923 × 108
MonotonicityNot monotonic
2023-12-10T18:57:29.103916image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11584 1
 
1.0%
12782 1
 
1.0%
24739 1
 
1.0%
76411 1
 
1.0%
59658 1
 
1.0%
68462 1
 
1.0%
85294 1
 
1.0%
59361 1
 
1.0%
22200 1
 
1.0%
62562 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
7527 1
1.0%
8911 1
1.0%
8940 1
1.0%
10316 1
1.0%
10353 1
1.0%
10425 1
1.0%
10440 1
1.0%
11584 1
1.0%
11596 1
1.0%
12521 1
1.0%
ValueCountFrequency (%)
87265 1
1.0%
87165 1
1.0%
86485 1
1.0%
85821 1
1.0%
85655 1
1.0%
85294 1
1.0%
84404 1
1.0%
84251 1
1.0%
81448 1
1.0%
76527 1
1.0%

pc40_cnt
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean86445.61
Minimum14944
Maximum191813
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:57:29.972605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum14944
5-th percentile17978.45
Q141006.25
median90292
Q3117666.5
95-th percentile165854.35
Maximum191813
Range176869
Interquartile range (IQR)76660.25

Descriptive statistics

Standard deviation48524.166
Coefficient of variation (CV)0.56132598
Kurtosis-1.0620804
Mean86445.61
Median Absolute Deviation (MAD)42495.5
Skewness0.26021482
Sum8644561
Variance2.3545947 × 109
MonotonicityNot monotonic
2023-12-10T18:57:30.372063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
24508 1
 
1.0%
30363 1
 
1.0%
41384 1
 
1.0%
151146 1
 
1.0%
111813 1
 
1.0%
131295 1
 
1.0%
161520 1
 
1.0%
117833 1
 
1.0%
36375 1
 
1.0%
110991 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
14944 1
1.0%
15163 1
1.0%
17812 1
1.0%
17835 1
1.0%
17930 1
1.0%
17981 1
1.0%
23160 1
1.0%
24508 1
1.0%
25177 1
1.0%
28215 1
1.0%
ValueCountFrequency (%)
191813 1
1.0%
178673 1
1.0%
178594 1
1.0%
177745 1
1.0%
176216 1
1.0%
165309 1
1.0%
161550 1
1.0%
161520 1
1.0%
160570 1
1.0%
160544 1
1.0%

pc60_cnt
Real number (ℝ)

HIGH CORRELATION 

Distinct99
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10254.25
Minimum1077
Maximum56080
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:57:30.741886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1077
5-th percentile1221.7
Q14178.5
median9244
Q314568.5
95-th percentile26512.5
Maximum56080
Range55003
Interquartile range (IQR)10390

Descriptive statistics

Standard deviation7845.0073
Coefficient of variation (CV)0.76504935
Kurtosis10.966852
Mean10254.25
Median Absolute Deviation (MAD)5204
Skewness2.3777746
Sum1025425
Variance61544139
MonotonicityNot monotonic
2023-12-10T18:57:31.107018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4237 2
 
2.0%
3123 1
 
1.0%
27566 1
 
1.0%
15113 1
 
1.0%
9145 1
 
1.0%
14692 1
 
1.0%
17028 1
 
1.0%
14961 1
 
1.0%
2581 1
 
1.0%
12171 1
 
1.0%
Other values (89) 89
89.0%
ValueCountFrequency (%)
1077 1
1.0%
1096 1
1.0%
1159 1
1.0%
1167 1
1.0%
1216 1
1.0%
1222 1
1.0%
2216 1
1.0%
2231 1
1.0%
2574 1
1.0%
2578 1
1.0%
ValueCountFrequency (%)
56080 1
1.0%
27921 1
1.0%
27765 1
1.0%
27566 1
1.0%
27187 1
1.0%
26477 1
1.0%
17174 1
1.0%
17094 1
1.0%
17049 1
1.0%
17028 1
1.0%

pc61_cnt
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3908.06
Minimum260
Maximum17171
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:57:31.427771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum260
5-th percentile314.35
Q11327.5
median3845.5
Q35033.5
95-th percentile10499.95
Maximum17171
Range16911
Interquartile range (IQR)3706

Descriptive statistics

Standard deviation3481.2118
Coefficient of variation (CV)0.89077747
Kurtosis5.5608594
Mean3908.06
Median Absolute Deviation (MAD)1956
Skewness2.1092826
Sum390806
Variance12118835
MonotonicityNot monotonic
2023-12-10T18:57:31.790419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1244 1
 
1.0%
1757 1
 
1.0%
1363 1
 
1.0%
4452 1
 
1.0%
2253 1
 
1.0%
4926 1
 
1.0%
5969 1
 
1.0%
6410 1
 
1.0%
685 1
 
1.0%
4743 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
260 1
1.0%
267 1
1.0%
291 1
1.0%
300 1
1.0%
302 1
1.0%
315 1
1.0%
400 1
1.0%
563 1
1.0%
583 1
1.0%
664 1
1.0%
ValueCountFrequency (%)
17171 1
1.0%
16763 1
1.0%
16440 1
1.0%
15557 1
1.0%
14413 1
1.0%
10294 1
1.0%
6593 1
1.0%
6410 1
1.0%
6234 1
1.0%
6161 1
1.0%

Interactions

2023-12-10T18:57:24.480122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:21.165128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:22.410378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:23.385031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:24.669538image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:21.456901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:22.661585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:23.755479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:24.872060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:21.827413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:22.900275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:24.052685image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:25.062005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:22.156631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:23.106008image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:24.301353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:57:32.008517image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
base_yearbase_monthgu_dcpc20_cntpc40_cntpc60_cntpc61_cnt
base_year1.0000.3200.0000.0000.0000.4480.476
base_month0.3201.0000.0000.0000.0000.0000.000
gu_dc0.0000.0001.0000.9640.9580.9290.907
pc20_cnt0.0000.0000.9641.0000.9820.9520.763
pc40_cnt0.0000.0000.9580.9821.0000.9640.815
pc60_cnt0.4480.0000.9290.9520.9641.0000.920
pc61_cnt0.4760.0000.9070.7630.8150.9201.000
2023-12-10T18:57:32.225226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
gu_dcbase_yearbase_month
gu_dc1.0000.0000.000
base_year0.0001.0000.306
base_month0.0000.3061.000
2023-12-10T18:57:32.406703image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
pc20_cntpc40_cntpc60_cntpc61_cntbase_yearbase_monthgu_dc
pc20_cnt1.0000.9710.9080.8280.0000.0000.809
pc40_cnt0.9711.0000.9560.8610.0000.0000.786
pc60_cnt0.9080.9561.0000.9340.3750.0000.748
pc61_cnt0.8280.8610.9341.0000.3560.0000.685
base_year0.0000.0000.3750.3561.0000.3060.000
base_month0.0000.0000.0000.0000.3061.0000.000
gu_dc0.8090.7860.7480.6850.0000.0001.000

Missing values

2023-12-10T18:57:25.376244image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:57:26.033144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

base_yearbase_monthgu_dcpc20_cntpc40_cntpc60_cntpc61_cnt
020143강서구115842450831231244
120216영도구1838957602119111316
220143기장군171963696447861764
320143남구57488102275115524262
420143동구18074295192231563
520143동래구51258102090138525723
620143부산진구76527144004161025579
720216중구7527231604054400
820143사상구5522810333787752172
920143사하구72197141813146184258
base_yearbase_monthgu_dcpc20_cntpc40_cntpc60_cntpc61_cnt
9020156서구248394143137541380
9120156수영구394987651184323963
9220156연제구4552190209112554909
9320156영도구308215295642371100
9420156중구10440179811167300
9520156해운대구872651786732792117171
9620159강서구135643280446852000
9720159금정구59752104824131926096
9820159기장군214994817064472410
9920159남구63410110515124865019