Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.7 KiB
Average record size in memory68.3 B

Variable types

Numeric3
Categorical5

Alerts

시도명 has constant value ""Constant
시군구명 is highly overall correlated with 행정동명High correlation
행정동명 is highly overall correlated with 행정동코드 and 1 other fieldsHigh correlation
행정동코드 is highly overall correlated with 행정동명High correlation
성별 is highly overall correlated with 연령대High correlation
연령대 is highly overall correlated with 성별High correlation
시군구명 is highly imbalanced (91.9%)Imbalance
성별 is highly imbalanced (50.8%)Imbalance
연령대 is highly imbalanced (59.4%)Imbalance

Reproduction

Analysis started2023-12-10 11:23:13.279366
Analysis finished2023-12-10 11:23:16.268495
Duration2.99 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

행정동코드
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1110871 × 109
Minimum1.1110515 × 109
Maximum1.114054 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:23:16.385670image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1110515 × 109
5-th percentile1.1110515 × 109
Q11.111053 × 109
median1.1110555 × 109
Q31.1110615 × 109
95-th percentile1.1110615 × 109
Maximum1.114054 × 109
Range3002500
Interquartile range (IQR)8500

Descriptive statistics

Standard deviation299721
Coefficient of variation (CV)0.00026975474
Kurtosis99.957919
Mean1.1110871 × 109
Median Absolute Deviation (MAD)4000
Skewness9.9968749
Sum1.1110871 × 1011
Variance8.983268 × 1010
MonotonicityIncreasing
2023-12-10T20:23:16.660453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1111061500 47
47.0%
1111053000 29
29.0%
1111051500 12
 
12.0%
1111055000 9
 
9.0%
1111056000 2
 
2.0%
1114054000 1
 
1.0%
ValueCountFrequency (%)
1111051500 12
 
12.0%
1111053000 29
29.0%
1111055000 9
 
9.0%
1111056000 2
 
2.0%
1111061500 47
47.0%
1114054000 1
 
1.0%
ValueCountFrequency (%)
1114054000 1
 
1.0%
1111061500 47
47.0%
1111056000 2
 
2.0%
1111055000 9
 
9.0%
1111053000 29
29.0%
1111051500 12
 
12.0%

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
서울특별시
100 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 100
100.0%

Length

2023-12-10T20:23:16.939495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:23:17.158950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 100
100.0%

시군구명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
종로구
99 
중구
 
1

Length

Max length3
Median length3
Mean length2.99
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row종로구
2nd row종로구
3rd row종로구
4th row종로구
5th row종로구

Common Values

ValueCountFrequency (%)
종로구 99
99.0%
중구 1
 
1.0%

Length

2023-12-10T20:23:17.387575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:23:17.565042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
종로구 99
99.0%
중구 1
 
1.0%

행정동명
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
종로1.2.3.4가동
47 
사직동
29 
청운효자동
12 
부암동
평창동
 
2

Length

Max length11
Median length5
Mean length7
Min length3

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row청운효자동
2nd row청운효자동
3rd row청운효자동
4th row청운효자동
5th row청운효자동

Common Values

ValueCountFrequency (%)
종로1.2.3.4가동 47
47.0%
사직동 29
29.0%
청운효자동 12
 
12.0%
부암동 9
 
9.0%
평창동 2
 
2.0%
회현동 1
 
1.0%

Length

2023-12-10T20:23:17.776984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:23:18.046109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
종로1.2.3.4가동 47
47.0%
사직동 29
29.0%
청운효자동 12
 
12.0%
부암동 9
 
9.0%
평창동 2
 
2.0%
회현동 1
 
1.0%

기준일자
Real number (ℝ)

Distinct54
Distinct (%)54.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20190920
Minimum20190805
Maximum20191031
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:23:18.305056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20190805
5-th percentile20190807
Q120190824
median20190918
Q320191012
95-th percentile20191028
Maximum20191031
Range226
Interquartile range (IQR)188

Descriptive statistics

Standard deviation86.489717
Coefficient of variation (CV)4.2835947 × 10-6
Kurtosis-1.6549778
Mean20190920
Median Absolute Deviation (MAD)94.5
Skewness-0.041679752
Sum2.019092 × 109
Variance7480.4711
MonotonicityNot monotonic
2023-12-10T20:23:18.635489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20191011 5
 
5.0%
20190819 4
 
4.0%
20190805 4
 
4.0%
20191014 3
 
3.0%
20190829 3
 
3.0%
20190821 3
 
3.0%
20190909 3
 
3.0%
20191015 3
 
3.0%
20190918 3
 
3.0%
20191017 3
 
3.0%
Other values (44) 66
66.0%
ValueCountFrequency (%)
20190805 4
4.0%
20190806 1
 
1.0%
20190807 1
 
1.0%
20190808 1
 
1.0%
20190809 2
2.0%
20190812 2
2.0%
20190813 1
 
1.0%
20190814 2
2.0%
20190819 4
4.0%
20190820 1
 
1.0%
ValueCountFrequency (%)
20191031 2
2.0%
20191030 2
2.0%
20191029 1
 
1.0%
20191028 2
2.0%
20191023 2
2.0%
20191022 1
 
1.0%
20191021 2
2.0%
20191018 2
2.0%
20191017 3
3.0%
20191016 2
2.0%

성별
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
X
80 
M
19 
F
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st rowM
2nd rowM
3rd rowX
4th rowM
5th rowX

Common Values

ValueCountFrequency (%)
X 80
80.0%
M 19
 
19.0%
F 1
 
1.0%

Length

2023-12-10T20:23:18.853436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:23:19.017853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
x 80
80.0%
m 19
 
19.0%
f 1
 
1.0%

연령대
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
xx
80 
55
 
5
50
 
3
30
 
3
35
 
3
Other values (4)
 
6

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row55
2nd row50
3rd rowxx
4th row55
5th rowxx

Common Values

ValueCountFrequency (%)
xx 80
80.0%
55 5
 
5.0%
50 3
 
3.0%
30 3
 
3.0%
35 3
 
3.0%
40 2
 
2.0%
45 2
 
2.0%
70 1
 
1.0%
65 1
 
1.0%

Length

2023-12-10T20:23:19.192969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:23:19.393817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
xx 80
80.0%
55 5
 
5.0%
50 3
 
3.0%
30 3
 
3.0%
35 3
 
3.0%
40 2
 
2.0%
45 2
 
2.0%
70 1
 
1.0%
65 1
 
1.0%

소비인구(명)
Real number (ℝ)

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.57
Minimum22
Maximum66
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:23:19.599207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum22
5-th percentile22
Q122
median22
Q330
95-th percentile44.4
Maximum66
Range44
Interquartile range (IQR)8

Descriptive statistics

Standard deviation9.8516829
Coefficient of variation (CV)0.34482614
Kurtosis3.0479688
Mean28.57
Median Absolute Deviation (MAD)0
Skewness1.7833744
Sum2857
Variance97.055657
MonotonicityNot monotonic
2023-12-10T20:23:19.790711image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
22 57
57.0%
30 21
 
21.0%
37 10
 
10.0%
44 7
 
7.0%
59 3
 
3.0%
66 1
 
1.0%
52 1
 
1.0%
ValueCountFrequency (%)
22 57
57.0%
30 21
 
21.0%
37 10
 
10.0%
44 7
 
7.0%
52 1
 
1.0%
59 3
 
3.0%
66 1
 
1.0%
ValueCountFrequency (%)
66 1
 
1.0%
59 3
 
3.0%
52 1
 
1.0%
44 7
 
7.0%
37 10
 
10.0%
30 21
 
21.0%
22 57
57.0%

Interactions

2023-12-10T20:23:15.284922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:23:14.045350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:23:14.705735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:23:15.499473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:23:14.244281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:23:14.920321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:23:15.697841image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:23:14.501502image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:23:15.104148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T20:23:19.938540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정동코드시군구명행정동명기준일자성별연령대소비인구(명)
행정동코드1.0000.6911.0000.4460.0000.0000.000
시군구명0.6911.0001.0000.4630.0000.0000.000
행정동명1.0001.0001.0000.0000.2970.3300.000
기준일자0.4460.4630.0001.0000.3780.0000.000
성별0.0000.0000.2970.3781.0000.9680.183
연령대0.0000.0000.3300.0000.9681.0000.000
소비인구(명)0.0000.0000.0000.0000.1830.0001.000
2023-12-10T20:23:20.171301image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령대시군구명행정동명성별
연령대1.0000.0000.1650.762
시군구명0.0001.0000.9790.000
행정동명0.1650.9791.0000.125
성별0.7620.0000.1251.000
2023-12-10T20:23:20.344351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정동코드기준일자소비인구(명)시군구명행정동명성별연령대
행정동코드1.000-0.0640.3400.4870.9790.0000.000
기준일자-0.0641.0000.1030.3300.0000.1790.000
소비인구(명)0.3400.1031.0000.0000.0000.1190.000
시군구명0.4870.3300.0001.0000.9790.0000.000
행정동명0.9790.0000.0000.9791.0000.1250.165
성별0.0000.1790.1190.0000.1251.0000.762
연령대0.0000.0000.0000.0000.1650.7621.000

Missing values

2023-12-10T20:23:15.932826image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T20:23:16.173512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

행정동코드시도명시군구명행정동명기준일자성별연령대소비인구(명)
01111051500서울특별시종로구청운효자동20190819M5530
11111051500서울특별시종로구청운효자동20190826M5022
21111051500서울특별시종로구청운효자동20190916Xxx22
31111051500서울특별시종로구청운효자동20191028M5522
41111051500서울특별시종로구청운효자동20190904Xxx22
51111051500서울특별시종로구청운효자동20191031Xxx22
61111051500서울특별시종로구청운효자동20190919Xxx22
71111051500서울특별시종로구청운효자동20191017Xxx30
81111051500서울특별시종로구청운효자동20191014Xxx22
91111051500서울특별시종로구청운효자동20191015Xxx30
행정동코드시도명시군구명행정동명기준일자성별연령대소비인구(명)
901111061500서울특별시종로구종로1.2.3.4가동20190830Xxx22
911111061500서울특별시종로구종로1.2.3.4가동20190829Xxx22
921111061500서울특별시종로구종로1.2.3.4가동20190828Xxx44
931111061500서울특별시종로구종로1.2.3.4가동20191021Xxx44
941111061500서울특별시종로구종로1.2.3.4가동20190826Xxx22
951111061500서울특별시종로구종로1.2.3.4가동20190823Xxx59
961111061500서울특별시종로구종로1.2.3.4가동20190822Xxx30
971111061500서울특별시종로구종로1.2.3.4가동20190821Xxx30
981111061500서울특별시종로구종로1.2.3.4가동20190821M3022
991114054000서울특별시중구회현동20190829Xxx22