Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.8 KiB
Average record size in memory69.3 B

Variable types

Numeric4
Categorical4

Alerts

시도명 has constant value ""Constant
행정동명 is highly overall correlated with 행정동코드 and 1 other fieldsHigh correlation
시군구명 is highly overall correlated with 행정동코드 and 1 other fieldsHigh correlation
행정동코드 is highly overall correlated with 시군구명 and 1 other fieldsHigh correlation
시군구명 is highly imbalanced (56.7%)Imbalance

Reproduction

Analysis started2023-12-10 13:45:08.084475
Analysis finished2023-12-10 13:45:12.705144
Duration4.62 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

행정동코드
Real number (ℝ)

HIGH CORRELATION 

Distinct12
Distinct (%)12.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1116291 × 109
Minimum1.111053 × 109
Maximum1.117063 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:45:12.839047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.111053 × 109
5-th percentile1.111053 × 109
Q11.111057 × 109
median1.1110615 × 109
Q31.1110615 × 109
95-th percentile1.1170511 × 109
Maximum1.117063 × 109
Range6010000
Interquartile range (IQR)4500

Descriptive statistics

Standard deviation1578553.2
Coefficient of variation (CV)0.0014200358
Kurtosis6.3600086
Mean1.1116291 × 109
Median Absolute Deviation (MAD)0
Skewness2.738921
Sum1.1116291 × 1011
Variance2.4918301 × 1012
MonotonicityIncreasing
2023-12-10T22:45:13.032878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1111061500 52
52.0%
1111053000 22
22.0%
1111057000 9
 
9.0%
1114064500 7
 
7.0%
1117063000 3
 
3.0%
1111055000 1
 
1.0%
1111056000 1
 
1.0%
1111058000 1
 
1.0%
1111065000 1
 
1.0%
1117051000 1
 
1.0%
Other values (2) 2
 
2.0%
ValueCountFrequency (%)
1111053000 22
22.0%
1111055000 1
 
1.0%
1111056000 1
 
1.0%
1111057000 9
 
9.0%
1111058000 1
 
1.0%
1111061500 52
52.0%
1111065000 1
 
1.0%
1114064500 7
 
7.0%
1117051000 1
 
1.0%
1117053000 1
 
1.0%
ValueCountFrequency (%)
1117063000 3
 
3.0%
1117058000 1
 
1.0%
1117053000 1
 
1.0%
1117051000 1
 
1.0%
1114064500 7
 
7.0%
1111065000 1
 
1.0%
1111061500 52
52.0%
1111058000 1
 
1.0%
1111057000 9
 
9.0%
1111056000 1
 
1.0%

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
서울특별시
100 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 100
100.0%

Length

2023-12-10T22:45:13.278428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:45:13.434176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 100
100.0%

시군구명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
종로구
87 
중구
 
7
용산구
 
6

Length

Max length3
Median length3
Mean length2.93
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row종로구
3rd row종로구
4th row종로구
5th row종로구

Common Values

ValueCountFrequency (%)
종로구 87
87.0%
중구 7
 
7.0%
용산구 6
 
6.0%

Length

2023-12-10T22:45:13.607102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:45:13.800359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
종로구 87
87.0%
중구 7
 
7.0%
용산구 6
 
6.0%

행정동명
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)12.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
종로1.2.3.4가동
52 
사직동
22 
무악동
청구동
이촌1동
 
3
Other values (7)

Length

Max length11
Median length11
Mean length7.19
Min length3

Unique

Unique7 ?
Unique (%)7.0%

Sample

1st row사직동
2nd row사직동
3rd row사직동
4th row사직동
5th row사직동

Common Values

ValueCountFrequency (%)
종로1.2.3.4가동 52
52.0%
사직동 22
22.0%
무악동 9
 
9.0%
청구동 7
 
7.0%
이촌1동 3
 
3.0%
부암동 1
 
1.0%
평창동 1
 
1.0%
교남동 1
 
1.0%
혜화동 1
 
1.0%
후암동 1
 
1.0%
Other values (2) 2
 
2.0%

Length

2023-12-10T22:45:14.098162image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
종로1.2.3.4가동 52
52.0%
사직동 22
22.0%
무악동 9
 
9.0%
청구동 7
 
7.0%
이촌1동 3
 
3.0%
부암동 1
 
1.0%
평창동 1
 
1.0%
교남동 1
 
1.0%
혜화동 1
 
1.0%
후암동 1
 
1.0%
Other values (2) 2
 
2.0%

기준일자
Real number (ℝ)

Distinct53
Distinct (%)53.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20190918
Minimum20190801
Maximum20191031
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:45:14.372213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20190801
5-th percentile20190809
Q120190823
median20190921
Q320191011
95-th percentile20191024
Maximum20191031
Range230
Interquartile range (IQR)188.5

Descriptive statistics

Standard deviation83.121576
Coefficient of variation (CV)4.1167803 × 10-6
Kurtosis-1.5386196
Mean20190918
Median Absolute Deviation (MAD)92.5
Skewness-0.064344993
Sum2.0190918 × 109
Variance6909.1964
MonotonicityNot monotonic
2023-12-10T22:45:14.758678image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20191014 8
 
8.0%
20190814 7
 
7.0%
20190902 5
 
5.0%
20191001 4
 
4.0%
20190925 4
 
4.0%
20190908 3
 
3.0%
20190831 3
 
3.0%
20191013 3
 
3.0%
20191024 3
 
3.0%
20190927 3
 
3.0%
Other values (43) 57
57.0%
ValueCountFrequency (%)
20190801 1
 
1.0%
20190805 1
 
1.0%
20190806 2
 
2.0%
20190807 1
 
1.0%
20190809 1
 
1.0%
20190810 2
 
2.0%
20190811 2
 
2.0%
20190812 1
 
1.0%
20190814 7
7.0%
20190815 2
 
2.0%
ValueCountFrequency (%)
20191031 1
 
1.0%
20191029 2
 
2.0%
20191028 1
 
1.0%
20191024 3
 
3.0%
20191022 1
 
1.0%
20191020 1
 
1.0%
20191016 2
 
2.0%
20191015 2
 
2.0%
20191014 8
8.0%
20191013 3
 
3.0%

성별
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
F
57 
M
43 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowF
3rd rowM
4th rowF
5th rowM

Common Values

ValueCountFrequency (%)
F 57
57.0%
M 43
43.0%

Length

2023-12-10T22:45:14.995264image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:45:15.207218image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
f 57
57.0%
m 43
43.0%

연령대
Real number (ℝ)

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41.5
Minimum20
Maximum55
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:45:15.374115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20
5-th percentile20
Q140
median45
Q350
95-th percentile50
Maximum55
Range35
Interquartile range (IQR)10

Descriptive statistics

Standard deviation10.624443
Coefficient of variation (CV)0.25601067
Kurtosis0.25828861
Mean41.5
Median Absolute Deviation (MAD)5
Skewness-1.2859509
Sum4150
Variance112.87879
MonotonicityNot monotonic
2023-12-10T22:45:15.580682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
45 47
47.0%
50 22
22.0%
20 17
 
17.0%
40 8
 
8.0%
55 4
 
4.0%
30 1
 
1.0%
25 1
 
1.0%
ValueCountFrequency (%)
20 17
 
17.0%
25 1
 
1.0%
30 1
 
1.0%
40 8
 
8.0%
45 47
47.0%
50 22
22.0%
55 4
 
4.0%
ValueCountFrequency (%)
55 4
 
4.0%
50 22
22.0%
45 47
47.0%
40 8
 
8.0%
30 1
 
1.0%
25 1
 
1.0%
20 17
 
17.0%

소비인구(명)
Real number (ℝ)

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.7
Minimum22
Maximum66
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:45:15.820447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum22
5-th percentile22
Q122
median22
Q330
95-th percentile52.35
Maximum66
Range44
Interquartile range (IQR)8

Descriptive statistics

Standard deviation9.9650906
Coefficient of variation (CV)0.3472157
Kurtosis3.6610641
Mean28.7
Median Absolute Deviation (MAD)0
Skewness1.957229
Sum2870
Variance99.30303
MonotonicityNot monotonic
2023-12-10T22:45:15.989013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
22 53
53.0%
30 29
29.0%
37 8
 
8.0%
59 4
 
4.0%
44 3
 
3.0%
52 2
 
2.0%
66 1
 
1.0%
ValueCountFrequency (%)
22 53
53.0%
30 29
29.0%
37 8
 
8.0%
44 3
 
3.0%
52 2
 
2.0%
59 4
 
4.0%
66 1
 
1.0%
ValueCountFrequency (%)
66 1
 
1.0%
59 4
 
4.0%
52 2
 
2.0%
44 3
 
3.0%
37 8
 
8.0%
30 29
29.0%
22 53
53.0%

Interactions

2023-12-10T22:45:11.397235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:08.655661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:09.293798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:10.238185image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:11.652180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:08.837606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:09.453108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:10.612186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:11.868719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:08.985198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:09.625816image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:10.877949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:12.085420image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:09.132736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:09.891620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:45:11.169348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:45:16.126753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정동코드시군구명행정동명기준일자성별연령대소비인구(명)
행정동코드1.0001.0001.0000.0000.1770.4550.000
시군구명1.0001.0001.0000.0000.1480.4150.000
행정동명1.0001.0001.0000.2270.2330.4530.000
기준일자0.0000.0000.2271.0000.0000.0000.000
성별0.1770.1480.2330.0001.0000.4210.244
연령대0.4550.4150.4530.0000.4211.0000.000
소비인구(명)0.0000.0000.0000.0000.2440.0001.000
2023-12-10T22:45:16.311071image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정동명시군구명성별
행정동명1.0000.9520.168
시군구명0.9521.0000.243
성별0.1680.2431.000
2023-12-10T22:45:16.448621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정동코드기준일자연령대소비인구(명)시군구명행정동명성별
행정동코드1.0000.080-0.144-0.0511.0000.9520.243
기준일자0.0801.000-0.1340.0010.0000.0830.000
연령대-0.144-0.1341.000-0.0970.3000.2310.439
소비인구(명)-0.0510.001-0.0971.0000.0000.0000.253
시군구명1.0000.0000.3000.0001.0000.9520.243
행정동명0.9520.0830.2310.0000.9521.0000.168
성별0.2430.0000.4390.2530.2430.1681.000

Missing values

2023-12-10T22:45:12.320195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:45:12.599051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

행정동코드시도명시군구명행정동명기준일자성별연령대소비인구(명)
01111053000서울특별시종로구사직동20190908M5030
11111053000서울특별시종로구사직동20190914F4530
21111053000서울특별시종로구사직동20191013M4530
31111053000서울특별시종로구사직동20190811F4537
41111053000서울특별시종로구사직동20191020M4522
51111053000서울특별시종로구사직동20190922M5022
61111053000서울특별시종로구사직동20191012F4544
71111053000서울특별시종로구사직동20190901F4022
81111053000서울특별시종로구사직동20190908F4552
91111053000서울특별시종로구사직동20190927F4530
행정동코드시도명시군구명행정동명기준일자성별연령대소비인구(명)
901114064500서울특별시중구청구동20190827F4030
911114064500서울특별시중구청구동20190812F4530
921114064500서울특별시중구청구동20190805F4522
931114064500서울특별시중구청구동20191031F4037
941117051000서울특별시용산구후암동20191007F4530
951117053000서울특별시용산구남영동20190809F5022
961117058000서울특별시용산구효창동20191022F4022
971117063000서울특별시용산구이촌1동20191029F4522
981117063000서울특별시용산구이촌1동20190826F4530
991117063000서울특별시용산구이촌1동20190806F4522