Overview

Dataset statistics

Number of variables8
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.7 KiB
Average record size in memory68.3 B

Variable types

Numeric3
Categorical5

Alerts

시도명 has constant value ""Constant
연령대 is highly overall correlated with 성별High correlation
시군구명 is highly overall correlated with 행정동코드 and 1 other fieldsHigh correlation
행정동명 is highly overall correlated with 행정동코드 and 1 other fieldsHigh correlation
성별 is highly overall correlated with 연령대High correlation
행정동코드 is highly overall correlated with 시군구명 and 1 other fieldsHigh correlation
시군구명 is highly imbalanced (67.3%)Imbalance
연령대 is highly imbalanced (63.2%)Imbalance

Reproduction

Analysis started2023-12-10 11:22:57.780449
Analysis finished2023-12-10 11:23:00.204218
Duration2.42 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

행정동코드
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1112369 × 109
Minimum1.1110515 × 109
Maximum1.114057 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:23:00.287874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1110515 × 109
5-th percentile1.111053 × 109
Q11.111053 × 109
median1.1110555 × 109
Q31.1110615 × 109
95-th percentile1.114055 × 109
Maximum1.114057 × 109
Range3005500
Interquartile range (IQR)8500

Descriptive statistics

Standard deviation715824.97
Coefficient of variation (CV)0.00064416953
Kurtosis12.400393
Mean1.1112369 × 109
Median Absolute Deviation (MAD)4000
Skewness3.7619356
Sum1.1112369 × 1011
Variance5.1240538 × 1011
MonotonicityIncreasing
2023-12-10T20:23:00.487354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1111061500 42
42.0%
1111053000 37
37.0%
1111055000 9
 
9.0%
1111051500 4
 
4.0%
1114055000 3
 
3.0%
1114057000 3
 
3.0%
1111056000 2
 
2.0%
ValueCountFrequency (%)
1111051500 4
 
4.0%
1111053000 37
37.0%
1111055000 9
 
9.0%
1111056000 2
 
2.0%
1111061500 42
42.0%
1114055000 3
 
3.0%
1114057000 3
 
3.0%
ValueCountFrequency (%)
1114057000 3
 
3.0%
1114055000 3
 
3.0%
1111061500 42
42.0%
1111056000 2
 
2.0%
1111055000 9
 
9.0%
1111053000 37
37.0%
1111051500 4
 
4.0%

시도명
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
서울특별시
100 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
서울특별시 100
100.0%

Length

2023-12-10T20:23:00.746822image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:23:00.924088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
서울특별시 100
100.0%

시군구명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
종로구
94 
중구
 
6

Length

Max length3
Median length3
Mean length2.94
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row종로구
3rd row종로구
4th row종로구
5th row종로구

Common Values

ValueCountFrequency (%)
종로구 94
94.0%
중구 6
 
6.0%

Length

2023-12-10T20:23:01.094575image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:23:01.321958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
종로구 94
94.0%
중구 6
 
6.0%

행정동명
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
종로1.2.3.4가동
42 
사직동
37 
부암동
청운효자동
 
4
명동
 
3
Other values (2)

Length

Max length11
Median length5
Mean length6.38
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row청운효자동
2nd row청운효자동
3rd row청운효자동
4th row청운효자동
5th row사직동

Common Values

ValueCountFrequency (%)
종로1.2.3.4가동 42
42.0%
사직동 37
37.0%
부암동 9
 
9.0%
청운효자동 4
 
4.0%
명동 3
 
3.0%
필동 3
 
3.0%
평창동 2
 
2.0%

Length

2023-12-10T20:23:01.492365image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:23:01.684012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
종로1.2.3.4가동 42
42.0%
사직동 37
37.0%
부암동 9
 
9.0%
청운효자동 4
 
4.0%
명동 3
 
3.0%
필동 3
 
3.0%
평창동 2
 
2.0%

기준일자
Real number (ℝ)

Distinct60
Distinct (%)60.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20200919
Minimum20200803
Maximum20201030
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:23:01.930427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20200803
5-th percentile20200807
Q120200826
median20200916
Q320201012
95-th percentile20201026
Maximum20201030
Range227
Interquartile range (IQR)186

Descriptive statistics

Standard deviation83.420572
Coefficient of variation (CV)4.1295435 × 10-6
Kurtosis-1.5201713
Mean20200919
Median Absolute Deviation (MAD)92
Skewness-0.008459194
Sum2.0200919 × 109
Variance6958.9918
MonotonicityNot monotonic
2023-12-10T20:23:02.209618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20201023 4
 
4.0%
20200904 4
 
4.0%
20200828 4
 
4.0%
20201016 3
 
3.0%
20200807 3
 
3.0%
20200820 3
 
3.0%
20200825 3
 
3.0%
20200901 3
 
3.0%
20201012 3
 
3.0%
20200929 3
 
3.0%
Other values (50) 67
67.0%
ValueCountFrequency (%)
20200803 1
 
1.0%
20200804 1
 
1.0%
20200805 1
 
1.0%
20200806 2
2.0%
20200807 3
3.0%
20200808 1
 
1.0%
20200810 1
 
1.0%
20200811 1
 
1.0%
20200812 1
 
1.0%
20200813 1
 
1.0%
ValueCountFrequency (%)
20201030 2
2.0%
20201029 2
2.0%
20201028 1
 
1.0%
20201026 2
2.0%
20201024 1
 
1.0%
20201023 4
4.0%
20201022 1
 
1.0%
20201021 1
 
1.0%
20201020 1
 
1.0%
20201019 2
2.0%

성별
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
X
83 
M
17 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowX
2nd rowX
3rd rowX
4th rowX
5th rowX

Common Values

ValueCountFrequency (%)
X 83
83.0%
M 17
 
17.0%

Length

2023-12-10T20:23:02.459558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:23:02.747315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
x 83
83.0%
m 17
 
17.0%

연령대
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct8
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
xx
83 
55
 
4
50
 
4
35
 
3
60
 
2
Other values (3)
 
4

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st rowxx
2nd rowxx
3rd rowxx
4th rowxx
5th rowxx

Common Values

ValueCountFrequency (%)
xx 83
83.0%
55 4
 
4.0%
50 4
 
4.0%
35 3
 
3.0%
60 2
 
2.0%
30 2
 
2.0%
45 1
 
1.0%
70 1
 
1.0%

Length

2023-12-10T20:23:02.967087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T20:23:03.172378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
xx 83
83.0%
55 4
 
4.0%
50 4
 
4.0%
35 3
 
3.0%
60 2
 
2.0%
30 2
 
2.0%
45 1
 
1.0%
70 1
 
1.0%

소비인구(명)
Real number (ℝ)

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean28.023923
Minimum22.479082
Maximum67.437247
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T20:23:03.349868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum22.479082
5-th percentile22.479082
Q122.479082
median22.479082
Q329.97211
95-th percentile44.958165
Maximum67.437247
Range44.958165
Interquartile range (IQR)7.4930275

Descriptive statistics

Standard deviation8.4947687
Coefficient of variation (CV)0.30312561
Kurtosis4.9003901
Mean28.023923
Median Absolute Deviation (MAD)0
Skewness2.0160821
Sum2802.3923
Variance72.161095
MonotonicityNot monotonic
2023-12-10T20:23:03.514799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
22.479082395 58
58.0%
29.97210986 23
 
23.0%
37.465137325 12
 
12.0%
44.95816479 3
 
3.0%
52.451192255 3
 
3.0%
67.437247185 1
 
1.0%
ValueCountFrequency (%)
22.479082395 58
58.0%
29.97210986 23
 
23.0%
37.465137325 12
 
12.0%
44.95816479 3
 
3.0%
52.451192255 3
 
3.0%
67.437247185 1
 
1.0%
ValueCountFrequency (%)
67.437247185 1
 
1.0%
52.451192255 3
 
3.0%
44.95816479 3
 
3.0%
37.465137325 12
 
12.0%
29.97210986 23
 
23.0%
22.479082395 58
58.0%

Interactions

2023-12-10T20:22:59.407811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:22:58.440584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:22:58.936634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:22:59.567081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:22:58.622699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:22:59.099647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:22:59.708886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:22:58.789439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T20:22:59.242227image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T20:23:03.700183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정동코드시군구명행정동명기준일자성별연령대소비인구(명)
행정동코드1.0000.9901.0000.1170.0000.2390.000
시군구명0.9901.0001.0000.1090.0000.2520.000
행정동명1.0001.0001.0000.2030.3030.3020.000
기준일자0.1170.1090.2031.0000.2940.2240.000
성별0.0000.0000.3030.2941.0001.0000.446
연령대0.2390.2520.3020.2241.0001.0000.000
소비인구(명)0.0000.0000.0000.0000.4460.0001.000
2023-12-10T20:23:03.931846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령대시군구명행정동명성별
연령대1.0000.1820.1640.969
시군구명0.1821.0000.9740.000
행정동명0.1640.9741.0000.315
성별0.9690.0000.3151.000
2023-12-10T20:23:04.135214image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정동코드기준일자소비인구(명)시군구명행정동명성별연령대
행정동코드1.0000.0940.3470.9100.9740.0000.182
기준일자0.0941.0000.1620.0800.1210.1910.116
소비인구(명)0.3470.1621.0000.0000.0000.3140.000
시군구명0.9100.0800.0001.0000.9740.0000.182
행정동명0.9740.1210.0000.9741.0000.3150.164
성별0.0000.1910.3140.0000.3151.0000.969
연령대0.1820.1160.0000.1820.1640.9691.000

Missing values

2023-12-10T20:22:59.891954image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T20:23:00.114181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

행정동코드시도명시군구명행정동명기준일자성별연령대소비인구(명)
01111051500서울특별시종로구청운효자동20200903Xxx22.479082
11111051500서울특별시종로구청운효자동20200923Xxx22.479082
21111051500서울특별시종로구청운효자동20200924Xxx22.479082
31111051500서울특별시종로구청운효자동20200929Xxx22.479082
41111053000서울특별시종로구사직동20201007Xxx22.479082
51111053000서울특별시종로구사직동20201012M6022.479082
61111053000서울특별시종로구사직동20201009M5522.479082
71111053000서울특별시종로구사직동20200807M3022.479082
81111053000서울특별시종로구사직동20200829Xxx22.479082
91111053000서울특별시종로구사직동20200929Xxx29.97211
행정동코드시도명시군구명행정동명기준일자성별연령대소비인구(명)
901111061500서울특별시종로구종로1.2.3.4가동20201028M7022.479082
911111061500서울특별시종로구종로1.2.3.4가동20201008Xxx44.958165
921111061500서울특별시종로구종로1.2.3.4가동20200923Xxx22.479082
931111061500서울특별시종로구종로1.2.3.4가동20200909Xxx52.451192
941114055000서울특별시중구명동20201023Xxx22.479082
951114055000서울특별시중구명동20200925Xxx22.479082
961114055000서울특별시중구명동20200904Xxx29.97211
971114057000서울특별시중구필동20201007M5522.479082
981114057000서울특별시중구필동20201023Xxx22.479082
991114057000서울특별시중구필동20200904M3022.479082