Overview

Dataset statistics

Number of variables13
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory10.5 KiB
Average record size in memory107.3 B

Variable types

Numeric1
Categorical6
Boolean6

Alerts

examin_ym has constant value ""Constant
marn_sports_supli_prchs_at has constant value ""Constant
answrr_oc_area_nm is highly overall correlated with climd_camping_fshng_supli_prchs_atHigh correlation
climd_camping_fshng_supli_prchs_at is highly overall correlated with answrr_oc_area_nmHigh correlation
bal_supli_prchs_at is highly imbalanced (75.8%)Imbalance
climd_camping_fshng_supli_prchs_at is highly imbalanced (85.9%)Imbalance
bcycl_wrkot_yoga_supli_prchs_at is highly imbalanced (71.4%)Imbalance
golf_supli_prchs_at is highly imbalanced (91.9%)Imbalance
wnt_sports_supli_prchs_at is highly imbalanced (91.9%)Imbalance
respond_id has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:02:16.288826
Analysis finished2023-12-10 10:02:18.300408
Duration2.01 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

respond_id
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean103660.11
Minimum199
Maximum3242878
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:02:18.433550image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum199
5-th percentile725.1
Q13399.75
median5873.5
Q310576.25
95-th percentile14898.95
Maximum3242878
Range3242679
Interquartile range (IQR)7176.5

Descriptive statistics

Standard deviation554781.24
Coefficient of variation (CV)5.351926
Kurtosis29.89392
Mean103660.11
Median Absolute Deviation (MAD)3239.5
Skewness5.5941265
Sum10366011
Variance3.0778222 × 1011
MonotonicityNot monotonic
2023-12-10T19:02:18.688929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199 1
 
1.0%
7094 1
 
1.0%
9592 1
 
1.0%
9495 1
 
1.0%
8817 1
 
1.0%
8717 1
 
1.0%
8671 1
 
1.0%
8573 1
 
1.0%
8370 1
 
1.0%
7351 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
199 1
1.0%
354 1
1.0%
497 1
1.0%
547 1
1.0%
689 1
1.0%
727 1
1.0%
982 1
1.0%
988 1
1.0%
1267 1
1.0%
1467 1
1.0%
ValueCountFrequency (%)
3242878 1
1.0%
3242247 1
1.0%
3242000 1
1.0%
15661 1
1.0%
15392 1
1.0%
14873 1
1.0%
14840 1
1.0%
14550 1
1.0%
14227 1
1.0%
14152 1
1.0%

examin_ym
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
202204
100 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202204
2nd row202204
3rd row202204
4th row202204
5th row202204

Common Values

ValueCountFrequency (%)
202204 100
100.0%

Length

2023-12-10T19:02:19.041930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:02:19.200464image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202204 100
100.0%

sexdstn_flag_cd
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
M
60 
F
40 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF
2nd rowF
3rd rowM
4th rowM
5th rowM

Common Values

ValueCountFrequency (%)
M 60
60.0%
F 40
40.0%

Length

2023-12-10T19:02:19.450221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:02:19.616589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
m 60
60.0%
f 40
40.0%

agrde_flag_nm
Categorical

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
40대
31 
50대
29 
60대
26 
30대
14 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row60대
2nd row40대
3rd row60대
4th row60대
5th row60대

Common Values

ValueCountFrequency (%)
40대 31
31.0%
50대 29
29.0%
60대 26
26.0%
30대 14
14.0%

Length

2023-12-10T19:02:19.824307image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:02:20.057254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
40대 31
31.0%
50대 29
29.0%
60대 26
26.0%
30대 14
14.0%

answrr_oc_area_nm
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
서울특별시
32 
경기도
23 
인천광역시
12 
부산광역시
광주광역시
Other values (10)
22 

Length

Max length12
Median length5
Mean length4.56
Min length3

Unique

Unique3 ?
Unique (%)3.0%

Sample

1st row서울특별시
2nd row부산광역시
3rd row서울특별시
4th row전라남도
5th row충청북도

Common Values

ValueCountFrequency (%)
서울특별시 32
32.0%
경기도 23
23.0%
인천광역시 12
 
12.0%
부산광역시 6
 
6.0%
광주광역시 5
 
5.0%
충청북도 4
 
4.0%
전라남도 3
 
3.0%
대구광역시 3
 
3.0%
대전광역시 3
 
3.0%
충청남도(세종시 포함) 2
 
2.0%
Other values (5) 7
 
7.0%

Length

2023-12-10T19:02:20.295946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울특별시 32
31.4%
경기도 23
22.5%
인천광역시 12
 
11.8%
부산광역시 6
 
5.9%
광주광역시 5
 
4.9%
충청북도 4
 
3.9%
전라남도 3
 
2.9%
대구광역시 3
 
2.9%
대전광역시 3
 
2.9%
충청남도(세종시 2
 
2.0%
Other values (6) 9
 
8.8%
Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
300만원 미만
27 
300이상500만원 미만
24 
700만원 이상
22 
500이상700만원 미만
20 
무응답

Length

Max length13
Median length8
Mean length9.85
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row300이상500만원 미만
2nd row700만원 이상
3rd row500이상700만원 미만
4th row500이상700만원 미만
5th row300이상500만원 미만

Common Values

ValueCountFrequency (%)
300만원 미만 27
27.0%
300이상500만원 미만 24
24.0%
700만원 이상 22
22.0%
500이상700만원 미만 20
20.0%
무응답 7
 
7.0%

Length

2023-12-10T19:02:20.553719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:02:20.734318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
미만 71
36.8%
300만원 27
 
14.0%
300이상500만원 24
 
12.4%
700만원 22
 
11.4%
이상 22
 
11.4%
500이상700만원 20
 
10.4%
무응답 7
 
3.6%

prchs_mth_nm
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
오프라인
53 
온라인
47 

Length

Max length4
Median length4
Mean length3.53
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row온라인
2nd row오프라인
3rd row온라인
4th row온라인
5th row온라인

Common Values

ValueCountFrequency (%)
오프라인 53
53.0%
온라인 47
47.0%

Length

2023-12-10T19:02:20.970629image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:02:21.136980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
오프라인 53
53.0%
온라인 47
47.0%

bal_supli_prchs_at
Boolean

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
96 
True
 
4
ValueCountFrequency (%)
False 96
96.0%
True 4
 
4.0%
2023-12-10T19:02:21.276860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

climd_camping_fshng_supli_prchs_at
Boolean

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
98 
True
 
2
ValueCountFrequency (%)
False 98
98.0%
True 2
 
2.0%
2023-12-10T19:02:21.404016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
95 
True
 
5
ValueCountFrequency (%)
False 95
95.0%
True 5
 
5.0%
2023-12-10T19:02:21.575210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

golf_supli_prchs_at
Boolean

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
99 
True
 
1
ValueCountFrequency (%)
False 99
99.0%
True 1
 
1.0%
2023-12-10T19:02:21.709373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

marn_sports_supli_prchs_at
Boolean

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
100 
ValueCountFrequency (%)
False 100
100.0%
2023-12-10T19:02:21.892509image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

wnt_sports_supli_prchs_at
Boolean

IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size232.0 B
False
99 
True
 
1
ValueCountFrequency (%)
False 99
99.0%
True 1
 
1.0%
2023-12-10T19:02:22.138310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2023-12-10T19:02:17.598417image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:02:22.299129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
respond_idsexdstn_flag_cdagrde_flag_nmanswrr_oc_area_nmhshld_income_dgree_nmprchs_mth_nmbal_supli_prchs_atclimd_camping_fshng_supli_prchs_atbcycl_wrkot_yoga_supli_prchs_atgolf_supli_prchs_atwnt_sports_supli_prchs_at
respond_id1.0000.0000.1310.0000.0000.0000.0000.2410.0000.0000.000
sexdstn_flag_cd0.0001.0000.0000.2700.1650.0000.0880.0000.0000.0000.000
agrde_flag_nm0.1310.0001.0000.0000.0000.0000.0000.0770.0430.0000.000
answrr_oc_area_nm0.0000.2700.0001.0000.0000.1970.0630.7060.4810.0000.000
hshld_income_dgree_nm0.0000.1650.0000.0001.0000.0590.0800.0000.0000.0000.000
prchs_mth_nm0.0000.0000.0000.1970.0591.0000.2070.0000.0540.0000.000
bal_supli_prchs_at0.0000.0880.0000.0630.0800.2071.0000.0000.0000.3310.000
climd_camping_fshng_supli_prchs_at0.2410.0000.0770.7060.0000.0000.0001.0000.1330.0000.000
bcycl_wrkot_yoga_supli_prchs_at0.0000.0000.0430.4810.0000.0540.0000.1331.0000.0000.000
golf_supli_prchs_at0.0000.0000.0000.0000.0000.0000.3310.0000.0001.0000.000
wnt_sports_supli_prchs_at0.0000.0000.0000.0000.0000.0000.0000.0000.0000.0001.000
2023-12-10T19:02:22.689097image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
answrr_oc_area_nmgolf_supli_prchs_atbcycl_wrkot_yoga_supli_prchs_atagrde_flag_nmhshld_income_dgree_nmwnt_sports_supli_prchs_atclimd_camping_fshng_supli_prchs_atsexdstn_flag_cdbal_supli_prchs_atprchs_mth_nm
answrr_oc_area_nm1.0000.0000.4090.0000.0000.0000.6120.2260.0370.163
golf_supli_prchs_at0.0001.0000.0000.0000.0000.0000.0000.0000.2150.000
bcycl_wrkot_yoga_supli_prchs_at0.4090.0001.0000.0220.0000.0000.0850.0000.0000.033
agrde_flag_nm0.0000.0000.0221.0000.0000.0000.0470.0000.0000.000
hshld_income_dgree_nm0.0000.0000.0000.0001.0000.0000.0000.1970.0940.068
wnt_sports_supli_prchs_at0.0000.0000.0000.0000.0001.0000.0000.0000.0000.000
climd_camping_fshng_supli_prchs_at0.6120.0000.0850.0470.0000.0001.0000.0000.0000.000
sexdstn_flag_cd0.2260.0000.0000.0000.1970.0000.0001.0000.0550.000
bal_supli_prchs_at0.0370.2150.0000.0000.0940.0000.0000.0551.0000.132
prchs_mth_nm0.1630.0000.0330.0000.0680.0000.0000.0000.1321.000
2023-12-10T19:02:22.959388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
respond_idsexdstn_flag_cdagrde_flag_nmanswrr_oc_area_nmhshld_income_dgree_nmprchs_mth_nmbal_supli_prchs_atclimd_camping_fshng_supli_prchs_atbcycl_wrkot_yoga_supli_prchs_atgolf_supli_prchs_atwnt_sports_supli_prchs_at
respond_id1.0000.0000.0880.0000.0000.0000.0000.1550.0000.0000.000
sexdstn_flag_cd0.0001.0000.0000.2260.1970.0000.0550.0000.0000.0000.000
agrde_flag_nm0.0880.0001.0000.0000.0000.0000.0000.0470.0220.0000.000
answrr_oc_area_nm0.0000.2260.0001.0000.0000.1630.0370.6120.4090.0000.000
hshld_income_dgree_nm0.0000.1970.0000.0001.0000.0680.0940.0000.0000.0000.000
prchs_mth_nm0.0000.0000.0000.1630.0681.0000.1320.0000.0330.0000.000
bal_supli_prchs_at0.0000.0550.0000.0370.0940.1321.0000.0000.0000.2150.000
climd_camping_fshng_supli_prchs_at0.1550.0000.0470.6120.0000.0000.0001.0000.0850.0000.000
bcycl_wrkot_yoga_supli_prchs_at0.0000.0000.0220.4090.0000.0330.0000.0851.0000.0000.000
golf_supli_prchs_at0.0000.0000.0000.0000.0000.0000.2150.0000.0001.0000.000
wnt_sports_supli_prchs_at0.0000.0000.0000.0000.0000.0000.0000.0000.0000.0001.000

Missing values

2023-12-10T19:02:17.835086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:02:18.165274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

respond_idexamin_ymsexdstn_flag_cdagrde_flag_nmanswrr_oc_area_nmhshld_income_dgree_nmprchs_mth_nmbal_supli_prchs_atclimd_camping_fshng_supli_prchs_atbcycl_wrkot_yoga_supli_prchs_atgolf_supli_prchs_atmarn_sports_supli_prchs_atwnt_sports_supli_prchs_at
0199202204F60대서울특별시300이상500만원 미만온라인NNNNNN
13242000202204F40대부산광역시700만원 이상오프라인NNNNNN
2354202204M60대서울특별시500이상700만원 미만온라인NNNNNN
3497202204M60대전라남도500이상700만원 미만온라인NNNNNN
4547202204M60대충청북도300이상500만원 미만온라인NNNNNN
5689202204M40대경기도500이상700만원 미만온라인NNNNNN
6727202204M50대경기도300이상500만원 미만오프라인NNNNNN
73242247202204M30대경기도300만원 미만온라인NYNNNN
8982202204M50대서울특별시300만원 미만온라인NNNNNN
9988202204F50대부산광역시300만원 미만오프라인NNNNNN
respond_idexamin_ymsexdstn_flag_cdagrde_flag_nmanswrr_oc_area_nmhshld_income_dgree_nmprchs_mth_nmbal_supli_prchs_atclimd_camping_fshng_supli_prchs_atbcycl_wrkot_yoga_supli_prchs_atgolf_supli_prchs_atmarn_sports_supli_prchs_atwnt_sports_supli_prchs_at
9013801202204M50대경기도300만원 미만오프라인NNNNNN
9113810202204F30대경기도700만원 이상오프라인NNNNNN
9214108202204M40대인천광역시300만원 미만오프라인NNNNNN
9314152202204M40대경기도700만원 이상온라인NNNNNN
9414227202204M60대경기도300이상500만원 미만온라인NNNNNN
9514550202204F60대서울특별시300만원 미만오프라인NNNNNN
9614840202204M50대인천광역시700만원 이상오프라인NNNNNN
9714873202204M40대경기도500이상700만원 미만오프라인NNNNNN
9815392202204F40대광주광역시300만원 미만오프라인NNNNNN
9915661202204M60대대전광역시300이상500만원 미만온라인NNNNNN