Overview

Dataset statistics

Number of variables6
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.2 KiB
Average record size in memory53.3 B

Variable types

Categorical4
Text1
Numeric1

Alerts

anals_trget_day is highly overall correlated with anals_trget_year and 2 other fieldsHigh correlation
anals_trget_mt is highly overall correlated with anals_trget_year and 2 other fieldsHigh correlation
anals_trget_year is highly overall correlated with anals_trget_mt and 2 other fieldsHigh correlation
one_area_nm is highly overall correlated with anals_trget_year and 2 other fieldsHigh correlation
anals_trget_year is highly imbalanced (80.6%)Imbalance
anals_trget_mt is highly imbalanced (80.6%)Imbalance
anals_trget_day is highly imbalanced (80.6%)Imbalance

Reproduction

Analysis started2023-12-10 10:08:04.397987
Analysis finished2023-12-10 10:08:05.656715
Duration1.26 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

anals_trget_year
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2018
97 
2020
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2018
2nd row2020
3rd row2018
4th row2018
5th row2018

Common Values

ValueCountFrequency (%)
2018 97
97.0%
2020 3
 
3.0%

Length

2023-12-10T19:08:05.795406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:08:05.991562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2018 97
97.0%
2020 3
 
3.0%

anals_trget_mt
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
97 
12
 
3

Length

Max length2
Median length1
Mean length1.03
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row12
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 97
97.0%
12 3
 
3.0%

Length

2023-12-10T19:08:06.221807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:08:06.422734image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 97
97.0%
12 3
 
3.0%

anals_trget_day
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
31
97 
30
 
3

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row31
2nd row30
3rd row31
4th row31
5th row31

Common Values

ValueCountFrequency (%)
31 97
97.0%
30 3
 
3.0%

Length

2023-12-10T19:08:06.753649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:08:07.268074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
31 97
97.0%
30 3
 
3.0%

one_area_nm
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기도
36 
강원도
16 
부산광역시
15 
경상남도
13 
경상북도
10 
Other values (4)
10 

Length

Max length5
Median length3
Mean length3.7
Min length3

Unique

Unique2 ?
Unique (%)2.0%

Sample

1st row강원도
2nd row충청북도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
경기도 36
36.0%
강원도 16
16.0%
부산광역시 15
15.0%
경상남도 13
 
13.0%
경상북도 10
 
10.0%
광주광역시 5
 
5.0%
충청북도 3
 
3.0%
대구광역시 1
 
1.0%
서울특별시 1
 
1.0%

Length

2023-12-10T19:08:07.529682image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:08:07.844960image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 36
36.0%
강원도 16
16.0%
부산광역시 15
15.0%
경상남도 13
 
13.0%
경상북도 10
 
10.0%
광주광역시 5
 
5.0%
충청북도 3
 
3.0%
대구광역시 1
 
1.0%
서울특별시 1
 
1.0%
Distinct95
Distinct (%)95.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:08:08.515495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.99
Min length2

Characters and Unicode

Total characters399
Distinct characters94
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique90 ?
Unique (%)90.0%

Sample

1st row강릉시
2nd row청주시 청원구
3rd row동해시
4th row삼척시
5th row속초시
ValueCountFrequency (%)
창원시 5
 
4.0%
수원시 4
 
3.2%
고양시 3
 
2.4%
용인시 3
 
2.4%
북구 3
 
2.4%
남구 3
 
2.4%
안양시 2
 
1.6%
안산시 2
 
1.6%
성남시 2
 
1.6%
청주시 2
 
1.6%
Other values (92) 96
76.8%
2023-12-10T19:08:09.420483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
61
 
15.3%
49
 
12.3%
25
 
6.3%
18
 
4.5%
15
 
3.8%
14
 
3.5%
12
 
3.0%
11
 
2.8%
9
 
2.3%
8
 
2.0%
Other values (84) 177
44.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 374
93.7%
Space Separator 25
 
6.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
61
 
16.3%
49
 
13.1%
18
 
4.8%
15
 
4.0%
14
 
3.7%
12
 
3.2%
11
 
2.9%
9
 
2.4%
8
 
2.1%
8
 
2.1%
Other values (83) 169
45.2%
Space Separator
ValueCountFrequency (%)
25
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 374
93.7%
Common 25
 
6.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
61
 
16.3%
49
 
13.1%
18
 
4.8%
15
 
4.0%
14
 
3.7%
12
 
3.2%
11
 
2.9%
9
 
2.4%
8
 
2.1%
8
 
2.1%
Other values (83) 169
45.2%
Common
ValueCountFrequency (%)
25
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 374
93.7%
ASCII 25
 
6.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
61
 
16.3%
49
 
13.1%
18
 
4.8%
15
 
4.0%
14
 
3.7%
12
 
3.2%
11
 
2.9%
9
 
2.4%
8
 
2.1%
8
 
2.1%
Other values (83) 169
45.2%
ASCII
ValueCountFrequency (%)
25
100.0%

lon_co
Real number (ℝ)

Distinct99
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2287.56
Minimum1
Maximum11049
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:08:09.704043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile136.9
Q1581
median1422.5
Q33253.5
95-th percentile8550.2
Maximum11049
Range11048
Interquartile range (IQR)2672.5

Descriptive statistics

Standard deviation2482.1448
Coefficient of variation (CV)1.0850622
Kurtosis2.6925771
Mean2287.56
Median Absolute Deviation (MAD)1060
Skewness1.7307779
Sum228756
Variance6161043
MonotonicityNot monotonic
2023-12-10T19:08:10.415561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3243 2
 
2.0%
587 1
 
1.0%
1004 1
 
1.0%
233 1
 
1.0%
238 1
 
1.0%
679 1
 
1.0%
973 1
 
1.0%
3101 1
 
1.0%
1881 1
 
1.0%
842 1
 
1.0%
Other values (89) 89
89.0%
ValueCountFrequency (%)
1 1
1.0%
67 1
1.0%
79 1
1.0%
86 1
1.0%
97 1
1.0%
139 1
1.0%
157 1
1.0%
166 1
1.0%
168 1
1.0%
217 1
1.0%
ValueCountFrequency (%)
11049 1
1.0%
10361 1
1.0%
9220 1
1.0%
8937 1
1.0%
8782 1
1.0%
8538 1
1.0%
7758 1
1.0%
7227 1
1.0%
6211 1
1.0%
6177 1
1.0%

Interactions

2023-12-10T19:08:05.031775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:08:10.622489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
anals_trget_yearanals_trget_mtanals_trget_dayone_area_nmtwo_area_nmlon_co
anals_trget_year1.0000.9630.9631.0001.0000.000
anals_trget_mt0.9631.0000.9631.0001.0000.000
anals_trget_day0.9630.9631.0001.0001.0000.000
one_area_nm1.0001.0001.0001.0000.0000.377
two_area_nm1.0001.0001.0000.0001.0001.000
lon_co0.0000.0000.0000.3771.0001.000
2023-12-10T19:08:10.866512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
anals_trget_dayanals_trget_mtanals_trget_yearone_area_nm
anals_trget_day1.0000.8260.8260.964
anals_trget_mt0.8261.0000.8260.964
anals_trget_year0.8260.8261.0000.964
one_area_nm0.9640.9640.9641.000
2023-12-10T19:08:11.074646image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
lon_coanals_trget_yearanals_trget_mtanals_trget_dayone_area_nm
lon_co1.0000.0000.0000.0000.178
anals_trget_year0.0001.0000.8260.8260.964
anals_trget_mt0.0000.8261.0000.8260.964
anals_trget_day0.0000.8260.8261.0000.964
one_area_nm0.1780.9640.9640.9641.000

Missing values

2023-12-10T19:08:05.310546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:08:05.575909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

anals_trget_yearanals_trget_mtanals_trget_dayone_area_nmtwo_area_nmlon_co
02018131강원도강릉시587
120201230충청북도청주시 청원구1943
22018131강원도동해시990
32018131강원도삼척시446
42018131강원도속초시802
52018131강원도양구군270
62018131강원도양양군79
720201230충청북도청주시 흥덕구1761
82018131강원도원주시3316
92018131강원도인제군168
anals_trget_yearanals_trget_mtanals_trget_dayone_area_nmtwo_area_nmlon_co
902018131부산광역시부산진구3314
912018131부산광역시북구2957
922018131부산광역시사상구495
932018131부산광역시사하구391
942018131부산광역시서구352
952018131부산광역시연제구688
962018131부산광역시영도구783
972018131부산광역시중구387
982018131부산광역시해운대구4922
992018131서울특별시강남구8782