Overview

Dataset statistics

Number of variables7
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.1 KiB
Average record size in memory62.3 B

Variable types

Categorical3
Text1
Numeric3

Alerts

anals_trget_year has constant value ""Constant
anals_trget_mt is highly overall correlated with one_area_nmHigh correlation
one_area_nm is highly overall correlated with anals_trget_mtHigh correlation
public_lbrry_co is highly overall correlated with popltn_coHigh correlation
popltn_co is highly overall correlated with public_lbrry_co and 1 other fieldsHigh correlation
avrg_popltn_co is highly overall correlated with popltn_coHigh correlation
anals_trget_mt is highly imbalanced (80.6%)Imbalance
two_area_nm has unique valuesUnique
popltn_co has unique valuesUnique
avrg_popltn_co has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:06:18.534544
Analysis finished2023-12-10 10:06:20.689684
Duration2.16 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

anals_trget_year
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2020
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020
2nd row2020
3rd row2020
4th row2020
5th row2020

Common Values

ValueCountFrequency (%)
2020 100
100.0%

Length

2023-12-10T19:06:20.803060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:06:20.958696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020 100
100.0%

anals_trget_mt
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
1
97 
12
 
3

Length

Max length2
Median length1
Mean length1.03
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row12
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 97
97.0%
12 3
 
3.0%

Length

2023-12-10T19:06:21.127261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:06:21.295114image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 97
97.0%
12 3
 
3.0%

one_area_nm
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기도
41 
경상남도
22 
경상북도
18 
강원도
16 
충청북도
 
3

Length

Max length4
Median length3
Mean length3.43
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row충청북도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
경기도 41
41.0%
경상남도 22
22.0%
경상북도 18
18.0%
강원도 16
 
16.0%
충청북도 3
 
3.0%

Length

2023-12-10T19:06:21.491002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:06:21.720982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 41
41.0%
경상남도 22
22.0%
경상북도 18
18.0%
강원도 16
 
16.0%
충청북도 3
 
3.0%

two_area_nm
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:06:22.224569image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length4.01
Min length3

Characters and Unicode

Total characters401
Distinct characters91
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row강릉시
2nd row청주시 청원구
3rd row동해시
4th row삼척시
5th row속초시
ValueCountFrequency (%)
창원시 5
 
4.1%
수원시 4
 
3.3%
용인시 3
 
2.4%
고양시 3
 
2.4%
청주시 2
 
1.6%
성남시 2
 
1.6%
안양시 2
 
1.6%
안산시 2
 
1.6%
평택시 1
 
0.8%
밀양시 1
 
0.8%
Other values (98) 98
79.7%
2023-12-10T19:06:23.008585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
70
17.5%
33
 
8.2%
26
 
6.5%
23
 
5.7%
16
 
4.0%
15
 
3.7%
14
 
3.5%
14
 
3.5%
11
 
2.7%
10
 
2.5%
Other values (81) 169
42.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 378
94.3%
Space Separator 23
 
5.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
70
18.5%
33
 
8.7%
26
 
6.9%
16
 
4.2%
15
 
4.0%
14
 
3.7%
14
 
3.7%
11
 
2.9%
10
 
2.6%
9
 
2.4%
Other values (80) 160
42.3%
Space Separator
ValueCountFrequency (%)
23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 378
94.3%
Common 23
 
5.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
70
18.5%
33
 
8.7%
26
 
6.9%
16
 
4.2%
15
 
4.0%
14
 
3.7%
14
 
3.7%
11
 
2.9%
10
 
2.6%
9
 
2.4%
Other values (80) 160
42.3%
Common
ValueCountFrequency (%)
23
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 378
94.3%
ASCII 23
 
5.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
70
18.5%
33
 
8.7%
26
 
6.9%
16
 
4.2%
15
 
4.0%
14
 
3.7%
14
 
3.7%
11
 
2.9%
10
 
2.6%
9
 
2.4%
Other values (80) 160
42.3%
ASCII
ValueCountFrequency (%)
23
100.0%

public_lbrry_co
Real number (ℝ)

HIGH CORRELATION 

Distinct15
Distinct (%)15.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.67
Minimum1
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:06:23.595947image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile11.05
Maximum17
Range16
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.2382031
Coefficient of variation (CV)0.69340538
Kurtosis3.1318755
Mean4.67
Median Absolute Deviation (MAD)2
Skewness1.5342263
Sum467
Variance10.48596
MonotonicityNot monotonic
2023-12-10T19:06:23.784019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
3 18
18.0%
2 14
14.0%
5 13
13.0%
1 13
13.0%
6 11
11.0%
7 10
10.0%
4 9
9.0%
8 3
 
3.0%
12 2
 
2.0%
9 2
 
2.0%
Other values (5) 5
 
5.0%
ValueCountFrequency (%)
1 13
13.0%
2 14
14.0%
3 18
18.0%
4 9
9.0%
5 13
13.0%
6 11
11.0%
7 10
10.0%
8 3
 
3.0%
9 2
 
2.0%
10 1
 
1.0%
ValueCountFrequency (%)
17 1
 
1.0%
16 1
 
1.0%
15 1
 
1.0%
12 2
 
2.0%
11 1
 
1.0%
10 1
 
1.0%
9 2
 
2.0%
8 3
 
3.0%
7 10
10.0%
6 11
11.0%

popltn_co
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean201865.64
Minimum9521
Maximum828947
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:06:24.035715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum9521
5-th percentile27014.75
Q154445
median176362
Q3285624.25
95-th percentile477370
Maximum828947
Range819426
Interquartile range (IQR)231179.25

Descriptive statistics

Standard deviation172734.01
Coefficient of variation (CV)0.85568804
Kurtosis2.2375334
Mean201865.64
Median Absolute Deviation (MAD)121450.5
Skewness1.3500848
Sum20186564
Variance2.9837039 × 1010
MonotonicityNot monotonic
2023-12-10T19:06:24.295546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
213328 1
 
1.0%
43539 1
 
1.0%
219933 1
 
1.0%
193975 1
 
1.0%
177026 1
 
1.0%
62182 1
 
1.0%
347489 1
 
1.0%
27131 1
 
1.0%
351168 1
 
1.0%
35336 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
9521 1
1.0%
16999 1
1.0%
22526 1
1.0%
23821 1
1.0%
24806 1
1.0%
27131 1
1.0%
27709 1
1.0%
31567 1
1.0%
32052 1
1.0%
32296 1
1.0%
ValueCountFrequency (%)
828947 1
1.0%
818760 1
1.0%
702545 1
1.0%
542713 1
1.0%
514876 1
1.0%
475396 1
1.0%
467673 1
1.0%
453961 1
1.0%
451876 1
1.0%
437789 1
1.0%

avrg_popltn_co
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean41158.648
Minimum8402.8
Maximum140991
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:06:24.697504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum8402.8
5-th percentile14551.203
Q125531.75
median39214.8
Q354263.997
95-th percentile73411.083
Maximum140991
Range132588.2
Interquartile range (IQR)28732.247

Descriptive statistics

Standard deviation21442.214
Coefficient of variation (CV)0.52096497
Kurtosis3.5937672
Mean41158.648
Median Absolute Deviation (MAD)15155.865
Skewness1.2533243
Sum4115864.8
Variance4.5976854 × 108
MonotonicityNot monotonic
2023-12-10T19:06:25.003749image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
53332.0 1
 
1.0%
21769.5 1
 
1.0%
73311.0 1
 
1.0%
48493.75 1
 
1.0%
59008.67 1
 
1.0%
20727.33 1
 
1.0%
49641.29 1
 
1.0%
27131.0 1
 
1.0%
58528.0 1
 
1.0%
35336.0 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
8402.8 1
1.0%
9521.0 1
1.0%
11366.75 1
1.0%
12403.0 1
1.0%
14377.8 1
1.0%
14560.33 1
1.0%
15459.0 1
1.0%
15575.75 1
1.0%
15783.5 1
1.0%
16148.0 1
1.0%
ValueCountFrequency (%)
140991.0 1
1.0%
92599.5 1
1.0%
87743.0 1
1.0%
87550.5 1
1.0%
75312.67 1
1.0%
73311.0 1
1.0%
73307.0 1
1.0%
66338.33 1
1.0%
65051.5 1
1.0%
64479.67 1
1.0%

Interactions

2023-12-10T19:06:19.919439image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:06:18.972172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:06:19.496923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:06:20.077801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:06:19.143972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:06:19.660794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:06:20.213056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:06:19.350933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:06:19.783246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:06:25.198143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
anals_trget_mtone_area_nmtwo_area_nmpublic_lbrry_copopltn_coavrg_popltn_co
anals_trget_mt1.0001.0001.0000.1100.0000.000
one_area_nm1.0001.0001.0000.4380.4060.228
two_area_nm1.0001.0001.0001.0001.0001.000
public_lbrry_co0.1100.4381.0001.0000.8760.000
popltn_co0.0000.4061.0000.8761.0000.470
avrg_popltn_co0.0000.2281.0000.0000.4701.000
2023-12-10T19:06:25.448354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
anals_trget_mtone_area_nm
anals_trget_mt1.0000.985
one_area_nm0.9851.000
2023-12-10T19:06:25.734773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
public_lbrry_copopltn_coavrg_popltn_coanals_trget_mtone_area_nm
public_lbrry_co1.0000.8550.2920.1020.279
popltn_co0.8551.0000.7160.0000.243
avrg_popltn_co0.2920.7161.0000.0000.152
anals_trget_mt0.1020.0000.0001.0000.985
one_area_nm0.2790.2430.1520.9851.000

Missing values

2023-12-10T19:06:20.386135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:06:20.609390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

anals_trget_yearanals_trget_mtone_area_nmtwo_area_nmpublic_lbrry_copopltn_coavrg_popltn_co
020201강원도강릉시421332853332.0
1202012충청북도청주시 청원구519437338874.6
220201강원도동해시39041730139.0
320201강원도삼척시36680622268.67
420201강원도속초시38184027280.0
520201강원도양구군12252622526.0
620201강원도양양군12770927709.0
7202012충청북도청주시 흥덕구526586653173.2
820201강원도원주시435020287550.5
920201강원도인제군23156715783.5
anals_trget_yearanals_trget_mtone_area_nmtwo_area_nmpublic_lbrry_copopltn_coavrg_popltn_co
9020201경상북도상주시29981449907.0
9120201경상북도성주군24397521987.5
9220201경상북도안동시615984426640.67
9320201경상북도영덕군13723337233.0
9420201경상북도영양군11699916999.0
9520201경상북도영주시310498534995.0
9620201경상북도영천시210216351081.5
9720201경상북도예천군25519227596.0
9820201경상북도울릉군195219521.0
9920201경상북도울진군34918816396.0