Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 100 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 6.1 KiB |
Average record size in memory | 62.3 B |
Variable types
Categorical | 3 |
---|---|
Text | 1 |
Numeric | 3 |
Dataset
Description | Sample |
---|---|
Author | 국립중앙도서관 |
URL | https://www.bigdata-culture.kr/bigdata/user/data_market/detail.do?id=eff883d0-1524-11ec-bbc0-d7035fffebeb |
anals_trget_year has constant value "" | Constant |
anals_trget_mt is highly overall correlated with one_area_nm | High correlation |
one_area_nm is highly overall correlated with anals_trget_mt | High correlation |
public_lbrry_co is highly overall correlated with popltn_co | High correlation |
popltn_co is highly overall correlated with public_lbrry_co and 1 other fields | High correlation |
avrg_popltn_co is highly overall correlated with popltn_co | High correlation |
anals_trget_mt is highly imbalanced (80.6%) | Imbalance |
two_area_nm has unique values | Unique |
popltn_co has unique values | Unique |
avrg_popltn_co has unique values | Unique |
Reproduction
Analysis started | 2023-12-10 10:06:18.534544 |
---|---|
Analysis finished | 2023-12-10 10:06:20.689684 |
Duration | 2.16 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
anals_trget_year
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
2020 |
---|
Length
Max length | 4 |
---|---|
Median length | 4 |
Mean length | 4 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 2020 |
---|---|
2nd row | 2020 |
3rd row | 2020 |
4th row | 2020 |
5th row | 2020 |
Common Values
Value | Count | Frequency (%) |
2020 | 100 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
2020 | 100 |
anals_trget_mt
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 2.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
1 | |
---|---|
12 | 3 |
Length
Max length | 2 |
---|---|
Median length | 1 |
Mean length | 1.03 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 1 |
---|---|
2nd row | 12 |
3rd row | 1 |
4th row | 1 |
5th row | 1 |
Common Values
Value | Count | Frequency (%) |
1 | 97 | |
12 | 3 | 3.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1 | 97 | |
12 | 3 | 3.0% |
one_area_nm
Categorical
HIGH CORRELATION
 
Distinct | 5 |
---|---|
Distinct (%) | 5.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
경기도 | |
---|---|
경상남도 | |
경상북도 | |
강원도 | |
충청북도 | 3 |
Length
Max length | 4 |
---|---|
Median length | 3 |
Mean length | 3.43 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 강원도 |
---|---|
2nd row | 충청북도 |
3rd row | 강원도 |
4th row | 강원도 |
5th row | 강원도 |
Common Values
Value | Count | Frequency (%) |
경기도 | 41 | |
경상남도 | 22 | |
경상북도 | 18 | |
강원도 | 16 | 16.0% |
충청북도 | 3 | 3.0% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
경기도 | 41 | |
경상남도 | 22 | |
경상북도 | 18 | |
강원도 | 16 | 16.0% |
충청북도 | 3 | 3.0% |
two_area_nm
Text
UNIQUE
 
Distinct | 100 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 932.0 B |
Value | Count | Frequency (%) |
창원시 | 5 | 4.1% |
수원시 | 4 | 3.3% |
용인시 | 3 | 2.4% |
고양시 | 3 | 2.4% |
청주시 | 2 | 1.6% |
성남시 | 2 | 1.6% |
안양시 | 2 | 1.6% |
안산시 | 2 | 1.6% |
평택시 | 1 | 0.8% |
밀양시 | 1 | 0.8% |
Other values (98) | 98 |
Most occurring characters
Value | Count | Frequency (%) |
시 | 70 | |
군 | 33 | 8.2% |
구 | 26 | 6.5% |
23 | 5.7% | |
양 | 16 | 4.0% |
원 | 15 | 3.7% |
주 | 14 | 3.5% |
천 | 14 | 3.5% |
산 | 11 | 2.7% |
안 | 10 | 2.5% |
Other values (81) | 169 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 378 | |
Space Separator | 23 | 5.7% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
시 | 70 | |
군 | 33 | 8.7% |
구 | 26 | 6.9% |
양 | 16 | 4.2% |
원 | 15 | 4.0% |
주 | 14 | 3.7% |
천 | 14 | 3.7% |
산 | 11 | 2.9% |
안 | 10 | 2.6% |
창 | 9 | 2.4% |
Other values (80) | 160 |
Space Separator
Value | Count | Frequency (%) |
23 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 378 | |
Common | 23 | 5.7% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
시 | 70 | |
군 | 33 | 8.7% |
구 | 26 | 6.9% |
양 | 16 | 4.2% |
원 | 15 | 4.0% |
주 | 14 | 3.7% |
천 | 14 | 3.7% |
산 | 11 | 2.9% |
안 | 10 | 2.6% |
창 | 9 | 2.4% |
Other values (80) | 160 |
Common
Value | Count | Frequency (%) |
23 |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 378 | |
ASCII | 23 | 5.7% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
시 | 70 | |
군 | 33 | 8.7% |
구 | 26 | 6.9% |
양 | 16 | 4.2% |
원 | 15 | 4.0% |
주 | 14 | 3.7% |
천 | 14 | 3.7% |
산 | 11 | 2.9% |
안 | 10 | 2.6% |
창 | 9 | 2.4% |
Other values (80) | 160 |
ASCII
Value | Count | Frequency (%) |
23 |
public_lbrry_co
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 15 |
---|---|
Distinct (%) | 15.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4.67 |
Minimum | 1 |
---|---|
Maximum | 17 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 2 |
median | 4 |
Q3 | 6 |
95-th percentile | 11.05 |
Maximum | 17 |
Range | 16 |
Interquartile range (IQR) | 4 |
Descriptive statistics
Standard deviation | 3.2382031 |
---|---|
Coefficient of variation (CV) | 0.69340538 |
Kurtosis | 3.1318755 |
Mean | 4.67 |
Median Absolute Deviation (MAD) | 2 |
Skewness | 1.5342263 |
Sum | 467 |
Variance | 10.48596 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
3 | 18 | |
2 | 14 | |
5 | 13 | |
1 | 13 | |
6 | 11 | |
7 | 10 | |
4 | 9 | |
8 | 3 | 3.0% |
12 | 2 | 2.0% |
9 | 2 | 2.0% |
Other values (5) | 5 | 5.0% |
Value | Count | Frequency (%) |
1 | 13 | |
2 | 14 | |
3 | 18 | |
4 | 9 | |
5 | 13 | |
6 | 11 | |
7 | 10 | |
8 | 3 | 3.0% |
9 | 2 | 2.0% |
10 | 1 | 1.0% |
Value | Count | Frequency (%) |
17 | 1 | 1.0% |
16 | 1 | 1.0% |
15 | 1 | 1.0% |
12 | 2 | 2.0% |
11 | 1 | 1.0% |
10 | 1 | 1.0% |
9 | 2 | 2.0% |
8 | 3 | 3.0% |
7 | 10 | |
6 | 11 |
popltn_co
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 100 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 201865.64 |
Minimum | 9521 |
---|---|
Maximum | 828947 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 9521 |
---|---|
5-th percentile | 27014.75 |
Q1 | 54445 |
median | 176362 |
Q3 | 285624.25 |
95-th percentile | 477370 |
Maximum | 828947 |
Range | 819426 |
Interquartile range (IQR) | 231179.25 |
Descriptive statistics
Standard deviation | 172734.01 |
---|---|
Coefficient of variation (CV) | 0.85568804 |
Kurtosis | 2.2375334 |
Mean | 201865.64 |
Median Absolute Deviation (MAD) | 121450.5 |
Skewness | 1.3500848 |
Sum | 20186564 |
Variance | 2.9837039 × 1010 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
213328 | 1 | 1.0% |
43539 | 1 | 1.0% |
219933 | 1 | 1.0% |
193975 | 1 | 1.0% |
177026 | 1 | 1.0% |
62182 | 1 | 1.0% |
347489 | 1 | 1.0% |
27131 | 1 | 1.0% |
351168 | 1 | 1.0% |
35336 | 1 | 1.0% |
Other values (90) | 90 |
Value | Count | Frequency (%) |
9521 | 1 | |
16999 | 1 | |
22526 | 1 | |
23821 | 1 | |
24806 | 1 | |
27131 | 1 | |
27709 | 1 | |
31567 | 1 | |
32052 | 1 | |
32296 | 1 |
Value | Count | Frequency (%) |
828947 | 1 | |
818760 | 1 | |
702545 | 1 | |
542713 | 1 | |
514876 | 1 | |
475396 | 1 | |
467673 | 1 | |
453961 | 1 | |
451876 | 1 | |
437789 | 1 |
avrg_popltn_co
Real number (ℝ)
HIGH CORRELATION
  UNIQUE
 
Distinct | 100 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 41158.648 |
Minimum | 8402.8 |
---|---|
Maximum | 140991 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 1.0 KiB |
Quantile statistics
Minimum | 8402.8 |
---|---|
5-th percentile | 14551.203 |
Q1 | 25531.75 |
median | 39214.8 |
Q3 | 54263.997 |
95-th percentile | 73411.083 |
Maximum | 140991 |
Range | 132588.2 |
Interquartile range (IQR) | 28732.247 |
Descriptive statistics
Standard deviation | 21442.214 |
---|---|
Coefficient of variation (CV) | 0.52096497 |
Kurtosis | 3.5937672 |
Mean | 41158.648 |
Median Absolute Deviation (MAD) | 15155.865 |
Skewness | 1.2533243 |
Sum | 4115864.8 |
Variance | 4.5976854 × 108 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
53332.0 | 1 | 1.0% |
21769.5 | 1 | 1.0% |
73311.0 | 1 | 1.0% |
48493.75 | 1 | 1.0% |
59008.67 | 1 | 1.0% |
20727.33 | 1 | 1.0% |
49641.29 | 1 | 1.0% |
27131.0 | 1 | 1.0% |
58528.0 | 1 | 1.0% |
35336.0 | 1 | 1.0% |
Other values (90) | 90 |
Value | Count | Frequency (%) |
8402.8 | 1 | |
9521.0 | 1 | |
11366.75 | 1 | |
12403.0 | 1 | |
14377.8 | 1 | |
14560.33 | 1 | |
15459.0 | 1 | |
15575.75 | 1 | |
15783.5 | 1 | |
16148.0 | 1 |
Value | Count | Frequency (%) |
140991.0 | 1 | |
92599.5 | 1 | |
87743.0 | 1 | |
87550.5 | 1 | |
75312.67 | 1 | |
73311.0 | 1 | |
73307.0 | 1 | |
66338.33 | 1 | |
65051.5 | 1 | |
64479.67 | 1 |
anals_trget_mt | one_area_nm | two_area_nm | public_lbrry_co | popltn_co | avrg_popltn_co | |
---|---|---|---|---|---|---|
anals_trget_mt | 1.000 | 1.000 | 1.000 | 0.110 | 0.000 | 0.000 |
one_area_nm | 1.000 | 1.000 | 1.000 | 0.438 | 0.406 | 0.228 |
two_area_nm | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
public_lbrry_co | 0.110 | 0.438 | 1.000 | 1.000 | 0.876 | 0.000 |
popltn_co | 0.000 | 0.406 | 1.000 | 0.876 | 1.000 | 0.470 |
avrg_popltn_co | 0.000 | 0.228 | 1.000 | 0.000 | 0.470 | 1.000 |
anals_trget_mt | one_area_nm | |
---|---|---|
anals_trget_mt | 1.000 | 0.985 |
one_area_nm | 0.985 | 1.000 |
public_lbrry_co | popltn_co | avrg_popltn_co | anals_trget_mt | one_area_nm | |
---|---|---|---|---|---|
public_lbrry_co | 1.000 | 0.855 | 0.292 | 0.102 | 0.279 |
popltn_co | 0.855 | 1.000 | 0.716 | 0.000 | 0.243 |
avrg_popltn_co | 0.292 | 0.716 | 1.000 | 0.000 | 0.152 |
anals_trget_mt | 0.102 | 0.000 | 0.000 | 1.000 | 0.985 |
one_area_nm | 0.279 | 0.243 | 0.152 | 0.985 | 1.000 |
anals_trget_year | anals_trget_mt | one_area_nm | two_area_nm | public_lbrry_co | popltn_co | avrg_popltn_co | |
---|---|---|---|---|---|---|---|
0 | 2020 | 1 | 강원도 | 강릉시 | 4 | 213328 | 53332.0 |
1 | 2020 | 12 | 충청북도 | 청주시 청원구 | 5 | 194373 | 38874.6 |
2 | 2020 | 1 | 강원도 | 동해시 | 3 | 90417 | 30139.0 |
3 | 2020 | 1 | 강원도 | 삼척시 | 3 | 66806 | 22268.67 |
4 | 2020 | 1 | 강원도 | 속초시 | 3 | 81840 | 27280.0 |
5 | 2020 | 1 | 강원도 | 양구군 | 1 | 22526 | 22526.0 |
6 | 2020 | 1 | 강원도 | 양양군 | 1 | 27709 | 27709.0 |
7 | 2020 | 12 | 충청북도 | 청주시 흥덕구 | 5 | 265866 | 53173.2 |
8 | 2020 | 1 | 강원도 | 원주시 | 4 | 350202 | 87550.5 |
9 | 2020 | 1 | 강원도 | 인제군 | 2 | 31567 | 15783.5 |
anals_trget_year | anals_trget_mt | one_area_nm | two_area_nm | public_lbrry_co | popltn_co | avrg_popltn_co | |
---|---|---|---|---|---|---|---|
90 | 2020 | 1 | 경상북도 | 상주시 | 2 | 99814 | 49907.0 |
91 | 2020 | 1 | 경상북도 | 성주군 | 2 | 43975 | 21987.5 |
92 | 2020 | 1 | 경상북도 | 안동시 | 6 | 159844 | 26640.67 |
93 | 2020 | 1 | 경상북도 | 영덕군 | 1 | 37233 | 37233.0 |
94 | 2020 | 1 | 경상북도 | 영양군 | 1 | 16999 | 16999.0 |
95 | 2020 | 1 | 경상북도 | 영주시 | 3 | 104985 | 34995.0 |
96 | 2020 | 1 | 경상북도 | 영천시 | 2 | 102163 | 51081.5 |
97 | 2020 | 1 | 경상북도 | 예천군 | 2 | 55192 | 27596.0 |
98 | 2020 | 1 | 경상북도 | 울릉군 | 1 | 9521 | 9521.0 |
99 | 2020 | 1 | 경상북도 | 울진군 | 3 | 49188 | 16396.0 |