Overview

Dataset statistics

Number of variables6
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.2 KiB
Average record size in memory53.3 B

Variable types

Categorical3
Numeric3

Alerts

anals_trget_year is highly overall correlated with kdc_nmHigh correlation
kdc_nm is highly overall correlated with anals_trget_yearHigh correlation
book_co is highly overall correlated with lon_coHigh correlation
lon_co is highly overall correlated with book_co and 1 other fieldsHigh correlation
rate_value is highly overall correlated with lon_coHigh correlation
anals_trget_year is highly imbalanced (80.6%)Imbalance
book_co has unique valuesUnique
lon_co has unique valuesUnique
rate_value has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:18:03.722695
Analysis finished2023-12-10 10:18:05.619178
Duration1.9 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

anals_trget_year
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2010
97 
2019
 
3

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2010
2nd row2019
3rd row2010
4th row2010
5th row2010

Common Values

ValueCountFrequency (%)
2010 97
97.0%
2019 3
 
3.0%

Length

2023-12-10T19:18:05.711755image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:18:05.849597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2010 97
97.0%
2019 3
 
3.0%

kdc_nm
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
200
16 
300
16 
400
16 
500
16 
100
15 
Other values (3)
21 

Length

Max length3
Median length3
Mean length2.97
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row000
2nd row미상
3rd row000
4th row000
5th row000

Common Values

ValueCountFrequency (%)
200 16
16.0%
300 16
16.0%
400 16
16.0%
500 16
16.0%
100 15
15.0%
000 14
14.0%
600 4
 
4.0%
미상 3
 
3.0%

Length

2023-12-10T19:18:06.008598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:18:06.196710image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
200 16
16.0%
300 16
16.0%
400 16
16.0%
500 16
16.0%
100 15
15.0%
000 14
14.0%
600 4
 
4.0%
미상 3
 
3.0%

area_nm
Categorical

Distinct16
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
강원도
제주특별자치도
경상남도
경상북도
충청남도
Other values (11)
65 

Length

Max length7
Median length5
Mean length4.61
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row제주특별자치도
3rd row경상남도
4th row경상북도
5th row광주광역시

Common Values

ValueCountFrequency (%)
강원도 7
 
7.0%
제주특별자치도 7
 
7.0%
경상남도 7
 
7.0%
경상북도 7
 
7.0%
충청남도 7
 
7.0%
광주광역시 6
 
6.0%
대구광역시 6
 
6.0%
대전광역시 6
 
6.0%
세종특별자치시 6
 
6.0%
울산광역시 6
 
6.0%
Other values (6) 35
35.0%

Length

2023-12-10T19:18:06.437371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
강원도 7
 
7.0%
제주특별자치도 7
 
7.0%
경상남도 7
 
7.0%
경상북도 7
 
7.0%
충청남도 7
 
7.0%
광주광역시 6
 
6.0%
대구광역시 6
 
6.0%
대전광역시 6
 
6.0%
세종특별자치시 6
 
6.0%
울산광역시 6
 
6.0%
Other values (6) 35
35.0%

book_co
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean211812.14
Minimum2491
Maximum2549281
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:18:06.663306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2491
5-th percentile13060.7
Q160505.5
median119790
Q3183071
95-th percentile661997.1
Maximum2549281
Range2546790
Interquartile range (IQR)122565.5

Descriptive statistics

Standard deviation336167.36
Coefficient of variation (CV)1.5871015
Kurtosis25.448746
Mean211812.14
Median Absolute Deviation (MAD)61638
Skewness4.499044
Sum21181214
Variance1.1300849 × 1011
MonotonicityNot monotonic
2023-12-10T19:18:06.933887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
131593 1
 
1.0%
182471 1
 
1.0%
124420 1
 
1.0%
64206 1
 
1.0%
6185 1
 
1.0%
503756 1
 
1.0%
142236 1
 
1.0%
205762 1
 
1.0%
53816 1
 
1.0%
73529 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
2491 1
1.0%
3961 1
1.0%
5824 1
1.0%
6185 1
1.0%
6481 1
1.0%
13407 1
1.0%
27238 1
1.0%
29939 1
1.0%
31968 1
1.0%
32997 1
1.0%
ValueCountFrequency (%)
2549281 1
1.0%
1463156 1
1.0%
1114933 1
1.0%
1044755 1
1.0%
1006260 1
1.0%
643878 1
1.0%
600249 1
1.0%
527380 1
1.0%
503756 1
1.0%
427089 1
1.0%

lon_co
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean277399.62
Minimum1879
Maximum3859151
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:18:07.185435image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1879
5-th percentile12971.65
Q163393
median123309.5
Q3250345
95-th percentile1000155.8
Maximum3859151
Range3857272
Interquartile range (IQR)186952

Descriptive statistics

Standard deviation528355.9
Coefficient of variation (CV)1.9046742
Kurtosis25.129748
Mean277399.62
Median Absolute Deviation (MAD)77413
Skewness4.6242985
Sum27739962
Variance2.7915995 × 1011
MonotonicityNot monotonic
2023-12-10T19:18:07.490276image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
45670 1
 
1.0%
155476 1
 
1.0%
274678 1
 
1.0%
168421 1
 
1.0%
9108 1
 
1.0%
1274018 1
 
1.0%
259709 1
 
1.0%
393365 1
 
1.0%
140808 1
 
1.0%
209962 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1879 1
1.0%
2905 1
1.0%
3235 1
1.0%
5021 1
1.0%
9108 1
1.0%
13175 1
1.0%
19652 1
1.0%
32706 1
1.0%
35498 1
1.0%
37607 1
1.0%
ValueCountFrequency (%)
3859151 1
1.0%
2508004 1
1.0%
2173193 1
1.0%
1290228 1
1.0%
1274018 1
1.0%
985742 1
1.0%
919482 1
1.0%
780364 1
1.0%
627264 1
1.0%
546130 1
1.0%

rate_value
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean121.84165
Minimum24.07
Maximum285.55
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:18:07.830927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum24.07
5-th percentile47.00085
Q180.1185
median116.0475
Q3151.936
95-th percentile253.09635
Maximum285.55
Range261.48
Interquartile range (IQR)71.8175

Descriptive statistics

Standard deviation56.350337
Coefficient of variation (CV)0.4624883
Kurtosis0.53442747
Mean121.84165
Median Absolute Deviation (MAD)36.0055
Skewness0.82676679
Sum12184.165
Variance3175.3605
MonotonicityNot monotonic
2023-12-10T19:18:08.096294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
34.705 1
 
1.0%
85.206 1
 
1.0%
220.767 1
 
1.0%
262.313 1
 
1.0%
147.259 1
 
1.0%
252.904 1
 
1.0%
182.59 1
 
1.0%
191.175 1
 
1.0%
261.647 1
 
1.0%
285.55 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
24.07 1
1.0%
31.646 1
1.0%
34.705 1
1.0%
44.823 1
1.0%
46.941 1
1.0%
47.004 1
1.0%
51.68 1
1.0%
53.582 1
1.0%
55.358 1
1.0%
55.635 1
1.0%
ValueCountFrequency (%)
285.55 1
1.0%
264.471 1
1.0%
262.313 1
1.0%
261.647 1
1.0%
256.751 1
1.0%
252.904 1
1.0%
224.947 1
1.0%
220.767 1
1.0%
191.175 1
1.0%
187.243 1
1.0%

Interactions

2023-12-10T19:18:04.918311image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:18:04.063919image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:18:04.510943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:18:05.041847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:18:04.231745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:18:04.650599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:18:05.175728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:18:04.390578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:18:04.804233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:18:08.251293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
anals_trget_yearkdc_nmarea_nmbook_colon_corate_value
anals_trget_year1.0001.0000.0000.0000.0000.367
kdc_nm1.0001.0000.0000.3670.0000.502
area_nm0.0000.0001.0000.4360.4630.595
book_co0.0000.3670.4361.0000.9810.098
lon_co0.0000.0000.4630.9811.0000.516
rate_value0.3670.5020.5950.0980.5161.000
2023-12-10T19:18:08.449923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
anals_trget_yearkdc_nmarea_nm
anals_trget_year1.0000.9690.000
kdc_nm0.9691.0000.000
area_nm0.0000.0001.000
2023-12-10T19:18:08.975161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
book_colon_corate_valueanals_trget_yearkdc_nmarea_nm
book_co1.0000.8390.0140.0000.2060.206
lon_co0.8391.0000.5000.0000.0000.221
rate_value0.0140.5001.0000.2690.2630.269
anals_trget_year0.0000.0000.2691.0000.9690.000
kdc_nm0.2060.0000.2630.9691.0000.000
area_nm0.2060.2210.2690.0000.0001.000

Missing values

2023-12-10T19:18:05.338398image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:18:05.559348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

anals_trget_yearkdc_nmarea_nmbook_colon_corate_value
02010000강원도1315934567034.705
12019미상제주특별자치도174714327140187.243
22010000경상남도105093122747116.798
32010000경상북도6249280863129.397
42010000광주광역시3344735498106.132
52010000대구광역시17115912962675.734
62010000대전광역시1162269498881.727
72019미상충청남도22324116280072.926
82010000세종특별자치시6481290544.823
92010000울산광역시4060946365114.174
anals_trget_yearkdc_nmarea_nmbook_colon_corate_value
902010500인천광역시131603154389117.314
912010500전라남도15490713714088.531
922010500전라북도74560106030142.208
932010500제주특별자치도906105122556.533
942010500충청남도15653411186571.464
952010500충청북도15614910134164.9
962010600강원도1852175861431.646
972010600경기도100626091948291.376
982010600경상남도14091611312780.28
992010600경상북도65765118445180.103