Overview

Dataset statistics

Number of variables4
Number of observations50
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.8 KiB
Average record size in memory37.6 B

Variable types

Categorical1
Text1
Numeric2

Alerts

BASE_YM has constant value ""Constant
SIGNGU_CD is highly overall correlated with CREDT_LON_AVRG_BLCE_PRICEHigh correlation
CREDT_LON_AVRG_BLCE_PRICE is highly overall correlated with SIGNGU_CDHigh correlation
SIGNGU_CD has unique valuesUnique
CREDT_LON_AVRG_BLCE_PRICE has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:57:12.233463
Analysis finished2023-12-10 09:57:13.802722
Duration1.57 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BASE_YM
Categorical

CONSTANT 

Distinct1
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size532.0 B
202106
50 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202106
2nd row202106
3rd row202106
4th row202106
5th row202106

Common Values

ValueCountFrequency (%)
202106 50
100.0%

Length

2023-12-10T18:57:13.947655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:57:14.137149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202106 50
100.0%
Distinct48
Distinct (%)96.0%
Missing0
Missing (%)0.0%
Memory size532.0 B
2023-12-10T18:57:14.498701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.62
Min length2

Characters and Unicode

Total characters181
Distinct characters64
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique46 ?
Unique (%)92.0%

Sample

1st row옥천군
2nd row부안군
3rd row동래구
4th row포항시 남구
5th row안산시 단원구
ValueCountFrequency (%)
고양시 3
 
5.2%
수원시 2
 
3.4%
중구 2
 
3.4%
북구 2
 
3.4%
군포시 1
 
1.7%
달성군 1
 
1.7%
장안구 1
 
1.7%
고성군 1
 
1.7%
옥천군 1
 
1.7%
고령군 1
 
1.7%
Other values (43) 43
74.1%
2023-12-10T18:57:15.214136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
21
 
11.6%
20
 
11.0%
18
 
9.9%
8
 
4.4%
7
 
3.9%
6
 
3.3%
6
 
3.3%
5
 
2.8%
5
 
2.8%
5
 
2.8%
Other values (54) 80
44.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 173
95.6%
Space Separator 8
 
4.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
21
 
12.1%
20
 
11.6%
18
 
10.4%
7
 
4.0%
6
 
3.5%
6
 
3.5%
5
 
2.9%
5
 
2.9%
5
 
2.9%
5
 
2.9%
Other values (53) 75
43.4%
Space Separator
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 173
95.6%
Common 8
 
4.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
21
 
12.1%
20
 
11.6%
18
 
10.4%
7
 
4.0%
6
 
3.5%
6
 
3.5%
5
 
2.9%
5
 
2.9%
5
 
2.9%
5
 
2.9%
Other values (53) 75
43.4%
Common
ValueCountFrequency (%)
8
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 173
95.6%
ASCII 8
 
4.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
21
 
12.1%
20
 
11.6%
18
 
10.4%
7
 
4.0%
6
 
3.5%
6
 
3.5%
5
 
2.9%
5
 
2.9%
5
 
2.9%
5
 
2.9%
Other values (53) 75
43.4%
ASCII
ValueCountFrequency (%)
8
100.0%

SIGNGU_CD
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct50
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38794.52
Minimum11230
Maximum48820
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size582.0 B
2023-12-10T18:57:15.509976image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11230
5-th percentile18164.5
Q130117.5
median42480
Q345787.5
95-th percentile47803
Maximum48820
Range37590
Interquartile range (IQR)15670

Descriptive statistics

Standard deviation10148.417
Coefficient of variation (CV)0.26159409
Kurtosis1.0791186
Mean38794.52
Median Absolute Deviation (MAD)3935
Skewness-1.3673471
Sum1939726
Variance1.0299037 × 108
MonotonicityNot monotonic
2023-12-10T18:57:15.902028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
43730 1
 
2.0%
41250 1
 
2.0%
27230 1
 
2.0%
47830 1
 
2.0%
27260 1
 
2.0%
45720 1
 
2.0%
46820 1
 
2.0%
43760 1
 
2.0%
47920 1
 
2.0%
42210 1
 
2.0%
Other values (40) 40
80.0%
ValueCountFrequency (%)
11230 1
2.0%
11350 1
2.0%
11590 1
2.0%
26200 1
2.0%
26260 1
2.0%
26320 1
2.0%
26350 1
2.0%
27110 1
2.0%
27230 1
2.0%
27260 1
2.0%
ValueCountFrequency (%)
48820 1
2.0%
47920 1
2.0%
47830 1
2.0%
47770 1
2.0%
47760 1
2.0%
47730 1
2.0%
47250 1
2.0%
47111 1
2.0%
46880 1
2.0%
46820 1
2.0%

CREDT_LON_AVRG_BLCE_PRICE
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct50
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6644.12
Minimum4318
Maximum12187
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size582.0 B
2023-12-10T18:57:16.364103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4318
5-th percentile4758.6
Q15362.25
median6144
Q37639.75
95-th percentile9725.15
Maximum12187
Range7869
Interquartile range (IQR)2277.5

Descriptive statistics

Standard deviation1704.2409
Coefficient of variation (CV)0.25650363
Kurtosis1.2974188
Mean6644.12
Median Absolute Deviation (MAD)1128.5
Skewness1.1273762
Sum332206
Variance2904437
MonotonicityNot monotonic
2023-12-10T18:57:16.710404image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5366 1
 
2.0%
5935 1
 
2.0%
6742 1
 
2.0%
4467 1
 
2.0%
9974 1
 
2.0%
5820 1
 
2.0%
5169 1
 
2.0%
4737 1
 
2.0%
5826 1
 
2.0%
7323 1
 
2.0%
Other values (40) 40
80.0%
ValueCountFrequency (%)
4318 1
2.0%
4467 1
2.0%
4737 1
2.0%
4785 1
2.0%
4896 1
2.0%
4977 1
2.0%
4992 1
2.0%
5002 1
2.0%
5019 1
2.0%
5030 1
2.0%
ValueCountFrequency (%)
12187 1
2.0%
10739 1
2.0%
9974 1
2.0%
9421 1
2.0%
9024 1
2.0%
8965 1
2.0%
8421 1
2.0%
8230 1
2.0%
8063 1
2.0%
7992 1
2.0%

Interactions

2023-12-10T18:57:12.991946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:12.587562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:13.204756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:57:12.795382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:57:16.966533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SIGNGU_NMSIGNGU_CDCREDT_LON_AVRG_BLCE_PRICE
SIGNGU_NM1.0000.6820.954
SIGNGU_CD0.6821.0000.409
CREDT_LON_AVRG_BLCE_PRICE0.9540.4091.000
2023-12-10T18:57:17.135180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SIGNGU_CDCREDT_LON_AVRG_BLCE_PRICE
SIGNGU_CD1.000-0.544
CREDT_LON_AVRG_BLCE_PRICE-0.5441.000

Missing values

2023-12-10T18:57:13.442989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:57:13.694654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BASE_YMSIGNGU_NMSIGNGU_CDCREDT_LON_AVRG_BLCE_PRICE
0202106옥천군437305366
1202106부안군458005118
2202106동래구262607837
3202106포항시 남구471116631
4202106안산시 단원구412734977
5202106의성군477304318
6202106계룡시4425010739
7202106충주시431305878
8202106목포시461105931
9202106고양시 일산서구412879024
BASE_YMSIGNGU_NMSIGNGU_CDCREDT_LON_AVRG_BLCE_PRICE
40202106천안시 동남구441316385
41202106임실군457505506
42202106광명시412107360
43202106고양시 일산동구412859421
44202106달성군277108421
45202106군포시414106876
46202106수원시 장안구411116899
47202106곡성군467205019
48202106장성군468805581
49202106동작구115908063