Overview

Dataset statistics

Number of variables4
Number of observations50
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.8 KiB
Average record size in memory37.6 B

Variable types

Categorical1
Text1
Numeric2

Alerts

BASE_YM has constant value ""Constant
SIGNGU_CD is highly overall correlated with THIRTY_DAY_ABOVE_ARRRG_NMPR_COHigh correlation
THIRTY_DAY_ABOVE_ARRRG_NMPR_CO is highly overall correlated with SIGNGU_CDHigh correlation
SIGNGU_CD has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:53:57.605526
Analysis finished2023-12-10 09:53:59.117749
Duration1.51 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

BASE_YM
Categorical

CONSTANT 

Distinct1
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size532.0 B
202106
50 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202106
2nd row202106
3rd row202106
4th row202106
5th row202106

Common Values

ValueCountFrequency (%)
202106 50
100.0%

Length

2023-12-10T18:53:59.263632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:53:59.432256image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202106 50
100.0%
Distinct45
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Memory size532.0 B
2023-12-10T18:53:59.791310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length3
Mean length3.46
Min length2

Characters and Unicode

Total characters173
Distinct characters58
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)82.0%

Sample

1st row진안군
2nd row포항시 북구
3rd row북구
4th row영양군
5th row고령군
ValueCountFrequency (%)
중구 3
 
5.2%
북구 3
 
5.2%
남구 2
 
3.4%
동구 2
 
3.4%
고양시 2
 
3.4%
수원시 2
 
3.4%
보령시 1
 
1.7%
인제군 1
 
1.7%
진안군 1
 
1.7%
완도군 1
 
1.7%
Other values (40) 40
69.0%
2023-12-10T18:54:00.587641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
27
 
15.6%
18
 
10.4%
14
 
8.1%
8
 
4.6%
7
 
4.0%
5
 
2.9%
4
 
2.3%
4
 
2.3%
4
 
2.3%
4
 
2.3%
Other values (48) 78
45.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 165
95.4%
Space Separator 8
 
4.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
27
 
16.4%
18
 
10.9%
14
 
8.5%
7
 
4.2%
5
 
3.0%
4
 
2.4%
4
 
2.4%
4
 
2.4%
4
 
2.4%
4
 
2.4%
Other values (47) 74
44.8%
Space Separator
ValueCountFrequency (%)
8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 165
95.4%
Common 8
 
4.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
27
 
16.4%
18
 
10.9%
14
 
8.5%
7
 
4.2%
5
 
3.0%
4
 
2.4%
4
 
2.4%
4
 
2.4%
4
 
2.4%
4
 
2.4%
Other values (47) 74
44.8%
Common
ValueCountFrequency (%)
8
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 165
95.4%
ASCII 8
 
4.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
27
 
16.4%
18
 
10.9%
14
 
8.5%
7
 
4.2%
5
 
3.0%
4
 
2.4%
4
 
2.4%
4
 
2.4%
4
 
2.4%
4
 
2.4%
Other values (47) 74
44.8%
ASCII
ValueCountFrequency (%)
8
100.0%

SIGNGU_CD
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct50
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36366.82
Minimum11215
Maximum48740
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size582.0 B
2023-12-10T18:54:00.882830image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11215
5-th percentile11300.5
Q127828.75
median41284
Q345717.5
95-th percentile47765.5
Maximum48740
Range37525
Interquartile range (IQR)17888.75

Descriptive statistics

Standard deviation11522.14
Coefficient of variation (CV)0.31683112
Kurtosis-0.20031943
Mean36366.82
Median Absolute Deviation (MAD)6441
Skewness-0.91769709
Sum1818341
Variance1.3275972 × 108
MonotonicityNot monotonic
2023-12-10T18:54:01.237684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
45720 1
 
2.0%
26200 1
 
2.0%
27110 1
 
2.0%
41131 1
 
2.0%
11215 1
 
2.0%
11260 1
 
2.0%
41115 1
 
2.0%
30110 1
 
2.0%
44180 1
 
2.0%
46780 1
 
2.0%
Other values (40) 40
80.0%
ValueCountFrequency (%)
11215 1
2.0%
11230 1
2.0%
11260 1
2.0%
11350 1
2.0%
11500 1
2.0%
26110 1
2.0%
26200 1
2.0%
26290 1
2.0%
26320 1
2.0%
26410 1
2.0%
ValueCountFrequency (%)
48740 1
2.0%
47830 1
2.0%
47770 1
2.0%
47760 1
2.0%
47730 1
2.0%
47720 1
2.0%
47250 1
2.0%
47113 1
2.0%
46910 1
2.0%
46890 1
2.0%

THIRTY_DAY_ABOVE_ARRRG_NMPR_CO
Real number (ℝ)

HIGH CORRELATION 

Distinct49
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1327.58
Minimum75
Maximum3354
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size582.0 B
2023-12-10T18:54:01.670571image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum75
5-th percentile173.85
Q1365
median1518
Q32039.75
95-th percentile2962.65
Maximum3354
Range3279
Interquartile range (IQR)1674.75

Descriptive statistics

Standard deviation970.42492
Coefficient of variation (CV)0.73097284
Kurtosis-1.1052517
Mean1327.58
Median Absolute Deviation (MAD)932.5
Skewness0.30460822
Sum66379
Variance941724.53
MonotonicityNot monotonic
2023-12-10T18:54:02.333966image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=49)
ValueCountFrequency (%)
1809 2
 
4.0%
168 1
 
2.0%
1043 1
 
2.0%
2090 1
 
2.0%
610 1
 
2.0%
2128 1
 
2.0%
2884 1
 
2.0%
1653 1
 
2.0%
746 1
 
2.0%
250 1
 
2.0%
Other values (39) 39
78.0%
ValueCountFrequency (%)
75 1
2.0%
102 1
2.0%
168 1
2.0%
181 1
2.0%
186 1
2.0%
197 1
2.0%
206 1
2.0%
232 1
2.0%
236 1
2.0%
250 1
2.0%
ValueCountFrequency (%)
3354 1
2.0%
3171 1
2.0%
3027 1
2.0%
2884 1
2.0%
2809 1
2.0%
2666 1
2.0%
2475 1
2.0%
2366 1
2.0%
2289 1
2.0%
2136 1
2.0%

Interactions

2023-12-10T18:53:58.339668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:53:57.847200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:53:58.538565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:53:58.039565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:54:02.544858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SIGNGU_NMSIGNGU_CDTHIRTY_DAY_ABOVE_ARRRG_NMPR_CO
SIGNGU_NM1.0000.7220.812
SIGNGU_CD0.7221.0000.525
THIRTY_DAY_ABOVE_ARRRG_NMPR_CO0.8120.5251.000
2023-12-10T18:54:02.727482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SIGNGU_CDTHIRTY_DAY_ABOVE_ARRRG_NMPR_CO
SIGNGU_CD1.000-0.666
THIRTY_DAY_ABOVE_ARRRG_NMPR_CO-0.6661.000

Missing values

2023-12-10T18:53:58.784410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:53:58.996080image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

BASE_YMSIGNGU_NMSIGNGU_CDTHIRTY_DAY_ABOVE_ARRRG_NMPR_CO
0202106진안군45720168
1202106포항시 북구471132115
2202106북구272303027
3202106영양군4776075
4202106고령군47830236
5202106동대문구112302136
6202106광산구292003171
7202106영덕군47770326
8202106상주시47250480
9202106목포시461102289
BASE_YMSIGNGU_NMSIGNGU_CDTHIRTY_DAY_ABOVE_ARRRG_NMPR_CO
40202106의성군47730259
41202106수원시 영통구411171401
42202106양양군42830186
43202106달성군277101728
44202106동구311701659
45202106금정구264101558
46202106인제군42810181
47202106중구26110503
48202106중구301401821
49202106고양시 덕양구412812666