Overview

Dataset statistics

Number of variables7
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.0 KiB
Average record size in memory61.3 B

Variable types

Categorical3
Text1
Numeric3

Alerts

FILE_NAME has constant value ""Constant
base_ymd has constant value ""Constant
hadm_cd is highly overall correlated with book_str_cnt and 2 other fieldsHigh correlation
book_str_cnt is highly overall correlated with hadm_cd and 1 other fieldsHigh correlation
residnt_cnt_sum is highly overall correlated with hadm_cd and 1 other fieldsHigh correlation
sido_nm is highly overall correlated with hadm_cdHigh correlation
hadm_cd has unique valuesUnique
residnt_cnt_sum has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:52:41.004196
Analysis finished2023-12-10 09:52:43.746755
Duration2.74 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

sido_nm
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기도
42 
경상남도
22 
강원도
18 
경상북도
18 

Length

Max length4
Median length3
Mean length3.4
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row강원도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
경기도 42
42.0%
경상남도 22
22.0%
강원도 18
18.0%
경상북도 18
18.0%

Length

2023-12-10T18:52:43.911802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:52:44.128356image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 42
42.0%
경상남도 22
22.0%
강원도 18
18.0%
경상북도 18
18.0%

sgg_nm
Text

Distinct99
Distinct (%)99.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:52:44.636154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.97
Min length3

Characters and Unicode

Total characters397
Distinct characters93
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique98 ?
Unique (%)98.0%

Sample

1st row강릉시
2nd row고성군
3rd row동해시
4th row삼척시
5th row속초시
ValueCountFrequency (%)
창원시 5
 
4.1%
수원시 4
 
3.3%
고양시 3
 
2.5%
성남시 3
 
2.5%
용인시 3
 
2.5%
안양시 2
 
1.6%
안산시 2
 
1.6%
고성군 2
 
1.6%
영덕군 1
 
0.8%
안동시 1
 
0.8%
Other values (96) 96
78.7%
2023-12-10T18:52:45.420014image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
68
 
17.1%
35
 
8.8%
25
 
6.3%
22
 
5.5%
16
 
4.0%
14
 
3.5%
14
 
3.5%
11
 
2.8%
11
 
2.8%
10
 
2.5%
Other values (83) 171
43.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 375
94.5%
Space Separator 22
 
5.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
68
18.1%
35
 
9.3%
25
 
6.7%
16
 
4.3%
14
 
3.7%
14
 
3.7%
11
 
2.9%
11
 
2.9%
10
 
2.7%
10
 
2.7%
Other values (82) 161
42.9%
Space Separator
ValueCountFrequency (%)
22
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 375
94.5%
Common 22
 
5.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
68
18.1%
35
 
9.3%
25
 
6.7%
16
 
4.3%
14
 
3.7%
14
 
3.7%
11
 
2.9%
11
 
2.9%
10
 
2.7%
10
 
2.7%
Other values (82) 161
42.9%
Common
ValueCountFrequency (%)
22
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 375
94.5%
ASCII 22
 
5.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
68
18.1%
35
 
9.3%
25
 
6.7%
16
 
4.3%
14
 
3.7%
14
 
3.7%
11
 
2.9%
11
 
2.9%
10
 
2.7%
10
 
2.7%
Other values (82) 161
42.9%
ASCII
ValueCountFrequency (%)
22
100.0%

hadm_cd
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44255.9
Minimum41111
Maximum48890
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:52:45.750499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum41111
5-th percentile41132.9
Q141425
median42725
Q347905
95-th percentile48840.5
Maximum48890
Range7779
Interquartile range (IQR)6480

Descriptive statistics

Standard deviation3170.1308
Coefficient of variation (CV)0.071631822
Kurtosis-1.7528404
Mean44255.9
Median Absolute Deviation (MAD)1543.5
Skewness0.39435263
Sum4425590
Variance10049729
MonotonicityNot monotonic
2023-12-10T18:52:46.052880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
42150 1
 
1.0%
48840 1
 
1.0%
48123 1
 
1.0%
48127 1
 
1.0%
48125 1
 
1.0%
48740 1
 
1.0%
48170 1
 
1.0%
48720 1
 
1.0%
48330 1
 
1.0%
48860 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
41111 1
1.0%
41113 1
1.0%
41115 1
1.0%
41117 1
1.0%
41131 1
1.0%
41133 1
1.0%
41135 1
1.0%
41150 1
1.0%
41171 1
1.0%
41173 1
1.0%
ValueCountFrequency (%)
48890 1
1.0%
48880 1
1.0%
48870 1
1.0%
48860 1
1.0%
48850 1
1.0%
48840 1
1.0%
48820 1
1.0%
48740 1
1.0%
48730 1
1.0%
48720 1
1.0%

book_str_cnt
Real number (ℝ)

HIGH CORRELATION 

Distinct52
Distinct (%)52.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean26.17
Minimum0
Maximum107
Zeros1
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:52:46.347368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q15
median19
Q345.25
95-th percentile66.1
Maximum107
Range107
Interquartile range (IQR)40.25

Descriptive statistics

Standard deviation24.164443
Coefficient of variation (CV)0.92336427
Kurtosis0.14284925
Mean26.17
Median Absolute Deviation (MAD)16.5
Skewness0.86324801
Sum2617
Variance583.9203
MonotonicityNot monotonic
2023-12-10T18:52:46.616255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 8
 
8.0%
3 6
 
6.0%
46 5
 
5.0%
5 5
 
5.0%
2 5
 
5.0%
8 4
 
4.0%
7 4
 
4.0%
4 4
 
4.0%
9 3
 
3.0%
50 3
 
3.0%
Other values (42) 53
53.0%
ValueCountFrequency (%)
0 1
 
1.0%
1 8
8.0%
2 5
5.0%
3 6
6.0%
4 4
4.0%
5 5
5.0%
6 1
 
1.0%
7 4
4.0%
8 4
4.0%
9 3
 
3.0%
ValueCountFrequency (%)
107 1
1.0%
85 1
1.0%
83 1
1.0%
81 1
1.0%
68 1
1.0%
66 1
1.0%
63 2
2.0%
59 1
1.0%
58 1
1.0%
56 1
1.0%

residnt_cnt_sum
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean161769.75
Minimum5198
Maximum730558
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:52:46.861384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5198
5-th percentile17990.55
Q135464.5
median132815.5
Q3246408.75
95-th percentile417416.75
Maximum730558
Range725360
Interquartile range (IQR)210944.25

Descriptive statistics

Standard deviation145624.98
Coefficient of variation (CV)0.90019905
Kurtosis1.8646798
Mean161769.75
Median Absolute Deviation (MAD)100570
Skewness1.2584857
Sum16176975
Variance2.1206633 × 1010
MonotonicityNot monotonic
2023-12-10T18:52:47.101099image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
162905 1
 
1.0%
30561 1
 
1.0%
191489 1
 
1.0%
162616 1
 
1.0%
144573 1
 
1.0%
45069 1
 
1.0%
280079 1
 
1.0%
18082 1
 
1.0%
269055 1
 
1.0%
24133 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
5198 1
1.0%
11314 1
1.0%
14731 1
1.0%
15570 1
1.0%
16253 1
1.0%
18082 1
1.0%
18628 1
1.0%
19462 1
1.0%
20388 1
1.0%
20917 1
1.0%
ValueCountFrequency (%)
730558 1
1.0%
618926 1
1.0%
521159 1
1.0%
447439 1
1.0%
423150 1
1.0%
417115 1
1.0%
414376 1
1.0%
364214 1
1.0%
352389 1
1.0%
351337 1
1.0%

FILE_NAME
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
KC_623_CLT_SALE_BOOK_STR_MAP_2019
100 

Length

Max length33
Median length33
Mean length33
Min length33

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKC_623_CLT_SALE_BOOK_STR_MAP_2019
2nd rowKC_623_CLT_SALE_BOOK_STR_MAP_2019
3rd rowKC_623_CLT_SALE_BOOK_STR_MAP_2019
4th rowKC_623_CLT_SALE_BOOK_STR_MAP_2019
5th rowKC_623_CLT_SALE_BOOK_STR_MAP_2019

Common Values

ValueCountFrequency (%)
KC_623_CLT_SALE_BOOK_STR_MAP_2019 100
100.0%

Length

2023-12-10T18:52:47.341355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:52:47.505088image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kc_623_clt_sale_book_str_map_2019 100
100.0%

base_ymd
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
20200214
100 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20200214
2nd row20200214
3rd row20200214
4th row20200214
5th row20200214

Common Values

ValueCountFrequency (%)
20200214 100
100.0%

Length

2023-12-10T18:52:47.788484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:52:47.954277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20200214 100
100.0%

Interactions

2023-12-10T18:52:42.818391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:52:41.522714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:52:42.167962image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:52:43.044581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:52:41.717630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:52:42.365124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:52:43.224676image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:52:41.923458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:52:42.515908image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:52:48.039473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
sido_nmsgg_nmhadm_cdbook_str_cntresidnt_cnt_sum
sido_nm1.0000.6751.0000.4060.487
sgg_nm0.6751.0000.8621.0001.000
hadm_cd1.0000.8621.0000.4490.400
book_str_cnt0.4061.0000.4491.0000.835
residnt_cnt_sum0.4871.0000.4000.8351.000
2023-12-10T18:52:48.216990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
hadm_cdbook_str_cntresidnt_cnt_sumsido_nm
hadm_cd1.000-0.545-0.5670.990
book_str_cnt-0.5451.0000.9280.262
residnt_cnt_sum-0.5670.9281.0000.304
sido_nm0.9900.2620.3041.000

Missing values

2023-12-10T18:52:43.451387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:52:43.661901image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

sido_nmsgg_nmhadm_cdbook_str_cntresidnt_cnt_sumFILE_NAMEbase_ymd
0강원도강릉시4215052162905KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
1강원도고성군42820218628KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
2강원도동해시42170968772KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
3강원도삼척시42230545409KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
4강원도속초시42210763203KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
5강원도양구군42800314731KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
6강원도양양군42830119462KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
7강원도영월군42750325832KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
8강원도원주시4213055267203KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
9강원도인제군42810520917KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
sido_nmsgg_nmhadm_cdbook_str_cntresidnt_cnt_sumFILE_NAMEbase_ymd
90경상북도상주시472501068854KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
91경상북도성주군47840132010KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
92경상북도안동시4717032119201KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
93경상북도영덕군47770225962KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
94경상북도영양군47760111314KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
95경상북도영주시472102379658KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
96경상북도영천시472301273702KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
97경상북도예천군47900736291KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
98경상북도울릉군4794005198KC_623_CLT_SALE_BOOK_STR_MAP_201920200214
99경상북도울진군47930936399KC_623_CLT_SALE_BOOK_STR_MAP_201920200214