Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.6 KiB
Average record size in memory78.3 B

Variable types

Categorical6
Text1
Numeric2

Alerts

sccnt_ym has constant value ""Constant
file_name has constant value ""Constant
base_ymd has constant value ""Constant
ctprvn_nm is highly overall correlated with adstrd_cd and 1 other fieldsHigh correlation
mlsfc is highly overall correlated with ctprvn_nm and 1 other fieldsHigh correlation
adstrd_cd is highly overall correlated with ctprvn_nmHigh correlation
sccnt is highly overall correlated with mlsfcHigh correlation
mlsfc is highly imbalanced (80.6%)Imbalance
sccnt is highly imbalanced (50.8%)Imbalance
sgnr_nm has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:09:33.131896
Analysis finished2023-12-10 10:09:35.275012
Duration2.14 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

ctprvn_nm
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기도
44 
경상남도
22 
강원도
16 
경상북도
15 
충청북도
 
3

Length

Max length4
Median length3
Mean length3.4
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row강원도
2nd row충청북도
3rd row강원도
4th row강원도
5th row강원도

Common Values

ValueCountFrequency (%)
경기도 44
44.0%
경상남도 22
22.0%
강원도 16
 
16.0%
경상북도 15
 
15.0%
충청북도 3
 
3.0%

Length

2023-12-10T19:09:35.455406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:09:35.655103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기도 44
44.0%
경상남도 22
22.0%
강원도 16
 
16.0%
경상북도 15
 
15.0%
충청북도 3
 
3.0%

sgnr_nm
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:09:36.271760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length4.05
Min length3

Characters and Unicode

Total characters405
Distinct characters91
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row강릉시
2nd row진천군
3rd row동해시
4th row삼척시
5th row속초시
ValueCountFrequency (%)
창원시 5
 
4.0%
수원시 4
 
3.2%
용인시 3
 
2.4%
부천시 3
 
2.4%
성남시 3
 
2.4%
고양시 3
 
2.4%
안양시 2
 
1.6%
안산시 2
 
1.6%
양산시 1
 
0.8%
산청군 1
 
0.8%
Other values (97) 97
78.2%
2023-12-10T19:09:37.344425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
72
17.8%
31
 
7.7%
27
 
6.7%
24
 
5.9%
16
 
4.0%
16
 
4.0%
15
 
3.7%
13
 
3.2%
11
 
2.7%
10
 
2.5%
Other values (81) 170
42.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 381
94.1%
Space Separator 24
 
5.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
72
18.9%
31
 
8.1%
27
 
7.1%
16
 
4.2%
16
 
4.2%
15
 
3.9%
13
 
3.4%
11
 
2.9%
10
 
2.6%
9
 
2.4%
Other values (80) 161
42.3%
Space Separator
ValueCountFrequency (%)
24
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 381
94.1%
Common 24
 
5.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
72
18.9%
31
 
8.1%
27
 
7.1%
16
 
4.2%
16
 
4.2%
15
 
3.9%
13
 
3.4%
11
 
2.9%
10
 
2.6%
9
 
2.4%
Other values (80) 161
42.3%
Common
ValueCountFrequency (%)
24
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 381
94.1%
ASCII 24
 
5.9%

Most frequent character per block

Hangul
ValueCountFrequency (%)
72
18.9%
31
 
8.1%
27
 
7.1%
16
 
4.2%
16
 
4.2%
15
 
3.9%
13
 
3.4%
11
 
2.9%
10
 
2.6%
9
 
2.4%
Other values (80) 161
42.3%
ASCII
ValueCountFrequency (%)
24
100.0%

adstrd_cd
Real number (ℝ)

HIGH CORRELATION 

Distinct97
Distinct (%)97.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.3858224 × 109
Minimum1.138051 × 109
Maximum4.889025 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:09:37.968887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.138051 × 109
5-th percentile4.113082 × 109
Q14.1405535 × 109
median4.247541 × 109
Q34.7832752 × 109
95-th percentile4.884075 × 109
Maximum4.889025 × 109
Range3.750974 × 109
Interquartile range (IQR)6.427217 × 108

Descriptive statistics

Standard deviation4.534638 × 108
Coefficient of variation (CV)0.10339311
Kurtosis25.529683
Mean4.3858224 × 109
Median Absolute Deviation (MAD)1.284707 × 108
Skewness-3.5325477
Sum4.3858224 × 1011
Variance2.0562942 × 1017
MonotonicityNot monotonic
2023-12-10T19:09:38.250883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4812760000 3
 
3.0%
4833051000 2
 
2.0%
4159025900 1
 
1.0%
4872025000 1
 
1.0%
4886025000 1
 
1.0%
4824052000 1
 
1.0%
4827025000 1
 
1.0%
4884025000 1
 
1.0%
4825052000 1
 
1.0%
4882025000 1
 
1.0%
Other values (87) 87
87.0%
ValueCountFrequency (%)
1138051000 1
1.0%
4111159700 1
1.0%
4111368000 1
1.0%
4111574000 1
1.0%
4111752000 1
1.0%
4113152000 1
1.0%
4113351000 1
1.0%
4113566500 1
1.0%
4115062000 1
1.0%
4117154000 1
1.0%
ValueCountFrequency (%)
4889025000 1
1.0%
4888025000 1
1.0%
4887025000 1
1.0%
4886025000 1
1.0%
4885025000 1
1.0%
4884025000 1
1.0%
4882025000 1
1.0%
4874025300 1
1.0%
4873025000 1
1.0%
4872025000 1
1.0%

sccnt_ym
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
202001
100 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202001
2nd row202001
3rd row202001
4th row202001
5th row202001

Common Values

ValueCountFrequency (%)
202001 100
100.0%

Length

2023-12-10T19:09:38.500460image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:09:38.662473image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202001 100
100.0%

mlsfc
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
공공도서관
97 
문예회관
 
3

Length

Max length5
Median length5
Mean length4.97
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공공도서관
2nd row문예회관
3rd row공공도서관
4th row공공도서관
5th row공공도서관

Common Values

ValueCountFrequency (%)
공공도서관 97
97.0%
문예회관 3
 
3.0%

Length

2023-12-10T19:09:38.843470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:09:38.996747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공공도서관 97
97.0%
문예회관 3
 
3.0%

fclt_cnt
Real number (ℝ)

Distinct14
Distinct (%)14.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.45
Minimum1
Maximum17
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:09:39.153348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile10.05
Maximum17
Range16
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.0989571
Coefficient of variation (CV)0.69639486
Kurtosis2.8011485
Mean4.45
Median Absolute Deviation (MAD)2
Skewness1.4297864
Sum445
Variance9.6035354
MonotonicityNot monotonic
2023-12-10T19:09:39.363556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
3 18
18.0%
1 15
15.0%
2 15
15.0%
5 12
12.0%
4 10
10.0%
6 9
9.0%
7 8
8.0%
8 4
 
4.0%
10 2
 
2.0%
9 2
 
2.0%
Other values (4) 5
 
5.0%
ValueCountFrequency (%)
1 15
15.0%
2 15
15.0%
3 18
18.0%
4 10
10.0%
5 12
12.0%
6 9
9.0%
7 8
8.0%
8 4
 
4.0%
9 2
 
2.0%
10 2
 
2.0%
ValueCountFrequency (%)
17 1
 
1.0%
15 1
 
1.0%
12 1
 
1.0%
11 2
 
2.0%
10 2
 
2.0%
9 2
 
2.0%
8 4
 
4.0%
7 8
8.0%
6 9
9.0%
5 12
12.0%

sccnt
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)6.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
54 
0
42 
896
 
1
582
 
1
575
 
1

Length

Max length4
Median length4
Mean length2.7
Min length1

Unique

Unique4 ?
Unique (%)4.0%

Sample

1st row0
2nd row896
3rd row<NA>
4th row0
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 54
54.0%
0 42
42.0%
896 1
 
1.0%
582 1
 
1.0%
575 1
 
1.0%
533 1
 
1.0%

Length

2023-12-10T19:09:39.657527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:09:39.860152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 54
54.0%
0 42
42.0%
896 1
 
1.0%
582 1
 
1.0%
575 1
 
1.0%
533 1
 
1.0%

file_name
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
KC_597_DGT_CLT_STATN_BIZAEA_2021
100 

Length

Max length32
Median length32
Mean length32
Min length32

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKC_597_DGT_CLT_STATN_BIZAEA_2021
2nd rowKC_597_DGT_CLT_STATN_BIZAEA_2021
3rd rowKC_597_DGT_CLT_STATN_BIZAEA_2021
4th rowKC_597_DGT_CLT_STATN_BIZAEA_2021
5th rowKC_597_DGT_CLT_STATN_BIZAEA_2021

Common Values

ValueCountFrequency (%)
KC_597_DGT_CLT_STATN_BIZAEA_2021 100
100.0%

Length

2023-12-10T19:09:40.069208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:09:40.235929image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kc_597_dgt_clt_statn_bizaea_2021 100
100.0%

base_ymd
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
20200101
100 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20200101
2nd row20200101
3rd row20200101
4th row20200101
5th row20200101

Common Values

ValueCountFrequency (%)
20200101 100
100.0%

Length

2023-12-10T19:09:40.494056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:09:40.695946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20200101 100
100.0%

Interactions

2023-12-10T19:09:34.495584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:09:34.005083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:09:34.671187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:09:34.255792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:09:40.815811image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ctprvn_nmsgnr_nmadstrd_cdmlsfcfclt_cntsccnt
ctprvn_nm1.0001.0000.7661.0000.3360.582
sgnr_nm1.0001.0001.0001.0001.0001.000
adstrd_cd0.7661.0001.0000.1060.2960.283
mlsfc1.0001.0000.1061.0000.0000.549
fclt_cnt0.3361.0000.2960.0001.0000.000
sccnt0.5821.0000.2830.5490.0001.000
2023-12-10T19:09:41.069586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
sccntctprvn_nmmlsfc
sccnt1.0000.2470.640
ctprvn_nm0.2471.0000.985
mlsfc0.6400.9851.000
2023-12-10T19:09:41.267174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
adstrd_cdfclt_cntctprvn_nmmlsfcsccnt
adstrd_cd1.000-0.4790.6190.1430.034
fclt_cnt-0.4791.0000.1980.0000.000
ctprvn_nm0.6190.1981.0000.9850.247
mlsfc0.1430.0000.9851.0000.640
sccnt0.0340.0000.2470.6401.000

Missing values

2023-12-10T19:09:34.938852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:09:35.183102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

ctprvn_nmsgnr_nmadstrd_cdsccnt_ymmlsfcfclt_cntsccntfile_namebase_ymd
0강원도강릉시4215061500202001공공도서관40KC_597_DGT_CLT_STATN_BIZAEA_202120200101
1충청북도진천군4375025000202001문예회관1896KC_597_DGT_CLT_STATN_BIZAEA_202120200101
2강원도동해시4217054000202001공공도서관3<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
3강원도삼척시4223057000202001공공도서관30KC_597_DGT_CLT_STATN_BIZAEA_202120200101
4강원도속초시4221056000202001공공도서관3<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
5강원도양구군4280025000202001공공도서관1<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
6강원도양양군4283025000202001공공도서관1<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
7충청북도청주시4311251000202001문예회관2<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
8강원도원주시4213025000202001공공도서관4<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
9강원도인제군4281025000202001공공도서관2<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
ctprvn_nmsgnr_nmadstrd_cdsccnt_ymmlsfcfclt_cntsccntfile_namebase_ymd
90경상북도김천시4715053600202001공공도서관1<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
91경상북도문경시4728059000202001공공도서관5<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
92경상북도봉화군4792025000202001공공도서관1<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
93경상북도상주시4725052000202001공공도서관2<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
94경상북도성주군4784025000202001공공도서관2<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
95경상북도안동시4717058500202001공공도서관5<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
96경상북도영덕군4777025000202001공공도서관1<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
97경상북도영양군4776025000202001공공도서관1<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
98경상북도영주시4721063000202001공공도서관3<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101
99경상북도영천시4723025000202001공공도서관2<NA>KC_597_DGT_CLT_STATN_BIZAEA_202120200101