Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells10000
Missing cells (%)12.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory771.5 KiB
Average record size in memory79.0 B

Variable types

Numeric5
Categorical2
Unsupported1

Alerts

연도 is highly overall correlated with 연월일High correlation
연월일 is highly overall correlated with 연도High correlation
요일순번 is highly overall correlated with 요일 and 1 other fieldsHigh correlation
요일 is highly overall correlated with 요일순번 and 1 other fieldsHigh correlation
휴일여부 is highly overall correlated with 요일순번 and 1 other fieldsHigh correlation
휴일여부명 has 10000 (100.0%) missing valuesMissing
연월일 has unique valuesUnique
휴일여부명 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-10 22:28:05.173781
Analysis finished2023-12-10 22:28:09.127462
Duration3.95 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연도
Real number (ℝ)

HIGH CORRELATION 

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2014.9793
Minimum2000
Maximum2030
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T07:28:09.201764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2000
5-th percentile2001
Q12007
median2015
Q32023
95-th percentile2029
Maximum2030
Range30
Interquartile range (IQR)16

Descriptive statistics

Standard deviation8.928821
Coefficient of variation (CV)0.0044312222
Kurtosis-1.1965978
Mean2014.9793
Median Absolute Deviation (MAD)8
Skewness0.00033060113
Sum20149793
Variance79.723844
MonotonicityNot monotonic
2023-12-11T07:28:09.356719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
2014 340
 
3.4%
2017 330
 
3.3%
2018 330
 
3.3%
2000 329
 
3.3%
2026 328
 
3.3%
2008 327
 
3.3%
2010 327
 
3.3%
2024 327
 
3.3%
2022 327
 
3.3%
2002 326
 
3.3%
Other values (21) 6709
67.1%
ValueCountFrequency (%)
2000 329
3.3%
2001 322
3.2%
2002 326
3.3%
2003 313
3.1%
2004 319
3.2%
2005 323
3.2%
2006 321
3.2%
2007 319
3.2%
2008 327
3.3%
2009 323
3.2%
ValueCountFrequency (%)
2030 318
3.2%
2029 318
3.2%
2028 321
3.2%
2027 313
3.1%
2026 328
3.3%
2025 318
3.2%
2024 327
3.3%
2023 325
3.2%
2022 327
3.3%
2021 314
3.1%


Real number (ℝ)

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.5282
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T07:28:09.466172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q14
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.4495209
Coefficient of variation (CV)0.52840307
Kurtosis-1.2052583
Mean6.5282
Median Absolute Deviation (MAD)3
Skewness-0.0038003324
Sum65282
Variance11.899195
MonotonicityNot monotonic
2023-12-11T07:28:09.842113image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
12 869
8.7%
7 852
8.5%
5 845
8.5%
3 841
8.4%
8 840
8.4%
10 839
8.4%
1 839
8.4%
6 839
8.4%
4 837
8.4%
11 819
8.2%
Other values (2) 1580
15.8%
ValueCountFrequency (%)
1 839
8.4%
2 774
7.7%
3 841
8.4%
4 837
8.4%
5 845
8.5%
6 839
8.4%
7 852
8.5%
8 840
8.4%
9 806
8.1%
10 839
8.4%
ValueCountFrequency (%)
12 869
8.7%
11 819
8.2%
10 839
8.4%
9 806
8.1%
8 840
8.4%
7 852
8.5%
6 839
8.4%
5 845
8.5%
4 837
8.4%
3 841
8.4%


Real number (ℝ)

Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.6627
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T07:28:09.946902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q18
median16
Q323
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)15

Descriptive statistics

Standard deviation8.7986516
Coefficient of variation (CV)0.56175829
Kurtosis-1.1915853
Mean15.6627
Median Absolute Deviation (MAD)8
Skewness0.015663845
Sum156627
Variance77.41627
MonotonicityNot monotonic
2023-12-11T07:28:10.088626image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
1 343
 
3.4%
17 339
 
3.4%
6 338
 
3.4%
8 337
 
3.4%
13 335
 
3.4%
5 335
 
3.4%
9 334
 
3.3%
26 334
 
3.3%
24 333
 
3.3%
2 333
 
3.3%
Other values (21) 6639
66.4%
ValueCountFrequency (%)
1 343
3.4%
2 333
3.3%
3 314
3.1%
4 330
3.3%
5 335
3.4%
6 338
3.4%
7 321
3.2%
8 337
3.4%
9 334
3.3%
10 325
3.2%
ValueCountFrequency (%)
31 196
2.0%
30 291
2.9%
29 299
3.0%
28 324
3.2%
27 324
3.2%
26 334
3.3%
25 324
3.2%
24 333
3.3%
23 332
3.3%
22 321
3.2%

연월일
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20150461
Minimum20000101
Maximum20301230
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T07:28:10.222980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20000101
5-th percentile20010711
Q120071015
median20150618
Q320230315
95-th percentile20290601
Maximum20301230
Range301129
Interquartile range (IQR)159300.5

Descriptive statistics

Standard deviation89288.071
Coefficient of variation (CV)0.0044310683
Kurtosis-1.1965249
Mean20150461
Median Absolute Deviation (MAD)79650
Skewness0.00033144059
Sum2.0150461 × 1011
Variance7.9723596 × 109
MonotonicityNot monotonic
2023-12-11T07:28:10.371433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20271206 1
 
< 0.1%
20190706 1
 
< 0.1%
20000116 1
 
< 0.1%
20100906 1
 
< 0.1%
20170808 1
 
< 0.1%
20081028 1
 
< 0.1%
20071018 1
 
< 0.1%
20190811 1
 
< 0.1%
20100116 1
 
< 0.1%
20120522 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
20000101 1
< 0.1%
20000102 1
< 0.1%
20000103 1
< 0.1%
20000105 1
< 0.1%
20000107 1
< 0.1%
20000108 1
< 0.1%
20000109 1
< 0.1%
20000110 1
< 0.1%
20000111 1
< 0.1%
20000112 1
< 0.1%
ValueCountFrequency (%)
20301230 1
< 0.1%
20301229 1
< 0.1%
20301228 1
< 0.1%
20301227 1
< 0.1%
20301226 1
< 0.1%
20301225 1
< 0.1%
20301224 1
< 0.1%
20301223 1
< 0.1%
20301222 1
< 0.1%
20301221 1
< 0.1%

요일순번
Real number (ℝ)

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.9924
Minimum1
Maximum7
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T07:28:10.498359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile7
Maximum7
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.0039321
Coefficient of variation (CV)0.5019367
Kurtosis-1.2538323
Mean3.9924
Median Absolute Deviation (MAD)2
Skewness0.0075265714
Sum39924
Variance4.0157438
MonotonicityNot monotonic
2023-12-11T07:28:10.658540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
2 1445
14.4%
1 1439
14.4%
7 1437
14.4%
4 1431
14.3%
3 1421
14.2%
6 1414
14.1%
5 1413
14.1%
ValueCountFrequency (%)
1 1439
14.4%
2 1445
14.4%
3 1421
14.2%
4 1431
14.3%
5 1413
14.1%
6 1414
14.1%
7 1437
14.4%
ValueCountFrequency (%)
7 1437
14.4%
6 1414
14.1%
5 1413
14.1%
4 1431
14.3%
3 1421
14.2%
2 1445
14.4%
1 1439
14.4%

요일
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1445 
1439 
1437 
1431 
1421 
Other values (2)
2827 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row
2nd row
3rd row
4th row
5th row

Common Values

ValueCountFrequency (%)
1445
14.4%
1439
14.4%
1437
14.4%
1431
14.3%
1421
14.2%
1414
14.1%
1413
14.1%

Length

2023-12-11T07:28:10.792589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:28:10.920768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1445
14.4%
1439
14.4%
1437
14.4%
1431
14.3%
1421
14.2%
1414
14.1%
1413
14.1%

휴일여부
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
8
7040 
7
1426 
1
1425 
9
 
109

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row8
2nd row1
3rd row8
4th row7
5th row1

Common Values

ValueCountFrequency (%)
8 7040
70.4%
7 1426
 
14.3%
1 1425
 
14.2%
9 109
 
1.1%

Length

2023-12-11T07:28:11.075774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:28:11.213680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
8 7040
70.4%
7 1426
 
14.3%
1 1425
 
14.2%
9 109
 
1.1%

휴일여부명
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size166.0 KiB

Interactions

2023-12-11T07:28:08.401110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:06.057707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:06.716059image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:07.280047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:07.847207image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:08.515998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:06.184133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:06.803222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:07.388187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:07.972497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:08.628591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:06.313955image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:06.902352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:07.526762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:08.073985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:08.714796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:06.424056image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:07.019537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:07.654757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:08.179368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:08.803747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:06.541373image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:07.170012image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:07.753793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:28:08.291075image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:28:11.325031image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도연월일요일순번요일휴일여부
연도1.0000.0000.0000.9940.0000.0000.151
0.0001.0000.0000.0090.0000.0000.060
0.0000.0001.0000.0000.0000.0000.045
연월일0.9940.0090.0001.0000.0000.0000.151
요일순번0.0000.0000.0000.0001.0001.0000.874
요일0.0000.0000.0000.0001.0001.0000.874
휴일여부0.1510.0600.0450.1510.8740.8741.000
2023-12-11T07:28:11.476051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
휴일여부요일
휴일여부1.0000.813
요일0.8131.000
2023-12-11T07:28:11.600880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연도연월일요일순번요일휴일여부
연도1.000-0.002-0.0050.999-0.0010.0000.091
-0.0021.0000.0120.030-0.0010.0000.036
-0.0050.0121.000-0.002-0.0010.0000.018
연월일0.9990.030-0.0021.000-0.0020.0000.091
요일순번-0.001-0.001-0.001-0.0021.0001.0000.813
요일0.0000.0000.0000.0001.0001.0000.813
휴일여부0.0910.0360.0180.0910.8130.8131.000

Missing values

2023-12-11T07:28:08.933007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:28:09.079638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연도연월일요일순번요일휴일여부휴일여부명
629320271262027120628<NA>
7616200010152000101511<NA>
817520157222015072248<NA>
779120308102030081077<NA>
6372014152014010511<NA>
650920287242028072428<NA>
26720121282012120877<NA>
978220203242020032438<NA>
88802017872017080728<NA>
952620196242019062428<NA>
연도연월일요일순번요일휴일여부휴일여부명
538920044132004041338<NA>
922620188102018081068<NA>
39662021312021030129<NA>
63222028142028010438<NA>
32922025792025070948<NA>
151520111312011013128<NA>
5560200410102004101011<NA>
4261200012302000123077<NA>
289520245192024051911<NA>
416120009122000091238<NA>