Overview

Dataset statistics

Number of variables3
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows12
Duplicate rows (%)12.0%
Total size in memory2.7 KiB
Average record size in memory27.3 B

Variable types

Categorical1
Text1
Numeric1

Alerts

Dataset has 12 (12.0%) duplicate rowsDuplicates

Reproduction

Analysis started2023-12-10 13:09:57.507809
Analysis finished2023-12-10 13:09:58.122773
Duration0.61 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설명
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
20210201
83 
20210202
17 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20210201
2nd row20210201
3rd row20210201
4th row20210201
5th row20210201

Common Values

ValueCountFrequency (%)
20210201 83
83.0%
20210202 17
 
17.0%

Length

2023-12-10T22:09:58.212803image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:09:58.353350image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20210201 83
83.0%
20210202 17
 
17.0%
Distinct62
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T22:09:58.698436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length5.19
Min length5

Characters and Unicode

Total characters519
Distinct characters82
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)45.0%

Sample

1st row임하가압장
2nd row주산가압장
3rd row논산가압장
4th row월야가압장
5th row서삼가압장
ValueCountFrequency (%)
미금가압장 9
 
9.0%
용인가압장 7
 
7.0%
판교가압장 6
 
6.0%
사등가압장 5
 
5.0%
광명가압장 4
 
4.0%
구선가압장 2
 
2.0%
교동가압장 2
 
2.0%
전동가압장 2
 
2.0%
포병대가압장 2
 
2.0%
북이북하가압장 2
 
2.0%
Other values (52) 59
59.0%
2023-12-10T22:09:59.339068image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
104
20.0%
101
19.5%
100
19.3%
11
 
2.1%
10
 
1.9%
9
 
1.7%
8
 
1.5%
8
 
1.5%
7
 
1.3%
7
 
1.3%
Other values (72) 154
29.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 516
99.4%
Decimal Number 3
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
104
20.2%
101
19.6%
100
19.4%
11
 
2.1%
10
 
1.9%
9
 
1.7%
8
 
1.6%
8
 
1.6%
7
 
1.4%
7
 
1.4%
Other values (70) 151
29.3%
Decimal Number
ValueCountFrequency (%)
2 2
66.7%
1 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 516
99.4%
Common 3
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
104
20.2%
101
19.6%
100
19.4%
11
 
2.1%
10
 
1.9%
9
 
1.7%
8
 
1.6%
8
 
1.6%
7
 
1.4%
7
 
1.4%
Other values (70) 151
29.3%
Common
ValueCountFrequency (%)
2 2
66.7%
1 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 516
99.4%
ASCII 3
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
104
20.2%
101
19.6%
100
19.4%
11
 
2.1%
10
 
1.9%
9
 
1.7%
8
 
1.6%
8
 
1.6%
7
 
1.4%
7
 
1.4%
Other values (70) 151
29.3%
ASCII
ValueCountFrequency (%)
2 2
66.7%
1 1
33.3%

공급량
Real number (ℝ)

Distinct81
Distinct (%)81.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean161496.92
Minimum0
Maximum2275700
Zeros1
Zeros (%)1.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T22:09:59.593470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile598.9
Q14009.75
median31606.5
Q394000
95-th percentile907400
Maximum2275700
Range2275700
Interquartile range (IQR)89990.25

Descriptive statistics

Standard deviation390058.29
Coefficient of variation (CV)2.4152677
Kurtosis17.722078
Mean161496.92
Median Absolute Deviation (MAD)28625
Skewness4.027558
Sum16149692
Variance1.5214547 × 1011
MonotonicityNot monotonic
2023-12-10T22:09:59.826003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
375700 4
 
4.0%
94000 4
 
4.0%
46887 4
 
4.0%
43687 3
 
3.0%
54012 2
 
2.0%
54762 2
 
2.0%
50813 2
 
2.0%
2275700 2
 
2.0%
907400 2
 
2.0%
394809 2
 
2.0%
Other values (71) 73
73.0%
ValueCountFrequency (%)
0 1
1.0%
379 1
1.0%
442 1
1.0%
460 1
1.0%
483 1
1.0%
605 1
1.0%
723 1
1.0%
994 1
1.0%
1421 1
1.0%
1542 1
1.0%
ValueCountFrequency (%)
2275700 2
2.0%
1368300 2
2.0%
907400 2
2.0%
528616 1
 
1.0%
411889 2
2.0%
403584 1
 
1.0%
394809 2
2.0%
375700 4
4.0%
194560 1
 
1.0%
171478 1
 
1.0%

Interactions

2023-12-10T22:09:57.785931image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:09:59.987221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명공급날짜공급량
시설명1.0000.0000.000
공급날짜0.0001.0000.000
공급량0.0000.0001.000
2023-12-10T22:10:00.157651image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공급량시설명
공급량1.0000.000
시설명0.0001.000

Missing values

2023-12-10T22:09:57.961406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:09:58.080506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설명공급날짜공급량
020210201임하가압장403584
120210201주산가압장32064
220210201논산가압장0
320210201월야가압장10098
420210201서삼가압장2799
520210201무장가압장11118
620210201정읍가압장31149
720210201오수가압장605
820210201용인가압장46887
920210201미금가압장375700
시설명공급날짜공급량
9020210202교동가압장6208
9120210202괴산가압장3018
9220210202고사포가압장442
9320210202포병대가압장460
9420210202용인가압장43687
9520210202사등가압장54762
9620210202용인가압장43687
9720210202사등가압장54762
9820210202용인가압장43687
9920210202김천가압장26947

Duplicate rows

Most frequently occurring

시설명공급날짜공급량# duplicates
120210201미금가압장940004
220210201미금가압장3757004
420210201용인가압장468874
1120210202용인가압장436873
020210201광명가압장3948092
320210201사등가압장540122
520210201의정부가압장508132
620210201판교가압장9074002
720210201판교가압장13683002
820210201판교가압장22757002