Overview

Dataset statistics

Number of variables3
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows11
Duplicate rows (%)11.0%
Total size in memory2.7 KiB
Average record size in memory27.3 B

Variable types

Categorical1
Text1
Numeric1

Alerts

Dataset has 11 (11.0%) duplicate rowsDuplicates

Reproduction

Analysis started2024-04-20 16:01:11.504283
Analysis finished2024-04-20 16:01:12.313661
Duration0.81 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시설명
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size928.0 B
20201201
84 
20201202
16 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20201201
2nd row20201201
3rd row20201201
4th row20201201
5th row20201201

Common Values

ValueCountFrequency (%)
20201201 84
84.0%
20201202 16
 
16.0%

Length

2024-04-21T01:01:12.512238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-21T01:01:12.810136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20201201 84
84.0%
20201202 16
 
16.0%
Distinct62
Distinct (%)62.0%
Missing0
Missing (%)0.0%
Memory size928.0 B
2024-04-21T01:01:13.669590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length5
Mean length5.15
Min length5

Characters and Unicode

Total characters515
Distinct characters82
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique46 ?
Unique (%)46.0%

Sample

1st row주산가압장
2nd row유정가압장
3rd row논산가압장
4th row광명가압장
5th row미금가압장
ValueCountFrequency (%)
미금가압장 11
 
11.0%
판교가압장 7
 
7.0%
용인가압장 5
 
5.0%
논산가압장 4
 
4.0%
광명가압장 4
 
4.0%
사등가압장 3
 
3.0%
금가가압장 2
 
2.0%
주산가압장 2
 
2.0%
상무대가압장 2
 
2.0%
낙양가압장 2
 
2.0%
Other values (52) 58
58.0%
2024-04-21T01:01:14.700762image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
104
20.2%
102
19.8%
100
19.4%
13
 
2.5%
13
 
2.5%
10
 
1.9%
8
 
1.6%
7
 
1.4%
6
 
1.2%
5
 
1.0%
Other values (72) 147
28.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 512
99.4%
Decimal Number 3
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
104
20.3%
102
19.9%
100
19.5%
13
 
2.5%
13
 
2.5%
10
 
2.0%
8
 
1.6%
7
 
1.4%
6
 
1.2%
5
 
1.0%
Other values (70) 144
28.1%
Decimal Number
ValueCountFrequency (%)
2 2
66.7%
1 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 512
99.4%
Common 3
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
104
20.3%
102
19.9%
100
19.5%
13
 
2.5%
13
 
2.5%
10
 
2.0%
8
 
1.6%
7
 
1.4%
6
 
1.2%
5
 
1.0%
Other values (70) 144
28.1%
Common
ValueCountFrequency (%)
2 2
66.7%
1 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 512
99.4%
ASCII 3
 
0.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
104
20.3%
102
19.9%
100
19.5%
13
 
2.5%
13
 
2.5%
10
 
2.0%
8
 
1.6%
7
 
1.4%
6
 
1.2%
5
 
1.0%
Other values (70) 144
28.1%
ASCII
ValueCountFrequency (%)
2 2
66.7%
1 1
33.3%

공급량
Real number (ℝ)

Distinct83
Distinct (%)83.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean183236.31
Minimum147
Maximum2166500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2024-04-21T01:01:14.952554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum147
5-th percentile1086.85
Q14922.5
median31630
Q3103778
95-th percentile909045
Maximum2166500
Range2166353
Interquartile range (IQR)98855.5

Descriptive statistics

Standard deviation415430.52
Coefficient of variation (CV)2.2671845
Kurtosis13.765339
Mean183236.31
Median Absolute Deviation (MAD)29637
Skewness3.6256015
Sum18323631
Variance1.7258251 × 1011
MonotonicityNot monotonic
2024-04-21T01:01:15.198840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
362300 4
 
4.0%
45754 4
 
4.0%
95770 4
 
4.0%
2123100 2
 
2.0%
892100 2
 
2.0%
63354 2
 
2.0%
364600 2
 
2.0%
416471 2
 
2.0%
1231000 2
 
2.0%
59288 2
 
2.0%
Other values (73) 74
74.0%
ValueCountFrequency (%)
147 1
1.0%
515 1
1.0%
523 1
1.0%
670 1
1.0%
704 1
1.0%
1107 1
1.0%
1244 1
1.0%
1475 1
1.0%
1580 1
1.0%
1665 1
1.0%
ValueCountFrequency (%)
2166500 1
 
1.0%
2123100 2
2.0%
1231000 2
2.0%
892100 2
2.0%
467364 1
 
1.0%
416471 2
2.0%
409630 2
2.0%
405568 1
 
1.0%
364600 2
2.0%
362300 4
4.0%

Interactions

2024-04-21T01:01:11.675140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-21T01:01:15.351952image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시설명공급날짜공급량
시설명1.0000.0000.075
공급날짜0.0001.0000.000
공급량0.0750.0001.000
2024-04-21T01:01:15.497283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공급량시설명
공급량1.0000.057
시설명0.0571.000

Missing values

2024-04-21T01:01:11.974948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-21T01:01:12.218316image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시설명공급날짜공급량
020201201주산가압장30592
120201201유정가압장4056
220201201논산가압장20169
320201201광명가압장409630
420201201미금가압장95770
520201201신림가압장21872
620201201논산가압장2579
720201201미금가압장95770
820201201수분가압장5028
920201201평촌가압장7325
시설명공급날짜공급량
9020201202금가가압장4400
9120201202미금가압장364600
9220201202광명가압장416471
9320201202부곡가압장18691
9420201202월야가압장8180
9520201202용인가압장44513
9620201202미금가압장364600
9720201202상무대가압장3707
9820201202광명가압장416471
9920201202팔도가압장21036

Duplicate rows

Most frequently occurring

시설명공급날짜공급량# duplicates
120201201미금가압장957704
220201201미금가압장3623004
420201201용인가압장457544
020201201광명가압장4096302
320201201사등가압장592882
520201201의정부가압장633542
620201201판교가압장8921002
720201201판교가압장12310002
820201201판교가압장21231002
920201202광명가압장4164712