Overview

Dataset statistics

Number of variables7
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows6
Duplicate rows (%)6.0%
Total size in memory5.9 KiB
Average record size in memory60.3 B

Variable types

Categorical5
Numeric2

Alerts

년도 has constant value ""Constant
시점 has constant value ""Constant
Dataset has 6 (6.0%) duplicate rowsDuplicates
행정구역(시도).1 is highly overall correlated with 행정구역(시도)High correlation
행정구역(시도) is highly overall correlated with 행정구역(시도).1High correlation
음용 is highly overall correlated with 비음용High correlation
비음용 is highly overall correlated with 음용High correlation
행정구역(시도) is highly imbalanced (70.1%)Imbalance
행정구역(시도).1 is highly imbalanced (70.1%)Imbalance
음용 has 55 (55.0%) zerosZeros
비음용 has 20 (20.0%) zerosZeros

Reproduction

Analysis started2023-12-10 10:56:06.108418
Analysis finished2023-12-10 10:56:07.889014
Duration1.78 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년도
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2017
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2017
3rd row2017
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
2017 100
100.0%

Length

2023-12-10T19:56:07.997055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:56:08.194150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2017 100
100.0%

시점
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
전반기
100 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전반기
2nd row전반기
3rd row전반기
4th row전반기
5th row전반기

Common Values

ValueCountFrequency (%)
전반기 100
100.0%

Length

2023-12-10T19:56:08.396161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:56:08.585884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전반기 100
100.0%

행정구역(시도)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct16
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
전체
85 
종로구
 
1
중구
 
1
용산구
 
1
성동구
 
1
Other values (11)
11 

Length

Max length4
Median length2
Mean length2.16
Min length2

Unique

Unique15 ?
Unique (%)15.0%

Sample

1st row전체
2nd row전체
3rd row전체
4th row전체
5th row전체

Common Values

ValueCountFrequency (%)
전체 85
85.0%
종로구 1
 
1.0%
중구 1
 
1.0%
용산구 1
 
1.0%
성동구 1
 
1.0%
광진구 1
 
1.0%
동대문구 1
 
1.0%
중랑구 1
 
1.0%
성북구 1
 
1.0%
강북구 1
 
1.0%
Other values (6) 6
 
6.0%

Length

2023-12-10T19:56:08.778198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전체 85
85.0%
종로구 1
 
1.0%
중구 1
 
1.0%
용산구 1
 
1.0%
성동구 1
 
1.0%
광진구 1
 
1.0%
동대문구 1
 
1.0%
중랑구 1
 
1.0%
성북구 1
 
1.0%
강북구 1
 
1.0%
Other values (6) 6
 
6.0%

행정구역(시도).1
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct16
Distinct (%)16.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
전체
85 
종로구
 
1
중구
 
1
용산구
 
1
성동구
 
1
Other values (11)
11 

Length

Max length4
Median length2
Mean length2.16
Min length2

Unique

Unique15 ?
Unique (%)15.0%

Sample

1st row전체
2nd row전체
3rd row전체
4th row전체
5th row전체

Common Values

ValueCountFrequency (%)
전체 85
85.0%
종로구 1
 
1.0%
중구 1
 
1.0%
용산구 1
 
1.0%
성동구 1
 
1.0%
광진구 1
 
1.0%
동대문구 1
 
1.0%
중랑구 1
 
1.0%
성북구 1
 
1.0%
강북구 1
 
1.0%
Other values (6) 6
 
6.0%

Length

2023-12-10T19:56:09.044869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
전체 85
85.0%
종로구 1
 
1.0%
중구 1
 
1.0%
용산구 1
 
1.0%
성동구 1
 
1.0%
광진구 1
 
1.0%
동대문구 1
 
1.0%
중랑구 1
 
1.0%
성북구 1
 
1.0%
강북구 1
 
1.0%
Other values (6) 6
 
6.0%

구분
Categorical

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
전체
32 
생활
17 
공업
17 
농업
17 
기타
17 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전체
2nd row전체
3rd row전체
4th row전체
5th row전체

Common Values

ValueCountFrequency (%)
전체 32
32.0%
생활 17
17.0%
공업 17
17.0%
농업 17
17.0%
기타 17
17.0%

Length

2023-12-10T19:56:09.346125image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:56:09.585581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
전체 32
32.0%
생활 17
17.0%
공업 17
17.0%
농업 17
17.0%
기타 17
17.0%

음용
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct22
Distinct (%)22.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.32
Minimum0
Maximum102
Zeros55
Zeros (%)55.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:56:10.050981image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q319
95-th percentile62.35
Maximum102
Range102
Interquartile range (IQR)19

Descriptive statistics

Standard deviation24.907863
Coefficient of variation (CV)1.7393759
Kurtosis2.9215576
Mean14.32
Median Absolute Deviation (MAD)0
Skewness1.8853039
Sum1432
Variance620.40162
MonotonicityNot monotonic
2023-12-10T19:56:10.381847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
0 55
55.0%
1 5
 
5.0%
4 4
 
4.0%
54 4
 
4.0%
18 3
 
3.0%
19 3
 
3.0%
3 2
 
2.0%
33 2
 
2.0%
88 2
 
2.0%
38 2
 
2.0%
Other values (12) 18
 
18.0%
ValueCountFrequency (%)
0 55
55.0%
1 5
 
5.0%
2 2
 
2.0%
3 2
 
2.0%
4 4
 
4.0%
9 2
 
2.0%
18 3
 
3.0%
19 3
 
3.0%
20 2
 
2.0%
30 2
 
2.0%
ValueCountFrequency (%)
102 2
2.0%
88 2
2.0%
69 1
 
1.0%
62 1
 
1.0%
56 1
 
1.0%
55 1
 
1.0%
54 4
4.0%
50 1
 
1.0%
48 1
 
1.0%
47 2
2.0%

비음용
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct47
Distinct (%)47.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean27.01
Minimum0
Maximum167
Zeros20
Zeros (%)20.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:56:10.709191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11.75
median8
Q338
95-th percentile114.05
Maximum167
Range167
Interquartile range (IQR)36.25

Descriptive statistics

Standard deviation38.220386
Coefficient of variation (CV)1.4150457
Kurtosis2.4176839
Mean27.01
Median Absolute Deviation (MAD)8
Skewness1.7478941
Sum2701
Variance1460.7979
MonotonicityNot monotonic
2023-12-10T19:56:11.025037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
0 20
20.0%
4 10
 
10.0%
1 5
 
5.0%
7 5
 
5.0%
10 4
 
4.0%
8 4
 
4.0%
3 4
 
4.0%
75 2
 
2.0%
28 2
 
2.0%
11 2
 
2.0%
Other values (37) 42
42.0%
ValueCountFrequency (%)
0 20
20.0%
1 5
 
5.0%
2 1
 
1.0%
3 4
 
4.0%
4 10
10.0%
5 2
 
2.0%
6 2
 
2.0%
7 5
 
5.0%
8 4
 
4.0%
9 1
 
1.0%
ValueCountFrequency (%)
167 1
1.0%
152 1
1.0%
118 2
2.0%
115 1
1.0%
114 1
1.0%
111 1
1.0%
110 1
1.0%
107 1
1.0%
97 1
1.0%
78 1
1.0%

Interactions

2023-12-10T19:56:06.816051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:56:06.503029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:56:06.988579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:56:06.650323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:56:11.302016image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정구역(시도)행정구역(시도).1구분음용비음용
행정구역(시도)1.0001.0000.0000.0000.000
행정구역(시도).11.0001.0000.0000.0000.000
구분0.0000.0001.0000.3970.471
음용0.0000.0000.3971.0000.872
비음용0.0000.0000.4710.8721.000
2023-12-10T19:56:11.529748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
행정구역(시도).1행정구역(시도)구분
행정구역(시도).11.0001.0000.000
행정구역(시도)1.0001.0000.000
구분0.0000.0001.000
2023-12-10T19:56:11.756632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
음용비음용행정구역(시도)행정구역(시도).1구분
음용1.0000.7460.0000.0000.236
비음용0.7461.0000.0000.0000.288
행정구역(시도)0.0000.0001.0001.0000.000
행정구역(시도).10.0000.0001.0001.0000.000
구분0.2360.2880.0000.0001.000

Missing values

2023-12-10T19:56:07.606195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:56:07.807063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

년도시점행정구역(시도)행정구역(시도).1구분음용비음용
02017전반기전체전체전체4775
12017전반기전체전체전체38152
22017전반기전체전체전체3038
32017전반기전체전체전체940
42017전반기전체전체전체1930
52017전반기전체전체전체2038
62017전반기전체전체전체46
72017전반기전체전체전체102167
82017전반기전체전체전체5578
92017전반기전체전체전체5463
년도시점행정구역(시도)행정구역(시도).1구분음용비음용
902017전반기중랑구중랑구전체07
912017전반기성북구성북구전체04
922017전반기강북구강북구전체04
932017전반기도봉구도봉구전체13
942017전반기노원구노원구전체43
952017전반기은평구은평구전체15
962017전반기서대문구서대문구전체04
972017전반기마포구마포구전체08
982017전반기양천구양천구전체31
992017전반기전체전체전체18118

Duplicate rows

Most frequently occurring

년도시점행정구역(시도)행정구역(시도).1구분음용비음용# duplicates
22017전반기전체전체기타0017
42017전반기전체전체농업014
02017전반기전체전체공업072
12017전반기전체전체공업0102
32017전반기전체전체농업002
52017전반기전체전체농업0102