Overview

Dataset statistics

Number of variables10
Number of observations400
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory33.7 KiB
Average record size in memory86.3 B

Variable types

DateTime1
Categorical6
Numeric3

Alerts

ETL일시 has constant value ""Constant
원천테이블 has constant value ""Constant
행정동코드 has constant value ""Constant
기준년월일 has constant value ""Constant
내국인수 is highly overall correlated with 성별구분코드 and 1 other fieldsHigh correlation
성별구분코드 is highly overall correlated with 내국인수 and 1 other fieldsHigh correlation
연령대구분코드 is highly overall correlated with 내국인수 and 1 other fieldsHigh correlation
단기외국인수 is highly imbalanced (86.6%)Imbalance
24시간대구분코드 has 30 (7.5%) zerosZeros
내국인수 has 28 (7.0%) zerosZeros
장기외국인수 has 386 (96.5%) zerosZeros

Reproduction

Analysis started2023-12-10 06:18:25.786834
Analysis finished2023-12-10 06:18:28.540172
Duration2.75 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

ETL일시
Date

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
Minimum2020-02-10 00:14:58
Maximum2020-02-10 00:14:58
2023-12-10T15:18:28.620296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:18:28.887275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

원천테이블
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
_
400 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row_
2nd row_
3rd row_
4th row_
5th row_

Common Values

ValueCountFrequency (%)
_ 400
100.0%

Length

2023-12-10T15:18:29.072209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:18:29.258066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
400
100.0%

24시간대구분코드
Real number (ℝ)

ZEROS 

Distinct14
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.175
Minimum0
Maximum13
Zeros30
Zeros (%)7.5%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:18:29.482212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median6
Q39.25
95-th percentile12
Maximum13
Range13
Interquartile range (IQR)6.25

Descriptive statistics

Standard deviation3.857665
Coefficient of variation (CV)0.62472307
Kurtosis-1.1937208
Mean6.175
Median Absolute Deviation (MAD)3
Skewness0.013625252
Sum2470
Variance14.881579
MonotonicityIncreasing
2023-12-10T15:18:29.719084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
0 30
 
7.5%
1 30
 
7.5%
2 30
 
7.5%
3 30
 
7.5%
4 30
 
7.5%
5 30
 
7.5%
6 30
 
7.5%
7 30
 
7.5%
8 30
 
7.5%
9 30
 
7.5%
Other values (4) 100
25.0%
ValueCountFrequency (%)
0 30
7.5%
1 30
7.5%
2 30
7.5%
3 30
7.5%
4 30
7.5%
5 30
7.5%
6 30
7.5%
7 30
7.5%
8 30
7.5%
9 30
7.5%
ValueCountFrequency (%)
13 10
 
2.5%
12 30
7.5%
11 30
7.5%
10 30
7.5%
9 30
7.5%
8 30
7.5%
7 30
7.5%
6 30
7.5%
5 30
7.5%
4 30
7.5%

성별구분코드
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
F
190 
M
182 
-
28 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row-
2nd row-
3rd rowF
4th rowF
5th rowF

Common Values

ValueCountFrequency (%)
F 190
47.5%
M 182
45.5%
- 28
 
7.0%

Length

2023-12-10T15:18:29.960627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:18:30.160491image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
f 190
47.5%
m 182
45.5%
28
 
7.0%

연령대구분코드
Categorical

HIGH CORRELATION 

Distinct14
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
age_70
52 
_
28 
age_10
27 
age_15
27 
age_20
27 
Other values (9)
239 

Length

Max length6
Median length6
Mean length5.65
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row_
2nd row_
3rd rowage_10
4th rowage_15
5th rowage_20

Common Values

ValueCountFrequency (%)
age_70 52
13.0%
_ 28
 
7.0%
age_10 27
 
6.8%
age_15 27
 
6.8%
age_20 27
 
6.8%
age_25 27
 
6.8%
age_30 27
 
6.8%
age_35 27
 
6.8%
age_40 27
 
6.8%
age_45 27
 
6.8%
Other values (4) 104
26.0%

Length

2023-12-10T15:18:30.376395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
age_70 52
13.0%
28
 
7.0%
age_10 27
 
6.8%
age_15 27
 
6.8%
age_20 27
 
6.8%
age_25 27
 
6.8%
age_30 27
 
6.8%
age_35 27
 
6.8%
age_40 27
 
6.8%
age_45 27
 
6.8%
Other values (4) 104
26.0%

행정동코드
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
11110560
400 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11110560
2nd row11110560
3rd row11110560
4th row11110560
5th row11110560

Common Values

ValueCountFrequency (%)
11110560 400
100.0%

Length

2023-12-10T15:18:30.594063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:18:30.763562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
11110560 400
100.0%

내국인수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct142
Distinct (%)35.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean103.875
Minimum0
Maximum235
Zeros28
Zeros (%)7.0%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:18:30.983969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q168
median107
Q3142
95-th percentile187
Maximum235
Range235
Interquartile range (IQR)74

Descriptive statistics

Standard deviation52.13059
Coefficient of variation (CV)0.50185886
Kurtosis-0.55042537
Mean103.875
Median Absolute Deviation (MAD)37
Skewness-0.16790319
Sum41550
Variance2717.5984
MonotonicityNot monotonic
2023-12-10T15:18:31.260212image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 28
 
7.0%
40 12
 
3.0%
160 11
 
2.8%
87 8
 
2.0%
86 7
 
1.8%
68 7
 
1.8%
185 6
 
1.5%
116 6
 
1.5%
41 6
 
1.5%
83 6
 
1.5%
Other values (132) 303
75.8%
ValueCountFrequency (%)
0 28
7.0%
35 1
 
0.2%
36 5
 
1.2%
37 1
 
0.2%
38 3
 
0.8%
39 4
 
1.0%
40 12
3.0%
41 6
 
1.5%
42 4
 
1.0%
43 1
 
0.2%
ValueCountFrequency (%)
235 1
 
0.2%
220 1
 
0.2%
209 2
0.5%
208 1
 
0.2%
205 1
 
0.2%
204 1
 
0.2%
199 1
 
0.2%
195 2
0.5%
194 1
 
0.2%
192 3
0.8%

장기외국인수
Real number (ℝ)

ZEROS 

Distinct10
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.905
Minimum0
Maximum77
Zeros386
Zeros (%)96.5%
Negative0
Negative (%)0.0%
Memory size3.6 KiB
2023-12-10T15:18:31.497444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum77
Range77
Interquartile range (IQR)0

Descriptive statistics

Standard deviation10.210122
Coefficient of variation (CV)5.3596441
Kurtosis28.551386
Mean1.905
Median Absolute Deviation (MAD)0
Skewness5.4018192
Sum762
Variance104.24659
MonotonicityNot monotonic
2023-12-10T15:18:31.700085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
0 386
96.5%
43 3
 
0.8%
44 2
 
0.5%
57 2
 
0.5%
61 2
 
0.5%
49 1
 
0.2%
50 1
 
0.2%
69 1
 
0.2%
64 1
 
0.2%
77 1
 
0.2%
ValueCountFrequency (%)
0 386
96.5%
43 3
 
0.8%
44 2
 
0.5%
49 1
 
0.2%
50 1
 
0.2%
57 2
 
0.5%
61 2
 
0.5%
64 1
 
0.2%
69 1
 
0.2%
77 1
 
0.2%
ValueCountFrequency (%)
77 1
 
0.2%
69 1
 
0.2%
64 1
 
0.2%
61 2
 
0.5%
57 2
 
0.5%
50 1
 
0.2%
49 1
 
0.2%
44 2
 
0.5%
43 3
 
0.8%
0 386
96.5%

단기외국인수
Categorical

IMBALANCE 

Distinct4
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
0
386 
8
 
8
7
 
4
9
 
2

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row7
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 386
96.5%
8 8
 
2.0%
7 4
 
1.0%
9 2
 
0.5%

Length

2023-12-10T15:18:31.909556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:18:32.076623image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 386
96.5%
8 8
 
2.0%
7 4
 
1.0%
9 2
 
0.5%

기준년월일
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.3 KiB
20200201
400 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20200201
2nd row20200201
3rd row20200201
4th row20200201
5th row20200201

Common Values

ValueCountFrequency (%)
20200201 400
100.0%

Length

2023-12-10T15:18:32.251724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T15:18:32.410294image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20200201 400
100.0%

Interactions

2023-12-10T15:18:27.649462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:18:26.325616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:18:27.213652image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:18:27.794924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:18:26.904927image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:18:27.349220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:18:27.942851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:18:27.048783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T15:18:27.487616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T15:18:32.519727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
24시간대구분코드성별구분코드연령대구분코드내국인수장기외국인수단기외국인수
24시간대구분코드1.0000.0000.0000.0000.0000.000
성별구분코드0.0001.0000.8320.8190.8000.487
연령대구분코드0.0000.8321.0000.9110.4760.575
내국인수0.0000.8190.9111.0000.4770.567
장기외국인수0.0000.8000.4760.4771.0000.000
단기외국인수0.0000.4870.5750.5670.0001.000
2023-12-10T15:18:32.725487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령대구분코드성별구분코드단기외국인수
연령대구분코드1.0000.6860.359
성별구분코드0.6861.0000.484
단기외국인수0.3590.4841.000
2023-12-10T15:18:32.904418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
24시간대구분코드내국인수장기외국인수성별구분코드연령대구분코드단기외국인수
24시간대구분코드1.0000.0540.0190.0000.0000.000
내국인수0.0541.000-0.3070.7120.6740.373
장기외국인수0.019-0.3071.0000.4790.2540.000
성별구분코드0.0000.7120.4791.0000.6860.484
연령대구분코드0.0000.6740.2540.6861.0000.359
단기외국인수0.0000.3730.0000.4840.3591.000

Missing values

2023-12-10T15:18:28.176790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T15:18:28.445655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

ETL일시원천테이블24시간대구분코드성별구분코드연령대구분코드행정동코드내국인수장기외국인수단기외국인수기준년월일
02020-02-10 00:14:58.0_0-_11110560043020200201
12020-02-10 00:14:58.0_0-_1111056000720200201
22020-02-10 00:14:58.0_0Fage_1011110560350020200201
32020-02-10 00:14:58.0_0Fage_1511110560740020200201
42020-02-10 00:14:58.0_0Fage_20111105601370020200201
52020-02-10 00:14:58.0_0Fage_2511110560980020200201
62020-02-10 00:14:58.0_0Fage_3011110560870020200201
72020-02-10 00:14:58.0_0Fage_35111105601240020200201
82020-02-10 00:14:58.0_0Fage_40111105601210020200201
92020-02-10 00:14:58.0_0Fage_45111105601580020200201
ETL일시원천테이블24시간대구분코드성별구분코드연령대구분코드행정동코드내국인수장기외국인수단기외국인수기준년월일
3902020-02-10 00:14:58.0_13-_11110560077020200201
3912020-02-10 00:14:58.0_13-_1111056000820200201
3922020-02-10 00:14:58.0_13Fage_1011110560440020200201
3932020-02-10 00:14:58.0_13Fage_1511110560750020200201
3942020-02-10 00:14:58.0_13Fage_20111105601100020200201
3952020-02-10 00:14:58.0_13Fage_25111105601100020200201
3962020-02-10 00:14:58.0_13Fage_3011110560930020200201
3972020-02-10 00:14:58.0_13Fage_35111105601350020200201
3982020-02-10 00:14:58.0_13Fage_40111105601160020200201
3992020-02-10 00:14:58.0_13Fage_45111105601720020200201