Overview

Dataset statistics

Number of variables7
Number of observations75
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.5 KiB
Average record size in memory60.8 B

Variable types

Numeric1
Categorical5
Text1

Alerts

UPPER_CTGRY_NM has constant value ""Constant
LWPRT_CTGRY_NM is highly overall correlated with SRCHWRD_NMHigh correlation
SRCHWRD_NM is highly overall correlated with SEQ_NO and 1 other fieldsHigh correlation
SEQ_NO is highly overall correlated with SRCHWRD_NM and 1 other fieldsHigh correlation
ANALS_YM is highly overall correlated with SEQ_NOHigh correlation
SEQ_NO has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:45:22.114430
Analysis finished2023-12-10 09:45:23.427717
Duration1.31 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

SEQ_NO
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct75
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean343.33333
Minimum16
Maximum720
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size807.0 B
2023-12-10T18:45:23.567020image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum16
5-th percentile19.7
Q1159.5
median298
Q3516.5
95-th percentile716.3
Maximum720
Range704
Interquartile range (IQR)357

Descriptive statistics

Standard deviation216.29611
Coefficient of variation (CV)0.62998868
Kurtosis-1.0936019
Mean343.33333
Median Absolute Deviation (MAD)200
Skewness0.055592554
Sum25750
Variance46784.009
MonotonicityStrictly increasing
2023-12-10T18:45:23.847693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16 1
 
1.3%
479 1
 
1.3%
516 1
 
1.3%
500 1
 
1.3%
499 1
 
1.3%
498 1
 
1.3%
497 1
 
1.3%
496 1
 
1.3%
480 1
 
1.3%
478 1
 
1.3%
Other values (65) 65
86.7%
ValueCountFrequency (%)
16 1
1.3%
17 1
1.3%
18 1
1.3%
19 1
1.3%
20 1
1.3%
36 1
1.3%
37 1
1.3%
38 1
1.3%
39 1
1.3%
40 1
1.3%
ValueCountFrequency (%)
720 1
1.3%
719 1
1.3%
718 1
1.3%
717 1
1.3%
716 1
1.3%
660 1
1.3%
659 1
1.3%
658 1
1.3%
657 1
1.3%
656 1
1.3%

SRCHWRD_NM
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size732.0 B
영어학원
15 
수학학원
15 
교육학원
15 
자녀교육
15 
국어학원
15 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영어학원
2nd row영어학원
3rd row영어학원
4th row영어학원
5th row영어학원

Common Values

ValueCountFrequency (%)
영어학원 15
20.0%
수학학원 15
20.0%
교육학원 15
20.0%
자녀교육 15
20.0%
국어학원 15
20.0%

Length

2023-12-10T18:45:24.121936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:45:24.363650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영어학원 15
20.0%
수학학원 15
20.0%
교육학원 15
20.0%
자녀교육 15
20.0%
국어학원 15
20.0%

UPPER_CTGRY_NM
Categorical

CONSTANT 

Distinct1
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size732.0 B
교육
75 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row교육
2nd row교육
3rd row교육
4th row교육
5th row교육

Common Values

ValueCountFrequency (%)
교육 75
100.0%

Length

2023-12-10T18:45:24.622142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:45:24.817391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
교육 75
100.0%

LWPRT_CTGRY_NM
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size732.0 B
교과
30 
공통
30 
외국어
15 

Length

Max length3
Median length2
Mean length2.2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row외국어
2nd row외국어
3rd row외국어
4th row외국어
5th row외국어

Common Values

ValueCountFrequency (%)
교과 30
40.0%
공통 30
40.0%
외국어 15
20.0%

Length

2023-12-10T18:45:25.070328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:45:25.285737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
교과 30
40.0%
공통 30
40.0%
외국어 15
20.0%

ALL_KWRD_RANK_CO
Categorical

Distinct5
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size732.0 B
16
15 
17
15 
18
15 
19
15 
20
15 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row16
2nd row17
3rd row18
4th row19
5th row20

Common Values

ValueCountFrequency (%)
16 15
20.0%
17 15
20.0%
18 15
20.0%
19 15
20.0%
20 15
20.0%

Length

2023-12-10T18:45:25.478926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:45:25.683766image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
16 15
20.0%
17 15
20.0%
18 15
20.0%
19 15
20.0%
20 15
20.0%
Distinct48
Distinct (%)64.0%
Missing0
Missing (%)0.0%
Memory size732.0 B
2023-12-10T18:45:26.061797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.3066667
Min length2

Characters and Unicode

Total characters173
Distinct characters83
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34 ?
Unique (%)45.3%

Sample

1st row코로나
2nd row이제
3rd row엄마
4th row영어학원
5th row파닉스
ValueCountFrequency (%)
선생님 5
 
6.7%
학교 5
 
6.7%
고민 4
 
5.3%
시작 3
 
4.0%
엄마 3
 
4.0%
이제 3
 
4.0%
유치원 3
 
4.0%
문제 3
 
4.0%
수업 2
 
2.7%
코로나 2
 
2.7%
Other values (38) 42
56.0%
2023-12-10T18:45:26.773760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
6.9%
8
 
4.6%
6
 
3.5%
6
 
3.5%
6
 
3.5%
5
 
2.9%
5
 
2.9%
5
 
2.9%
4
 
2.3%
4
 
2.3%
Other values (73) 112
64.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 173
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
12
 
6.9%
8
 
4.6%
6
 
3.5%
6
 
3.5%
6
 
3.5%
5
 
2.9%
5
 
2.9%
5
 
2.9%
4
 
2.3%
4
 
2.3%
Other values (73) 112
64.7%

Most occurring scripts

ValueCountFrequency (%)
Hangul 173
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
12
 
6.9%
8
 
4.6%
6
 
3.5%
6
 
3.5%
6
 
3.5%
5
 
2.9%
5
 
2.9%
5
 
2.9%
4
 
2.3%
4
 
2.3%
Other values (73) 112
64.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 173
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
12
 
6.9%
8
 
4.6%
6
 
3.5%
6
 
3.5%
6
 
3.5%
5
 
2.9%
5
 
2.9%
5
 
2.9%
4
 
2.3%
4
 
2.3%
Other values (73) 112
64.7%

ANALS_YM
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size732.0 B
202101
25 
202102
25 
202103
25 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row202101
2nd row202101
3rd row202101
4th row202101
5th row202101

Common Values

ValueCountFrequency (%)
202101 25
33.3%
202102 25
33.3%
202103 25
33.3%

Length

2023-12-10T18:45:27.040718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:45:27.264146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
202101 25
33.3%
202102 25
33.3%
202103 25
33.3%

Interactions

2023-12-10T18:45:22.851972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:45:27.391635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SEQ_NOSRCHWRD_NMLWPRT_CTGRY_NMALL_KWRD_RANK_COASKWRD_NMANALS_YM
SEQ_NO1.0000.7430.7220.0000.8050.986
SRCHWRD_NM0.7431.0001.0000.0000.5210.000
LWPRT_CTGRY_NM0.7221.0001.0000.0000.1660.000
ALL_KWRD_RANK_CO0.0000.0000.0001.0000.0000.000
ASKWRD_NM0.8050.5210.1660.0001.0000.000
ANALS_YM0.9860.0000.0000.0000.0001.000
2023-12-10T18:45:27.595589image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ANALS_YMLWPRT_CTGRY_NMALL_KWRD_RANK_COSRCHWRD_NM
ANALS_YM1.0000.0000.0000.000
LWPRT_CTGRY_NM0.0001.0000.0000.986
ALL_KWRD_RANK_CO0.0000.0001.0000.000
SRCHWRD_NM0.0000.9860.0001.000
2023-12-10T18:45:27.819479image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SEQ_NOSRCHWRD_NMLWPRT_CTGRY_NMALL_KWRD_RANK_COANALS_YM
SEQ_NO1.0000.5340.4080.0000.818
SRCHWRD_NM0.5341.0000.9860.0000.000
LWPRT_CTGRY_NM0.4080.9861.0000.0000.000
ALL_KWRD_RANK_CO0.0000.0000.0001.0000.000
ANALS_YM0.8180.0000.0000.0001.000

Missing values

2023-12-10T18:45:23.080641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:45:23.333138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

SEQ_NOSRCHWRD_NMUPPER_CTGRY_NMLWPRT_CTGRY_NMALL_KWRD_RANK_COASKWRD_NMANALS_YM
016영어학원교육외국어16코로나202101
117영어학원교육외국어17이제202101
218영어학원교육외국어18엄마202101
319영어학원교육외국어19영어학원202101
420영어학원교육외국어20파닉스202101
536수학학원교육교과16유치원202101
637수학학원교육교과17선생님202101
738수학학원교육교과18고민202101
839수학학원교육교과19한글202101
940수학학원교육교과20그냥202101
SEQ_NOSRCHWRD_NMUPPER_CTGRY_NMLWPRT_CTGRY_NMALL_KWRD_RANK_COASKWRD_NMANALS_YM
65656자녀교육교육공통16선생님202103
66657자녀교육교육공통17가정202103
67658자녀교육교육공통18학교202103
68659자녀교육교육공통19자녀교육202103
69660자녀교육교육공통20이상202103
70716국어학원교육교과16코딩202103
71717국어학원교육교과17스마트202103
72718국어학원교육교과18독서202103
73719국어학원교육교과19엄마202103
74720국어학원교육교과20태권도202103