Overview

Dataset statistics

Number of variables7
Number of observations90
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.3 KiB
Average record size in memory60.5 B

Variable types

Numeric2
Categorical4
Text1

Alerts

UPPER_CTGRY_NM has constant value ""Constant
LWPRT_CTGRY_NM has constant value ""Constant
SEQ_NO is highly overall correlated with ANALS_YM and 1 other fieldsHigh correlation
ANALS_YM is highly overall correlated with SEQ_NOHigh correlation
SRCHWRD_NM is highly overall correlated with SEQ_NOHigh correlation
SEQ_NO has unique valuesUnique

Reproduction

Analysis started2024-04-17 13:30:21.886963
Analysis finished2024-04-17 13:30:22.794353
Duration0.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

SEQ_NO
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct90
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9009.0556
Minimum2506
Maximum14212
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size942.0 B
2024-04-17T22:30:22.857038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2506
5-th percentile3247.1
Q18213.25
median8563
Q310691.75
95-th percentile13463.25
Maximum14212
Range11706
Interquartile range (IQR)2478.5

Descriptive statistics

Standard deviation2677.7181
Coefficient of variation (CV)0.29722517
Kurtosis0.75461479
Mean9009.0556
Median Absolute Deviation (MAD)1565
Skewness-0.53384585
Sum810815
Variance7170174
MonotonicityNot monotonic
2024-04-17T22:30:22.978363image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2506 1
 
1.1%
8254 1
 
1.1%
8252 1
 
1.1%
8251 1
 
1.1%
10714 1
 
1.1%
10713 1
 
1.1%
10712 1
 
1.1%
10711 1
 
1.1%
10710 1
 
1.1%
8575 1
 
1.1%
Other values (80) 80
88.9%
ValueCountFrequency (%)
2506 1
1.1%
2507 1
1.1%
2508 1
1.1%
2509 1
1.1%
2510 1
1.1%
4148 1
1.1%
4149 1
1.1%
4150 1
1.1%
4151 1
1.1%
4152 1
1.1%
ValueCountFrequency (%)
14212 1
1.1%
14211 1
1.1%
14210 1
1.1%
14209 1
1.1%
14208 1
1.1%
12553 1
1.1%
12552 1
1.1%
12551 1
1.1%
12550 1
1.1%
12549 1
1.1%

SRCHWRD_NM
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size852.0 B
뮤지컬귀환
30 
뮤지컬렌트
30 
뮤지컬리지
30 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row뮤지컬귀환
2nd row뮤지컬귀환
3rd row뮤지컬귀환
4th row뮤지컬귀환
5th row뮤지컬귀환

Common Values

ValueCountFrequency (%)
뮤지컬귀환 30
33.3%
뮤지컬렌트 30
33.3%
뮤지컬리지 30
33.3%

Length

2024-04-17T22:30:23.091238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T22:30:23.176183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
뮤지컬귀환 30
33.3%
뮤지컬렌트 30
33.3%
뮤지컬리지 30
33.3%

UPPER_CTGRY_NM
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size852.0 B
문화공연
90 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row문화공연
2nd row문화공연
3rd row문화공연
4th row문화공연
5th row문화공연

Common Values

ValueCountFrequency (%)
문화공연 90
100.0%

Length

2024-04-17T22:30:23.270189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T22:30:23.352680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
문화공연 90
100.0%

LWPRT_CTGRY_NM
Categorical

CONSTANT 

Distinct1
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size852.0 B
뮤지컬
90 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row뮤지컬
2nd row뮤지컬
3rd row뮤지컬
4th row뮤지컬
5th row뮤지컬

Common Values

ValueCountFrequency (%)
뮤지컬 90
100.0%

Length

2024-04-17T22:30:23.434980image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T22:30:23.523467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
뮤지컬 90
100.0%

ALL_KWRD_RANK_CO
Categorical

Distinct5
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Memory size852.0 B
16
18 
17
18 
18
18 
19
18 
20
18 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row16
2nd row17
3rd row18
4th row19
5th row20

Common Values

ValueCountFrequency (%)
16 18
20.0%
17 18
20.0%
18 18
20.0%
19 18
20.0%
20 18
20.0%

Length

2024-04-17T22:30:23.609593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-17T22:30:23.703237image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
16 18
20.0%
17 18
20.0%
18 18
20.0%
19 18
20.0%
20 18
20.0%
Distinct73
Distinct (%)81.1%
Missing0
Missing (%)0.0%
Memory size852.0 B
2024-04-17T22:30:23.912794image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.3333333
Min length2

Characters and Unicode

Total characters210
Distinct characters122
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique60 ?
Unique (%)66.7%

Sample

1st row미팅
2nd row경수
3rd row경력
4th row미스터
5th row제이미
ValueCountFrequency (%)
시간 4
 
4.4%
노래 4
 
4.4%
정다희 2
 
2.2%
조앤 2
 
2.2%
연극 2
 
2.2%
이성열 2
 
2.2%
프로그램 2
 
2.2%
좌석 2
 
2.2%
콘서트 2
 
2.2%
출처 2
 
2.2%
Other values (63) 66
73.3%
2024-04-17T22:30:24.269733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
 
3.3%
5
 
2.4%
5
 
2.4%
5
 
2.4%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
3
 
1.4%
Other values (112) 165
78.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 210
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
 
3.3%
5
 
2.4%
5
 
2.4%
5
 
2.4%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
3
 
1.4%
Other values (112) 165
78.6%

Most occurring scripts

ValueCountFrequency (%)
Hangul 210
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
 
3.3%
5
 
2.4%
5
 
2.4%
5
 
2.4%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
3
 
1.4%
Other values (112) 165
78.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 210
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7
 
3.3%
5
 
2.4%
5
 
2.4%
5
 
2.4%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
4
 
1.9%
3
 
1.4%
Other values (112) 165
78.6%

ANALS_YM
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean202009.5
Minimum202007
Maximum202012
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size942.0 B
2024-04-17T22:30:24.377924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum202007
5-th percentile202007
Q1202008
median202009.5
Q3202011
95-th percentile202012
Maximum202012
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.7173929
Coefficient of variation (CV)8.501545 × 10-6
Kurtosis-1.2722257
Mean202009.5
Median Absolute Deviation (MAD)1.5
Skewness0
Sum18180855
Variance2.9494382
MonotonicityIncreasing
2024-04-17T22:30:24.480040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
202007 15
16.7%
202008 15
16.7%
202009 15
16.7%
202010 15
16.7%
202011 15
16.7%
202012 15
16.7%
ValueCountFrequency (%)
202007 15
16.7%
202008 15
16.7%
202009 15
16.7%
202010 15
16.7%
202011 15
16.7%
202012 15
16.7%
ValueCountFrequency (%)
202012 15
16.7%
202011 15
16.7%
202010 15
16.7%
202009 15
16.7%
202008 15
16.7%
202007 15
16.7%

Interactions

2024-04-17T22:30:22.231825image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T22:30:22.085669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T22:30:22.309085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-17T22:30:22.157616image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-17T22:30:24.553399image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SEQ_NOSRCHWRD_NMALL_KWRD_RANK_COASKWRD_NMANALS_YM
SEQ_NO1.0001.0000.0000.0000.583
SRCHWRD_NM1.0001.0000.0000.8080.000
ALL_KWRD_RANK_CO0.0000.0001.0000.0000.000
ASKWRD_NM0.0000.8080.0001.0000.399
ANALS_YM0.5830.0000.0000.3991.000
2024-04-17T22:30:24.643891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SRCHWRD_NMALL_KWRD_RANK_CO
SRCHWRD_NM1.0000.000
ALL_KWRD_RANK_CO0.0001.000
2024-04-17T22:30:24.720025image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
SEQ_NOANALS_YMSRCHWRD_NMALL_KWRD_RANK_CO
SEQ_NO1.0000.7980.9710.000
ANALS_YM0.7981.0000.0000.000
SRCHWRD_NM0.9710.0001.0000.000
ALL_KWRD_RANK_CO0.0000.0000.0001.000

Missing values

2024-04-17T22:30:22.419724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-17T22:30:22.755831image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

SEQ_NOSRCHWRD_NMUPPER_CTGRY_NMLWPRT_CTGRY_NMALL_KWRD_RANK_COASKWRD_NMANALS_YM
02506뮤지컬귀환문화공연뮤지컬16미팅202007
12507뮤지컬귀환문화공연뮤지컬17경수202007
22508뮤지컬귀환문화공연뮤지컬18경력202007
32509뮤지컬귀환문화공연뮤지컬19미스터202007
42510뮤지컬귀환문화공연뮤지컬20제이미202007
57520뮤지컬렌트문화공연뮤지컬16영화202007
67521뮤지컬렌트문화공연뮤지컬17두훈202007
77522뮤지컬렌트문화공연뮤지컬18주년202007
87523뮤지컬렌트문화공연뮤지컬19정다희202007
97524뮤지컬렌트문화공연뮤지컬20시간202007
SEQ_NOSRCHWRD_NMUPPER_CTGRY_NMLWPRT_CTGRY_NMALL_KWRD_RANK_COASKWRD_NMANALS_YM
8012549뮤지컬렌트문화공연뮤지컬16레아202012
8112550뮤지컬렌트문화공연뮤지컬17유진선202012
8212551뮤지컬렌트문화공연뮤지컬18노래202012
8312552뮤지컬렌트문화공연뮤지컬19시간202012
8412553뮤지컬렌트문화공연뮤지컬20엔젤202012
8514208뮤지컬리지문화공연뮤지컬16펀홈202012
8614209뮤지컬리지문화공연뮤지컬17가격202012
8714210뮤지컬리지문화공연뮤지컬18도끼202012
8814211뮤지컬리지문화공연뮤지컬19그냥202012
8914212뮤지컬리지문화공연뮤지컬20하나202012