Overview

Dataset statistics

Number of variables6
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.0 KiB
Average record size in memory51.3 B

Variable types

Numeric1
Categorical3
Text2

Alerts

FILE_NAME has constant value ""Constant
BASE_YMD has constant value ""Constant
NO is highly overall correlated with Entry_NMHigh correlation
Entry_NM is highly overall correlated with NOHigh correlation
NO has unique valuesUnique
Association_Korean_NM has unique valuesUnique
Association_English_NM has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:57:11.714323
Analysis finished2023-12-10 09:57:12.619185
Duration0.9 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

NO
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean50.5
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:57:12.772494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.95
Q125.75
median50.5
Q375.25
95-th percentile95.05
Maximum100
Range99
Interquartile range (IQR)49.5

Descriptive statistics

Standard deviation29.011492
Coefficient of variation (CV)0.57448499
Kurtosis-1.2
Mean50.5
Median Absolute Deviation (MAD)25
Skewness0
Sum5050
Variance841.66667
MonotonicityStrictly increasing
2023-12-10T18:57:13.070572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.0%
65 1
 
1.0%
75 1
 
1.0%
74 1
 
1.0%
73 1
 
1.0%
72 1
 
1.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
68 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1 1
1.0%
2 1
1.0%
3 1
1.0%
4 1
1.0%
5 1
1.0%
6 1
1.0%
7 1
1.0%
8 1
1.0%
9 1
1.0%
10 1
1.0%
ValueCountFrequency (%)
100 1
1.0%
99 1
1.0%
98 1
1.0%
97 1
1.0%
96 1
1.0%
95 1
1.0%
94 1
1.0%
93 1
1.0%
92 1
1.0%
91 1
1.0%

Entry_NM
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)13.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
세븐틴
13 
엑스원
11 
슈퍼주니어
10 
트와이스
스트레이키즈
Other values (8)
49 

Length

Max length6
Median length5
Mean length3.98
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row방탄소년단
2nd row방탄소년단
3rd row방탄소년단
4th row방탄소년단
5th row방탄소년단

Common Values

ValueCountFrequency (%)
세븐틴 13
13.0%
엑스원 11
11.0%
슈퍼주니어 10
10.0%
트와이스 9
9.0%
스트레이키즈 8
8.0%
에이티즈 8
8.0%
방탄소년단 7
7.0%
갓세븐 7
7.0%
드림캐쳐 7
7.0%
에버글로우 6
6.0%
Other values (3) 14
14.0%

Length

2023-12-10T18:57:13.358957image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
세븐틴 13
13.0%
엑스원 11
11.0%
슈퍼주니어 10
10.0%
트와이스 9
9.0%
스트레이키즈 8
8.0%
에이티즈 8
8.0%
방탄소년단 7
7.0%
갓세븐 7
7.0%
드림캐쳐 7
7.0%
에버글로우 6
6.0%
Other values (3) 14
14.0%
Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:57:14.064109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length2
Mean length2.15
Min length1

Characters and Unicode

Total characters215
Distinct characters105
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row알엠
2nd row
3rd row지민
4th row슈가
5th row제이홉
ValueCountFrequency (%)
알엠 1
 
1.0%
민규 1
 
1.0%
가현 1
 
1.0%
다미 1
 
1.0%
유현 1
 
1.0%
한동 1
 
1.0%
시연 1
 
1.0%
수아 1
 
1.0%
지유 1
 
1.0%
디노 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T18:57:14.986260image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10
 
4.7%
7
 
3.3%
7
 
3.3%
6
 
2.8%
6
 
2.8%
6
 
2.8%
5
 
2.3%
5
 
2.3%
4
 
1.9%
4
 
1.9%
Other values (95) 155
72.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 215
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10
 
4.7%
7
 
3.3%
7
 
3.3%
6
 
2.8%
6
 
2.8%
6
 
2.8%
5
 
2.3%
5
 
2.3%
4
 
1.9%
4
 
1.9%
Other values (95) 155
72.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 215
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10
 
4.7%
7
 
3.3%
7
 
3.3%
6
 
2.8%
6
 
2.8%
6
 
2.8%
5
 
2.3%
5
 
2.3%
4
 
1.9%
4
 
1.9%
Other values (95) 155
72.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 215
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10
 
4.7%
7
 
3.3%
7
 
3.3%
6
 
2.8%
6
 
2.8%
6
 
2.8%
5
 
2.3%
5
 
2.3%
4
 
1.9%
4
 
1.9%
Other values (95) 155
72.1%
Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:57:15.576930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length12
Mean length6.38
Min length1

Characters and Unicode

Total characters638
Distinct characters51
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowRM
2nd rowJin
3rd rowJIMIN
4th rowSUGA
5th rowj-hope
ValueCountFrequency (%)
lee 3
 
2.5%
han 2
 
1.7%
kim 2
 
1.7%
rm 1
 
0.8%
mingyu 1
 
0.8%
chaeyoung 1
 
0.8%
gahyeon 1
 
0.8%
dami 1
 
0.8%
yoohyeon 1
 
0.8%
handong 1
 
0.8%
Other values (104) 104
88.1%
2023-12-10T18:57:16.609243image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 57
 
8.9%
O 46
 
7.2%
E 33
 
5.2%
I 31
 
4.9%
A 30
 
4.7%
n 30
 
4.7%
Y 28
 
4.4%
S 27
 
4.2%
H 26
 
4.1%
o 26
 
4.1%
Other values (41) 304
47.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 419
65.7%
Lowercase Letter 185
29.0%
Space Separator 18
 
2.8%
Dash Punctuation 12
 
1.9%
Other Punctuation 3
 
0.5%
Decimal Number 1
 
0.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 57
13.6%
O 46
11.0%
E 33
 
7.9%
I 31
 
7.4%
A 30
 
7.2%
Y 28
 
6.7%
S 27
 
6.4%
H 26
 
6.2%
G 21
 
5.0%
U 21
 
5.0%
Other values (15) 99
23.6%
Lowercase Letter
ValueCountFrequency (%)
n 30
16.2%
o 26
14.1%
e 21
11.4%
u 18
9.7%
g 15
8.1%
a 12
 
6.5%
h 12
 
6.5%
y 11
 
5.9%
i 8
 
4.3%
m 7
 
3.8%
Other values (11) 25
13.5%
Other Punctuation
ValueCountFrequency (%)
. 2
66.7%
: 1
33.3%
Space Separator
ValueCountFrequency (%)
18
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%
Decimal Number
ValueCountFrequency (%)
8 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 604
94.7%
Common 34
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 57
 
9.4%
O 46
 
7.6%
E 33
 
5.5%
I 31
 
5.1%
A 30
 
5.0%
n 30
 
5.0%
Y 28
 
4.6%
S 27
 
4.5%
H 26
 
4.3%
o 26
 
4.3%
Other values (36) 270
44.7%
Common
ValueCountFrequency (%)
18
52.9%
- 12
35.3%
. 2
 
5.9%
: 1
 
2.9%
8 1
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 638
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 57
 
8.9%
O 46
 
7.2%
E 33
 
5.2%
I 31
 
4.9%
A 30
 
4.7%
n 30
 
4.7%
Y 28
 
4.4%
S 27
 
4.2%
H 26
 
4.1%
o 26
 
4.1%
Other values (41) 304
47.6%

FILE_NAME
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
KC_DICTIONARY_ASS_INFO_2019
100 

Length

Max length27
Median length27
Mean length27
Min length27

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKC_DICTIONARY_ASS_INFO_2019
2nd rowKC_DICTIONARY_ASS_INFO_2019
3rd rowKC_DICTIONARY_ASS_INFO_2019
4th rowKC_DICTIONARY_ASS_INFO_2019
5th rowKC_DICTIONARY_ASS_INFO_2019

Common Values

ValueCountFrequency (%)
KC_DICTIONARY_ASS_INFO_2019 100
100.0%

Length

2023-12-10T18:57:16.925157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:57:17.120018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kc_dictionary_ass_info_2019 100
100.0%

BASE_YMD
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2019
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
2019 100
100.0%

Length

2023-12-10T18:57:17.636278image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:57:17.889965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2019 100
100.0%

Interactions

2023-12-10T18:57:12.073511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:57:17.984377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
NOEntry_NMAssociation_Korean_NMAssociation_English_NM
NO1.0000.9341.0001.000
Entry_NM0.9341.0001.0001.000
Association_Korean_NM1.0001.0001.0001.000
Association_English_NM1.0001.0001.0001.000
2023-12-10T18:57:18.133408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
NOEntry_NM
NO1.0000.745
Entry_NM0.7451.000

Missing values

2023-12-10T18:57:12.234683image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:57:12.487576image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

NOEntry_NMAssociation_Korean_NMAssociation_English_NMFILE_NAMEBASE_YMD
01방탄소년단알엠RMKC_DICTIONARY_ASS_INFO_20192019
12방탄소년단JinKC_DICTIONARY_ASS_INFO_20192019
23방탄소년단지민JIMINKC_DICTIONARY_ASS_INFO_20192019
34방탄소년단슈가SUGAKC_DICTIONARY_ASS_INFO_20192019
45방탄소년단제이홉j-hopeKC_DICTIONARY_ASS_INFO_20192019
56방탄소년단VKC_DICTIONARY_ASS_INFO_20192019
67방탄소년단정국JUNGKOOKKC_DICTIONARY_ASS_INFO_20192019
78블랙핑크지수JISOOKC_DICTIONARY_ASS_INFO_20192019
89블랙핑크제니JENNIEKC_DICTIONARY_ASS_INFO_20192019
910블랙핑크로제ROSEKC_DICTIONARY_ASS_INFO_20192019
NOEntry_NMAssociation_Korean_NMAssociation_English_NMFILE_NAMEBASE_YMD
9091에이티즈여상YEOSANGKC_DICTIONARY_ASS_INFO_20192019
9192에이티즈SANKC_DICTIONARY_ASS_INFO_20192019
9293에이티즈민기MIN GIKC_DICTIONARY_ASS_INFO_20192019
9394에이티즈우영WOO YOUNGKC_DICTIONARY_ASS_INFO_20192019
9495에이티즈종호JONG HOKC_DICTIONARY_ASS_INFO_20192019
9596레드벨벳아이린IRENEKC_DICTIONARY_ASS_INFO_20192019
9697레드벨벳슬기SEULGIKC_DICTIONARY_ASS_INFO_20192019
9798레드벨벳웬디WENDYKC_DICTIONARY_ASS_INFO_20192019
9899레드벨벳조이JOYKC_DICTIONARY_ASS_INFO_20192019
99100레드벨벳예리YERIKC_DICTIONARY_ASS_INFO_20192019