Overview

Dataset statistics

Number of variables6
Number of observations80
Missing cells146
Missing cells (%)30.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.0 KiB
Average record size in memory51.6 B

Variable types

Numeric1
Text3
Categorical2

Alerts

FILE_NAME has constant value ""Constant
BASE_YMD has constant value ""Constant
Synonyms_Korean_NM has 74 (92.5%) missing valuesMissing
Synonyms_English_NM has 72 (90.0%) missing valuesMissing
NO has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:46:46.074928
Analysis finished2023-12-10 09:46:47.225494
Duration1.15 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

NO
Real number (ℝ)

UNIQUE 

Distinct80
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.5
Minimum1
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size852.0 B
2023-12-10T18:46:47.381245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.95
Q120.75
median40.5
Q360.25
95-th percentile76.05
Maximum80
Range79
Interquartile range (IQR)39.5

Descriptive statistics

Standard deviation23.2379
Coefficient of variation (CV)0.57377531
Kurtosis-1.2
Mean40.5
Median Absolute Deviation (MAD)20
Skewness0
Sum3240
Variance540
MonotonicityStrictly increasing
2023-12-10T18:46:48.465647image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.2%
42 1
 
1.2%
60 1
 
1.2%
59 1
 
1.2%
58 1
 
1.2%
57 1
 
1.2%
56 1
 
1.2%
55 1
 
1.2%
54 1
 
1.2%
53 1
 
1.2%
Other values (70) 70
87.5%
ValueCountFrequency (%)
1 1
1.2%
2 1
1.2%
3 1
1.2%
4 1
1.2%
5 1
1.2%
6 1
1.2%
7 1
1.2%
8 1
1.2%
9 1
1.2%
10 1
1.2%
ValueCountFrequency (%)
80 1
1.2%
79 1
1.2%
78 1
1.2%
77 1
1.2%
76 1
1.2%
75 1
1.2%
74 1
1.2%
73 1
1.2%
72 1
1.2%
71 1
1.2%
Distinct76
Distinct (%)95.0%
Missing0
Missing (%)0.0%
Memory size772.0 B
2023-12-10T18:46:48.957792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length4.9125
Min length2

Characters and Unicode

Total characters393
Distinct characters116
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique75 ?
Unique (%)93.8%

Sample

1st row방탄소년단
2nd row방탄소년단
3rd row방탄소년단
4th row방탄소년단
5th row방탄소년단
ValueCountFrequency (%)
엔터테인먼트 12
 
12.1%
방탄소년단 5
 
5.1%
슈퍼엠 2
 
2.0%
에이티니 1
 
1.0%
네버랜드 1
 
1.0%
믿지 1
 
1.0%
서클 1
 
1.0%
이너 1
 
1.0%
위즈원 1
 
1.0%
몬베베 1
 
1.0%
Other values (73) 73
73.7%
2023-12-10T18:46:49.806129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
24
 
6.1%
19
 
4.8%
16
 
4.1%
14
 
3.6%
14
 
3.6%
13
 
3.3%
13
 
3.3%
13
 
3.3%
12
 
3.1%
8
 
2.0%
Other values (106) 247
62.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 372
94.7%
Space Separator 19
 
4.8%
Open Punctuation 1
 
0.3%
Close Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
24
 
6.5%
16
 
4.3%
14
 
3.8%
14
 
3.8%
13
 
3.5%
13
 
3.5%
13
 
3.5%
12
 
3.2%
8
 
2.2%
8
 
2.2%
Other values (103) 237
63.7%
Space Separator
ValueCountFrequency (%)
19
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 372
94.7%
Common 21
 
5.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
24
 
6.5%
16
 
4.3%
14
 
3.8%
14
 
3.8%
13
 
3.5%
13
 
3.5%
13
 
3.5%
12
 
3.2%
8
 
2.2%
8
 
2.2%
Other values (103) 237
63.7%
Common
ValueCountFrequency (%)
19
90.5%
( 1
 
4.8%
) 1
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 372
94.7%
ASCII 21
 
5.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
24
 
6.5%
16
 
4.3%
14
 
3.8%
14
 
3.8%
13
 
3.5%
13
 
3.5%
13
 
3.5%
12
 
3.2%
8
 
2.2%
8
 
2.2%
Other values (103) 237
63.7%
ASCII
ValueCountFrequency (%)
19
90.5%
( 1
 
4.8%
) 1
 
4.8%

Synonyms_Korean_NM
Text

MISSING 

Distinct6
Distinct (%)100.0%
Missing74
Missing (%)92.5%
Memory size772.0 B
2023-12-10T18:46:50.128986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length4.5
Mean length4.5
Min length2

Characters and Unicode

Total characters27
Distinct characters25
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)100.0%

Sample

1st row비티에스
2nd row정호석
3rd row레벨
4th row에닉 소미 다우마
5th row여자아이들
ValueCountFrequency (%)
비티에스 1
12.5%
정호석 1
12.5%
레벨 1
12.5%
에닉 1
12.5%
소미 1
12.5%
다우마 1
12.5%
여자아이들 1
12.5%
한국음악 1
12.5%
2023-12-10T18:46:50.656441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2
 
7.4%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (15) 15
55.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 25
92.6%
Space Separator 2
 
7.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
 
8.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
Other values (14) 14
56.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 25
92.6%
Common 2
 
7.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2
 
8.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
Other values (14) 14
56.0%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 25
92.6%
ASCII 2
 
7.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2
 
8.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
Other values (14) 14
56.0%
ASCII
ValueCountFrequency (%)
2
100.0%

Synonyms_English_NM
Text

MISSING 

Distinct8
Distinct (%)100.0%
Missing72
Missing (%)90.0%
Memory size772.0 B
2023-12-10T18:46:50.905106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length11
Mean length11.5
Min length5

Characters and Unicode

Total characters92
Distinct characters28
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)100.0%

Sample

1st rowBANGTAN
2nd rowBangtan the Boys
3rd rowBangtan Sonyeondan
4th rowBeyond the Scene
5th rowBangtan Boys
ValueCountFrequency (%)
bangtan 4
26.7%
the 2
13.3%
boys 2
13.3%
sonyeondan 1
 
6.7%
beyond 1
 
6.7%
scene 1
 
6.7%
jeonsomi 1
 
6.7%
izone 1
 
6.7%
korean 1
 
6.7%
pop 1
 
6.7%
2023-12-10T18:46:51.384431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 13
14.1%
o 9
 
9.8%
e 8
 
8.7%
a 8
 
8.7%
B 7
 
7.6%
7
 
7.6%
t 5
 
5.4%
y 4
 
4.3%
s 3
 
3.3%
g 3
 
3.3%
Other values (18) 25
27.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 63
68.5%
Uppercase Letter 22
 
23.9%
Space Separator 7
 
7.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 13
20.6%
o 9
14.3%
e 8
12.7%
a 8
12.7%
t 5
 
7.9%
y 4
 
6.3%
s 3
 
4.8%
g 3
 
4.8%
h 2
 
3.2%
d 2
 
3.2%
Other values (5) 6
9.5%
Uppercase Letter
ValueCountFrequency (%)
B 7
31.8%
N 3
13.6%
S 2
 
9.1%
A 2
 
9.1%
G 1
 
4.5%
T 1
 
4.5%
I 1
 
4.5%
Z 1
 
4.5%
O 1
 
4.5%
E 1
 
4.5%
Other values (2) 2
 
9.1%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 85
92.4%
Common 7
 
7.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 13
15.3%
o 9
10.6%
e 8
 
9.4%
a 8
 
9.4%
B 7
 
8.2%
t 5
 
5.9%
y 4
 
4.7%
s 3
 
3.5%
g 3
 
3.5%
N 3
 
3.5%
Other values (17) 22
25.9%
Common
ValueCountFrequency (%)
7
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 92
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 13
14.1%
o 9
 
9.8%
e 8
 
8.7%
a 8
 
8.7%
B 7
 
7.6%
7
 
7.6%
t 5
 
5.4%
y 4
 
4.3%
s 3
 
3.3%
g 3
 
3.3%
Other values (18) 25
27.2%

FILE_NAME
Categorical

CONSTANT 

Distinct1
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size772.0 B
KC_DICTIONARY_SNN_INFO_2019
80 

Length

Max length27
Median length27
Mean length27
Min length27

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowKC_DICTIONARY_SNN_INFO_2019
2nd rowKC_DICTIONARY_SNN_INFO_2019
3rd rowKC_DICTIONARY_SNN_INFO_2019
4th rowKC_DICTIONARY_SNN_INFO_2019
5th rowKC_DICTIONARY_SNN_INFO_2019

Common Values

ValueCountFrequency (%)
KC_DICTIONARY_SNN_INFO_2019 80
100.0%

Length

2023-12-10T18:46:51.623127image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:46:51.810617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
kc_dictionary_snn_info_2019 80
100.0%

BASE_YMD
Categorical

CONSTANT 

Distinct1
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size772.0 B
2019
80 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2019
2nd row2019
3rd row2019
4th row2019
5th row2019

Common Values

ValueCountFrequency (%)
2019 80
100.0%

Length

2023-12-10T18:46:52.004335image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:46:52.177687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2019 80
100.0%

Interactions

2023-12-10T18:46:46.474433image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:46:52.292394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
NOEntry_NMSynonyms_Korean_NMSynonyms_English_NM
NO1.0001.0001.0001.000
Entry_NM1.0001.0001.0001.000
Synonyms_Korean_NM1.0001.0001.0001.000
Synonyms_English_NM1.0001.0001.0001.000

Missing values

2023-12-10T18:46:46.738183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:46:46.954534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-10T18:46:47.126197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

NOEntry_NMSynonyms_Korean_NMSynonyms_English_NMFILE_NAMEBASE_YMD
01방탄소년단비티에스BANGTANKC_DICTIONARY_SNN_INFO_20192019
12방탄소년단<NA>Bangtan the BoysKC_DICTIONARY_SNN_INFO_20192019
23방탄소년단<NA>Bangtan SonyeondanKC_DICTIONARY_SNN_INFO_20192019
34방탄소년단<NA>Beyond the SceneKC_DICTIONARY_SNN_INFO_20192019
45방탄소년단<NA>Bangtan BoysKC_DICTIONARY_SNN_INFO_20192019
56블랙핑크<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
67엑스원<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
78에버글로우<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
89갓세븐<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
910슈퍼주니어<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
NOEntry_NMSynonyms_Korean_NMSynonyms_English_NMFILE_NAMEBASE_YMD
7071플레디스 엔터테인먼트<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
7172케이큐 엔터테인먼트<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
7273더블랙레이블<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
7374더블유엠 엔터테인먼트<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
7475큐브 엔터테인먼트<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
7576알비더블유<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
7677스타쉽 엔터테인먼트<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
7778오프더레코드 엔터테인먼트<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
7879크래커 엔터테인먼트<NA><NA>KC_DICTIONARY_SNN_INFO_20192019
7980판타지오<NA><NA>KC_DICTIONARY_SNN_INFO_20192019