Overview

Dataset statistics

Number of variables17
Number of observations100
Missing cells358
Missing cells (%)21.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory13.8 KiB
Average record size in memory141.3 B

Variable types

Text6
Numeric2
Categorical9

Alerts

base_ymd has constant value ""Constant
gov_dn_kor_lang_nm is highly overall correlated with xpos_lo and 8 other fieldsHigh correlation
city_do_jan_lang_nm is highly overall correlated with xpos_lo and 8 other fieldsHigh correlation
gov_dn_jan_lang_nm is highly overall correlated with xpos_lo and 8 other fieldsHigh correlation
city_gn_gu_cd is highly overall correlated with xpos_lo and 8 other fieldsHigh correlation
city_gn_gu_kor_lang_nm is highly overall correlated with xpos_lo and 8 other fieldsHigh correlation
city_gn_gu_jan_lang_nm is highly overall correlated with xpos_lo and 8 other fieldsHigh correlation
city_do_cd is highly overall correlated with xpos_lo and 8 other fieldsHigh correlation
city_do_kor_lang_nm is highly overall correlated with xpos_lo and 8 other fieldsHigh correlation
xpos_lo is highly overall correlated with city_do_cd and 7 other fieldsHigh correlation
ypos_la is highly overall correlated with city_do_cd and 7 other fieldsHigh correlation
city_do_cd is highly imbalanced (64.2%)Imbalance
city_gn_gu_cd is highly imbalanced (68.3%)Imbalance
city_do_kor_lang_nm is highly imbalanced (64.2%)Imbalance
city_gn_gu_kor_lang_nm is highly imbalanced (68.3%)Imbalance
eng_lang_nm has 89 (89.0%) missing valuesMissing
jan_lang_nm has 89 (89.0%) missing valuesMissing
chg_lang_nm has 90 (90.0%) missing valuesMissing
chb_lang_nm has 90 (90.0%) missing valuesMissing
xpos_lo has unique valuesUnique
ypos_la has unique valuesUnique

Reproduction

Analysis started2023-12-10 10:07:00.824033
Analysis finished2023-12-10 10:07:04.341397
Duration3.52 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct98
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:07:04.715586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length4.96
Min length3

Characters and Unicode

Total characters496
Distinct characters183
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique96 ?
Unique (%)96.0%

Sample

1st row가평민박
2nd row해온안
3rd row강변민박
4th row강언덕
5th row맑은물산장
ValueCountFrequency (%)
청솔민박 2
 
2.0%
황토민박 2
 
2.0%
선녀와나무터 1
 
1.0%
타운하우스 1
 
1.0%
가평민박 1
 
1.0%
한그린잔디 1
 
1.0%
통나무집 1
 
1.0%
청산유원지 1
 
1.0%
용수유원지 1
 
1.0%
산내들유원지 1
 
1.0%
Other values (88) 88
88.0%
2023-12-10T19:07:05.557563image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35
 
7.1%
35
 
7.1%
20
 
4.0%
13
 
2.6%
13
 
2.6%
11
 
2.2%
11
 
2.2%
10
 
2.0%
8
 
1.6%
/ 8
 
1.6%
Other values (173) 332
66.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 488
98.4%
Other Punctuation 8
 
1.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
35
 
7.2%
35
 
7.2%
20
 
4.1%
13
 
2.7%
13
 
2.7%
11
 
2.3%
11
 
2.3%
10
 
2.0%
8
 
1.6%
8
 
1.6%
Other values (172) 324
66.4%
Other Punctuation
ValueCountFrequency (%)
/ 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 488
98.4%
Common 8
 
1.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
35
 
7.2%
35
 
7.2%
20
 
4.1%
13
 
2.7%
13
 
2.7%
11
 
2.3%
11
 
2.3%
10
 
2.0%
8
 
1.6%
8
 
1.6%
Other values (172) 324
66.4%
Common
ValueCountFrequency (%)
/ 8
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 488
98.4%
ASCII 8
 
1.6%

Most frequent character per block

Hangul
ValueCountFrequency (%)
35
 
7.2%
35
 
7.2%
20
 
4.1%
13
 
2.7%
13
 
2.7%
11
 
2.3%
11
 
2.3%
10
 
2.0%
8
 
1.6%
8
 
1.6%
Other values (172) 324
66.4%
ASCII
ValueCountFrequency (%)
/ 8
100.0%

xpos_lo
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean981804.76
Minimum126.24006
Maximum1136978
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:07:05.900026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.24006
5-th percentile987074.2
Q1992870.5
median998429
Q31000989.2
95-th percentile1135058
Maximum1136978
Range1136851.8
Interquartile range (IQR)8118.75

Descriptive statistics

Standard deviation178830.51
Coefficient of variation (CV)0.18214467
Kurtosis26.204549
Mean981804.76
Median Absolute Deviation (MAD)3830.5
Skewness-5.0447191
Sum98180476
Variance3.198035 × 1010
MonotonicityNot monotonic
2023-12-10T19:07:06.256302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1000867.0 1
 
1.0%
998403.0 1
 
1.0%
989371.0 1
 
1.0%
993227.0 1
 
1.0%
988848.0 1
 
1.0%
990256.0 1
 
1.0%
989844.0 1
 
1.0%
986657.0 1
 
1.0%
987080.0 1
 
1.0%
989212.0 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
126.240059 1
1.0%
127.043389 1
1.0%
129.385268 1
1.0%
986657.0 1
1.0%
986964.0 1
1.0%
987080.0 1
1.0%
988308.0 1
1.0%
988470.0 1
1.0%
988508.0 1
1.0%
988848.0 1
1.0%
ValueCountFrequency (%)
1136978.0 1
1.0%
1135370.0 1
1.0%
1135366.0 1
1.0%
1135355.0 1
1.0%
1135134.0 1
1.0%
1135054.0 1
1.0%
1133658.0 1
1.0%
1133637.0 1
1.0%
1131116.0 1
1.0%
1130767.0 1
1.0%

ypos_la
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1916749.9
Minimum34.490406
Maximum1998814
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:07:06.562394image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum34.490406
5-th percentile1954792.2
Q11965351.2
median1972035
Q31988647.8
95-th percentile1996591.4
Maximum1998814
Range1998779.5
Interquartile range (IQR)23296.5

Descriptive statistics

Standard deviation339029.2
Coefficient of variation (CV)0.17687712
Kurtosis29.80022
Mean1916749.9
Median Absolute Deviation (MAD)11364
Skewness-5.5813989
Sum1.9167499 × 108
Variance1.149408 × 1011
MonotonicityNot monotonic
2023-12-10T19:07:06.853306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1971396.0 1
 
1.0%
1964080.0 1
 
1.0%
1964506.0 1
 
1.0%
1970280.0 1
 
1.0%
1989175.0 1
 
1.0%
1978002.0 1
 
1.0%
1990476.0 1
 
1.0%
1983819.0 1
 
1.0%
1982844.0 1
 
1.0%
1989279.0 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
34.490406 1
1.0%
35.041119 1
1.0%
36.608114 1
1.0%
1954646.0 1
1.0%
1954720.0 1
1.0%
1954796.0 1
1.0%
1954810.0 1
1.0%
1954861.0 1
1.0%
1955230.0 1
1.0%
1955239.0 1
1.0%
ValueCountFrequency (%)
1998814.0 1
1.0%
1998566.0 1
1.0%
1998539.0 1
1.0%
1997588.0 1
1.0%
1996845.0 1
1.0%
1996578.0 1
1.0%
1996515.0 1
1.0%
1996210.0 1
1.0%
1996028.0 1
1.0%
1995537.0 1
1.0%
Distinct98
Distinct (%)98.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T19:07:07.324814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.88
Min length3

Characters and Unicode

Total characters488
Distinct characters182
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique96 ?
Unique (%)96.0%

Sample

1st row가평민박
2nd row해온안
3rd row강변민박
4th row강언덕
5th row맑은물산장
ValueCountFrequency (%)
청솔민박 2
 
2.0%
황토민박 2
 
2.0%
선녀와나무터 1
 
1.0%
타운하우스 1
 
1.0%
가평민박 1
 
1.0%
한그린잔디 1
 
1.0%
통나무집 1
 
1.0%
청산유원지 1
 
1.0%
용수유원지 1
 
1.0%
산내들유원지 1
 
1.0%
Other values (88) 88
88.0%
2023-12-10T19:07:08.038233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
35
 
7.2%
35
 
7.2%
20
 
4.1%
13
 
2.7%
13
 
2.7%
11
 
2.3%
11
 
2.3%
10
 
2.0%
8
 
1.6%
8
 
1.6%
Other values (172) 324
66.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 488
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
35
 
7.2%
35
 
7.2%
20
 
4.1%
13
 
2.7%
13
 
2.7%
11
 
2.3%
11
 
2.3%
10
 
2.0%
8
 
1.6%
8
 
1.6%
Other values (172) 324
66.4%

Most occurring scripts

ValueCountFrequency (%)
Hangul 488
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
35
 
7.2%
35
 
7.2%
20
 
4.1%
13
 
2.7%
13
 
2.7%
11
 
2.3%
11
 
2.3%
10
 
2.0%
8
 
1.6%
8
 
1.6%
Other values (172) 324
66.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 488
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
35
 
7.2%
35
 
7.2%
20
 
4.1%
13
 
2.7%
13
 
2.7%
11
 
2.3%
11
 
2.3%
10
 
2.0%
8
 
1.6%
8
 
1.6%
Other values (172) 324
66.4%

eng_lang_nm
Text

MISSING 

Distinct10
Distinct (%)90.9%
Missing89
Missing (%)89.0%
Memory size932.0 B
2023-12-10T19:07:08.357177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length15
Mean length13.545455
Min length5

Characters and Unicode

Total characters149
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)81.8%

Sample

1st rowMotel
2nd rowCheongsol Garden Restaurant
3rd rowGyegok Guest House
4th rowBomulseom
5th rowSarangchae
ValueCountFrequency (%)
motel 2
 
10.5%
restaurant 2
 
10.5%
guest 2
 
10.5%
house 2
 
10.5%
cheongsol 1
 
5.3%
garden 1
 
5.3%
gyegok 1
 
5.3%
bomulseom 1
 
5.3%
sarangchae 1
 
5.3%
saejip 1
 
5.3%
Other values (5) 5
26.3%
2023-12-10T19:07:09.061308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 16
 
10.7%
o 14
 
9.4%
a 12
 
8.1%
u 11
 
7.4%
s 10
 
6.7%
t 9
 
6.0%
n 9
 
6.0%
8
 
5.4%
g 7
 
4.7%
p 5
 
3.4%
Other values (21) 48
32.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 121
81.2%
Uppercase Letter 19
 
12.8%
Space Separator 8
 
5.4%
Other Punctuation 1
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 16
13.2%
o 14
11.6%
a 12
9.9%
u 11
9.1%
s 10
8.3%
t 9
 
7.4%
n 9
 
7.4%
g 7
 
5.8%
p 5
 
4.1%
l 4
 
3.3%
Other values (9) 24
19.8%
Uppercase Letter
ValueCountFrequency (%)
G 4
21.1%
S 3
15.8%
M 2
10.5%
R 2
10.5%
H 2
10.5%
C 2
10.5%
B 1
 
5.3%
T 1
 
5.3%
D 1
 
5.3%
A 1
 
5.3%
Space Separator
ValueCountFrequency (%)
8
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 140
94.0%
Common 9
 
6.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 16
 
11.4%
o 14
 
10.0%
a 12
 
8.6%
u 11
 
7.9%
s 10
 
7.1%
t 9
 
6.4%
n 9
 
6.4%
g 7
 
5.0%
p 5
 
3.6%
l 4
 
2.9%
Other values (19) 43
30.7%
Common
ValueCountFrequency (%)
8
88.9%
. 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 149
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 16
 
10.7%
o 14
 
9.4%
a 12
 
8.1%
u 11
 
7.4%
s 10
 
6.7%
t 9
 
6.0%
n 9
 
6.0%
8
 
5.4%
g 7
 
4.7%
p 5
 
3.4%
Other values (21) 48
32.2%

jan_lang_nm
Text

MISSING 

Distinct10
Distinct (%)90.9%
Missing89
Missing (%)89.0%
Memory size932.0 B
2023-12-10T19:07:09.388053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length5.0909091
Min length2

Characters and Unicode

Total characters56
Distinct characters35
Distinct categories2 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)81.8%

Sample

1st rowモ?テル
2nd rowチョンソルガ?デン
3rd row?谷民泊
4th row?島
5th rowサランチェ
ValueCountFrequency (%)
モ?テル 2
18.2%
チョンソルガ?デン 1
9.1%
谷民泊 1
9.1%
1
9.1%
サランチェ 1
9.1%
セジプ民宿 1
9.1%
エリス山 1
9.1%
トンナムジプ 1
9.1%
大成食堂 1
9.1%
スプソグィアチム 1
9.1%
2023-12-10T19:07:10.017013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
? 6
 
10.7%
4
 
7.1%
3
 
5.4%
3
 
5.4%
3
 
5.4%
2
 
3.6%
2
 
3.6%
2
 
3.6%
2
 
3.6%
2
 
3.6%
Other values (25) 27
48.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 50
89.3%
Other Punctuation 6
 
10.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
 
8.0%
3
 
6.0%
3
 
6.0%
3
 
6.0%
2
 
4.0%
2
 
4.0%
2
 
4.0%
2
 
4.0%
2
 
4.0%
2
 
4.0%
Other values (24) 25
50.0%
Other Punctuation
ValueCountFrequency (%)
? 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Katakana 39
69.6%
Han 11
 
19.6%
Common 6
 
10.7%

Most frequent character per script

Katakana
ValueCountFrequency (%)
4
 
10.3%
3
 
7.7%
3
 
7.7%
3
 
7.7%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
Other values (14) 14
35.9%
Han
ValueCountFrequency (%)
2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
宿 1
9.1%
1
9.1%
1
9.1%
1
9.1%
Common
ValueCountFrequency (%)
? 6
100.0%

Most occurring blocks

ValueCountFrequency (%)
Katakana 39
69.6%
CJK 11
 
19.6%
ASCII 6
 
10.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
? 6
100.0%
Katakana
ValueCountFrequency (%)
4
 
10.3%
3
 
7.7%
3
 
7.7%
3
 
7.7%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
Other values (14) 14
35.9%
CJK
ValueCountFrequency (%)
2
18.2%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
1
9.1%
宿 1
9.1%
1
9.1%
1
9.1%
1
9.1%

chg_lang_nm
Text

MISSING 

Distinct9
Distinct (%)90.0%
Missing90
Missing (%)90.0%
Memory size932.0 B
2023-12-10T19:07:10.278265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length4
Mean length3.8
Min length3

Characters and Unicode

Total characters38
Distinct characters25
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)80.0%

Sample

1st row汽?旅?
2nd row?松花?
3rd row山谷民宿
4th row?物?
5th row舍廊房
ValueCountFrequency (%)
汽?旅 2
20.0%
松花 1
10.0%
山谷民宿 1
10.0%
1
10.0%
舍廊房 1
10.0%
新屋民宿 1
10.0%
原木屋 1
10.0%
大成餐 1
10.0%
森里的早晨 1
10.0%
2023-12-10T19:07:10.809409image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
? 9
23.7%
2
 
5.3%
2
 
5.3%
2
 
5.3%
宿 2
 
5.3%
2
 
5.3%
1
 
2.6%
1
 
2.6%
1
 
2.6%
1
 
2.6%
Other values (15) 15
39.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29
76.3%
Other Punctuation 9
 
23.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
 
6.9%
2
 
6.9%
2
 
6.9%
宿 2
 
6.9%
2
 
6.9%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
Other values (14) 14
48.3%
Other Punctuation
ValueCountFrequency (%)
? 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Han 29
76.3%
Common 9
 
23.7%

Most frequent character per script

Han
ValueCountFrequency (%)
2
 
6.9%
2
 
6.9%
2
 
6.9%
宿 2
 
6.9%
2
 
6.9%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
Other values (14) 14
48.3%
Common
ValueCountFrequency (%)
? 9
100.0%

Most occurring blocks

ValueCountFrequency (%)
CJK 29
76.3%
ASCII 9
 
23.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
? 9
100.0%
CJK
ValueCountFrequency (%)
2
 
6.9%
2
 
6.9%
2
 
6.9%
宿 2
 
6.9%
2
 
6.9%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
1
 
3.4%
Other values (14) 14
48.3%

chb_lang_nm
Text

MISSING 

Distinct9
Distinct (%)90.0%
Missing90
Missing (%)90.0%
Memory size932.0 B
2023-12-10T19:07:11.106596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length5.5
Mean length4
Min length3

Characters and Unicode

Total characters40
Distinct characters30
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)80.0%

Sample

1st row汽車旅館
2nd row?松花園餐館
3rd row山谷民宿
4th row寶物島
5th row舍廊房
ValueCountFrequency (%)
汽車旅館 2
20.0%
松花園餐館 1
10.0%
山谷民宿 1
10.0%
寶物島 1
10.0%
舍廊房 1
10.0%
新屋民宿 1
10.0%
原木屋 1
10.0%
大成餐館 1
10.0%
森里的早晨 1
10.0%
2023-12-10T19:07:11.599093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
 
10.0%
2
 
5.0%
2
 
5.0%
2
 
5.0%
2
 
5.0%
2
 
5.0%
2
 
5.0%
宿 2
 
5.0%
1
 
2.5%
1
 
2.5%
Other values (20) 20
50.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 39
97.5%
Other Punctuation 1
 
2.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
 
10.3%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
宿 2
 
5.1%
1
 
2.6%
1
 
2.6%
Other values (19) 19
48.7%
Other Punctuation
ValueCountFrequency (%)
? 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Han 39
97.5%
Common 1
 
2.5%

Most frequent character per script

Han
ValueCountFrequency (%)
4
 
10.3%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
宿 2
 
5.1%
1
 
2.6%
1
 
2.6%
Other values (19) 19
48.7%
Common
ValueCountFrequency (%)
? 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
CJK 39
97.5%
ASCII 1
 
2.5%

Most frequent character per block

CJK
ValueCountFrequency (%)
4
 
10.3%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
宿 2
 
5.1%
1
 
2.6%
1
 
2.6%
Other values (19) 19
48.7%
ASCII
ValueCountFrequency (%)
? 1
100.0%

city_do_cd
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
41
86 
42
11 
46
 
2
47
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row41
2nd row47
3rd row41
4th row41
5th row41

Common Values

ValueCountFrequency (%)
41 86
86.0%
42 11
 
11.0%
46 2
 
2.0%
47 1
 
1.0%

Length

2023-12-10T19:07:11.836413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:07:12.013140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
41 86
86.0%
42 11
 
11.0%
46 2
 
2.0%
47 1
 
1.0%

city_gn_gu_cd
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
41820
86 
42150
11 
47770
 
1
46900
 
1
46790
 
1

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique3 ?
Unique (%)3.0%

Sample

1st row41820
2nd row47770
3rd row41820
4th row41820
5th row41820

Common Values

ValueCountFrequency (%)
41820 86
86.0%
42150 11
 
11.0%
47770 1
 
1.0%
46900 1
 
1.0%
46790 1
 
1.0%

Length

2023-12-10T19:07:12.200415image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:07:12.385170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
41820 86
86.0%
42150 11
 
11.0%
47770 1
 
1.0%
46900 1
 
1.0%
46790 1
 
1.0%

city_do_kor_lang_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
경기
86 
강원
11 
전남
 
2
경북
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row경기
2nd row경북
3rd row경기
4th row경기
5th row경기

Common Values

ValueCountFrequency (%)
경기 86
86.0%
강원 11
 
11.0%
전남 2
 
2.0%
경북 1
 
1.0%

Length

2023-12-10T19:07:12.600525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:07:12.789268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경기 86
86.0%
강원 11
 
11.0%
전남 2
 
2.0%
경북 1
 
1.0%

city_gn_gu_kor_lang_nm
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
가평군
86 
강릉시
11 
영덕군
 
1
진도군
 
1
화순군
 
1

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique3 ?
Unique (%)3.0%

Sample

1st row가평군
2nd row영덕군
3rd row가평군
4th row가평군
5th row가평군

Common Values

ValueCountFrequency (%)
가평군 86
86.0%
강릉시 11
 
11.0%
영덕군 1
 
1.0%
진도군 1
 
1.0%
화순군 1
 
1.0%

Length

2023-12-10T19:07:13.003890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:07:13.221817image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
가평군 86
86.0%
강릉시 11
 
11.0%
영덕군 1
 
1.0%
진도군 1
 
1.0%
화순군 1
 
1.0%

gov_dn_kor_lang_nm
Categorical

HIGH CORRELATION 

Distinct10
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
북면
27 
설악면
19 
청평면
16 
가평읍
13 
강동면
11 
Other values (5)
14 

Length

Max length3
Median length3
Mean length2.68
Min length2

Unique

Unique3 ?
Unique (%)3.0%

Sample

1st row가평읍
2nd row병곡면
3rd row가평읍
4th row가평읍
5th row가평읍

Common Values

ValueCountFrequency (%)
북면 27
27.0%
설악면 19
19.0%
청평면 16
16.0%
가평읍 13
13.0%
강동면 11
11.0%
조종면 7
 
7.0%
상면 4
 
4.0%
병곡면 1
 
1.0%
진도읍 1
 
1.0%
동면 1
 
1.0%

Length

2023-12-10T19:07:13.440614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:07:13.670645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
북면 27
27.0%
설악면 19
19.0%
청평면 16
16.0%
가평읍 13
13.0%
강동면 11
11.0%
조종면 7
 
7.0%
상면 4
 
4.0%
병곡면 1
 
1.0%
진도읍 1
 
1.0%
동면 1
 
1.0%

city_do_jan_lang_nm
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
京畿道
79 
江原道
11 
<NA>
10 

Length

Max length4
Median length3
Mean length3.1
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row京畿道
2nd row<NA>
3rd row京畿道
4th row京畿道
5th row京畿道

Common Values

ValueCountFrequency (%)
京畿道 79
79.0%
江原道 11
 
11.0%
<NA> 10
 
10.0%

Length

2023-12-10T19:07:13.943091image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:07:14.149303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
京畿道 79
79.0%
江原道 11
 
11.0%
na 10
 
10.0%

city_gn_gu_jan_lang_nm
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
加平郡
79 
江陵市
11 
<NA>
10 

Length

Max length4
Median length3
Mean length3.1
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row加平郡
2nd row<NA>
3rd row加平郡
4th row加平郡
5th row加平郡

Common Values

ValueCountFrequency (%)
加平郡 79
79.0%
江陵市 11
 
11.0%
<NA> 10
 
10.0%

Length

2023-12-10T19:07:14.374900image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:07:14.569942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
加平郡 79
79.0%
江陵市 11
 
11.0%
na 10
 
10.0%

gov_dn_jan_lang_nm
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)7.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
北面
27 
雪岳面
19 
?平面
16 
加平邑
13 
江東面
11 
Other values (2)
14 

Length

Max length4
Median length3
Mean length2.79
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row加平邑
2nd row<NA>
3rd row加平邑
4th row加平邑
5th row加平邑

Common Values

ValueCountFrequency (%)
北面 27
27.0%
雪岳面 19
19.0%
?平面 16
16.0%
加平邑 13
13.0%
江東面 11
11.0%
<NA> 10
 
10.0%
上面 4
 
4.0%

Length

2023-12-10T19:07:14.798448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:07:15.133782image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
北面 27
27.0%
雪岳面 19
19.0%
平面 16
16.0%
加平邑 13
13.0%
江東面 11
11.0%
na 10
 
10.0%
上面 4
 
4.0%

base_ymd
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2020-12-31
100 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-12-31
2nd row2020-12-31
3rd row2020-12-31
4th row2020-12-31
5th row2020-12-31

Common Values

ValueCountFrequency (%)
2020-12-31 100
100.0%

Length

2023-12-10T19:07:15.422961image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:07:15.610640image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-12-31 100
100.0%

Interactions

2023-12-10T19:07:02.350343image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:07:02.032452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:07:02.518177image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:07:02.187454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:07:15.740228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
entrp_nmxpos_loypos_lakor_lang_nmeng_lang_nmjan_lang_nmchg_lang_nmchb_lang_nmcity_do_cdcity_gn_gu_cdcity_do_kor_lang_nmcity_gn_gu_kor_lang_nmgov_dn_kor_lang_nmcity_do_jan_lang_nmcity_gn_gu_jan_lang_nmgov_dn_jan_lang_nm
entrp_nm1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.9721.0001.0000.890
xpos_lo1.0001.0001.0001.0000.0000.0000.0000.0001.0001.0001.0001.0001.0000.9970.9971.000
ypos_la1.0001.0001.0001.000NaNNaNNaNNaN1.0001.0001.0001.0001.000NaNNaNNaN
kor_lang_nm1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.0000.9721.0001.0000.890
eng_lang_nm1.0000.000NaN1.0001.0001.0001.0001.0000.0000.0000.0000.0000.0000.0000.0000.000
jan_lang_nm1.0000.000NaN1.0001.0001.0001.0001.0000.0000.0000.0000.0000.0000.0000.0000.000
chg_lang_nm1.0000.000NaN1.0001.0001.0001.0001.0000.0000.0000.0000.0000.0000.0000.0000.000
chb_lang_nm1.0000.000NaN1.0001.0001.0001.0001.0000.0000.0000.0000.0000.0000.0000.0000.000
city_do_cd1.0001.0001.0001.0000.0000.0000.0000.0001.0001.0001.0001.0001.0000.9970.9971.000
city_gn_gu_cd1.0001.0001.0001.0000.0000.0000.0000.0001.0001.0001.0001.0001.0000.9970.9971.000
city_do_kor_lang_nm1.0001.0001.0001.0000.0000.0000.0000.0001.0001.0001.0001.0001.0000.9970.9971.000
city_gn_gu_kor_lang_nm1.0001.0001.0001.0000.0000.0000.0000.0001.0001.0001.0001.0001.0000.9970.9971.000
gov_dn_kor_lang_nm0.9721.0001.0000.9720.0000.0000.0000.0001.0001.0001.0001.0001.0001.0001.0001.000
city_do_jan_lang_nm1.0000.997NaN1.0000.0000.0000.0000.0000.9970.9970.9970.9971.0001.0000.9971.000
city_gn_gu_jan_lang_nm1.0000.997NaN1.0000.0000.0000.0000.0000.9970.9970.9970.9971.0000.9971.0001.000
gov_dn_jan_lang_nm0.8901.000NaN0.8900.0000.0000.0000.0001.0001.0001.0001.0001.0001.0001.0001.000
2023-12-10T19:07:16.052584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
gov_dn_kor_lang_nmcity_do_jan_lang_nmgov_dn_jan_lang_nmcity_gn_gu_cdcity_gn_gu_kor_lang_nmcity_gn_gu_jan_lang_nmcity_do_cdcity_do_kor_lang_nm
gov_dn_kor_lang_nm1.0000.9771.0000.9730.9730.9770.9680.968
city_do_jan_lang_nm0.9771.0000.9770.9480.9480.9480.9480.948
gov_dn_jan_lang_nm1.0000.9771.0000.9770.9770.9770.9770.977
city_gn_gu_cd0.9730.9480.9771.0001.0000.9480.9950.995
city_gn_gu_kor_lang_nm0.9730.9480.9771.0001.0000.9480.9950.995
city_gn_gu_jan_lang_nm0.9770.9480.9770.9480.9481.0000.9480.948
city_do_cd0.9680.9480.9770.9950.9950.9481.0001.000
city_do_kor_lang_nm0.9680.9480.9770.9950.9950.9481.0001.000
2023-12-10T19:07:16.309131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
xpos_loypos_lacity_do_cdcity_gn_gu_cdcity_do_kor_lang_nmcity_gn_gu_kor_lang_nmgov_dn_kor_lang_nmcity_do_jan_lang_nmcity_gn_gu_jan_lang_nmgov_dn_jan_lang_nm
xpos_lo1.000-0.0070.9950.9900.9950.9900.9630.9480.9480.977
ypos_la-0.0071.0000.9900.9850.9900.9850.9581.0001.0001.000
city_do_cd0.9950.9901.0000.9951.0000.9950.9680.9480.9480.977
city_gn_gu_cd0.9900.9850.9951.0000.9951.0000.9730.9480.9480.977
city_do_kor_lang_nm0.9950.9901.0000.9951.0000.9950.9680.9480.9480.977
city_gn_gu_kor_lang_nm0.9900.9850.9951.0000.9951.0000.9730.9480.9480.977
gov_dn_kor_lang_nm0.9630.9580.9680.9730.9680.9731.0000.9770.9771.000
city_do_jan_lang_nm0.9481.0000.9480.9480.9480.9480.9771.0000.9480.977
city_gn_gu_jan_lang_nm0.9481.0000.9480.9480.9480.9480.9770.9481.0000.977
gov_dn_jan_lang_nm0.9771.0000.9770.9770.9770.9771.0000.9770.9771.000

Missing values

2023-12-10T19:07:02.827760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:07:03.363191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-10T19:07:04.169087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

entrp_nmxpos_loypos_lakor_lang_nmeng_lang_nmjan_lang_nmchg_lang_nmchb_lang_nmcity_do_cdcity_gn_gu_cdcity_do_kor_lang_nmcity_gn_gu_kor_lang_nmgov_dn_kor_lang_nmcity_do_jan_lang_nmcity_gn_gu_jan_lang_nmgov_dn_jan_lang_nmbase_ymd
0가평민박1000867.01971396.0가평민박<NA><NA><NA><NA>4141820경기가평군가평읍京畿道加平郡加平邑2020-12-31
1해온안129.38526836.608114해온안<NA><NA><NA><NA>4747770경북영덕군병곡면<NA><NA><NA>2020-12-31
2강변민박1002184.01978569.0강변민박<NA><NA><NA><NA>4141820경기가평군가평읍京畿道加平郡加平邑2020-12-31
3강언덕1003076.01974845.0강언덕<NA><NA><NA><NA>4141820경기가평군가평읍京畿道加平郡加平邑2020-12-31
4맑은물산장998431.01984328.0맑은물산장<NA><NA><NA><NA>4141820경기가평군가평읍京畿道加平郡加平邑2020-12-31
5산수민박998904.01978239.0산수민박<NA><NA><NA><NA>4141820경기가평군가평읍京畿道加平郡加平邑2020-12-31
6산유펜션1000603.01973393.0산유펜션<NA><NA><NA><NA>4141820경기가평군가평읍京畿道加平郡加平邑2020-12-31
7진도해안로민박126.24005934.490406진도해안로민박<NA><NA><NA><NA>4646900전남진도군진도읍<NA><NA><NA>2020-12-31
8안골민박1002022.01978864.0안골민박<NA><NA><NA><NA>4141820경기가평군가평읍京畿道加平郡加平邑2020-12-31
9여울펜션1000701.01982979.0여울펜션<NA><NA><NA><NA>4141820경기가평군가평읍京畿道加平郡加平邑2020-12-31
entrp_nmxpos_loypos_lakor_lang_nmeng_lang_nmjan_lang_nmchg_lang_nmchb_lang_nmcity_do_cdcity_gn_gu_cdcity_do_kor_lang_nmcity_gn_gu_kor_lang_nmgov_dn_kor_lang_nmcity_do_jan_lang_nmcity_gn_gu_jan_lang_nmgov_dn_jan_lang_nmbase_ymd
90고성산민박1135134.01966650.0고성산민박<NA><NA><NA><NA>4242150강원강릉시강동면江原道江陵市江東面2020-12-31
91고준재민박1128468.01972954.0고준재민박<NA><NA><NA><NA>4242150강원강릉시강동면江原道江陵市江東面2020-12-31
92나폴리1136978.01963941.0나폴리Motelモ?テル汽?旅?汽車旅館4242150강원강릉시강동면江原道江陵市江東面2020-12-31
93단골민박1135355.01966462.0단골민박<NA><NA><NA><NA>4242150강원강릉시강동면江原道江陵市江東面2020-12-31
94등명민박1133637.01968163.0등명민박<NA><NA><NA><NA>4242150강원강릉시강동면江原道江陵市江東面2020-12-31
95등명슈퍼민박1133658.01968114.0등명슈퍼민박<NA><NA><NA><NA>4242150강원강릉시강동면江原道江陵市江東面2020-12-31
96모네펜션1130767.01971055.0모네펜션<NA><NA><NA><NA>4242150강원강릉시강동면江原道江陵市江東面2020-12-31
97바다와해돋이1135366.01966623.0바다와해돋이<NA><NA><NA><NA>4242150강원강릉시강동면江原道江陵市江東面2020-12-31
98보금자리민박1135054.01966664.0보금자리민박<NA><NA><NA><NA>4242150강원강릉시강동면江原道江陵市江東面2020-12-31
99부부식당1131116.01971363.0부부식당<NA><NA><NA><NA>4242150강원강릉시강동면江原道江陵市江東面2020-12-31