gimi9 Pandas Profiling

Dataset statistics

Number of variables	13
Number of observations	197
Missing cells	1
Missing cells (%)	< 0.1%
Duplicate rows	0
Duplicate rows (%)	0.0%
Total size in memory	20.5 KiB
Average record size in memory	106.7 B

Variable types

Text	10
Numeric	2
Categorical	1

Dataset

Description	외교부 홈페이지에 공개 중인 각 국가별 약황 정보 중 일반정보를 CSV 형식으로 제공 합니다.(데이터 업데이트 주기: 12개월, 실시간 정보는 동명의 API 참고)
Author	외교부
URL	https://www.data.go.kr/data/15076557/fileData.do

Alerts

`인구` is highly overall correlated with `면적`	High correlation
`면적` is highly overall correlated with `인구`	High correlation
`기후` is highly imbalanced (62.1%)	Imbalance
`한글국가명` has unique values	Unique
`영문국가명` has unique values	Unique
`수도` has unique values	Unique
`면적` has unique values	Unique

Reproduction

Analysis started	2024-04-21 01:43:36.912858
Analysis finished	2024-04-21 01:43:40.270341
Duration	3.36 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

한글국가명
Text

UNIQUE

Distinct	197
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Memory size	1.7 KiB

Length

Max length	10
Median length	9
Mean length	3.964467
Min length	2

Characters and Unicode

Total characters	781
Distinct characters	180
Distinct categories	1 ?
Distinct scripts	1 ?
Distinct blocks	1 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	197 ?
Unique (%)	100.0%

Sample

1st row	가나
2nd row	가봉
3rd row	가이아나
4th row	감비아
5th row	과테말라

Value	Count	Frequency (%)
가나	1	0.5%
슬로베니아	1	0.5%
오스트리아	1	0.5%
온두라스	1	0.5%
요르단	1	0.5%
우간다	1	0.5%
우루과이	1	0.5%
우즈베키스탄	1	0.5%
우크라이나	1	0.5%
이라크	1	0.5%
Other values (187)	187	94.9%

Most occurring characters

Value	Count	Frequency (%)
아	55	7.0%
리	31	4.0%
스	28	3.6%
니	24	3.1%
르	24	3.1%
이	23	2.9%
라	20	2.6%
바	16	2.0%
나	16	2.0%
도	14	1.8%
Other values (170)	530	67.9%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	781	100.0%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
아	55	7.0%
리	31	4.0%
스	28	3.6%
니	24	3.1%
르	24	3.1%
이	23	2.9%
라	20	2.6%
바	16	2.0%
나	16	2.0%
도	14	1.8%
Other values (170)	530	67.9%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	781	100.0%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
아	55	7.0%
리	31	4.0%
스	28	3.6%
니	24	3.1%
르	24	3.1%
이	23	2.9%
라	20	2.6%
바	16	2.0%
나	16	2.0%
도	14	1.8%
Other values (170)	530	67.9%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	781	100.0%

Most frequent character per block

Hangul

Value	Count	Frequency (%)
아	55	7.0%
리	31	4.0%
스	28	3.6%
니	24	3.1%
르	24	3.1%
이	23	2.9%
라	20	2.6%
바	16	2.0%
나	16	2.0%
도	14	1.8%
Other values (170)	530	67.9%

영문국가명
Text

UNIQUE

Distinct	197
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Memory size	1.7 KiB

Length

Max length	30
Median length	26
Mean length	8.5380711
Min length	3

Characters and Unicode

Total characters	1682
Distinct characters	57
Distinct categories	5 ?
Distinct scripts	2 ?
Distinct blocks	1 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	197 ?
Unique (%)	100.0%

Sample

1st row	Ghana
2nd row	Gabon
3rd row	Guyana
4th row	Gambia
5th row	Guatemala

Value	Count	Frequency (%)
republic	5	2.0%
	5	2.0%
of	4	1.6%
united	3	1.2%
st	3	1.2%
guinea	3	1.2%
islands	3	1.2%
states	2	0.8%
sudan	2	0.8%
south	2	0.8%
Other values (217)	219	87.3%

Most occurring characters

Value	Count	Frequency (%)
a	249	14.8%
i	149	8.9%
n	125	7.4%
e	117	7.0%
o	90	5.4%
r	90	5.4%
u	66	3.9%
t	64	3.8%
l	58	3.4%
s	57	3.4%
Other values (47)	617	36.7%

Most occurring categories

Value	Count	Frequency (%)
Lowercase Letter	1361	80.9%
Uppercase Letter	251	14.9%
Space Separator	54	3.2%
Other Punctuation	12	0.7%
Dash Punctuation	4	0.2%

Most frequent character per category

Lowercase Letter

Value	Count	Frequency (%)
a	249	18.3%
i	149	10.9%
n	125	9.2%
e	117	8.6%
o	90	6.6%
r	90	6.6%
u	66	4.8%
t	64	4.7%
l	58	4.3%
s	57	4.2%
Other values (16)	296	21.7%

Uppercase Letter

Value	Count	Frequency (%)
S	29	11.6%
C	20	8.0%
M	20	8.0%
B	19	7.6%
A	17	6.8%
T	15	6.0%
G	15	6.0%
N	14	5.6%
L	13	5.2%
P	12	4.8%
Other values (14)	77	30.7%

Other Punctuation

Value	Count	Frequency (%)
.	3	25.0%
&	3	25.0%
?	3	25.0%
:	2	16.7%
'	1	8.3%

Space Separator

Value	Count	Frequency (%)
	54	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	4	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	1612	95.8%
Common	70	4.2%

Most frequent character per script

Latin

Value	Count	Frequency (%)
a	249	15.4%
i	149	9.2%
n	125	7.8%
e	117	7.3%
o	90	5.6%
r	90	5.6%
u	66	4.1%
t	64	4.0%
l	58	3.6%
s	57	3.5%
Other values (40)	547	33.9%

Common

Value	Count	Frequency (%)
	54	77.1%
-	4	5.7%
.	3	4.3%
&	3	4.3%
?	3	4.3%
:	2	2.9%
'	1	1.4%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	1682	100.0%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
a	249	14.8%
i	149	8.9%
n	125	7.4%
e	117	7.0%
o	90	5.4%
r	90	5.4%
u	66	3.9%
t	64	3.8%
l	58	3.4%
s	57	3.4%
Other values (47)	617	36.7%

국가코드(ISO 2자리)
Text

Distinct	196
Distinct (%)	100.0%
Missing	1
Missing (%)	0.5%
Memory size	1.7 KiB

Length

Max length	2
Median length	2
Mean length	2
Min length	2

Characters and Unicode

Total characters	392
Distinct characters	26
Distinct categories	1 ?
Distinct scripts	1 ?
Distinct blocks	1 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	196 ?
Unique (%)	100.0%

Sample

1st row	GH
2nd row	GA
3rd row	GY
4th row	GM
5th row	GT

Value	Count	Frequency (%)
gh	1	0.5%
si	1	0.5%
at	1	0.5%
hn	1	0.5%
jo	1	0.5%
ug	1	0.5%
uy	1	0.5%
uz	1	0.5%
ua	1	0.5%
iq	1	0.5%
Other values (186)	186	94.9%

Most occurring characters

Value	Count	Frequency (%)
M	30	7.7%
S	27	6.9%
T	24	6.1%
G	23	5.9%
A	22	5.6%
C	21	5.4%
B	21	5.4%
N	19	4.8%
E	19	4.8%
L	19	4.8%
Other values (16)	167	42.6%

Most occurring categories

Value	Count	Frequency (%)
Uppercase Letter	392	100.0%

Most frequent character per category

Uppercase Letter

Value	Count	Frequency (%)
M	30	7.7%
S	27	6.9%
T	24	6.1%
G	23	5.9%
A	22	5.6%
C	21	5.4%
B	21	5.4%
N	19	4.8%
E	19	4.8%
L	19	4.8%
Other values (16)	167	42.6%

Most occurring scripts

Value	Count	Frequency (%)
Latin	392	100.0%

Most frequent character per script

Latin

Value	Count	Frequency (%)
M	30	7.7%
S	27	6.9%
T	24	6.1%
G	23	5.9%
A	22	5.6%
C	21	5.4%
B	21	5.4%
N	19	4.8%
E	19	4.8%
L	19	4.8%
Other values (16)	167	42.6%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	392	100.0%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
M	30	7.7%
S	27	6.9%
T	24	6.1%
G	23	5.9%
A	22	5.6%
C	21	5.4%
B	21	5.4%
N	19	4.8%
E	19	4.8%
L	19	4.8%
Other values (16)	167	42.6%

수도
Text

UNIQUE

Distinct	197
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Memory size	1.7 KiB

Length

Max length	92
Median length	49
Mean length	17.771574
Min length	2

Characters and Unicode

Total characters	3501
Distinct characters	312
Distinct categories	11 ?
Distinct scripts	4 ?
Distinct blocks	5 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	197 ?
Unique (%)	100.0%

Sample

1st row	아크라(Accra)
2nd row	리브르빌(Libreville)
3rd row	조지타운(Georgetown)
4th row	반줄(Banjul)
5th row	과테말라시티(Guatemala City)

Value	Count	Frequency (%)
약	15	3.4%
인구	13	3.0%
명	10	2.3%
	7	1.6%
기준	5	1.1%
city	4	0.9%
행정수도	3	0.7%
수도	3	0.7%
la	2	0.5%
33만명	2	0.5%
Other values (366)	373	85.4%

Most occurring characters

Value	Count	Frequency (%)
	242	6.9%
a	205	5.9%
(	195	5.6%
)	195	5.6%
o	103	2.9%
,	99	2.8%
i	98	2.8%
n	88	2.5%
r	76	2.2%
만	75	2.1%
Other values (302)	2125	60.7%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	1139	32.5%
Lowercase Letter	1096	31.3%
Decimal Number	270	7.7%
Space Separator	242	6.9%
Uppercase Letter	212	6.1%
Open Punctuation	195	5.6%
Close Punctuation	195	5.6%
Other Punctuation	146	4.2%
Dash Punctuation	4	0.1%
Initial Punctuation	1	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
만	75	6.6%
명	71	6.2%
라	34	3.0%
스	32	2.8%
리	28	2.5%
아	28	2.5%
도	23	2.0%
바	21	1.8%
마	19	1.7%
트	19	1.7%
Other values (230)	789	69.3%

Lowercase Letter

Value	Count	Frequency (%)
a	205	18.7%
o	103	9.4%
i	98	8.9%
n	88	8.0%
r	76	6.9%
e	71	6.5%
u	59	5.4%
t	56	5.1%
s	53	4.8%
l	46	4.2%
Other values (15)	241	22.0%

Uppercase Letter

Value	Count	Frequency (%)
S	19	9.0%
B	19	9.0%
M	19	9.0%
P	18	8.5%
A	17	8.0%
D	15	7.1%
N	13	6.1%
T	13	6.1%
C	13	6.1%
L	12	5.7%
Other values (13)	54	25.5%

Decimal Number

Value	Count	Frequency (%)
2	48	17.8%
1	47	17.4%
0	42	15.6%
7	30	11.1%
4	23	8.5%
3	21	7.8%
8	17	6.3%
6	17	6.3%
9	13	4.8%
5	12	4.4%

Other Punctuation

Value	Count	Frequency (%)
,	99	67.8%
.	20	13.7%
:	8	5.5%
?	7	4.8%
'	7	4.8%
/	2	1.4%
※	2	1.4%
·	1	0.7%

Space Separator

Value	Count	Frequency (%)
	242	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	195	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	195	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	4	100.0%

Initial Punctuation

Value	Count	Frequency (%)
‘	1	100.0%

Final Punctuation

Value	Count	Frequency (%)
’	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	1308	37.4%
Hangul	1135	32.4%
Common	1054	30.1%
Han	4	0.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
만	75	6.6%
명	71	6.3%
라	34	3.0%
스	32	2.8%
리	28	2.5%
아	28	2.5%
도	23	2.0%
바	21	1.9%
마	19	1.7%
트	19	1.7%
Other values (227)	785	69.2%

Latin

Value	Count	Frequency (%)
a	205	15.7%
o	103	7.9%
i	98	7.5%
n	88	6.7%
r	76	5.8%
e	71	5.4%
u	59	4.5%
t	56	4.3%
s	53	4.1%
l	46	3.5%
Other values (38)	453	34.6%

Common

Value	Count	Frequency (%)
	242	23.0%
(	195	18.5%
)	195	18.5%
,	99	9.4%
2	48	4.6%
1	47	4.5%
0	42	4.0%
7	30	2.8%
4	23	2.2%
3	21	2.0%
Other values (14)	112	10.6%

Han

Value	Count	Frequency (%)
京	2	50.0%
東	1	25.0%
北	1	25.0%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	2357	67.3%
Hangul	1135	32.4%
CJK	4	0.1%
Punctuation	4	0.1%
None	1	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	242	10.3%
a	205	8.7%
(	195	8.3%
)	195	8.3%
o	103	4.4%
,	99	4.2%
i	98	4.2%
n	88	3.7%
r	76	3.2%
e	71	3.0%
Other values (58)	985	41.8%

Hangul

Value	Count	Frequency (%)
만	75	6.6%
명	71	6.3%
라	34	3.0%
스	32	2.8%
리	28	2.5%
아	28	2.5%
도	23	2.0%
바	21	1.9%
마	19	1.7%
트	19	1.7%
Other values (227)	785	69.2%

CJK

Value	Count	Frequency (%)
京	2	50.0%
東	1	25.0%
北	1	25.0%

Punctuation

Value	Count	Frequency (%)
※	2	50.0%
‘	1	25.0%
’	1	25.0%

None

Value	Count	Frequency (%)
·	1	100.0%

인구
Real number (ℝ)

HIGH CORRELATION

Distinct	195
Distinct (%)	99.0%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Mean	39521004

Minimum	274
Maximum	1.40967 × 10⁹
Zeros	0
Zeros (%)	0.0%
Negative	0
Negative (%)	0.0%
Memory size	1.9 KiB

Quantile statistics

Minimum	274
5-th percentile	54211
Q1	1580000
median	8610000
Q3	28430000
95-th percentile	1.2612 × 10⁸
Maximum	1.40967 × 10⁹
Range	1.4096697 × 10⁹
Interquartile range (IQR)	26850000

Descriptive statistics

Standard deviation	1.4696817 × 10⁸
Coefficient of variation (CV)	3.7187357
Kurtosis	76.290503
Mean	39521004
Median Absolute Deviation (MAD)	8170000
Skewness	8.4222966
Sum	7.7856378 × 10⁹
Variance	2.1599643 × 10¹⁶
Monotonicity	Not monotonic

Histogram with fixed size bins (bins=50)

Value	Count	Frequency (%)
440000	2	1.0%
110000	2	1.0%
32400000	1	0.5%
62390000	1	0.5%
11310000	1	0.5%
47000000	1	0.5%
3480000	1	0.5%
33900000	1	0.5%
41600000	1	0.5%
39650000	1	0.5%
Other values (185)	185	93.9%

Minimum 10 values
Maximum 10 values

Value	Count	Frequency (%)
274	1	0.5%
1000	1	0.5%
1600	1	0.5%
10800	1	0.5%
11925	1	0.5%
17601	1	0.5%
18000	1	0.5%
31223	1	0.5%
33745	1	0.5%
39055	1	0.5%

Value	Count	Frequency (%)
1409670000	1	0.5%
1407000000	1	0.5%
334910000	1	0.5%
277430000	1	0.5%
230000000	1	0.5%
218540000	1	0.5%
210000000	1	0.5%
170000000	1	0.5%
143200000	1	0.5%
130120000	1	0.5%

인구설명
Text

Distinct	97
Distinct (%)	49.2%
Missing	0
Missing (%)	0.0%
Memory size	1.7 KiB

Length

Max length	49
Median length	20
Mean length	11.253807
Min length	5

Characters and Unicode

Total characters	2217
Distinct characters	111
Distinct categories	11 ?
Distinct scripts	3 ?
Distinct blocks	3 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	74 ?
Unique (%)	37.6%

Sample

1st row	('22, EIU)
2nd row	(‘21 World Bank)
3rd row	(2019 CIA)
4th row	('22 World Bank)
5th row	(2022, IMF)

Value	Count	Frequency (%)
imf	45	11.0%
world	38	9.3%
bank	37	9.0%
2022	30	7.3%
2021	25	6.1%
22	19	4.6%
null	18	4.4%
21	14	3.4%
2023	12	2.9%
cia	12	2.9%
Other values (78)	160	39.0%

Most occurring characters

Value	Count	Frequency (%)
2	333	15.0%
	217	9.8%
)	174	7.8%
(	174	7.8%
0	110	5.0%
1	73	3.3%
I	69	3.1%
,	67	3.0%
'	54	2.4%
W	50	2.3%
Other values (101)	896	40.4%

Most occurring categories

Value	Count	Frequency (%)
Decimal Number	576	26.0%
Uppercase Letter	380	17.1%
Lowercase Letter	326	14.7%
Space Separator	217	9.8%
Close Punctuation	174	7.8%
Open Punctuation	174	7.8%
Other Punctuation	165	7.4%
Other Letter	157	7.1%
Dash Punctuation	24	1.1%
Initial Punctuation	19	0.9%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
년	10	6.4%
정	10	6.4%
추	10	6.4%
월	9	5.7%
계	9	5.7%
준	8	5.1%
기	8	5.1%
세	5	3.2%
만	5	3.2%
명	4	2.5%
Other values (50)	79	50.3%

Lowercase Letter

Value	Count	Frequency (%)
o	50	15.3%
r	48	14.7%
d	43	13.2%
l	43	13.2%
k	38	11.7%
a	38	11.7%
n	37	11.3%
e	10	3.1%
t	6	1.8%
m	5	1.5%
Other values (5)	8	2.5%

Uppercase Letter

Value	Count	Frequency (%)
I	69	18.2%
W	50	13.2%
M	47	12.4%
F	47	12.4%
B	45	11.8%
L	36	9.5%
U	26	6.8%
N	20	5.3%
A	15	3.9%
C	15	3.9%
Other values (4)	10	2.6%

Decimal Number

Value	Count	Frequency (%)
2	333	57.8%
0	110	19.1%
1	73	12.7%
3	35	6.1%
7	8	1.4%
5	5	0.9%
9	4	0.7%
8	3	0.5%
6	3	0.5%
4	2	0.3%

Other Punctuation

Value	Count	Frequency (%)
,	67	40.6%
'	54	32.7%
.	35	21.2%
/	7	4.2%
%	1	0.6%
:	1	0.6%

Space Separator

Value	Count	Frequency (%)
	217	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	174	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	174	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	24	100.0%

Initial Punctuation

Value	Count	Frequency (%)
‘	19	100.0%

Final Punctuation

Value	Count	Frequency (%)
’	5	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	1354	61.1%
Latin	706	31.8%
Hangul	157	7.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
년	10	6.4%
정	10	6.4%
추	10	6.4%
월	9	5.7%
계	9	5.7%
준	8	5.1%
기	8	5.1%
세	5	3.2%
만	5	3.2%
명	4	2.5%
Other values (50)	79	50.3%

Latin

Value	Count	Frequency (%)
I	69	9.8%
W	50	7.1%
o	50	7.1%
r	48	6.8%
M	47	6.7%
F	47	6.7%
B	45	6.4%
d	43	6.1%
l	43	6.1%
k	38	5.4%
Other values (19)	226	32.0%

Common

Value	Count	Frequency (%)
2	333	24.6%
	217	16.0%
)	174	12.9%
(	174	12.9%
0	110	8.1%
1	73	5.4%
,	67	4.9%
'	54	4.0%
.	35	2.6%
3	35	2.6%
Other values (12)	82	6.1%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	2036	91.8%
Hangul	157	7.1%
Punctuation	24	1.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
2	333	16.4%
	217	10.7%
)	174	8.5%
(	174	8.5%
0	110	5.4%
1	73	3.6%
I	69	3.4%
,	67	3.3%
'	54	2.7%
W	50	2.5%
Other values (39)	715	35.1%

Punctuation

Value	Count	Frequency (%)
‘	19	79.2%
’	5	20.8%

Hangul

Value	Count	Frequency (%)
년	10	6.4%
정	10	6.4%
추	10	6.4%
월	9	5.7%
계	9	5.7%
준	8	5.1%
기	8	5.1%
세	5	3.2%
만	5	3.2%
명	4	2.5%
Other values (50)	79	50.3%

면적
Real number (ℝ)

HIGH CORRELATION UNIQUE

Distinct	197
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Mean	679697.84

Minimum	0.44
Maximum	17090000
Zeros	0
Zeros (%)	0.0%
Negative	0
Negative (%)	0.0%
Memory size	1.9 KiB

Quantile statistics

Minimum	0.44
5-th percentile	290.6
Q1	20770
median	113370
Q3	527968
95-th percentile	2352234.6
Maximum	17090000
Range	17090000
Interquartile range (IQR)	507198

Descriptive statistics

Standard deviation	1907099.4
Coefficient of variation (CV)	2.8058047
Kurtosis	36.572186
Mean	679697.84
Median Absolute Deviation (MAD)	112914.7
Skewness	5.6100353
Sum	1.3390047 × 10⁸
Variance	3.6370282 × 10¹²
Monotonicity	Not monotonic

Histogram with fixed size bins (bins=50)

Value	Count	Frequency (%)
238537.0	1	0.5%
527968.0	1	0.5%
83879.0	1	0.5%
112492.0	1	0.5%
89342.0	1	0.5%
241038.0	1	0.5%
176000.0	1	0.5%
447400.0	1	0.5%
603500.0	1	0.5%
441839.0	1	0.5%
Other values (187)	187	94.9%

Minimum 10 values
Maximum 10 values

Value	Count	Frequency (%)
0.44	1	0.5%
2.02	1	0.5%
21.0	1	0.5%
25.9	1	0.5%
60.5	1	0.5%
160.0	1	0.5%
182.0	1	0.5%
240.0	1	0.5%
259.0	1	0.5%
261.0	1	0.5%

Value	Count	Frequency (%)
17090000.0	1	0.5%
9970000.0	1	0.5%
9830000.0	1	0.5%
9600000.0	1	0.5%
8510000.0	1	0.5%
7690000.0	1	0.5%
3287782.0	1	0.5%
2790000.0	1	0.5%
2724900.0	1	0.5%
2381741.0	1	0.5%

면적설명
Text

Distinct	150
Distinct (%)	76.1%
Missing	0
Missing (%)	0.0%
Memory size	1.7 KiB

Length

Max length	52
Median length	47
Mean length	13.670051
Min length	6

Characters and Unicode

Total characters	2693
Distinct characters	174
Distinct categories	9 ?
Distinct scripts	4 ?
Distinct blocks	5 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	125 ?
Unique (%)	63.5%

Sample

1st row	(한반도 1.1배)
2nd row	(한반도의 1.2배)
3rd row	(NULL)
4th row	(한반도의 1/20)
5th row	(한반도의 1/2)

Value	Count	Frequency (%)
한반도의	130	23.0%
약	39	6.9%
한반도	14	2.5%
null	11	1.9%
세계	11	1.9%
구성	11	1.9%
1/4	10	1.8%
1/3	9	1.6%
도서로	7	1.2%
1/2	7	1.2%
Other values (207)	316	55.9%

Most occurring characters

Value	Count	Frequency (%)
	375	13.9%
(	201	7.5%
)	201	7.5%
도	173	6.4%
의	163	6.1%
한	158	5.9%
반	147	5.5%
1	133	4.9%
배	100	3.7%
/	74	2.7%
Other values (164)	968	35.9%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	1169	43.4%
Decimal Number	446	16.6%
Space Separator	375	13.9%
Open Punctuation	201	7.5%
Close Punctuation	201	7.5%
Other Punctuation	171	6.3%
Uppercase Letter	69	2.6%
Lowercase Letter	57	2.1%
Other Symbol	4	0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
도	173	14.8%
의	163	13.9%
한	158	13.5%
반	147	12.6%
배	100	8.6%
약	45	3.8%
서	18	1.5%
개	16	1.4%
로	16	1.4%
성	15	1.3%
Other values (114)	318	27.2%

Lowercase Letter

Value	Count	Frequency (%)
a	13	22.8%
o	7	12.3%
n	6	10.5%
i	5	8.8%
r	4	7.0%
u	4	7.0%
g	3	5.3%
e	3	5.3%
d	2	3.5%
t	2	3.5%
Other values (7)	8	14.0%

Uppercase Letter

Value	Count	Frequency (%)
L	22	31.9%
U	12	17.4%
N	11	15.9%
A	7	10.1%
C	6	8.7%
I	5	7.2%
M	2	2.9%
R	1	1.4%
G	1	1.4%
B	1	1.4%

Decimal Number

Value	Count	Frequency (%)
1	133	29.8%
2	64	14.3%
3	58	13.0%
5	47	10.5%
0	36	8.1%
4	32	7.2%
7	25	5.6%
6	21	4.7%
8	18	4.0%
9	12	2.7%

Other Punctuation

Value	Count	Frequency (%)
/	74	43.3%
.	61	35.7%
,	21	12.3%
%	6	3.5%
※	4	2.3%
:	3	1.8%
*	2	1.2%

Other Symbol

Value	Count	Frequency (%)
㎢	3	75.0%
㎞	1	25.0%

Space Separator

Value	Count	Frequency (%)
	375	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	201	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	201	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	1398	51.9%
Hangul	1163	43.2%
Latin	126	4.7%
Han	6	0.2%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
도	173	14.9%
의	163	14.0%
한	158	13.6%
반	147	12.6%
배	100	8.6%
약	45	3.9%
서	18	1.5%
개	16	1.4%
로	16	1.4%
성	15	1.3%
Other values (108)	312	26.8%

Latin

Value	Count	Frequency (%)
L	22	17.5%
a	13	10.3%
U	12	9.5%
N	11	8.7%
A	7	5.6%
o	7	5.6%
C	6	4.8%
n	6	4.8%
I	5	4.0%
i	5	4.0%
Other values (18)	32	25.4%

Common

Value	Count	Frequency (%)
	375	26.8%
(	201	14.4%
)	201	14.4%
1	133	9.5%
/	74	5.3%
2	64	4.6%
.	61	4.4%
3	58	4.1%
5	47	3.4%
0	36	2.6%
Other values (12)	148	10.6%

Han

Value	Count	Frequency (%)
湖	1	16.7%
澎	1	16.7%
金	1	16.7%
門	1	16.7%
祖	1	16.7%
馬	1	16.7%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	1516	56.3%
Hangul	1163	43.2%
CJK	6	0.2%
Punctuation	4	0.1%
CJK Compat	4	0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	375	24.7%
(	201	13.3%
)	201	13.3%
1	133	8.8%
/	74	4.9%
2	64	4.2%
.	61	4.0%
3	58	3.8%
5	47	3.1%
0	36	2.4%
Other values (37)	266	17.5%

Hangul

Value	Count	Frequency (%)
도	173	14.9%
의	163	14.0%
한	158	13.6%
반	147	12.6%
배	100	8.6%
약	45	3.9%
서	18	1.5%
개	16	1.4%
로	16	1.4%
성	15	1.3%
Other values (108)	312	26.8%

Punctuation

Value	Count	Frequency (%)
※	4	100.0%

CJK Compat

Value	Count	Frequency (%)
㎢	3	75.0%
㎞	1	25.0%

CJK

Value	Count	Frequency (%)
湖	1	16.7%
澎	1	16.7%
金	1	16.7%
門	1	16.7%
祖	1	16.7%
馬	1	16.7%

언어
Text

Distinct	153
Distinct (%)	77.7%
Missing	0
Missing (%)	0.0%
Memory size	1.7 KiB

Length

Max length	70
Median length	42
Mean length	15.964467
Min length	2

Characters and Unicode

Total characters	3145
Distinct characters	252
Distinct categories	11 ?
Distinct scripts	4 ?
Distinct blocks	5 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	145 ?
Unique (%)	73.6%

Sample

1st row	영어(공용어), 아산테어, 에웨어 등
2nd row	불어(공용어), Fang어
3rd row	영어(공용어), Creole(현지 토속어)
4th row	영어(공용어), Wolof어
5th row	스페인어(공용어), 23개 공인 원주민어

Value	Count	Frequency (%)
영어	43	8.2%
null	27	5.2%
영어(공용어	17	3.3%
스페인어	14	2.7%
아랍어	13	2.5%
통용	12	2.3%
불어	12	2.3%
불어(공용어	11	2.1%
및	11	2.1%
토착어	10	1.9%
Other values (267)	352	67.4%

Most occurring characters

Value	Count	Frequency (%)
어	470	14.9%
	329	10.5%
)	190	6.0%
(	190	6.0%
,	181	5.8%
용	108	3.4%
공	94	3.0%
영	82	2.6%
아	76	2.4%
스	55	1.7%
Other values (242)	1370	43.6%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	1831	58.2%
Space Separator	329	10.5%
Other Punctuation	232	7.4%
Close Punctuation	190	6.0%
Open Punctuation	190	6.0%
Uppercase Letter	137	4.4%
Lowercase Letter	136	4.3%
Decimal Number	92	2.9%
Dash Punctuation	6	0.2%
Initial Punctuation	1	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
어	470	25.7%
용	108	5.9%
공	94	5.1%
영	82	4.5%
아	76	4.2%
스	55	3.0%
불	30	1.6%
인	28	1.5%
랍	27	1.5%
르	26	1.4%
Other values (182)	835	45.6%

Lowercase Letter

Value	Count	Frequency (%)
a	19	14.0%
o	19	14.0%
i	14	10.3%
e	11	8.1%
l	9	6.6%
n	9	6.6%
h	8	5.9%
r	8	5.9%
g	7	5.1%
t	6	4.4%
Other values (11)	26	19.1%

Uppercase Letter

Value	Count	Frequency (%)
L	55	40.1%
N	27	19.7%
U	27	19.7%
C	5	3.6%
S	4	2.9%
P	4	2.9%
K	3	2.2%
M	2	1.5%
W	2	1.5%
F	2	1.5%
Other values (5)	6	4.4%

Decimal Number

Value	Count	Frequency (%)
1	17	18.5%
0	13	14.1%
2	12	13.0%
3	12	13.0%
5	11	12.0%
8	11	12.0%
9	5	5.4%
6	5	5.4%
4	3	3.3%
7	3	3.3%

Other Punctuation

Value	Count	Frequency (%)
,	181	78.0%
%	34	14.7%
.	6	2.6%
:	4	1.7%
·	3	1.3%
?	2	0.9%
*	1	0.4%
※	1	0.4%

Space Separator

Value	Count	Frequency (%)
	329	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	190	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	190	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	6	100.0%

Initial Punctuation

Value	Count	Frequency (%)
‘	1	100.0%

Final Punctuation

Value	Count	Frequency (%)
’	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	1830	58.2%
Common	1041	33.1%
Latin	273	8.7%
Han	1	< 0.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
어	470	25.7%
용	108	5.9%
공	94	5.1%
영	82	4.5%
아	76	4.2%
스	55	3.0%
불	30	1.6%
인	28	1.5%
랍	27	1.5%
르	26	1.4%
Other values (181)	834	45.6%

Latin

Value	Count	Frequency (%)
L	55	20.1%
N	27	9.9%
U	27	9.9%
a	19	7.0%
o	19	7.0%
i	14	5.1%
e	11	4.0%
l	9	3.3%
n	9	3.3%
h	8	2.9%
Other values (26)	75	27.5%

Common

Value	Count	Frequency (%)
	329	31.6%
)	190	18.3%
(	190	18.3%
,	181	17.4%
%	34	3.3%
1	17	1.6%
0	13	1.2%
2	12	1.2%
3	12	1.2%
5	11	1.1%
Other values (14)	52	5.0%

Han

Value	Count	Frequency (%)
語	1	100.0%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	1830	58.2%
ASCII	1308	41.6%
None	3	0.1%
Punctuation	3	0.1%
CJK	1	< 0.1%

Most frequent character per block

Hangul

Value	Count	Frequency (%)
어	470	25.7%
용	108	5.9%
공	94	5.1%
영	82	4.5%
아	76	4.2%
스	55	3.0%
불	30	1.6%
인	28	1.5%
랍	27	1.5%
르	26	1.4%
Other values (181)	834	45.6%

ASCII

Value	Count	Frequency (%)
	329	25.2%
)	190	14.5%
(	190	14.5%
,	181	13.8%
L	55	4.2%
%	34	2.6%
N	27	2.1%
U	27	2.1%
a	19	1.5%
o	19	1.5%
Other values (46)	237	18.1%

None

Value	Count	Frequency (%)
·	3	100.0%

Punctuation

Value	Count	Frequency (%)
※	1	33.3%
‘	1	33.3%
’	1	33.3%

CJK

Value	Count	Frequency (%)
語	1	100.0%

종교
Text

Distinct	191
Distinct (%)	97.0%
Missing	0
Missing (%)	0.0%
Memory size	1.7 KiB

Length

Max length	65
Median length	45
Mean length	28.187817
Min length	6

Characters and Unicode

Total characters	5553
Distinct characters	183
Distinct categories	10 ?
Distinct scripts	4 ?
Distinct blocks	5 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	189 ?
Unique (%)	95.9%

Sample

1st row	기독교(71%), 이슬람교(17.6%)
2nd row	기독교(가톨릭 포함) 85%, 회교 9.8%, 토착종교
3rd row	기독교 57%, 힌두교 33%, 회교 9%, 기타 1%
4th row	회교(90%), 기독교(9%) 등
5th row	가톨릭(41%), 개신교(38.8%), 기타(2.7%)

Value	Count	Frequency (%)
등	38	5.0%
기타	25	3.3%
기독교	18	2.4%
가톨릭	15	2.0%
이슬람교	12	1.6%
개신교	10	1.3%
이슬람교(수니파	10	1.3%
시아파	9	1.2%
힌두교	8	1.0%
및	7	0.9%
Other values (499)	610	80.1%

Most occurring characters

Value	Count	Frequency (%)
	570	10.3%
%	481	8.7%
)	428	7.7%
(	428	7.7%
,	371	6.7%
교	360	6.5%
.	163	2.9%
1	140	2.5%
기	134	2.4%
2	124	2.2%
Other values (173)	2354	42.4%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	2048	36.9%
Other Punctuation	1022	18.4%
Decimal Number	988	17.8%
Space Separator	570	10.3%
Close Punctuation	428	7.7%
Open Punctuation	428	7.7%
Lowercase Letter	35	0.6%
Uppercase Letter	26	0.5%
Dash Punctuation	7	0.1%
Final Punctuation	1	< 0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
교	360	17.6%
기	134	6.5%
이	106	5.2%
톨	98	4.8%
릭	98	4.8%
슬	96	4.7%
람	95	4.6%
신	82	4.0%
가	75	3.7%
독	68	3.3%
Other values (129)	836	40.8%

Lowercase Letter

Value	Count	Frequency (%)
n	5	14.3%
h	4	11.4%
i	4	11.4%
o	3	8.6%
a	3	8.6%
m	2	5.7%
c	2	5.7%
s	2	5.7%
t	2	5.7%
e	2	5.7%
Other values (4)	6	17.1%

Decimal Number

Value	Count	Frequency (%)
1	140	14.2%
2	124	12.6%
5	119	12.0%
3	103	10.4%
7	93	9.4%
8	89	9.0%
9	89	9.0%
0	82	8.3%
4	78	7.9%
6	71	7.2%

Other Punctuation

Value	Count	Frequency (%)
%	481	47.1%
,	371	36.3%
.	163	15.9%
:	2	0.2%
※	2	0.2%
*	1	0.1%
/	1	0.1%
·	1	0.1%

Uppercase Letter

Value	Count	Frequency (%)
L	10	38.5%
U	5	19.2%
N	5	19.2%
C	2	7.7%
S	2	7.7%
R	1	3.8%
W	1	3.8%

Space Separator

Value	Count	Frequency (%)
	570	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	428	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	428	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	7	100.0%

Final Punctuation

Value	Count	Frequency (%)
’	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	3444	62.0%
Hangul	2045	36.8%
Latin	61	1.1%
Han	3	0.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
교	360	17.6%
기	134	6.6%
이	106	5.2%
톨	98	4.8%
릭	98	4.8%
슬	96	4.7%
람	95	4.6%
신	82	4.0%
가	75	3.7%
독	68	3.3%
Other values (126)	833	40.7%

Common

Value	Count	Frequency (%)
	570	16.6%
%	481	14.0%
)	428	12.4%
(	428	12.4%
,	371	10.8%
.	163	4.7%
1	140	4.1%
2	124	3.6%
5	119	3.5%
3	103	3.0%
Other values (13)	517	15.0%

Latin

Value	Count	Frequency (%)
L	10	16.4%
n	5	8.2%
U	5	8.2%
N	5	8.2%
h	4	6.6%
i	4	6.6%
o	3	4.9%
a	3	4.9%
m	2	3.3%
c	2	3.3%
Other values (11)	18	29.5%

Han

Value	Count	Frequency (%)
神	1	33.3%
道	1	33.3%
無	1	33.3%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	3501	63.0%
Hangul	2045	36.8%
Punctuation	3	0.1%
CJK	3	0.1%
None	1	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	570	16.3%
%	481	13.7%
)	428	12.2%
(	428	12.2%
,	371	10.6%
.	163	4.7%
1	140	4.0%
2	124	3.5%
5	119	3.4%
3	103	2.9%
Other values (31)	574	16.4%

Hangul

Value	Count	Frequency (%)
교	360	17.6%
기	134	6.6%
이	106	5.2%
톨	98	4.8%
릭	98	4.8%
슬	96	4.7%
람	95	4.6%
신	82	4.0%
가	75	3.7%
독	68	3.3%
Other values (126)	833	40.7%

Punctuation

Value	Count	Frequency (%)
※	2	66.7%
’	1	33.3%

CJK

Value	Count	Frequency (%)
神	1	33.3%
道	1	33.3%
無	1	33.3%

None

Value	Count	Frequency (%)
·	1	100.0%

민족
Text

Distinct	156
Distinct (%)	79.2%
Missing	0
Missing (%)	0.0%
Memory size	1.7 KiB

Length

Max length	119
Median length	61
Mean length	29.060914
Min length	3

Characters and Unicode

Total characters	5725
Distinct characters	296
Distinct categories	9 ?
Distinct scripts	4 ?
Distinct blocks	4 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	155 ?
Unique (%)	78.7%

Sample

1st row	아칸족(48%), 몰다그바니족(17%), 에웨족(14%), 가아단베족(7%), 구르마족(6%), 구안족(4%)
2nd row	Fang, Echira, Adouma 등 40여 종족
3rd row	동인도계 39.8%, 흑인 29.2%, 혼혈 19.9%, 아메리카 인디안 10.5%
4th row	Mandinka, Peul, Wolof 등
5th row	메스티소 56%, 마야인 41.7%, 흑인?흑인계 혼혈 0.2%, Garifuna인 0.1%, 외국인 0.2%, Xinca 원주민 1.8%

Value	Count	Frequency (%)
등	56	7.0%
null	42	5.3%
기타	20	2.5%
및	12	1.5%
혼혈	8	1.0%
유럽계	5	0.6%
2	5	0.6%
종족	4	0.5%
인도계	4	0.5%
백인	4	0.5%
Other values (581)	639	80.0%

Most occurring characters

Value	Count	Frequency (%)
	608	10.6%
%	416	7.3%
(	381	6.7%
)	381	6.7%
,	368	6.4%
인	190	3.3%
.	158	2.8%
1	143	2.5%
아	114	2.0%
2	112	2.0%
Other values (286)	2854	49.9%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	1859	32.5%
Other Punctuation	953	16.6%
Decimal Number	854	14.9%
Space Separator	608	10.6%
Lowercase Letter	423	7.4%
Open Punctuation	381	6.7%
Close Punctuation	381	6.7%
Uppercase Letter	259	4.5%
Dash Punctuation	7	0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
인	190	10.2%
아	114	6.1%
계	100	5.4%
족	73	3.9%
등	58	3.1%
타	54	2.9%
기	49	2.6%
르	40	2.2%
스	38	2.0%
시	37	2.0%
Other values (218)	1106	59.5%

Lowercase Letter

Value	Count	Frequency (%)
a	77	18.2%
e	49	11.6%
n	40	9.5%
o	33	7.8%
i	31	7.3%
u	29	6.9%
s	24	5.7%
l	24	5.7%
r	24	5.7%
h	12	2.8%
Other values (13)	80	18.9%

Uppercase Letter

Value	Count	Frequency (%)
L	87	33.6%
N	43	16.6%
U	42	16.2%
B	12	4.6%
S	10	3.9%
M	9	3.5%
C	8	3.1%
K	8	3.1%
P	6	2.3%
A	5	1.9%
Other values (12)	29	11.2%

Decimal Number

Value	Count	Frequency (%)
1	143	16.7%
2	112	13.1%
6	82	9.6%
5	82	9.6%
4	80	9.4%
3	77	9.0%
8	76	8.9%
7	70	8.2%
9	69	8.1%
0	63	7.4%

Other Punctuation

Value	Count	Frequency (%)
%	416	43.7%
,	368	38.6%
.	158	16.6%
?	3	0.3%
:	2	0.2%
/	2	0.2%
·	2	0.2%
'	1	0.1%
*	1	0.1%

Space Separator

Value	Count	Frequency (%)
	608	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	381	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	381	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	7	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	3184	55.6%
Hangul	1857	32.4%
Latin	682	11.9%
Han	2	< 0.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
인	190	10.2%
아	114	6.1%
계	100	5.4%
족	73	3.9%
등	58	3.1%
타	54	2.9%
기	49	2.6%
르	40	2.2%
스	38	2.0%
시	37	2.0%
Other values (216)	1104	59.5%

Latin

Value	Count	Frequency (%)
L	87	12.8%
a	77	11.3%
e	49	7.2%
N	43	6.3%
U	42	6.2%
n	40	5.9%
o	33	4.8%
i	31	4.5%
u	29	4.3%
s	24	3.5%
Other values (35)	227	33.3%

Common

Value	Count	Frequency (%)
	608	19.1%
%	416	13.1%
(	381	12.0%
)	381	12.0%
,	368	11.6%
.	158	5.0%
1	143	4.5%
2	112	3.5%
6	82	2.6%
5	82	2.6%
Other values (13)	453	14.2%

Han

Value	Count	Frequency (%)
漢	1	50.0%
族	1	50.0%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	3864	67.5%
Hangul	1857	32.4%
None	2	< 0.1%
CJK	2	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
	608	15.7%
%	416	10.8%
(	381	9.9%
)	381	9.9%
,	368	9.5%
.	158	4.1%
1	143	3.7%
2	112	2.9%
L	87	2.3%
6	82	2.1%
Other values (57)	1128	29.2%

Hangul

Value	Count	Frequency (%)
인	190	10.2%
아	114	6.1%
계	100	5.4%
족	73	3.9%
등	58	3.1%
타	54	2.9%
기	49	2.6%
르	40	2.2%
스	38	2.0%
시	37	2.0%
Other values (216)	1104	59.5%

None

Value	Count	Frequency (%)
·	2	100.0%

CJK

Value	Count	Frequency (%)
漢	1	50.0%
族	1	50.0%

기후
Categorical

IMBALANCE

Distinct	44
Distinct (%)	22.3%
Missing	0
Missing (%)	0.0%
Memory size	1.7 KiB

(NULL)	150
열대성 기후	3
고온다습한 열대성 기후	3
고온 다습(5-10월 35-45도, 11월-4월 15-35도)	1
고온건조	1
Other values (39)	39

Length

Max length	52
Median length	6
Mean length	8.5228426
Min length	4

Unique

Unique	41 ?
Unique (%)	20.8%

Sample

1st row	(NULL)
2nd row	(NULL)
3rd row	(NULL)
4th row	(NULL)
5th row	(NULL)

Common Values

Value	Count	Frequency (%)
(NULL)	150	76.1%
열대성 기후	3	1.5%
고온다습한 열대성 기후	3	1.5%
고온 다습(5-10월 35-45도, 11월-4월 15-35도)	1	0.5%
고온건조	1	0.5%
아열대, 사막성 건조기후	1	0.5%
대륙성, 아열대성(남부)	1	0.5%
열대우림, 사바나	1	0.5%
아열대 기후	1	0.5%
아열대성 해양기후(여름 33℃,겨울 13℃ 평균)	1	0.5%
Other values (34)	34	17.3%

Length

Histogram of lengths of the category

Value	Count	Frequency (%)
null	150	46.4%
기후	16	5.0%
열대성	9	2.8%
대륙성	8	2.5%
고온다습	7	2.2%
건조	5	1.5%
열대몬순	5	1.5%
아열대	5	1.5%
고온다습한	4	1.2%
연평균	4	1.2%
Other values (97)	110	34.1%

건국
Text

Distinct	75
Distinct (%)	38.1%
Missing	0
Missing (%)	0.0%
Memory size	1.7 KiB

Length

Max length	59
Median length	6
Mean length	12.766497
Min length	6

Characters and Unicode

Total characters	2515
Distinct characters	148
Distinct categories	9 ?
Distinct scripts	4 ?
Distinct blocks	4 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	74 ?
Unique (%)	37.6%

Sample

1st row	1957.3.6.(영국에서 독립)
2nd row	1960.8.17. 프랑스로부터 독립
3rd row	(NULL)
4th row	1965.2.18(영국에서 독립)
5th row	(NULL)

Value	Count	Frequency (%)
null	123	33.6%
독립	58	15.8%
	31	8.5%
독립일	5	1.4%
국경일	3	0.8%
프랑스로부터	2	0.5%
영국으로부터	2	0.5%
독립(국경일	2	0.5%
최초	2	0.5%
주권회복	2	0.5%
Other values (134)	136	37.2%

Most occurring characters

Value	Count	Frequency (%)
L	246	9.8%
.	196	7.8%
(	192	7.6%
)	192	7.6%
	171	6.8%
1	146	5.8%
U	124	4.9%
N	124	4.9%
9	95	3.8%
립	73	2.9%
Other values (138)	956	38.0%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	665	26.4%
Decimal Number	554	22.0%
Uppercase Letter	499	19.8%
Other Punctuation	221	8.8%
Open Punctuation	192	7.6%
Close Punctuation	192	7.6%
Space Separator	171	6.8%
Dash Punctuation	16	0.6%
Lowercase Letter	5	0.2%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
립	73	11.0%
독	70	10.5%
로	39	5.9%
부	39	5.9%
터	36	5.4%
국	35	5.3%
에	25	3.8%
일	24	3.6%
스	22	3.3%
서	21	3.2%
Other values (106)	281	42.3%

Decimal Number

Value	Count	Frequency (%)
1	146	26.4%
9	95	17.1%
0	66	11.9%
2	58	10.5%
6	56	10.1%
7	38	6.9%
8	29	5.2%
3	22	4.0%
4	22	4.0%
5	22	4.0%

Uppercase Letter

Value	Count	Frequency (%)
L	246	49.3%
U	124	24.8%
N	124	24.8%
A	2	0.4%
G	1	0.2%
V	1	0.2%
S	1	0.2%

Other Punctuation

Value	Count	Frequency (%)
.	196	88.7%
:	15	6.8%
,	7	3.2%
'	1	0.5%
※	1	0.5%
*	1	0.5%

Lowercase Letter

Value	Count	Frequency (%)
u	1	20.0%
s	1	20.0%
t	1	20.0%
a	1	20.0%
v	1	20.0%

Open Punctuation

Value	Count	Frequency (%)
(	192	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	192	100.0%

Space Separator

Value	Count	Frequency (%)
	171	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	16	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	1346	53.5%
Hangul	664	26.4%
Latin	504	20.0%
Han	1	< 0.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
립	73	11.0%
독	70	10.5%
로	39	5.9%
부	39	5.9%
터	36	5.4%
국	35	5.3%
에	25	3.8%
일	24	3.6%
스	22	3.3%
서	21	3.2%
Other values (105)	280	42.2%

Common

Value	Count	Frequency (%)
.	196	14.6%
(	192	14.3%
)	192	14.3%
	171	12.7%
1	146	10.8%
9	95	7.1%
0	66	4.9%
2	58	4.3%
6	56	4.2%
7	38	2.8%
Other values (10)	136	10.1%

Latin

Value	Count	Frequency (%)
L	246	48.8%
U	124	24.6%
N	124	24.6%
A	2	0.4%
G	1	0.2%
u	1	0.2%
s	1	0.2%
t	1	0.2%
a	1	0.2%
v	1	0.2%
Other values (2)	2	0.4%

Han

Value	Count	Frequency (%)
前	1	100.0%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	1849	73.5%
Hangul	664	26.4%
CJK	1	< 0.1%
Punctuation	1	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
L	246	13.3%
.	196	10.6%
(	192	10.4%
)	192	10.4%
	171	9.2%
1	146	7.9%
U	124	6.7%
N	124	6.7%
9	95	5.1%
0	66	3.6%
Other values (21)	297	16.1%

Hangul

Value	Count	Frequency (%)
립	73	11.0%
독	70	10.5%
로	39	5.9%
부	39	5.9%
터	36	5.4%
국	35	5.3%
에	25	3.8%
일	24	3.6%
스	22	3.3%
서	21	3.2%
Other values (105)	280	42.2%

CJK

Value	Count	Frequency (%)
前	1	100.0%

Punctuation

Value	Count	Frequency (%)
※	1	100.0%

인구
면적

면적
인구

면적
인구

Phik (φk)
Auto

Heatmap
Table

	인구	인구설명	면적	기후	건국
인구	1.000	0.984	0.573	0.000	0.000
인구설명	0.984	1.000	0.882	0.942	0.967
면적	0.573	0.882	1.000	0.000	0.000
기후	0.000	0.942	0.000	1.000	0.950
건국	0.000	0.967	0.000	0.950	1.000

Heatmap
Table

	인구	면적	기후
인구	1.000	0.826	0.000
면적	0.826	1.000	0.000
기후	0.000	0.000	1.000

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	한글국가명	영문국가명	국가코드(ISO 2자리)	수도	인구	인구설명	면적	면적설명	언어	종교	민족	기후	건국
0	가나	Ghana	GH	아크라(Accra)	32400000	('22, EIU)	238537.0	(한반도 1.1배)	영어(공용어), 아산테어, 에웨어 등	기독교(71%), 이슬람교(17.6%)	아칸족(48%), 몰다그바니족(17%), 에웨족(14%), 가아단베족(7%), 구르마족(6%), 구안족(4%)	(NULL)	1957.3.6.(영국에서 독립)
1	가봉	Gabon	GA	리브르빌(Libreville)	2340000	(‘21 World Bank)	267000.0	(한반도의 1.2배)	불어(공용어), Fang어	기독교(가톨릭 포함) 85%, 회교 9.8%, 토착종교	Fang, Echira, Adouma 등 40여 종족	(NULL)	1960.8.17. 프랑스로부터 독립
2	가이아나	Guyana	GY	조지타운(Georgetown)	750000	(2019 CIA)	214969.0	(NULL)	영어(공용어), Creole(현지 토속어)	기독교 57%, 힌두교 33%, 회교 9%, 기타 1%	동인도계 39.8%, 흑인 29.2%, 혼혈 19.9%, 아메리카 인디안 10.5%	(NULL)	(NULL)
3	감비아	Gambia	GM	반줄(Banjul)	2710000	('22 World Bank)	11295.0	(한반도의 1/20)	영어(공용어), Wolof어	회교(90%), 기독교(9%) 등	Mandinka, Peul, Wolof 등	(NULL)	1965.2.18(영국에서 독립)
4	과테말라	Guatemala	GT	과테말라시티(Guatemala City)	18640000	(2022, IMF)	108889.0	(한반도의 1/2)	스페인어(공용어), 23개 공인 원주민어	가톨릭(41%), 개신교(38.8%), 기타(2.7%)	메스티소 56%, 마야인 41.7%, 흑인?흑인계 혼혈 0.2%, Garifuna인 0.1%, 외국인 0.2%, Xinca 원주민 1.8%	(NULL)	(NULL)
5	교황청	Vatican	VA	Vatican(로마 북서부위치)	1000	(NULL)	0.44	(성베드로 대성당 성베드로 광장 교황거처 및 사무실 등 : 서울 창경궁 정도의 면적)	라틴어(공식언어), 이탈리아어, 불어, 영어	가톨릭(Roman Catholic)	(NULL)	(NULL)	(NULL)
6	그레나다	Grenada	GD	세인트 조지스(St. George's)	110000	(NULL)	334.0	(한반도의 1/643)	영어, 파투아(프랑스 크레올)	가톨릭(53%), 성공회(13.8%), 개신교(33.2%)	흑인(82%), 유색혼혈 (13%), 남아시아계 및 유럽계(5%)	(NULL)	(NULL)
7	그리스	Greece	GR	아테네(Athens, 약370만명)	10390000	(NULL)	131957.0	(한반도의 2/3 한국의 1.3배) (본토81% 도서19%)	그리스어	그리스정교(90%), 기독교(3%), 이슬람교(1.3%) 등	(NULL)	(NULL)	(NULL)
8	기니	Guinea	GN	코나크리(Conakry)	13860000	('22 World Bank)	246000.0	(한반도 크기)	불어(공용어)	회교(80%), 기독교(15%), 토착종교(5%)	Peul(40%), Malink?(30%), Soussou(20%)	(NULL)	1958.10.2.(프랑스에서 독립)
9	기니비사우	Guinea-Bissau	GW	비사우(Bissau)	2110000	(‘22 World Bank)	36125.0	(한반도의 1/7배)	포르투갈어(공용어)	토속신앙(65%), 회교(30%), 기독교(5%)	Balantes, Fulas, Manjaca 등	(NULL)	1973.9.24(포르투갈로부터 독립)

	한글국가명	영문국가명	국가코드(ISO 2자리)	수도	인구	인구설명	면적	면적설명	언어	종교	민족	기후	건국
187	팔레스타인	Palestine	PS	라말라(Ramallah: 임시 수도)	5100000	('20)	6020.0	(NULL)	아랍어, 영어	이슬람교(98%), 기독교(1.37%)	(NULL)	(NULL)	(NULL)
188	페루	Peru	PE	리마(Lima, 1,000만명)	33400000	-2022	1280000.0	(한반도의 약 6배)	스페인어, 께추아어, 아이마라어	(NULL)	(NULL)	(NULL)	(NULL)
189	포르투갈	Portugal	PT	리스본(시내 57만명, 수도권내 287만명)	10340000	(2021년)	92225.61	(한반도의 약 2/5)	포르투갈어(로망스어계)	카톨릭(전체 인구의 80%이상이나 국교는 아님)	이베리아족, 켈트족, 라틴족, 게르만족, 무어족 등의 혼혈	대서양, 지중해 및 대륙성 혼합, 건기(5-10월)/우기(11-4월)	(NULL)
190	폴란드	Poland	PL	바르샤바(171만명)	38560000	(2022년)	312685.0	(한반도의 1.4배)	(NULL)	가톨릭(87%), 정교회, 개신교 등	폴란드인(96.9%), 독일인, 벨라루스인 등	(NULL)	(NULL)
191	프랑스	France	FR	파리(Paris)	67810000	-2022.1	675417.0	(속령 포함 / 한반도의 3.1배)	프랑스어	가톨릭, 신교, 유대교, 이슬람교	골족, 프랑크족 등	(NULL)	(NULL)
192	피지	Fiji	FJ	수바(Suva)	903000	(2021. World Bank)	18333.0	(세계 151위, 경상북도 크기 330여개의 도서로 구성)	영어(공식어), 피지어, 힌두어, 로투만어	기독교 64%, 힌두교 28%, 이슬람교 6%, 기타 2%	피지 원주민 56.8%, 인도계 37.5%, 기타 5.7% 등	(NULL)	(NULL)
193	핀란드	Finland	FI	헬싱키(Helsinki)	5590000	(2021.7월 / CIA 추정)	338145.0	(한반도의 약 1.5배)	핀란드어(87.6%), 스웨덴어(5.2%)	루터교(69.8%), 그리스정교(1.1%)	핀란드인, 스웨덴인, 사미족 등	(NULL)	(NULL)
194	필리핀	Philippines	PH	마닐라(Manila, 약 184만 명)	112890000	(2023, IMF)	300000.0	(한반도의 1.3배)	영어 및 타갈로그어	천주교(79%), 개신교(7%), 이슬람교(6%) 등	말레이계가 주인종이며 미국·스페인계 등 혼혈 다수	고온다습한 아열대성 기후	(NULL)
195	헝가리	Hungary	HU	부다페스트(170만명)	9770000	(NULL)	93030.0	(한반도의 2/5)	(NULL)	카톨릭(37.2%), 개신교(13.8%), 그리스정교(1.8%) 등	마자르인(85.6%), 루마니아(3.2%), 독일(1.9%), 기타(2.6%)	(NULL)	(NULL)
196	호주	Australia	AU	캔버라(Canberra)	25980000	(2022. 기준)	7690000.0	(한반도의 35배, 세계 6위)	영어	기독교 44%, 무교 39%, 기타(불교, 이슬람교 등) 17%	앵글로색슨 80%, 이시안, 원주민 및 기타 약 20%	(NULL)	(NULL)

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Most occurring scripts

Most frequent character per script

Hangul

Most occurring blocks

Most frequent character per block

Hangul

Most occurring characters

Most occurring categories

Most frequent character per category

Lowercase Letter

Uppercase Letter

Other Punctuation

Space Separator

Dash Punctuation

Most occurring scripts

Most frequent character per script

Latin

Common

Most occurring blocks

Most frequent character per block

ASCII

Most occurring characters

Most occurring categories

Most frequent character per category

Uppercase Letter

Most occurring scripts

Most frequent character per script

Latin

Most occurring blocks

Most frequent character per block

ASCII

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Lowercase Letter

Uppercase Letter

Decimal Number

Other Punctuation

Space Separator

Open Punctuation

Close Punctuation

Dash Punctuation

Initial Punctuation

Final Punctuation

Most occurring scripts

Most frequent character per script

Hangul

Latin

Common

Han

Most occurring blocks

Most frequent character per block

ASCII

Hangul

CJK

Punctuation

None

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Lowercase Letter

Uppercase Letter

Decimal Number

Other Punctuation

Space Separator

Close Punctuation

Open Punctuation

Dash Punctuation

Initial Punctuation

Final Punctuation

Most occurring scripts

Most frequent character per script