gimi9 Pandas Profiling

Dataset statistics

Number of variables	7
Number of observations	298
Missing cells	10
Missing cells (%)	0.5%
Duplicate rows	0
Duplicate rows (%)	0.0%
Total size in memory	16.7 KiB
Average record size in memory	57.4 B

Variable types

Numeric	1
Categorical	1
Text	5

Dataset

Description	파일 다운로드
Author	서울교통공사
URL	https://data.seoul.go.kr/dataList/OA-2751/F/1/datasetView.do

Alerts

`연번` is highly overall correlated with `호선`	High correlation
`호선` is highly overall correlated with `연번`	High correlation
`한자` has 10 (3.4%) missing values	Missing
`연번` has unique values	Unique

Reproduction

Analysis started	2024-04-29 22:02:00.568814
Analysis finished	2024-04-29 22:02:02.454678
Duration	1.89 second
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

연번
Real number (ℝ)

HIGH CORRELATION UNIQUE

Distinct	298
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Mean	149.5

Minimum	1
Maximum	298
Zeros	0
Zeros (%)	0.0%
Negative	0
Negative (%)	0.0%
Memory size	2.7 KiB

Quantile statistics

Minimum	1
5-th percentile	15.85
Q1	75.25
median	149.5
Q3	223.75
95-th percentile	283.15
Maximum	298
Range	297
Interquartile range (IQR)	148.5

Descriptive statistics

Standard deviation	86.169407
Coefficient of variation (CV)	0.57638399
Kurtosis	-1.2
Mean	149.5
Median Absolute Deviation (MAD)	74.5
Skewness	0
Sum	44551
Variance	7425.1667
Monotonicity	Strictly increasing

Histogram with fixed size bins (bins=50)

Value	Count	Frequency (%)
1	1	0.3%
206	1	0.3%
204	1	0.3%
203	1	0.3%
202	1	0.3%
201	1	0.3%
200	1	0.3%
199	1	0.3%
198	1	0.3%
197	1	0.3%
Other values (288)	288	96.6%

Minimum 10 values
Maximum 10 values

Value	Count	Frequency (%)
1	1	0.3%
2	1	0.3%
3	1	0.3%
4	1	0.3%
5	1	0.3%
6	1	0.3%
7	1	0.3%
8	1	0.3%
9	1	0.3%
10	1	0.3%

Value	Count	Frequency (%)
298	1	0.3%
297	1	0.3%
296	1	0.3%
295	1	0.3%
294	1	0.3%
293	1	0.3%
292	1	0.3%
291	1	0.3%
290	1	0.3%
289	1	0.3%

호선
Categorical

HIGH CORRELATION

Distinct	9
Distinct (%)	3.0%
Missing	0
Missing (%)	0.0%
Memory size	2.5 KiB

5호선	56
2호선	51
7호선	51
6호선	39
3호선	34
Other values (4)	67

Length

Max length	3
Median length	3
Mean length	3
Min length	3

Unique

Unique	0 ?
Unique (%)	0.0%

Sample

1st row	1호선
2nd row	1호선
3rd row	1호선
4th row	1호선
5th row	1호선

Common Values

Value	Count	Frequency (%)
5호선	56	18.8%
2호선	51	17.1%
7호선	51	17.1%
6호선	39	13.1%
3호선	34	11.4%
4호선	26	8.7%
8호선	18	6.0%
9호선	13	4.4%
1호선	10	3.4%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
5호선	56	18.8%
2호선	51	17.1%
7호선	51	17.1%
6호선	39	13.1%
3호선	34	11.4%
4호선	26	8.7%
8호선	18	6.0%
9호선	13	4.4%
1호선	10	3.4%

역명
Text

Distinct	260
Distinct (%)	87.2%
Missing	0
Missing (%)	0.0%
Memory size	2.5 KiB

Length

Max length	16
Median length	14
Mean length	4.4228188
Min length	2

Characters and Unicode

Total characters	1318
Distinct characters	251
Distinct categories	7 ?
Distinct scripts	3 ?
Distinct blocks	4 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	223 ?
Unique (%)	74.8%

Sample

1st row	서울역
2nd row	시청
3rd row	종각
4th row	종로3가
5th row	종로5가

Value	Count	Frequency (%)
동대문역사문화공원(ddp	3	1.0%
대림(구로구청	2	0.7%
영등포구청	2	0.7%
충정로(경기대입구	2	0.7%
충무로	2	0.7%
시청	2	0.7%
공덕	2	0.7%
사당	2	0.7%
석촌	2	0.7%
교대(법원·검찰청	2	0.7%
Other values (251)	278	93.0%

Most occurring characters

Value	Count	Frequency (%)
(	67	5.1%
)	67	5.1%
구	50	3.8%
대	50	3.8%
동	36	2.7%
청	32	2.4%
신	28	2.1%
원	26	2.0%
산	23	1.7%
문	20	1.5%
Other values (241)	919	69.7%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	1161	88.1%
Open Punctuation	67	5.1%
Close Punctuation	67	5.1%
Uppercase Letter	9	0.7%
Decimal Number	8	0.6%
Other Punctuation	4	0.3%
Space Separator	2	0.2%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
구	50	4.3%
대	50	4.3%
동	36	3.1%
청	32	2.8%
신	28	2.4%
원	26	2.2%
산	23	2.0%
문	20	1.7%
입	19	1.6%
성	18	1.6%
Other values (231)	859	74.0%

Decimal Number

Value	Count	Frequency (%)
3	5	62.5%
4	2	25.0%
5	1	12.5%

Uppercase Letter

Value	Count	Frequency (%)
D	6	66.7%
P	3	33.3%

Other Punctuation

Value	Count	Frequency (%)
·	3	75.0%
•	1	25.0%

Open Punctuation

Value	Count	Frequency (%)
(	67	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	67	100.0%

Space Separator

Value	Count	Frequency (%)
	2	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	1161	88.1%
Common	148	11.2%
Latin	9	0.7%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
구	50	4.3%
대	50	4.3%
동	36	3.1%
청	32	2.8%
신	28	2.4%
원	26	2.2%
산	23	2.0%
문	20	1.7%
입	19	1.6%
성	18	1.6%
Other values (231)	859	74.0%

Common

Value	Count	Frequency (%)
(	67	45.3%
)	67	45.3%
3	5	3.4%
·	3	2.0%
4	2	1.4%
	2	1.4%
•	1	0.7%
5	1	0.7%

Latin

Value	Count	Frequency (%)
D	6	66.7%
P	3	33.3%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	1161	88.1%
ASCII	153	11.6%
None	3	0.2%
Punctuation	1	0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
(	67	43.8%
)	67	43.8%
D	6	3.9%
3	5	3.3%
P	3	2.0%
4	2	1.3%
	2	1.3%
5	1	0.7%

Hangul

Value	Count	Frequency (%)
구	50	4.3%
대	50	4.3%
동	36	3.1%
청	32	2.8%
신	28	2.4%
원	26	2.2%
산	23	2.0%
문	20	1.7%
입	19	1.6%
성	18	1.6%
Other values (231)	859	74.0%

None

Value	Count	Frequency (%)
·	3	100.0%

Punctuation

Value	Count	Frequency (%)
•	1	100.0%

한자
Text

MISSING

Distinct	254
Distinct (%)	88.2%
Missing	10
Missing (%)	3.4%
Memory size	2.5 KiB

Length

Max length	16
Median length	14
Mean length	4.4548611
Min length	2

Characters and Unicode

Total characters	1283
Distinct characters	410
Distinct categories	8 ?
Distinct scripts	4 ?
Distinct blocks	6 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	221 ?
Unique (%)	76.7%

Sample

1st row	市廳
2nd row	鐘閣
3rd row	鍾路3街
4th row	鍾路5街
5th row	東大門

Value	Count	Frequency (%)
東大門歷史文化公園(ddp	3	1.0%
市廳	2	0.7%
石村	2	0.7%
忠正路(京畿大入口	2	0.7%
綜合運動場	2	0.7%
孔德	2	0.7%
舍堂	2	0.7%
藥水	2	0.7%
泰陵入口	2	0.7%
까치山	2	0.7%
Other values (247)	270	92.8%

Most occurring characters

Value	Count	Frequency (%)
(	66	5.1%
)	66	5.1%
大	46	3.6%
廳	27	2.1%
山	23	1.8%
新	22	1.7%
口	20	1.6%
區	20	1.6%
入	20	1.6%
洞	15	1.2%
Other values (400)	958	74.7%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	1127	87.8%
Open Punctuation	66	5.1%
Close Punctuation	66	5.1%
Uppercase Letter	9	0.7%
Decimal Number	8	0.6%
Space Separator	4	0.3%
Other Punctuation	2	0.2%
Math Symbol	1	0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
大	46	4.1%
廳	27	2.4%
山	23	2.0%
新	22	2.0%
口	20	1.8%
區	20	1.8%
入	20	1.8%
洞	15	1.3%
東	15	1.3%
門	14	1.2%
Other values (389)	905	80.3%

Decimal Number

Value	Count	Frequency (%)
3	5	62.5%
4	2	25.0%
5	1	12.5%

Uppercase Letter

Value	Count	Frequency (%)
D	6	66.7%
P	3	33.3%

Space Separator

Value	Count	Frequency (%)
	2	50.0%
	2	50.0%

Open Punctuation

Value	Count	Frequency (%)
(	66	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	66	100.0%

Other Punctuation

Value	Count	Frequency (%)
·	2	100.0%

Math Symbol

Value	Count	Frequency (%)
∙	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Han	985	76.8%
Common	147	11.5%
Hangul	142	11.1%
Latin	9	0.7%

Most frequent character per script

Han

Value	Count	Frequency (%)
大	46	4.7%
廳	27	2.7%
山	23	2.3%
新	22	2.2%
口	20	2.0%
區	20	2.0%
入	20	2.0%
洞	15	1.5%
東	15	1.5%
門	14	1.4%
Other values (319)	763	77.5%

Hangul

Value	Count	Frequency (%)
앞	8	5.6%
울	8	5.6%
서	7	4.9%
터	6	4.2%
리	6	4.2%
미	5	3.5%
널	5	3.5%
거	5	3.5%
이	4	2.8%
나	4	2.8%
Other values (60)	84	59.2%

Common

Value	Count	Frequency (%)
(	66	44.9%
)	66	44.9%
3	5	3.4%
	2	1.4%
	2	1.4%
·	2	1.4%
4	2	1.4%
5	1	0.7%
∙	1	0.7%

Latin

Value	Count	Frequency (%)
D	6	66.7%
P	3	33.3%

Most occurring blocks

Value	Count	Frequency (%)
CJK	951	74.1%
ASCII	151	11.8%
Hangul	142	11.1%
CJK Compat Ideographs	34	2.7%
None	4	0.3%
Math Operators	1	0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
(	66	43.7%
)	66	43.7%
D	6	4.0%
3	5	3.3%
P	3	2.0%
	2	1.3%
4	2	1.3%
5	1	0.7%

CJK

Value	Count	Frequency (%)
大	46	4.8%
廳	27	2.8%
山	23	2.4%
新	22	2.3%
口	20	2.1%
區	20	2.1%
入	20	2.1%
洞	15	1.6%
東	15	1.6%
門	14	1.5%
Other values (301)	729	76.7%

Hangul

Value	Count	Frequency (%)
앞	8	5.6%
울	8	5.6%
서	7	4.9%
터	6	4.2%
리	6	4.2%
미	5	3.5%
널	5	3.5%
거	5	3.5%
이	4	2.8%
나	4	2.8%
Other values (60)	84	59.2%

CJK Compat Ideographs

Value	Count	Frequency (%)
龍	6	17.6%
梨	4	11.8%
歷	3	8.8%
女	3	8.8%
樂	2	5.9%
金	2	5.9%
陵	2	5.9%
蘆	2	5.9%
綠	1	2.9%
論	1	2.9%
Other values (8)	8	23.5%

None

Value	Count	Frequency (%)
	2	50.0%
·	2	50.0%

Math Operators

Value	Count	Frequency (%)
∙	1	100.0%

영문
Text

Distinct	263
Distinct (%)	88.3%
Missing	0
Missing (%)	0.0%
Memory size	2.5 KiB

Length

Max length	59
Median length	50
Mean length	14.275168
Min length	3

Characters and Unicode

Total characters	4254
Distinct characters	60
Distinct categories	10 ?
Distinct scripts	2 ?
Distinct blocks	3 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	229 ?
Unique (%)	76.8%

Sample

1st row	Seoul Station
2nd row	City Hall
3rd row	Jonggak
4th row	Jongno 3(sam)ga
5th row	Jongno 5(o)ga

Value	Count	Frequency (%)
univ	24	4.7%
office	22	4.3%
seoul	11	2.2%
	9	1.8%
national	6	1.2%
center	6	1.2%
dongdaemun	5	1.0%
city	5	1.0%
euljiro	5	1.0%
terminal	5	1.0%
Other values (325)	413	80.8%

Most occurring characters

Value	Count	Frequency (%)
n	454	10.7%
o	355	8.3%
a	348	8.2%
g	302	7.1%
e	298	7.0%
	218	5.1%
i	203	4.8%
u	187	4.4%
s	121	2.8%
r	120	2.8%
Other values (50)	1648	38.7%

Most occurring categories

Value	Count	Frequency (%)
Lowercase Letter	3256	76.5%
Uppercase Letter	543	12.8%
Space Separator	219	5.1%
Open Punctuation	75	1.8%
Close Punctuation	75	1.8%
Other Punctuation	45	1.1%
Dash Punctuation	29	0.7%
Decimal Number	9	0.2%
Final Punctuation	2	< 0.1%
Initial Punctuation	1	< 0.1%

Most frequent character per category

Lowercase Letter

Value	Count	Frequency (%)
n	454	13.9%
o	355	10.9%
a	348	10.7%
g	302	9.3%
e	298	9.2%
i	203	6.2%
u	187	5.7%
s	121	3.7%
r	120	3.7%
m	111	3.4%
Other values (14)	757	23.2%

Uppercase Letter

Value	Count	Frequency (%)
S	103	19.0%
G	48	8.8%
C	44	8.1%
D	40	7.4%
M	31	5.7%
U	31	5.7%
O	30	5.5%
H	30	5.5%
N	25	4.6%
P	24	4.4%
Other values (12)	137	25.2%

Decimal Number

Value	Count	Frequency (%)
3	5	55.6%
4	2	22.2%
5	1	11.1%
1	1	11.1%

Other Punctuation

Value	Count	Frequency (%)
.	26	57.8%
'	10	22.2%
&	9	20.0%

Space Separator

Value	Count	Frequency (%)
	218	99.5%
	1	0.5%

Open Punctuation

Value	Count	Frequency (%)
(	75	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	75	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	29	100.0%

Final Punctuation

Value	Count	Frequency (%)
’	2	100.0%

Initial Punctuation

Value	Count	Frequency (%)
‘	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	3799	89.3%
Common	455	10.7%

Most frequent character per script

Latin

Value	Count	Frequency (%)
n	454	12.0%
o	355	9.3%
a	348	9.2%
g	302	7.9%
e	298	7.8%
i	203	5.3%
u	187	4.9%
s	121	3.2%
r	120	3.2%
m	111	2.9%
Other values (36)	1300	34.2%

Common

Value	Count	Frequency (%)
	218	47.9%
(	75	16.5%
)	75	16.5%
-	29	6.4%
.	26	5.7%
'	10	2.2%
&	9	2.0%
3	5	1.1%
’	2	0.4%
4	2	0.4%
Other values (4)	4	0.9%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	4250	99.9%
Punctuation	3	0.1%
None	1	< 0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
n	454	10.7%
o	355	8.4%
a	348	8.2%
g	302	7.1%
e	298	7.0%
	218	5.1%
i	203	4.8%
u	187	4.4%
s	121	2.8%
r	120	2.8%
Other values (47)	1644	38.7%

Punctuation

Value	Count	Frequency (%)
’	2	66.7%
‘	1	33.3%

None

Value	Count	Frequency (%)
	1	100.0%

중국어
Text

Distinct	261
Distinct (%)	87.6%
Missing	0
Missing (%)	0.0%
Memory size	2.5 KiB

Length

Max length	14
Median length	12
Mean length	4.4060403
Min length	2

Characters and Unicode

Total characters	1313
Distinct characters	374
Distinct categories	5 ?
Distinct scripts	3 ?
Distinct blocks	4 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	225 ?
Unique (%)	75.5%

Sample

1st row	首尔站
2nd row	市厅
3rd row	钟阁
4th row	钟路三街
5th row	钟路五街

Value	Count	Frequency (%)
东大门历史文化公园(ddp	3	1.0%
药水	2	0.7%
忠正路(京畿大学	2	0.7%
忠武路	2	0.7%
市厅	2	0.7%
孔德	2	0.7%
舍堂	2	0.7%
石村	2	0.7%
大林(九老区厅	2	0.7%
首尔教育大学(法院·检察厅	2	0.7%
Other values (251)	277	93.0%

Most occurring characters

Value	Count	Frequency (%)
(	66	5.0%
)	66	5.0%
大	47	3.6%
学	32	2.4%
厅	28	2.1%
新	25	1.9%
山	23	1.8%
区	22	1.7%
路	18	1.4%
东	15	1.1%
Other values (364)	971	74.0%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	1168	89.0%
Open Punctuation	66	5.0%
Close Punctuation	66	5.0%
Uppercase Letter	9	0.7%
Other Punctuation	4	0.3%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
大	47	4.0%
学	32	2.7%
厅	28	2.4%
新	25	2.1%
山	23	2.0%
区	22	1.9%
路	18	1.5%
东	15	1.3%
洞	15	1.3%
门	14	1.2%
Other values (359)	929	79.5%

Uppercase Letter

Value	Count	Frequency (%)
D	6	66.7%
P	3	33.3%

Open Punctuation

Value	Count	Frequency (%)
(	66	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	66	100.0%

Other Punctuation

Value	Count	Frequency (%)
·	4	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Han	1168	89.0%
Common	136	10.4%
Latin	9	0.7%

Most frequent character per script

Han

Value	Count	Frequency (%)
大	47	4.0%
学	32	2.7%
厅	28	2.4%
新	25	2.1%
山	23	2.0%
区	22	1.9%
路	18	1.5%
东	15	1.3%
洞	15	1.3%
门	14	1.2%
Other values (359)	929	79.5%

Common

Value	Count	Frequency (%)
(	66	48.5%
)	66	48.5%
·	4	2.9%

Latin

Value	Count	Frequency (%)
D	6	66.7%
P	3	33.3%

Most occurring blocks

Value	Count	Frequency (%)
CJK	1155	88.0%
ASCII	141	10.7%
CJK Compat Ideographs	13	1.0%
None	4	0.3%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
(	66	46.8%
)	66	46.8%
D	6	4.3%
P	3	2.1%

CJK

Value	Count	Frequency (%)
大	47	4.1%
学	32	2.8%
厅	28	2.4%
新	25	2.2%
山	23	2.0%
区	22	1.9%
路	18	1.6%
东	15	1.3%
洞	15	1.3%
门	14	1.2%
Other values (351)	916	79.3%

None

Value	Count	Frequency (%)
·	4	100.0%

CJK Compat Ideographs

Value	Count	Frequency (%)
梨	4	30.8%
女	3	23.1%
丹	1	7.7%
陵	1	7.7%
老	1	7.7%
林	1	7.7%
落	1	7.7%
金	1	7.7%

일본어
Text

Distinct	261
Distinct (%)	87.6%
Missing	0
Missing (%)	0.0%
Memory size	2.5 KiB

Length

Max length	22
Median length	16
Mean length	5.6845638
Min length	2

Characters and Unicode

Total characters	1694
Distinct characters	88
Distinct categories	8 ?
Distinct scripts	4 ?
Distinct blocks	5 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	226 ?
Unique (%)	75.8%

Sample

1st row	ソウルヨク
2nd row	シチョン
3rd row	チョンガク
4th row	チョンノサムガ
5th row	チョンノオガ

Value	Count	Frequency (%)
チョンノサムガ	3	1.0%
トンデムンヨクサムンファゴンウォン(ddp	3	1.0%
チョング	2	0.7%
ソウルヨク	2	0.7%
テルンイック	2	0.7%
ハプチョン	2	0.7%
チャムシル	2	0.7%
サムガクチ	2	0.7%
チョンハブンドンジャン	2	0.7%
コンドク	2	0.7%
Other values (251)	277	92.6%

Most occurring characters

Value	Count	Frequency (%)
ン	367	21.7%
チ	93	5.5%
ョ	87	5.1%
ク	72	4.3%
ム	68	4.0%
サ	58	3.4%
ル	51	3.0%
シ	43	2.5%
ジ	38	2.2%
ソ	34	2.0%
Other values (78)	783	46.2%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	1642	96.9%
Other Punctuation	12	0.7%
Open Punctuation	10	0.6%
Close Punctuation	10	0.6%
Uppercase Letter	9	0.5%
Space Separator	6	0.4%
Modifier Letter	4	0.2%
Math Symbol	1	0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
ン	367	22.4%
チ	93	5.7%
ョ	87	5.3%
ク	72	4.4%
ム	68	4.1%
サ	58	3.5%
ル	51	3.1%
シ	43	2.6%
ジ	38	2.3%
ソ	34	2.1%
Other values (69)	731	44.5%

Uppercase Letter

Value	Count	Frequency (%)
D	6	66.7%
P	3	33.3%

Space Separator

Value	Count	Frequency (%)
	3	50.0%
	3	50.0%

Other Punctuation

Value	Count	Frequency (%)
・	12	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	10	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	10	100.0%

Modifier Letter

Value	Count	Frequency (%)
ー	4	100.0%

Math Symbol

Value	Count	Frequency (%)
∙	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Katakana	1632	96.3%
Common	43	2.5%
Han	10	0.6%
Latin	9	0.5%

Most frequent character per script

Katakana

Value	Count	Frequency (%)
ン	367	22.5%
チ	93	5.7%
ョ	87	5.3%
ク	72	4.4%
ム	68	4.2%
サ	58	3.6%
ル	51	3.1%
シ	43	2.6%
ジ	38	2.3%
ソ	34	2.1%
Other values (61)	721	44.2%

Han

Value	Count	Frequency (%)
庁	2	20.0%
区	2	20.0%
村	1	10.0%
新	1	10.0%
成	1	10.0%
三	1	10.0%
公	1	10.0%
園	1	10.0%

Common

Value	Count	Frequency (%)
・	12	27.9%
(	10	23.3%
)	10	23.3%
ー	4	9.3%
	3	7.0%
	3	7.0%
∙	1	2.3%

Latin

Value	Count	Frequency (%)
D	6	66.7%
P	3	33.3%

Most occurring blocks

Value	Count	Frequency (%)
Katakana	1648	97.3%
ASCII	32	1.9%
CJK	10	0.6%
None	3	0.2%
Math Operators	1	0.1%

Most frequent character per block

Katakana

Value	Count	Frequency (%)
ン	367	22.3%
チ	93	5.6%
ョ	87	5.3%
ク	72	4.4%
ム	68	4.1%
サ	58	3.5%
ル	51	3.1%
シ	43	2.6%
ジ	38	2.3%
ソ	34	2.1%
Other values (63)	737	44.7%

ASCII

Value	Count	Frequency (%)
(	10	31.2%
)	10	31.2%
D	6	18.8%
	3	9.4%
P	3	9.4%

None

Value	Count	Frequency (%)
	3	100.0%

CJK

Value	Count	Frequency (%)
庁	2	20.0%
区	2	20.0%
村	1	10.0%
新	1	10.0%
成	1	10.0%
三	1	10.0%
公	1	10.0%
園	1	10.0%

Math Operators

Value	Count	Frequency (%)
∙	1	100.0%

연번

연번

Phik (φk)
Auto

Heatmap
Table

	연번	호선
연번	1.000	0.946
호선	0.946	1.000

Heatmap
Table

	연번	호선
연번	1.000	0.812
호선	0.812	1.000

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	연번	호선	역명	한자	영문	중국어	일본어
0	1	1호선	서울역	<NA>	Seoul Station	首尔站	ソウルヨク
1	2	1호선	시청	市廳	City Hall	市厅	シチョン
2	3	1호선	종각	鐘閣	Jonggak	钟阁	チョンガク
3	4	1호선	종로3가	鍾路3街	Jongno 3(sam)ga	钟路三街	チョンノサムガ
4	5	1호선	종로5가	鍾路5街	Jongno 5(o)ga	钟路五街	チョンノオガ
5	6	1호선	동대문	東大門	Dongdaemun	东大门	トンデムン
6	7	1호선	동묘앞	東廟앞	Dongmyo	东庙	トンミョアプ
7	8	1호선	신설동	新設洞	Sinseoldong	新设洞	シンソルトン
8	9	1호선	제기동	祭基洞	Jegidong	祭基洞	チェギドン
9	10	1호선	청량리(서울시립대입구)	淸凉里(서울市立大入口)	Cheongnyangni(University of Seoul)	清凉里(首尔市立大学)	チョンニャンニ

	연번	호선	역명	한자	영문	중국어	일본어
288	289	9호선	봉은사	奉恩寺	Bongeunsa	奉恩寺	ポンウンサ
289	290	9호선	종합운동장	綜合運動場	Sports Complex	综合运动场	チョンハブンドンジャン
290	291	9호선	삼전	三田	Samjeon	三田	サムジョン
291	292	9호선	석촌고분	石村古墳	Seokchon Gobun	石村古坟	ソクチョンコブン
292	293	9호선	석촌	石村	Seokchon	石村	ソクチョン
293	294	9호선	송파나루	松坡나루	Songpanaru	松坡渡口	ソンパナル
294	295	9호선	한성백제	漢城百済	Hanseong Baekje	汉城百济	ハンソンベクチェ
295	296	9호선	올림픽공원(한국체대)	올림픽公園(韓國體大)	Olympic Park(Korea National Sport University)	奥林匹克公园(韩国体育大学)	オリンピックゴンウォン
296	297	9호선	둔촌오륜	遁村五輪	Dunchon Oryun	遁村五轮	トゥンチョノリュン
297	298	9호선	중앙보훈병원	中央報勲病院	VHS Medical Center	中央报勋医院	チュンアンボフンビョンウォン

Overview

Variables

Common Values

Length

Common Values (Plot)

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Decimal Number

Uppercase Letter

Other Punctuation

Open Punctuation

Close Punctuation

Space Separator

Most occurring scripts

Most frequent character per script

Hangul

Common

Latin

Most occurring blocks

Most frequent character per block

ASCII

Hangul

None

Punctuation

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Decimal Number

Uppercase Letter

Space Separator

Open Punctuation

Close Punctuation

Other Punctuation

Math Symbol

Most occurring scripts

Most frequent character per script

Han

Hangul

Common

Latin

Most occurring blocks

Most frequent character per block

ASCII

CJK

Hangul

CJK Compat Ideographs

None

Math Operators

Most occurring characters

Most occurring categories

Most frequent character per category

Lowercase Letter

Uppercase Letter

Decimal Number

Other Punctuation

Space Separator

Open Punctuation

Close Punctuation

Dash Punctuation

Final Punctuation

Initial Punctuation

Most occurring scripts

Most frequent character per script

Latin

Common

Most occurring blocks

Most frequent character per block

ASCII

Punctuation

None

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Uppercase Letter

Open Punctuation

Close Punctuation