gimi9 Pandas Profiling

Dataset statistics

Number of variables	7
Number of observations	784
Missing cells	18
Missing cells (%)	0.3%
Duplicate rows	0
Duplicate rows (%)	0.0%
Total size in memory	43.0 KiB
Average record size in memory	56.2 B

Variable types

Text	6
Categorical	1

Dataset

Description	전철역코드,전철역명,전철명명(영문),호선,외부코드,전철명명(중문),전철명명(일문)
Author	서울교통공사
URL	https://data.seoul.go.kr/dataList/OA-15442/S/1/datasetView.do

Alerts

`전철명명(일문)` has 15 (1.9%) missing values	Missing
`전철역코드` has unique values	Unique
`외부코드` has unique values	Unique

Reproduction

Analysis started	2024-05-11 04:22:27.864564
Analysis finished	2024-05-11 04:22:30.777427
Duration	2.91 seconds
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

전철역코드
Text

UNIQUE

Distinct	784
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Memory size	6.3 KiB

Length

Max length	4
Median length	4
Mean length	4
Min length	4

Characters and Unicode

Total characters	3136
Distinct characters	11
Distinct categories	2 ?
Distinct scripts	2 ?
Distinct blocks	1 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	784 ?
Unique (%)	100.0%

Sample

1st row	1018
2nd row	0150
3rd row	1006
4th row	1407
5th row	1727

Value	Count	Frequency (%)
1018	1	0.1%
1276	1	0.1%
1323	1	0.1%
1273	1	0.1%
1286	1	0.1%
1283	1	0.1%
1282	1	0.1%
1275	1	0.1%
1270	1	0.1%
1220	1	0.1%
Other values (774)	774	98.7%

Most occurring characters

Value	Count	Frequency (%)
1	654	20.9%
2	521	16.6%
0	399	12.7%
4	329	10.5%
3	310	9.9%
5	235	7.5%
7	204	6.5%
8	197	6.3%
6	165	5.3%
9	113	3.6%

Most occurring categories

Value	Count	Frequency (%)
Decimal Number	3127	99.7%
Uppercase Letter	9	0.3%

Most frequent character per category

Decimal Number

Value	Count	Frequency (%)
1	654	20.9%
2	521	16.7%
0	399	12.8%
4	329	10.5%
3	310	9.9%
5	235	7.5%
7	204	6.5%
8	197	6.3%
6	165	5.3%
9	113	3.6%

Uppercase Letter

Value	Count	Frequency (%)
C	9	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	3127	99.7%
Latin	9	0.3%

Most frequent character per script

Common

Value	Count	Frequency (%)
1	654	20.9%
2	521	16.7%
0	399	12.8%
4	329	10.5%
3	310	9.9%
5	235	7.5%
7	204	6.5%
8	197	6.3%
6	165	5.3%
9	113	3.6%

Latin

Value	Count	Frequency (%)
C	9	100.0%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	3136	100.0%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
1	654	20.9%
2	521	16.6%
0	399	12.7%
4	329	10.5%
3	310	9.9%
5	235	7.5%
7	204	6.5%
8	197	6.3%
6	165	5.3%
9	113	3.6%

전철역명
Text

Distinct	645
Distinct (%)	82.3%
Missing	0
Missing (%)	0.0%
Memory size	6.3 KiB

Length

Max length	9
Median length	2
Mean length	2.8596939
Min length	2

Characters and Unicode

Total characters	2242
Distinct characters	306
Distinct categories	3 ?
Distinct scripts	2 ?
Distinct blocks	2 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	528 ?
Unique (%)	67.3%

Sample

1st row	석계
2nd row	서울역
3rd row	영등포
4th row	온양온천
5th row	두정

Value	Count	Frequency (%)
김포공항	5	0.6%
왕십리	4	0.5%
서울역	4	0.5%
공덕	4	0.5%
청량리	4	0.5%
홍대입구	3	0.4%
상봉	3	0.4%
회기	3	0.4%
동대문역사문화공원	3	0.4%
종로3가	3	0.4%
Other values (635)	748	95.4%

Most occurring characters

Value	Count	Frequency (%)
대	66	2.9%
산	60	2.7%
구	53	2.4%
신	51	2.3%
동	49	2.2%
천	48	2.1%
정	44	2.0%
청	41	1.8%
원	40	1.8%
지	33	1.5%
Other values (296)	1757	78.4%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	2226	99.3%
Decimal Number	13	0.6%
Other Punctuation	3	0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
대	66	3.0%
산	60	2.7%
구	53	2.4%
신	51	2.3%
동	49	2.2%
천	48	2.2%
정	44	2.0%
청	41	1.8%
원	40	1.8%
지	33	1.5%
Other values (288)	1741	78.2%

Decimal Number

Value	Count	Frequency (%)
3	5	38.5%
4	3	23.1%
1	2	15.4%
2	1	7.7%
9	1	7.7%
5	1	7.7%

Other Punctuation

Value	Count	Frequency (%)
.	2	66.7%
?	1	33.3%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	2226	99.3%
Common	16	0.7%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
대	66	3.0%
산	60	2.7%
구	53	2.4%
신	51	2.3%
동	49	2.2%
천	48	2.2%
정	44	2.0%
청	41	1.8%
원	40	1.8%
지	33	1.5%
Other values (288)	1741	78.2%

Common

Value	Count	Frequency (%)
3	5	31.2%
4	3	18.8%
.	2	12.5%
1	2	12.5%
2	1	6.2%
9	1	6.2%
?	1	6.2%
5	1	6.2%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	2226	99.3%
ASCII	16	0.7%

Most frequent character per block

Hangul

Value	Count	Frequency (%)
대	66	3.0%
산	60	2.7%
구	53	2.4%
신	51	2.3%
동	49	2.2%
천	48	2.2%
정	44	2.0%
청	41	1.8%
원	40	1.8%
지	33	1.5%
Other values (288)	1741	78.2%

ASCII

Value	Count	Frequency (%)
3	5	31.2%
4	3	18.8%
.	2	12.5%
1	2	12.5%
2	1	6.2%
9	1	6.2%
?	1	6.2%
5	1	6.2%

전철명명(영문)
Text

Distinct	650
Distinct (%)	82.9%
Missing	0
Missing (%)	0.0%
Memory size	6.3 KiB

Length

Max length	46
Median length	34
Mean length	9.747449
Min length	3

Characters and Unicode

Total characters	7642
Distinct characters	65
Distinct categories	10 ?
Distinct scripts	2 ?
Distinct blocks	2 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	537 ?
Unique (%)	68.5%

Sample

1st row	Seokgye
2nd row	Seoul Station
3rd row	Yeongdeungpo
4th row	Onyang oncheon
5th row	Dujeong

Value	Count	Frequency (%)
univ	34	3.2%
park	16	1.5%
office	15	1.4%
city	14	1.3%
incheon	11	1.0%
seoul	11	1.0%
hall	10	1.0%
market	9	0.9%
airport	9	0.9%
complex	8	0.8%
Other values (669)	912	86.9%

Most occurring characters

Value	Count	Frequency (%)
n	985	12.9%
o	732	9.6%
a	693	9.1%
g	591	7.7%
e	579	7.6%
i	353	4.6%
u	308	4.0%
	274	3.6%
s	204	2.7%
m	192	2.5%
Other values (55)	2731	35.7%

Most occurring categories

Value	Count	Frequency (%)
Lowercase Letter	6203	81.2%
Uppercase Letter	1029	13.5%
Space Separator	274	3.6%
Dash Punctuation	46	0.6%
Other Punctuation	45	0.6%
Decimal Number	13	0.2%
Close Punctuation	10	0.1%
Open Punctuation	10	0.1%
Modifier Symbol	8	0.1%
Final Punctuation	4	0.1%

Most frequent character per category

Lowercase Letter

Value	Count	Frequency (%)
n	985	15.9%
o	732	11.8%
a	693	11.2%
g	591	9.5%
e	579	9.3%
i	353	5.7%
u	308	5.0%
s	204	3.3%
m	192	3.1%
l	185	3.0%
Other values (15)	1381	22.3%

Uppercase Letter

Value	Count	Frequency (%)
S	183	17.8%
G	120	11.7%
D	75	7.3%
C	72	7.0%
B	63	6.1%
H	59	5.7%
M	59	5.7%
J	55	5.3%
U	47	4.6%
Y	43	4.2%
Other values (14)	253	24.6%

Decimal Number

Value	Count	Frequency (%)
3	5	38.5%
1	3	23.1%
4	2	15.4%
5	1	7.7%
2	1	7.7%
9	1	7.7%

Other Punctuation

Value	Count	Frequency (%)
.	39	86.7%
'	4	8.9%
?	1	2.2%
,	1	2.2%

Space Separator

Value	Count	Frequency (%)
	274	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	46	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	10	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	10	100.0%

Modifier Symbol

Value	Count	Frequency (%)
`	8	100.0%

Final Punctuation

Value	Count	Frequency (%)
’	4	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	7232	94.6%
Common	410	5.4%

Most frequent character per script

Latin

Value	Count	Frequency (%)
n	985	13.6%
o	732	10.1%
a	693	9.6%
g	591	8.2%
e	579	8.0%
i	353	4.9%
u	308	4.3%
s	204	2.8%
m	192	2.7%
l	185	2.6%
Other values (39)	2410	33.3%

Common

Value	Count	Frequency (%)
	274	66.8%
-	46	11.2%
.	39	9.5%
)	10	2.4%
(	10	2.4%
`	8	2.0%
3	5	1.2%
’	4	1.0%
'	4	1.0%
1	3	0.7%
Other values (6)	7	1.7%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	7638	99.9%
Punctuation	4	0.1%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
n	985	12.9%
o	732	9.6%
a	693	9.1%
g	591	7.7%
e	579	7.6%
i	353	4.6%
u	308	4.0%
	274	3.6%
s	204	2.7%
m	192	2.5%
Other values (54)	2727	35.7%

Punctuation

Value	Count	Frequency (%)
’	4	100.0%

호선
Categorical

Distinct	24
Distinct (%)	3.1%
Missing	0
Missing (%)	0.0%
Memory size	6.3 KiB

01호선	102
수인분당선	63
경의선	57
05호선	56
07호선	53
Other values (19)	453

Length

Max length	7
Median length	4
Mean length	4.0522959
Min length	3

Unique

Unique	0 ?
Unique (%)	0.0%

Sample

1st row	01호선
2nd row	01호선
3rd row	01호선
4th row	01호선
5th row	01호선

Common Values

Value	Count	Frequency (%)
01호선	102	13.0%
수인분당선	63	8.0%
경의선	57	7.3%
05호선	56	7.1%
07호선	53	6.8%
04호선	51	6.5%
02호선	51	6.5%
03호선	44	5.6%
06호선	39	5.0%
09호선	38	4.8%
Other values (14)	230	29.3%

Length

Histogram of lengths of the category

Value	Count	Frequency (%)
01호선	102	13.0%
수인분당선	63	8.0%
경의선	57	7.3%
05호선	56	7.1%
07호선	53	6.8%
04호선	51	6.5%
02호선	51	6.5%
03호선	44	5.6%
06호선	39	5.0%
09호선	38	4.8%
Other values (14)	230	29.3%

외부코드
Text

UNIQUE

Distinct	784
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Memory size	6.3 KiB

Length

Max length	6
Median length	3
Mean length	3.4145408
Min length	2

Characters and Unicode

Total characters	2677
Distinct characters	20
Distinct categories	3 ?
Distinct scripts	2 ?
Distinct blocks	1 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	784 ?
Unique (%)	100.0%

Sample

1st row	120
2nd row	133
3rd row	139
4th row	P176
5th row	P168

Value	Count	Frequency (%)
120	1	0.1%
k327	1	0.1%
p134	1	0.1%
k324	1	0.1%
k336	1	0.1%
k334	1	0.1%
k333	1	0.1%
k326	1	0.1%
k320	1	0.1%
k138	1	0.1%
Other values (774)	774	98.7%

Most occurring characters

Value	Count	Frequency (%)
1	513	19.2%
2	392	14.6%
3	282	10.5%
4	251	9.4%
5	203	7.6%
0	155	5.8%
6	149	5.6%
7	141	5.3%
9	134	5.0%
K	130	4.9%
Other values (10)	327	12.2%

Most occurring categories

Value	Count	Frequency (%)
Decimal Number	2311	86.3%
Uppercase Letter	353	13.2%
Dash Punctuation	13	0.5%

Most frequent character per category

Decimal Number

Value	Count	Frequency (%)
1	513	22.2%
2	392	17.0%
3	282	12.2%
4	251	10.9%
5	203	8.8%
0	155	6.7%
6	149	6.4%
7	141	6.1%
9	134	5.8%
8	91	3.9%

Uppercase Letter

Value	Count	Frequency (%)
K	130	36.8%
P	71	20.1%
I	57	16.1%
S	32	9.1%
D	16	4.5%
Y	15	4.2%
U	15	4.2%
A	14	4.0%
X	3	0.8%

Dash Punctuation

Value	Count	Frequency (%)
-	13	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Common	2324	86.8%
Latin	353	13.2%

Most frequent character per script

Common

Value	Count	Frequency (%)
1	513	22.1%
2	392	16.9%
3	282	12.1%
4	251	10.8%
5	203	8.7%
0	155	6.7%
6	149	6.4%
7	141	6.1%
9	134	5.8%
8	91	3.9%

Latin

Value	Count	Frequency (%)
K	130	36.8%
P	71	20.1%
I	57	16.1%
S	32	9.1%
D	16	4.5%
Y	15	4.2%
U	15	4.2%
A	14	4.0%
X	3	0.8%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	2677	100.0%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
1	513	19.2%
2	392	14.6%
3	282	10.5%
4	251	9.4%
5	203	7.6%
0	155	5.8%
6	149	5.6%
7	141	5.3%
9	134	5.0%
K	130	4.9%
Other values (10)	327	12.2%

전철명명(중문)
Text

Distinct	624
Distinct (%)	79.9%
Missing	3
Missing (%)	0.4%
Memory size	6.3 KiB

Length

Max length	14
Median length	2
Mean length	3.006402
Min length	2

Characters and Unicode

Total characters	2348
Distinct characters	444
Distinct categories	7 ?
Distinct scripts	4 ?
Distinct blocks	4 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	517 ?
Unique (%)	66.2%

Sample

1st row	石溪
2nd row	首?
3rd row	永登浦
4th row	???泉
5th row	斗井

Value	Count	Frequency (%)
	25	3.2%
山	11	1.4%
大	9	1.1%
水	5	0.6%
江南	5	0.6%
江	5	0.6%
西	4	0.5%
富平	4	0.5%
洞	4	0.5%
新	4	0.5%
Other values (588)	711	90.3%

Most occurring characters

Value	Count	Frequency (%)
?	542	23.1%
大	67	2.9%
山	59	2.5%
新	46	2.0%
川	37	1.6%
南	26	1.1%
谷	26	1.1%
浦	23	1.0%
市	23	1.0%
路	23	1.0%
Other values (434)	1476	62.9%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	1759	74.9%
Other Punctuation	542	23.1%
Close Punctuation	18	0.8%
Open Punctuation	18	0.8%
Space Separator	6	0.3%
Uppercase Letter	3	0.1%
Decimal Number	2	0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
大	67	3.8%
山	59	3.4%
新	46	2.6%
川	37	2.1%
南	26	1.5%
谷	26	1.5%
浦	23	1.3%
市	23	1.3%
路	23	1.3%
水	21	1.2%
Other values (426)	1408	80.0%

Uppercase Letter

Value	Count	Frequency (%)
D	2	66.7%
P	1	33.3%

Decimal Number

Value	Count	Frequency (%)
2	1	50.0%
1	1	50.0%

Other Punctuation

Value	Count	Frequency (%)
?	542	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	18	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	18	100.0%

Space Separator

Value	Count	Frequency (%)
	6	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Han	1750	74.5%
Common	586	25.0%
Hangul	9	0.4%
Latin	3	0.1%

Most frequent character per script

Han

Value	Count	Frequency (%)
大	67	3.8%
山	59	3.4%
新	46	2.6%
川	37	2.1%
南	26	1.5%
谷	26	1.5%
浦	23	1.3%
市	23	1.3%
路	23	1.3%
水	21	1.2%
Other values (423)	1399	79.9%

Common

Value	Count	Frequency (%)
?	542	92.5%
)	18	3.1%
(	18	3.1%
	6	1.0%
2	1	0.2%
1	1	0.2%

Hangul

Value	Count	Frequency (%)
쒧	5	55.6%
쎱	3	33.3%
씉	1	11.1%

Latin

Value	Count	Frequency (%)
D	2	66.7%
P	1	33.3%

Most occurring blocks

Value	Count	Frequency (%)
CJK	1728	73.6%
ASCII	589	25.1%
CJK Compat Ideographs	22	0.9%
Hangul	9	0.4%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
?	542	92.0%
)	18	3.1%
(	18	3.1%
	6	1.0%
D	2	0.3%
2	1	0.2%
1	1	0.2%
P	1	0.2%

CJK

Value	Count	Frequency (%)
大	67	3.9%
山	59	3.4%
新	46	2.7%
川	37	2.1%
南	26	1.5%
谷	26	1.5%
浦	23	1.3%
市	23	1.3%
路	23	1.3%
水	21	1.2%
Other values (408)	1377	79.7%

Hangul

Value	Count	Frequency (%)
쒧	5	55.6%
쎱	3	33.3%
씉	1	11.1%

CJK Compat Ideographs

Value	Count	Frequency (%)
金	4	18.2%
梨	3	13.6%
女	2	9.1%
陵	2	9.1%
林	1	4.5%
臨	1	4.5%
龍	1	4.5%
良	1	4.5%
樂	1	4.5%
落	1	4.5%
Other values (5)	5	22.7%

전철명명(일문)
Text

MISSING

Distinct	651
Distinct (%)	84.7%
Missing	15
Missing (%)	1.9%
Memory size	6.3 KiB

Length

Max length	22
Median length	15
Mean length	5.1456437
Min length	2

Characters and Unicode

Total characters	3957
Distinct characters	154
Distinct categories	7 ?
Distinct scripts	4 ?
Distinct blocks	5 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	550 ?
Unique (%)	71.5%

Sample

1st row	ソッケ
2nd row	ソウル
3rd row	ヨンドゥンポ
4th row	オニャンオンチョン
5th row	トゥジョン

Value	Count	Frequency (%)
ワンシムニ	4	0.5%
コンドク	4	0.5%
サンボン	4	0.5%
チョンニャンニ	4	0.5%
スソ	3	0.4%
シチョン	3	0.4%
チョンノサムガ	3	0.4%
フェギ	3	0.4%
ホンデイック	3	0.4%
カンナム	3	0.4%
Other values (640)	738	95.6%

Most occurring characters

Value	Count	Frequency (%)
ン	866	21.9%
ョ	247	6.2%
チ	208	5.3%
ク	159	4.0%
サ	119	3.0%
ム	114	2.9%
ル	107	2.7%
ジ	100	2.5%
シ	84	2.1%
ウ	83	2.1%
Other values (144)	1870	47.3%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	3891	98.3%
Other Punctuation	50	1.3%
Close Punctuation	4	0.1%
Open Punctuation	4	0.1%
Space Separator	3	0.1%
Uppercase Letter	3	0.1%
Decimal Number	2	0.1%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
ン	866	22.3%
ョ	247	6.3%
チ	208	5.3%
ク	159	4.1%
サ	119	3.1%
ム	114	2.9%
ル	107	2.7%
ジ	100	2.6%
シ	84	2.2%
ウ	83	2.1%
Other values (133)	1804	46.4%

Other Punctuation

Value	Count	Frequency (%)
?	28	56.0%
·	20	40.0%
,	1	2.0%
.	1	2.0%

Uppercase Letter

Value	Count	Frequency (%)
D	2	66.7%
P	1	33.3%

Decimal Number

Value	Count	Frequency (%)
1	1	50.0%
２	1	50.0%

Close Punctuation

Value	Count	Frequency (%)
)	4	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	4	100.0%

Space Separator

Value	Count	Frequency (%)
	3	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Katakana	3770	95.3%
Han	121	3.1%
Common	63	1.6%
Latin	3	0.1%

Most frequent character per script

Katakana

Value	Count	Frequency (%)
ン	866	23.0%
ョ	247	6.6%
チ	208	5.5%
ク	159	4.2%
サ	119	3.2%
ム	114	3.0%
ル	107	2.8%
ジ	100	2.7%
シ	84	2.2%
ウ	83	2.2%
Other values (63)	1683	44.6%

Han

Value	Count	Frequency (%)
市	6	5.0%
大	6	5.0%
川	5	4.1%
場	5	4.1%
仁	4	3.3%
山	4	3.3%
谷	4	3.3%
石	4	3.3%
黔	3	2.5%
港	3	2.5%
Other values (60)	77	63.6%

Common

Value	Count	Frequency (%)
?	28	44.4%
·	20	31.7%
)	4	6.3%
(	4	6.3%
	3	4.8%
,	1	1.6%
.	1	1.6%
1	1	1.6%
２	1	1.6%

Latin

Value	Count	Frequency (%)
D	2	66.7%
P	1	33.3%

Most occurring blocks

Value	Count	Frequency (%)
Katakana	3770	95.3%
CJK	119	3.0%
ASCII	45	1.1%
None	21	0.5%
CJK Compat Ideographs	2	0.1%

Most frequent character per block

Katakana

Value	Count	Frequency (%)
ン	866	23.0%
ョ	247	6.6%
チ	208	5.5%
ク	159	4.2%
サ	119	3.2%
ム	114	3.0%
ル	107	2.8%
ジ	100	2.7%
シ	84	2.2%
ウ	83	2.2%
Other values (63)	1683	44.6%

ASCII

Value	Count	Frequency (%)
?	28	62.2%
)	4	8.9%
(	4	8.9%
	3	6.7%
D	2	4.4%
,	1	2.2%
P	1	2.2%
.	1	2.2%
1	1	2.2%

None

Value	Count	Frequency (%)
·	20	95.2%
２	1	4.8%

CJK

Value	Count	Frequency (%)
市	6	5.0%
大	6	5.0%
川	5	4.2%
場	5	4.2%
仁	4	3.4%
山	4	3.4%
谷	4	3.4%
石	4	3.4%
黔	3	2.5%
港	3	2.5%
Other values (58)	75	63.0%

CJK Compat Ideographs

Value	Count	Frequency (%)
女	1	50.0%
陵	1	50.0%

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

First rows
Last rows

	전철역코드	전철역명	전철명명(영문)	호선	외부코드	전철명명(중문)	전철명명(일문)
0	1018	석계	Seokgye	01호선	120	石溪	ソッケ
1	0150	서울역	Seoul Station	01호선	133	首?	ソウル
2	1006	영등포	Yeongdeungpo	01호선	139	永登浦	ヨンドゥンポ
3	1407	온양온천	Onyang oncheon	01호선	P176	???泉	オニャンオンチョン
4	1727	두정	Dujeong	01호선	P168	斗井	トゥジョン
5	1720	진위	Jinwi	01호선	P161	振威	チヌィ
6	1005	대방	Daebang	01호선	137	大方	テバン
7	1910	덕계	Deokgye	01호선	106	德溪	トッケ
8	1809	주안	Juan	01호선	156	朱安	チュアン
9	1749	서동탄	Seodongtan	01호선	P157-1	西??	ソドンタン

	전철역코드	전철역명	전철명명(영문)	호선	외부코드	전철명명(중문)	전철명명(일문)
774	3121	동수	Dongsu	인천선	I121	??	トンス
775	3120	부평	Bupyeong	인천선	I120	富平	プピョン
776	3119	부평시장	Bupyeong Market	인천선	I119	富平市場	富平市場
777	3118	부평구청	Bupyeong-gu Office	인천선	I118	富平??	プピョングチョン
778	3117	갈산	Galsan	인천선	I117	葛山	カルサン
779	3116	작전	Jakjeon	인천선	I116	?田	チャクチョン
780	3110	계양	Gyeyang	인천선	I110	桂?	ケヤン
781	3138	국제업무지구	Intl. Business District	인천선	I138	??????	ククチェオンムジグ
782	3137	센트럴파크	Central Park	인천선	I137	中央公?	セントラルパ?ク
783	3136	인천대입구	Incheon Nat'l Univ.	인천선	I136	仁川大?	インチョンデイック

Overview

Variables

Most occurring characters

Most occurring categories

Most frequent character per category

Decimal Number

Uppercase Letter

Most occurring scripts

Most frequent character per script

Common

Latin

Most occurring blocks

Most frequent character per block

ASCII

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Decimal Number

Other Punctuation

Most occurring scripts

Most frequent character per script

Hangul

Common

Most occurring blocks

Most frequent character per block

Hangul

ASCII

Most occurring characters

Most occurring categories

Most frequent character per category

Lowercase Letter

Uppercase Letter

Decimal Number

Other Punctuation

Space Separator

Dash Punctuation

Close Punctuation

Open Punctuation

Modifier Symbol

Final Punctuation

Most occurring scripts

Most frequent character per script

Latin

Common

Most occurring blocks

Most frequent character per block

ASCII

Punctuation

Common Values

Length

Most occurring characters

Most occurring categories

Most frequent character per category

Decimal Number

Uppercase Letter

Dash Punctuation

Most occurring scripts

Most frequent character per script

Common

Latin

Most occurring blocks

Most frequent character per block

ASCII

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Uppercase Letter

Decimal Number

Other Punctuation

Close Punctuation

Open Punctuation

Space Separator

Most occurring scripts

Most frequent character per script

Han

Common

Hangul

Latin