gimi9 Pandas Profiling

Dataset statistics

Number of variables	10
Number of observations	99
Missing cells	0
Missing cells (%)	0.0%
Duplicate rows	0
Duplicate rows (%)	0.0%
Total size in memory	8.0 KiB
Average record size in memory	82.3 B

Variable types

Categorical	2
Text	6
Boolean	1
Numeric	1

Dataset

Description	수도권 1호선 역사들의 데이터로 철도운영기관명, 선명, 역명, 영어명, 로마자, 일본어, 중국어간체, 중국어번체, 환승역여부, 신설일자의 데이터가 있습니다.
Author	국가철도공단
URL	https://www.data.go.kr/data/15041013/fileData.do

Alerts

`선명` has constant value ""	Constant
`철도운영기관명` is highly imbalanced (52.8%)	Imbalance
`역명` has unique values	Unique
`영어명` has unique values	Unique
`로마자` has unique values	Unique
`중국어간체` has unique values	Unique
`중국어번체` has unique values	Unique

Reproduction

Analysis started	2023-12-12 03:53:03.509599
Analysis finished	2023-12-12 03:53:05.025027
Duration	1.52 second
Software version	ydata-profiling vv4.5.1
Download configuration	config.json

철도운영기관명
Categorical

IMBALANCE

Distinct	2
Distinct (%)	2.0%
Missing	0
Missing (%)	0.0%
Memory size	924.0 B

코레일	89
서울교통공사	10

Length

Max length	6
Median length	3
Mean length	3.3030303
Min length	3

Unique

Unique	0 ?
Unique (%)	0.0%

Sample

1st row	코레일
2nd row	코레일
3rd row	코레일
4th row	코레일
5th row	코레일

Common Values

Value	Count	Frequency (%)
코레일	89	89.9%
서울교통공사	10	10.1%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
코레일	89	89.9%
서울교통공사	10	10.1%

선명
Categorical

CONSTANT

Distinct	1
Distinct (%)	1.0%
Missing	0
Missing (%)	0.0%
Memory size	924.0 B

1호선	99

Length

Max length	3
Median length	3
Mean length	3
Min length	3

Unique

Unique	0 ?
Unique (%)	0.0%

Sample

1st row	1호선
2nd row	1호선
3rd row	1호선
4th row	1호선
5th row	1호선

Common Values

Value	Count	Frequency (%)
1호선	99	100.0%

Length

Histogram of lengths of the category

Common Values (Plot)

Value	Count	Frequency (%)
1호선	99	100.0%

역명
Text

UNIQUE

Distinct	99
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Memory size	924.0 B

Length

Max length	12
Median length	2
Mean length	2.6464646
Min length	2

Characters and Unicode

Total characters	262
Distinct characters	118
Distinct categories	4 ?
Distinct scripts	2 ?
Distinct blocks	2 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	99 ?
Unique (%)	100.0%

Sample

1st row	가능
2nd row	가산디지털단지
3rd row	간석
4th row	개봉
5th row	관악

Value	Count	Frequency (%)
가능	1	1.0%
소요산	1	1.0%
의왕	1	1.0%
월계	1	1.0%
용산	1	1.0%
외대앞	1	1.0%
온양온천	1	1.0%
온수	1	1.0%
오산대	1	1.0%
오산	1	1.0%
Other values (89)	89	89.9%

Most occurring characters

Value	Count	Frequency (%)
동	12	4.6%
천	10	3.8%
산	10	3.8%
대	9	3.4%
정	7	2.7%
서	5	1.9%
신	5	1.9%
도	5	1.9%
부	4	1.5%
지	4	1.5%
Other values (108)	191	72.9%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	254	96.9%
Open Punctuation	3	1.1%
Close Punctuation	3	1.1%
Decimal Number	2	0.8%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
동	12	4.7%
천	10	3.9%
산	10	3.9%
대	9	3.5%
정	7	2.8%
서	5	2.0%
신	5	2.0%
도	5	2.0%
부	4	1.6%
지	4	1.6%
Other values (104)	183	72.0%

Decimal Number

Value	Count	Frequency (%)
5	1	50.0%
3	1	50.0%

Open Punctuation

Value	Count	Frequency (%)
(	3	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	3	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Hangul	254	96.9%
Common	8	3.1%

Most frequent character per script

Hangul

Value	Count	Frequency (%)
동	12	4.7%
천	10	3.9%
산	10	3.9%
대	9	3.5%
정	7	2.8%
서	5	2.0%
신	5	2.0%
도	5	2.0%
부	4	1.6%
지	4	1.6%
Other values (104)	183	72.0%

Common

Value	Count	Frequency (%)
(	3	37.5%
)	3	37.5%
5	1	12.5%
3	1	12.5%

Most occurring blocks

Value	Count	Frequency (%)
Hangul	254	96.9%
ASCII	8	3.1%

Most frequent character per block

Hangul

Value	Count	Frequency (%)
동	12	4.7%
천	10	3.9%
산	10	3.9%
대	9	3.5%
정	7	2.8%
서	5	2.0%
신	5	2.0%
도	5	2.0%
부	4	1.6%
지	4	1.6%
Other values (104)	183	72.0%

ASCII

Value	Count	Frequency (%)
(	3	37.5%
)	3	37.5%
5	1	12.5%
3	1	12.5%

영어명
Text

UNIQUE

Distinct	99
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Memory size	924.0 B

Length

Max length	34
Median length	21
Mean length	9.3030303
Min length	4

Characters and Unicode

Total characters	921
Distinct characters	52
Distinct categories	8 ?
Distinct scripts	2 ?
Distinct blocks	2 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	99 ?
Unique (%)	100.0%

Sample

1st row	Ganeung
2nd row	Gasan Digital Complex
3rd row	Ganseok
4th row	Gaebong
5th row	Gwanak

Value	Count	Frequency (%)
univ	6	5.0%
station	3	2.5%
of	2	1.7%
osan	2	1.7%
jongno	2	1.7%
seoul	2	1.7%
hankuk	1	0.8%
oryu-dong	1	0.8%
onsu	1	0.8%
onyangoncheon	1	0.8%
Other values (100)	100	82.6%

Most occurring characters

Value	Count	Frequency (%)
n	137	14.9%
o	106	11.5%
g	84	9.1%
e	73	7.9%
a	73	7.9%
i	38	4.1%
u	35	3.8%
y	26	2.8%
S	25	2.7%
s	22	2.4%
Other values (42)	302	32.8%

Most occurring categories

Value	Count	Frequency (%)
Lowercase Letter	758	82.3%
Uppercase Letter	117	12.7%
Space Separator	24	2.6%
Other Punctuation	6	0.7%
Close Punctuation	5	0.5%
Open Punctuation	5	0.5%
Dash Punctuation	4	0.4%
Decimal Number	2	0.2%

Most frequent character per category

Lowercase Letter

Value	Count	Frequency (%)
n	137	18.1%
o	106	14.0%
g	84	11.1%
e	73	9.6%
a	73	9.6%
i	38	5.0%
u	35	4.6%
y	26	3.4%
s	22	2.9%
k	20	2.6%
Other values (15)	144	19.0%

Uppercase Letter

Value	Count	Frequency (%)
S	25	21.4%
D	17	14.5%
J	11	9.4%
G	11	9.4%
U	9	7.7%
B	8	6.8%
H	5	4.3%
O	5	4.3%
N	5	4.3%
C	5	4.3%
Other values (9)	16	13.7%

Space Separator

Value	Count	Frequency (%)
	21	87.5%
	3	12.5%

Decimal Number

Value	Count	Frequency (%)
3	1	50.0%
5	1	50.0%

Other Punctuation

Value	Count	Frequency (%)
.	6	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	5	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	5	100.0%

Dash Punctuation

Value	Count	Frequency (%)
-	4	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	875	95.0%
Common	46	5.0%

Most frequent character per script

Latin

Value	Count	Frequency (%)
n	137	15.7%
o	106	12.1%
g	84	9.6%
e	73	8.3%
a	73	8.3%
i	38	4.3%
u	35	4.0%
y	26	3.0%
S	25	2.9%
s	22	2.5%
Other values (34)	256	29.3%

Common

Value	Count	Frequency (%)
	21	45.7%
.	6	13.0%
)	5	10.9%
(	5	10.9%
-	4	8.7%
	3	6.5%
3	1	2.2%
5	1	2.2%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	918	99.7%
None	3	0.3%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
n	137	14.9%
o	106	11.5%
g	84	9.2%
e	73	8.0%
a	73	8.0%
i	38	4.1%
u	35	3.8%
y	26	2.8%
S	25	2.7%
s	22	2.4%
Other values (41)	299	32.6%

None

Value	Count	Frequency (%)
	3	100.0%

로마자
Text

UNIQUE

Distinct	99
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Memory size	924.0 B

Length

Max length	34
Median length	20
Mean length	9.1111111
Min length	4

Characters and Unicode

Total characters	902
Distinct characters	52
Distinct categories	8 ?
Distinct scripts	2 ?
Distinct blocks	2 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	99 ?
Unique (%)	100.0%

Sample

1st row	Ganeung
2nd row	GasanDigital Complex
3rd row	Ganseok
4th row	Gaebong
5th row	Gwanak

Value	Count	Frequency (%)
univ	3	2.7%
dongducheon	2	1.8%
station	2	1.8%
pyeongtaek	2	1.8%
osan	2	1.8%
ganeung	1	0.9%
yeongdeungpo	1	0.9%
uijeongbu	1	0.9%
uiwang	1	0.9%
wolgye	1	0.9%
Other values (97)	97	85.8%

Most occurring characters

Value	Count	Frequency (%)
n	135	15.0%
o	103	11.4%
g	85	9.4%
e	73	8.1%
a	72	8.0%
i	38	4.2%
u	36	4.0%
S	27	3.0%
y	25	2.8%
k	20	2.2%
Other values (42)	288	31.9%

Most occurring categories

Value	Count	Frequency (%)
Lowercase Letter	746	82.7%
Uppercase Letter	118	13.1%
Space Separator	16	1.8%
Dash Punctuation	8	0.9%
Other Punctuation	6	0.7%
Open Punctuation	3	0.3%
Close Punctuation	3	0.3%
Decimal Number	2	0.2%

Most frequent character per category

Lowercase Letter

Value	Count	Frequency (%)
n	135	18.1%
o	103	13.8%
g	85	11.4%
e	73	9.8%
a	72	9.7%
i	38	5.1%
u	36	4.8%
y	25	3.4%
k	20	2.7%
h	19	2.5%
Other values (15)	140	18.8%

Uppercase Letter

Value	Count	Frequency (%)
S	27	22.9%
D	17	14.4%
J	11	9.3%
G	11	9.3%
B	8	6.8%
U	8	6.8%
C	5	4.2%
H	5	4.2%
O	5	4.2%
N	5	4.2%
Other values (9)	16	13.6%

Space Separator

Value	Count	Frequency (%)
	13	81.2%
	3	18.8%

Decimal Number

Value	Count	Frequency (%)
3	1	50.0%
5	1	50.0%

Dash Punctuation

Value	Count	Frequency (%)
-	8	100.0%

Other Punctuation

Value	Count	Frequency (%)
.	6	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	3	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	3	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Latin	864	95.8%
Common	38	4.2%

Most frequent character per script

Latin

Value	Count	Frequency (%)
n	135	15.6%
o	103	11.9%
g	85	9.8%
e	73	8.4%
a	72	8.3%
i	38	4.4%
u	36	4.2%
S	27	3.1%
y	25	2.9%
k	20	2.3%
Other values (34)	250	28.9%

Common

Value	Count	Frequency (%)
	13	34.2%
-	8	21.1%
.	6	15.8%
(	3	7.9%
)	3	7.9%
	3	7.9%
3	1	2.6%
5	1	2.6%

Most occurring blocks

Value	Count	Frequency (%)
ASCII	899	99.7%
None	3	0.3%

Most frequent character per block

ASCII

Value	Count	Frequency (%)
n	135	15.0%
o	103	11.5%
g	85	9.5%
e	73	8.1%
a	72	8.0%
i	38	4.2%
u	36	4.0%
S	27	3.0%
y	25	2.8%
k	20	2.2%
Other values (41)	285	31.7%

None

Value	Count	Frequency (%)
	3	100.0%

일본어
Text

Distinct	98
Distinct (%)	99.0%
Missing	0
Missing (%)	0.0%
Memory size	924.0 B

Length

Max length	17
Median length	12
Mean length	4.8181818
Min length	2

Characters and Unicode

Total characters	477
Distinct characters	70
Distinct categories	3 ?
Distinct scripts	3 ?
Distinct blocks	3 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	97 ?
Unique (%)	98.0%

Sample

1st row	カヌン
2nd row	カサンデジタルダンジ
3rd row	カンソク
4th row	ケボン
5th row	クァナク

Value	Count	Frequency (%)
タンジョン	2	2.0%
ソンタン	1	1.0%
ウィワン	1	1.0%
ウォルゲ	1	1.0%
ヨンサン	1	1.0%
ウェデアプ	1	1.0%
オニャンオンチョン	1	1.0%
オンス	1	1.0%
オサンデ	1	1.0%
オサン	1	1.0%
Other values (88)	88	88.9%

Most occurring characters

Value	Count	Frequency (%)
ン	114	23.9%
ョ	34	7.1%
チ	27	5.7%
ク	21	4.4%
ソ	15	3.1%
ト	15	3.1%
サ	14	2.9%
ジ	12	2.5%
ド	9	1.9%
ム	9	1.9%
Other values (60)	207	43.4%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	473	99.2%
Open Punctuation	2	0.4%
Close Punctuation	2	0.4%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
ン	114	24.1%
ョ	34	7.2%
チ	27	5.7%
ク	21	4.4%
ソ	15	3.2%
ト	15	3.2%
サ	14	3.0%
ジ	12	2.5%
ド	9	1.9%
ム	9	1.9%
Other values (58)	203	42.9%

Open Punctuation

Value	Count	Frequency (%)
(	2	100.0%

Close Punctuation

Value	Count	Frequency (%)
)	2	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Katakana	466	97.7%
Han	7	1.5%
Common	4	0.8%

Most frequent character per script

Katakana

Value	Count	Frequency (%)
ン	114	24.5%
ョ	34	7.3%
チ	27	5.8%
ク	21	4.5%
ソ	15	3.2%
ト	15	3.2%
サ	14	3.0%
ジ	12	2.6%
ド	9	1.9%
ム	9	1.9%
Other values (53)	196	42.1%

Han

Value	Count	Frequency (%)
大	2	28.6%
学	2	28.6%
駅	1	14.3%
九	1	14.3%
老	1	14.3%

Common

Value	Count	Frequency (%)
(	2	50.0%
)	2	50.0%

Most occurring blocks

Value	Count	Frequency (%)
Katakana	466	97.7%
CJK	7	1.5%
ASCII	4	0.8%

Most frequent character per block

Katakana

Value	Count	Frequency (%)
ン	114	24.5%
ョ	34	7.3%
チ	27	5.8%
ク	21	4.5%
ソ	15	3.2%
ト	15	3.2%
サ	14	3.0%
ジ	12	2.6%
ド	9	1.9%
ム	9	1.9%
Other values (53)	196	42.1%

ASCII

Value	Count	Frequency (%)
(	2	50.0%
)	2	50.0%

CJK

Value	Count	Frequency (%)
大	2	28.6%
学	2	28.6%
駅	1	14.3%
九	1	14.3%
老	1	14.3%

중국어간체
Text

UNIQUE

Distinct	99
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Memory size	924.0 B

Length

Max length	11
Median length	2
Mean length	2.7373737
Min length	2

Characters and Unicode

Total characters	271
Distinct characters	161
Distinct categories	4 ?
Distinct scripts	2 ?
Distinct blocks	3 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	99 ?
Unique (%)	100.0%

Sample

1st row	佳陵
2nd row	加山数码园区
3rd row	间石驛
4th row	开峰
5th row	冠岳

Value	Count	Frequency (%)
佳陵	1	1.0%
逍遙山	1	1.0%
义王	1	1.0%
月溪	1	1.0%
龙山	1	1.0%
韩国外国语大学	1	1.0%
温阳温泉	1	1.0%
温水	1	1.0%
乌山大学	1	1.0%
乌山	1	1.0%
Other values (89)	89	89.9%

Most occurring characters

Value	Count	Frequency (%)
山	10	3.7%
大	9	3.3%
川	7	2.6%
学	7	2.6%
东	6	2.2%
井	5	1.8%
洞	5	1.8%
新	5	1.8%
道	4	1.5%
)	3	1.1%
Other values (151)	210	77.5%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	264	97.4%
Close Punctuation	3	1.1%
Open Punctuation	3	1.1%
Space Separator	1	0.4%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
山	10	3.8%
大	9	3.4%
川	7	2.7%
学	7	2.7%
东	6	2.3%
井	5	1.9%
洞	5	1.9%
新	5	1.9%
道	4	1.5%
浦	3	1.1%
Other values (148)	203	76.9%

Close Punctuation

Value	Count	Frequency (%)
)	3	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	3	100.0%

Space Separator

Value	Count	Frequency (%)
	1	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Han	264	97.4%
Common	7	2.6%

Most frequent character per script

Han

Value	Count	Frequency (%)
山	10	3.8%
大	9	3.4%
川	7	2.7%
学	7	2.7%
东	6	2.3%
井	5	1.9%
洞	5	1.9%
新	5	1.9%
道	4	1.5%
浦	3	1.1%
Other values (148)	203	76.9%

Common

Value	Count	Frequency (%)
)	3	42.9%
(	3	42.9%
	1	14.3%

Most occurring blocks

Value	Count	Frequency (%)
CJK	264	97.4%
ASCII	6	2.2%
None	1	0.4%

Most frequent character per block

CJK

Value	Count	Frequency (%)
山	10	3.8%
大	9	3.4%
川	7	2.7%
学	7	2.7%
东	6	2.3%
井	5	1.9%
洞	5	1.9%
新	5	1.9%
道	4	1.5%
浦	3	1.1%
Other values (148)	203	76.9%

ASCII

Value	Count	Frequency (%)
)	3	50.0%
(	3	50.0%

None

Value	Count	Frequency (%)
	1	100.0%

중국어번체
Text

UNIQUE

Distinct	99
Distinct (%)	100.0%
Missing	0
Missing (%)	0.0%
Memory size	924.0 B

Length

Max length	12
Median length	2
Mean length	2.6666667
Min length	2

Characters and Unicode

Total characters	264
Distinct characters	165
Distinct categories	4 ?
Distinct scripts	3 ?
Distinct blocks	4 ?

The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique	99 ?
Unique (%)	100.0%

Sample

1st row	佳陵
2nd row	加山디지털團地
3rd row	間石
4th row	開峰
5th row	冠岳

Value	Count	Frequency (%)
佳陵	1	1.0%
逍遙山	1	1.0%
義王	1	1.0%
月溪	1	1.0%
龍山	1	1.0%
外大앞	1	1.0%
溫陽溫泉	1	1.0%
溫水	1	1.0%
烏山大	1	1.0%
烏山	1	1.0%
Other values (89)	89	89.9%

Most occurring characters

Value	Count	Frequency (%)
山	10	3.8%
大	9	3.4%
川	7	2.7%
東	6	2.3%
新	5	1.9%
井	5	1.9%
洞	5	1.9%
)	4	1.5%
道	4	1.5%
(	4	1.5%
Other values (155)	205	77.7%

Most occurring categories

Value	Count	Frequency (%)
Other Letter	254	96.2%
Close Punctuation	4	1.5%
Open Punctuation	4	1.5%
Decimal Number	2	0.8%

Most frequent character per category

Other Letter

Value	Count	Frequency (%)
山	10	3.9%
大	9	3.5%
川	7	2.8%
東	6	2.4%
新	5	2.0%
井	5	2.0%
洞	5	2.0%
道	4	1.6%
西	3	1.2%
安	3	1.2%
Other values (151)	197	77.6%

Decimal Number

Value	Count	Frequency (%)
3	1	50.0%
5	1	50.0%

Close Punctuation

Value	Count	Frequency (%)
)	4	100.0%

Open Punctuation

Value	Count	Frequency (%)
(	4	100.0%

Most occurring scripts

Value	Count	Frequency (%)
Han	247	93.6%
Common	10	3.8%
Hangul	7	2.7%

Most frequent character per script

Han

Value	Count	Frequency (%)
山	10	4.0%
大	9	3.6%
川	7	2.8%
東	6	2.4%
新	5	2.0%
井	5	2.0%
洞	5	2.0%
道	4	1.6%
西	3	1.2%
安	3	1.2%
Other values (145)	190	76.9%

Hangul

Value	Count	Frequency (%)
앞	2	28.6%
서	1	14.3%
울	1	14.3%
털	1	14.3%
지	1	14.3%
디	1	14.3%

Common

Value	Count	Frequency (%)
)	4	40.0%
(	4	40.0%
3	1	10.0%
5	1	10.0%

Most occurring blocks

Value	Count	Frequency (%)
CJK	240	90.9%
ASCII	10	3.8%
CJK Compat Ideographs	7	2.7%
Hangul	7	2.7%

Most frequent character per block

CJK

Value	Count	Frequency (%)
山	10	4.2%
大	9	3.8%
川	7	2.9%
東	6	2.5%
新	5	2.1%
井	5	2.1%
洞	5	2.1%
道	4	1.7%
西	3	1.2%
安	3	1.2%
Other values (139)	183	76.2%

ASCII

Value	Count	Frequency (%)
)	4	40.0%
(	4	40.0%
3	1	10.0%
5	1	10.0%

CJK Compat Ideographs

Value	Count	Frequency (%)
龍	2	28.6%
勒	1	14.3%
綠	1	14.3%
鷺	1	14.3%
里	1	14.3%
陵	1	14.3%

Hangul

Value	Count	Frequency (%)
앞	2	28.6%
서	1	14.3%
울	1	14.3%
털	1	14.3%
지	1	14.3%
디	1	14.3%

환승역여부
Boolean

Distinct	2
Distinct (%)	2.0%
Missing	0
Missing (%)	0.0%
Memory size	231.0 B

False	72
True	27

Common Values (Table)
Common Values (Plot)

Value	Count	Frequency (%)
False	72	72.7%
True	27	27.3%

신설일자
Real number (ℝ)

Distinct	21
Distinct (%)	21.2%
Missing	0
Missing (%)	0.0%
Infinite	0
Infinite (%)	0.0%
Mean	19852163

Minimum	19000708
Maximum	20220607
Zeros	0
Zeros (%)	0.0%
Negative	0
Negative (%)	0.0%
Memory size	1023.0 B

Quantile statistics

Minimum	19000708
5-th percentile	19740815
Q1	19740815
median	19740815
Q3	20050120
95-th percentile	20083110
Maximum	20220607
Range	1219899
Interquartile range (IQR)	309305

Descriptive statistics

Standard deviation	210226.04
Coefficient of variation (CV)	0.010589579
Kurtosis	4.4265764
Mean	19852163
Median Absolute Deviation (MAD)	0
Skewness	-1.3315758
Sum	1.9653641 × 10⁹
Variance	4.4194987 × 10¹⁰
Monotonicity	Not monotonic

Histogram with fixed size bins (bins=21)

Value	Count	Frequency (%)
19740815	50	50.5%
20050120	11	11.1%
20061225	6	6.1%
20081220	5	5.1%
19860902	5	5.1%
20061215	4	4.0%
19850425	3	3.0%
20030430	2	2.0%
20100121	1	1.0%
20220607	1	1.0%
Other values (11)	11	11.1%

Minimum 10 values
Maximum 10 values

Value	Count	Frequency (%)
19000708	1	1.0%
19050101	1	1.0%
19111005	1	1.0%
19740815	50	50.5%
19800105	1	1.0%
19850425	3	3.0%
19860902	5	5.1%
19950216	1	1.0%
20030430	2	2.0%
20040401	1	1.0%

Value	Count	Frequency (%)
20220607	1	1.0%
20211030	1	1.0%
20120118	1	1.0%
20100226	1	1.0%
20100121	1	1.0%
20081220	5	5.1%
20081215	1	1.0%
20061225	6	6.1%
20061215	4	4.0%
20051221	1	1.0%

신설일자

신설일자

Heatmap
Table

	철도운영기관명	역명	영어명	로마자	일본어	중국어간체	중국어번체	환승역여부	신설일자
철도운영기관명	1.000	1.000	1.000	1.000	1.000	1.000	1.000	0.407	0.106
역명	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
영어명	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
로마자	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
일본어	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
중국어간체	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
중국어번체	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
환승역여부	0.407	1.000	1.000	1.000	1.000	1.000	1.000	1.000	0.212
신설일자	0.106	1.000	1.000	1.000	1.000	1.000	1.000	0.212	1.000

Heatmap
Table

	환승역여부	철도운영기관명
환승역여부	1.000	0.267
철도운영기관명	0.267	1.000

Heatmap
Table

	신설일자	철도운영기관명	환승역여부
신설일자	1.000	0.132	0.284
철도운영기관명	0.132	1.000	0.267
환승역여부	0.284	0.267	1.000

Count
Matrix

A simple visualization of nullity by column.

Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

First rows
Last rows

	철도운영기관명	선명	역명	영어명	로마자	일본어	중국어간체	중국어번체	환승역여부	신설일자
0	코레일	1호선	가능	Ganeung	Ganeung	カヌン	佳陵	佳陵	N	20061215
1	코레일	1호선	가산디지털단지	Gasan Digital Complex	GasanDigital Complex	カサンデジタルダンジ	加山数码园区	加山디지털團地	Y	19740815
2	코레일	1호선	간석	Ganseok	Ganseok	カンソク	间石驛	間石	N	19740815
3	코레일	1호선	개봉	Gaebong	Gaebong	ケボン	开峰	開峰	N	19740815
4	코레일	1호선	관악	Gwanak	Gwanak	クァナク	冠岳	冠岳	N	19740815
5	코레일	1호선	광명	Gwangmyeong	Gwangmyeong	クァンミョン	光明	光明	N	20040401
6	코레일	1호선	광운대	Kwangwoon Univ.	Kwangwoon Univ.	クァンウンデ	光云大学	光云大	Y	19111005
7	코레일	1호선	구로	Guro Station	Guro Station	九老駅	九老站	九老	N	19740815
8	코레일	1호선	구일	Guil	Guil	クイル	九一	九一	N	19950216
9	코레일	1호선	군포	Gunpo	Gunpo	クンポ	军浦	軍浦	N	19740815

	철도운영기관명	선명	역명	영어명	로마자	일본어	중국어간체	중국어번체	환승역여부	신설일자
89	서울교통공사	1호선	동대문	Dongdaemun	Dongdaemun	トンデムン	东大门	東大門	Y	19740815
90	서울교통공사	1호선	동묘앞	Dongmyo	Dongmyo	トンミョアプ	东庙	東廟앞	Y	20051221
91	서울교통공사	1호선	서울역	Seoul Station	Seoul Station	ソウルヨク	首尔站	首爾(驛)	Y	19740815
92	서울교통공사	1호선	시청	City Hall	City Hall	シチョン	市厅	市廳	Y	19740815
93	서울교통공사	1호선	신설동	Sinseoldong	Sinseol-dong	シンソルトン	新设洞	新設洞	Y	19740815
94	서울교통공사	1호선	제기동	Jegidong	Jegi-dong	チェギドン	祭基洞	祭基洞	N	19740815
95	서울교통공사	1호선	종각	Jonggak	Jonggak	チョンガク	钟阁	鐘閣	N	19740815
96	서울교통공사	1호선	종로3가	Jongno 3(sam)ga	Jongno3-ga	チョンノサムガ	钟路三街	鍾路3街	Y	19740815
97	서울교통공사	1호선	종로5가	Jongno 5(o)ga	Jongno5-ga	チョンノオガ	钟路五街	鍾路5街	N	19740815
98	서울교통공사	1호선	청량리(서울시립대입구)	Cheongnyangni(University of Seoul)	Cheongnyangni (SeoulSiripdaeip-gu)	チョンニャンニ	清凉里(首尔市立大学)	淸凉里(서울市立大入口)	Y	19740815

Overview

Variables

Common Values

Length

Common Values (Plot)

Common Values

Length

Common Values (Plot)

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Decimal Number

Open Punctuation

Close Punctuation

Most occurring scripts

Most frequent character per script

Hangul

Common

Most occurring blocks

Most frequent character per block

Hangul

ASCII

Most occurring characters

Most occurring categories

Most frequent character per category

Lowercase Letter

Uppercase Letter

Space Separator

Decimal Number

Other Punctuation

Close Punctuation

Open Punctuation

Dash Punctuation

Most occurring scripts

Most frequent character per script

Latin

Common

Most occurring blocks

Most frequent character per block

ASCII

None

Most occurring characters

Most occurring categories

Most frequent character per category

Lowercase Letter

Uppercase Letter

Space Separator

Decimal Number

Dash Punctuation

Other Punctuation

Open Punctuation

Close Punctuation

Most occurring scripts

Most frequent character per script

Latin

Common

Most occurring blocks

Most frequent character per block

ASCII

None

Most occurring characters

Most occurring categories

Most frequent character per category

Other Letter

Open Punctuation

Close Punctuation

Most occurring scripts

Most frequent character per script

Katakana

Han

Common

Most occurring blocks

Most frequent character per block

Katakana

ASCII

CJK

Most occurring characters

Most occurring categories

Most frequent character per category