Overview

Dataset statistics

Number of variables12
Number of observations100
Missing cells85
Missing cells (%)7.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory9.9 KiB
Average record size in memory101.3 B

Variable types

Categorical4
Text5
Numeric3

Alerts

base_ymd has constant value ""Constant
city_gn_gu_cd is highly overall correlated with ypos_la and 1 other fieldsHigh correlation
xpos_lo is highly overall correlated with city_do_cdHigh correlation
ypos_la is highly overall correlated with city_gn_gu_cd and 1 other fieldsHigh correlation
city_do_cd is highly overall correlated with city_gn_gu_cd and 2 other fieldsHigh correlation
city_do_cd is highly imbalanced (68.2%)Imbalance
tel_no has 33 (33.0%) missing valuesMissing
homepage_url has 52 (52.0%) missing valuesMissing

Reproduction

Analysis started2023-12-10 09:45:11.416409
Analysis finished2023-12-10 09:45:15.840019
Duration4.42 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

subway_line_no
Categorical

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
5호선
23 
4호선
17 
3호선
14 
1호선
14 
6호선
11 
Other values (4)
21 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5호선
2nd row3호선
3rd row9호선
4th row5호선
5th row9호선

Common Values

ValueCountFrequency (%)
5호선 23
23.0%
4호선 17
17.0%
3호선 14
14.0%
1호선 14
14.0%
6호선 11
11.0%
2호선 10
10.0%
9호선 5
 
5.0%
8호선 3
 
3.0%
7호선 3
 
3.0%

Length

2023-12-10T18:45:16.014221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:45:16.240516image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
5호선 23
23.0%
4호선 17
17.0%
3호선 14
14.0%
1호선 14
14.0%
6호선 11
11.0%
2호선 10
10.0%
9호선 5
 
5.0%
8호선 3
 
3.0%
7호선 3
 
3.0%
Distinct64
Distinct (%)64.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:45:16.727742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7
Mean length3.22
Min length2

Characters and Unicode

Total characters322
Distinct characters102
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique43 ?
Unique (%)43.0%

Sample

1st row여의나루
2nd row체육공원
3rd row국회의사당
4th row여의도
5th row가양
ValueCountFrequency (%)
종로3가 6
 
6.0%
시청 6
 
6.0%
충무로 6
 
6.0%
동대문역사문화공원 3
 
3.0%
광화문 3
 
3.0%
경복궁 3
 
3.0%
대공원 2
 
2.0%
이촌 2
 
2.0%
신금호 2
 
2.0%
이태원 2
 
2.0%
Other values (54) 65
65.0%
2023-12-10T18:45:17.719917image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
17
 
5.3%
12
 
3.7%
11
 
3.4%
11
 
3.4%
11
 
3.4%
10
 
3.1%
9
 
2.8%
9
 
2.8%
9
 
2.8%
8
 
2.5%
Other values (92) 215
66.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 313
97.2%
Decimal Number 9
 
2.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
17
 
5.4%
12
 
3.8%
11
 
3.5%
11
 
3.5%
11
 
3.5%
10
 
3.2%
9
 
2.9%
9
 
2.9%
9
 
2.9%
8
 
2.6%
Other values (89) 206
65.8%
Decimal Number
ValueCountFrequency (%)
3 6
66.7%
4 2
 
22.2%
5 1
 
11.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 313
97.2%
Common 9
 
2.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
17
 
5.4%
12
 
3.8%
11
 
3.5%
11
 
3.5%
11
 
3.5%
10
 
3.2%
9
 
2.9%
9
 
2.9%
9
 
2.9%
8
 
2.6%
Other values (89) 206
65.8%
Common
ValueCountFrequency (%)
3 6
66.7%
4 2
 
22.2%
5 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 313
97.2%
ASCII 9
 
2.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
17
 
5.4%
12
 
3.8%
11
 
3.5%
11
 
3.5%
11
 
3.5%
10
 
3.2%
9
 
2.9%
9
 
2.9%
9
 
2.9%
8
 
2.6%
Other values (89) 206
65.8%
ASCII
ValueCountFrequency (%)
3 6
66.7%
4 2
 
22.2%
5 1
 
11.1%
Distinct82
Distinct (%)82.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:45:18.260285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length21
Median length12.5
Mean length6.41
Min length3

Characters and Unicode

Total characters641
Distinct characters183
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique68 ?
Unique (%)68.0%

Sample

1st row63 시티
2nd row강서체육공원
3rd rowKBS
4th rowKBS 홀
5th rowKBS스포츠월드
ValueCountFrequency (%)
광희문 4
 
3.3%
러시아 3
 
2.5%
공사관 3
 
2.5%
국립중앙박물관 3
 
2.5%
3
 
2.5%
경찰박물관 2
 
1.6%
2
 
1.6%
kbs 2
 
1.6%
교보문고 2
 
1.6%
경성일보필동사옥터 2
 
1.6%
Other values (86) 96
78.7%
2023-12-10T18:45:19.037985image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
27
 
4.2%
22
 
3.4%
19
 
3.0%
17
 
2.7%
14
 
2.2%
14
 
2.2%
13
 
2.0%
13
 
2.0%
13
 
2.0%
12
 
1.9%
Other values (173) 477
74.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 576
89.9%
Space Separator 22
 
3.4%
Uppercase Letter 15
 
2.3%
Decimal Number 11
 
1.7%
Close Punctuation 6
 
0.9%
Open Punctuation 6
 
0.9%
Other Punctuation 5
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
27
 
4.7%
19
 
3.3%
17
 
3.0%
14
 
2.4%
14
 
2.4%
13
 
2.3%
13
 
2.3%
13
 
2.3%
12
 
2.1%
12
 
2.1%
Other values (156) 422
73.3%
Uppercase Letter
ValueCountFrequency (%)
S 3
20.0%
B 3
20.0%
K 3
20.0%
N 2
13.3%
G 2
13.3%
L 2
13.3%
Decimal Number
ValueCountFrequency (%)
6 3
27.3%
1 3
27.3%
0 2
18.2%
3 1
 
9.1%
9 1
 
9.1%
4 1
 
9.1%
Other Punctuation
ValueCountFrequency (%)
. 3
60.0%
, 2
40.0%
Space Separator
ValueCountFrequency (%)
22
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 576
89.9%
Common 50
 
7.8%
Latin 15
 
2.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
27
 
4.7%
19
 
3.3%
17
 
3.0%
14
 
2.4%
14
 
2.4%
13
 
2.3%
13
 
2.3%
13
 
2.3%
12
 
2.1%
12
 
2.1%
Other values (156) 422
73.3%
Common
ValueCountFrequency (%)
22
44.0%
) 6
 
12.0%
( 6
 
12.0%
6 3
 
6.0%
. 3
 
6.0%
1 3
 
6.0%
0 2
 
4.0%
, 2
 
4.0%
3 1
 
2.0%
9 1
 
2.0%
Latin
ValueCountFrequency (%)
S 3
20.0%
B 3
20.0%
K 3
20.0%
N 2
13.3%
G 2
13.3%
L 2
13.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 576
89.9%
ASCII 65
 
10.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
27
 
4.7%
19
 
3.3%
17
 
3.0%
14
 
2.4%
14
 
2.4%
13
 
2.3%
13
 
2.3%
13
 
2.3%
12
 
2.1%
12
 
2.1%
Other values (156) 422
73.3%
ASCII
ValueCountFrequency (%)
22
33.8%
) 6
 
9.2%
( 6
 
9.2%
S 3
 
4.6%
B 3
 
4.6%
K 3
 
4.6%
6 3
 
4.6%
. 3
 
4.6%
1 3
 
4.6%
N 2
 
3.1%
Other values (7) 11
16.9%
Distinct76
Distinct (%)76.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:45:19.558723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length41
Median length28
Mean length18.4
Min length11

Characters and Unicode

Total characters1840
Distinct characters189
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique58 ?
Unique (%)58.0%

Sample

1st row서울특별시 영등포구 63로 50 한화금융센터_63
2nd row부산광역시 강서구 대저1동 2158
3rd row서울특별시 영등포구 여의공원로 13
4th row서울특별시 영등포구 여의공원로 13 한국방송공사
5th row서울특별시 강서구 공항대로 376 KBS88체육관
ValueCountFrequency (%)
서울특별시 90
22.7%
중구 25
 
6.3%
종로구 18
 
4.5%
용산구 9
 
2.3%
강서구 7
 
1.8%
경기도 6
 
1.5%
정동 5
 
1.3%
영등포구 5
 
1.3%
강남구 4
 
1.0%
과천시 4
 
1.0%
Other values (169) 224
56.4%
2023-12-10T18:45:20.377820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
297
 
16.1%
106
 
5.8%
103
 
5.6%
97
 
5.3%
91
 
4.9%
90
 
4.9%
90
 
4.9%
77
 
4.2%
57
 
3.1%
1 46
 
2.5%
Other values (179) 786
42.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1290
70.1%
Space Separator 297
 
16.1%
Decimal Number 226
 
12.3%
Dash Punctuation 9
 
0.5%
Uppercase Letter 6
 
0.3%
Lowercase Letter 6
 
0.3%
Open Punctuation 2
 
0.1%
Close Punctuation 2
 
0.1%
Connector Punctuation 1
 
0.1%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
106
 
8.2%
103
 
8.0%
97
 
7.5%
91
 
7.1%
90
 
7.0%
90
 
7.0%
77
 
6.0%
57
 
4.4%
27
 
2.1%
21
 
1.6%
Other values (153) 531
41.2%
Decimal Number
ValueCountFrequency (%)
1 46
20.4%
2 35
15.5%
3 33
14.6%
6 23
10.2%
8 21
9.3%
7 19
8.4%
4 19
8.4%
5 12
 
5.3%
0 10
 
4.4%
9 8
 
3.5%
Lowercase Letter
ValueCountFrequency (%)
b 1
16.7%
u 1
16.7%
e 1
16.7%
l 1
16.7%
y 1
16.7%
t 1
16.7%
Uppercase Letter
ValueCountFrequency (%)
K 2
33.3%
S 2
33.3%
B 1
16.7%
H 1
16.7%
Space Separator
ValueCountFrequency (%)
297
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1290
70.1%
Common 538
29.2%
Latin 12
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
106
 
8.2%
103
 
8.0%
97
 
7.5%
91
 
7.1%
90
 
7.0%
90
 
7.0%
77
 
6.0%
57
 
4.4%
27
 
2.1%
21
 
1.6%
Other values (153) 531
41.2%
Common
ValueCountFrequency (%)
297
55.2%
1 46
 
8.6%
2 35
 
6.5%
3 33
 
6.1%
6 23
 
4.3%
8 21
 
3.9%
7 19
 
3.5%
4 19
 
3.5%
5 12
 
2.2%
0 10
 
1.9%
Other values (6) 23
 
4.3%
Latin
ValueCountFrequency (%)
K 2
16.7%
S 2
16.7%
B 1
8.3%
b 1
8.3%
u 1
8.3%
H 1
8.3%
e 1
8.3%
l 1
8.3%
y 1
8.3%
t 1
8.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1290
70.1%
ASCII 550
29.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
297
54.0%
1 46
 
8.4%
2 35
 
6.4%
3 33
 
6.0%
6 23
 
4.2%
8 21
 
3.8%
7 19
 
3.5%
4 19
 
3.5%
5 12
 
2.2%
0 10
 
1.8%
Other values (16) 35
 
6.4%
Hangul
ValueCountFrequency (%)
106
 
8.2%
103
 
8.0%
97
 
7.5%
91
 
7.1%
90
 
7.0%
90
 
7.0%
77
 
6.0%
57
 
4.4%
27
 
2.1%
21
 
1.6%
Other values (153) 531
41.2%

city_do_cd
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
11
89 
41
 
7
26
 
3
44
 
1

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique1 ?
Unique (%)1.0%

Sample

1st row11
2nd row26
3rd row11
4th row11
5th row11

Common Values

ValueCountFrequency (%)
11 89
89.0%
41 7
 
7.0%
26 3
 
3.0%
44 1
 
1.0%

Length

2023-12-10T18:45:20.652457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:45:20.882579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
11 89
89.0%
41 7
 
7.0%
26 3
 
3.0%
44 1
 
1.0%

city_gn_gu_cd
Real number (ℝ)

HIGH CORRELATION 

Distinct26
Distinct (%)26.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13872.95
Minimum11110
Maximum44200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:45:21.084449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11110
5-th percentile11110
Q111140
median11170
Q311560
95-th percentile41290
Maximum44200
Range33090
Interquartile range (IQR)420

Descriptive statistics

Standard deviation8104.6202
Coefficient of variation (CV)0.58420308
Kurtosis7.6437533
Mean13872.95
Median Absolute Deviation (MAD)60
Skewness3.0182474
Sum1387295
Variance65684868
MonotonicityNot monotonic
2023-12-10T18:45:21.359134image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
11140 25
25.0%
11110 18
18.0%
11170 9
 
9.0%
11500 6
 
6.0%
11560 5
 
5.0%
11320 5
 
5.0%
11740 4
 
4.0%
41290 4
 
4.0%
11650 3
 
3.0%
11710 3
 
3.0%
Other values (16) 18
18.0%
ValueCountFrequency (%)
11110 18
18.0%
11140 25
25.0%
11170 9
 
9.0%
11200 2
 
2.0%
11230 1
 
1.0%
11290 1
 
1.0%
11305 1
 
1.0%
11320 5
 
5.0%
11350 1
 
1.0%
11440 1
 
1.0%
ValueCountFrequency (%)
44200 1
 
1.0%
41370 1
 
1.0%
41290 4
4.0%
41250 1
 
1.0%
26440 1
 
1.0%
26350 1
 
1.0%
26260 1
 
1.0%
11740 4
4.0%
11710 3
3.0%
11650 3
3.0%

xpos_lo
Real number (ℝ)

HIGH CORRELATION 

Distinct74
Distinct (%)74.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean127.05094
Minimum126.80359
Maximum129.12247
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:45:21.643930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.80359
5-th percentile126.85343
Q1126.97367
median126.99074
Q3127.00921
95-th percentile127.14606
Maximum129.12247
Range2.318887
Interquartile range (IQR)0.0355415

Descriptive statistics

Standard deviation0.36147086
Coefficient of variation (CV)0.0028450861
Kurtosis28.043138
Mean127.05094
Median Absolute Deviation (MAD)0.017642
Skewness5.3304232
Sum12705.094
Variance0.13066118
MonotonicityNot monotonic
2023-12-10T18:45:21.916200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
126.97314 5
 
5.0%
127.008424 4
 
4.0%
126.994106 4
 
4.0%
126.97995 3
 
3.0%
127.153556 2
 
2.0%
127.009214 2
 
2.0%
126.97385 2
 
2.0%
126.989789 2
 
2.0%
126.969666 2
 
2.0%
126.992458 2
 
2.0%
Other values (64) 72
72.0%
ValueCountFrequency (%)
126.803586 1
1.0%
126.814124 1
1.0%
126.838397 1
1.0%
126.8476 1
1.0%
126.847875 1
1.0%
126.853727 1
1.0%
126.870111 1
1.0%
126.880801 1
1.0%
126.916423 2
2.0%
126.916558 1
1.0%
ValueCountFrequency (%)
129.122473 1
1.0%
129.096162 1
1.0%
128.971281 1
1.0%
127.153556 2
2.0%
127.14567 1
1.0%
127.130005 2
2.0%
127.123713 1
1.0%
127.111698 1
1.0%
127.07033 1
1.0%
127.064438 1
1.0%

ypos_la
Real number (ℝ)

HIGH CORRELATION 

Distinct74
Distinct (%)74.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.468868
Minimum35.200909
Maximum37.944658
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:45:22.180159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum35.200909
5-th percentile37.417445
Q137.523724
median37.558878
Q337.56902
95-th percentile37.589838
Maximum37.944658
Range2.743749
Interquartile range (IQR)0.045296

Descriptive statistics

Standard deviation0.41288189
Coefficient of variation (CV)0.011019332
Kurtosis25.944675
Mean37.468868
Median Absolute Deviation (MAD)0.0199885
Skewness-5.1009741
Sum3746.8868
Variance0.17047145
MonotonicityNot monotonic
2023-12-10T18:45:22.452035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.566617 5
 
5.0%
37.563945 4
 
4.0%
37.558878 4
 
4.0%
37.523724 3
 
3.0%
37.53926 2
 
2.0%
37.477749 2
 
2.0%
37.499973 2
 
2.0%
37.564247 2
 
2.0%
37.56902 2
 
2.0%
37.561089 2
 
2.0%
Other values (64) 72
72.0%
ValueCountFrequency (%)
35.200909 1
1.0%
35.210222 1
1.0%
35.214377 1
1.0%
36.767899 1
1.0%
37.159031 1
1.0%
37.431046 1
1.0%
37.433683 1
1.0%
37.433971 1
1.0%
37.438934 1
1.0%
37.471431 1
1.0%
ValueCountFrequency (%)
37.944658 1
1.0%
37.672997 1
1.0%
37.665292 1
1.0%
37.649042 1
1.0%
37.593195 1
1.0%
37.589661 1
1.0%
37.587123 1
1.0%
37.582365 1
1.0%
37.58236 1
1.0%
37.58094 1
1.0%

tel_no
Text

MISSING 

Distinct52
Distinct (%)77.6%
Missing33
Missing (%)33.0%
Memory size932.0 B
2023-12-10T18:45:22.881840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length11.373134
Min length6

Characters and Unicode

Total characters762
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique40 ?
Unique (%)59.7%

Sample

1st row1833-7001
2nd row02-781-1000
3rd row02-781-1000
4th row02-2600-8808
5th row02-3773-1053
ValueCountFrequency (%)
02-3369-5882 4
 
6.0%
02-2077-9000 3
 
4.5%
1544-1900 2
 
3.0%
02-3150-3681 2
 
3.0%
02-3425-5252 2
 
3.0%
02-472-2770 2
 
3.0%
02-813-9625 2
 
3.0%
02-2261-0517 2
 
3.0%
02-3455-9277 2
 
3.0%
02-781-1000 2
 
3.0%
Other values (42) 44
65.7%
2023-12-10T18:45:23.608932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 153
20.1%
- 130
17.1%
2 119
15.6%
3 55
 
7.2%
7 55
 
7.2%
1 54
 
7.1%
5 50
 
6.6%
8 45
 
5.9%
4 38
 
5.0%
9 33
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 632
82.9%
Dash Punctuation 130
 
17.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 153
24.2%
2 119
18.8%
3 55
 
8.7%
7 55
 
8.7%
1 54
 
8.5%
5 50
 
7.9%
8 45
 
7.1%
4 38
 
6.0%
9 33
 
5.2%
6 30
 
4.7%
Dash Punctuation
ValueCountFrequency (%)
- 130
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 762
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 153
20.1%
- 130
17.1%
2 119
15.6%
3 55
 
7.2%
7 55
 
7.2%
1 54
 
7.1%
5 50
 
6.6%
8 45
 
5.9%
4 38
 
5.0%
9 33
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 762
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 153
20.1%
- 130
17.1%
2 119
15.6%
3 55
 
7.2%
7 55
 
7.2%
1 54
 
7.1%
5 50
 
6.6%
8 45
 
5.9%
4 38
 
5.0%
9 33
 
4.3%

homepage_url
Text

MISSING 

Distinct41
Distinct (%)85.4%
Missing52
Missing (%)52.0%
Memory size932.0 B
2023-12-10T18:45:24.068860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length66
Median length33
Mean length27.645833
Min length16

Characters and Unicode

Total characters1327
Distinct characters45
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique35 ?
Unique (%)72.9%

Sample

1st rowhttp://www.63city.co.kr/
2nd rowhttp://www.kbs.co.kr/
3rd rowhttp://office.kbs.co.kr/kbshall/
4th rowhttp://www.kbssw.co.kr/
5th rowhttp://www.lgscience.co.kr/
ValueCountFrequency (%)
http://www.museum.go.kr 3
 
6.2%
http://hanokmaeul.or.kr 2
 
4.2%
http://www.nanta.co.kr 2
 
4.2%
http://parks.seoul.go.kr/gildong 2
 
4.2%
http://www.nseoultower.com 2
 
4.2%
http://www.policemuseum.go.kr 2
 
4.2%
http://www.ntok.go.kr 1
 
2.1%
http://www.63city.co.kr 1
 
2.1%
http://www.gugak.go.kr 1
 
2.1%
http://www.nfm.go.kr 1
 
2.1%
Other values (31) 31
64.6%
2023-12-10T18:45:24.904297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 143
 
10.8%
. 134
 
10.1%
t 110
 
8.3%
w 110
 
8.3%
o 87
 
6.6%
r 66
 
5.0%
k 66
 
5.0%
h 53
 
4.0%
p 53
 
4.0%
e 51
 
3.8%
Other values (35) 454
34.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 977
73.6%
Other Punctuation 324
 
24.4%
Decimal Number 13
 
1.0%
Uppercase Letter 9
 
0.7%
Math Symbol 2
 
0.2%
Connector Punctuation 1
 
0.1%
Dash Punctuation 1
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 110
 
11.3%
w 110
 
11.3%
o 87
 
8.9%
r 66
 
6.8%
k 66
 
6.8%
h 53
 
5.4%
p 53
 
5.4%
e 51
 
5.2%
a 50
 
5.1%
g 44
 
4.5%
Other values (15) 287
29.4%
Decimal Number
ValueCountFrequency (%)
1 4
30.8%
0 3
23.1%
5 2
15.4%
6 1
 
7.7%
3 1
 
7.7%
4 1
 
7.7%
9 1
 
7.7%
Other Punctuation
ValueCountFrequency (%)
/ 143
44.1%
. 134
41.4%
: 44
 
13.6%
? 2
 
0.6%
& 1
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
S 3
33.3%
I 3
33.3%
M 1
 
11.1%
T 1
 
11.1%
E 1
 
11.1%
Math Symbol
ValueCountFrequency (%)
= 2
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 986
74.3%
Common 341
 
25.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 110
 
11.2%
w 110
 
11.2%
o 87
 
8.8%
r 66
 
6.7%
k 66
 
6.7%
h 53
 
5.4%
p 53
 
5.4%
e 51
 
5.2%
a 50
 
5.1%
g 44
 
4.5%
Other values (20) 296
30.0%
Common
ValueCountFrequency (%)
/ 143
41.9%
. 134
39.3%
: 44
 
12.9%
1 4
 
1.2%
0 3
 
0.9%
= 2
 
0.6%
? 2
 
0.6%
5 2
 
0.6%
_ 1
 
0.3%
6 1
 
0.3%
Other values (5) 5
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1327
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 143
 
10.8%
. 134
 
10.1%
t 110
 
8.3%
w 110
 
8.3%
o 87
 
6.6%
r 66
 
5.0%
k 66
 
5.0%
h 53
 
4.0%
p 53
 
4.0%
e 51
 
3.8%
Other values (35) 454
34.2%

sales_tm
Categorical

Distinct20
Distinct (%)20.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
<NA>
59 
10:00~18:00
00:00~24:00
 
5
09:30~17:30
 
4
10:00~23:00
 
3
Other values (15)
23 

Length

Max length13
Median length4
Mean length6.89
Min length4

Unique

Unique9 ?
Unique (%)9.0%

Sample

1st row10:00~22:00
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 59
59.0%
10:00~18:00 6
 
6.0%
00:00~24:00 5
 
5.0%
09:30~17:30 4
 
4.0%
10:00~23:00 3
 
3.0%
09:00~18:00 3
 
3.0%
10:00~16:00 3
 
3.0%
09:30~22:00 2
 
2.0%
06:00~18:00 2
 
2.0%
09:00~17:30 2
 
2.0%
Other values (10) 11
 
11.0%

Length

2023-12-10T18:45:25.191581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
na 59
57.8%
10:00~18:00 6
 
5.9%
00:00~24:00 5
 
4.9%
09:30~17:30 4
 
3.9%
10:00~23:00 3
 
2.9%
09:00~18:00 3
 
2.9%
10:00~16:00 3
 
2.9%
09:30~22:00 2
 
2.0%
06:00~18:00 2
 
2.0%
09:00~17:30 2
 
2.0%
Other values (12) 13
 
12.7%

base_ymd
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2020-12-31
100 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2020-12-31
2nd row2020-12-31
3rd row2020-12-31
4th row2020-12-31
5th row2020-12-31

Common Values

ValueCountFrequency (%)
2020-12-31 100
100.0%

Length

2023-12-10T18:45:25.417884image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:45:25.612688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2020-12-31 100
100.0%

Interactions

2023-12-10T18:45:14.604011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:45:13.088526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:45:13.983636image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:45:14.799964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:45:13.655769image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:45:14.157382image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:45:14.935865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:45:13.818613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:45:14.397029image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:45:25.751933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
subway_line_nosubway_station_nmtourist_nmload_addrcity_do_cdcity_gn_gu_cdxpos_loypos_latel_nohomepage_urlsales_tm
subway_line_no1.0000.9760.3730.7110.0000.3160.5950.0820.0000.7470.534
subway_station_nm0.9761.0000.9970.9970.9991.0000.9950.9880.9790.9830.896
tourist_nm0.3730.9971.0001.0001.0001.0001.0001.0001.0000.9971.000
load_addr0.7110.9971.0001.0000.9981.0001.0001.0000.9991.0000.998
city_do_cd0.0000.9991.0000.9981.0000.8920.6730.8641.0001.0000.000
city_gn_gu_cd0.3161.0001.0001.0000.8921.0000.9410.7881.0001.0000.000
xpos_lo0.5950.9951.0001.0000.6730.9411.0000.7661.0001.0000.692
ypos_la0.0820.9881.0001.0000.8640.7880.7661.0001.0001.0000.000
tel_no0.0000.9791.0000.9991.0001.0001.0001.0001.0001.0001.000
homepage_url0.7470.9830.9971.0001.0001.0001.0001.0001.0001.0001.000
sales_tm0.5340.8961.0000.9980.0000.0000.6920.0001.0001.0001.000
2023-12-10T18:45:26.065273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
subway_line_nocity_do_cdsales_tm
subway_line_no1.0000.0000.164
city_do_cd0.0001.0000.000
sales_tm0.1640.0001.000
2023-12-10T18:45:26.234131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
city_gn_gu_cdxpos_loypos_lasubway_line_nocity_do_cdsales_tm
city_gn_gu_cd1.0000.212-0.7010.1370.9600.000
xpos_lo0.2121.000-0.1540.3100.7000.462
ypos_la-0.701-0.1541.0000.0360.8440.000
subway_line_no0.1370.3100.0361.0000.0000.164
city_do_cd0.9600.7000.8440.0001.0000.000
sales_tm0.0000.4620.0000.1640.0001.000

Missing values

2023-12-10T18:45:15.173374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:45:15.521680image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-10T18:45:15.739836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

subway_line_nosubway_station_nmtourist_nmload_addrcity_do_cdcity_gn_gu_cdxpos_loypos_latel_nohomepage_urlsales_tmbase_ymd
05호선여의나루63 시티서울특별시 영등포구 63로 50 한화금융센터_631111560126.93975737.5195761833-7001http://www.63city.co.kr/10:00~22:002020-12-31
13호선체육공원강서체육공원부산광역시 강서구 대저1동 21582626440128.97128135.210222<NA><NA><NA>2020-12-31
29호선국회의사당KBS서울특별시 영등포구 여의공원로 131111560126.91642337.52559102-781-1000http://www.kbs.co.kr/<NA>2020-12-31
35호선여의도KBS 홀서울특별시 영등포구 여의공원로 13 한국방송공사1111560126.91642337.52559102-781-1000http://office.kbs.co.kr/kbshall/<NA>2020-12-31
49호선가양KBS스포츠월드서울특별시 강서구 공항대로 376 KBS88체육관1111500126.847637.55680902-2600-8808http://www.kbssw.co.kr/<NA>2020-12-31
55호선여의나루LG 사이언스 홀서울특별시 영등포구 여의대로 1281111560126.92896137.52780402-3773-1053http://www.lgscience.co.kr/09:00~17:302020-12-31
62호선역삼LG아트센터서울특별시 강남구 논현로 5081111320127.03761937.50214102-2005-0114http://www.lgart.com/<NA>2020-12-31
74호선충렬사충렬사부산광역시 동래구 충렬대로 3472626260129.09616235.200909<NA><NA><NA>2020-12-31
84호선명동N서울타워서울특별시 용산구 남산공원길 1261111170126.99075137.55085702-3455-9277http://www.nseoultower.com/10:00~23:002020-12-31
98호선장지역가든파이브서울특별시 송파구 충민로 66 가든파이브라이프1111710127.12371337.47763502-2157-0100http://www.garden5.com/10:30~21:002020-12-31
subway_line_nosubway_station_nmtourist_nmload_addrcity_do_cdcity_gn_gu_cdxpos_loypos_latel_nohomepage_urlsales_tmbase_ymd
905호선종로3가낙원떡, 악기상가서울특별시 종로구 낙원동1111110126.98862437.572581<NA><NA><NA>2020-12-31
911호선시청난타전용극장서울특별시 중구 명동길 26 유네스코회관1111140126.98376737.56346502-739-8288http://www.nanta.co.kr/<NA>2020-12-31
922호선시청난타전용극장서울특별시 중구 명동길 26 유네스코회관1111140126.98376737.56346502-739-8288http://www.nanta.co.kr/<NA>2020-12-31
934호선회현남대문서울특별시 중구 세종대로 401111140126.97532637.559923042-481-4650<NA><NA>2020-12-31
944호선회현남대문시장서울특별시 중구 남대문시장4길 211111140126.97772437.55922302-753-2805http://namdaemunmarket.co.kr/00:00~23:002020-12-31
956호선한강진역남산(N서울타워)서울특별시 용산구 남산공원길 1261111170126.99075137.55085702-3455-9277http://www.nseoultower.com/10:00~23:002020-12-31
964호선충무로남산골 한옥마을서울특별시 중구 퇴계로34길 28 남산골한옥마을1111140126.99410637.55887802-2261-0517http://hanokmaeul.or.kr/09:00~21:002020-12-31
973호선충무로남산골한옥마을서울특별시 중구 퇴계로34길 28 남산골한옥마을1111140126.99410637.55887802-2261-0517http://hanokmaeul.or.kr/09:00~21:002020-12-31
986호선한강진역남산야외식물원서울특별시 용산구 소월로 3231111170126.99702137.54153902-798-3771<NA>00:00~24:002020-12-31
994호선명동남산케이블카서울특별시 중구 소파로 831111140126.98399537.5566102-753-2403http://www.cablecar.co.kr/10:00~23:002020-12-31