Overview

Dataset statistics

Number of variables7
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.8 KiB
Average record size in memory59.3 B

Variable types

Numeric2
Categorical2
Text3

Alerts

sccnt_ym has constant value ""Constant
origin_ty has constant value ""Constant
seq has unique valuesUnique
origin_sn_id has unique valuesUnique
kwrd_nm has unique valuesUnique
srchwrd_nm has unique valuesUnique

Reproduction

Analysis started2023-12-10 09:56:07.922256
Analysis finished2023-12-10 09:56:09.965138
Duration2.04 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

seq
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean283148.32
Minimum282862
Maximum285081
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:56:10.119675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum282862
5-th percentile282953.9
Q1282996
median283122
Q3283243.5
95-th percentile283285.1
Maximum285081
Range2219
Interquartile range (IQR)247.5

Descriptive statistics

Standard deviation262.13386
Coefficient of variation (CV)0.00092578285
Kurtosis31.33068
Mean283148.32
Median Absolute Deviation (MAD)124
Skewness4.773381
Sum28314832
Variance68714.159
MonotonicityNot monotonic
2023-12-10T18:56:10.397729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
283980 1
 
1.0%
283147 1
 
1.0%
283237 1
 
1.0%
283235 1
 
1.0%
283232 1
 
1.0%
283230 1
 
1.0%
283228 1
 
1.0%
283157 1
 
1.0%
283155 1
 
1.0%
283153 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
282862 1
1.0%
282873 1
1.0%
282875 1
1.0%
282877 1
1.0%
282952 1
1.0%
282954 1
1.0%
282956 1
1.0%
282960 1
1.0%
282962 1
1.0%
282963 1
1.0%
ValueCountFrequency (%)
285081 1
1.0%
284050 1
1.0%
283980 1
1.0%
283289 1
1.0%
283287 1
1.0%
283285 1
1.0%
283283 1
1.0%
283281 1
1.0%
283279 1
1.0%
283277 1
1.0%

sccnt_ym
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2021-11
100 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-11
2nd row2021-11
3rd row2021-11
4th row2021-11
5th row2021-11

Common Values

ValueCountFrequency (%)
2021-11 100
100.0%

Length

2023-12-10T18:56:10.635837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:56:10.815621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2021-11 100
100.0%

origin_sn_id
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:56:11.206221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length16
Mean length14.74
Min length8

Characters and Unicode

Total characters1474
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st rowSAN_0080
2nd rowKC498PP19N007924
3rd rowKC495PP19N014040
4th rowSAN_0086
5th rowKC495PP19N026268
ValueCountFrequency (%)
san_0080 1
 
1.0%
kc495pp19n032283 1
 
1.0%
kc498pp19n006280 1
 
1.0%
kc498pp19n001768 1
 
1.0%
kc498pp19n000936 1
 
1.0%
kc498pp19n004635 1
 
1.0%
kc495pp19n026407 1
 
1.0%
kc498pp19n006689 1
 
1.0%
culture_001211 1
 
1.0%
kc498pp19n003369 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T18:56:12.046310image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 225
15.3%
9 190
12.9%
P 164
11.1%
1 127
8.6%
4 116
7.9%
N 97
6.6%
8 92
 
6.2%
C 85
 
5.8%
K 82
 
5.6%
7 58
 
3.9%
Other values (12) 238
16.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 980
66.5%
Uppercase Letter 476
32.3%
Connector Punctuation 18
 
1.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 164
34.5%
N 97
20.4%
C 85
17.9%
K 82
17.2%
S 15
 
3.2%
A 15
 
3.2%
U 6
 
1.3%
L 3
 
0.6%
T 3
 
0.6%
R 3
 
0.6%
Decimal Number
ValueCountFrequency (%)
0 225
23.0%
9 190
19.4%
1 127
13.0%
4 116
11.8%
8 92
9.4%
7 58
 
5.9%
5 57
 
5.8%
3 39
 
4.0%
2 38
 
3.9%
6 38
 
3.9%
Connector Punctuation
ValueCountFrequency (%)
_ 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 998
67.7%
Latin 476
32.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 225
22.5%
9 190
19.0%
1 127
12.7%
4 116
11.6%
8 92
9.2%
7 58
 
5.8%
5 57
 
5.7%
3 39
 
3.9%
2 38
 
3.8%
6 38
 
3.8%
Latin
ValueCountFrequency (%)
P 164
34.5%
N 97
20.4%
C 85
17.9%
K 82
17.2%
S 15
 
3.2%
A 15
 
3.2%
U 6
 
1.3%
L 3
 
0.6%
T 3
 
0.6%
R 3
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1474
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 225
15.3%
9 190
12.9%
P 164
11.1%
1 127
8.6%
4 116
7.9%
N 97
6.6%
8 92
 
6.2%
C 85
 
5.8%
K 82
 
5.6%
7 58
 
3.9%
Other values (12) 238
16.1%

kwrd_nm
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:56:12.786060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length5.2
Min length2

Characters and Unicode

Total characters520
Distinct characters139
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row구미산
2nd row5.18자유공원
3rd row12폭포
4th row구절산
5th row가사해수욕장
ValueCountFrequency (%)
구미산 1
 
1.0%
강당계곡 1
 
1.0%
개롱공원 1
 
1.0%
개나리어린이공원 1
 
1.0%
개나리공원 1
 
1.0%
개금테마공원 1
 
1.0%
강문해수욕장 1
 
1.0%
강릉통일공원 1
 
1.0%
강릉임영관삼문 1
 
1.0%
강릉남대천체육공원 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T18:56:13.728052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
65
 
12.5%
61
 
11.7%
31
 
6.0%
25
 
4.8%
17
 
3.3%
13
 
2.5%
13
 
2.5%
10
 
1.9%
10
 
1.9%
10
 
1.9%
Other values (129) 265
51.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 509
97.9%
Decimal Number 10
 
1.9%
Other Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
65
 
12.8%
61
 
12.0%
31
 
6.1%
25
 
4.9%
17
 
3.3%
13
 
2.6%
13
 
2.6%
10
 
2.0%
10
 
2.0%
10
 
2.0%
Other values (123) 254
49.9%
Decimal Number
ValueCountFrequency (%)
5 3
30.0%
1 3
30.0%
8 2
20.0%
2 1
 
10.0%
7 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 509
97.9%
Common 11
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
65
 
12.8%
61
 
12.0%
31
 
6.1%
25
 
4.9%
17
 
3.3%
13
 
2.6%
13
 
2.6%
10
 
2.0%
10
 
2.0%
10
 
2.0%
Other values (123) 254
49.9%
Common
ValueCountFrequency (%)
5 3
27.3%
1 3
27.3%
8 2
18.2%
. 1
 
9.1%
2 1
 
9.1%
7 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 509
97.9%
ASCII 11
 
2.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
65
 
12.8%
61
 
12.0%
31
 
6.1%
25
 
4.9%
17
 
3.3%
13
 
2.6%
13
 
2.6%
10
 
2.0%
10
 
2.0%
10
 
2.0%
Other values (123) 254
49.9%
ASCII
ValueCountFrequency (%)
5 3
27.3%
1 3
27.3%
8 2
18.2%
. 1
 
9.1%
2 1
 
9.1%
7 1
 
9.1%

srchwrd_nm
Text

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T18:56:14.284880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length5.2
Min length2

Characters and Unicode

Total characters520
Distinct characters139
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique100 ?
Unique (%)100.0%

Sample

1st row구미산
2nd row5.18자유공원
3rd row12폭포
4th row구절산
5th row가사해수욕장
ValueCountFrequency (%)
구미산 1
 
1.0%
강당계곡 1
 
1.0%
개롱공원 1
 
1.0%
개나리어린이공원 1
 
1.0%
개나리공원 1
 
1.0%
개금테마공원 1
 
1.0%
강문해수욕장 1
 
1.0%
강릉통일공원 1
 
1.0%
강릉임영관삼문 1
 
1.0%
강릉남대천체육공원 1
 
1.0%
Other values (90) 90
90.0%
2023-12-10T18:56:15.314789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
65
 
12.5%
61
 
11.7%
31
 
6.0%
25
 
4.8%
17
 
3.3%
13
 
2.5%
13
 
2.5%
10
 
1.9%
10
 
1.9%
10
 
1.9%
Other values (129) 265
51.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 509
97.9%
Decimal Number 10
 
1.9%
Other Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
65
 
12.8%
61
 
12.0%
31
 
6.1%
25
 
4.9%
17
 
3.3%
13
 
2.6%
13
 
2.6%
10
 
2.0%
10
 
2.0%
10
 
2.0%
Other values (123) 254
49.9%
Decimal Number
ValueCountFrequency (%)
5 3
30.0%
1 3
30.0%
8 2
20.0%
2 1
 
10.0%
7 1
 
10.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 509
97.9%
Common 11
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
65
 
12.8%
61
 
12.0%
31
 
6.1%
25
 
4.9%
17
 
3.3%
13
 
2.6%
13
 
2.6%
10
 
2.0%
10
 
2.0%
10
 
2.0%
Other values (123) 254
49.9%
Common
ValueCountFrequency (%)
5 3
27.3%
1 3
27.3%
8 2
18.2%
. 1
 
9.1%
2 1
 
9.1%
7 1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 509
97.9%
ASCII 11
 
2.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
65
 
12.8%
61
 
12.0%
31
 
6.1%
25
 
4.9%
17
 
3.3%
13
 
2.6%
13
 
2.6%
10
 
2.0%
10
 
2.0%
10
 
2.0%
Other values (123) 254
49.9%
ASCII
ValueCountFrequency (%)
5 3
27.3%
1 3
27.3%
8 2
18.2%
. 1
 
9.1%
2 1
 
9.1%
7 1
 
9.1%

sccnt
Real number (ℝ)

Distinct66
Distinct (%)66.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16190.81
Minimum10
Maximum1493600
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T18:56:15.736885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile11
Q140
median90
Q3462.5
95-th percentile9558
Maximum1493600
Range1493590
Interquartile range (IQR)422.5

Descriptive statistics

Standard deviation149291.11
Coefficient of variation (CV)9.2207316
Kurtosis99.840289
Mean16190.81
Median Absolute Deviation (MAD)72
Skewness9.988245
Sum1619081
Variance2.2287837 × 1010
MonotonicityNot monotonic
2023-12-10T18:56:16.250968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
60 5
 
5.0%
50 4
 
4.0%
90 4
 
4.0%
70 3
 
3.0%
11 3
 
3.0%
80 3
 
3.0%
10 3
 
3.0%
180 3
 
3.0%
40 3
 
3.0%
200 2
 
2.0%
Other values (56) 67
67.0%
ValueCountFrequency (%)
10 3
3.0%
11 3
3.0%
13 1
 
1.0%
14 2
2.0%
16 2
2.0%
20 2
2.0%
25 2
2.0%
26 1
 
1.0%
27 1
 
1.0%
28 1
 
1.0%
ValueCountFrequency (%)
1493600 1
1.0%
27590 1
1.0%
24150 1
1.0%
12860 1
1.0%
11800 1
1.0%
9440 1
1.0%
9330 1
1.0%
4130 1
1.0%
2320 1
1.0%
1790 1
1.0%

origin_ty
Categorical

CONSTANT 

Distinct1
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
관광명소
100 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row관광명소
2nd row관광명소
3rd row관광명소
4th row관광명소
5th row관광명소

Common Values

ValueCountFrequency (%)
관광명소 100
100.0%

Length

2023-12-10T18:56:16.561419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T18:56:16.802017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
관광명소 100
100.0%

Interactions

2023-12-10T18:56:09.174848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:56:08.772942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:56:09.356282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T18:56:08.955437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T18:56:16.955044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
seqorigin_sn_idkwrd_nmsrchwrd_nmsccnt
seq1.0001.0001.0001.0000.000
origin_sn_id1.0001.0001.0001.0001.000
kwrd_nm1.0001.0001.0001.0001.000
srchwrd_nm1.0001.0001.0001.0001.000
sccnt0.0001.0001.0001.0001.000
2023-12-10T18:56:17.171157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
seqsccnt
seq1.000-0.016
sccnt-0.0161.000

Missing values

2023-12-10T18:56:09.640846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T18:56:09.885869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

seqsccnt_ymorigin_sn_idkwrd_nmsrchwrd_nmsccntorigin_ty
02839802021-11SAN_0080구미산구미산910관광명소
12828732021-11KC498PP19N0079245.18자유공원5.18자유공원440관광명소
22828622021-11KC495PP19N01404012폭포12폭포47관광명소
32840502021-11SAN_0086구절산구절산850관광명소
42829522021-11KC495PP19N026268가사해수욕장가사해수욕장26관광명소
52829542021-11SAN_0007가산가산9440관광명소
62829562021-11KC498PP19N007314가산공원가산공원180관광명소
72828752021-11KC498PP19N005217518기념공원518기념공원930관광명소
82829602021-11KC498PP19N006181가산수변공원가산수변공원270관광명소
92829622021-11SAN_0008가섭산가섭산180관광명소
seqsccnt_ymorigin_sn_idkwrd_nmsrchwrd_nmsccntorigin_ty
902832712021-11KC498PP19N003353거금생태숲거금생태숲140관광명소
912832732021-11SAN_0030거류산거류산1210관광명소
922832752021-11KC498PP19N000759거류체육공원거류체육공원38관광명소
932832772021-11KC495PP19N014472거림계곡거림계곡200관광명소
942832792021-11KC498PP19N004084거마공원거마공원70관광명소
952832812021-11SAN_0031거망산거망산90관광명소
962832832021-11KC495PP19N026269거문도해수욕장거문도해수욕장14관광명소
972832852021-11SAN_0032거문산거문산30관광명소
982832872021-11KC498PP19N007441거북공원거북공원460관광명소
992832892021-11KC498PP19N007449거북선공원거북선공원150관광명소