Overview

Dataset statistics

Number of variables7
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.9 KiB
Average record size in memory65.4 B

Variable types

Numeric5
Categorical1
Text1

Dataset

Description샘플 데이터
Author더아이엠씨
URLhttps://bigdata-region.kr/#/dataset/a35d29be-f7f8-4ebb-87ec-0265ea6983d1

Alerts

수집년월 has constant value ""Constant
분석인덱스 is highly overall correlated with 단어빈도 and 2 other fieldsHigh correlation
단어빈도 is highly overall correlated with 분석인덱스 and 2 other fieldsHigh correlation
연결정도중심성 is highly overall correlated with 분석인덱스 and 2 other fieldsHigh correlation
매개중심성 is highly overall correlated with 분석인덱스 and 2 other fieldsHigh correlation
분석인덱스 has unique valuesUnique
키워드명 has unique valuesUnique
매개중심성 has 7 (23.3%) zerosZeros

Reproduction

Analysis started2023-12-10 13:44:52.043655
Analysis finished2023-12-10 13:44:56.639664
Duration4.6 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

분석인덱스
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.5
Minimum1
Maximum30
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:44:56.744026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.45
Q18.25
median15.5
Q322.75
95-th percentile28.55
Maximum30
Range29
Interquartile range (IQR)14.5

Descriptive statistics

Standard deviation8.8034084
Coefficient of variation (CV)0.56796183
Kurtosis-1.2
Mean15.5
Median Absolute Deviation (MAD)7.5
Skewness0
Sum465
Variance77.5
MonotonicityStrictly increasing
2023-12-10T22:44:56.944878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1 1
 
3.3%
17 1
 
3.3%
30 1
 
3.3%
29 1
 
3.3%
28 1
 
3.3%
27 1
 
3.3%
26 1
 
3.3%
25 1
 
3.3%
24 1
 
3.3%
23 1
 
3.3%
Other values (20) 20
66.7%
ValueCountFrequency (%)
1 1
3.3%
2 1
3.3%
3 1
3.3%
4 1
3.3%
5 1
3.3%
6 1
3.3%
7 1
3.3%
8 1
3.3%
9 1
3.3%
10 1
3.3%
ValueCountFrequency (%)
30 1
3.3%
29 1
3.3%
28 1
3.3%
27 1
3.3%
26 1
3.3%
25 1
3.3%
24 1
3.3%
23 1
3.3%
22 1
3.3%
21 1
3.3%

수집년월
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
2010-01
30 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2010-01
2nd row2010-01
3rd row2010-01
4th row2010-01
5th row2010-01

Common Values

ValueCountFrequency (%)
2010-01 30
100.0%

Length

2023-12-10T22:44:57.165556image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T22:44:57.314454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2010-01 30
100.0%

키워드명
Text

UNIQUE 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T22:44:57.575347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length2
Mean length2.4
Min length2

Characters and Unicode

Total characters72
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique30 ?
Unique (%)100.0%

Sample

1st row경기도
2nd row데이트
3rd row연인
4th row데이트코스
5th row펜션
ValueCountFrequency (%)
경기도 1
 
3.3%
데이트 1
 
3.3%
남양주 1
 
3.3%
분위기 1
 
3.3%
고양 1
 
3.3%
겨울 1
 
3.3%
양평 1
 
3.3%
공원 1
 
3.3%
결혼 1
 
3.3%
일산 1
 
3.3%
Other values (20) 20
66.7%
2023-12-10T22:44:58.067936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
 
5.6%
3
 
4.2%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
Other values (45) 49
68.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 72
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
 
5.6%
3
 
4.2%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
Other values (45) 49
68.1%

Most occurring scripts

ValueCountFrequency (%)
Hangul 72
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
 
5.6%
3
 
4.2%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
Other values (45) 49
68.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 72
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4
 
5.6%
3
 
4.2%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
2
 
2.8%
Other values (45) 49
68.1%

단어빈도
Real number (ℝ)

HIGH CORRELATION 

Distinct22
Distinct (%)73.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42.3
Minimum14
Maximum291
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:44:58.265837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum14
5-th percentile15
Q117.25
median24.5
Q338.5
95-th percentile143.3
Maximum291
Range277
Interquartile range (IQR)21.25

Descriptive statistics

Standard deviation58.925523
Coefficient of variation (CV)1.3930384
Kurtosis12.861769
Mean42.3
Median Absolute Deviation (MAD)8.5
Skewness3.5732089
Sum1269
Variance3472.2172
MonotonicityDecreasing
2023-12-10T22:44:58.475853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
15 4
 
13.3%
25 2
 
6.7%
16 2
 
6.7%
22 2
 
6.7%
30 2
 
6.7%
24 2
 
6.7%
291 1
 
3.3%
14 1
 
3.3%
17 1
 
3.3%
18 1
 
3.3%
Other values (12) 12
40.0%
ValueCountFrequency (%)
14 1
 
3.3%
15 4
13.3%
16 2
6.7%
17 1
 
3.3%
18 1
 
3.3%
19 1
 
3.3%
22 2
6.7%
23 1
 
3.3%
24 2
6.7%
25 2
6.7%
ValueCountFrequency (%)
291 1
3.3%
209 1
3.3%
63 1
3.3%
60 1
3.3%
50 1
3.3%
42 1
3.3%
41 1
3.3%
40 1
3.3%
34 1
3.3%
30 2
6.7%

단어중요도
Real number (ℝ)

Distinct27
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.03012
Minimum0.018
Maximum0.0621
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:44:58.755869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.018
5-th percentile0.01912
Q10.0232
median0.0261
Q30.03105
95-th percentile0.05691
Maximum0.0621
Range0.0441
Interquartile range (IQR)0.00785

Descriptive statistics

Standard deviation0.01159695
Coefficient of variation (CV)0.38502489
Kurtosis1.9535983
Mean0.03012
Median Absolute Deviation (MAD)0.0046
Skewness1.6179432
Sum0.9036
Variance0.00013448924
MonotonicityNot monotonic
2023-12-10T22:44:59.344591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
0.0238 2
 
6.7%
0.0261 2
 
6.7%
0.0231 2
 
6.7%
0.0242 1
 
3.3%
0.0222 1
 
3.3%
0.0287 1
 
3.3%
0.0311 1
 
3.3%
0.0309 1
 
3.3%
0.0282 1
 
3.3%
0.018 1
 
3.3%
Other values (17) 17
56.7%
ValueCountFrequency (%)
0.018 1
3.3%
0.0184 1
3.3%
0.02 1
3.3%
0.0201 1
3.3%
0.0209 1
3.3%
0.0222 1
3.3%
0.0231 2
6.7%
0.0235 1
3.3%
0.0237 1
3.3%
0.0238 2
6.7%
ValueCountFrequency (%)
0.0621 1
3.3%
0.057 1
3.3%
0.0568 1
3.3%
0.0472 1
3.3%
0.0391 1
3.3%
0.0376 1
3.3%
0.0332 1
3.3%
0.0311 1
3.3%
0.0309 1
3.3%
0.0308 1
3.3%

연결정도중심성
Real number (ℝ)

HIGH CORRELATION 

Distinct13
Distinct (%)43.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.052496667
Minimum0.0069
Maximum0.3402
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:44:59.576044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0.0069
5-th percentile0.0069
Q10.0208
median0.0347
Q30.067675
95-th percentile0.141625
Maximum0.3402
Range0.3333
Interquartile range (IQR)0.046875

Descriptive statistics

Standard deviation0.064849276
Coefficient of variation (CV)1.2353027
Kurtosis13.604207
Mean0.052496667
Median Absolute Deviation (MAD)0.0174
Skewness3.3661246
Sum1.5749
Variance0.0042054286
MonotonicityNot monotonic
2023-12-10T22:44:59.772040image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0.0208 6
20.0%
0.0347 4
13.3%
0.0416 4
13.3%
0.0069 4
13.3%
0.0763 3
10.0%
0.0138 2
 
6.7%
0.3402 1
 
3.3%
0.1666 1
 
3.3%
0.0694 1
 
3.3%
0.1111 1
 
3.3%
Other values (3) 3
10.0%
ValueCountFrequency (%)
0.0069 4
13.3%
0.0138 2
 
6.7%
0.0208 6
20.0%
0.0277 1
 
3.3%
0.0347 4
13.3%
0.0416 4
13.3%
0.0625 1
 
3.3%
0.0694 1
 
3.3%
0.0763 3
10.0%
0.0833 1
 
3.3%
ValueCountFrequency (%)
0.3402 1
 
3.3%
0.1666 1
 
3.3%
0.1111 1
 
3.3%
0.0833 1
 
3.3%
0.0763 3
10.0%
0.0694 1
 
3.3%
0.0625 1
 
3.3%
0.0416 4
13.3%
0.0347 4
13.3%
0.0277 1
 
3.3%

매개중심성
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct24
Distinct (%)80.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.052826667
Minimum0
Maximum0.4702
Zeros7
Zeros (%)23.3%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T22:44:59.981102image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.0006
median0.0297
Q30.0637
95-th percentile0.159935
Maximum0.4702
Range0.4702
Interquartile range (IQR)0.0631

Descriptive statistics

Standard deviation0.091216041
Coefficient of variation (CV)1.7267045
Kurtosis15.692992
Mean0.052826667
Median Absolute Deviation (MAD)0.02965
Skewness3.6556033
Sum1.5848
Variance0.0083203662
MonotonicityNot monotonic
2023-12-10T22:45:00.243807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
0.0 7
23.3%
0.0566 1
 
3.3%
0.0586 1
 
3.3%
0.0001 1
 
3.3%
0.0319 1
 
3.3%
0.0721 1
 
3.3%
0.0102 1
 
3.3%
0.0021 1
 
3.3%
0.0137 1
 
3.3%
0.0592 1
 
3.3%
Other values (14) 14
46.7%
ValueCountFrequency (%)
0.0 7
23.3%
0.0001 1
 
3.3%
0.0021 1
 
3.3%
0.0065 1
 
3.3%
0.0077 1
 
3.3%
0.0086 1
 
3.3%
0.0102 1
 
3.3%
0.0137 1
 
3.3%
0.0275 1
 
3.3%
0.0319 1
 
3.3%
ValueCountFrequency (%)
0.4702 1
3.3%
0.2102 1
3.3%
0.0985 1
3.3%
0.0983 1
3.3%
0.0888 1
3.3%
0.0721 1
3.3%
0.0679 1
3.3%
0.0652 1
3.3%
0.0592 1
3.3%
0.0586 1
3.3%

Interactions

2023-12-10T22:44:55.439106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:52.382785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:53.195628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:53.916807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:54.654885image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:55.594166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:52.542124image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:53.371574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:54.047129image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:54.802277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:55.714494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:52.689086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:53.515168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:54.185456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:54.937578image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:55.859555image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:52.903991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:53.639449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:54.345599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:55.106254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:56.122759image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:53.049677image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:53.779672image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:54.520728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T22:44:55.285679image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T22:45:00.409621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분석인덱스키워드명단어빈도단어중요도연결정도중심성매개중심성
분석인덱스1.0001.0000.7290.3330.4650.000
키워드명1.0001.0001.0001.0001.0001.000
단어빈도0.7291.0001.0000.0000.9700.869
단어중요도0.3331.0000.0001.0000.4390.220
연결정도중심성0.4651.0000.9700.4391.0000.912
매개중심성0.0001.0000.8690.2200.9121.000
2023-12-10T22:45:00.622757image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분석인덱스단어빈도단어중요도연결정도중심성매개중심성
분석인덱스1.000-0.998-0.057-0.666-0.604
단어빈도-0.9981.0000.0460.6570.592
단어중요도-0.0570.0461.0000.2600.270
연결정도중심성-0.6660.6570.2601.0000.855
매개중심성-0.6040.5920.2700.8551.000

Missing values

2023-12-10T22:44:56.346246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T22:44:56.537462image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

분석인덱스수집년월키워드명단어빈도단어중요도연결정도중심성매개중심성
012010-01경기도2910.02420.34020.4702
122010-01데이트2090.02370.16660.2102
232010-01연인630.02350.06940.0584
342010-01데이트코스600.02380.07630.0679
452010-01펜션500.06210.11110.0983
562010-01맛집420.03910.07630.0652
672010-01서울410.02090.06250.0354
782010-01장소400.02610.03470.0275
892010-01헤이리예술마을340.0570.04160.0985
9102010-01가평300.04720.03470.0065
분석인덱스수집년월키워드명단어빈도단어중요도연결정도중심성매개중심성
20212010-01영화190.02010.01380.0021
21222010-01일산180.03080.03470.0102
22232010-01결혼170.0180.00690.0
23242010-01공원160.02820.04160.0721
24252010-01양평160.03090.02080.0
25262010-01겨울150.03110.03470.0319
26272010-01고양150.02310.02080.0001
27282010-01분위기150.02380.00690.0
28292010-01남양주150.02870.02080.0
29302010-01코스140.02610.04160.0586