Overview

Dataset statistics

Number of variables3
Number of observations200
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.2 KiB
Average record size in memory26.7 B

Variable types

Numeric2
Text1

Dataset

Description뉴스데이터베이스 "BIGKinds" 기반 분석 자료, 기타 메타정보
Author한국언론진흥재단
URLhttps://www.data.go.kr/data/15072750/fileData.do

Alerts

순위 is highly overall correlated with 빈도수High correlation
빈도수 is highly overall correlated with 순위High correlation
순위 has unique valuesUnique
키워드 has unique valuesUnique

Reproduction

Analysis started2023-12-12 14:47:51.279491
Analysis finished2023-12-12 14:47:52.114314
Duration0.83 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순위
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct200
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean100.5
Minimum1
Maximum200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-12T23:47:52.233295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile10.95
Q150.75
median100.5
Q3150.25
95-th percentile190.05
Maximum200
Range199
Interquartile range (IQR)99.5

Descriptive statistics

Standard deviation57.879185
Coefficient of variation (CV)0.57591228
Kurtosis-1.2
Mean100.5
Median Absolute Deviation (MAD)50
Skewness0
Sum20100
Variance3350
MonotonicityStrictly increasing
2023-12-12T23:47:52.451254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.5%
139 1
 
0.5%
129 1
 
0.5%
130 1
 
0.5%
131 1
 
0.5%
132 1
 
0.5%
133 1
 
0.5%
134 1
 
0.5%
135 1
 
0.5%
136 1
 
0.5%
Other values (190) 190
95.0%
ValueCountFrequency (%)
1 1
0.5%
2 1
0.5%
3 1
0.5%
4 1
0.5%
5 1
0.5%
6 1
0.5%
7 1
0.5%
8 1
0.5%
9 1
0.5%
10 1
0.5%
ValueCountFrequency (%)
200 1
0.5%
199 1
0.5%
198 1
0.5%
197 1
0.5%
196 1
0.5%
195 1
0.5%
194 1
0.5%
193 1
0.5%
192 1
0.5%
191 1
0.5%

키워드
Text

UNIQUE 

Distinct200
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
2023-12-12T23:47:52.822150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length3.27
Min length3

Characters and Unicode

Total characters654
Distinct characters262
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique200 ?
Unique (%)100.0%

Sample

1st row대통령
2nd row서비스
3rd row프로그램
4th row청와대
5th row트럼프
ValueCountFrequency (%)
대통령 1
 
0.5%
선수들 1
 
0.5%
대법원 1
 
0.5%
프리미엄 1
 
0.5%
카탈루냐 1
 
0.5%
간담회 1
 
0.5%
전기차 1
 
0.5%
대변인 1
 
0.5%
ceo 1
 
0.5%
인공지능 1
 
0.5%
Other values (190) 190
95.0%
2023-12-12T23:47:53.297791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
22
 
3.4%
16
 
2.4%
15
 
2.3%
14
 
2.1%
13
 
2.0%
13
 
2.0%
12
 
1.8%
11
 
1.7%
11
 
1.7%
10
 
1.5%
Other values (252) 517
79.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 618
94.5%
Uppercase Letter 22
 
3.4%
Lowercase Letter 5
 
0.8%
Decimal Number 5
 
0.8%
Connector Punctuation 3
 
0.5%
Other Punctuation 1
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
22
 
3.6%
16
 
2.6%
15
 
2.4%
14
 
2.3%
13
 
2.1%
13
 
2.1%
12
 
1.9%
11
 
1.8%
11
 
1.8%
10
 
1.6%
Other values (225) 481
77.8%
Uppercase Letter
ValueCountFrequency (%)
S 2
 
9.1%
T 2
 
9.1%
R 2
 
9.1%
K 2
 
9.1%
I 2
 
9.1%
A 2
 
9.1%
O 1
 
4.5%
E 1
 
4.5%
C 1
 
4.5%
B 1
 
4.5%
Other values (6) 6
27.3%
Decimal Number
ValueCountFrequency (%)
4 1
20.0%
2 1
20.0%
0 1
20.0%
1 1
20.0%
7 1
20.0%
Lowercase Letter
ValueCountFrequency (%)
o 2
40.0%
r 1
20.0%
e 1
20.0%
a 1
20.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Other Punctuation
ValueCountFrequency (%)
& 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 618
94.5%
Latin 27
 
4.1%
Common 9
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
22
 
3.6%
16
 
2.6%
15
 
2.4%
14
 
2.3%
13
 
2.1%
13
 
2.1%
12
 
1.9%
11
 
1.8%
11
 
1.8%
10
 
1.6%
Other values (225) 481
77.8%
Latin
ValueCountFrequency (%)
o 2
 
7.4%
S 2
 
7.4%
T 2
 
7.4%
R 2
 
7.4%
K 2
 
7.4%
I 2
 
7.4%
A 2
 
7.4%
O 1
 
3.7%
E 1
 
3.7%
C 1
 
3.7%
Other values (10) 10
37.0%
Common
ValueCountFrequency (%)
_ 3
33.3%
& 1
 
11.1%
4 1
 
11.1%
2 1
 
11.1%
0 1
 
11.1%
1 1
 
11.1%
7 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 618
94.5%
ASCII 36
 
5.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
22
 
3.6%
16
 
2.6%
15
 
2.4%
14
 
2.3%
13
 
2.1%
13
 
2.1%
12
 
1.9%
11
 
1.8%
11
 
1.8%
10
 
1.6%
Other values (225) 481
77.8%
ASCII
ValueCountFrequency (%)
_ 3
 
8.3%
o 2
 
5.6%
S 2
 
5.6%
T 2
 
5.6%
R 2
 
5.6%
K 2
 
5.6%
I 2
 
5.6%
A 2
 
5.6%
O 1
 
2.8%
E 1
 
2.8%
Other values (17) 17
47.2%

빈도수
Real number (ℝ)

HIGH CORRELATION 

Distinct176
Distinct (%)88.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1248.4
Minimum562
Maximum13864
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-12T23:47:53.442906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum562
5-th percentile573.8
Q1664
median935.5
Q31393.75
95-th percentile2842.1
Maximum13864
Range13302
Interquartile range (IQR)729.75

Descriptive statistics

Standard deviation1197.8927
Coefficient of variation (CV)0.9595424
Kurtosis63.701097
Mean1248.4
Median Absolute Deviation (MAD)315.5
Skewness6.7175905
Sum249680
Variance1434947
MonotonicityDecreasing
2023-12-12T23:47:53.600637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
564 3
 
1.5%
613 3
 
1.5%
574 3
 
1.5%
674 3
 
1.5%
732 2
 
1.0%
585 2
 
1.0%
621 2
 
1.0%
615 2
 
1.0%
661 2
 
1.0%
665 2
 
1.0%
Other values (166) 176
88.0%
ValueCountFrequency (%)
562 1
 
0.5%
563 1
 
0.5%
564 3
1.5%
565 1
 
0.5%
566 1
 
0.5%
568 2
1.0%
570 1
 
0.5%
574 3
1.5%
580 2
1.0%
581 2
1.0%
ValueCountFrequency (%)
13864 1
0.5%
6773 1
0.5%
4445 1
0.5%
3697 1
0.5%
3594 1
0.5%
3333 1
0.5%
3256 1
0.5%
3130 1
0.5%
3026 1
0.5%
2977 1
0.5%

Interactions

2023-12-12T23:47:51.685296image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:47:51.428486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:47:51.811527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T23:47:51.525829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T23:47:53.693082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순위빈도수
순위1.0000.768
빈도수0.7681.000
2023-12-12T23:47:53.770748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순위빈도수
순위1.000-1.000
빈도수-1.0001.000

Missing values

2023-12-12T23:47:51.966624image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T23:47:52.076701image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순위키워드빈도수
01대통령13864
12서비스6773
23프로그램4445
34청와대3697
45트럼프3594
56글로벌3333
67위원장3256
78외국인3130
89삼성전자3026
910드라마2977
순위키워드빈도수
190191인천시570
191192IoT568
192193리스크568
193194국정감사566
194195개정안565
195196태양광564
196197순이익564
197198협의회564
198199문화재563
199200성매매562