Overview

Dataset statistics

Number of variables4
Number of observations997
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory33.2 KiB
Average record size in memory34.1 B

Variable types

Numeric2
Text1
DateTime1

Dataset

Description다문화교육포털 내 탑재된 자료들에 대한 검색어 빈도 집계이며, 해당 정보는 접속자의 검색 키워드 대비 빈도를 제공합니다.
Author국가평생교육진흥원
URLhttps://www.data.go.kr/data/15090342/fileData.do

Alerts

검색빈도 is highly skewed (γ1 = 20.66581599)Skewed
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-13 00:05:18.805874
Analysis finished2023-12-13 00:05:19.451163
Duration0.65 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct997
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean499.99198
Minimum1
Maximum998
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T09:05:19.508063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile51.8
Q1251
median500
Q3749
95-th percentile948.2
Maximum998
Range997
Interquartile range (IQR)498

Descriptive statistics

Standard deviation287.96722
Coefficient of variation (CV)0.57594368
Kurtosis-1.1997768
Mean499.99198
Median Absolute Deviation (MAD)249
Skewness-0.00016398895
Sum498492
Variance82925.118
MonotonicityStrictly increasing
2023-12-13T09:05:19.619239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.1%
672 1
 
0.1%
659 1
 
0.1%
660 1
 
0.1%
661 1
 
0.1%
662 1
 
0.1%
663 1
 
0.1%
664 1
 
0.1%
665 1
 
0.1%
666 1
 
0.1%
Other values (987) 987
99.0%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
10 1
0.1%
11 1
0.1%
ValueCountFrequency (%)
998 1
0.1%
997 1
0.1%
996 1
0.1%
995 1
0.1%
994 1
0.1%
993 1
0.1%
992 1
0.1%
991 1
0.1%
990 1
0.1%
989 1
0.1%
Distinct971
Distinct (%)97.4%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
2023-12-13T09:05:19.845495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length114
Median length31
Mean length8.2136409
Min length2

Characters and Unicode

Total characters8189
Distinct characters443
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks6 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique948 ?
Unique (%)95.1%

Sample

1st row베트남어
2nd row학적
3rd row스스로
4th row이중언어
5th row표준 한국어
ValueCountFrequency (%)
다문화 108
 
5.5%
다문화교육 43
 
2.2%
한국어 39
 
2.0%
위한 36
 
1.8%
학적관리 24
 
1.2%
교육 23
 
1.2%
다문화학생 20
 
1.0%
배우는 19
 
1.0%
어휘 17
 
0.9%
2019 16
 
0.8%
Other values (856) 1617
82.4%
2023-12-13T09:05:20.176716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
967
 
11.8%
369
 
4.5%
346
 
4.2%
329
 
4.0%
325
 
4.0%
300
 
3.7%
200
 
2.4%
151
 
1.8%
147
 
1.8%
140
 
1.7%
Other values (433) 4915
60.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6736
82.3%
Space Separator 967
 
11.8%
Decimal Number 240
 
2.9%
Lowercase Letter 153
 
1.9%
Uppercase Letter 43
 
0.5%
Other Punctuation 15
 
0.2%
Connector Punctuation 13
 
0.2%
Open Punctuation 9
 
0.1%
Close Punctuation 7
 
0.1%
Dash Punctuation 4
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
369
 
5.5%
346
 
5.1%
329
 
4.9%
325
 
4.8%
300
 
4.5%
200
 
3.0%
151
 
2.2%
147
 
2.2%
140
 
2.1%
106
 
1.6%
Other values (367) 4323
64.2%
Lowercase Letter
ValueCountFrequency (%)
k 23
15.0%
d 17
 
11.1%
t 10
 
6.5%
l 10
 
6.5%
e 9
 
5.9%
c 9
 
5.9%
g 8
 
5.2%
p 8
 
5.2%
n 7
 
4.6%
s 7
 
4.6%
Other values (14) 45
29.4%
Uppercase Letter
ValueCountFrequency (%)
K 9
20.9%
S 8
18.6%
L 7
16.3%
E 3
 
7.0%
R 2
 
4.7%
N 2
 
4.7%
P 2
 
4.7%
I 1
 
2.3%
H 1
 
2.3%
M 1
 
2.3%
Other values (7) 7
16.3%
Decimal Number
ValueCountFrequency (%)
1 62
25.8%
2 57
23.8%
0 55
22.9%
9 28
11.7%
8 11
 
4.6%
7 9
 
3.8%
4 8
 
3.3%
3 5
 
2.1%
5 3
 
1.2%
6 2
 
0.8%
Other Punctuation
ValueCountFrequency (%)
, 6
40.0%
: 3
20.0%
' 2
 
13.3%
? 1
 
6.7%
\ 1
 
6.7%
; 1
 
6.7%
/ 1
 
6.7%
Open Punctuation
ValueCountFrequency (%)
( 8
88.9%
1
 
11.1%
Space Separator
ValueCountFrequency (%)
967
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 13
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6734
82.2%
Common 1257
 
15.3%
Latin 196
 
2.4%
Han 2
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
369
 
5.5%
346
 
5.1%
329
 
4.9%
325
 
4.8%
300
 
4.5%
200
 
3.0%
151
 
2.2%
147
 
2.2%
140
 
2.1%
106
 
1.6%
Other values (365) 4321
64.2%
Latin
ValueCountFrequency (%)
k 23
 
11.7%
d 17
 
8.7%
t 10
 
5.1%
l 10
 
5.1%
e 9
 
4.6%
K 9
 
4.6%
c 9
 
4.6%
S 8
 
4.1%
g 8
 
4.1%
p 8
 
4.1%
Other values (31) 85
43.4%
Common
ValueCountFrequency (%)
967
76.9%
1 62
 
4.9%
2 57
 
4.5%
0 55
 
4.4%
9 28
 
2.2%
_ 13
 
1.0%
8 11
 
0.9%
7 9
 
0.7%
4 8
 
0.6%
( 8
 
0.6%
Other values (15) 39
 
3.1%
Han
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6727
82.1%
ASCII 1450
 
17.7%
Compat Jamo 7
 
0.1%
CJK 2
 
< 0.1%
Punctuation 2
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
967
66.7%
1 62
 
4.3%
2 57
 
3.9%
0 55
 
3.8%
9 28
 
1.9%
k 23
 
1.6%
d 17
 
1.2%
_ 13
 
0.9%
8 11
 
0.8%
t 10
 
0.7%
Other values (53) 207
 
14.3%
Hangul
ValueCountFrequency (%)
369
 
5.5%
346
 
5.1%
329
 
4.9%
325
 
4.8%
300
 
4.5%
200
 
3.0%
151
 
2.2%
147
 
2.2%
140
 
2.1%
106
 
1.6%
Other values (359) 4314
64.1%
Compat Jamo
ValueCountFrequency (%)
2
28.6%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
1
14.3%
CJK
ValueCountFrequency (%)
1
50.0%
1
50.0%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
None
ValueCountFrequency (%)
1
100.0%

검색빈도
Real number (ℝ)

SKEWED 

Distinct104
Distinct (%)10.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.848546
Minimum1
Maximum9356
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size8.9 KiB
2023-12-13T09:05:20.297252image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q38
95-th percentile74.2
Maximum9356
Range9355
Interquartile range (IQR)7

Descriptive statistics

Standard deviation354.7044
Coefficient of variation (CV)9.3716784
Kurtosis501.31789
Mean37.848546
Median Absolute Deviation (MAD)1
Skewness20.665816
Sum37735
Variance125815.21
MonotonicityNot monotonic
2023-12-13T09:05:20.416991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 473
47.4%
2 91
 
9.1%
3 60
 
6.0%
4 34
 
3.4%
5 34
 
3.4%
6 21
 
2.1%
8 21
 
2.1%
7 17
 
1.7%
10 14
 
1.4%
11 14
 
1.4%
Other values (94) 218
21.9%
ValueCountFrequency (%)
1 473
47.4%
2 91
 
9.1%
3 60
 
6.0%
4 34
 
3.4%
5 34
 
3.4%
6 21
 
2.1%
7 17
 
1.7%
8 21
 
2.1%
9 13
 
1.3%
10 14
 
1.4%
ValueCountFrequency (%)
9356 1
0.1%
4169 1
0.1%
2260 1
0.1%
2214 1
0.1%
1901 1
0.1%
1852 1
0.1%
1180 1
0.1%
1062 1
0.1%
929 1
0.1%
438 1
0.1%
Distinct953
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Memory size7.9 KiB
Minimum2019-01-29 16:49:00
Maximum2021-09-14 15:03:00
2023-12-13T09:05:20.518617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:05:20.617465image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-13T09:05:19.216779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:05:19.069155image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:05:19.282121image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T09:05:19.141965image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T09:05:20.678537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번검색빈도
연번1.0000.229
검색빈도0.2291.000
2023-12-13T09:05:20.741837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번검색빈도
연번1.000-0.235
검색빈도-0.2351.000

Missing values

2023-12-13T09:05:19.363707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:05:19.424595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번검색어검색빈도자료생성일
01베트남어492021-08-01 14:30
12학적10622021-09-10 14:23
23스스로3822021-09-10 21:33
34이중언어19012021-09-13 19:56
45표준 한국어18522021-09-14 15:03
56학적 관리9292021-09-14 14:49
67연수612021-08-26 20:18
78우수사례2612021-09-14 11:25
810한국342021-05-04 19:47
911한국어 진단활동자료_국어322142021-09-14 14:00
연번검색어검색빈도자료생성일
987989세계 음식 체험12019-04-22 11:44
988990ppt82020-05-17 18:40
989991운영가이드52020-05-21 15:42
990992전문상담교사 대상 다문화 이해연수12019-04-22 14:50
991993교육 과정 여녜 다문화 교육 수업 도움자료12019-04-22 15:14
992994다문화교육의 특징22020-11-25 16:20
993995눈을열어요12019-04-22 15:40
9949962019다문화학생을 위한 학적관리메뉴얼12019-04-22 15:44
995997다문화지원단12019-04-22 18:20
996998인권132021-08-30 13:16