Overview

Dataset statistics

Number of variables2
Number of observations3053
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory50.8 KiB
Average record size in memory17.0 B

Variable types

Numeric1
Text1

Dataset

Description2019년도 임상병리사 합격자의 학교를 알 수 있는 정보(번호, 학교명)를 개인을 식별할 수 없는 형태로 제공합니다.
URLhttps://www.data.go.kr/data/15067562/fileData.do

Alerts

번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 19:49:44.245074
Analysis finished2023-12-12 19:49:44.730461
Duration0.49 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct3053
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1527
Minimum1
Maximum3053
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size27.0 KiB
2023-12-13T04:49:44.861821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile153.6
Q1764
median1527
Q32290
95-th percentile2900.4
Maximum3053
Range3052
Interquartile range (IQR)1526

Descriptive statistics

Standard deviation881.46951
Coefficient of variation (CV)0.57725574
Kurtosis-1.2
Mean1527
Median Absolute Deviation (MAD)763
Skewness0
Sum4661931
Variance776988.5
MonotonicityStrictly increasing
2023-12-13T04:49:45.070318image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
2040 1
 
< 0.1%
2031 1
 
< 0.1%
2032 1
 
< 0.1%
2033 1
 
< 0.1%
2034 1
 
< 0.1%
2035 1
 
< 0.1%
2036 1
 
< 0.1%
2037 1
 
< 0.1%
2038 1
 
< 0.1%
Other values (3043) 3043
99.7%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
3053 1
< 0.1%
3052 1
< 0.1%
3051 1
< 0.1%
3050 1
< 0.1%
3049 1
< 0.1%
3048 1
< 0.1%
3047 1
< 0.1%
3046 1
< 0.1%
3045 1
< 0.1%
3044 1
< 0.1%

학교
Text

Distinct58
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size24.0 KiB
2023-12-13T04:49:45.399376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length13
Mean length6.5735342
Min length4

Characters and Unicode

Total characters20069
Distinct characters93
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.2%

Sample

1st row충북보건과학대학교
2nd row신성대학교
3rd row경복대학교
4th row대구보건대학교
5th row나사렛대학교
ValueCountFrequency (%)
대구보건대학교 288
 
9.4%
대전보건대학교 136
 
4.5%
원광보건대학교 127
 
4.2%
신한대학교(의정부캠퍼스 112
 
3.7%
김천대학교 107
 
3.5%
을지대학교(성남 99
 
3.2%
동남보건대학교 94
 
3.1%
진주보건대학교 91
 
3.0%
광주보건대학교 89
 
2.9%
세명대학교 84
 
2.7%
Other values (50) 1828
59.8%
2023-12-13T04:49:45.864786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3726
18.6%
3261
16.2%
3011
15.0%
976
 
4.9%
918
 
4.6%
418
 
2.1%
396
 
2.0%
( 371
 
1.8%
) 371
 
1.8%
343
 
1.7%
Other values (83) 6278
31.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 19303
96.2%
Open Punctuation 371
 
1.8%
Close Punctuation 371
 
1.8%
Lowercase Letter 18
 
0.1%
Uppercase Letter 3
 
< 0.1%
Space Separator 2
 
< 0.1%
Decimal Number 1
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3726
19.3%
3261
16.9%
3011
15.6%
976
 
5.1%
918
 
4.8%
418
 
2.2%
396
 
2.1%
343
 
1.8%
337
 
1.7%
295
 
1.5%
Other values (65) 5622
29.1%
Lowercase Letter
ValueCountFrequency (%)
o 3
16.7%
e 2
11.1%
l 2
11.1%
m 2
11.1%
n 2
11.1%
x 1
 
5.6%
r 1
 
5.6%
u 1
 
5.6%
i 1
 
5.6%
t 1
 
5.6%
Other values (2) 2
11.1%
Uppercase Letter
ValueCountFrequency (%)
C 2
66.7%
B 1
33.3%
Open Punctuation
ValueCountFrequency (%)
( 371
100.0%
Close Punctuation
ValueCountFrequency (%)
) 371
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%
Decimal Number
ValueCountFrequency (%)
1 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 19303
96.2%
Common 745
 
3.7%
Latin 21
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3726
19.3%
3261
16.9%
3011
15.6%
976
 
5.1%
918
 
4.8%
418
 
2.2%
396
 
2.1%
343
 
1.8%
337
 
1.7%
295
 
1.5%
Other values (65) 5622
29.1%
Latin
ValueCountFrequency (%)
o 3
14.3%
e 2
9.5%
l 2
9.5%
C 2
9.5%
m 2
9.5%
n 2
9.5%
x 1
 
4.8%
r 1
 
4.8%
u 1
 
4.8%
i 1
 
4.8%
Other values (4) 4
19.0%
Common
ValueCountFrequency (%)
( 371
49.8%
) 371
49.8%
2
 
0.3%
1 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 19303
96.2%
ASCII 766
 
3.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
3726
19.3%
3261
16.9%
3011
15.6%
976
 
5.1%
918
 
4.8%
418
 
2.2%
396
 
2.1%
343
 
1.8%
337
 
1.7%
295
 
1.5%
Other values (65) 5622
29.1%
ASCII
ValueCountFrequency (%)
( 371
48.4%
) 371
48.4%
o 3
 
0.4%
2
 
0.3%
e 2
 
0.3%
l 2
 
0.3%
C 2
 
0.3%
m 2
 
0.3%
n 2
 
0.3%
x 1
 
0.1%
Other values (8) 8
 
1.0%

Interactions

2023-12-13T04:49:44.421890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:49:46.018209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호학교
번호1.0000.350
학교0.3501.000

Missing values

2023-12-13T04:49:44.589241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:49:44.687716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호학교
01충북보건과학대학교
12신성대학교
23경복대학교
34대구보건대학교
45나사렛대학교
56동남보건대학교
67단국대학교(충남)
78신성대학교
89동남보건대학교
910인제대학교
번호학교
30433044광주보건대학교
30443045동남보건대학교
30453046혜전대학교
30463047인제대학교
30473048목포과학대학교
30483049원광보건대학교
30493050대구보건대학교
30503051서영대학교
30513052광양보건대학교
30523053원광보건대학교