Overview

Dataset statistics

Number of variables4
Number of observations3316
Missing cells8
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory107.0 KiB
Average record size in memory33.0 B

Variable types

Numeric1
Text3

Dataset

Description한국수력원자력 원자력기술정보시스템의 중수로관련 약어집 데이터입니다. 중수로관련 약자와 약자풀이를 제공합니다.
URLhttps://www.data.go.kr/data/15117245/fileData.do

Alerts

문서번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 01:59:47.315937
Analysis finished2023-12-12 01:59:48.170142
Duration0.85 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

문서번호
Real number (ℝ)

UNIQUE 

Distinct3316
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1658.5
Minimum1
Maximum3316
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size29.3 KiB
2023-12-12T10:59:48.255637image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile166.75
Q1829.75
median1658.5
Q32487.25
95-th percentile3150.25
Maximum3316
Range3315
Interquartile range (IQR)1657.5

Descriptive statistics

Standard deviation957.39107
Coefficient of variation (CV)0.57726323
Kurtosis-1.2
Mean1658.5
Median Absolute Deviation (MAD)829
Skewness0
Sum5499586
Variance916597.67
MonotonicityStrictly increasing
2023-12-12T10:59:48.422667image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
< 0.1%
2217 1
 
< 0.1%
2207 1
 
< 0.1%
2208 1
 
< 0.1%
2209 1
 
< 0.1%
2210 1
 
< 0.1%
2211 1
 
< 0.1%
2212 1
 
< 0.1%
2213 1
 
< 0.1%
2214 1
 
< 0.1%
Other values (3306) 3306
99.7%
ValueCountFrequency (%)
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
10 1
< 0.1%
ValueCountFrequency (%)
3316 1
< 0.1%
3315 1
< 0.1%
3314 1
< 0.1%
3313 1
< 0.1%
3312 1
< 0.1%
3311 1
< 0.1%
3310 1
< 0.1%
3309 1
< 0.1%
3308 1
< 0.1%
3307 1
< 0.1%
Distinct3280
Distinct (%)99.0%
Missing3
Missing (%)0.1%
Memory size26.0 KiB
2023-12-12T10:59:48.804140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length136
Median length70
Mean length26.971929
Min length2

Characters and Unicode

Total characters89358
Distinct characters99
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3247 ?
Unique (%)98.0%

Sample

1st rowAtomics International
2nd rowAnalog Input
3rd rowAuthorized Inspection Agency
4th rowAmerican Institute of Architects
5th rowAccelerator Information Center
ValueCountFrequency (%)
reactor 310
 
2.7%
system 303
 
2.6%
of 204
 
1.8%
nuclear 183
 
1.6%
and 142
 
1.2%
test 134
 
1.2%
power 113
 
1.0%
energy 110
 
1.0%
water 108
 
0.9%
control 104
 
0.9%
Other values (2660) 9747
85.1%
2023-12-12T10:59:49.458634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 9232
 
10.3%
8180
 
9.2%
t 6578
 
7.4%
a 6171
 
6.9%
r 5962
 
6.7%
i 5868
 
6.6%
n 5845
 
6.5%
o 5632
 
6.3%
s 3504
 
3.9%
l 3355
 
3.8%
Other values (89) 29031
32.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 70090
78.4%
Uppercase Letter 10359
 
11.6%
Space Separator 8180
 
9.2%
Other Punctuation 295
 
0.3%
Dash Punctuation 237
 
0.3%
Open Punctuation 63
 
0.1%
Close Punctuation 63
 
0.1%
Decimal Number 44
 
< 0.1%
Other Letter 24
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 9232
13.2%
t 6578
9.4%
a 6171
8.8%
r 5962
8.5%
i 5868
 
8.4%
n 5845
 
8.3%
o 5632
 
8.0%
s 3504
 
5.0%
l 3355
 
4.8%
c 3255
 
4.6%
Other values (17) 14688
21.0%
Uppercase Letter
ValueCountFrequency (%)
S 1130
 
10.9%
C 1118
 
10.8%
R 849
 
8.2%
P 802
 
7.7%
A 798
 
7.7%
E 660
 
6.4%
T 551
 
5.3%
I 532
 
5.1%
F 485
 
4.7%
D 473
 
4.6%
Other values (16) 2961
28.6%
Other Letter
ValueCountFrequency (%)
2
 
8.3%
2
 
8.3%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
Other values (12) 12
50.0%
Decimal Number
ValueCountFrequency (%)
0 9
20.5%
1 7
15.9%
5 6
13.6%
2 6
13.6%
3 4
9.1%
9 4
9.1%
6 3
 
6.8%
7 2
 
4.5%
4 2
 
4.5%
8 1
 
2.3%
Other Punctuation
ValueCountFrequency (%)
. 108
36.6%
, 90
30.5%
' 46
15.6%
/ 24
 
8.1%
& 21
 
7.1%
% 3
 
1.0%
# 2
 
0.7%
; 1
 
0.3%
Space Separator
ValueCountFrequency (%)
8180
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 237
100.0%
Open Punctuation
ValueCountFrequency (%)
( 63
100.0%
Close Punctuation
ValueCountFrequency (%)
) 63
100.0%
Math Symbol
ValueCountFrequency (%)
= 2
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 80448
90.0%
Common 8885
 
9.9%
Hangul 24
 
< 0.1%
Greek 1
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 9232
 
11.5%
t 6578
 
8.2%
a 6171
 
7.7%
r 5962
 
7.4%
i 5868
 
7.3%
n 5845
 
7.3%
o 5632
 
7.0%
s 3504
 
4.4%
l 3355
 
4.2%
c 3255
 
4.0%
Other values (42) 25046
31.1%
Common
ValueCountFrequency (%)
8180
92.1%
- 237
 
2.7%
. 108
 
1.2%
, 90
 
1.0%
( 63
 
0.7%
) 63
 
0.7%
' 46
 
0.5%
/ 24
 
0.3%
& 21
 
0.2%
0 9
 
0.1%
Other values (14) 44
 
0.5%
Hangul
ValueCountFrequency (%)
2
 
8.3%
2
 
8.3%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
Other values (12) 12
50.0%
Greek
ValueCountFrequency (%)
β 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 89333
> 99.9%
Hangul 24
 
< 0.1%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 9232
 
10.3%
8180
 
9.2%
t 6578
 
7.4%
a 6171
 
6.9%
r 5962
 
6.7%
i 5868
 
6.6%
n 5845
 
6.5%
o 5632
 
6.3%
s 3504
 
3.9%
l 3355
 
3.8%
Other values (66) 29006
32.5%
Hangul
ValueCountFrequency (%)
2
 
8.3%
2
 
8.3%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
1
 
4.2%
Other values (12) 12
50.0%
None
ValueCountFrequency (%)
β 1
100.0%
Distinct3143
Distinct (%)94.8%
Missing2
Missing (%)0.1%
Memory size26.0 KiB
2023-12-12T10:59:49.949606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length63
Median length41
Mean length10.713941
Min length1

Characters and Unicode

Total characters35506
Distinct characters618
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3049 ?
Unique (%)92.0%

Sample

1st row회사명(미국)
2nd row아날로그 입력신호
3rd row공인 검사 기관
4th row미국 건축 기구
5th row가속기 정보 센터(미국)
ValueCountFrequency (%)
계통 259
 
2.6%
원자력 203
 
2.1%
원자로 151
 
1.5%
시험 112
 
1.1%
핵연료 94
 
1.0%
단위 70
 
0.7%
67
 
0.7%
미국 65
 
0.7%
안전 65
 
0.7%
시설 62
 
0.6%
Other values (3121) 8693
88.3%
2023-12-12T10:59:50.583408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6535
 
18.4%
( 855
 
2.4%
) 855
 
2.4%
790
 
2.2%
686
 
1.9%
598
 
1.7%
561
 
1.6%
535
 
1.5%
491
 
1.4%
458
 
1.3%
Other values (608) 23142
65.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 26008
73.2%
Space Separator 6535
 
18.4%
Open Punctuation 870
 
2.5%
Close Punctuation 870
 
2.5%
Uppercase Letter 528
 
1.5%
Lowercase Letter 276
 
0.8%
Other Punctuation 221
 
0.6%
Decimal Number 166
 
0.5%
Dash Punctuation 21
 
0.1%
Math Symbol 11
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
790
 
3.0%
686
 
2.6%
598
 
2.3%
561
 
2.2%
535
 
2.1%
491
 
1.9%
458
 
1.8%
433
 
1.7%
403
 
1.5%
403
 
1.5%
Other values (533) 20650
79.4%
Uppercase Letter
ValueCountFrequency (%)
A 59
11.2%
E 56
10.6%
N 47
 
8.9%
I 44
 
8.3%
C 42
 
8.0%
L 33
 
6.2%
S 33
 
6.2%
R 32
 
6.1%
O 32
 
6.1%
D 24
 
4.5%
Other values (16) 126
23.9%
Lowercase Letter
ValueCountFrequency (%)
e 33
12.0%
a 33
12.0%
o 28
10.1%
n 27
 
9.8%
l 19
 
6.9%
r 18
 
6.5%
t 14
 
5.1%
s 13
 
4.7%
h 11
 
4.0%
u 11
 
4.0%
Other values (15) 69
25.0%
Decimal Number
ValueCountFrequency (%)
1 65
39.2%
0 31
18.7%
2 25
 
15.1%
3 14
 
8.4%
9 9
 
5.4%
5 6
 
3.6%
6 6
 
3.6%
8 5
 
3.0%
4 3
 
1.8%
7 2
 
1.2%
Other Punctuation
ValueCountFrequency (%)
, 151
68.3%
· 40
 
18.1%
/ 21
 
9.5%
. 6
 
2.7%
% 2
 
0.9%
: 1
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 855
98.3%
[ 15
 
1.7%
Close Punctuation
ValueCountFrequency (%)
) 855
98.3%
] 15
 
1.7%
Math Symbol
ValueCountFrequency (%)
= 9
81.8%
+ 2
 
18.2%
Space Separator
ValueCountFrequency (%)
6535
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 26008
73.2%
Common 8694
 
24.5%
Latin 804
 
2.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
790
 
3.0%
686
 
2.6%
598
 
2.3%
561
 
2.2%
535
 
2.1%
491
 
1.9%
458
 
1.8%
433
 
1.7%
403
 
1.5%
403
 
1.5%
Other values (533) 20650
79.4%
Latin
ValueCountFrequency (%)
A 59
 
7.3%
E 56
 
7.0%
N 47
 
5.8%
I 44
 
5.5%
C 42
 
5.2%
e 33
 
4.1%
L 33
 
4.1%
S 33
 
4.1%
a 33
 
4.1%
R 32
 
4.0%
Other values (41) 392
48.8%
Common
ValueCountFrequency (%)
6535
75.2%
( 855
 
9.8%
) 855
 
9.8%
, 151
 
1.7%
1 65
 
0.7%
· 40
 
0.5%
0 31
 
0.4%
2 25
 
0.3%
/ 21
 
0.2%
- 21
 
0.2%
Other values (14) 95
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 26008
73.2%
ASCII 9458
 
26.6%
None 40
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6535
69.1%
( 855
 
9.0%
) 855
 
9.0%
, 151
 
1.6%
1 65
 
0.7%
A 59
 
0.6%
E 56
 
0.6%
N 47
 
0.5%
I 44
 
0.5%
C 42
 
0.4%
Other values (64) 749
 
7.9%
Hangul
ValueCountFrequency (%)
790
 
3.0%
686
 
2.6%
598
 
2.3%
561
 
2.2%
535
 
2.1%
491
 
1.9%
458
 
1.8%
433
 
1.7%
403
 
1.5%
403
 
1.5%
Other values (533) 20650
79.4%
None
ValueCountFrequency (%)
· 40
100.0%

약어
Text

Distinct2778
Distinct (%)83.9%
Missing3
Missing (%)0.1%
Memory size26.0 KiB
2023-12-12T10:59:51.088481image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length10
Mean length3.3902807
Min length1

Characters and Unicode

Total characters11232
Distinct characters76
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2424 ?
Unique (%)73.2%

Sample

1st rowAI
2nd rowAI
3rd rowAIA
4th rowAIA
5th rowAIC
ValueCountFrequency (%)
24
 
0.7%
d 14
 
0.4%
t 13
 
0.4%
k 12
 
0.4%
m 12
 
0.4%
f 11
 
0.3%
c 11
 
0.3%
n 10
 
0.3%
l 10
 
0.3%
r 10
 
0.3%
Other values (2700) 3260
96.3%
2023-12-12T10:59:51.732763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 1060
 
9.4%
C 1058
 
9.4%
R 865
 
7.7%
A 797
 
7.1%
P 772
 
6.9%
E 694
 
6.2%
T 584
 
5.2%
I 542
 
4.8%
F 489
 
4.4%
D 480
 
4.3%
Other values (66) 3891
34.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 10513
93.6%
Lowercase Letter 408
 
3.6%
Other Punctuation 133
 
1.2%
Space Separator 74
 
0.7%
Dash Punctuation 32
 
0.3%
Decimal Number 32
 
0.3%
Open Punctuation 18
 
0.2%
Close Punctuation 17
 
0.2%
Other Symbol 2
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 1060
 
10.1%
C 1058
 
10.1%
R 865
 
8.2%
A 797
 
7.6%
P 772
 
7.3%
E 694
 
6.6%
T 584
 
5.6%
I 542
 
5.2%
F 489
 
4.7%
D 480
 
4.6%
Other values (18) 3172
30.2%
Lowercase Letter
ValueCountFrequency (%)
e 40
 
9.8%
m 33
 
8.1%
a 27
 
6.6%
r 25
 
6.1%
t 25
 
6.1%
d 23
 
5.6%
k 22
 
5.4%
g 20
 
4.9%
l 20
 
4.9%
p 20
 
4.9%
Other values (17) 153
37.5%
Other Punctuation
ValueCountFrequency (%)
/ 62
46.6%
& 32
24.1%
, 20
 
15.0%
. 14
 
10.5%
· 3
 
2.3%
# 2
 
1.5%
Decimal Number
ValueCountFrequency (%)
2 9
28.1%
5 7
21.9%
1 6
18.8%
0 5
15.6%
3 4
12.5%
8 1
 
3.1%
Other Symbol
ValueCountFrequency (%)
1
50.0%
1
50.0%
Math Symbol
ValueCountFrequency (%)
1
50.0%
+ 1
50.0%
Space Separator
ValueCountFrequency (%)
74
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 32
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10916
97.2%
Common 311
 
2.8%
Greek 5
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1060
 
9.7%
C 1058
 
9.7%
R 865
 
7.9%
A 797
 
7.3%
P 772
 
7.1%
E 694
 
6.4%
T 584
 
5.3%
I 542
 
5.0%
F 489
 
4.5%
D 480
 
4.4%
Other values (41) 3575
32.8%
Common
ValueCountFrequency (%)
74
23.8%
/ 62
19.9%
& 32
10.3%
- 32
10.3%
, 20
 
6.4%
( 18
 
5.8%
) 17
 
5.5%
. 14
 
4.5%
2 9
 
2.9%
5 7
 
2.3%
Other values (11) 26
 
8.4%
Greek
ValueCountFrequency (%)
σ 2
40.0%
Δ 1
20.0%
ρ 1
20.0%
Ω 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11221
99.9%
None 8
 
0.1%
Letterlike Symbols 2
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 1060
 
9.4%
C 1058
 
9.4%
R 865
 
7.7%
A 797
 
7.1%
P 772
 
6.9%
E 694
 
6.2%
T 584
 
5.2%
I 542
 
4.8%
F 489
 
4.4%
D 480
 
4.3%
Other values (58) 3880
34.6%
None
ValueCountFrequency (%)
· 3
37.5%
σ 2
25.0%
Δ 1
 
12.5%
ρ 1
 
12.5%
Ω 1
 
12.5%
Letterlike Symbols
ValueCountFrequency (%)
1
50.0%
1
50.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

Interactions

2023-12-12T10:59:47.775141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-12T10:59:47.896602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T10:59:48.007532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T10:59:48.107959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

문서번호영문용어한글용어약어
01Atomics International회사명(미국)AI
12Analog Input아날로그 입력신호AI
23Authorized Inspection Agency공인 검사 기관AIA
34American Institute of Architects미국 건축 기구AIA
45Accelerator Information Center가속기 정보 센터(미국)AIC
56American Institute of Chemical Engineers미국 화학 공학 협회AICE, AICHE
67Atomic Industrial Forum, Inc,미국 원자력 산업 회의AIF
78Auxiliary Inerting Gas Subsystem보조 불활성 가스부계통AIGS
89Auxiliary Intermediate Heat Exchanger보조 중간열 교환기AIHX
910American Institute of Physics미국 물리 학회AIP
문서번호영문용어한글용어약어
33063307Union Carbide Corporation회사명(미국)UCC
33073308University of California, Lawrence Radiation Laboratory캘리포니아 공학 로렌스 방사선 연구소(미국)UCLR
33083309University of California Radiation Laboratory캘리포니아 대학 방사선 연구소(미국)UCRL
33093310Union of Concerned Scientists반핵 과학자 단체명UCS
33103311Uranium Enrichment Associates우라늄 농축 협회(미국)UEA
33113312University of Florida Teaching Reactor 플로리다 대학 교육용 원자로UFTR
33123313Ultra High Frequency초 고주파UHF
33133314Upper High Injection상부덮개 주입UHI
33143315Earth Leakage Circuit Breaker누전차단기ELB
33153316No Fuse Breaker배선용 차단기, MCCBNFB