Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory888.7 KiB
Average record size in memory91.0 B

Variable types

Numeric3
Categorical4
Text3

Dataset

Description2021년도 공공데이터 기업매칭 지원사업을 통하여 개방되는 데이터로, 15만건의 중한 특허 번역 말뭉치 데이터를 CSV파일로 개방합니다.
URLhttps://www.data.go.kr/data/15096709/fileData.do

Alerts

국가코드 has constant value ""Constant
언어코드 has constant value ""Constant
순번 is highly overall correlated with 출원번호High correlation
출원번호 is highly overall correlated with 순번High correlation
문헌종류 is highly imbalanced (99.1%)Imbalance
순번 has unique valuesUnique
원문 has unique valuesUnique

Reproduction

Analysis started2023-12-12 10:38:00.197786
Analysis finished2023-12-12 10:38:05.832149
Duration5.63 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11541.703
Minimum2
Maximum23082
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:38:05.928055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile1163.9
Q15807.75
median11464.5
Q317293.25
95-th percentile21922.2
Maximum23082
Range23080
Interquartile range (IQR)11485.5

Descriptive statistics

Standard deviation6639.6863
Coefficient of variation (CV)0.57527785
Kurtosis-1.1891379
Mean11541.703
Median Absolute Deviation (MAD)5735
Skewness0.0066448743
Sum1.1541703 × 108
Variance44085434
MonotonicityNot monotonic
2023-12-12T19:38:06.122375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16977 1
 
< 0.1%
14403 1
 
< 0.1%
3894 1
 
< 0.1%
11944 1
 
< 0.1%
9715 1
 
< 0.1%
12139 1
 
< 0.1%
10074 1
 
< 0.1%
11438 1
 
< 0.1%
18790 1
 
< 0.1%
16169 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
3 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
10 1
< 0.1%
13 1
< 0.1%
14 1
< 0.1%
16 1
< 0.1%
20 1
< 0.1%
ValueCountFrequency (%)
23082 1
< 0.1%
23080 1
< 0.1%
23079 1
< 0.1%
23077 1
< 0.1%
23075 1
< 0.1%
23073 1
< 0.1%
23072 1
< 0.1%
23071 1
< 0.1%
23070 1
< 0.1%
23067 1
< 0.1%

국가코드
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
중국
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중국
2nd row중국
3rd row중국
4th row중국
5th row중국

Common Values

ValueCountFrequency (%)
중국 10000
100.0%

Length

2023-12-12T19:38:06.313282image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:38:06.465086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
중국 10000
100.0%

출원번호
Real number (ℝ)

HIGH CORRELATION 

Distinct8413
Distinct (%)84.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.0181645 × 1011
Minimum2.0148005 × 1011
Maximum2.0201006 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:38:06.660788image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0148005 × 1011
5-th percentile2.0161046 × 1011
Q12.0171077 × 1011
median2.0191046 × 1011
Q32.0191094 × 1011
95-th percentile2.0191111 × 1011
Maximum2.0201006 × 1011
Range5.3001058 × 108
Interquartile range (IQR)2.001629 × 108

Descriptive statistics

Standard deviation1.1566321 × 108
Coefficient of variation (CV)0.0005731109
Kurtosis-0.9704188
Mean2.0181645 × 1011
Median Absolute Deviation (MAD)775435.5
Skewness-0.74570822
Sum2.0181645 × 1015
Variance1.3377977 × 1016
MonotonicityNot monotonic
2023-12-12T19:38:06.942090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
201680023751 2
 
< 0.1%
201910916518 2
 
< 0.1%
201880044609 2
 
< 0.1%
201710783880 2
 
< 0.1%
201910955961 2
 
< 0.1%
201880032182 2
 
< 0.1%
201680023462 2
 
< 0.1%
201910961635 2
 
< 0.1%
201710852451 2
 
< 0.1%
201910754877 2
 
< 0.1%
Other values (8403) 9980
99.8%
ValueCountFrequency (%)
201480053131 1
< 0.1%
201480080770 2
< 0.1%
201480081246 1
< 0.1%
201480081682 2
< 0.1%
201480082099 1
< 0.1%
201480082676 1
< 0.1%
201480082947 2
< 0.1%
201480083537 2
< 0.1%
201480083922 1
< 0.1%
201480084578 2
< 0.1%
ValueCountFrequency (%)
202010063714 1
< 0.1%
202010057095 1
< 0.1%
202010050947 1
< 0.1%
201980003367 2
< 0.1%
201980003339 1
< 0.1%
201980002508 2
< 0.1%
201980002497 1
< 0.1%
201980002489 2
< 0.1%
201980002424 2
< 0.1%
201980002400 1
< 0.1%

언어코드
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
중국어
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중국어
2nd row중국어
3rd row중국어
4th row중국어
5th row중국어

Common Values

ValueCountFrequency (%)
중국어 10000
100.0%

Length

2023-12-12T19:38:07.151967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:38:07.293728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
중국어 10000
100.0%
Distinct305
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T19:38:07.780590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters40000
Distinct characters30
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)0.3%

Sample

1st rowG06Q
2nd rowH01M
3rd rowB21F
4th rowB02C
5th rowG01R
ValueCountFrequency (%)
g06f 793
 
7.9%
g06q 448
 
4.5%
g01n 359
 
3.6%
b01d 327
 
3.3%
b01j 323
 
3.2%
g06k 316
 
3.2%
h04l 299
 
3.0%
h01l 277
 
2.8%
g02b 204
 
2.0%
h04w 195
 
1.9%
Other values (295) 6459
64.6%
2023-12-12T19:38:08.451292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 7542
18.9%
B 4558
11.4%
G 4177
10.4%
1 3359
8.4%
6 2813
 
7.0%
2 2608
 
6.5%
H 2515
 
6.3%
F 2203
 
5.5%
4 1421
 
3.6%
D 1014
 
2.5%
Other values (20) 7790
19.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20000
50.0%
Uppercase Letter 20000
50.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B 4558
22.8%
G 4177
20.9%
H 2515
12.6%
F 2203
11.0%
D 1014
 
5.1%
K 782
 
3.9%
L 762
 
3.8%
J 657
 
3.3%
N 624
 
3.1%
C 582
 
2.9%
Other values (10) 2126
10.6%
Decimal Number
ValueCountFrequency (%)
0 7542
37.7%
1 3359
16.8%
6 2813
 
14.1%
2 2608
 
13.0%
4 1421
 
7.1%
5 931
 
4.7%
3 702
 
3.5%
8 238
 
1.2%
9 206
 
1.0%
7 180
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
Common 20000
50.0%
Latin 20000
50.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 4558
22.8%
G 4177
20.9%
H 2515
12.6%
F 2203
11.0%
D 1014
 
5.1%
K 782
 
3.9%
L 762
 
3.8%
J 657
 
3.3%
N 624
 
3.1%
C 582
 
2.9%
Other values (10) 2126
10.6%
Common
ValueCountFrequency (%)
0 7542
37.7%
1 3359
16.8%
6 2813
 
14.1%
2 2608
 
13.0%
4 1421
 
7.1%
5 931
 
4.7%
3 702
 
3.5%
8 238
 
1.2%
9 206
 
1.0%
7 180
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 40000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7542
18.9%
B 4558
11.4%
G 4177
10.4%
1 3359
8.4%
6 2813
 
7.0%
2 2608
 
6.5%
H 2515
 
6.3%
F 2203
 
5.5%
4 1421
 
3.6%
D 1014
 
2.5%
Other values (20) 7790
19.5%

문헌종류
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
공개특허공보(A)
9992 
등록특허공보(B)
 
8

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row공개특허공보(A)
2nd row공개특허공보(A)
3rd row공개특허공보(A)
4th row공개특허공보(A)
5th row공개특허공보(A)

Common Values

ValueCountFrequency (%)
공개특허공보(A) 9992
99.9%
등록특허공보(B) 8
 
0.1%

Length

2023-12-12T19:38:08.673118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:38:08.814815image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
공개특허공보(a 9992
99.9%
등록특허공보(b 8
 
0.1%

구분
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
명세서
5247 
청구항
4753 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row명세서
2nd row청구항
3rd row명세서
4th row명세서
5th row명세서

Common Values

ValueCountFrequency (%)
명세서 5247
52.5%
청구항 4753
47.5%

Length

2023-12-12T19:38:08.968905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:38:09.107306image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
명세서 5247
52.5%
청구항 4753
47.5%

원문
Text

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T19:38:09.346969image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length80
Median length56
Mean length52.7278
Min length24

Characters and Unicode

Total characters527278
Distinct characters2443
Distinct categories17 ?
Distinct scripts4 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st row将苛性比值预测结果与溶出后实际化验值相对误差低于0.1%的数据构成模型修正数据集,用于在线更新苛性比值误差预测模型,提高对工况的自适应能力。
2nd row根据权利要求1所述的多芯金属空气电池,其特征在于:所述电极夹板焊接或粘接到电池壳体。
3rd row上述两种成型方式,人工参与度极大,很难保证空间型发卡状扁铜线制成后产品的一致性,并且存在效率低、工人劳动强度大的缺点。
4th row现有技术中,金属回收装置粉碎效率低,因此,发明一种x装置来解决上述问题很有必要。
5th row最后,将电池性能预测值与电池更换阈值进行比较,如电池性能预测值超过电池更换阈值,则该电池需要更换。
ValueCountFrequency (%)
根据权利要求1 78
 
0.7%
根据权利要求1所述的 30
 
0.3%
如权利要求1 28
 
0.3%
根据权利要求2 16
 
0.2%
任一项所述的方法。 15
 
0.1%
根据权利要求1-4 10
 
0.1%
根据权利要求1-3 9
 
0.1%
如权利要求1所述的 9
 
0.1%
根据权利要求 8
 
0.1%
根据权利要求3 7
 
0.1%
Other values (10271) 10333
98.0%
2023-12-12T19:38:09.926632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
19928
 
3.8%
18427
 
3.5%
14713
 
2.8%
14449
 
2.7%
10000
 
1.9%
7506
 
1.4%
7094
 
1.3%
1 6233
 
1.2%
6031
 
1.1%
5892
 
1.1%
Other values (2433) 417005
79.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 460588
87.4%
Other Punctuation 34560
 
6.6%
Decimal Number 15268
 
2.9%
Uppercase Letter 5714
 
1.1%
Close Punctuation 3275
 
0.6%
Open Punctuation 3215
 
0.6%
Lowercase Letter 2828
 
0.5%
Dash Punctuation 645
 
0.1%
Space Separator 543
 
0.1%
Math Symbol 376
 
0.1%
Other values (7) 266
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
19928
 
4.3%
14713
 
3.2%
14449
 
3.1%
7506
 
1.6%
7094
 
1.5%
6031
 
1.3%
5892
 
1.3%
5130
 
1.1%
4925
 
1.1%
4853
 
1.1%
Other values (2285) 370067
80.3%
Lowercase Letter
ValueCountFrequency (%)
m 464
16.4%
i 219
 
7.7%
a 217
 
7.7%
e 210
 
7.4%
n 205
 
7.2%
o 141
 
5.0%
t 123
 
4.3%
r 117
 
4.1%
c 114
 
4.0%
l 112
 
4.0%
Other values (31) 906
32.0%
Uppercase Letter
ValueCountFrequency (%)
C 539
 
9.4%
S 531
 
9.3%
P 461
 
8.1%
D 378
 
6.6%
A 345
 
6.0%
M 333
 
5.8%
T 307
 
5.4%
O 268
 
4.7%
L 267
 
4.7%
I 258
 
4.5%
Other values (20) 2027
35.5%
Other Punctuation
ValueCountFrequency (%)
18427
53.3%
10000
28.9%
2715
 
7.9%
2017
 
5.8%
424
 
1.2%
/ 288
 
0.8%
. 286
 
0.8%
160
 
0.5%
: 91
 
0.3%
, 86
 
0.2%
Other values (9) 66
 
0.2%
Math Symbol
ValueCountFrequency (%)
232
61.7%
+ 29
 
7.7%
25
 
6.6%
21
 
5.6%
13
 
3.5%
± 11
 
2.9%
11
 
2.9%
× 10
 
2.7%
< 10
 
2.7%
> 6
 
1.6%
Other values (5) 8
 
2.1%
Decimal Number
ValueCountFrequency (%)
1 6233
40.8%
2 2285
 
15.0%
0 1780
 
11.7%
3 1303
 
8.5%
5 985
 
6.5%
4 907
 
5.9%
6 611
 
4.0%
7 473
 
3.1%
8 395
 
2.6%
9 296
 
1.9%
Close Punctuation
ValueCountFrequency (%)
) 2367
72.3%
898
 
27.4%
] 8
 
0.2%
1
 
< 0.1%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 2314
72.0%
891
 
27.7%
[ 8
 
0.2%
1
 
< 0.1%
1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 631
97.8%
8
 
1.2%
5
 
0.8%
1
 
0.2%
Other Symbol
ValueCountFrequency (%)
82
61.7%
° 49
36.8%
1
 
0.8%
1
 
0.8%
Letter Number
ValueCountFrequency (%)
9
56.2%
3
 
18.8%
3
 
18.8%
1
 
6.2%
Other Number
ValueCountFrequency (%)
3
50.0%
1
 
16.7%
1
 
16.7%
1
 
16.7%
Space Separator
ValueCountFrequency (%)
541
99.6%
  2
 
0.4%
Final Punctuation
ValueCountFrequency (%)
47
83.9%
9
 
16.1%
Initial Punctuation
ValueCountFrequency (%)
47
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%
Modifier Symbol
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Han 460588
87.4%
Common 58132
 
11.0%
Latin 8465
 
1.6%
Greek 93
 
< 0.1%

Most frequent character per script

Han
ValueCountFrequency (%)
19928
 
4.3%
14713
 
3.2%
14449
 
3.1%
7506
 
1.6%
7094
 
1.5%
6031
 
1.3%
5892
 
1.3%
5130
 
1.1%
4925
 
1.1%
4853
 
1.1%
Other values (2285) 370067
80.3%
Common
ValueCountFrequency (%)
18427
31.7%
10000
17.2%
1 6233
 
10.7%
2715
 
4.7%
) 2367
 
4.1%
( 2314
 
4.0%
2 2285
 
3.9%
2017
 
3.5%
0 1780
 
3.1%
3 1303
 
2.2%
Other values (63) 8691
15.0%
Latin
ValueCountFrequency (%)
C 539
 
6.4%
S 531
 
6.3%
m 464
 
5.5%
P 461
 
5.4%
D 378
 
4.5%
A 345
 
4.1%
M 333
 
3.9%
T 307
 
3.6%
O 268
 
3.2%
L 267
 
3.2%
Other values (46) 4572
54.0%
Greek
ValueCountFrequency (%)
μ 35
37.6%
α 12
 
12.9%
λ 6
 
6.5%
Δ 5
 
5.4%
β 5
 
5.4%
δ 5
 
5.4%
θ 4
 
4.3%
φ 3
 
3.2%
γ 3
 
3.2%
Ω 3
 
3.2%
Other values (9) 12
 
12.9%

Most occurring blocks

ValueCountFrequency (%)
CJK 460588
87.4%
None 35988
 
6.8%
ASCII 30439
 
5.8%
Punctuation 117
 
< 0.1%
Letterlike Symbols 82
 
< 0.1%
Math Operators 36
 
< 0.1%
Number Forms 16
 
< 0.1%
Enclosed Alphanum 6
 
< 0.1%
Hiragana 4
 
< 0.1%
CJK Compat 1
 
< 0.1%

Most frequent character per block

CJK
ValueCountFrequency (%)
19928
 
4.3%
14713
 
3.2%
14449
 
3.1%
7506
 
1.6%
7094
 
1.5%
6031
 
1.3%
5892
 
1.3%
5130
 
1.1%
4925
 
1.1%
4853
 
1.1%
Other values (2285) 370067
80.3%
None
ValueCountFrequency (%)
18427
51.2%
10000
27.8%
2715
 
7.5%
2017
 
5.6%
898
 
2.5%
891
 
2.5%
424
 
1.2%
232
 
0.6%
160
 
0.4%
° 49
 
0.1%
Other values (31) 175
 
0.5%
ASCII
ValueCountFrequency (%)
1 6233
20.5%
) 2367
 
7.8%
( 2314
 
7.6%
2 2285
 
7.5%
0 1780
 
5.8%
3 1303
 
4.3%
5 985
 
3.2%
4 907
 
3.0%
- 631
 
2.1%
6 611
 
2.0%
Other values (73) 11023
36.2%
Letterlike Symbols
ValueCountFrequency (%)
82
100.0%
Punctuation
ValueCountFrequency (%)
47
40.2%
47
40.2%
9
 
7.7%
8
 
6.8%
3
 
2.6%
2
 
1.7%
1
 
0.9%
Math Operators
ValueCountFrequency (%)
21
58.3%
11
30.6%
2
 
5.6%
1
 
2.8%
1
 
2.8%
Number Forms
ValueCountFrequency (%)
9
56.2%
3
 
18.8%
3
 
18.8%
1
 
6.2%
Hiragana
ValueCountFrequency (%)
4
100.0%
Enclosed Alphanum
ValueCountFrequency (%)
3
50.0%
1
 
16.7%
1
 
16.7%
1
 
16.7%
CJK Compat
ValueCountFrequency (%)
1
100.0%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%
Distinct9996
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T19:38:10.410261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length187
Median length129
Mean length84.3309
Min length31

Characters and Unicode

Total characters843309
Distinct characters1159
Distinct categories17 ?
Distinct scripts6 ?
Distinct blocks13 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9992 ?
Unique (%)99.9%

Sample

1st row가성비 예측 결과와 용출 후 실제 테스트 값 간의 상대오차가 0.1% 미만인 데이터로 모델 수정 데이터 세트를 구성하고, 가성비 오차 예측 모델을 온라인으로 업데이트하여 작업 조건에 대한 적응 능력을 개선한다.
2nd row제1항에 있어서, 상기 전극 클립판은 전지케이스에 용접되거나 접착되는 것을 특징으로 하는 멀티 코어 금속 공기 전지.
3rd row상기 두 가지 성형 방법은 수작업의 참여도가 높으므로 공간형 헤어핀 모양의 플랫 구리 와이어가 제작된 후 제품의 일관성 확보가 어려우며, 작업 효율이 낮고 작업자의 노동강도가 큰 단점이 있다.
4th row종래 기술에서는 금속 회수 장치가 분쇄 효율이 낮으므로, x 장치를 발명하여 상기 문제를 해결할 필요가 매우 있다.
5th row마지막으로, 배터리 성능 예측 값과 배터리 교체 임계값을 비교하여, 배터리 성능 예측 값이 배터리 교체 임계값을 초과하는 경우, 해당 배터리의 교체가 필요하다.
ValueCountFrequency (%)
상기 10140
 
4.8%
있어서 4541
 
2.1%
하는 4037
 
1.9%
것을 4031
 
1.9%
특징으로 3802
 
1.8%
관한 3292
 
1.5%
2770
 
1.3%
2707
 
1.3%
제1항에 2596
 
1.2%
것이다 2050
 
1.0%
Other values (36502) 172854
81.2%
2023-12-12T19:38:11.153937image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
202822
24.1%
18755
 
2.2%
18578
 
2.2%
, 16484
 
2.0%
16090
 
1.9%
15012
 
1.8%
14805
 
1.8%
13462
 
1.6%
11661
 
1.4%
11637
 
1.4%
Other values (1149) 504003
59.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 578268
68.6%
Space Separator 202822
 
24.1%
Other Punctuation 27689
 
3.3%
Decimal Number 17865
 
2.1%
Uppercase Letter 5819
 
0.7%
Close Punctuation 3513
 
0.4%
Open Punctuation 3455
 
0.4%
Lowercase Letter 2804
 
0.3%
Dash Punctuation 436
 
0.1%
Math Symbol 435
 
0.1%
Other values (7) 203
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
18755
 
3.2%
18578
 
3.2%
16090
 
2.8%
15012
 
2.6%
14805
 
2.6%
13462
 
2.3%
11661
 
2.0%
11637
 
2.0%
11261
 
1.9%
9289
 
1.6%
Other values (1011) 437718
75.7%
Lowercase Letter
ValueCountFrequency (%)
m 466
16.6%
a 227
 
8.1%
i 212
 
7.6%
e 203
 
7.2%
n 196
 
7.0%
o 137
 
4.9%
t 123
 
4.4%
c 117
 
4.2%
r 115
 
4.1%
l 110
 
3.9%
Other values (31) 898
32.0%
Uppercase Letter
ValueCountFrequency (%)
C 610
 
10.5%
S 538
 
9.2%
P 465
 
8.0%
D 382
 
6.6%
A 338
 
5.8%
M 335
 
5.8%
T 310
 
5.3%
L 269
 
4.6%
O 266
 
4.6%
I 265
 
4.6%
Other values (21) 2041
35.1%
Math Symbol
ValueCountFrequency (%)
~ 291
66.9%
+ 30
 
6.9%
= 24
 
5.5%
19
 
4.4%
< 14
 
3.2%
± 11
 
2.5%
11
 
2.5%
× 10
 
2.3%
> 7
 
1.6%
7
 
1.6%
Other values (7) 11
 
2.5%
Other Punctuation
ValueCountFrequency (%)
, 16484
59.5%
. 10296
37.2%
/ 299
 
1.1%
: 191
 
0.7%
% 176
 
0.6%
; 106
 
0.4%
" 64
 
0.2%
' 41
 
0.1%
· 13
 
< 0.1%
* 9
 
< 0.1%
Other values (5) 10
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 7407
41.5%
2 3246
18.2%
0 1787
 
10.0%
3 1579
 
8.8%
4 1026
 
5.7%
5 1014
 
5.7%
6 630
 
3.5%
7 471
 
2.6%
8 400
 
2.2%
9 305
 
1.7%
Other Symbol
ValueCountFrequency (%)
° 120
85.1%
16
 
11.3%
2
 
1.4%
1
 
0.7%
1
 
0.7%
1
 
0.7%
Letter Number
ValueCountFrequency (%)
9
56.2%
4
25.0%
2
 
12.5%
1
 
6.2%
Other Number
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Close Punctuation
ValueCountFrequency (%)
) 3505
99.8%
] 8
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 3447
99.8%
[ 8
 
0.2%
Final Punctuation
ValueCountFrequency (%)
7
87.5%
1
 
12.5%
Space Separator
ValueCountFrequency (%)
202822
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 436
100.0%
Format
ValueCountFrequency (%)
25
100.0%
Initial Punctuation
ValueCountFrequency (%)
6
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 578264
68.6%
Common 256402
30.4%
Latin 8547
 
1.0%
Greek 92
 
< 0.1%
Han 3
 
< 0.1%
Bopomofo 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
18755
 
3.2%
18578
 
3.2%
16090
 
2.8%
15012
 
2.6%
14805
 
2.6%
13462
 
2.3%
11661
 
2.0%
11637
 
2.0%
11261
 
1.9%
9289
 
1.6%
Other values (1008) 437714
75.7%
Common
ValueCountFrequency (%)
202822
79.1%
, 16484
 
6.4%
. 10296
 
4.0%
1 7407
 
2.9%
) 3505
 
1.4%
( 3447
 
1.3%
2 3246
 
1.3%
0 1787
 
0.7%
3 1579
 
0.6%
4 1026
 
0.4%
Other values (52) 4803
 
1.9%
Latin
ValueCountFrequency (%)
C 610
 
7.1%
S 538
 
6.3%
m 466
 
5.5%
P 465
 
5.4%
D 382
 
4.5%
A 338
 
4.0%
M 335
 
3.9%
T 310
 
3.6%
L 269
 
3.1%
O 266
 
3.1%
Other values (46) 4568
53.4%
Greek
ValueCountFrequency (%)
μ 33
35.9%
α 11
 
12.0%
λ 6
 
6.5%
Δ 5
 
5.4%
β 5
 
5.4%
δ 5
 
5.4%
θ 4
 
4.3%
Ω 3
 
3.3%
φ 3
 
3.3%
γ 3
 
3.3%
Other values (10) 14
15.2%
Han
ValueCountFrequency (%)
2
66.7%
1
33.3%
Bopomofo
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 578263
68.6%
ASCII 264666
31.4%
None 262
 
< 0.1%
Punctuation 39
 
< 0.1%
Math Operators 34
 
< 0.1%
Letterlike Symbols 16
 
< 0.1%
Number Forms 16
 
< 0.1%
CJK Compat 4
 
< 0.1%
CJK 3
 
< 0.1%
Enclosed Alphanum 3
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
202822
76.6%
, 16484
 
6.2%
. 10296
 
3.9%
1 7407
 
2.8%
) 3505
 
1.3%
( 3447
 
1.3%
2 3246
 
1.2%
0 1787
 
0.7%
3 1579
 
0.6%
4 1026
 
0.4%
Other values (76) 13067
 
4.9%
Hangul
ValueCountFrequency (%)
18755
 
3.2%
18578
 
3.2%
16090
 
2.8%
15012
 
2.6%
14805
 
2.6%
13462
 
2.3%
11661
 
2.0%
11637
 
2.0%
11261
 
1.9%
9289
 
1.6%
Other values (1007) 437713
75.7%
None
ValueCountFrequency (%)
° 120
45.8%
μ 33
 
12.6%
· 13
 
5.0%
± 11
 
4.2%
α 11
 
4.2%
× 10
 
3.8%
7
 
2.7%
λ 6
 
2.3%
Δ 5
 
1.9%
β 5
 
1.9%
Other values (21) 41
 
15.6%
Punctuation
ValueCountFrequency (%)
25
64.1%
7
 
17.9%
6
 
15.4%
1
 
2.6%
Math Operators
ValueCountFrequency (%)
19
55.9%
11
32.4%
2
 
5.9%
1
 
2.9%
1
 
2.9%
Letterlike Symbols
ValueCountFrequency (%)
16
100.0%
Number Forms
ValueCountFrequency (%)
9
56.2%
4
25.0%
2
 
12.5%
1
 
6.2%
CJK
ValueCountFrequency (%)
2
66.7%
1
33.3%
CJK Compat
ValueCountFrequency (%)
2
50.0%
1
25.0%
1
25.0%
Enclosed Alphanum
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%
Bopomofo
ValueCountFrequency (%)
1
100.0%

글자수
Real number (ℝ)

Distinct57
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.0478
Minimum24
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T19:38:11.377998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum24
5-th percentile29
Q140
median53
Q366
95-th percentile77
Maximum80
Range56
Interquartile range (IQR)26

Descriptive statistics

Standard deviation15.314698
Coefficient of variation (CV)0.28869619
Kurtosis-1.1934108
Mean53.0478
Median Absolute Deviation (MAD)13
Skewness-0.025075692
Sum530478
Variance234.53997
MonotonicityNot monotonic
2023-12-12T19:38:11.567266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
62 222
 
2.2%
65 216
 
2.2%
39 214
 
2.1%
61 210
 
2.1%
63 209
 
2.1%
68 209
 
2.1%
36 207
 
2.1%
60 206
 
2.1%
33 205
 
2.1%
66 202
 
2.0%
Other values (47) 7900
79.0%
ValueCountFrequency (%)
24 6
 
0.1%
25 5
 
0.1%
26 119
1.2%
27 137
1.4%
28 165
1.7%
29 162
1.6%
30 148
1.5%
31 164
1.6%
32 180
1.8%
33 205
2.1%
ValueCountFrequency (%)
80 56
 
0.6%
79 135
1.4%
78 162
1.6%
77 179
1.8%
76 187
1.9%
75 192
1.9%
74 191
1.9%
73 196
2.0%
72 182
1.8%
71 169
1.7%

Interactions

2023-12-12T19:38:04.659737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:38:03.692972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:38:04.188684image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:38:04.788895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:38:03.834758image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:38:04.360446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:38:04.908902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:38:03.962767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:38:04.533861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:38:11.702286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번출원번호문헌종류구분글자수
순번1.0000.7640.0220.0720.199
출원번호0.7641.0000.0530.1350.304
문헌종류0.0220.0531.0000.0000.026
구분0.0720.1350.0001.0000.584
글자수0.1990.3040.0260.5841.000
2023-12-12T19:38:11.829824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문헌종류구분
문헌종류1.0000.000
구분0.0001.000
2023-12-12T19:38:11.963895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번출원번호글자수문헌종류구분
순번1.0000.5560.1280.0170.055
출원번호0.5561.0000.2330.0410.104
글자수0.1280.2331.0000.0200.451
문헌종류0.0170.0410.0201.0000.000
구분0.0550.1040.4510.0001.000

Missing values

2023-12-12T19:38:05.109060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:38:05.366358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번국가코드출원번호언어코드국제특허분류코드(IPC)문헌종류구분원문번역문(최종) 검수 후글자수
1697616977중국201911254396중국어G06Q공개특허공보(A)명세서将苛性比值预测结果与溶出后实际化验值相对误差低于0.1%的数据构成模型修正数据集,用于在线更新苛性比值误差预测模型,提高对工况的自适应能力。가성비 예측 결과와 용출 후 실제 테스트 값 간의 상대오차가 0.1% 미만인 데이터로 모델 수정 데이터 세트를 구성하고, 가성비 오차 예측 모델을 온라인으로 업데이트하여 작업 조건에 대한 적응 능력을 개선한다.70
85388539중국201810684591중국어H01M공개특허공보(A)청구항根据权利要求1所述的多芯金属空气电池,其特征在于:所述电极夹板焊接或粘接到电池壳体。제1항에 있어서, 상기 전극 클립판은 전지케이스에 용접되거나 접착되는 것을 특징으로 하는 멀티 코어 금속 공기 전지.43
1197111972중국201910801645중국어B21F공개특허공보(A)명세서上述两种成型方式,人工参与度极大,很难保证空间型发卡状扁铜线制成后产品的一致性,并且存在效率低、工人劳动强度大的缺点。상기 두 가지 성형 방법은 수작업의 참여도가 높으므로 공간형 헤어핀 모양의 플랫 구리 와이어가 제작된 후 제품의 일관성 확보가 어려우며, 작업 효율이 낮고 작업자의 노동강도가 큰 단점이 있다.59
2251922520중국201911110320중국어B02C공개특허공보(A)명세서现有技术中,金属回收装置粉碎效率低,因此,发明一种x装置来解决上述问题很有必要。종래 기술에서는 금속 회수 장치가 분쇄 효율이 낮으므로, x 장치를 발명하여 상기 문제를 해결할 필요가 매우 있다.40
71017102중국201810939301중국어G01R공개특허공보(A)명세서最后,将电池性能预测值与电池更换阈值进行比较,如电池性能预测值超过电池更换阈值,则该电池需要更换。마지막으로, 배터리 성능 예측 값과 배터리 교체 임계값을 비교하여, 배터리 성능 예측 값이 배터리 교체 임계값을 초과하는 경우, 해당 배터리의 교체가 필요하다.49
1357913580중국201910904124중국어H04M공개특허공보(A)청구항根据权利要求1-4中任一项所述基于STA/LTA+DTW的智能手机地震异常事件检测方法,其特征在于:所述threshold2为1。제1항 내지 제4항 중 어느 한 항에 있어서, 상기 threshold2은 1인 것을 특징으로 하는 STA/LTA+DTW 기반의 스마트폰 지진 이상 이벤트 검출 방법.66
1774717748중국201880044707중국어G01B공개특허공보(A)청구항根据权利要求1所述的度量系统,其中所述一或多个非重复层包括:重复层的两个区域之间的顶层、底层或中间层中的至少一者。제1항에 있어서, 상기 하나 또는 다수의 비반복층은, 반복층의 두 영역 사이에 위치하는 상층, 저층 또는 중간층 중 적어도 하나를 포함하는 메트릭 시스템.58
28502851중국201910882617중국어B05D공개특허공보(A)명세서进一步地,所述预处理绝缘层的厚度范围为20~150μm,所述滚涂涂层的厚度范围为100~400μm。또한, 상기 전처리 절연층의 두께 범위는 20~150μm이고, 상기 롤 코팅층의 두께 범위는 100~400μm이다.50
819820중국201680025043중국어H04N공개특허공보(A)명세서本公开涉及一种相机,并且更具体地,涉及自动地控制相机的操作模式。본 개시는 카메라에 관한 것으로, 보다 구체적으로는 카메라의 동작 모드를 자동으로 제어하는 방법에 관한 것이다.32
2098020981중국201910841480중국어H04N공개특허공보(A)명세서发明的目的在于,针对上述问题,提出一种用于机顶盒的多种开机logo实现方法。발명의 목적은 상기한 문제에 대해 셋톱박스를 위한 다양한 부팅 logo 구현 방법을 제안함에 있다.38
순번국가코드출원번호언어코드국제특허분류코드(IPC)문헌종류구분원문번역문(최종) 검수 후글자수
1438314384중국201910941651중국어F23K공개특허공보(A)명세서优选的,所述炉体的外顶部左右两侧均固定连接有排烟管,所述炉体的内底部固定安装有过滤炉条,所述底座与炉体对应的位置设置有用于清理燃料废渣的清灰抽屉。바람직하게는, 상기 노체의 외상부 좌우 양측에는 모두 연기 배출관이 고정 연결되고, 상기 노체의 내저부에는 필터노 스트립이 고정 설치되며, 상기 베이스에는 노체와 대응되는 위치에 연료 폐기물을 청소하기 위한 재 청소 서랍이 설치된다.73
1689816899중국201910933057중국어F04B공개특허공보(A)청구항根据权利要求1所述的一种拨动式开合装卡结构总成,其特征在于:所述的上压块上设有用于对上管卡及拨杆限位的凹槽,所述的上管卡的限位于凹槽中。제1항에 있어서, 상기 상부 압력 블록에는 상기 상부 튜브 카드와 상기 시프트 레버의 위치 제한을 위한 홈이 설치되고, 상기 상부 튜브 카드는 상기 홈에서 위치가 제한되는 것을 특징으로 하는 토글식 개폐 카드 장착 구조 어셈블리.69
2279622797중국201911146975중국어G01C공개특허공보(A)청구항根据权利要求2所述的自动驾驶中高精度地图获取方法,其特征在于:当车辆行驶至一个路径段的一半时,若下一个路径段的前段数据没有接收完成,则进行预警。제2항에 있어서, 차량이 한 경로 구간의 절반까지 주행할 때, 다음 경로 구간의 선행 구간 데이터가 수신이 완료되지 않으면, 경고하는 것을 특징으로 하는 자율 주행 중 완전한 지도 획득 방법.73
1319713198중국201911257892중국어G01W공개특허공보(A)청구항根据权利要求1所述的带旋转功能的微气象观测塔,其特征在于:所述下法兰(32)的底部设有用于防虫的挡板(323)。제1항에 있어서, 상기 하부 플랜지(32)의 바닥부에 방충 배플이 설치되어 있는 것을 특징으로 하는 회전 기능이 구비된 미기상 관측탑.57
2045720458중국201880034415중국어B60R공개특허공보(A)명세서此外,DE 10 2006 646 A1揭示一种安全带,其包括整合在织带中的加热元件。그 외, DE 10 2006 646 A1은 벨트 스트랩에 통합되는 가열요소를 포함하는 안전벨트를 개시한다.43
47694770중국201680009487중국어G01N공개특허공보(A)청구항根据权利要求1所述的系统,其中:所述光源包括激光器。제1항에 있어서, 상기 광원은 레이저를 포함하는, 시스템.26
1548115482중국201911045836중국어F24F공개특허공보(A)청구항根据权利要求3所述的一种带冷凝水热回收的热泵新风机系统,其特征在于,所述喷水管(1232)的喷嘴朝向第一换热器(114)。제3항에 있어서, 상기 분수관(1232)의 노즐은 제1 열교환기(114)를 향하도록 구성되는 것을 특징으로 하는 응축수 열회수를 갖는 히트 펌프 신형 송풍기 시스템.62
1880718808중국201910903679중국어B26F공개특허공보(A)청구항根据权利要求3所述的电木板自动送料冲压设备,其特征在于:所述托板内还安装有若干根用于加热待加工电木板的加热管,所述加热管垂直于电木板移动方向间隔排列布置。제3항에 있어서, 상기 팔레트 안에는 가공될 베이클라이트 판을 가열하기 위한 다수 개의 가열관이 더 설치되고, 상기 가열관은 베이클라이트 판의 이동 방향에 수직으로 간격을 두고 배열되어 배치되는 것을 특징으로 하는 베이클라이트 판 자동 공급 프레스 장치.78
2132221323중국201910934801중국어H04R공개특허공보(A)청구항根据权利要求2所述的消声电路,其特征在于,所述驱动信号输出引脚通过第一限流电阻与所述驱动信号输入引脚电连接。제2항에 있어서, 상기 구동 신호 출력 핀은 제1 전류 제한 저항을 통해 상기 구동 신호 입력 핀과 전기적으로 연결되는 것을 특징으로 하는 소음 회로.55
525526중국201610472312중국어H01L공개특허공보(A)명세서本发明涉及一种制作半导体元件的方法,尤其是涉及一种形成间隙壁后去除部分栅极图案的方法。본 발명은 반도체 소자의 제조 방법에 관한 것으로, 특히 스페이서를 형성한 후 게이트 패턴의 일부를 제거하는 방법에 관한 것이다.43