Overview

Dataset statistics

Number of variables3
Number of observations883
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.7 KiB
Average record size in memory25.1 B

Variable types

Numeric1
Text2

Dataset

Description산업통상자원부 국가기술표준원의 기관 대표 홈페이지에 게제된 게시글의 첨부파일 관련 정보로서 게시글 번호, 파일명, 원본 파일명 정보를 제공합니다.
URLhttps://www.data.go.kr/data/15040687/fileData.do

Alerts

파일명 has unique valuesUnique

Reproduction

Analysis started2023-12-12 18:27:35.710491
Analysis finished2023-12-12 18:27:36.305141
Duration0.59 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

게시글번호
Real number (ℝ)

Distinct593
Distinct (%)67.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23624.561
Minimum19142
Maximum23941
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.9 KiB
2023-12-13T03:27:36.403862image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum19142
5-th percentile23372.1
Q123483.5
median23643
Q323787.5
95-th percentile23911
Maximum23941
Range4799
Interquartile range (IQR)304

Descriptive statistics

Standard deviation297.5825
Coefficient of variation (CV)0.012596319
Kurtosis123.7114
Mean23624.561
Median Absolute Deviation (MAD)152
Skewness-8.8385162
Sum20860487
Variance88555.342
MonotonicityNot monotonic
2023-12-13T03:27:36.557159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23378 5
 
0.6%
23784 5
 
0.6%
23911 5
 
0.6%
23798 5
 
0.6%
23703 5
 
0.6%
23663 5
 
0.6%
23376 5
 
0.6%
23377 4
 
0.5%
23421 4
 
0.5%
23445 4
 
0.5%
Other values (583) 836
94.7%
ValueCountFrequency (%)
19142 2
0.2%
21269 2
0.2%
23340 1
 
0.1%
23341 2
0.2%
23342 1
 
0.1%
23343 1
 
0.1%
23344 1
 
0.1%
23345 2
0.2%
23346 1
 
0.1%
23347 4
0.5%
ValueCountFrequency (%)
23941 1
 
0.1%
23940 1
 
0.1%
23939 4
0.5%
23938 1
 
0.1%
23937 2
0.2%
23936 1
 
0.1%
23935 1
 
0.1%
23934 1
 
0.1%
23933 1
 
0.1%
23932 1
 
0.1%

파일명
Text

UNIQUE 

Distinct883
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
2023-12-13T03:27:36.802602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length28
Mean length28.02718
Min length27

Characters and Unicode

Total characters24748
Distinct characters37
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique883 ?
Unique (%)100.0%

Sample

1st row23343_202210050806404390.pdf
2nd row23365_202210171516584330.pdf
3rd row23397_202210310711363180.jpg
4th row23397_202210310711363430.jpg
5th row23397_202210310711364551.jpg
ValueCountFrequency (%)
23343_202210050806404390.pdf 1
 
0.1%
23458_202212011649448050.pdf 1
 
0.1%
23435_202211171751278570.pdf 1
 
0.1%
23468_202212061606027290.hwp 1
 
0.1%
23468_202212061606028261.pdf 1
 
0.1%
23468_202212061606029252.pdf 1
 
0.1%
23471_202212091738002430.pdf 1
 
0.1%
23471_202212091738003331.zip 1
 
0.1%
23474_202212121722181770.hwp 1
 
0.1%
23451_202211241718159630.pdf 1
 
0.1%
Other values (873) 873
98.9%
2023-12-13T03:27:37.178856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 4428
17.9%
0 3670
14.8%
3 2840
11.5%
1 2782
11.2%
4 1302
 
5.3%
5 1223
 
4.9%
7 1193
 
4.8%
6 984
 
4.0%
8 971
 
3.9%
9 917
 
3.7%
Other values (27) 4438
17.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 20310
82.1%
Lowercase Letter 2571
 
10.4%
Connector Punctuation 883
 
3.6%
Other Punctuation 883
 
3.6%
Uppercase Letter 101
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
p 841
32.7%
h 474
18.4%
w 473
18.4%
d 265
 
10.3%
f 263
 
10.2%
z 75
 
2.9%
i 74
 
2.9%
g 31
 
1.2%
j 31
 
1.2%
x 27
 
1.1%
Other values (6) 17
 
0.7%
Decimal Number
ValueCountFrequency (%)
2 4428
21.8%
0 3670
18.1%
3 2840
14.0%
1 2782
13.7%
4 1302
 
6.4%
5 1223
 
6.0%
7 1193
 
5.9%
6 984
 
4.8%
8 971
 
4.8%
9 917
 
4.5%
Uppercase Letter
ValueCountFrequency (%)
P 31
30.7%
G 30
29.7%
N 25
24.8%
J 5
 
5.0%
X 4
 
4.0%
L 2
 
2.0%
S 2
 
2.0%
D 1
 
1.0%
F 1
 
1.0%
Connector Punctuation
ValueCountFrequency (%)
_ 883
100.0%
Other Punctuation
ValueCountFrequency (%)
. 883
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22076
89.2%
Latin 2672
 
10.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
p 841
31.5%
h 474
17.7%
w 473
17.7%
d 265
 
9.9%
f 263
 
9.8%
z 75
 
2.8%
i 74
 
2.8%
g 31
 
1.2%
P 31
 
1.2%
j 31
 
1.2%
Other values (15) 114
 
4.3%
Common
ValueCountFrequency (%)
2 4428
20.1%
0 3670
16.6%
3 2840
12.9%
1 2782
12.6%
4 1302
 
5.9%
5 1223
 
5.5%
7 1193
 
5.4%
6 984
 
4.5%
8 971
 
4.4%
9 917
 
4.2%
Other values (2) 1766
 
8.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 24748
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 4428
17.9%
0 3670
14.8%
3 2840
11.5%
1 2782
11.2%
4 1302
 
5.3%
5 1223
 
4.9%
7 1193
 
4.8%
6 984
 
4.0%
8 971
 
3.9%
9 917
 
3.7%
Other values (27) 4438
17.9%
Distinct814
Distinct (%)92.2%
Missing0
Missing (%)0.0%
Memory size7.0 KiB
2023-12-13T03:27:37.493783image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length93
Median length66
Mean length40.475651
Min length6

Characters and Unicode

Total characters35740
Distinct characters538
Distinct categories15 ?
Distinct scripts3 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique764 ?
Unique (%)86.5%

Sample

1st row상임전문위원 서류전형 합격자 및 면접시험 공고.pdf
2nd row상임전문위원 채용시험 최종합격자 공고.pdf
3rd row16656528475150.jpg
4th row16656528475811.jpg
5th row16656528478745.jpg
ValueCountFrequency (%)
공고 157
 
3.6%
국가기술표준원 135
 
3.1%
kolas 85
 
1.9%
84
 
1.9%
2023년 60
 
1.4%
공고.hwp 47
 
1.1%
공인시험기관 46
 
1.0%
인정공고(국가기술표준원 39
 
0.9%
안전기준 38
 
0.9%
2023년도 33
 
0.7%
Other values (1999) 3693
83.6%
2023-12-13T03:27:38.000883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3536
 
9.9%
2 1880
 
5.3%
0 1453
 
4.1%
. 1160
 
3.2%
) 878
 
2.5%
( 877
 
2.5%
p 849
 
2.4%
807
 
2.3%
1 782
 
2.2%
3 762
 
2.1%
Other values (528) 22756
63.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 17718
49.6%
Decimal Number 6074
 
17.0%
Space Separator 3536
 
9.9%
Lowercase Letter 2761
 
7.7%
Other Punctuation 1466
 
4.1%
Uppercase Letter 1456
 
4.1%
Close Punctuation 915
 
2.6%
Open Punctuation 914
 
2.6%
Connector Punctuation 472
 
1.3%
Dash Punctuation 401
 
1.1%
Other values (5) 27
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
807
 
4.6%
711
 
4.0%
693
 
3.9%
651
 
3.7%
572
 
3.2%
541
 
3.1%
519
 
2.9%
483
 
2.7%
469
 
2.6%
448
 
2.5%
Other values (441) 11824
66.7%
Uppercase Letter
ValueCountFrequency (%)
K 204
14.0%
A 168
11.5%
S 164
11.3%
L 149
10.2%
O 147
10.1%
P 96
 
6.6%
N 80
 
5.5%
G 77
 
5.3%
R 66
 
4.5%
T 63
 
4.3%
Other values (14) 242
16.6%
Lowercase Letter
ValueCountFrequency (%)
p 849
30.7%
h 479
17.3%
w 473
17.1%
d 269
 
9.7%
f 265
 
9.6%
i 95
 
3.4%
z 75
 
2.7%
g 39
 
1.4%
j 31
 
1.1%
x 27
 
1.0%
Other values (13) 159
 
5.8%
Decimal Number
ValueCountFrequency (%)
2 1880
31.0%
0 1453
23.9%
1 782
12.9%
3 762
12.5%
5 245
 
4.0%
6 241
 
4.0%
4 203
 
3.3%
7 196
 
3.2%
8 169
 
2.8%
9 143
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 1160
79.1%
, 253
 
17.3%
· 25
 
1.7%
' 10
 
0.7%
! 9
 
0.6%
? 3
 
0.2%
; 2
 
0.1%
& 2
 
0.1%
1
 
0.1%
% 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
+ 3
37.5%
~ 2
25.0%
= 2
25.0%
1
 
12.5%
Close Punctuation
ValueCountFrequency (%)
) 878
96.0%
] 25
 
2.7%
12
 
1.3%
Open Punctuation
ValueCountFrequency (%)
( 877
96.0%
[ 25
 
2.7%
12
 
1.3%
Final Punctuation
ValueCountFrequency (%)
4
66.7%
2
33.3%
Initial Punctuation
ValueCountFrequency (%)
4
66.7%
2
33.3%
Other Symbol
ValueCountFrequency (%)
3
60.0%
2
40.0%
Space Separator
ValueCountFrequency (%)
3536
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 472
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 401
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 17720
49.6%
Common 13803
38.6%
Latin 4217
 
11.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
807
 
4.6%
711
 
4.0%
693
 
3.9%
651
 
3.7%
572
 
3.2%
541
 
3.1%
519
 
2.9%
483
 
2.7%
469
 
2.6%
448
 
2.5%
Other values (442) 11826
66.7%
Latin
ValueCountFrequency (%)
p 849
20.1%
h 479
11.4%
w 473
11.2%
d 269
 
6.4%
f 265
 
6.3%
K 204
 
4.8%
A 168
 
4.0%
S 164
 
3.9%
L 149
 
3.5%
O 147
 
3.5%
Other values (37) 1050
24.9%
Common
ValueCountFrequency (%)
3536
25.6%
2 1880
13.6%
0 1453
10.5%
. 1160
 
8.4%
) 878
 
6.4%
( 877
 
6.4%
1 782
 
5.7%
3 762
 
5.5%
_ 472
 
3.4%
- 401
 
2.9%
Other values (29) 1602
11.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17954
50.2%
Hangul 17717
49.6%
None 51
 
0.1%
Punctuation 13
 
< 0.1%
Misc Symbols 3
 
< 0.1%
Compat Jamo 1
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3536
19.7%
2 1880
 
10.5%
0 1453
 
8.1%
. 1160
 
6.5%
) 878
 
4.9%
( 877
 
4.9%
p 849
 
4.7%
1 782
 
4.4%
3 762
 
4.2%
h 479
 
2.7%
Other values (66) 5298
29.5%
Hangul
ValueCountFrequency (%)
807
 
4.6%
711
 
4.0%
693
 
3.9%
651
 
3.7%
572
 
3.2%
541
 
3.1%
519
 
2.9%
483
 
2.7%
469
 
2.6%
448
 
2.5%
Other values (440) 11823
66.7%
None
ValueCountFrequency (%)
· 25
49.0%
12
23.5%
12
23.5%
2
 
3.9%
Punctuation
ValueCountFrequency (%)
4
30.8%
4
30.8%
2
15.4%
2
15.4%
1
 
7.7%
Misc Symbols
ValueCountFrequency (%)
3
100.0%
Compat Jamo
ValueCountFrequency (%)
1
100.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

Interactions

2023-12-13T03:27:36.064107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-13T03:27:36.194429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:27:36.267483image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

게시글번호파일명오리지널파일명
02334323343_202210050806404390.pdf상임전문위원 서류전형 합격자 및 면접시험 공고.pdf
12336523365_202210171516584330.pdf상임전문위원 채용시험 최종합격자 공고.pdf
22339723397_202210310711363180.jpg16656528475150.jpg
32339723397_202210310711363430.jpg16656528475811.jpg
42339723397_202210310711364551.jpg16656528478745.jpg
52339823398_202210310715302150.jpg16665945972121.jpg
62339823398_202210310715303190.jpg16665945964850.jpg
72339823398_202210310715304091.jpg16665945973763.jpg
82339823398_202210310715305192.jpg16665945975325.jpg
92339923399_202210310720220070.jpg16667683966195.jpg
게시글번호파일명오리지널파일명
8732375623756_202305170936246711.hwp0515(16석간)기계융합산업표준과, 한, 친환경 선박분야 ISO 국제표준 주도.hwp
8742376123761_202305220942129690.hwp0519(22조간)기계융합산업표준과, 우리 자율주행 기술, 국제표준으로 세계시장 진출.hwp
8752341623416_202211081627187940.hwp1103(04석간) 메타버스 서비스표준화 포럼.hwp
8762343423434_202211171410528180.hwp1117(18조간)바이오화학서비스표준과, 건물에너지 효율 위한 건물일체형 태양광(BIPV) KS 개정.hwp
8772358523585_202302122044413410.hwp0210(11조간)바이오화학서비스표준과, 우리나라 나노센서 성능평가 기술, 국제표준으로 제정.hwp
8782360923609_202302231245037280.hwp0223(조간)바이오화학서비스표준과, 우리나라 주도로 넷제로 에너지 국제표준 최초 개발_제출_최종.hwp
8792368723687_202304070924536500.hwp0406(7조간)바이오화학서비스표준과, 한국인 고령인구(70세_84세) 20년 전보다 키 크고 날씬해져.hwp
8802382223822_202306291517506800.hwp0628(29조간)바이오화학서비스표준과, 이차전지 양극재 분석 표준화로 K배터리 글로벌 경쟁력 강화.hwp
8812393423934_202308240916358411.hwp0823(24조간)바이오화학서비스표준과, 백신산업 분류체계 표준화로 글로벌 백신강국 토대 마련.hwp
8822341223412_202211031748080300.JPGtest.JPG