Overview

Dataset statistics

Number of variables9
Number of observations10000
Missing cells29926
Missing cells (%)33.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory810.5 KiB
Average record size in memory83.0 B

Variable types

Text4
Categorical2
Numeric2
DateTime1

Dataset

Description한국세라믹기술원 세라믹소재정보은행의 첨부파일 정보입니다.
Author한국세라믹기술원
URLhttps://www.data.go.kr/data/15072093/fileData.do

Alerts

파일개수 is highly overall correlated with 파일크기 and 2 other fieldsHigh correlation
파일타입 is highly overall correlated with 파일크기 and 2 other fieldsHigh correlation
파일크기 is highly overall correlated with 파일타입 and 1 other fieldsHigh correlation
파일색인명 is highly overall correlated with 파일타입 and 1 other fieldsHigh correlation
파일타입 is highly imbalanced (51.1%)Imbalance
파일개수 is highly imbalanced (97.5%)Imbalance
파일크기 has 9976 (99.8%) missing valuesMissing
등록일 has 9975 (99.8%) missing valuesMissing
파일색인명 has 9975 (99.8%) missing valuesMissing
파일시퀀스 has unique valuesUnique

Reproduction

Analysis started2023-12-12 03:39:21.969624
Analysis finished2023-12-12 03:39:23.661054
Duration1.69 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

파일시퀀스
Text

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T12:39:23.816895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters140000
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowFIL-1000048553
2nd rowFIL-1000020380
3rd rowFIL-1000025674
4th rowFIL-1000017977
5th rowFIL-1000032838
ValueCountFrequency (%)
fil-1000048553 1
 
< 0.1%
fil-1000021482 1
 
< 0.1%
fil-1000082389 1
 
< 0.1%
fil-1000036105 1
 
< 0.1%
fil-1000094817 1
 
< 0.1%
fil-1000094680 1
 
< 0.1%
fil-1000044734 1
 
< 0.1%
fil-1000018397 1
 
< 0.1%
fil-1000064422 1
 
< 0.1%
fil-1000024640 1
 
< 0.1%
Other values (9990) 9990
99.9%
2023-12-12T12:39:24.197266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 43218
30.9%
1 17012
 
12.2%
- 10000
 
7.1%
F 9989
 
7.1%
I 9989
 
7.1%
L 9989
 
7.1%
4 5407
 
3.9%
5 5357
 
3.8%
3 5208
 
3.7%
8 5040
 
3.6%
Other values (7) 18791
13.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 100000
71.4%
Uppercase Letter 30000
 
21.4%
Dash Punctuation 10000
 
7.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 43218
43.2%
1 17012
 
17.0%
4 5407
 
5.4%
5 5357
 
5.4%
3 5208
 
5.2%
8 5040
 
5.0%
2 5006
 
5.0%
6 4796
 
4.8%
9 4591
 
4.6%
7 4365
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
F 9989
33.3%
I 9989
33.3%
L 9989
33.3%
B 11
 
< 0.1%
A 11
 
< 0.1%
K 11
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 110000
78.6%
Latin 30000
 
21.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0 43218
39.3%
1 17012
 
15.5%
- 10000
 
9.1%
4 5407
 
4.9%
5 5357
 
4.9%
3 5208
 
4.7%
8 5040
 
4.6%
2 5006
 
4.6%
6 4796
 
4.4%
9 4591
 
4.2%
Latin
ValueCountFrequency (%)
F 9989
33.3%
I 9989
33.3%
L 9989
33.3%
B 11
 
< 0.1%
A 11
 
< 0.1%
K 11
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 140000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 43218
30.9%
1 17012
 
12.2%
- 10000
 
7.1%
F 9989
 
7.1%
I 9989
 
7.1%
L 9989
 
7.1%
4 5407
 
3.9%
5 5357
 
3.8%
3 5208
 
3.7%
8 5040
 
3.6%
Other values (7) 18791
13.4%

파일타입
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
F02001
6571 
F02002
2393 
F02003
986 
F_MAMO
 
37
F_NANO
 
9

Length

Max length10
Median length6
Mean length6.0016
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowF02001
2nd rowF02001
3rd rowF02001
4th rowF02001
5th rowF02001

Common Values

ValueCountFrequency (%)
F02001 6571
65.7%
F02002 2393
 
23.9%
F02003 986
 
9.9%
F_MAMO 37
 
0.4%
F_NANO 9
 
0.1%
F_NANO_ETC 4
 
< 0.1%

Length

2023-12-12T12:39:24.464860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:39:24.653003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
f02001 6571
65.7%
f02002 2393
 
23.9%
f02003 986
 
9.9%
f_mamo 37
 
0.4%
f_nano 9
 
0.1%
f_nano_etc 4
 
< 0.1%
Distinct3442
Distinct (%)34.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T12:39:24.849942image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length8
Mean length9.7816
Min length1

Characters and Unicode

Total characters97816
Distinct characters18
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3404 ?
Unique (%)34.0%

Sample

1st row1.05E+13
2nd row1.04E+13
3rd row1.04E+14
4th row1.04E+13
5th row1.05E+13
ValueCountFrequency (%)
1.04e+13 2072
20.7%
1.05e+13 1160
 
11.6%
1.04e+14 1126
 
11.3%
1.05e+12 771
 
7.7%
1.02e+13 566
 
5.7%
1.04e+12 383
 
3.8%
1.02e+12 200
 
2.0%
1.05e+11 142
 
1.4%
1.04e+11 55
 
0.5%
1.02e+11 22
 
0.2%
Other values (3432) 3503
35.0%
2023-12-12T12:39:25.221495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 21731
22.2%
1 20143
20.6%
E 6584
 
6.7%
+ 6545
 
6.7%
. 6545
 
6.7%
4 6176
 
6.3%
3 5394
 
5.5%
2 4016
 
4.1%
5 3485
 
3.6%
P 2743
 
2.8%
Other values (8) 14454
14.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 67209
68.7%
Uppercase Letter 14774
 
15.1%
Math Symbol 6545
 
6.7%
Other Punctuation 6545
 
6.7%
Dash Punctuation 2743
 
2.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 21731
32.3%
1 20143
30.0%
4 6176
 
9.2%
3 5394
 
8.0%
2 4016
 
6.0%
5 3485
 
5.2%
9 1729
 
2.6%
8 1567
 
2.3%
6 1539
 
2.3%
7 1429
 
2.1%
Uppercase Letter
ValueCountFrequency (%)
E 6584
44.6%
P 2743
18.6%
R 2704
18.3%
O 2704
18.3%
Q 39
 
0.3%
Math Symbol
ValueCountFrequency (%)
+ 6545
100.0%
Other Punctuation
ValueCountFrequency (%)
. 6545
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2743
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 83042
84.9%
Latin 14774
 
15.1%

Most frequent character per script

Common
ValueCountFrequency (%)
0 21731
26.2%
1 20143
24.3%
+ 6545
 
7.9%
. 6545
 
7.9%
4 6176
 
7.4%
3 5394
 
6.5%
2 4016
 
4.8%
5 3485
 
4.2%
- 2743
 
3.3%
9 1729
 
2.1%
Other values (3) 4535
 
5.5%
Latin
ValueCountFrequency (%)
E 6584
44.6%
P 2743
18.6%
R 2704
18.3%
O 2704
18.3%
Q 39
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 97816
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 21731
22.2%
1 20143
20.6%
E 6584
 
6.7%
+ 6545
 
6.7%
. 6545
 
6.7%
4 6176
 
6.3%
3 5394
 
5.5%
2 4016
 
4.1%
5 3485
 
3.6%
P 2743
 
2.8%
Other values (8) 14454
14.8%
Distinct2466
Distinct (%)24.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T12:39:25.685176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length82
Median length80
Mean length17.6879
Min length5

Characters and Unicode

Total characters176879
Distinct characters161
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1864 ?
Unique (%)18.6%

Sample

1st row1050101022pic3.jpg
2nd row1040301010pic1.jpg
3rd row1040301010pic1.jpg
4th row1040301010pic1.jpg
5th row1050101040pic1.jpg
ValueCountFrequency (%)
1040301010pic1.jpg 2107
 
16.7%
1050101040pic1.jpg 497
 
3.9%
1050101050pic5.jpg 244
 
1.9%
1040301020pic45.jpg 162
 
1.3%
10202060pic90.jpg 119
 
0.9%
1040301020pic43.jpg 118
 
0.9%
1040301020pic19.jpg 115
 
0.9%
1050101050pic1.jpg 113
 
0.9%
1050101060pic6.jpg 106
 
0.8%
102010pic6.jpg 101
 
0.8%
Other values (2658) 8930
70.8%
2023-12-12T12:39:26.424756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 33324
18.8%
1 24118
13.6%
p 16518
 
9.3%
. 10333
 
5.8%
g 8871
 
5.0%
i 8579
 
4.9%
c 7798
 
4.4%
j 7640
 
4.3%
3 6166
 
3.5%
4 5858
 
3.3%
Other values (151) 47674
27.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 84153
47.6%
Lowercase Letter 68728
38.9%
Other Punctuation 10364
 
5.9%
Uppercase Letter 6644
 
3.8%
Space Separator 2612
 
1.5%
Dash Punctuation 2056
 
1.2%
Connector Punctuation 1024
 
0.6%
Other Letter 637
 
0.4%
Open Punctuation 326
 
0.2%
Close Punctuation 326
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
63
 
9.9%
58
 
9.1%
39
 
6.1%
35
 
5.5%
19
 
3.0%
18
 
2.8%
17
 
2.7%
16
 
2.5%
16
 
2.5%
16
 
2.5%
Other values (76) 340
53.4%
Lowercase Letter
ValueCountFrequency (%)
p 16518
24.0%
g 8871
12.9%
i 8579
12.5%
c 7798
11.3%
j 7640
11.1%
e 2179
 
3.2%
r 1865
 
2.7%
t 1862
 
2.7%
a 1716
 
2.5%
n 1612
 
2.3%
Other values (16) 10088
14.7%
Uppercase Letter
ValueCountFrequency (%)
P 763
11.5%
J 662
 
10.0%
C 643
 
9.7%
S 624
 
9.4%
G 584
 
8.8%
D 424
 
6.4%
M 335
 
5.0%
E 330
 
5.0%
A 311
 
4.7%
F 227
 
3.4%
Other values (16) 1741
26.2%
Decimal Number
ValueCountFrequency (%)
0 33324
39.6%
1 24118
28.7%
3 6166
 
7.3%
4 5858
 
7.0%
2 5346
 
6.4%
5 4195
 
5.0%
6 1982
 
2.4%
9 1314
 
1.6%
7 1068
 
1.3%
8 782
 
0.9%
Other Punctuation
ValueCountFrequency (%)
. 10333
99.7%
, 14
 
0.1%
% 9
 
0.1%
& 6
 
0.1%
' 2
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 7
77.8%
~ 1
 
11.1%
+ 1
 
11.1%
Space Separator
ValueCountFrequency (%)
2612
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2056
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1024
100.0%
Open Punctuation
ValueCountFrequency (%)
( 326
100.0%
Close Punctuation
ValueCountFrequency (%)
) 326
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 100870
57.0%
Latin 75372
42.6%
Hangul 617
 
0.3%
Katakana 20
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
63
 
10.2%
58
 
9.4%
39
 
6.3%
35
 
5.7%
19
 
3.1%
18
 
2.9%
17
 
2.8%
16
 
2.6%
16
 
2.6%
16
 
2.6%
Other values (72) 320
51.9%
Latin
ValueCountFrequency (%)
p 16518
21.9%
g 8871
11.8%
i 8579
11.4%
c 7798
10.3%
j 7640
10.1%
e 2179
 
2.9%
r 1865
 
2.5%
t 1862
 
2.5%
a 1716
 
2.3%
n 1612
 
2.1%
Other values (42) 16732
22.2%
Common
ValueCountFrequency (%)
0 33324
33.0%
1 24118
23.9%
. 10333
 
10.2%
3 6166
 
6.1%
4 5858
 
5.8%
2 5346
 
5.3%
5 4195
 
4.2%
2612
 
2.6%
- 2056
 
2.0%
6 1982
 
2.0%
Other values (13) 4880
 
4.8%
Katakana
ValueCountFrequency (%)
5
25.0%
5
25.0%
5
25.0%
5
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 176242
99.6%
Hangul 610
 
0.3%
Katakana 20
 
< 0.1%
Compat Jamo 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 33324
18.9%
1 24118
13.7%
p 16518
 
9.4%
. 10333
 
5.9%
g 8871
 
5.0%
i 8579
 
4.9%
c 7798
 
4.4%
j 7640
 
4.3%
3 6166
 
3.5%
4 5858
 
3.3%
Other values (65) 47037
26.7%
Hangul
ValueCountFrequency (%)
63
 
10.3%
58
 
9.5%
39
 
6.4%
35
 
5.7%
19
 
3.1%
18
 
3.0%
17
 
2.8%
16
 
2.6%
16
 
2.6%
16
 
2.6%
Other values (71) 313
51.3%
Compat Jamo
ValueCountFrequency (%)
7
100.0%
Katakana
ValueCountFrequency (%)
5
25.0%
5
25.0%
5
25.0%
5
25.0%
Distinct2943
Distinct (%)29.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T12:39:26.899163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length83
Median length80
Mean length18.0193
Min length2

Characters and Unicode

Total characters180193
Distinct characters153
Distinct categories11 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2448 ?
Unique (%)24.5%

Sample

1st row1050101022pic3.jpg
2nd row1040301010pic1.jpg
3rd row1040301010pic1.jpg
4th row1040301010pic1.jpg
5th row1050101040pic1.jpg
ValueCountFrequency (%)
1040301010pic1.jpg 2107
 
16.7%
1050101040pic1.jpg 497
 
4.0%
1050101050pic5.jpg 244
 
1.9%
1040301020pic45.jpg 162
 
1.3%
10202060pic90.jpg 119
 
0.9%
1040301020pic43.jpg 118
 
0.9%
1040301020pic19.jpg 115
 
0.9%
1050101050pic1.jpg 113
 
0.9%
1050101060pic6.jpg 106
 
0.8%
102010pic6.jpg 101
 
0.8%
Other values (3153) 8898
70.7%
2023-12-12T12:39:27.460110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 35492
19.7%
1 24966
13.9%
p 16033
 
8.9%
. 9797
 
5.4%
g 8831
 
4.9%
i 8551
 
4.7%
c 7795
 
4.3%
j 7634
 
4.2%
3 6457
 
3.6%
4 6105
 
3.4%
Other values (143) 48532
26.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 89623
49.7%
Lowercase Letter 67160
37.3%
Other Punctuation 9828
 
5.5%
Uppercase Letter 6774
 
3.8%
Space Separator 2580
 
1.4%
Dash Punctuation 2056
 
1.1%
Connector Punctuation 1034
 
0.6%
Other Letter 477
 
0.3%
Close Punctuation 326
 
0.2%
Open Punctuation 326
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
63
 
13.2%
58
 
12.2%
39
 
8.2%
35
 
7.3%
18
 
3.8%
13
 
2.7%
12
 
2.5%
11
 
2.3%
11
 
2.3%
9
 
1.9%
Other values (68) 208
43.6%
Lowercase Letter
ValueCountFrequency (%)
p 16033
23.9%
g 8831
13.1%
i 8551
12.7%
c 7795
11.6%
j 7634
11.4%
e 2179
 
3.2%
r 1865
 
2.8%
t 1861
 
2.8%
a 1715
 
2.6%
n 1611
 
2.4%
Other values (16) 9085
13.5%
Uppercase Letter
ValueCountFrequency (%)
P 764
11.3%
J 663
 
9.8%
C 638
 
9.4%
S 624
 
9.2%
G 584
 
8.6%
D 424
 
6.3%
M 351
 
5.2%
E 346
 
5.1%
A 311
 
4.6%
F 242
 
3.6%
Other values (16) 1827
27.0%
Decimal Number
ValueCountFrequency (%)
0 35492
39.6%
1 24966
27.9%
3 6457
 
7.2%
4 6105
 
6.8%
2 5671
 
6.3%
5 4502
 
5.0%
6 2223
 
2.5%
9 1560
 
1.7%
8 1386
 
1.5%
7 1261
 
1.4%
Other Punctuation
ValueCountFrequency (%)
. 9797
99.7%
, 14
 
0.1%
% 9
 
0.1%
& 6
 
0.1%
' 2
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 7
77.8%
~ 1
 
11.1%
+ 1
 
11.1%
Space Separator
ValueCountFrequency (%)
2580
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2056
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1034
100.0%
Close Punctuation
ValueCountFrequency (%)
) 326
100.0%
Open Punctuation
ValueCountFrequency (%)
( 326
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 105782
58.7%
Latin 73934
41.0%
Hangul 457
 
0.3%
Katakana 20
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
63
 
13.8%
58
 
12.7%
39
 
8.5%
35
 
7.7%
18
 
3.9%
13
 
2.8%
12
 
2.6%
11
 
2.4%
11
 
2.4%
9
 
2.0%
Other values (64) 188
41.1%
Latin
ValueCountFrequency (%)
p 16033
21.7%
g 8831
11.9%
i 8551
11.6%
c 7795
10.5%
j 7634
10.3%
e 2179
 
2.9%
r 1865
 
2.5%
t 1861
 
2.5%
a 1715
 
2.3%
n 1611
 
2.2%
Other values (42) 15859
21.5%
Common
ValueCountFrequency (%)
0 35492
33.6%
1 24966
23.6%
. 9797
 
9.3%
3 6457
 
6.1%
4 6105
 
5.8%
2 5671
 
5.4%
5 4502
 
4.3%
2580
 
2.4%
6 2223
 
2.1%
- 2056
 
1.9%
Other values (13) 5933
 
5.6%
Katakana
ValueCountFrequency (%)
5
25.0%
5
25.0%
5
25.0%
5
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 179716
99.7%
Hangul 450
 
0.2%
Katakana 20
 
< 0.1%
Compat Jamo 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 35492
19.7%
1 24966
13.9%
p 16033
 
8.9%
. 9797
 
5.5%
g 8831
 
4.9%
i 8551
 
4.8%
c 7795
 
4.3%
j 7634
 
4.2%
3 6457
 
3.6%
4 6105
 
3.4%
Other values (65) 48055
26.7%
Hangul
ValueCountFrequency (%)
63
 
14.0%
58
 
12.9%
39
 
8.7%
35
 
7.8%
18
 
4.0%
13
 
2.9%
12
 
2.7%
11
 
2.4%
11
 
2.4%
9
 
2.0%
Other values (63) 181
40.2%
Compat Jamo
ValueCountFrequency (%)
7
100.0%
Katakana
ValueCountFrequency (%)
5
25.0%
5
25.0%
5
25.0%
5
25.0%

파일크기
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct23
Distinct (%)95.8%
Missing9976
Missing (%)99.8%
Infinite0
Infinite (%)0.0%
Mean669227.33
Minimum4190
Maximum5095819
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T12:39:27.594975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4190
5-th percentile9824.45
Q146716.75
median84977.5
Q3404990.75
95-th percentile2326219.4
Maximum5095819
Range5091629
Interquartile range (IQR)358274

Descriptive statistics

Standard deviation1240283.8
Coefficient of variation (CV)1.8533071
Kurtosis6.3354369
Mean669227.33
Median Absolute Deviation (MAD)70225
Skewness2.4321328
Sum16061456
Variance1.5383038 × 1012
MonotonicityNot monotonic
2023-12-12T12:39:27.707557image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
230285 2
 
< 0.1%
4190 1
 
< 0.1%
367135 1
 
< 0.1%
90323 1
 
< 0.1%
73075 1
 
< 0.1%
23901 1
 
< 0.1%
518558 1
 
< 0.1%
2232620 1
 
< 0.1%
104329 1
 
< 0.1%
84271 1
 
< 0.1%
Other values (13) 13
 
0.1%
(Missing) 9976
99.8%
ValueCountFrequency (%)
4190 1
< 0.1%
9521 1
< 0.1%
11544 1
< 0.1%
17961 1
< 0.1%
19068 1
< 0.1%
23901 1
< 0.1%
54322 1
< 0.1%
57696 1
< 0.1%
66263 1
< 0.1%
73075 1
< 0.1%
ValueCountFrequency (%)
5095819 1
< 0.1%
2342737 1
< 0.1%
2232620 1
< 0.1%
2219188 1
< 0.1%
2049000 1
< 0.1%
518558 1
< 0.1%
367135 1
< 0.1%
230285 2
< 0.1%
104329 1
< 0.1%
90323 1
< 0.1%

등록일
Date

MISSING 

Distinct13
Distinct (%)52.0%
Missing9975
Missing (%)99.8%
Memory size156.2 KiB
Minimum2008-08-05 00:00:00
Maximum2010-12-15 00:00:00
2023-12-12T12:39:27.803533image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:39:27.899601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)

파일개수
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9975 
1
 
25

Length

Max length4
Median length4
Mean length3.9925
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9975
99.8%
1 25
 
0.2%

Length

2023-12-12T12:39:28.014818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T12:39:28.101194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9975
99.8%
1 25
 
0.2%

파일색인명
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct25
Distinct (%)100.0%
Missing9975
Missing (%)99.8%
Infinite0
Infinite (%)0.0%
Mean752.16
Minimum226
Maximum872
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T12:39:28.185277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum226
5-th percentile694.8
Q1733
median765
Q3802
95-th percentile855.6
Maximum872
Range646
Interquartile range (IQR)69

Descriptive statistics

Standard deviation119.92209
Coefficient of variation (CV)0.15943694
Kurtosis16.557659
Mean752.16
Median Absolute Deviation (MAD)37
Skewness-3.7022422
Sum18804
Variance14381.307
MonotonicityNot monotonic
2023-12-12T12:39:28.287745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
226 1
 
< 0.1%
789 1
 
< 0.1%
692 1
 
< 0.1%
733 1
 
< 0.1%
762 1
 
< 0.1%
806 1
 
< 0.1%
802 1
 
< 0.1%
736 1
 
< 0.1%
765 1
 
< 0.1%
706 1
 
< 0.1%
Other values (15) 15
 
0.1%
(Missing) 9975
99.8%
ValueCountFrequency (%)
226 1
< 0.1%
692 1
< 0.1%
706 1
< 0.1%
707 1
< 0.1%
716 1
< 0.1%
723 1
< 0.1%
733 1
< 0.1%
736 1
< 0.1%
751 1
< 0.1%
761 1
< 0.1%
ValueCountFrequency (%)
872 1
< 0.1%
857 1
< 0.1%
850 1
< 0.1%
831 1
< 0.1%
826 1
< 0.1%
806 1
< 0.1%
802 1
< 0.1%
796 1
< 0.1%
789 1
< 0.1%
780 1
< 0.1%

Interactions

2023-12-12T12:39:22.748054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:39:22.593580image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:39:22.875601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T12:39:22.661371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T12:39:28.367546image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
파일타입파일크기등록일파일색인명
파일타입1.000NaNNaNNaN
파일크기NaN1.0000.2850.000
등록일NaN0.2851.0001.000
파일색인명NaN0.0001.0001.000
2023-12-12T12:39:28.452160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
파일개수파일타입
파일개수1.0001.000
파일타입1.0001.000
2023-12-12T12:39:28.525970image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
파일크기파일색인명파일타입파일개수
파일크기1.000-0.1411.0001.000
파일색인명-0.1411.0001.0001.000
파일타입1.0001.0001.0001.000
파일개수1.0001.0001.0001.000

Missing values

2023-12-12T12:39:23.323875image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T12:39:23.464865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T12:39:23.592182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

파일시퀀스파일타입원본명실제파일명내부파일명파일크기등록일파일개수파일색인명
12401FIL-1000048553F020011.05E+131050101022pic3.jpg1050101022pic3.jpg<NA><NA><NA><NA>
51309FIL-1000020380F020011.04E+131040301010pic1.jpg1040301010pic1.jpg<NA><NA><NA><NA>
44877FIL-1000025674F020011.04E+141040301010pic1.jpg1040301010pic1.jpg<NA><NA><NA><NA>
50239FIL-1000017977F020011.04E+131040301010pic1.jpg1040301010pic1.jpg<NA><NA><NA><NA>
55423FIL-1000032838F020011.05E+131050101040pic1.jpg1050101040pic1.jpg<NA><NA><NA><NA>
75991FIL-1000093488F02003PRO-1000062694Fig_9.jpgFig_923.jpg<NA><NA><NA><NA>
30298FIL-1000002381F020011.02E+13102010pic19.jpg102010pic19.jpg<NA><NA><NA><NA>
38535FIL-1000013851F020011.04E+131040301020pic21.jpg1040301020pic21.jpg<NA><NA><NA><NA>
34385FIL-1000008017F020011.04E+121040301010pic1.jpg1040301010pic1.jpg<NA><NA><NA><NA>
98598FIL-1000179037F02002PRO-1000124481fig.2 Thermal expansion.jpgfig.2 Thermal expansion.jpg<NA><NA><NA><NA>
파일시퀀스파일타입원본명실제파일명내부파일명파일크기등록일파일개수파일색인명
13895FIL-1000039381F020011.04E+131040301030pic19.jpg1040301030pic19.jpg<NA><NA><NA><NA>
28235FIL-1000004735F020011.02E+13102010pic10.jpg102010pic10.jpg<NA><NA><NA><NA>
78228FIL-1000095711F_MAMOEQP-0000000052wearloss.jpgwearloss1000095711.jpg<NA><NA><NA><NA>
6430FIL-1000041067F020011.04E+131040301020pic47.jpg1040301020pic47.jpg<NA><NA><NA><NA>
23218FIL-1000055857F020012.01E+1120105110pic13.jpg20105110pic13.jpg<NA><NA><NA><NA>
59819FIL-1000049348F020011.04E+131040301060pic7.jpg1040301060pic7.jpg<NA><NA><NA><NA>
6509FIL-1000061950F020011.05E+121050101060pic3.jpg1050101060pic3.jpg<NA><NA><NA><NA>
46043FIL-1000025940F020011.04E+141040301010pic1.jpg1040301010pic1.jpg<NA><NA><NA><NA>
37001FIL-1000017756F020011.04E+131040301010pic1.jpg1040301010pic1.jpg<NA><NA><NA><NA>
28397FIL-1000003935F020011.02E+13102010pic11.jpg102010pic11.jpg<NA><NA><NA><NA>