Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory722.7 KiB
Average record size in memory74.0 B

Variable types

Text2
Categorical2
DateTime2
Boolean1
Numeric1

Dataset

Description결재문서 번호,공개문서 유형명,결재문서 제목,결재상신일시,결재완료일시,보존연한,문서조회수
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-21077/S/1/datasetView.do

Alerts

공개여부 has constant value ""Constant
보존연한 has constant value ""Constant
문서조회수 is highly skewed (γ1 = 22.6665905)Skewed
결재문서 번호 has unique valuesUnique
문서조회수 has 9572 (95.7%) zerosZeros

Reproduction

Analysis started2024-04-29 20:56:10.336328
Analysis finished2024-04-29 20:56:12.064884
Duration1.73 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

결재문서 번호
Text

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-30T05:56:12.182273image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length24
Median length22
Mean length21.8286
Min length21

Characters and Unicode

Total characters218286
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowSAE_FM001_210506_51611
2nd rowSAE_FM001_210209_15909
3rd rowSAE_CP001_210204_6036
4th rowSAE_CP002_201018_0453
5th rowSAE_CP001_210629_27717
ValueCountFrequency (%)
sae_fm001_210506_51611 1
 
< 0.1%
sae_fm003_210610_69319 1
 
< 0.1%
sae_fm004_210330_34854 1
 
< 0.1%
sae_am005_210525_6268 1
 
< 0.1%
sae_em004_210514_0954 1
 
< 0.1%
sae_fm001_210415_42287 1
 
< 0.1%
sae_cp001_210730_33333 1
 
< 0.1%
sae_fm002_210814_104549 1
 
< 0.1%
sae_cp001_210205_6174 1
 
< 0.1%
sae_cp002_210723_32096 1
 
< 0.1%
Other values (9990) 9990
99.9%
2024-04-30T05:56:12.489609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 38599
17.7%
_ 30001
13.7%
1 25359
11.6%
2 24544
11.2%
A 10558
 
4.8%
E 10132
 
4.6%
S 10000
 
4.6%
3 9304
 
4.3%
4 7652
 
3.5%
5 7060
 
3.2%
Other values (9) 45077
20.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 138284
63.3%
Uppercase Letter 50000
 
22.9%
Connector Punctuation 30001
 
13.7%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 38599
27.9%
1 25359
18.3%
2 24544
17.7%
3 9304
 
6.7%
4 7652
 
5.5%
5 7060
 
5.1%
7 6785
 
4.9%
6 6765
 
4.9%
8 6435
 
4.7%
9 5781
 
4.2%
Uppercase Letter
ValueCountFrequency (%)
A 10558
21.1%
E 10132
20.3%
S 10000
20.0%
M 5951
11.9%
F 5104
10.2%
C 4206
 
8.4%
P 4049
 
8.1%
Connector Punctuation
ValueCountFrequency (%)
_ 30001
100.0%
Lowercase Letter
ValueCountFrequency (%)
x 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 168285
77.1%
Latin 50001
 
22.9%

Most frequent character per script

Common
ValueCountFrequency (%)
0 38599
22.9%
_ 30001
17.8%
1 25359
15.1%
2 24544
14.6%
3 9304
 
5.5%
4 7652
 
4.5%
5 7060
 
4.2%
7 6785
 
4.0%
6 6765
 
4.0%
8 6435
 
3.8%
Latin
ValueCountFrequency (%)
A 10558
21.1%
E 10132
20.3%
S 10000
20.0%
M 5951
11.9%
F 5104
10.2%
C 4206
 
8.4%
P 4049
 
8.1%
x 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 218286
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 38599
17.7%
_ 30001
13.7%
1 25359
11.6%
2 24544
11.2%
A 10558
 
4.8%
E 10132
 
4.6%
S 10000
 
4.6%
3 9304
 
4.3%
4 7652
 
3.5%
5 7060
 
3.2%
Other values (9) 45077
20.7%
Distinct31
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
일일민원일지
1883 
일일작업일지
1809 
설비 일일점검일지
1780 
전기 일일점검일지
1736 
영선 일일점검일지
780 
Other values (26)
2012 

Length

Max length20
Median length19
Mean length8.0885
Min length2

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row전기 일일점검일지
2nd row전기 일일점검일지
3rd row일일민원일지
4th row일일작업일지
5th row일일민원일지

Common Values

ValueCountFrequency (%)
일일민원일지 1883
18.8%
일일작업일지 1809
18.1%
설비 일일점검일지 1780
17.8%
전기 일일점검일지 1736
17.4%
영선 일일점검일지 780
7.8%
토목,건축,기타 일일점검일지 681
 
6.8%
기타 413
 
4.1%
지출품의서 330
 
3.3%
자재 및 물품 구매신청서 86
 
0.9%
공사ㆍ용역사업자 선정 결과 공고 81
 
0.8%
Other values (21) 421
 
4.2%

Length

2024-04-30T05:56:12.628732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
일일점검일지 4977
30.6%
일일민원일지 1883
 
11.6%
일일작업일지 1809
 
11.1%
설비 1780
 
11.0%
전기 1736
 
10.7%
영선 780
 
4.8%
토목,건축,기타 681
 
4.2%
기타 413
 
2.5%
지출품의서 330
 
2.0%
공고 219
 
1.3%
Other values (47) 1639
 
10.1%
Distinct1073
Distinct (%)10.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-30T05:56:12.948128image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length79
Median length57
Mean length9.5398
Min length5

Characters and Unicode

Total characters95398
Distinct characters385
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique996 ?
Unique (%)10.0%

Sample

1st row전기 일일점검일지
2nd row전기 일일점검일지
3rd row일일민원일지
4th row일일작업일지
5th row일일민원일지(21.6.28)
ValueCountFrequency (%)
일일점검일지 4823
26.8%
전기 1726
 
9.6%
설비 1651
 
9.2%
일일작업일지 1625
 
9.0%
일일민원일지 1542
 
8.6%
영선 803
 
4.5%
토목/건축/기타 684
 
3.8%
종합업무일지 345
 
1.9%
공고 190
 
1.1%
지출품의서 177
 
1.0%
Other values (1499) 4418
24.6%
2024-04-30T05:56:13.460117image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26250
27.5%
9511
 
10.0%
8170
 
8.6%
5115
 
5.4%
5097
 
5.3%
2787
 
2.9%
2388
 
2.5%
2195
 
2.3%
1865
 
2.0%
1856
 
1.9%
Other values (375) 30164
31.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 79540
83.4%
Space Separator 8170
 
8.6%
Decimal Number 4266
 
4.5%
Other Punctuation 2123
 
2.2%
Close Punctuation 583
 
0.6%
Open Punctuation 581
 
0.6%
Uppercase Letter 57
 
0.1%
Dash Punctuation 49
 
0.1%
Connector Punctuation 19
 
< 0.1%
Math Symbol 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26250
33.0%
9511
 
12.0%
5115
 
6.4%
5097
 
6.4%
2787
 
3.5%
2388
 
3.0%
2195
 
2.8%
1865
 
2.3%
1856
 
2.3%
1850
 
2.3%
Other values (333) 20626
25.9%
Uppercase Letter
ValueCountFrequency (%)
C 12
21.1%
B 12
21.1%
V 6
10.5%
T 5
8.8%
E 3
 
5.3%
D 3
 
5.3%
S 3
 
5.3%
R 2
 
3.5%
P 2
 
3.5%
I 2
 
3.5%
Other values (7) 7
12.3%
Decimal Number
ValueCountFrequency (%)
2 1314
30.8%
1 884
20.7%
0 869
20.4%
3 216
 
5.1%
4 196
 
4.6%
5 191
 
4.5%
6 180
 
4.2%
7 163
 
3.8%
8 142
 
3.3%
9 111
 
2.6%
Other Punctuation
ValueCountFrequency (%)
/ 1433
67.5%
. 613
28.9%
, 72
 
3.4%
' 3
 
0.1%
? 1
 
< 0.1%
: 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
p 2
50.0%
i 1
25.0%
t 1
25.0%
Space Separator
ValueCountFrequency (%)
8170
100.0%
Close Punctuation
ValueCountFrequency (%)
) 583
100.0%
Open Punctuation
ValueCountFrequency (%)
( 581
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 49
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 19
100.0%
Math Symbol
ValueCountFrequency (%)
~ 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 79540
83.4%
Common 15797
 
16.6%
Latin 61
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
26250
33.0%
9511
 
12.0%
5115
 
6.4%
5097
 
6.4%
2787
 
3.5%
2388
 
3.0%
2195
 
2.8%
1865
 
2.3%
1856
 
2.3%
1850
 
2.3%
Other values (333) 20626
25.9%
Common
ValueCountFrequency (%)
8170
51.7%
/ 1433
 
9.1%
2 1314
 
8.3%
1 884
 
5.6%
0 869
 
5.5%
. 613
 
3.9%
) 583
 
3.7%
( 581
 
3.7%
3 216
 
1.4%
4 196
 
1.2%
Other values (12) 938
 
5.9%
Latin
ValueCountFrequency (%)
C 12
19.7%
B 12
19.7%
V 6
9.8%
T 5
8.2%
E 3
 
4.9%
D 3
 
4.9%
S 3
 
4.9%
R 2
 
3.3%
p 2
 
3.3%
P 2
 
3.3%
Other values (10) 11
18.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 79491
83.3%
ASCII 15858
 
16.6%
Compat Jamo 49
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
26250
33.0%
9511
 
12.0%
5115
 
6.4%
5097
 
6.4%
2787
 
3.5%
2388
 
3.0%
2195
 
2.8%
1865
 
2.3%
1856
 
2.3%
1850
 
2.3%
Other values (332) 20577
25.9%
ASCII
ValueCountFrequency (%)
8170
51.5%
/ 1433
 
9.0%
2 1314
 
8.3%
1 884
 
5.6%
0 869
 
5.5%
. 613
 
3.9%
) 583
 
3.7%
( 581
 
3.7%
3 216
 
1.4%
4 196
 
1.2%
Other values (32) 999
 
6.3%
Compat Jamo
ValueCountFrequency (%)
49
100.0%
Distinct9761
Distinct (%)97.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2020-08-13 17:08:58
Maximum2024-02-18 09:02:28
2024-04-30T05:56:13.577045image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T05:56:13.862199image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct9993
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2020-08-14 09:29:06
Maximum2024-02-19 00:26:53
2024-04-30T05:56:13.990314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T05:56:14.107436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

공개여부
Boolean

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.9 KiB
True
10000 
ValueCountFrequency (%)
True 10000
100.0%
2024-04-30T05:56:14.192078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

보존연한
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
99
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row99
2nd row99
3rd row99
4th row99
5th row99

Common Values

ValueCountFrequency (%)
99 10000
100.0%

Length

2024-04-30T05:56:14.267051image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-30T05:56:14.339380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
99 10000
100.0%

문서조회수
Real number (ℝ)

SKEWED  ZEROS 

Distinct22
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1012
Minimum0
Maximum38
Zeros9572
Zeros (%)95.7%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-30T05:56:14.410354image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum38
Range38
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.91007768
Coefficient of variation (CV)8.9928625
Kurtosis692.50424
Mean0.1012
Median Absolute Deviation (MAD)0
Skewness22.66659
Sum1012
Variance0.82824138
MonotonicityNot monotonic
2024-04-30T05:56:14.515867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%)
0 9572
95.7%
1 264
 
2.6%
2 72
 
0.7%
3 38
 
0.4%
4 16
 
0.2%
5 8
 
0.1%
6 5
 
0.1%
7 5
 
0.1%
10 3
 
< 0.1%
8 3
 
< 0.1%
Other values (12) 14
 
0.1%
ValueCountFrequency (%)
0 9572
95.7%
1 264
 
2.6%
2 72
 
0.7%
3 38
 
0.4%
4 16
 
0.2%
5 8
 
0.1%
6 5
 
0.1%
7 5
 
0.1%
8 3
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
38 1
< 0.1%
31 1
< 0.1%
29 1
< 0.1%
25 1
< 0.1%
21 1
< 0.1%
20 1
< 0.1%
18 1
< 0.1%
16 1
< 0.1%
15 1
< 0.1%
14 1
< 0.1%

Interactions

2024-04-30T05:56:11.710434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T05:56:14.594064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
공개문서 유형명문서조회수
공개문서 유형명1.0000.360
문서조회수0.3601.000
2024-04-30T05:56:14.670263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문서조회수공개문서 유형명
문서조회수1.0000.134
공개문서 유형명0.1341.000

Missing values

2024-04-30T05:56:11.881886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T05:56:11.994450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

결재문서 번호공개문서 유형명결재문서 제목결재상신일시결재완료일시공개여부보존연한문서조회수
9541SAE_FM001_210506_51611전기 일일점검일지전기 일일점검일지2021-05-06 06:05:43.02021-05-06 06:15:43.0Y990
5179SAE_FM001_210209_15909전기 일일점검일지전기 일일점검일지2021-02-09 07:02:19.02021-02-09 07:09:59.0Y990
31298SAE_CP001_210204_6036일일민원일지일일민원일지2021-02-04 09:02:19.02021-02-05 01:06:43.0Y990
41505SAE_CP002_201018_0453일일작업일지일일작업일지2020-10-18 16:10:40.02020-10-19 11:33:30.0Y990
38753SAE_CP001_210629_27717일일민원일지일일민원일지(21.6.28)2021-06-29 01:06:11.02021-06-29 08:28:28.0Y990
573SAE_CP003_210511_19344기타종합업무일지(2021년 5월 8일)2021-05-11 00:05:28.02021-05-11 00:54:37.0Y990
78581SAE_FM004_210513_55386토목,건축,기타 일일점검일지토목/건축/기타 일일점검일지2021-05-13 12:05:25.02021-05-14 00:13:13.0Y990
75922SAE_FM004_210210_16371토목,건축,기타 일일점검일지토목/건축/기타 일일점검일지2021-02-10 07:02:26.02021-02-15 00:42:29.0Y990
24720SAE_FM001_210829_112501전기 일일점검일지전기 일일점검일지2021-08-29 21:08:51.02021-08-30 00:55:54.0Y990
506SAE_CP003_210507_18737기타종합업무일지2021-05-07 00:05:39.02021-05-10 02:37:42.0Y990
결재문서 번호공개문서 유형명결재문서 제목결재상신일시결재완료일시공개여부보존연한문서조회수
41628SAE_CP002_201120_0836일일작업일지일일작업일지2020-11-20 22:11:26.02020-11-23 08:51:09.0Y990
30973SAE_CP001_210128_5243일일민원일지일일민원일지2021-01-28 14:01:33.02021-02-01 01:13:19.0Y990
74671SAE_FM001_210902_114903전기 일일점검일지전기 일일점검일지2021-09-02 21:09:09.02021-09-02 23:49:44.0Y990
66154SAE_FM002_210713_87308설비 일일점검일지영선 일일점검일지2021-07-13 22:07:28.02021-07-16 01:35:05.0Y990
9817SAE_FM001_210510_53763전기 일일점검일지전기 일일점검일지2021-05-10 12:05:31.02021-05-27 06:08:06.0Y990
21482SAE_FM001_210818_106380전기 일일점검일지전기 일일점검일지2021-08-18 08:08:16.02021-08-19 08:29:33.0Y990
16401SAE_CP001_210812_35436일일민원일지일일민원일지2021-08-12 08:08:47.02021-08-13 02:23:36.0Y990
55266SAE_FM002_201231_5646설비 일일점검일지설비 일일점검일지2020-12-31 12:12:55.02021-01-05 04:12:15.0Y990
70806SAE_FM003_210331_35171영선 일일점검일지영선 일일점검일지2021-03-31 05:03:08.02021-04-01 07:43:43.0Y990
78690SAE_FM004_210517_57130토목,건축,기타 일일점검일지토목/건축/기타 일일점검일지2021-05-17 09:05:48.02021-05-17 23:13:42.0Y990