Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells1265
Missing cells (%)1.6%
Duplicate rows528
Duplicate rows (%)5.3%
Total size in memory712.9 KiB
Average record size in memory73.0 B

Variable types

Categorical2
Text5
Numeric1

Dataset

Description계약종류,기관명,발주부서명,건명,발주시기,발주(예상)금액(천원),전화번호,사업개요
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15616/S/1/datasetView.do

Alerts

Dataset has 528 (5.3%) duplicate rowsDuplicates
발주부서명 has 1264 (12.6%) missing valuesMissing
발주(예상)금액(천원) is highly skewed (γ1 = 82.55924256)Skewed

Reproduction

Analysis started2024-05-04 04:15:50.160816
Analysis finished2024-05-04 04:15:56.209133
Duration6.05 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

계약종류
Categorical

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
용역
4527 
공사
3491 
물품
1977 
용역
 
3
공사
 
2

Length

Max length4
Median length2
Mean length2.0008
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row용역
2nd row공사
3rd row용역
4th row용역
5th row용역

Common Values

ValueCountFrequency (%)
용역 4527
45.3%
공사 3491
34.9%
물품 1977
19.8%
용역 3
 
< 0.1%
공사 2
 
< 0.1%

Length

2024-05-04T04:15:56.448991image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T04:15:56.810737image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
용역 4530
45.3%
공사 3493
34.9%
물품 1977
19.8%

기관명
Categorical

Distinct23
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
투자출연기관
1655 
서울시(사업소)
1539 
<NA>
1264 
성북구
894 
서대문구
884 
Other values (18)
3764 

Length

Max length8
Median length7
Mean length4.7313
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row투자출연기관
2nd row투자출연기관
3rd row<NA>
4th row<NA>
5th row투자출연기관

Common Values

ValueCountFrequency (%)
투자출연기관 1655
16.6%
서울시(사업소) 1539
15.4%
<NA> 1264
12.6%
성북구 894
8.9%
서대문구 884
8.8%
영등포구 865
8.6%
관악구 471
 
4.7%
서울시(본청) 410
 
4.1%
중랑구 325
 
3.2%
은평구 256
 
2.6%
Other values (13) 1437
14.4%

Length

2024-05-04T04:15:57.203286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
투자출연기관 1655
16.6%
서울시(사업소 1539
15.4%
na 1264
12.6%
성북구 894
8.9%
서대문구 884
8.8%
영등포구 865
8.6%
관악구 471
 
4.7%
서울시(본청 410
 
4.1%
중랑구 325
 
3.2%
은평구 256
 
2.6%
Other values (13) 1437
14.4%

발주부서명
Text

MISSING 

Distinct326
Distinct (%)3.7%
Missing1264
Missing (%)12.6%
Memory size156.2 KiB
2024-05-04T04:15:57.840503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length20
Mean length11.904762
Min length3

Characters and Unicode

Total characters104000
Distinct characters234
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique83 ?
Unique (%)1.0%

Sample

1st row서울시설관리공단
2nd row서울경제진흥원
3rd row서울물재생시설공단
4th row구로구 기획경제국 재무과
5th row서울시설관리공단
ValueCountFrequency (%)
재무과 5120
24.6%
기획재정국 3809
18.3%
기획경제국 977
 
4.7%
성북구 894
 
4.3%
서대문구 884
 
4.3%
영등포구 866
 
4.2%
소방행정과 479
 
2.3%
관악구 471
 
2.3%
서울시설관리공단 326
 
1.6%
중랑구 325
 
1.6%
Other values (355) 6630
31.9%
2024-05-04T04:15:59.011022image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12045
 
11.6%
10032
 
9.6%
6653
 
6.4%
5510
 
5.3%
5336
 
5.1%
5242
 
5.0%
5173
 
5.0%
5059
 
4.9%
4997
 
4.8%
3929
 
3.8%
Other values (224) 40024
38.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 91599
88.1%
Space Separator 12045
 
11.6%
Decimal Number 347
 
0.3%
Uppercase Letter 9
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10032
 
11.0%
6653
 
7.3%
5510
 
6.0%
5336
 
5.8%
5242
 
5.7%
5173
 
5.6%
5059
 
5.5%
4997
 
5.5%
3929
 
4.3%
1993
 
2.2%
Other values (215) 37675
41.1%
Decimal Number
ValueCountFrequency (%)
0 117
33.7%
5 100
28.8%
1 77
22.2%
2 30
 
8.6%
9 23
 
6.6%
Uppercase Letter
ValueCountFrequency (%)
T 3
33.3%
B 3
33.3%
S 3
33.3%
Space Separator
ValueCountFrequency (%)
12045
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 91599
88.1%
Common 12392
 
11.9%
Latin 9
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10032
 
11.0%
6653
 
7.3%
5510
 
6.0%
5336
 
5.8%
5242
 
5.7%
5173
 
5.6%
5059
 
5.5%
4997
 
5.5%
3929
 
4.3%
1993
 
2.2%
Other values (215) 37675
41.1%
Common
ValueCountFrequency (%)
12045
97.2%
0 117
 
0.9%
5 100
 
0.8%
1 77
 
0.6%
2 30
 
0.2%
9 23
 
0.2%
Latin
ValueCountFrequency (%)
T 3
33.3%
B 3
33.3%
S 3
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 91599
88.1%
ASCII 12401
 
11.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12045
97.1%
0 117
 
0.9%
5 100
 
0.8%
1 77
 
0.6%
2 30
 
0.2%
9 23
 
0.2%
T 3
 
< 0.1%
B 3
 
< 0.1%
S 3
 
< 0.1%
Hangul
ValueCountFrequency (%)
10032
 
11.0%
6653
 
7.3%
5510
 
6.0%
5336
 
5.8%
5242
 
5.7%
5173
 
5.6%
5059
 
5.5%
4997
 
5.5%
3929
 
4.3%
1993
 
2.2%
Other values (215) 37675
41.1%

건명
Text

Distinct9104
Distinct (%)91.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-04T04:15:59.788039image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length50
Median length42
Mean length20.5906
Min length2

Characters and Unicode

Total characters205906
Distinct characters876
Distinct categories16 ?
Distinct scripts4 ?
Distinct blocks8 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8390 ?
Unique (%)83.9%

Sample

1st row염곡동서지하차도 저압수전 공사 설계용역
2nd row돈암시장 어닝 설치
3rd row도시계획시설 입지기준 및 실시계획인가 개선방안 연구
4th row개발행위허가 기준개선(경사도 등) 및 합리적 관리방안 마련
5th row체험관 환경조성 및 개선
ValueCountFrequency (%)
용역 1515
 
3.5%
1215
 
2.8%
구매 828
 
1.9%
공사 639
 
1.5%
설치 476
 
1.1%
2023년 475
 
1.1%
2022년 463
 
1.1%
제작 393
 
0.9%
운영 363
 
0.8%
2021년 339
 
0.8%
Other values (12403) 36350
84.4%
2024-05-04T04:16:01.296861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
33586
 
16.3%
2 5805
 
2.8%
5589
 
2.7%
4701
 
2.3%
3283
 
1.6%
3111
 
1.5%
2940
 
1.4%
2887
 
1.4%
0 2839
 
1.4%
2799
 
1.4%
Other values (866) 138366
67.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 152863
74.2%
Space Separator 33587
 
16.3%
Decimal Number 12671
 
6.2%
Uppercase Letter 1817
 
0.9%
Open Punctuation 1799
 
0.9%
Close Punctuation 1798
 
0.9%
Other Punctuation 640
 
0.3%
Dash Punctuation 283
 
0.1%
Lowercase Letter 266
 
0.1%
Math Symbol 161
 
0.1%
Other values (6) 21
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5589
 
3.7%
4701
 
3.1%
3283
 
2.1%
3111
 
2.0%
2940
 
1.9%
2887
 
1.9%
2799
 
1.8%
2690
 
1.8%
2556
 
1.7%
2506
 
1.6%
Other values (768) 119801
78.4%
Uppercase Letter
ValueCountFrequency (%)
C 300
16.5%
S 180
9.9%
T 168
9.2%
V 139
 
7.6%
D 136
 
7.5%
I 115
 
6.3%
E 100
 
5.5%
B 94
 
5.2%
P 92
 
5.1%
L 76
 
4.2%
Other values (16) 417
22.9%
Lowercase Letter
ValueCountFrequency (%)
o 32
12.0%
e 29
 
10.9%
i 19
 
7.1%
t 19
 
7.1%
s 18
 
6.8%
r 17
 
6.4%
a 16
 
6.0%
n 15
 
5.6%
l 14
 
5.3%
u 12
 
4.5%
Other values (15) 75
28.2%
Other Punctuation
ValueCountFrequency (%)
, 299
46.7%
? 129
20.2%
. 69
 
10.8%
' 68
 
10.6%
/ 29
 
4.5%
19
 
3.0%
: 13
 
2.0%
& 7
 
1.1%
% 3
 
0.5%
! 2
 
0.3%
Other values (2) 2
 
0.3%
Decimal Number
ValueCountFrequency (%)
2 5805
45.8%
0 2839
22.4%
1 1470
 
11.6%
3 958
 
7.6%
4 593
 
4.7%
9 356
 
2.8%
5 235
 
1.9%
6 167
 
1.3%
7 132
 
1.0%
8 116
 
0.9%
Open Punctuation
ValueCountFrequency (%)
( 1772
98.5%
14
 
0.8%
[ 9
 
0.5%
3
 
0.2%
1
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 1770
98.4%
14
 
0.8%
] 10
 
0.6%
3
 
0.2%
1
 
0.1%
Math Symbol
ValueCountFrequency (%)
~ 112
69.6%
+ 17
 
10.6%
> 16
 
9.9%
< 16
 
9.9%
Space Separator
ValueCountFrequency (%)
33586
> 99.9%
  1
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Initial Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 283
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 11
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 2
100.0%
Other Number
ValueCountFrequency (%)
¾ 2
100.0%
Letter Number
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 152850
74.2%
Common 50962
 
24.8%
Latin 2081
 
1.0%
Han 13
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5589
 
3.7%
4701
 
3.1%
3283
 
2.1%
3111
 
2.0%
2940
 
1.9%
2887
 
1.9%
2799
 
1.8%
2690
 
1.8%
2556
 
1.7%
2506
 
1.6%
Other values (764) 119788
78.4%
Latin
ValueCountFrequency (%)
C 300
14.4%
S 180
 
8.6%
T 168
 
8.1%
V 139
 
6.7%
D 136
 
6.5%
I 115
 
5.5%
E 100
 
4.8%
B 94
 
4.5%
P 92
 
4.4%
L 76
 
3.7%
Other values (41) 681
32.7%
Common
ValueCountFrequency (%)
33586
65.9%
2 5805
 
11.4%
0 2839
 
5.6%
( 1772
 
3.5%
) 1770
 
3.5%
1 1470
 
2.9%
3 958
 
1.9%
4 593
 
1.2%
9 356
 
0.7%
, 299
 
0.6%
Other values (37) 1514
 
3.0%
Han
ValueCountFrequency (%)
10
76.9%
1
 
7.7%
1
 
7.7%
1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 152844
74.2%
ASCII 52975
 
25.7%
None 58
 
< 0.1%
CJK 13
 
< 0.1%
Compat Jamo 6
 
< 0.1%
Letterlike Symbols 4
 
< 0.1%
Punctuation 4
 
< 0.1%
Number Forms 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
33586
63.4%
2 5805
 
11.0%
0 2839
 
5.4%
( 1772
 
3.3%
) 1770
 
3.3%
1 1470
 
2.8%
3 958
 
1.8%
4 593
 
1.1%
9 356
 
0.7%
C 300
 
0.6%
Other values (73) 3526
 
6.7%
Hangul
ValueCountFrequency (%)
5589
 
3.7%
4701
 
3.1%
3283
 
2.1%
3111
 
2.0%
2940
 
1.9%
2887
 
1.9%
2799
 
1.8%
2690
 
1.8%
2556
 
1.7%
2506
 
1.6%
Other values (763) 119782
78.4%
None
ValueCountFrequency (%)
19
32.8%
14
24.1%
14
24.1%
3
 
5.2%
3
 
5.2%
¾ 2
 
3.4%
1
 
1.7%
1
 
1.7%
  1
 
1.7%
CJK
ValueCountFrequency (%)
10
76.9%
1
 
7.7%
1
 
7.7%
1
 
7.7%
Compat Jamo
ValueCountFrequency (%)
6
100.0%
Letterlike Symbols
ValueCountFrequency (%)
4
100.0%
Number Forms
ValueCountFrequency (%)
2
100.0%
Punctuation
ValueCountFrequency (%)
1
25.0%
1
25.0%
1
25.0%
1
25.0%
Distinct128
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-04T04:16:02.019132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters120000
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)0.3%

Sample

1st row202210(2022)
2nd row202010(2020)
3rd row202005(2020)
4th row202104(2021)
5th row202303(2023)
ValueCountFrequency (%)
202202(2022 386
 
3.9%
202301(2023 382
 
3.8%
202302(2023 352
 
3.5%
202102(2021 327
 
3.3%
202303(2023 324
 
3.2%
202201(2022 324
 
3.2%
202103(2021 305
 
3.0%
202203(2022 291
 
2.9%
202304(2023 279
 
2.8%
202210(2022 270
 
2.7%
Other values (118) 6760
67.6%
2024-05-04T04:16:03.093191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 45434
37.9%
0 32095
26.7%
( 10000
 
8.3%
) 10000
 
8.3%
1 9514
 
7.9%
3 5923
 
4.9%
4 3039
 
2.5%
9 1741
 
1.5%
7 724
 
0.6%
5 672
 
0.6%
Other values (2) 858
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 100000
83.3%
Open Punctuation 10000
 
8.3%
Close Punctuation 10000
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 45434
45.4%
0 32095
32.1%
1 9514
 
9.5%
3 5923
 
5.9%
4 3039
 
3.0%
9 1741
 
1.7%
7 724
 
0.7%
5 672
 
0.7%
6 456
 
0.5%
8 402
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 10000
100.0%
Close Punctuation
ValueCountFrequency (%)
) 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 120000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 45434
37.9%
0 32095
26.7%
( 10000
 
8.3%
) 10000
 
8.3%
1 9514
 
7.9%
3 5923
 
4.9%
4 3039
 
2.5%
9 1741
 
1.5%
7 724
 
0.6%
5 672
 
0.6%
Other values (2) 858
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 120000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 45434
37.9%
0 32095
26.7%
( 10000
 
8.3%
) 10000
 
8.3%
1 9514
 
7.9%
3 5923
 
4.9%
4 3039
 
2.5%
9 1741
 
1.5%
7 724
 
0.6%
5 672
 
0.6%
Other values (2) 858
 
0.7%

발주(예상)금액(천원)
Real number (ℝ)

SKEWED 

Distinct3595
Distinct (%)35.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4284520.9
Minimum0
Maximum1.2206765 × 1010
Zeros8
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T04:16:03.665262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4581.4
Q115000
median42251.5
Q3200000
95-th percentile1660000
Maximum1.2206765 × 1010
Range1.2206765 × 1010
Interquartile range (IQR)185000

Descriptive statistics

Standard deviation1.3075927 × 108
Coefficient of variation (CV)30.518995
Kurtosis7601.4408
Mean4284520.9
Median Absolute Deviation (MAD)34251.5
Skewness82.559243
Sum4.2845209 × 1010
Variance1.7097987 × 1016
MonotonicityNot monotonic
2024-05-04T04:16:04.159157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20000 460
 
4.6%
50000 335
 
3.4%
10000 253
 
2.5%
15000 243
 
2.4%
100000 197
 
2.0%
30000 195
 
1.9%
22000 194
 
1.9%
5000 135
 
1.4%
40000 132
 
1.3%
200000 132
 
1.3%
Other values (3585) 7724
77.2%
ValueCountFrequency (%)
0 8
0.1%
50 1
 
< 0.1%
220 1
 
< 0.1%
330 1
 
< 0.1%
362 1
 
< 0.1%
394 1
 
< 0.1%
500 6
0.1%
506 1
 
< 0.1%
550 1
 
< 0.1%
600 4
< 0.1%
ValueCountFrequency (%)
12206764900 1
< 0.1%
1998000000 1
< 0.1%
1860000000 1
< 0.1%
1600000000 1
< 0.1%
1467000000 1
< 0.1%
1000000000 1
< 0.1%
923000000 1
< 0.1%
915000000 1
< 0.1%
900000000 1
< 0.1%
800000000 2
< 0.1%
Distinct5059
Distinct (%)50.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-04T04:16:05.076144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length11.2958
Min length1

Characters and Unicode

Total characters112958
Distinct characters39
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3176 ?
Unique (%)31.8%

Sample

1st row02-2290-4695
2nd row2241-3972
3rd row2133-8408
4th row02-2133-8421
5th row02-3660-2148
ValueCountFrequency (%)
02-2241-4304 42
 
0.4%
02-6981-5221 38
 
0.4%
02-6981-7621 30
 
0.3%
02-330-1963 29
 
0.3%
02-6981-5421 25
 
0.2%
02-2670-3766 25
 
0.2%
02-6913-7721 23
 
0.2%
02-330-1715 23
 
0.2%
02-2670-3432 22
 
0.2%
02-6981-8221 22
 
0.2%
Other values (5047) 9726
97.2%
2024-05-04T04:16:06.728197image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 20744
18.4%
- 18635
16.5%
0 16424
14.5%
3 10419
9.2%
1 8761
7.8%
6 7985
 
7.1%
4 7215
 
6.4%
7 6523
 
5.8%
5 5440
 
4.8%
8 5416
 
4.8%
Other values (29) 5396
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 94255
83.4%
Dash Punctuation 18635
 
16.5%
Other Letter 33
 
< 0.1%
Space Separator 21
 
< 0.1%
Close Punctuation 11
 
< 0.1%
Other Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
1
 
3.0%
1
 
3.0%
Other values (15) 15
45.5%
Decimal Number
ValueCountFrequency (%)
2 20744
22.0%
0 16424
17.4%
3 10419
11.1%
1 8761
9.3%
6 7985
 
8.5%
4 7215
 
7.7%
7 6523
 
6.9%
5 5440
 
5.8%
8 5416
 
5.7%
9 5328
 
5.7%
Dash Punctuation
ValueCountFrequency (%)
- 18635
100.0%
Space Separator
ValueCountFrequency (%)
21
100.0%
Close Punctuation
ValueCountFrequency (%)
) 11
100.0%
Other Punctuation
ValueCountFrequency (%)
, 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 112925
> 99.9%
Hangul 33
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
1
 
3.0%
1
 
3.0%
Other values (15) 15
45.5%
Common
ValueCountFrequency (%)
2 20744
18.4%
- 18635
16.5%
0 16424
14.5%
3 10419
9.2%
1 8761
7.8%
6 7985
 
7.1%
4 7215
 
6.4%
7 6523
 
5.8%
5 5440
 
4.8%
8 5416
 
4.8%
Other values (4) 5363
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 112925
> 99.9%
Hangul 33
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 20744
18.4%
- 18635
16.5%
0 16424
14.5%
3 10419
9.2%
1 8761
7.8%
6 7985
 
7.1%
4 7215
 
6.4%
7 6523
 
5.8%
5 5440
 
4.8%
8 5416
 
4.8%
Other values (4) 5363
 
4.7%
Hangul
ValueCountFrequency (%)
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
2
 
6.1%
1
 
3.0%
1
 
3.0%
Other values (15) 15
45.5%
Distinct8834
Distinct (%)88.3%
Missing1
Missing (%)< 0.1%
Memory size156.2 KiB
2024-05-04T04:16:07.652361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length100
Median length87
Mean length24.566357
Min length1

Characters and Unicode

Total characters245639
Distinct characters972
Distinct categories16 ?
Distinct scripts4 ?
Distinct blocks11 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8002 ?
Unique (%)80.0%

Sample

1st row설계용역
2nd row- 돈암시장 구조물 일부 철거 공사 - 돈암시장 어닝, 막구조물설치 등
3rd row도시계획시설의 용도지역,지구의에 따른 입지제한 필요성 검토 및 주요 실시계획인가 개선방안 및 업무프로세스 모델 제시
4th row개발행위허가 기준의 적정성 진단 및 합리적 기준 마련
5th row체험관 환경조성 및 개선
ValueCountFrequency (%)
2468
 
4.4%
1296
 
2.3%
설치 744
 
1.3%
구매 708
 
1.3%
위한 644
 
1.1%
605
 
1.1%
용역 593
 
1.0%
정비 534
 
0.9%
공사 515
 
0.9%
교체 504
 
0.9%
Other values (17591) 47875
84.8%
2024-05-04T04:16:09.095003image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
47908
 
19.5%
3659
 
1.5%
3360
 
1.4%
3352
 
1.4%
3328
 
1.4%
3030
 
1.2%
3018
 
1.2%
2874
 
1.2%
2706
 
1.1%
2 2697
 
1.1%
Other values (962) 169707
69.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 173489
70.6%
Space Separator 47910
 
19.5%
Decimal Number 11493
 
4.7%
Other Punctuation 4463
 
1.8%
Uppercase Letter 2642
 
1.1%
Close Punctuation 1367
 
0.6%
Open Punctuation 1367
 
0.6%
Lowercase Letter 1238
 
0.5%
Math Symbol 740
 
0.3%
Other Symbol 530
 
0.2%
Other values (6) 400
 
0.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
3659
 
2.1%
3360
 
1.9%
3352
 
1.9%
3328
 
1.9%
3030
 
1.7%
3018
 
1.7%
2874
 
1.7%
2706
 
1.6%
2503
 
1.4%
2368
 
1.4%
Other values (846) 143291
82.6%
Uppercase Letter
ValueCountFrequency (%)
C 353
13.4%
L 322
12.2%
D 309
11.7%
S 243
9.2%
T 194
 
7.3%
V 161
 
6.1%
E 125
 
4.7%
P 121
 
4.6%
I 118
 
4.5%
A 114
 
4.3%
Other values (17) 582
22.0%
Lowercase Letter
ValueCountFrequency (%)
m 713
57.6%
a 119
 
9.6%
o 72
 
5.8%
k 62
 
5.0%
e 32
 
2.6%
n 31
 
2.5%
t 31
 
2.5%
i 29
 
2.3%
s 25
 
2.0%
p 24
 
1.9%
Other values (13) 100
 
8.1%
Other Punctuation
ValueCountFrequency (%)
, 2678
60.0%
. 617
 
13.8%
: 603
 
13.5%
? 166
 
3.7%
/ 143
 
3.2%
' 93
 
2.1%
88
 
2.0%
* 42
 
0.9%
@ 18
 
0.4%
& 8
 
0.2%
Other values (4) 7
 
0.2%
Decimal Number
ValueCountFrequency (%)
2 2697
23.5%
0 2583
22.5%
1 2019
17.6%
3 959
 
8.3%
4 791
 
6.9%
5 749
 
6.5%
9 478
 
4.2%
6 430
 
3.7%
8 404
 
3.5%
7 383
 
3.3%
Other Symbol
ValueCountFrequency (%)
219
41.3%
204
38.5%
38
 
7.2%
27
 
5.1%
24
 
4.5%
7
 
1.3%
7
 
1.3%
2
 
0.4%
1
 
0.2%
1
 
0.2%
Math Symbol
ValueCountFrequency (%)
= 324
43.8%
~ 319
43.1%
× 32
 
4.3%
> 21
 
2.8%
+ 18
 
2.4%
< 15
 
2.0%
11
 
1.5%
Close Punctuation
ValueCountFrequency (%)
) 1343
98.2%
] 10
 
0.7%
9
 
0.7%
4
 
0.3%
1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 1343
98.2%
[ 10
 
0.7%
9
 
0.7%
4
 
0.3%
1
 
0.1%
Other Number
ValueCountFrequency (%)
² 4
57.1%
1
 
14.3%
1
 
14.3%
³ 1
 
14.3%
Letter Number
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Space Separator
ValueCountFrequency (%)
47908
> 99.9%
  2
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
16
88.9%
2
 
11.1%
Initial Punctuation
ValueCountFrequency (%)
7
77.8%
2
 
22.2%
Dash Punctuation
ValueCountFrequency (%)
- 358
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 173473
70.6%
Common 68266
 
27.8%
Latin 3889
 
1.6%
Han 11
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
3659
 
2.1%
3360
 
1.9%
3352
 
1.9%
3328
 
1.9%
3030
 
1.7%
3018
 
1.7%
2874
 
1.7%
2706
 
1.6%
2503
 
1.4%
2368
 
1.4%
Other values (841) 143275
82.6%
Common
ValueCountFrequency (%)
47908
70.2%
2 2697
 
4.0%
, 2678
 
3.9%
0 2583
 
3.8%
1 2019
 
3.0%
) 1343
 
2.0%
( 1343
 
2.0%
3 959
 
1.4%
4 791
 
1.2%
5 749
 
1.1%
Other values (52) 5196
 
7.6%
Latin
ValueCountFrequency (%)
m 713
18.3%
C 353
 
9.1%
L 322
 
8.3%
D 309
 
7.9%
S 243
 
6.2%
T 194
 
5.0%
V 161
 
4.1%
E 125
 
3.2%
P 121
 
3.1%
a 119
 
3.1%
Other values (44) 1229
31.6%
Han
ValueCountFrequency (%)
3
27.3%
3
27.3%
3
27.3%
1
 
9.1%
1
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 173443
70.6%
ASCII 71421
29.1%
CJK Compat 277
 
0.1%
Geometric Shapes 252
 
0.1%
None 163
 
0.1%
Compat Jamo 29
 
< 0.1%
Punctuation 27
 
< 0.1%
Arrows 11
 
< 0.1%
CJK 11
 
< 0.1%
Number Forms 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
47908
67.1%
2 2697
 
3.8%
, 2678
 
3.7%
0 2583
 
3.6%
1 2019
 
2.8%
) 1343
 
1.9%
( 1343
 
1.9%
3 959
 
1.3%
4 791
 
1.1%
5 749
 
1.0%
Other values (74) 8351
 
11.7%
Hangul
ValueCountFrequency (%)
3659
 
2.1%
3360
 
1.9%
3352
 
1.9%
3328
 
1.9%
3030
 
1.7%
3018
 
1.7%
2874
 
1.7%
2706
 
1.6%
2503
 
1.4%
2368
 
1.4%
Other values (837) 143245
82.6%
Geometric Shapes
ValueCountFrequency (%)
219
86.9%
24
 
9.5%
7
 
2.8%
2
 
0.8%
CJK Compat
ValueCountFrequency (%)
204
73.6%
38
 
13.7%
27
 
9.7%
7
 
2.5%
1
 
0.4%
None
ValueCountFrequency (%)
88
54.0%
× 32
 
19.6%
9
 
5.5%
9
 
5.5%
º 6
 
3.7%
² 4
 
2.5%
4
 
2.5%
4
 
2.5%
  2
 
1.2%
1
 
0.6%
Other values (4) 4
 
2.5%
Compat Jamo
ValueCountFrequency (%)
22
75.9%
6
 
20.7%
1
 
3.4%
Punctuation
ValueCountFrequency (%)
16
59.3%
7
25.9%
2
 
7.4%
2
 
7.4%
Arrows
ValueCountFrequency (%)
11
100.0%
CJK
ValueCountFrequency (%)
3
27.3%
3
27.3%
3
27.3%
1
 
9.1%
1
 
9.1%
Enclosed Alphanum
ValueCountFrequency (%)
1
50.0%
1
50.0%
Number Forms
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Interactions

2024-05-04T04:15:54.620593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T04:16:09.378604image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
계약종류기관명발주(예상)금액(천원)
계약종류1.0000.3030.000
기관명0.3031.0000.175
발주(예상)금액(천원)0.0000.1751.000
2024-05-04T04:16:09.662380image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기관명계약종류
기관명1.0000.155
계약종류0.1551.000
2024-05-04T04:16:09.873802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
발주(예상)금액(천원)계약종류기관명
발주(예상)금액(천원)1.0000.0000.091
계약종류0.0001.0000.155
기관명0.0910.1551.000

Missing values

2024-05-04T04:15:55.179854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T04:15:55.618956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-05-04T04:15:56.023347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

계약종류기관명발주부서명건명발주시기발주(예상)금액(천원)전화번호사업개요
22401용역투자출연기관서울시설관리공단염곡동서지하차도 저압수전 공사 설계용역202210(2022)1000002-2290-4695설계용역
49442공사투자출연기관서울경제진흥원돈암시장 어닝 설치202010(2020)4000002241-3972- 돈암시장 구조물 일부 철거 공사 - 돈암시장 어닝, 막구조물설치 등
51967용역<NA><NA>도시계획시설 입지기준 및 실시계획인가 개선방안 연구202005(2020)2000002133-8408도시계획시설의 용도지역,지구의에 따른 입지제한 필요성 검토 및 주요 실시계획인가 개선방안 및 업무프로세스 모델 제시
41181용역<NA><NA>개발행위허가 기준개선(경사도 등) 및 합리적 관리방안 마련202104(2021)20000002-2133-8421개발행위허가 기준의 적정성 진단 및 합리적 기준 마련
14833용역투자출연기관서울물재생시설공단체험관 환경조성 및 개선202303(2023)2000002-3660-2148체험관 환경조성 및 개선
13565공사구로구구로구 기획경제국 재무과공원관리팀 기간제근로자 대기실 교체사업202303(2023)12000002-860-3084공원 내 노후 기간제근로자 대기실 교체 등
26946공사<NA><NA>개나리,반달어린이공원 창의놀이터 조성공사202204(2022)40000002-820-1395어린이공원 창의놀이터 조성
10324용역<NA><NA>청년 취업박람회(오프라인)202305(2023)1800002-2148-2254청년 취업박람회
57946용역투자출연기관서울시설관리공단개인택시사업자 콜장비 이전설치201912(2019)1150002-2290-7232개인택시사업자 콜장비 이전설치
15827용역서대문구서대문구 기획재정국 재무과궁동산 등산로 정비사업 실시설계 용역202302(2023)1200002-330-1715목재 데크, 목재계단 설치 등
계약종류기관명발주부서명건명발주시기발주(예상)금액(천원)전화번호사업개요
36672용역<NA><NA>2022년 수도요금 청구서 등 전산출력 용역202110(2021)29553302-3146-15642022년 수도요금 청구서 전산출력 용역
39701공사서울시(사업소)용산소방서 소방행정과용산소방서 저녹스 진공온수보일러 교체공사202106(2021)3250002-6943-1421용산소방서 저녹스 진공온수보일러 교체공사
6183용역성북구성북구 기획재정국 재무과정릉1동어린이집 그린리모델링 공사 감리용역202311(2023)425002-2241-2556그린리모델링으로 에너지소비량 절감 및 쾌적한 보육 환경 조성
39463용역투자출연기관서울특별시평생교육진흥원서울시민 평생학습 참여 실태조사202106(2021)13400002-719-6431만 25~79세 서울시민 7천명 기준 표본설계 서울시민 대상 가구 방문 시민 대면조사 실시 데이터 분석 및 분석보고서 발간
53758용역서울시(사업소)남부도로사업소 기전과터널 및 지하차도 비상발전기 정비202004(2020)48003284-5495비상발전기 유지보수
43718용역구로구구로구 기획경제국 재무과잣절 유아숲체험원 프로그램 용역202103(2021)5900002-860-3147유아숲체험원 프로그램 운영
30259용역서울시(사업소)119특수구조단 행정지원과2022년 119특수구조단 코로나19 방역소독 용역(1차)202202(2022)5141702-3706-19162022년 코로나19대응 청사,소방차량,소방선박 방역 소독 용역
29433용역서울시(본청)관광체육국 관광산업과서울 관광 M.V.P 테마 코스 확산202203(2022)10000002-2133-2786서울관광 M.V.P 코스 활성화
25517공사중랑구중랑구 기획재정국 재무과2022년 가로변 녹지량 확충사업202206(2022)5000002-2094-2382가로변 유휴공간 녹화
38860용역성북구성북구 기획재정국 재무과천장산(청량근린공원) 단절된 산책로 연결사업 실시설계용역202107(2021)1600002-2241-3665실시설계용역

Duplicate rows

Most frequently occurring

계약종류기관명발주부서명건명발주시기발주(예상)금액(천원)전화번호사업개요# duplicates
146공사영등포구영등포구 기획재정국 재무과공원 소나무 생육환경 개선사업202303(2023)2200002-2670-3761공원 수목 정비5
5공사관악구관악구 기획경제국 재무과2022년 배전선로 근접 가로수 가지치기202202(2022)22100002-879-6531배전선로 근접 가로수 가지치기4
9공사관악구관악구 기획경제국 재무과2022년 주차구획선 시설정비공사202201(2022)11000002-879-6902주차구획선 정비공사(연간단가)4
22공사관악구관악구 기획경제국 재무과노후어린이공원 환경개선 사업(색동)202203(2022)20000002-879-6504색동어린이공원 환경개선 사업4
26공사관악구관악구 기획경제국 재무과봉천,님현 보안등 유지보수공사(연간단가)202201(2022)45000002-879-6782보안등 유지보수 1식4
28공사관악구관악구 기획경제국 재무과봉천초교 보행환경개선사업202203(2022)28000002-879-6861보도설치, 미끄럼방지포장4
32공사관악구관악구 기획경제국 재무과신림로11길 일대 등 3개소 하수관로 개량공사202201(2022)89460002-879-6813하수관로 개량 D450~800㎜, L=710m4
33공사관악구관악구 기획경제국 재무과신림로73 ~ 137 일대 보도정비 공사202201(2022)50000002-879-6771보도 및 보차도경계석, 측구 설치 등4
40공사관악구관악구 기획경제국 재무과태양어린이공원 노후시설물 정비202202(2022)30000002-879-6504태양어린이공원 노후시설물 정비 공사4
46공사광진구광진구 기획경제국 재무과중곡빗물펌프장 관리사택 등 2개소 환경개선202107(2021)5000002-450-1628실내건축리모델링4