Overview

Dataset statistics

Number of variables3
Number of observations280
Missing cells0
Missing cells (%)0.0%
Duplicate rows4
Duplicate rows (%)1.4%
Total size in memory7.0 KiB
Average record size in memory25.5 B

Variable types

Text2
Numeric1

Alerts

Dataset has 4 (1.4%) duplicate rowsDuplicates

Reproduction

Analysis started2024-04-11 03:05:23.515519
Analysis finished2024-04-11 03:05:25.058088
Duration1.54 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct94
Distinct (%)33.6%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2024-04-11T12:05:25.219733image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length6.45
Min length3

Characters and Unicode

Total characters1806
Distinct characters151
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41 ?
Unique (%)14.6%

Sample

1st rowAI빅데이터산업과
2nd rowAI빅데이터산업과
3rd rowAI빅데이터산업과
4th row청년기회과
5th row청년기회과
ValueCountFrequency (%)
경기콘텐츠진흥원 16
 
5.7%
ai빅데이터산업과 12
 
4.3%
정보기획담당관 11
 
3.9%
정보통신보안담당관 10
 
3.6%
경기도일자리재단 9
 
3.2%
경기문화재단 8
 
2.9%
경기도시장상권진흥원 8
 
2.9%
교육지원과 8
 
2.9%
119종합상황실 7
 
2.5%
미세먼지연구부 6
 
2.1%
Other values (84) 185
66.1%
2024-04-11T12:05:25.534956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
123
 
6.8%
110
 
6.1%
89
 
4.9%
76
 
4.2%
63
 
3.5%
57
 
3.2%
55
 
3.0%
55
 
3.0%
48
 
2.7%
46
 
2.5%
Other values (141) 1084
60.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1752
97.0%
Decimal Number 30
 
1.7%
Uppercase Letter 24
 
1.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
123
 
7.0%
110
 
6.3%
89
 
5.1%
76
 
4.3%
63
 
3.6%
57
 
3.3%
55
 
3.1%
55
 
3.1%
48
 
2.7%
46
 
2.6%
Other values (137) 1030
58.8%
Decimal Number
ValueCountFrequency (%)
1 20
66.7%
9 10
33.3%
Uppercase Letter
ValueCountFrequency (%)
I 12
50.0%
A 12
50.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1752
97.0%
Common 30
 
1.7%
Latin 24
 
1.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
123
 
7.0%
110
 
6.3%
89
 
5.1%
76
 
4.3%
63
 
3.6%
57
 
3.3%
55
 
3.1%
55
 
3.1%
48
 
2.7%
46
 
2.6%
Other values (137) 1030
58.8%
Common
ValueCountFrequency (%)
1 20
66.7%
9 10
33.3%
Latin
ValueCountFrequency (%)
I 12
50.0%
A 12
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1752
97.0%
ASCII 54
 
3.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
123
 
7.0%
110
 
6.3%
89
 
5.1%
76
 
4.3%
63
 
3.6%
57
 
3.3%
55
 
3.1%
55
 
3.1%
48
 
2.7%
46
 
2.6%
Other values (137) 1030
58.8%
ASCII
ValueCountFrequency (%)
1 20
37.0%
I 12
22.2%
A 12
22.2%
9 10
18.5%
Distinct275
Distinct (%)98.2%
Missing0
Missing (%)0.0%
Memory size2.3 KiB
2024-04-11T12:05:25.767503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length53
Median length34
Mean length19.346429
Min length5

Characters and Unicode

Total characters5417
Distinct characters373
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique270 ?
Unique (%)96.4%

Sample

1st row인공지능 리터러시 교육 운영 계획
2nd row인공지능 전문인력 양성
3rd row지역경제 빅데이터 플랫폼 운영
4th row경기도 청년지원사업단 운영
5th row시스템 구축 등 운영비
ValueCountFrequency (%)
운영 84
 
7.6%
47
 
4.2%
유지보수 41
 
3.7%
유지관리 40
 
3.6%
경기도 31
 
2.8%
시스템 30
 
2.7%
2024년 26
 
2.3%
홈페이지 20
 
1.8%
구축 18
 
1.6%
플랫폼 17
 
1.5%
Other values (520) 753
68.0%
2024-04-11T12:05:26.120871image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
830
 
15.3%
201
 
3.7%
144
 
2.7%
132
 
2.4%
130
 
2.4%
126
 
2.3%
121
 
2.2%
118
 
2.2%
116
 
2.1%
116
 
2.1%
Other values (363) 3383
62.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4125
76.1%
Space Separator 830
 
15.3%
Decimal Number 177
 
3.3%
Uppercase Letter 128
 
2.4%
Close Punctuation 56
 
1.0%
Open Punctuation 56
 
1.0%
Other Punctuation 19
 
0.4%
Lowercase Letter 16
 
0.3%
Dash Punctuation 5
 
0.1%
Math Symbol 4
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
201
 
4.9%
144
 
3.5%
132
 
3.2%
130
 
3.2%
126
 
3.1%
121
 
2.9%
118
 
2.9%
116
 
2.8%
116
 
2.8%
115
 
2.8%
Other values (308) 2806
68.0%
Uppercase Letter
ValueCountFrequency (%)
S 19
14.8%
D 13
 
10.2%
I 12
 
9.4%
G 9
 
7.0%
R 8
 
6.2%
C 8
 
6.2%
B 7
 
5.5%
A 7
 
5.5%
E 6
 
4.7%
W 6
 
4.7%
Other values (12) 33
25.8%
Lowercase Letter
ValueCountFrequency (%)
e 3
18.8%
r 2
12.5%
c 2
12.5%
o 2
12.5%
m 1
 
6.2%
d 1
 
6.2%
u 1
 
6.2%
a 1
 
6.2%
n 1
 
6.2%
y 1
 
6.2%
Decimal Number
ValueCountFrequency (%)
2 81
45.8%
4 39
22.0%
0 38
21.5%
1 9
 
5.1%
5 3
 
1.7%
3 3
 
1.7%
9 2
 
1.1%
6 1
 
0.6%
7 1
 
0.6%
Other Punctuation
ValueCountFrequency (%)
/ 5
26.3%
, 5
26.3%
' 4
21.1%
· 3
15.8%
1
 
5.3%
& 1
 
5.3%
Math Symbol
ValueCountFrequency (%)
~ 3
75.0%
1
 
25.0%
Space Separator
ValueCountFrequency (%)
830
100.0%
Close Punctuation
ValueCountFrequency (%)
) 56
100.0%
Open Punctuation
ValueCountFrequency (%)
( 56
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4125
76.1%
Common 1148
 
21.2%
Latin 144
 
2.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
201
 
4.9%
144
 
3.5%
132
 
3.2%
130
 
3.2%
126
 
3.1%
121
 
2.9%
118
 
2.9%
116
 
2.8%
116
 
2.8%
115
 
2.8%
Other values (308) 2806
68.0%
Latin
ValueCountFrequency (%)
S 19
 
13.2%
D 13
 
9.0%
I 12
 
8.3%
G 9
 
6.2%
R 8
 
5.6%
C 8
 
5.6%
B 7
 
4.9%
A 7
 
4.9%
E 6
 
4.2%
W 6
 
4.2%
Other values (23) 49
34.0%
Common
ValueCountFrequency (%)
830
72.3%
2 81
 
7.1%
) 56
 
4.9%
( 56
 
4.9%
4 39
 
3.4%
0 38
 
3.3%
1 9
 
0.8%
- 5
 
0.4%
/ 5
 
0.4%
, 5
 
0.4%
Other values (12) 24
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4125
76.1%
ASCII 1286
 
23.7%
None 3
 
0.1%
Punctuation 2
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
830
64.5%
2 81
 
6.3%
) 56
 
4.4%
( 56
 
4.4%
4 39
 
3.0%
0 38
 
3.0%
S 19
 
1.5%
D 13
 
1.0%
I 12
 
0.9%
G 9
 
0.7%
Other values (41) 133
 
10.3%
Hangul
ValueCountFrequency (%)
201
 
4.9%
144
 
3.5%
132
 
3.2%
130
 
3.2%
126
 
3.1%
121
 
2.9%
118
 
2.9%
116
 
2.8%
116
 
2.8%
115
 
2.8%
Other values (308) 2806
68.0%
None
ValueCountFrequency (%)
· 3
100.0%
Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Math Operators
ValueCountFrequency (%)
1
100.0%

사업비(천원)
Real number (ℝ)

Distinct232
Distinct (%)82.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean394223.87
Minimum1494
Maximum20174091
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size2.6 KiB
2024-04-11T12:05:26.248607image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1494
5-th percentile7498.5
Q132370.75
median92442.5
Q3307120.25
95-th percentile1603538.9
Maximum20174091
Range20172597
Interquartile range (IQR)274749.5

Descriptive statistics

Standard deviation1371624.1
Coefficient of variation (CV)3.4793025
Kurtosis157.78412
Mean394223.87
Median Absolute Deviation (MAD)77178.5
Skewness11.454446
Sum1.1038268 × 108
Variance1.8813527 × 1012
MonotonicityNot monotonic
2024-04-11T12:05:26.374176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50000 8
 
2.9%
35000 5
 
1.8%
40000 5
 
1.8%
45000 4
 
1.4%
20000 3
 
1.1%
30000 3
 
1.1%
19000 3
 
1.1%
60000 3
 
1.1%
150000 3
 
1.1%
38000 3
 
1.1%
Other values (222) 240
85.7%
ValueCountFrequency (%)
1494 1
0.4%
1800 1
0.4%
2310 1
0.4%
3916 1
0.4%
4020 2
0.7%
4200 1
0.4%
5000 1
0.4%
5130 1
0.4%
5600 1
0.4%
5859 1
0.4%
ValueCountFrequency (%)
20174091 1
0.4%
6950918 1
0.4%
3930529 1
0.4%
3841451 1
0.4%
3592000 1
0.4%
2794184 1
0.4%
2790000 1
0.4%
2096554 1
0.4%
2065000 1
0.4%
1861944 1
0.4%

Interactions

2024-04-11T12:05:24.794895image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-11T12:05:26.451530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
담당부서/공공기관사업비(천원)
담당부서/공공기관1.0000.000
사업비(천원)0.0001.000

Missing values

2024-04-11T12:05:24.954972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-11T12:05:25.019968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

담당부서/공공기관사업명사업비(천원)
0AI빅데이터산업과인공지능 리터러시 교육 운영 계획750000
1AI빅데이터산업과인공지능 전문인력 양성300000
2AI빅데이터산업과지역경제 빅데이터 플랫폼 운영500000
3청년기회과경기도 청년지원사업단 운영131000
4청년기회과시스템 구축 등 운영비361205
5청년기회과청년 면접수당50000
6청년기회과청년노동자 통장 운영625000
7교육지원과교육관리시스템 기능보강(교육관리시스템 기능보강)40000
8교육지원과교육관리시스템 유지관리 및 학습지원센터 운영(이러닝 운영 및 콘텐츠 제작임차)767000
9교육지원과도서관 자동화(RFID)장비 유지관리4020
담당부서/공공기관사업명사업비(천원)
270벤처스타트업과경기 스타트업 플랫폼 운영380000
271AI빅데이터산업과AI기반 맞춤형 돌봄 서비스 실증 확대650000
272AI빅데이터산업과경기 생성형 AI 데이터 플랫폼 구축 컨설팅405000
273AI빅데이터산업과경기도 인공지능 교육센터 구축450000
274AI빅데이터산업과공공데이터 개방 및 품질관리500000
275AI빅데이터산업과데이터 분석540000
276AI빅데이터산업과데이터 분석시스템 운영 및 유지보수433000
277AI빅데이터산업과마이데이터 통합 플랫폼 운영935000
278AI빅데이터산업과발달장애인 AI Care 사업200000
279AI빅데이터산업과분석용 민간데이터 구매800000

Duplicate rows

Most frequently occurring

담당부서/공공기관사업명사업비(천원)# duplicates
0교육지원과교육관리시스템 기능보강(교육관리시스템 기능보강)400002
1교육지원과교육관리시스템 유지관리 및 학습지원센터 운영(이러닝 운영 및 콘텐츠 제작임차)7670002
2교육지원과도서관 자동화(RFID)장비 유지관리40202
3교육지원과전산교육장비 유지관리(전산교육장비 유지관리)500002