Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells10000
Missing cells (%)12.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory742.2 KiB
Average record size in memory76.0 B

Variable types

Numeric3
Text1
DateTime2
Unsupported1
Boolean1

Dataset

Description회계연도,제안번호,순번,내용,등록일,수정id,수정일,삭제여부
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15717/S/1/datasetView.do

Alerts

회계연도 is highly overall correlated with 순번High correlation
순번 is highly overall correlated with 회계연도High correlation
삭제여부 is highly imbalanced (81.9%)Imbalance
수정id has 10000 (100.0%) missing valuesMissing
수정id is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2024-05-18 00:04:00.493158
Analysis finished2024-05-18 00:04:07.941693
Duration7.45 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

회계연도
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2018.3337
Minimum2017
Maximum2024
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-18T09:04:08.092079image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2017
5-th percentile2017
Q12018
median2018
Q32018
95-th percentile2020
Maximum2024
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.058894
Coefficient of variation (CV)0.00052463771
Kurtosis7.5729113
Mean2018.3337
Median Absolute Deviation (MAD)0
Skewness2.4580636
Sum20183337
Variance1.1212564
MonotonicityNot monotonic
2024-05-18T09:04:08.543924image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
2018 7320
73.2%
2020 1021
 
10.2%
2017 772
 
7.7%
2019 523
 
5.2%
2023 139
 
1.4%
2021 127
 
1.3%
2022 60
 
0.6%
2024 38
 
0.4%
ValueCountFrequency (%)
2017 772
 
7.7%
2018 7320
73.2%
2019 523
 
5.2%
2020 1021
 
10.2%
2021 127
 
1.3%
2022 60
 
0.6%
2023 139
 
1.4%
2024 38
 
0.4%
ValueCountFrequency (%)
2024 38
 
0.4%
2023 139
 
1.4%
2022 60
 
0.6%
2021 127
 
1.3%
2020 1021
 
10.2%
2019 523
 
5.2%
2018 7320
73.2%
2017 772
 
7.7%

제안번호
Real number (ℝ)

Distinct716
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1732.0146
Minimum1
Maximum6861
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-18T09:04:09.214086image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile129
Q1430
median718
Q32575
95-th percentile5588
Maximum6861
Range6860
Interquartile range (IQR)2145

Descriptive statistics

Standard deviation1889.8439
Coefficient of variation (CV)1.0911247
Kurtosis-0.51083077
Mean1732.0146
Median Absolute Deviation (MAD)461
Skewness1.0587887
Sum17320146
Variance3571509.8
MonotonicityNot monotonic
2024-05-18T09:04:09.778799image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
565 635
 
6.3%
478 530
 
5.3%
164 454
 
4.5%
863 434
 
4.3%
5588 389
 
3.9%
5414 254
 
2.5%
711 222
 
2.2%
4410 178
 
1.8%
59 162
 
1.6%
215 147
 
1.5%
Other values (706) 6595
66.0%
ValueCountFrequency (%)
1 6
0.1%
2 2
 
< 0.1%
3 9
0.1%
5 11
0.1%
6 2
 
< 0.1%
8 1
 
< 0.1%
9 6
0.1%
11 2
 
< 0.1%
14 1
 
< 0.1%
15 5
0.1%
ValueCountFrequency (%)
6861 1
 
< 0.1%
6768 1
 
< 0.1%
6567 2
 
< 0.1%
6361 1
 
< 0.1%
6349 2
 
< 0.1%
6344 1
 
< 0.1%
6330 1
 
< 0.1%
6315 1
 
< 0.1%
6275 2
 
< 0.1%
6256 14
0.1%

순번
Real number (ℝ)

HIGH CORRELATION 

Distinct9148
Distinct (%)91.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.9865001 × 1013
Minimum1.7031815 × 1012
Maximum2.4050812 × 1014
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-18T09:04:10.276739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.7031815 × 1012
5-th percentile1.705251 × 1012
Q11.8032709 × 1012
median1.804102 × 1012
Q31.8042609 × 1012
95-th percentile2.004231 × 1014
Maximum2.4050812 × 1014
Range2.3880494 × 1014
Interquartile range (IQR)9.8991621 × 108

Descriptive statistics

Standard deviation7.8730927 × 1013
Coefficient of variation (CV)1.9749385
Kurtosis0.5831569
Mean3.9865001 × 1013
Median Absolute Deviation (MAD)8.3059938 × 108
Skewness1.5967051
Sum3.9865001 × 1017
Variance6.1985589 × 1027
MonotonicityNot monotonic
2024-05-18T09:04:10.909456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1804231605121 6
 
0.1%
1804231606121 6
 
0.1%
1804161745104 6
 
0.1%
1803231426181 6
 
0.1%
1804171717108 5
 
0.1%
1804231608121 5
 
0.1%
1804171529107 5
 
0.1%
1804201125115 4
 
< 0.1%
1803291059528 4
 
< 0.1%
1804191522113 4
 
< 0.1%
Other values (9138) 9949
99.5%
ValueCountFrequency (%)
1703181533028 1
< 0.1%
1703241610029 1
< 0.1%
1703241628030 1
< 0.1%
1703241826031 1
< 0.1%
1704041417033 1
< 0.1%
1704101428035 1
< 0.1%
1704131355038 1
< 0.1%
1704181418040 1
< 0.1%
1704181449041 1
< 0.1%
1704181502042 1
< 0.1%
ValueCountFrequency (%)
240508120816562 1
< 0.1%
240508095816561 1
< 0.1%
240507180616560 1
< 0.1%
240507163516557 1
< 0.1%
240501024716555 1
< 0.1%
240501023816552 1
< 0.1%
240501022916550 1
< 0.1%
240501021516549 1
< 0.1%
240430161116547 1
< 0.1%
240430131616543 1
< 0.1%

내용
Text

Distinct9064
Distinct (%)90.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-18T09:04:11.607042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length300
Median length264
Mean length51.8443
Min length1

Characters and Unicode

Total characters518443
Distinct characters1339
Distinct categories15 ?
Distinct scripts4 ?
Distinct blocks9 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8850 ?
Unique (%)88.5%

Sample

1st row가치 있는 일을 하는 정미화씨 화이팅 입니다
2nd row화재등 시민의 안전이 걸린 문제이니 반드시 예산 지원이 되었으면 좋겠습니다.
3rd row시민의 안전을 위해 필요한것 같네요
4th row꼭 필요한 청소년들을 위한 청소년 축제입니다.
5th row아파트만 지어놓고 문화인프라는 없는 길음뉴타운 지역에 주민주도로 문화행사를 기획해서 스스로 공동체를 만든다니..응원합니다
ValueCountFrequency (%)
2182
 
1.9%
좋은 1772
 
1.5%
1622
 
1.4%
응원합니다 1381
 
1.2%
있는 1138
 
1.0%
필요한 1032
 
0.9%
같습니다 789
 
0.7%
785
 
0.7%
생각합니다 701
 
0.6%
좋겠습니다 687
 
0.6%
Other values (33122) 102298
89.4%
2024-05-18T09:04:12.793988image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
109131
 
21.0%
14642
 
2.8%
14568
 
2.8%
12359
 
2.4%
. 11039
 
2.1%
7593
 
1.5%
7219
 
1.4%
6201
 
1.2%
6178
 
1.2%
5739
 
1.1%
Other values (1329) 323774
62.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 388046
74.8%
Space Separator 109131
 
21.0%
Other Punctuation 17277
 
3.3%
Modifier Symbol 1655
 
0.3%
Decimal Number 1098
 
0.2%
Lowercase Letter 638
 
0.1%
Uppercase Letter 309
 
0.1%
Other Symbol 113
 
< 0.1%
Dash Punctuation 89
 
< 0.1%
Connector Punctuation 30
 
< 0.1%
Other values (5) 57
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
14642
 
3.8%
14568
 
3.8%
12359
 
3.2%
7593
 
2.0%
7219
 
1.9%
6201
 
1.6%
6178
 
1.6%
5739
 
1.5%
5413
 
1.4%
5160
 
1.3%
Other values (1241) 302974
78.1%
Lowercase Letter
ValueCountFrequency (%)
a 147
23.0%
d 47
 
7.4%
o 40
 
6.3%
s 39
 
6.1%
e 37
 
5.8%
t 35
 
5.5%
i 29
 
4.5%
f 29
 
4.5%
n 28
 
4.4%
c 26
 
4.1%
Other values (15) 181
28.4%
Uppercase Letter
ValueCountFrequency (%)
E 32
 
10.4%
T 29
 
9.4%
C 27
 
8.7%
A 24
 
7.8%
I 22
 
7.1%
B 22
 
7.1%
O 19
 
6.1%
G 18
 
5.8%
R 16
 
5.2%
N 16
 
5.2%
Other values (14) 84
27.2%
Decimal Number
ValueCountFrequency (%)
1 385
35.1%
0 187
17.0%
2 150
 
13.7%
3 129
 
11.7%
4 91
 
8.3%
5 47
 
4.3%
8 42
 
3.8%
7 27
 
2.5%
9 22
 
2.0%
6 18
 
1.6%
Other Punctuation
ValueCountFrequency (%)
. 11039
63.9%
! 4255
 
24.6%
, 1624
 
9.4%
? 306
 
1.8%
: 41
 
0.2%
@ 5
 
< 0.1%
4
 
< 0.1%
2
 
< 0.1%
1
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
88
77.9%
16
 
14.2%
4
 
3.5%
2
 
1.8%
° 1
 
0.9%
1
 
0.9%
1
 
0.9%
Dash Punctuation
ValueCountFrequency (%)
- 88
98.9%
1
 
1.1%
Initial Punctuation
ValueCountFrequency (%)
11
64.7%
6
35.3%
Final Punctuation
ValueCountFrequency (%)
10
62.5%
6
37.5%
Math Symbol
ValueCountFrequency (%)
+ 9
90.0%
× 1
 
10.0%
Space Separator
ValueCountFrequency (%)
109131
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 1655
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 30
100.0%
Open Punctuation
ValueCountFrequency (%)
7
100.0%
Close Punctuation
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 388036
74.8%
Common 129450
 
25.0%
Latin 947
 
0.2%
Han 10
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
14642
 
3.8%
14568
 
3.8%
12359
 
3.2%
7593
 
2.0%
7219
 
1.9%
6201
 
1.6%
6178
 
1.6%
5739
 
1.5%
5413
 
1.4%
5160
 
1.3%
Other values (1233) 302964
78.1%
Latin
ValueCountFrequency (%)
a 147
 
15.5%
d 47
 
5.0%
o 40
 
4.2%
s 39
 
4.1%
e 37
 
3.9%
t 35
 
3.7%
E 32
 
3.4%
T 29
 
3.1%
i 29
 
3.1%
f 29
 
3.1%
Other values (39) 483
51.0%
Common
ValueCountFrequency (%)
109131
84.3%
. 11039
 
8.5%
! 4255
 
3.3%
^ 1655
 
1.3%
, 1624
 
1.3%
1 385
 
0.3%
? 306
 
0.2%
0 187
 
0.1%
2 150
 
0.1%
3 129
 
0.1%
Other values (29) 589
 
0.5%
Han
ValueCountFrequency (%)
2
20.0%
2
20.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 387620
74.8%
ASCII 130228
 
25.1%
Compat Jamo 416
 
0.1%
Misc Symbols 111
 
< 0.1%
Punctuation 38
 
< 0.1%
None 19
 
< 0.1%
CJK 9
 
< 0.1%
CJK Compat Ideographs 1
 
< 0.1%
Geometric Shapes 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
109131
83.8%
. 11039
 
8.5%
! 4255
 
3.3%
^ 1655
 
1.3%
, 1624
 
1.2%
1 385
 
0.3%
? 306
 
0.2%
0 187
 
0.1%
2 150
 
0.1%
a 147
 
0.1%
Other values (60) 1349
 
1.0%
Hangul
ValueCountFrequency (%)
14642
 
3.8%
14568
 
3.8%
12359
 
3.2%
7593
 
2.0%
7219
 
1.9%
6201
 
1.6%
6178
 
1.6%
5739
 
1.5%
5413
 
1.4%
5160
 
1.3%
Other values (1209) 302548
78.1%
Compat Jamo
ValueCountFrequency (%)
99
23.8%
69
16.6%
63
15.1%
49
11.8%
27
 
6.5%
25
 
6.0%
23
 
5.5%
9
 
2.2%
9
 
2.2%
8
 
1.9%
Other values (14) 35
 
8.4%
Misc Symbols
ValueCountFrequency (%)
88
79.3%
16
 
14.4%
4
 
3.6%
2
 
1.8%
1
 
0.9%
Punctuation
ValueCountFrequency (%)
11
28.9%
10
26.3%
6
15.8%
6
15.8%
4
 
10.5%
1
 
2.6%
None
ValueCountFrequency (%)
7
36.8%
7
36.8%
2
 
10.5%
1
 
5.3%
° 1
 
5.3%
× 1
 
5.3%
CJK
ValueCountFrequency (%)
2
22.2%
2
22.2%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
1
11.1%
CJK Compat Ideographs
ValueCountFrequency (%)
1
100.0%
Geometric Shapes
ValueCountFrequency (%)
1
100.0%
Distinct9975
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2017-03-18 15:33:27
Maximum2024-05-08 12:08:32
2024-05-18T09:04:13.232806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T09:04:13.688610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

수정id
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing10000
Missing (%)100.0%
Memory size166.0 KiB
Distinct9975
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2017-03-18 15:33:27
Maximum2024-05-08 12:08:32
2024-05-18T09:04:13.978035image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T09:04:14.372111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

삭제여부
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size87.9 KiB
True
9726 
False
 
274
ValueCountFrequency (%)
True 9726
97.3%
False 274
 
2.7%
2024-05-18T09:04:14.678284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Interactions

2024-05-18T09:04:06.015286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T09:04:04.111048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T09:04:05.001263image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T09:04:06.336447image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T09:04:04.465809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T09:04:05.329249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T09:04:06.689463image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T09:04:04.710284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-18T09:04:05.675594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-18T09:04:14.880600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도제안번호순번삭제여부
회계연도1.0000.5980.9870.055
제안번호0.5981.0000.6200.076
순번0.9870.6201.0000.057
삭제여부0.0550.0760.0571.000
2024-05-18T09:04:15.151042image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
회계연도제안번호순번삭제여부
회계연도1.000-0.0580.7790.066
제안번호-0.0581.000-0.0030.058
순번0.779-0.0031.0000.038
삭제여부0.0660.0580.0381.000

Missing values

2024-05-18T09:04:07.132065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-18T09:04:07.737618image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

회계연도제안번호순번내용등록일수정id수정일삭제여부
1320820181641804061800688가치 있는 일을 하는 정미화씨 화이팅 입니다2018-04-06 18:00:49.0<NA>2018-04-06 18:00:49.0Y
1094820184781804041958640화재등 시민의 안전이 걸린 문제이니 반드시 예산 지원이 되었으면 좋겠습니다.2018-04-04 19:58:33.0<NA>2018-04-04 19:58:33.0Y
1498820175251705171253333시민의 안전을 위해 필요한것 같네요2017-05-17 12:53:58.0<NA>2017-05-17 12:53:58.0Y
4782201848351804231704122꼭 필요한 청소년들을 위한 청소년 축제입니다.2018-04-23 17:04:41.0<NA>2018-04-23 17:04:41.0Y
645420189091803262029382아파트만 지어놓고 문화인프라는 없는 길음뉴타운 지역에 주민주도로 문화행사를 기획해서 스스로 공동체를 만든다니..응원합니다2018-03-26 20:29:20.0<NA>2018-03-26 20:29:20.0Y
792920187111804170926106아빠목공소 멋지게 해내세요2018-04-17 09:26:33.0<NA>2018-04-17 09:26:33.0Y
5935201842281804180916110시간을 알뜰히 쓸수 있을거 같아 좋아요.2018-04-18 09:16:24.0<NA>2018-04-18 09:16:24.0Y
1079420184781804141428962길이 좁고 유사시 도로가 좁아 위험할 것 같습니다. 빠른 조치 바랍니다.2018-04-14 14:28:15.0<NA>2018-04-14 14:28:15.0Y
90520202531200316101015176선정되어 좋은 결실이 되었음 좋겠습니다 청소년, 청년들위해 꼭 필요한 현실적인 방법인듯합니다 꼭 선정되었으면 좋겠네요 내 아이들을 위해서도2020-03-16 10:10:18.0<NA>2020-03-16 10:10:18.0Y
151452017151706091440112학생들에게 꼭 필요한 교육이네요. 이런교육을 한다면 꼭 참여 하겠습니다.2017-06-09 14:40:30.0<NA>2017-06-09 14:40:30.0Y
회계연도제안번호순번내용등록일수정id수정일삭제여부
14360201713411706031706103안양천에 종종 걸어다니는데 의자가 정말로 많이 녹슬고 낡았더라구요 빠른 정리 부탁 드려요2017-06-03 17:06:24.0<NA>2017-06-03 17:06:24.0Y
4154201854141803261625363안전이 최우선시 되어야 합니다. 필수적으로 시행되야할 안 입니다.2018-03-26 16:25:24.0<NA>2018-03-26 16:25:24.0Y
28052019111190318171513717아무런 법적 보호도 없는 상태인 느린학습자!! 이들에 대한 정책이 절실합니다. 가정에서 돌보는 부모들은 너무 힘듭니다. 부모들과 자녀느린학습자들에게 희망을 주는 참여예산이 되었으면 합니다.2019-03-18 17:15:24.0<NA>2019-03-18 17:15:24.0Y
1177620183061803231611190꼭 필요하다고 생각합니다2018-03-23 16:11:28.0<NA>2018-03-23 16:11:28.0Y
2950201862191804121515907응원합니다2018-04-12 15:15:09.0<NA>2018-04-12 15:15:09.0Y
911120185651804110828843우리 동네에서두 연극을 볼수 있을꺼라구 아들딸에게 자랑 했습니다 연극은 대학로에서 보는줄 아는 울 아이들에 눈이 반짝반짝 비ㅉ나네요 손잡고 연극관람하러 갈께요 화이팅 하세요^^2018-04-11 08:28:39.0<NA>2018-04-11 08:28:39.0Y
70020202575200324172015455좋은의견 환영합니다 척추관리를 집에서 할수있는 사업 추진되길바랍니다.2020-03-24 17:20:52.0<NA>2020-03-24 17:20:52.0Y
1330220181641804030934602응원합니다2018-04-03 09:34:59.0<NA>2018-04-03 09:34:59.0Y
842420186521804022257596막연한 안전제일 광고보다는 시내버스에 안전벨트 설치 참좋은 안전아이디어 라고 생각합니다 안전의식이 투철한 시민이군요 제안자분에게 안전훈장을 드립니다2018-04-02 22:57:57.0<NA>2018-04-02 22:57:57.0Y
4537201848391804261327127안전하게 놀아요2018-04-26 13:27:45.0<NA>2018-04-26 13:27:45.0Y