Overview

Dataset statistics

Number of variables2
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory244.1 KiB
Average record size in memory25.0 B

Variable types

Text1
Numeric1

Dataset

Description굴착예정지일련번호,년도
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-21183/S/1/datasetView.do

Alerts

굴착예정지일련번호 has unique valuesUnique

Reproduction

Analysis started2024-05-11 01:05:59.384946
Analysis finished2024-05-11 01:06:00.351362
Duration0.97 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T01:06:00.680130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters180000
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st rowSVR001200603150187
2nd rowSVR001200705130002
3rd rowSVR001200605040239
4th rowSVR001200604030012
5th rowSVR001200511070193
ValueCountFrequency (%)
svr001200603150187 1
 
< 0.1%
svr001200505160029 1
 
< 0.1%
svr001200605270006 1
 
< 0.1%
svr001200506030037 1
 
< 0.1%
svr001200507060056 1
 
< 0.1%
svr001200608020181 1
 
< 0.1%
svr001200505300063 1
 
< 0.1%
svr001200705110288 1
 
< 0.1%
svr001200602160097 1
 
< 0.1%
svr001200704190158 1
 
< 0.1%
Other values (9990) 9990
99.9%
2024-05-11T01:06:01.562939image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 70379
39.1%
1 22992
 
12.8%
2 18154
 
10.1%
S 10000
 
5.6%
V 10000
 
5.6%
R 10000
 
5.6%
5 7758
 
4.3%
6 7713
 
4.3%
7 6240
 
3.5%
3 4869
 
2.7%
Other values (3) 11895
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 150000
83.3%
Uppercase Letter 30000
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 70379
46.9%
1 22992
 
15.3%
2 18154
 
12.1%
5 7758
 
5.2%
6 7713
 
5.1%
7 6240
 
4.2%
3 4869
 
3.2%
4 4337
 
2.9%
8 4113
 
2.7%
9 3445
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
S 10000
33.3%
V 10000
33.3%
R 10000
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 150000
83.3%
Latin 30000
 
16.7%

Most frequent character per script

Common
ValueCountFrequency (%)
0 70379
46.9%
1 22992
 
15.3%
2 18154
 
12.1%
5 7758
 
5.2%
6 7713
 
5.1%
7 6240
 
4.2%
3 4869
 
3.2%
4 4337
 
2.9%
8 4113
 
2.7%
9 3445
 
2.3%
Latin
ValueCountFrequency (%)
S 10000
33.3%
V 10000
33.3%
R 10000
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 180000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 70379
39.1%
1 22992
 
12.8%
2 18154
 
10.1%
S 10000
 
5.6%
V 10000
 
5.6%
R 10000
 
5.6%
5 7758
 
4.3%
6 7713
 
4.3%
7 6240
 
3.5%
3 4869
 
2.7%
Other values (3) 11895
 
6.6%

년도
Real number (ℝ)

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2005.8928
Minimum2005
Maximum2013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T01:06:01.909053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2005
5-th percentile2005
Q12005
median2006
Q32006
95-th percentile2007
Maximum2013
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.76744189
Coefficient of variation (CV)0.00038259367
Kurtosis0.38696699
Mean2005.8928
Median Absolute Deviation (MAD)1
Skewness0.3731773
Sum20058928
Variance0.58896706
MonotonicityNot monotonic
2024-05-11T01:06:02.239606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2006 4146
41.5%
2005 3476
34.8%
2007 2365
23.6%
2008 10
 
0.1%
2013 2
 
< 0.1%
2011 1
 
< 0.1%
ValueCountFrequency (%)
2005 3476
34.8%
2006 4146
41.5%
2007 2365
23.6%
2008 10
 
0.1%
2011 1
 
< 0.1%
2013 2
 
< 0.1%
ValueCountFrequency (%)
2013 2
 
< 0.1%
2011 1
 
< 0.1%
2008 10
 
0.1%
2007 2365
23.6%
2006 4146
41.5%
2005 3476
34.8%

Interactions

2024-05-11T01:05:59.615649image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2024-05-11T01:05:59.999066image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T01:06:00.252480image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

굴착예정지일련번호년도
33164SVR0012006031501872006
99195SVR0012007051300022007
51242SVR0012006050402392006
47645SVR0012006040300122006
33085SVR0012005110701932005
45315SVR0012006042500632006
26840SVR0012005122900112005
34804SVR0012005111401512005
99383SVR0012007051501752007
31635SVR0012006011700492006
굴착예정지일련번호년도
27791SVR0012005120100592005
87145SVR0012007090300792007
54503SVR0012006060800122006
640SVR0012005041800292005
82976SVR0012006082103322006
85123SVR0012007043003062007
90338SVR0012007082401762007
37037SVR0012005110104782005
35689SVR0012005113000112005
59619SVR0012007032601832007