Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells1
Missing cells (%)< 0.1%
Duplicate rows13
Duplicate rows (%)0.1%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

DateTime1
Text2
Numeric2

Dataset

Description수도권매립지 토사반입정보에 대한 정보입니다개방항목 : 반입일자, 발주처, 공사명, 반입량(t), 반입대수의 항목을 제공합니다.
Author수도권매립지관리공사
URLhttps://www.data.go.kr/data/15064399/fileData.do

Alerts

Dataset has 13 (0.1%) duplicate rowsDuplicates
반입량(t) is highly overall correlated with 반입대수High correlation
반입대수 is highly overall correlated with 반입량(t)High correlation
반입량(t) is highly skewed (γ1 = 23.03242065)Skewed

Reproduction

Analysis started2024-03-16 04:21:16.637824
Analysis finished2024-03-16 04:21:17.722353
Duration1.08 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct3556
Distinct (%)35.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2004-01-03 00:00:00
Maximum2024-01-24 00:00:00
2024-03-16T13:21:17.803222image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:21:17.943210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct153
Distinct (%)1.5%
Missing1
Missing (%)< 0.1%
Memory size156.2 KiB
2024-03-16T13:21:18.169006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length7
Mean length7.0230023
Min length3

Characters and Unicode

Total characters70223
Distinct characters138
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)0.3%

Sample

1st row금천구청
2nd row서부수도사업소
3rd row고양시청
4th row서울시시설관리공단
5th row서울시설공단
ValueCountFrequency (%)
남부수도사업소 1411
14.1%
서울시시설관리공단 998
 
10.0%
강서수도사업소 841
 
8.4%
강남수도사업소 649
 
6.5%
강동수도사업소 583
 
5.8%
중부수도사업소 557
 
5.6%
동부수도사업소 474
 
4.7%
서부수도사업소 473
 
4.7%
영등포수도사업소 455
 
4.5%
서울시설공단 343
 
3.4%
Other values (138) 3226
32.2%
2024-03-16T13:21:18.650789image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6790
 
9.7%
6687
 
9.5%
6293
 
9.0%
6120
 
8.7%
5971
 
8.5%
3986
 
5.7%
3633
 
5.2%
3497
 
5.0%
2326
 
3.3%
2302
 
3.3%
Other values (128) 22618
32.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 69447
98.9%
Uppercase Letter 706
 
1.0%
Lowercase Letter 46
 
0.1%
Space Separator 18
 
< 0.1%
Other Punctuation 3
 
< 0.1%
Other Symbol 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6790
 
9.8%
6687
 
9.6%
6293
 
9.1%
6120
 
8.8%
5971
 
8.6%
3986
 
5.7%
3633
 
5.2%
3497
 
5.0%
2326
 
3.3%
2302
 
3.3%
Other values (117) 21842
31.5%
Uppercase Letter
ValueCountFrequency (%)
S 338
47.9%
H 338
47.9%
T 15
 
2.1%
K 15
 
2.1%
Lowercase Letter
ValueCountFrequency (%)
l 22
47.8%
h 22
47.8%
g 1
 
2.2%
s 1
 
2.2%
Space Separator
ValueCountFrequency (%)
18
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 3
100.0%
Other Symbol
ValueCountFrequency (%)
3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 69450
98.9%
Latin 752
 
1.1%
Common 21
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6790
 
9.8%
6687
 
9.6%
6293
 
9.1%
6120
 
8.8%
5971
 
8.6%
3986
 
5.7%
3633
 
5.2%
3497
 
5.0%
2326
 
3.3%
2302
 
3.3%
Other values (118) 21845
31.5%
Latin
ValueCountFrequency (%)
S 338
44.9%
H 338
44.9%
l 22
 
2.9%
h 22
 
2.9%
T 15
 
2.0%
K 15
 
2.0%
g 1
 
0.1%
s 1
 
0.1%
Common
ValueCountFrequency (%)
18
85.7%
/ 3
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 69447
98.9%
ASCII 773
 
1.1%
None 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6790
 
9.8%
6687
 
9.6%
6293
 
9.1%
6120
 
8.8%
5971
 
8.6%
3986
 
5.7%
3633
 
5.2%
3497
 
5.0%
2326
 
3.3%
2302
 
3.3%
Other values (117) 21842
31.5%
ASCII
ValueCountFrequency (%)
S 338
43.7%
H 338
43.7%
l 22
 
2.8%
h 22
 
2.8%
18
 
2.3%
T 15
 
1.9%
K 15
 
1.9%
/ 3
 
0.4%
g 1
 
0.1%
s 1
 
0.1%
None
ValueCountFrequency (%)
3
100.0%
Distinct2490
Distinct (%)24.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-03-16T13:21:18.931936image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length44
Median length34
Mean length18.7725
Min length6

Characters and Unicode

Total characters187725
Distinct characters454
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique948 ?
Unique (%)9.5%

Sample

1st row가산동43번지일원침수방지사업
2nd row2018은평구관내상수도시설물유지관리공사
3rd row고양드론센터건립공사
4th row성동.광진구관내포장도로굴착복구공사
5th row2019안양천시민이용잔디광장조성공사
ValueCountFrequency (%)
동작구관내포장도로굴착복구공사 222
 
2.1%
관악구관내포장도로굴착복구공사 195
 
1.8%
강서구관내포장도로굴착복구공사 164
 
1.5%
인천국제공항철도2-3a공구 110
 
1.0%
구로.금천구관내포장도로굴착복구공사 101
 
0.9%
영등포구관내포장도로굴착복구공사 100
 
0.9%
강남구관내포장도로굴착복구공사 78
 
0.7%
2022년 71
 
0.7%
동작관내포장도로굴착복구공사 65
 
0.6%
마포구관내포장도로굴착복구공사 61
 
0.6%
Other values (2542) 9572
89.1%
2024-03-16T13:21:19.402302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
10915
 
5.8%
10476
 
5.6%
7320
 
3.9%
7103
 
3.8%
6581
 
3.5%
1 5914
 
3.2%
2 5820
 
3.1%
5617
 
3.0%
5318
 
2.8%
0 4415
 
2.4%
Other values (444) 118246
63.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 152870
81.4%
Decimal Number 27887
 
14.9%
Dash Punctuation 2424
 
1.3%
Math Symbol 1300
 
0.7%
Space Separator 972
 
0.5%
Other Punctuation 839
 
0.4%
Uppercase Letter 522
 
0.3%
Open Punctuation 439
 
0.2%
Close Punctuation 439
 
0.2%
Lowercase Letter 33
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
10915
 
7.1%
10476
 
6.9%
7320
 
4.8%
7103
 
4.6%
6581
 
4.3%
5617
 
3.7%
5318
 
3.5%
4292
 
2.8%
3635
 
2.4%
3627
 
2.4%
Other values (401) 87986
57.6%
Uppercase Letter
ValueCountFrequency (%)
A 246
47.1%
B 75
 
14.4%
L 55
 
10.5%
C 27
 
5.2%
E 24
 
4.6%
R 23
 
4.4%
P 21
 
4.0%
I 19
 
3.6%
S 12
 
2.3%
V 9
 
1.7%
Other values (4) 11
 
2.1%
Decimal Number
ValueCountFrequency (%)
1 5914
21.2%
2 5820
20.9%
0 4415
15.8%
3 2217
 
7.9%
4 2004
 
7.2%
5 1625
 
5.8%
6 1584
 
5.7%
8 1504
 
5.4%
9 1450
 
5.2%
7 1354
 
4.9%
Other Punctuation
ValueCountFrequency (%)
. 578
68.9%
, 197
 
23.5%
/ 30
 
3.6%
# 18
 
2.1%
· 8
 
1.0%
& 8
 
1.0%
Lowercase Letter
ValueCountFrequency (%)
k 17
51.5%
m 13
39.4%
c 1
 
3.0%
v 1
 
3.0%
i 1
 
3.0%
Math Symbol
ValueCountFrequency (%)
~ 1233
94.8%
47
 
3.6%
> 10
 
0.8%
< 10
 
0.8%
Dash Punctuation
ValueCountFrequency (%)
- 2424
100.0%
Space Separator
ValueCountFrequency (%)
972
100.0%
Open Punctuation
ValueCountFrequency (%)
( 439
100.0%
Close Punctuation
ValueCountFrequency (%)
) 439
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 152870
81.4%
Common 34300
 
18.3%
Latin 555
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
10915
 
7.1%
10476
 
6.9%
7320
 
4.8%
7103
 
4.6%
6581
 
4.3%
5617
 
3.7%
5318
 
3.5%
4292
 
2.8%
3635
 
2.4%
3627
 
2.4%
Other values (401) 87986
57.6%
Common
ValueCountFrequency (%)
1 5914
17.2%
2 5820
17.0%
0 4415
12.9%
- 2424
7.1%
3 2217
 
6.5%
4 2004
 
5.8%
5 1625
 
4.7%
6 1584
 
4.6%
8 1504
 
4.4%
9 1450
 
4.2%
Other values (14) 5343
15.6%
Latin
ValueCountFrequency (%)
A 246
44.3%
B 75
 
13.5%
L 55
 
9.9%
C 27
 
4.9%
E 24
 
4.3%
R 23
 
4.1%
P 21
 
3.8%
I 19
 
3.4%
k 17
 
3.1%
m 13
 
2.3%
Other values (9) 35
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 152861
81.4%
ASCII 34800
 
18.5%
Math Operators 47
 
< 0.1%
Compat Jamo 9
 
< 0.1%
None 8
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
10915
 
7.1%
10476
 
6.9%
7320
 
4.8%
7103
 
4.6%
6581
 
4.3%
5617
 
3.7%
5318
 
3.5%
4292
 
2.8%
3635
 
2.4%
3627
 
2.4%
Other values (400) 87977
57.6%
ASCII
ValueCountFrequency (%)
1 5914
17.0%
2 5820
16.7%
0 4415
12.7%
- 2424
7.0%
3 2217
 
6.4%
4 2004
 
5.8%
5 1625
 
4.7%
6 1584
 
4.6%
8 1504
 
4.3%
9 1450
 
4.2%
Other values (31) 5843
16.8%
Math Operators
ValueCountFrequency (%)
47
100.0%
Compat Jamo
ValueCountFrequency (%)
9
100.0%
None
ValueCountFrequency (%)
· 8
100.0%

반입량(t)
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct263
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean560.654
Minimum15
Maximum212415
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-16T13:21:19.578140image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum15
5-th percentile15
Q145
median120
Q3300
95-th percentile1050
Maximum212415
Range212400
Interquartile range (IQR)255

Descriptive statistics

Standard deviation5310.9335
Coefficient of variation (CV)9.472747
Kurtosis625.9165
Mean560.654
Median Absolute Deviation (MAD)90
Skewness23.032421
Sum5606540
Variance28206015
MonotonicityNot monotonic
2024-03-16T13:21:19.711359image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15 1178
 
11.8%
30 928
 
9.3%
45 877
 
8.8%
60 720
 
7.2%
75 445
 
4.5%
90 393
 
3.9%
120 346
 
3.5%
150 335
 
3.4%
105 322
 
3.2%
135 309
 
3.1%
Other values (253) 4147
41.5%
ValueCountFrequency (%)
15 1178
11.8%
20 1
 
< 0.1%
30 928
9.3%
45 877
8.8%
60 720
7.2%
75 445
 
4.5%
90 393
 
3.9%
105 322
 
3.2%
120 346
 
3.5%
135 309
 
3.1%
ValueCountFrequency (%)
212415 1
< 0.1%
165375 1
< 0.1%
147015 1
< 0.1%
138240 1
< 0.1%
129735 1
< 0.1%
116160 1
< 0.1%
108375 1
< 0.1%
105840 1
< 0.1%
98415 1
< 0.1%
96000 1
< 0.1%

반입대수
Real number (ℝ)

HIGH CORRELATION 

Distinct221
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17.3183
Minimum1
Maximum474
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-16T13:21:19.864204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median8
Q320
95-th percentile64
Maximum474
Range473
Interquartile range (IQR)17

Descriptive statistics

Standard deviation30.31666
Coefficient of variation (CV)1.7505564
Kurtosis45.032808
Mean17.3183
Median Absolute Deviation (MAD)6
Skewness5.4087968
Sum173183
Variance919.0999
MonotonicityNot monotonic
2024-03-16T13:21:20.078593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1178
 
11.8%
2 943
 
9.4%
3 891
 
8.9%
4 714
 
7.1%
5 452
 
4.5%
6 402
 
4.0%
8 351
 
3.5%
10 339
 
3.4%
7 327
 
3.3%
9 300
 
3.0%
Other values (211) 4103
41.0%
ValueCountFrequency (%)
1 1178
11.8%
2 943
9.4%
3 891
8.9%
4 714
7.1%
5 452
 
4.5%
6 402
 
4.0%
7 327
 
3.3%
8 351
 
3.5%
9 300
 
3.0%
10 339
 
3.4%
ValueCountFrequency (%)
474 1
< 0.1%
468 1
< 0.1%
450 1
< 0.1%
418 1
< 0.1%
406 1
< 0.1%
402 1
< 0.1%
374 1
< 0.1%
350 1
< 0.1%
339 1
< 0.1%
324 1
< 0.1%

Interactions

2024-03-16T13:21:17.335807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:21:17.139641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:21:17.433932image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-16T13:21:17.230349image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-16T13:21:20.215920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
반입량(t)반입대수
반입량(t)1.0000.189
반입대수0.1891.000
2024-03-16T13:21:20.669351image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
반입량(t)반입대수
반입량(t)1.0000.993
반입대수0.9931.000

Missing values

2024-03-16T13:21:17.533419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-16T13:21:17.663532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

반입일자발주처공사명반입량(t)반입대수
212762013-09-30금천구청가산동43번지일원침수방지사업51034
289122019-07-03서부수도사업소2018은평구관내상수도시설물유지관리공사151
315292021-09-10고양시청고양드론센터건립공사453
45252005-07-18서울시시설관리공단성동.광진구관내포장도로굴착복구공사453
295182019-10-29서울시설공단2019안양천시민이용잔디광장조성공사302
224662014-09-01서울시시설관리공단성북1배수지급수취약환경관망이중화공사15010
311532020-12-11서울시도시기반시설본부월드컵대교건설공사453
78122006-09-08서울시시설관리공단은평구관내상수도시설물및소화전보수공사906
279502018-12-06동부수도사업소2017광진구관내상수도시설물설치및보수공사151
101892007-07-30영등포수도사업소영등포구관내포장도로굴착복구공사151
반입일자발주처공사명반입량(t)반입대수
277772018-11-13중부수도사업소장위동246-173~189호외6개소배급수관정비공사604
34182005-01-07송파구청위례성길-성내천간도로개설공사52535
314172021-06-15인천시 서구청원당문화체육센터건립공사18012
112722007-11-21강남수도사업소강남구일원동남부순환로배수관부설공사453
173832010-07-12서울시시설관리공단교보타워사거리-양재역간송배수관정비공사37525
135512008-11-12중부수도사업소2008명륜3동등5개동상수도및계량기교체공사604
64612006-03-02인천국제공항철도인천국제공항철도2-3A공구19513
143112009-03-16SH공사상암2지구3단지아파트건설공사2715181
122842008-05-23영등포구청대림운동장지하공영주차장건설공사82555
166342010-01-29남부수도사업소2009관악.동작구관내긴급누수복구공사453

Duplicate rows

Most frequently occurring

반입일자발주처공사명반입량(t)반입대수# duplicates
02004-01-08서울시건설안전본부서남권농수산물시장지하차도현장870582
12004-06-01강서수도사업소강서구관내포장도로굴착복구공사7552
22005-07-27마포구청토정길확장공사2-3공구375252
32006-07-03남부수도사업소동작구관내포장도로굴착복구공사4532
42006-07-03서울시시설관리공단종로4~6가간보도및시설물정비공사180122
52006-07-03중부수도사업소명동외2개동배급수관및불용관정비공사6042
62006-09-07강남수도사업소강남구관내포장도로굴착복구공사4532
72006-09-07남부수도사업소동작구관내포장도로굴착복구공사3022
82006-09-07영등포수도사업소시흥1동외5개동상수도공사1512
92009-04-14남부수도사업소신림1동외4개동상수도공사7552