Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells3
Missing cells (%)< 0.1%
Duplicate rows7
Duplicate rows (%)0.1%
Total size in memory732.4 KiB
Average record size in memory75.0 B

Variable types

Text1
Categorical5
Numeric1
DateTime1

Dataset

Description지자체별 연도별 분기별 채취량을 수치화한 데이터로 광역, 기초로 나누고 업종에 따라 체계적으로 관리한 분기별 채취현황 데이터
Author국토교통부
URLhttps://www.data.go.kr/data/15122697/fileData.do

Alerts

생성자 has constant value ""Constant
Dataset has 7 (0.1%) duplicate rowsDuplicates
구분 is highly overall correlated with 업종등록명High correlation
업종등록명 is highly overall correlated with 구분High correlation
채취년도 is highly imbalanced (66.4%)Imbalance
채취량(생산량) is highly skewed (γ1 = 53.8945305)Skewed
채취량(생산량) has 8979 (89.8%) zerosZeros

Reproduction

Analysis started2023-12-12 00:02:43.500508
Analysis finished2023-12-12 00:02:44.552022
Duration1.05 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct182
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T09:02:44.871796image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.917
Min length2

Characters and Unicode

Total characters59170
Distinct characters118
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row전남 완도군
2nd row대구 북구
3rd row경기 화성시
4th row전남 해남군
5th row강원 평창군
ValueCountFrequency (%)
충남 1329
 
6.7%
경기 1275
 
6.4%
경북 1205
 
6.0%
전남 947
 
4.7%
경남 941
 
4.7%
부산 861
 
4.3%
강원 793
 
4.0%
전북 739
 
3.7%
충북 620
 
3.1%
대구 390
 
2.0%
Other values (174) 10859
54.4%
2023-12-12T09:02:45.357696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9959
16.8%
4499
 
7.6%
3800
 
6.4%
3714
 
6.3%
3570
 
6.0%
2711
 
4.6%
2301
 
3.9%
1995
 
3.4%
1905
 
3.2%
1739
 
2.9%
Other values (108) 22977
38.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 49211
83.2%
Space Separator 9959
 
16.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4499
 
9.1%
3800
 
7.7%
3714
 
7.5%
3570
 
7.3%
2711
 
5.5%
2301
 
4.7%
1995
 
4.1%
1905
 
3.9%
1739
 
3.5%
1319
 
2.7%
Other values (107) 21658
44.0%
Space Separator
ValueCountFrequency (%)
9959
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 49211
83.2%
Common 9959
 
16.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4499
 
9.1%
3800
 
7.7%
3714
 
7.5%
3570
 
7.3%
2711
 
5.5%
2301
 
4.7%
1995
 
4.1%
1905
 
3.9%
1739
 
3.5%
1319
 
2.7%
Other values (107) 21658
44.0%
Common
ValueCountFrequency (%)
9959
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 49211
83.2%
ASCII 9959
 
16.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9959
100.0%
Hangul
ValueCountFrequency (%)
4499
 
9.1%
3800
 
7.7%
3714
 
7.5%
3570
 
7.3%
2711
 
5.5%
2301
 
4.7%
1995
 
4.1%
1905
 
3.9%
1739
 
3.5%
1319
 
2.7%
Other values (107) 21658
44.0%

구분
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
채취실적
3852 
허가실적
3800 
신고채취실적
2348 

Length

Max length6
Median length4
Mean length4.4696
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row허가실적
2nd row채취실적
3rd row신고채취실적
4th row허가실적
5th row허가실적

Common Values

ValueCountFrequency (%)
채취실적 3852
38.5%
허가실적 3800
38.0%
신고채취실적 2348
23.5%

Length

2023-12-12T09:02:45.508497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:02:45.619814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
채취실적 3852
38.5%
허가실적 3800
38.0%
신고채취실적 2348
23.5%

업종등록명
Categorical

HIGH CORRELATION 

Distinct12
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
바다골재(모래)
1048 
육상골재(자갈)
1023 
산림골재(자갈)
1010 
육상골재(모래)
1009 
산림골재(모래)
997 
Other values (7)
4913 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row하천골재(자갈)
2nd row하천골재(자갈)
3rd row산림골재(모래)
4th row바다골재(모래)
5th row하천골재(모래)

Common Values

ValueCountFrequency (%)
바다골재(모래) 1048
10.5%
육상골재(자갈) 1023
10.2%
산림골재(자갈) 1010
10.1%
육상골재(모래) 1009
10.1%
산림골재(모래) 997
10.0%
하천골재(모래) 995
10.0%
하천골재(자갈) 991
9.9%
바다골재(자갈) 984
9.8%
선별파쇄(자갈) 495
5.0%
선별세척(자갈) 492
4.9%
Other values (2) 956
9.6%

Length

2023-12-12T09:02:45.717675image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
바다골재(모래 1048
10.5%
육상골재(자갈 1023
10.2%
산림골재(자갈 1010
10.1%
육상골재(모래 1009
10.1%
산림골재(모래 997
10.0%
하천골재(모래 995
10.0%
하천골재(자갈 991
9.9%
바다골재(자갈 984
9.8%
선별파쇄(자갈 495
5.0%
선별세척(자갈 492
4.9%
Other values (2) 956
9.6%

채취년도
Categorical

IMBALANCE 

Distinct5
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2022
8754 
2020
 
404
2021
 
353
2019
 
349
2023
 
140

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022
2nd row2022
3rd row2023
4th row2022
5th row2022

Common Values

ValueCountFrequency (%)
2022 8754
87.5%
2020 404
 
4.0%
2021 353
 
3.5%
2019 349
 
3.5%
2023 140
 
1.4%

Length

2023-12-12T09:02:45.810489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:02:45.898621image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022 8754
87.5%
2020 404
 
4.0%
2021 353
 
3.5%
2019 349
 
3.5%
2023 140
 
1.4%

채취분기
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
1
2517 
2
2505 
3
2491 
4
2487 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4
2nd row2
3rd row2
4th row4
5th row2

Common Values

ValueCountFrequency (%)
1 2517
25.2%
2 2505
25.1%
3 2491
24.9%
4 2487
24.9%

Length

2023-12-12T09:02:45.997083image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:02:46.095921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 2517
25.2%
2 2505
25.1%
3 2491
24.9%
4 2487
24.9%

채취량(생산량)
Real number (ℝ)

SKEWED  ZEROS 

Distinct303
Distinct (%)3.0%
Missing3
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean15.713714
Minimum0
Maximum13561
Zeros8979
Zeros (%)89.8%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T09:02:46.205376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile54
Maximum13561
Range13561
Interquartile range (IQR)0

Descriptive statistics

Standard deviation173.80823
Coefficient of variation (CV)11.060926
Kurtosis3872.6896
Mean15.713714
Median Absolute Deviation (MAD)0
Skewness53.894531
Sum157090
Variance30209.302
MonotonicityNot monotonic
2023-12-12T09:02:46.358729image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 8979
89.8%
2 26
 
0.3%
3 21
 
0.2%
10 20
 
0.2%
5 19
 
0.2%
7 18
 
0.2%
16 16
 
0.2%
15 14
 
0.1%
1 14
 
0.1%
21 14
 
0.1%
Other values (293) 856
 
8.6%
ValueCountFrequency (%)
0 8979
89.8%
1 14
 
0.1%
2 26
 
0.3%
3 21
 
0.2%
4 10
 
0.1%
5 19
 
0.2%
6 11
 
0.1%
7 18
 
0.2%
8 8
 
0.1%
9 9
 
0.1%
ValueCountFrequency (%)
13561 1
< 0.1%
6221 1
< 0.1%
2573 1
< 0.1%
2333 1
< 0.1%
2307 1
< 0.1%
2204 1
< 0.1%
1983 1
< 0.1%
1707 1
< 0.1%
1585 1
< 0.1%
1501 1
< 0.1%
Distinct20
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2020-01-18 00:00:00
Maximum2023-07-20 00:00:00
2023-12-12T09:02:46.466848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T09:02:46.578254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=20)

생성자
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
SYSTEM
10000 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSYSTEM
2nd rowSYSTEM
3rd rowSYSTEM
4th rowSYSTEM
5th rowSYSTEM

Common Values

ValueCountFrequency (%)
SYSTEM 10000
100.0%

Length

2023-12-12T09:02:46.682662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T09:02:46.758904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
system 10000
100.0%

Interactions

2023-12-12T09:02:44.146914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T09:02:46.815153image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분업종등록명채취년도채취분기채취량(생산량)생성일
구분1.0000.8820.1640.0000.0170.256
업종등록명0.8821.0000.0000.0000.0840.000
채취년도0.1640.0001.0000.0000.0290.997
채취분기0.0000.0000.0001.0000.0080.826
채취량(생산량)0.0170.0840.0290.0081.0000.132
생성일0.2560.0000.9970.8260.1321.000
2023-12-12T09:02:47.202848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종등록명채취년도채취분기구분
업종등록명1.0000.0000.0000.626
채취년도0.0001.0000.0000.125
채취분기0.0000.0001.0000.000
구분0.6260.1250.0001.000
2023-12-12T09:02:47.281719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
채취량(생산량)구분업종등록명채취년도채취분기
채취량(생산량)1.0000.0160.0390.0240.003
구분0.0161.0000.6260.1250.000
업종등록명0.0390.6261.0000.0000.000
채취년도0.0240.1250.0001.0000.000
채취분기0.0030.0000.0000.0001.000

Missing values

2023-12-12T09:02:44.325498image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T09:02:44.490208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

자지체명구분업종등록명채취년도채취분기채취량(생산량)생성일생성자
12516전남 완도군허가실적하천골재(자갈)2022402023-04-05SYSTEM
2275대구 북구채취실적하천골재(자갈)2022202022-12-14SYSTEM
4851경기 화성시신고채취실적산림골재(모래)2023202023-07-20SYSTEM
12041전남 해남군허가실적바다골재(모래)202242312023-04-05SYSTEM
6204강원 평창군허가실적하천골재(모래)2022212022-12-14SYSTEM
2354대구 수성구채취실적하천골재(자갈)2022202022-12-14SYSTEM
8876충남 금산군허가실적바다골재(자갈)2022402023-04-05SYSTEM
1379부산 사하구허가실적바다골재(자갈)2022202022-12-14SYSTEM
5387경기 가평군허가실적바다골재(모래)2022202022-12-14SYSTEM
2934인천 옹진군채취실적산림골재(자갈)2022202022-12-14SYSTEM
자지체명구분업종등록명채취년도채취분기채취량(생산량)생성일생성자
15141경남 거제시채취실적육상골재(모래)2022102022-12-14SYSTEM
6306강원 정선군채취실적바다골재(모래)2022102022-12-14SYSTEM
14924경남 사천시채취실적육상골재(자갈)2022302022-12-14SYSTEM
9035충남 서천군허가실적바다골재(모래)2022402023-04-05SYSTEM
6145강원 영월군채취실적육상골재(모래)2022102022-12-14SYSTEM
3502경기 의정부시채취실적육상골재(모래)2022402023-04-05SYSTEM
9608충남 예산군채취실적하천골재(모래)2021402021-07-06SYSTEM
9675충남 예산군신고채취실적선별파쇄(자갈)2021302021-07-06SYSTEM
10059전북 군산시채취실적산림골재(모래)2022202022-12-14SYSTEM
13287경북 상주시채취실적산림골재(자갈)2022102022-12-14SYSTEM

Duplicate rows

Most frequently occurring

자지체명구분업종등록명채취년도채취분기채취량(생산량)생성일생성자# duplicates
0경북 청송군신고채취실적선별파쇄(모래)2022402023-04-05SYSTEM2
1경북 청송군채취실적바다골재(모래)2022402023-04-05SYSTEM2
2경북 청송군채취실적바다골재(자갈)2022402023-04-05SYSTEM2
3경북 청송군채취실적하천골재(자갈)2022402023-04-05SYSTEM2
4경북 청송군허가실적산림골재(자갈)2022402023-04-05SYSTEM2
5경북 청송군허가실적육상골재(모래)2022402023-04-05SYSTEM2
6경북 청송군허가실적하천골재(자갈)2022402023-04-05SYSTEM2