Overview

Dataset statistics

Number of variables7
Number of observations2408
Missing cells0
Missing cells (%)0.0%
Duplicate rows35
Duplicate rows (%)1.5%
Total size in memory141.2 KiB
Average record size in memory60.1 B

Variable types

Text2
Categorical2
Numeric2
DateTime1

Dataset

Description전기전자제품및자동차의재활용시스템 내 폐자동차 실적 정보를 제공(업체명, 자동차 실적 번호, 실적 년도, 실적 분기, 해체 대수, 해체 인수 중량(kg), 등록일)
Author환경부
URLhttps://www.data.go.kr/data/15092402/fileData.do

Alerts

Dataset has 35 (1.5%) duplicate rowsDuplicates
실적 분기 is highly overall correlated with 실적 년도High correlation
실적 년도 is highly overall correlated with 실적 분기High correlation
해체 대수 is highly overall correlated with 해체 인수 중량(kg)High correlation
해체 인수 중량(kg) is highly overall correlated with 해체 대수High correlation
해체 대수 has 593 (24.6%) zerosZeros
해체 인수 중량(kg) has 593 (24.6%) zerosZeros

Reproduction

Analysis started2024-04-06 08:03:54.439133
Analysis finished2024-04-06 08:03:59.399815
Duration4.96 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct584
Distinct (%)24.3%
Missing0
Missing (%)0.0%
Memory size18.9 KiB
2024-04-06T17:03:59.751376image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length17
Mean length9.519103
Min length2

Characters and Unicode

Total characters22922
Distinct characters290
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)0.2%

Sample

1st row(유한)강릉자동차폐차장
2nd row(유한)강릉자동차폐차장
3rd row(유한)강릉자동차폐차장
4th row(유한)강릉자동차폐차장
5th row문경자동차해체재활용산업
ValueCountFrequency (%)
주식회사 176
 
6.5%
현대자동차해체재활용산업 16
 
0.6%
테스트_파쇄재활용업 12
 
0.4%
영남자동차해체재활용산업 12
 
0.4%
12
 
0.4%
그린자동차해체재활용산업 9
 
0.3%
주)지알엠 8
 
0.3%
에너지플러스 8
 
0.3%
비전오토폐차 8
 
0.3%
광양폐차장 8
 
0.3%
Other values (595) 2427
90.0%
2024-04-06T17:04:00.462839image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1853
 
8.1%
1258
 
5.5%
1142
 
5.0%
1141
 
5.0%
1115
 
4.9%
) 1015
 
4.4%
( 1011
 
4.4%
860
 
3.8%
774
 
3.4%
769
 
3.4%
Other values (280) 11984
52.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 20395
89.0%
Close Punctuation 1015
 
4.4%
Open Punctuation 1011
 
4.4%
Space Separator 292
 
1.3%
Uppercase Letter 108
 
0.5%
Lowercase Letter 41
 
0.2%
Connector Punctuation 32
 
0.1%
Decimal Number 12
 
0.1%
Dash Punctuation 8
 
< 0.1%
Other Punctuation 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1853
 
9.1%
1258
 
6.2%
1142
 
5.6%
1141
 
5.6%
1115
 
5.5%
860
 
4.2%
774
 
3.8%
769
 
3.8%
718
 
3.5%
676
 
3.3%
Other values (254) 10089
49.5%
Uppercase Letter
ValueCountFrequency (%)
R 32
29.6%
A 24
22.2%
C 24
22.2%
N 8
 
7.4%
S 4
 
3.7%
E 4
 
3.7%
O 4
 
3.7%
D 4
 
3.7%
X 4
 
3.7%
Lowercase Letter
ValueCountFrequency (%)
o 8
19.5%
c 8
19.5%
x 5
12.2%
s 4
9.8%
r 4
9.8%
a 4
9.8%
t 4
9.8%
u 4
9.8%
Decimal Number
ValueCountFrequency (%)
1 4
33.3%
5 4
33.3%
2 4
33.3%
Close Punctuation
ValueCountFrequency (%)
) 1015
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1011
100.0%
Space Separator
ValueCountFrequency (%)
292
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 32
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%
Other Punctuation
ValueCountFrequency (%)
& 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 20395
89.0%
Common 2378
 
10.4%
Latin 149
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1853
 
9.1%
1258
 
6.2%
1142
 
5.6%
1141
 
5.6%
1115
 
5.5%
860
 
4.2%
774
 
3.8%
769
 
3.8%
718
 
3.5%
676
 
3.3%
Other values (254) 10089
49.5%
Latin
ValueCountFrequency (%)
R 32
21.5%
A 24
16.1%
C 24
16.1%
N 8
 
5.4%
o 8
 
5.4%
c 8
 
5.4%
x 5
 
3.4%
S 4
 
2.7%
E 4
 
2.7%
O 4
 
2.7%
Other values (7) 28
18.8%
Common
ValueCountFrequency (%)
) 1015
42.7%
( 1011
42.5%
292
 
12.3%
_ 32
 
1.3%
- 8
 
0.3%
& 8
 
0.3%
1 4
 
0.2%
5 4
 
0.2%
2 4
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 20395
89.0%
ASCII 2527
 
11.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1853
 
9.1%
1258
 
6.2%
1142
 
5.6%
1141
 
5.6%
1115
 
5.5%
860
 
4.2%
774
 
3.8%
769
 
3.8%
718
 
3.5%
676
 
3.3%
Other values (254) 10089
49.5%
ASCII
ValueCountFrequency (%)
) 1015
40.2%
( 1011
40.0%
292
 
11.6%
_ 32
 
1.3%
R 32
 
1.3%
A 24
 
0.9%
C 24
 
0.9%
N 8
 
0.3%
- 8
 
0.3%
o 8
 
0.3%
Other values (16) 73
 
2.9%
Distinct2373
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Memory size18.9 KiB
2024-04-06T17:04:01.095617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length14
Mean length14
Min length14

Characters and Unicode

Total characters33712
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2338 ?
Unique (%)97.1%

Sample

1st rowCAT12300558655
2nd rowCAT12300559132
3rd rowCAT12300560054
4th rowCAT12300558008
5th rowCAT12300560012
ValueCountFrequency (%)
cat12300559403 2
 
0.1%
cat12300558727 2
 
0.1%
cat32300560053 2
 
0.1%
cat12300559352 2
 
0.1%
cat12300560099 2
 
0.1%
cat12300559405 2
 
0.1%
cat12300557856 2
 
0.1%
cat12300559794 2
 
0.1%
cat12300558516 2
 
0.1%
cat32300558995 2
 
0.1%
Other values (2363) 2388
99.2%
2024-04-06T17:04:01.955377image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 5765
17.1%
5 5362
15.9%
2 3164
9.4%
3 3133
9.3%
1 2948
8.7%
C 2408
7.1%
A 2408
7.1%
T 2408
7.1%
9 1641
 
4.9%
8 1588
 
4.7%
Other values (3) 2887
8.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26488
78.6%
Uppercase Letter 7224
 
21.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5765
21.8%
5 5362
20.2%
2 3164
11.9%
3 3133
11.8%
1 2948
11.1%
9 1641
 
6.2%
8 1588
 
6.0%
7 1236
 
4.7%
6 958
 
3.6%
4 693
 
2.6%
Uppercase Letter
ValueCountFrequency (%)
C 2408
33.3%
A 2408
33.3%
T 2408
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 26488
78.6%
Latin 7224
 
21.4%

Most frequent character per script

Common
ValueCountFrequency (%)
0 5765
21.8%
5 5362
20.2%
2 3164
11.9%
3 3133
11.8%
1 2948
11.1%
9 1641
 
6.2%
8 1588
 
6.0%
7 1236
 
4.7%
6 958
 
3.6%
4 693
 
2.6%
Latin
ValueCountFrequency (%)
C 2408
33.3%
A 2408
33.3%
T 2408
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33712
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5765
17.1%
5 5362
15.9%
2 3164
9.4%
3 3133
9.3%
1 2948
8.7%
C 2408
7.1%
A 2408
7.1%
T 2408
7.1%
9 1641
 
4.9%
8 1588
 
4.7%
Other values (3) 2887
8.6%

실적 년도
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size18.9 KiB
2023
1801 
2022
607 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023
2nd row2023
3rd row2023
4th row2022
5th row2023

Common Values

ValueCountFrequency (%)
2023 1801
74.8%
2022 607
 
25.2%

Length

2024-04-06T17:04:02.211137image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:04:02.414428image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023 1801
74.8%
2022 607
 
25.2%

실적 분기
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size18.9 KiB
4
607 
1
606 
2
604 
3
591 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row2
3rd row3
4th row4
5th row3

Common Values

ValueCountFrequency (%)
4 607
25.2%
1 606
25.2%
2 604
25.1%
3 591
24.5%

Length

2024-04-06T17:04:02.605353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-06T17:04:02.799011image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4 607
25.2%
1 606
25.2%
2 604
25.1%
3 591
24.5%

해체 대수
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct658
Distinct (%)27.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean227.11628
Minimum0
Maximum4131
Zeros593
Zeros (%)24.6%
Negative0
Negative (%)0.0%
Memory size21.3 KiB
2024-04-06T17:04:03.057148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11.75
median103
Q3266
95-th percentile853.65
Maximum4131
Range4131
Interquartile range (IQR)264.25

Descriptive statistics

Standard deviation385.6769
Coefficient of variation (CV)1.6981473
Kurtosis19.403595
Mean227.11628
Median Absolute Deviation (MAD)103
Skewness3.8222556
Sum546896
Variance148746.67
MonotonicityNot monotonic
2024-04-06T17:04:03.320400image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 593
 
24.6%
110 12
 
0.5%
94 12
 
0.5%
3 11
 
0.5%
31 11
 
0.5%
82 10
 
0.4%
2 10
 
0.4%
95 10
 
0.4%
29 10
 
0.4%
102 9
 
0.4%
Other values (648) 1720
71.4%
ValueCountFrequency (%)
0 593
24.6%
1 9
 
0.4%
2 10
 
0.4%
3 11
 
0.5%
4 7
 
0.3%
5 2
 
0.1%
6 4
 
0.2%
7 5
 
0.2%
8 7
 
0.3%
9 6
 
0.2%
ValueCountFrequency (%)
4131 1
< 0.1%
3397 1
< 0.1%
2901 1
< 0.1%
2846 1
< 0.1%
2836 1
< 0.1%
2797 1
< 0.1%
2687 1
< 0.1%
2623 1
< 0.1%
2583 1
< 0.1%
2551 1
< 0.1%

해체 인수 중량(kg)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1792
Distinct (%)74.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean323440.47
Minimum0
Maximum5234311
Zeros593
Zeros (%)24.6%
Negative0
Negative (%)0.0%
Memory size21.3 KiB
2024-04-06T17:04:03.645843image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11920
median153171.5
Q3393905
95-th percentile1208509
Maximum5234311
Range5234311
Interquartile range (IQR)391985

Descriptive statistics

Standard deviation514602.49
Coefficient of variation (CV)1.5910269
Kurtosis15.551864
Mean323440.47
Median Absolute Deviation (MAD)153171.5
Skewness3.3935936
Sum7.7884466 × 108
Variance2.6481573 × 1011
MonotonicityNot monotonic
2024-04-06T17:04:03.990324image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 593
 
24.6%
199950 2
 
0.1%
151320 2
 
0.1%
159675 2
 
0.1%
910 2
 
0.1%
29870 2
 
0.1%
149330 2
 
0.1%
231455 2
 
0.1%
1239286 2
 
0.1%
1167287 2
 
0.1%
Other values (1782) 1797
74.6%
ValueCountFrequency (%)
0 593
24.6%
600 1
 
< 0.1%
910 2
 
0.1%
1000 1
 
< 0.1%
1620 1
 
< 0.1%
1700 1
 
< 0.1%
1800 2
 
0.1%
1830 1
 
< 0.1%
1950 1
 
< 0.1%
2000 1
 
< 0.1%
ValueCountFrequency (%)
5234311 1
< 0.1%
4232152 1
< 0.1%
4176455 1
< 0.1%
3571285 1
< 0.1%
3437770 1
< 0.1%
3431519 1
< 0.1%
3336834 1
< 0.1%
3309811 1
< 0.1%
3231668 1
< 0.1%
3225984 1
< 0.1%
Distinct89
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Memory size18.9 KiB
Minimum2023-01-02 00:00:00
Maximum2023-11-01 00:00:00
2024-04-06T17:04:04.283888image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:04:04.906990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-04-06T17:03:58.338441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:03:57.899761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:03:58.583824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-06T17:03:58.146396image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-06T17:04:05.082977image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
실적 년도실적 분기해체 대수해체 인수 중량(kg)등록일
실적 년도1.0001.0000.0310.0001.000
실적 분기1.0001.0000.0000.0001.000
해체 대수0.0310.0001.0000.9870.000
해체 인수 중량(kg)0.0000.0000.9871.0000.000
등록일1.0001.0000.0000.0001.000
2024-04-06T17:04:05.273115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
실적 분기실적 년도
실적 분기1.0001.000
실적 년도1.0001.000
2024-04-06T17:04:05.441287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
해체 대수해체 인수 중량(kg)실적 년도실적 분기
해체 대수1.0000.9930.0240.000
해체 인수 중량(kg)0.9931.0000.0000.000
실적 년도0.0240.0001.0001.000
실적 분기0.0000.0001.0001.000

Missing values

2024-04-06T17:03:58.922591image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-06T17:03:59.309013image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업체명자동차 실적 번호실적 년도실적 분기해체 대수해체 인수 중량(kg)등록일
0(유한)강릉자동차폐차장CAT12300558655202313957790152023-04-14
1(유한)강릉자동차폐차장CAT12300559132202324439476732023-07-10
2(유한)강릉자동차폐차장CAT12300560054202334579508882023-10-16
3(유한)강릉자동차폐차장CAT12300558008202244198692532023-01-16
4문경자동차해체재활용산업CAT123005600122023374917452023-10-13
5문경자동차해체재활용산업CAT12300557790202241141552152023-01-09
6문경자동차해체재활용산업CAT12300559081202321041279502023-07-06
7문경자동차해체재활용산업CAT1230055848620231941094302023-04-11
8(주) 현대자동차해체재활용산업CAT12300557836202244827083512023-01-10
9(주) 현대자동차해체재활용산업CAT12300558399202314286198732023-04-07
업체명자동차 실적 번호실적 년도실적 분기해체 대수해체 인수 중량(kg)등록일
2398카이로폐차장주식회사CAT12300557524202243603689502023-01-02
2399카이로폐차장주식회사CAT12300559706202333073572402023-10-05
2400주식회사 엠케이인터내셔널CAT2230055994120233002023-10-12
2401주식회사 엠케이인터내셔널CAT2230055857220231002023-04-13
2402주식회사 엠케이인터내셔널CAT2230055796320224002023-01-13
2403주식회사 엠케이인터내셔널CAT2230055933820232002023-07-14
2404동영폐차산업(주)CAT12300558016202245417919382023-01-16
2405동영폐차산업(주)CAT12300558701202315127562772023-04-17
2406동영폐차산업(주)CAT12300559375202325538178762023-07-14
2407동영폐차산업(주)CAT12300559944202334967318522023-10-12

Duplicate rows

Most frequently occurring

업체명자동차 실적 번호실적 년도실적 분기해체 대수해체 인수 중량(kg)등록일# duplicates
0(주)도나폐차서비스CAT1230055807120224002023-02-012
1(주)도나폐차서비스CAT1230055877120231002023-05-012
2(주)도나폐차서비스CAT1230055947620232002023-08-012
3(주)도나폐차서비스CAT1230056016220233002023-11-012
4(주)아이카파_연천폐차장CAT1230055802320224002023-01-162
5(주)아이카파_연천폐차장CAT1230055865820231002023-04-152
6(주)아이카파_연천폐차장CAT1230055935220232002023-07-142
7(주)아이카파_연천폐차장CAT123005600992023319102023-10-162
8(주)에스피네이처 슈레더사업소CAT2230055799220224002023-01-132
9(주)에스피네이처 슈레더사업소CAT2230055855820231002023-04-122