Overview

Dataset statistics

Number of variables6
Number of observations91
Missing cells5
Missing cells (%)0.9%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.5 KiB
Average record size in memory50.5 B

Variable types

Numeric1
Text2
Categorical3

Dataset

Description세종특별자치시 건설기계 사업자 현황을 제공합니다. 데이터는 상호, 사업유형, 등록종별, 영업상태, 주소 로 구성되어 있습니다.
URLhttps://www.data.go.kr/data/15113731/fileData.do

Alerts

등록종별 is highly overall correlated with 순번 and 1 other fieldsHigh correlation
사업유형 is highly overall correlated with 순번 and 1 other fieldsHigh correlation
순번 is highly overall correlated with 사업유형 and 1 other fieldsHigh correlation
영업상태 is highly imbalanced (84.8%)Imbalance
주소 has 5 (5.5%) missing valuesMissing
순번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 05:10:31.854957
Analysis finished2023-12-12 05:10:32.666113
Duration0.81 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct91
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46
Minimum1
Maximum91
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size951.0 B
2023-12-12T14:10:32.754195image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5.5
Q123.5
median46
Q368.5
95-th percentile86.5
Maximum91
Range90
Interquartile range (IQR)45

Descriptive statistics

Standard deviation26.41338
Coefficient of variation (CV)0.57420392
Kurtosis-1.2
Mean46
Median Absolute Deviation (MAD)23
Skewness0
Sum4186
Variance697.66667
MonotonicityStrictly increasing
2023-12-12T14:10:32.974581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.1%
59 1
 
1.1%
68 1
 
1.1%
67 1
 
1.1%
66 1
 
1.1%
65 1
 
1.1%
64 1
 
1.1%
63 1
 
1.1%
62 1
 
1.1%
61 1
 
1.1%
Other values (81) 81
89.0%
ValueCountFrequency (%)
1 1
1.1%
2 1
1.1%
3 1
1.1%
4 1
1.1%
5 1
1.1%
6 1
1.1%
7 1
1.1%
8 1
1.1%
9 1
1.1%
10 1
1.1%
ValueCountFrequency (%)
91 1
1.1%
90 1
1.1%
89 1
1.1%
88 1
1.1%
87 1
1.1%
86 1
1.1%
85 1
1.1%
84 1
1.1%
83 1
1.1%
82 1
1.1%

상호
Text

Distinct87
Distinct (%)95.6%
Missing0
Missing (%)0.0%
Memory size860.0 B
2023-12-12T14:10:33.264143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length18
Median length12
Mean length8.1318681
Min length4

Characters and Unicode

Total characters740
Distinct characters147
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique83 ?
Unique (%)91.2%

Sample

1st row보성중기(주)
2nd row(합자)합동종합중기
3rd row(주)유천건설
4th row개별중기
5th row재형건기(주)
ValueCountFrequency (%)
주식회사 6
 
5.5%
세종지점 5
 
4.6%
한국복합물류(주 2
 
1.8%
충북지게차엔지니어링 2
 
1.8%
연기수도 2
 
1.8%
주)대일정공중부연합중기 2
 
1.8%
현대중장비(주 2
 
1.8%
대성건설 1
 
0.9%
보성중기(주 1
 
0.9%
주)천호산업 1
 
0.9%
Other values (85) 85
78.0%
2023-12-12T14:10:33.748015image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
53
 
7.2%
) 50
 
6.8%
( 50
 
6.8%
34
 
4.6%
26
 
3.5%
25
 
3.4%
23
 
3.1%
19
 
2.6%
18
 
2.4%
17
 
2.3%
Other values (137) 425
57.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 619
83.6%
Close Punctuation 50
 
6.8%
Open Punctuation 50
 
6.8%
Space Separator 18
 
2.4%
Decimal Number 3
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
53
 
8.6%
34
 
5.5%
26
 
4.2%
25
 
4.0%
23
 
3.7%
19
 
3.1%
17
 
2.7%
16
 
2.6%
16
 
2.6%
15
 
2.4%
Other values (132) 375
60.6%
Decimal Number
ValueCountFrequency (%)
1 2
66.7%
4 1
33.3%
Close Punctuation
ValueCountFrequency (%)
) 50
100.0%
Open Punctuation
ValueCountFrequency (%)
( 50
100.0%
Space Separator
ValueCountFrequency (%)
18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 619
83.6%
Common 121
 
16.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
53
 
8.6%
34
 
5.5%
26
 
4.2%
25
 
4.0%
23
 
3.7%
19
 
3.1%
17
 
2.7%
16
 
2.6%
16
 
2.6%
15
 
2.4%
Other values (132) 375
60.6%
Common
ValueCountFrequency (%)
) 50
41.3%
( 50
41.3%
18
 
14.9%
1 2
 
1.7%
4 1
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 619
83.6%
ASCII 121
 
16.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
53
 
8.6%
34
 
5.5%
26
 
4.2%
25
 
4.0%
23
 
3.7%
19
 
3.1%
17
 
2.7%
16
 
2.6%
16
 
2.6%
15
 
2.4%
Other values (132) 375
60.6%
ASCII
ValueCountFrequency (%)
) 50
41.3%
( 50
41.3%
18
 
14.9%
1 2
 
1.7%
4 1
 
0.8%

사업유형
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Memory size860.0 B
대여업
63 
정비업
18 
해체재활용업
 
4
매매업
 
4
등록번호제작자
 
2

Length

Max length7
Median length3
Mean length3.2197802
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대여업
2nd row대여업
3rd row대여업
4th row대여업
5th row대여업

Common Values

ValueCountFrequency (%)
대여업 63
69.2%
정비업 18
 
19.8%
해체재활용업 4
 
4.4%
매매업 4
 
4.4%
등록번호제작자 2
 
2.2%

Length

2023-12-12T14:10:33.962275image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:10:34.131006image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대여업 63
69.2%
정비업 18
 
19.8%
해체재활용업 4
 
4.4%
매매업 4
 
4.4%
등록번호제작자 2
 
2.2%

등록종별
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size860.0 B
개별
32 
일반
31 
<NA>
10 
종합(덤프 및 믹서트럭)
부분(일반)
Other values (2)
 
3

Length

Max length13
Median length2
Mean length3.6373626
Min length2

Unique

Unique1 ?
Unique (%)1.1%

Sample

1st row일반
2nd row일반
3rd row개별
4th row개별
5th row일반

Common Values

ValueCountFrequency (%)
개별 32
35.2%
일반 31
34.1%
<NA> 10
 
11.0%
종합(덤프 및 믹서트럭) 8
 
8.8%
부분(일반) 7
 
7.7%
전문(유압) 2
 
2.2%
종합(굴착기) 1
 
1.1%

Length

2023-12-12T14:10:34.295329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:10:34.432292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
개별 32
29.9%
일반 31
29.0%
na 10
 
9.3%
종합(덤프 8
 
7.5%
8
 
7.5%
믹서트럭 8
 
7.5%
부분(일반 7
 
6.5%
전문(유압 2
 
1.9%
종합(굴착기 1
 
0.9%

영업상태
Categorical

IMBALANCE 

Distinct2
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size860.0 B
영업
89 
재개업
 
2

Length

Max length3
Median length2
Mean length2.021978
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영업
2nd row영업
3rd row영업
4th row영업
5th row영업

Common Values

ValueCountFrequency (%)
영업 89
97.8%
재개업 2
 
2.2%

Length

2023-12-12T14:10:34.578159image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T14:10:34.659477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영업 89
97.8%
재개업 2
 
2.2%

주소
Text

MISSING 

Distinct74
Distinct (%)86.0%
Missing5
Missing (%)5.5%
Memory size860.0 B
2023-12-12T14:10:35.010668image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length40
Median length29
Mean length21.895349
Min length17

Characters and Unicode

Total characters1883
Distinct characters140
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique67 ?
Unique (%)77.9%

Sample

1st row세종특별자치시 조치원읍 새내로 197
2nd row세종특별자치시 조치원읍 충현로 34
3rd row세종특별자치시 전의면 운주산로 1220
4th row세종특별자치시 조치원읍 새내로 148-1
5th row세종특별자치시 전동면 수회길 76-21
ValueCountFrequency (%)
세종특별자치시 85
23.2%
부강면 21
 
5.7%
조치원읍 17
 
4.6%
연동면 11
 
3.0%
금남면 7
 
1.9%
전동면 6
 
1.6%
청연로 5
 
1.4%
연청로 5
 
1.4%
연기면 5
 
1.4%
세종로 5
 
1.4%
Other values (151) 199
54.4%
2023-12-12T14:10:35.559052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
294
 
15.6%
102
 
5.4%
90
 
4.8%
90
 
4.8%
86
 
4.6%
85
 
4.5%
85
 
4.5%
85
 
4.5%
64
 
3.4%
1 61
 
3.2%
Other values (130) 841
44.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1244
66.1%
Decimal Number 302
 
16.0%
Space Separator 294
 
15.6%
Dash Punctuation 29
 
1.5%
Close Punctuation 6
 
0.3%
Open Punctuation 6
 
0.3%
Uppercase Letter 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
102
 
8.2%
90
 
7.2%
90
 
7.2%
86
 
6.9%
85
 
6.8%
85
 
6.8%
85
 
6.8%
64
 
5.1%
53
 
4.3%
32
 
2.6%
Other values (114) 472
37.9%
Decimal Number
ValueCountFrequency (%)
1 61
20.2%
2 48
15.9%
3 34
11.3%
4 32
10.6%
7 27
8.9%
9 26
8.6%
5 24
 
7.9%
0 21
 
7.0%
6 16
 
5.3%
8 13
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
A 1
50.0%
B 1
50.0%
Space Separator
ValueCountFrequency (%)
294
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 29
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1244
66.1%
Common 637
33.8%
Latin 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
102
 
8.2%
90
 
7.2%
90
 
7.2%
86
 
6.9%
85
 
6.8%
85
 
6.8%
85
 
6.8%
64
 
5.1%
53
 
4.3%
32
 
2.6%
Other values (114) 472
37.9%
Common
ValueCountFrequency (%)
294
46.2%
1 61
 
9.6%
2 48
 
7.5%
3 34
 
5.3%
4 32
 
5.0%
- 29
 
4.6%
7 27
 
4.2%
9 26
 
4.1%
5 24
 
3.8%
0 21
 
3.3%
Other values (4) 41
 
6.4%
Latin
ValueCountFrequency (%)
A 1
50.0%
B 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1244
66.1%
ASCII 639
33.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
294
46.0%
1 61
 
9.5%
2 48
 
7.5%
3 34
 
5.3%
4 32
 
5.0%
- 29
 
4.5%
7 27
 
4.2%
9 26
 
4.1%
5 24
 
3.8%
0 21
 
3.3%
Other values (6) 43
 
6.7%
Hangul
ValueCountFrequency (%)
102
 
8.2%
90
 
7.2%
90
 
7.2%
86
 
6.9%
85
 
6.8%
85
 
6.8%
85
 
6.8%
64
 
5.1%
53
 
4.3%
32
 
2.6%
Other values (114) 472
37.9%

Interactions

2023-12-12T14:10:32.327236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T14:10:35.709226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번상호사업유형등록종별영업상태주소
순번1.0000.7260.9120.7490.0000.848
상호0.7261.0000.0000.8421.0001.000
사업유형0.9120.0001.0001.0000.0000.000
등록종별0.7490.8421.0001.0000.0000.930
영업상태0.0001.0000.0000.0001.0001.000
주소0.8481.0000.0000.9301.0001.000
2023-12-12T14:10:35.844108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록종별사업유형영업상태
등록종별1.0000.9740.000
사업유형0.9741.0000.000
영업상태0.0000.0001.000
2023-12-12T14:10:35.942425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번사업유형등록종별영업상태
순번1.0000.5920.5050.000
사업유형0.5921.0000.9740.000
등록종별0.5050.9741.0000.000
영업상태0.0000.0000.0001.000

Missing values

2023-12-12T14:10:32.445239image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T14:10:32.602384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번상호사업유형등록종별영업상태주소
01보성중기(주)대여업일반영업세종특별자치시 조치원읍 새내로 197
12(합자)합동종합중기대여업일반영업세종특별자치시 조치원읍 충현로 34
23(주)유천건설대여업개별영업세종특별자치시 전의면 운주산로 1220
34개별중기대여업개별영업<NA>
45재형건기(주)대여업일반영업세종특별자치시 조치원읍 새내로 148-1
56(주)대청환경대여업일반영업세종특별자치시 전동면 수회길 76-21
67(합)대명건설대여업일반영업세종특별자치시 전동면 운주산로 708
78모아건설중기대여업개별영업<NA>
89금강개별중기대여업개별영업<NA>
910(주)세종종합중기조합대여업일반영업세종특별자치시 연기면 수왕로 56-3
순번상호사업유형등록종별영업상태주소
8182금강자동차서비스(주)정비업종합(덤프 및 믹서트럭)영업세종특별자치시 부강면 연청로 745-46
8283만트럭버스코리아(주)정비업종합(덤프 및 믹서트럭)영업세종특별자치시 부강면 문곡리 19번지 2호 외 9필지
8384국제모터스정비업종합(덤프 및 믹서트럭)영업세종특별자치시 전동면 심중리 473번지 5호
8485케이앤리(주)정비업종합(덤프 및 믹서트럭)영업세종특별자치시 소정면 운당리 91번지 7호
8586중원중기공업정비업전문(유압)영업세종특별자치시 부강면 연청로 1149
8687주식회사 유로파트 세종정비공장지점정비업종합(덤프 및 믹서트럭)영업세종특별자치시 연기면 연청로 269-5
8788한국복합물류(주)매매업<NA>영업세종특별자치시 부강면 연청로 745-46
8889충북지게차엔지니어링매매업<NA>영업세종특별자치시 부강면 금호선말길 4
8990펌프카114매매업<NA>영업세종특별자치시 조치원읍 세종로 2749
9091현대중장비(주)매매업<NA>영업세종특별자치시 연동면 명학산단1로 7