Overview

Dataset statistics

Number of variables6
Number of observations574
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory27.6 KiB
Average record size in memory49.2 B

Variable types

Numeric1
Categorical3
Text2

Dataset

Description경상북도 건설산업 건설기계 사업자 등록 현황입니다. 2023년 5월 기준 건설기계 경상북도 등록 현황인 자료입니다. (상호, 사업유형, 등록종, 주소 등 )
URLhttps://www.data.go.kr/data/15113741/fileData.do

Alerts

사업유형 has constant value ""Constant
상태 is highly imbalanced (94.9%)Imbalance
순번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 22:52:42.611590
Analysis finished2023-12-12 22:52:43.290240
Duration0.68 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct574
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean287.5
Minimum1
Maximum574
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size5.2 KiB
2023-12-13T07:52:43.404289image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile29.65
Q1144.25
median287.5
Q3430.75
95-th percentile545.35
Maximum574
Range573
Interquartile range (IQR)286.5

Descriptive statistics

Standard deviation165.8438
Coefficient of variation (CV)0.57684801
Kurtosis-1.2
Mean287.5
Median Absolute Deviation (MAD)143.5
Skewness0
Sum165025
Variance27504.167
MonotonicityStrictly increasing
2023-12-13T07:52:43.597860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.2%
387 1
 
0.2%
381 1
 
0.2%
382 1
 
0.2%
383 1
 
0.2%
384 1
 
0.2%
385 1
 
0.2%
386 1
 
0.2%
388 1
 
0.2%
379 1
 
0.2%
Other values (564) 564
98.3%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
574 1
0.2%
573 1
0.2%
572 1
0.2%
571 1
0.2%
570 1
0.2%
569 1
0.2%
568 1
0.2%
567 1
0.2%
566 1
0.2%
565 1
0.2%

상태
Categorical

IMBALANCE 

Distinct3
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
영업
569 
휴업
 
3
재개업
 
2

Length

Max length3
Median length2
Mean length2.0034843
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영업
2nd row영업
3rd row영업
4th row영업
5th row영업

Common Values

ValueCountFrequency (%)
영업 569
99.1%
휴업 3
 
0.5%
재개업 2
 
0.3%

Length

2023-12-13T07:52:43.753055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:52:43.857732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영업 569
99.1%
휴업 3
 
0.5%
재개업 2
 
0.3%
Distinct567
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
2023-12-13T07:52:44.104488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length14
Mean length7.587108
Min length2

Characters and Unicode

Total characters4355
Distinct characters281
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique560 ?
Unique (%)97.6%

Sample

1st row대림종합중기(주)
2nd row대창건설(합)
3rd row태원종합중기(주)
4th row(주)에스피네이처 하역사업소
5th row동우종합중기(주)
ValueCountFrequency (%)
주식회사 64
 
9.7%
포항지점 3
 
0.5%
주)대성중기 2
 
0.3%
지원중기(주 2
 
0.3%
주)대도건기 2
 
0.3%
김천종합중기 2
 
0.3%
삼양종합건설(주 2
 
0.3%
보성종합건기(주 2
 
0.3%
주)미래종합건기 2
 
0.3%
주)에스피네이처 2
 
0.3%
Other values (576) 579
87.5%
2023-12-13T07:52:44.507115image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
434
 
10.0%
( 354
 
8.1%
) 354
 
8.1%
261
 
6.0%
168
 
3.9%
149
 
3.4%
108
 
2.5%
96
 
2.2%
90
 
2.1%
88
 
2.0%
Other values (271) 2253
51.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 3530
81.1%
Open Punctuation 354
 
8.1%
Close Punctuation 354
 
8.1%
Space Separator 88
 
2.0%
Decimal Number 19
 
0.4%
Uppercase Letter 8
 
0.2%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
434
 
12.3%
261
 
7.4%
168
 
4.8%
149
 
4.2%
108
 
3.1%
96
 
2.7%
90
 
2.5%
85
 
2.4%
74
 
2.1%
72
 
2.0%
Other values (252) 1993
56.5%
Decimal Number
ValueCountFrequency (%)
1 4
21.1%
2 4
21.1%
5 3
15.8%
4 2
10.5%
6 2
10.5%
9 2
10.5%
3 2
10.5%
Uppercase Letter
ValueCountFrequency (%)
C 2
25.0%
N 1
12.5%
L 1
12.5%
F 1
12.5%
E 1
12.5%
H 1
12.5%
P 1
12.5%
Other Punctuation
ValueCountFrequency (%)
· 1
50.0%
. 1
50.0%
Open Punctuation
ValueCountFrequency (%)
( 354
100.0%
Close Punctuation
ValueCountFrequency (%)
) 354
100.0%
Space Separator
ValueCountFrequency (%)
88
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 3530
81.1%
Common 817
 
18.8%
Latin 8
 
0.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
434
 
12.3%
261
 
7.4%
168
 
4.8%
149
 
4.2%
108
 
3.1%
96
 
2.7%
90
 
2.5%
85
 
2.4%
74
 
2.1%
72
 
2.0%
Other values (252) 1993
56.5%
Common
ValueCountFrequency (%)
( 354
43.3%
) 354
43.3%
88
 
10.8%
1 4
 
0.5%
2 4
 
0.5%
5 3
 
0.4%
4 2
 
0.2%
6 2
 
0.2%
9 2
 
0.2%
3 2
 
0.2%
Other values (2) 2
 
0.2%
Latin
ValueCountFrequency (%)
C 2
25.0%
N 1
12.5%
L 1
12.5%
F 1
12.5%
E 1
12.5%
H 1
12.5%
P 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 3530
81.1%
ASCII 824
 
18.9%
None 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
434
 
12.3%
261
 
7.4%
168
 
4.8%
149
 
4.2%
108
 
3.1%
96
 
2.7%
90
 
2.5%
85
 
2.4%
74
 
2.1%
72
 
2.0%
Other values (252) 1993
56.5%
ASCII
ValueCountFrequency (%)
( 354
43.0%
) 354
43.0%
88
 
10.7%
1 4
 
0.5%
2 4
 
0.5%
5 3
 
0.4%
C 2
 
0.2%
4 2
 
0.2%
6 2
 
0.2%
9 2
 
0.2%
Other values (8) 9
 
1.1%
None
ValueCountFrequency (%)
· 1
100.0%

사업유형
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
대여업
574 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row대여업
2nd row대여업
3rd row대여업
4th row대여업
5th row대여업

Common Values

ValueCountFrequency (%)
대여업 574
100.0%

Length

2023-12-13T07:52:44.633257image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:52:44.718743image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
대여업 574
100.0%

등록종별
Categorical

Distinct2
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
일반
352 
개별
222 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반
2nd row일반
3rd row일반
4th row일반
5th row일반

Common Values

ValueCountFrequency (%)
일반 352
61.3%
개별 222
38.7%

Length

2023-12-13T07:52:44.807861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T07:52:44.905457image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반 352
61.3%
개별 222
38.7%

주소
Text

Distinct209
Distinct (%)36.4%
Missing0
Missing (%)0.0%
Memory size4.6 KiB
2023-12-13T07:52:45.171968image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length12
Mean length11.945993
Min length11

Characters and Unicode

Total characters6857
Distinct characters174
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique133 ?
Unique (%)23.2%

Sample

1st row경상북도 포항시 남구
2nd row경상북도 영주시 구성로88번길
3rd row경상북도 포항시 북구
4th row경상북도 포항시 남구
5th row경상북도 구미시 백산로
ValueCountFrequency (%)
경상북도 574
33.3%
포항시 158
 
9.2%
남구 123
 
7.1%
칠곡군 44
 
2.6%
영천시 42
 
2.4%
북구 35
 
2.0%
구미시 32
 
1.9%
성주군 25
 
1.5%
상주시 25
 
1.5%
문경시 25
 
1.5%
Other values (222) 639
37.1%
2023-12-13T07:52:45.652060image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1148
16.7%
656
 
9.6%
629
 
9.2%
605
 
8.8%
584
 
8.5%
394
 
5.7%
200
 
2.9%
198
 
2.9%
167
 
2.4%
166
 
2.4%
Other values (164) 2110
30.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5654
82.5%
Space Separator 1148
 
16.7%
Decimal Number 55
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
656
 
11.6%
629
 
11.1%
605
 
10.7%
584
 
10.3%
394
 
7.0%
200
 
3.5%
198
 
3.5%
167
 
3.0%
166
 
2.9%
160
 
2.8%
Other values (154) 1895
33.5%
Decimal Number
ValueCountFrequency (%)
1 12
21.8%
3 11
20.0%
2 11
20.0%
8 7
12.7%
5 5
9.1%
7 4
 
7.3%
4 3
 
5.5%
6 1
 
1.8%
0 1
 
1.8%
Space Separator
ValueCountFrequency (%)
1148
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 5654
82.5%
Common 1203
 
17.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
656
 
11.6%
629
 
11.1%
605
 
10.7%
584
 
10.3%
394
 
7.0%
200
 
3.5%
198
 
3.5%
167
 
3.0%
166
 
2.9%
160
 
2.8%
Other values (154) 1895
33.5%
Common
ValueCountFrequency (%)
1148
95.4%
1 12
 
1.0%
3 11
 
0.9%
2 11
 
0.9%
8 7
 
0.6%
5 5
 
0.4%
7 4
 
0.3%
4 3
 
0.2%
6 1
 
0.1%
0 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 5654
82.5%
ASCII 1203
 
17.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1148
95.4%
1 12
 
1.0%
3 11
 
0.9%
2 11
 
0.9%
8 7
 
0.6%
5 5
 
0.4%
7 4
 
0.3%
4 3
 
0.2%
6 1
 
0.1%
0 1
 
0.1%
Hangul
ValueCountFrequency (%)
656
 
11.6%
629
 
11.1%
605
 
10.7%
584
 
10.3%
394
 
7.0%
200
 
3.5%
198
 
3.5%
167
 
3.0%
166
 
2.9%
160
 
2.8%
Other values (154) 1895
33.5%

Interactions

2023-12-13T07:52:42.982367image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T07:52:45.771057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번상태등록종별
순번1.0000.1000.491
상태0.1001.0000.000
등록종별0.4910.0001.000
2023-12-13T07:52:45.888131image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상태등록종별
상태1.0000.000
등록종별0.0001.000
2023-12-13T07:52:45.993145image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번상태등록종별
순번1.0000.0590.375
상태0.0591.0000.000
등록종별0.3750.0001.000

Missing values

2023-12-13T07:52:43.120548image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T07:52:43.246241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번상태상호(명칭)사업유형등록종별주소
01영업대림종합중기(주)대여업일반경상북도 포항시 남구
12영업대창건설(합)대여업일반경상북도 영주시 구성로88번길
23영업태원종합중기(주)대여업일반경상북도 포항시 북구
34영업(주)에스피네이처 하역사업소대여업일반경상북도 포항시 남구
45영업동우종합중기(주)대여업일반경상북도 구미시 백산로
56영업(주)대성종합중기대여업일반경상북도 상주시 경상대로
67영업(주)구미종합중기대여업일반경상북도 구미시 수출대로
78영업제일종합건기(주)대여업일반경상북도 안동시 풍산읍
89영업(주)명선대여업일반경상북도 안동시 운동장길
910영업(주)용마루대여업일반경상북도 포항시 남구
순번상태상호(명칭)사업유형등록종별주소
564565영업한일종합중기(주)대여업일반경상북도 영덕군 영덕읍
565566영업(주)덕성대여업일반경상북도 영덕군 영덕읍
566567영업힐링팜영농조합법인대여업개별경상북도 영덕군 병곡면
567568영업(주)무진건설대여업일반경상북도 영덕군 영덕읍
568569영업(주)대운산업개발대여업일반경상북도 영덕군 영해면
569570영업(주)성덕산업대여업개별경상북도 영덕군 영해면
570571영업한영타워대여업일반경상북도 청도군 각북면
571572영업준창중기골재(주)대여업일반경상북도 청도군 화양읍
572573영업(주)백경건설대여업개별경상북도 청도군 이서면
573574영업청도종합중기대여업개별경상북도 청도군 청도읍