Overview

Dataset statistics

Number of variables5
Number of observations68
Missing cells4
Missing cells (%)1.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.9 KiB
Average record size in memory43.9 B

Variable types

Numeric2
Text2
Categorical1

Dataset

Description대구광역시 달성군의 화물자동차운송사업자 업체현황에 관한 데이터로 연번, 업체명, 업체 도로명 주소, 보유 차량 수 등의 데이터를 제공하고 있습니다.
URLhttps://www.data.go.kr/data/15114900/fileData.do

Alerts

연번 is highly overall correlated with 면허종류High correlation
보유대수 is highly overall correlated with 면허종류High correlation
면허종류 is highly overall correlated with 연번 and 1 other fieldsHigh correlation
면허종류 is highly imbalanced (88.9%)Imbalance
연번 has 1 (1.5%) missing valuesMissing
업체명 has 1 (1.5%) missing valuesMissing
보유대수 has 1 (1.5%) missing valuesMissing
도로명주소 has 1 (1.5%) missing valuesMissing

Reproduction

Analysis started2023-12-12 07:46:12.462978
Analysis finished2023-12-12 07:46:13.372260
Duration0.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct67
Distinct (%)100.0%
Missing1
Missing (%)1.5%
Infinite0
Infinite (%)0.0%
Mean34
Minimum1
Maximum67
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 B
2023-12-12T16:46:13.750914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.3
Q117.5
median34
Q350.5
95-th percentile63.7
Maximum67
Range66
Interquartile range (IQR)33

Descriptive statistics

Standard deviation19.485037
Coefficient of variation (CV)0.57308932
Kurtosis-1.2
Mean34
Median Absolute Deviation (MAD)17
Skewness0
Sum2278
Variance379.66667
MonotonicityStrictly increasing
2023-12-12T16:46:13.887017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.5%
44 1
 
1.5%
50 1
 
1.5%
49 1
 
1.5%
48 1
 
1.5%
47 1
 
1.5%
46 1
 
1.5%
45 1
 
1.5%
43 1
 
1.5%
2 1
 
1.5%
Other values (57) 57
83.8%
ValueCountFrequency (%)
1 1
1.5%
2 1
1.5%
3 1
1.5%
4 1
1.5%
5 1
1.5%
6 1
1.5%
7 1
1.5%
8 1
1.5%
9 1
1.5%
10 1
1.5%
ValueCountFrequency (%)
67 1
1.5%
66 1
1.5%
65 1
1.5%
64 1
1.5%
63 1
1.5%
62 1
1.5%
61 1
1.5%
60 1
1.5%
59 1
1.5%
58 1
1.5%

업체명
Text

MISSING 

Distinct67
Distinct (%)100.0%
Missing1
Missing (%)1.5%
Memory size676.0 B
2023-12-12T16:46:14.141383image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length5
Mean length6.0298507
Min length3

Characters and Unicode

Total characters404
Distinct characters115
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique67 ?
Unique (%)100.0%

Sample

1st row(주)남경물류
2nd row대성특수화물㈜
3rd row대양운수㈜
4th row삼양기업㈜
5th row(주)하나물류시스템
ValueCountFrequency (%)
대구영업소 2
 
2.9%
주)남경물류 1
 
1.4%
㈜삼진통운 1
 
1.4%
㈜대구중기살수차 1
 
1.4%
㈜대경미니추레라 1
 
1.4%
㈜진솔물류 1
 
1.4%
㈜엠제이물류 1
 
1.4%
대성특수화물㈜ 1
 
1.4%
㈜용진로지스 1
 
1.4%
㈜디와이물류 1
 
1.4%
Other values (58) 58
84.1%
2023-12-12T16:46:14.491348image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
60
 
14.9%
22
 
5.4%
19
 
4.7%
18
 
4.5%
15
 
3.7%
12
 
3.0%
12
 
3.0%
10
 
2.5%
10
 
2.5%
8
 
2.0%
Other values (105) 218
54.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 328
81.2%
Other Symbol 60
 
14.9%
Close Punctuation 7
 
1.7%
Open Punctuation 7
 
1.7%
Space Separator 2
 
0.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
22
 
6.7%
19
 
5.8%
18
 
5.5%
15
 
4.6%
12
 
3.7%
12
 
3.7%
10
 
3.0%
10
 
3.0%
8
 
2.4%
8
 
2.4%
Other values (101) 194
59.1%
Other Symbol
ValueCountFrequency (%)
60
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 388
96.0%
Common 16
 
4.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
60
 
15.5%
22
 
5.7%
19
 
4.9%
18
 
4.6%
15
 
3.9%
12
 
3.1%
12
 
3.1%
10
 
2.6%
10
 
2.6%
8
 
2.1%
Other values (102) 202
52.1%
Common
ValueCountFrequency (%)
) 7
43.8%
( 7
43.8%
2
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 328
81.2%
None 60
 
14.9%
ASCII 16
 
4.0%

Most frequent character per block

None
ValueCountFrequency (%)
60
100.0%
Hangul
ValueCountFrequency (%)
22
 
6.7%
19
 
5.8%
18
 
5.5%
15
 
4.6%
12
 
3.7%
12
 
3.7%
10
 
3.0%
10
 
3.0%
8
 
2.4%
8
 
2.4%
Other values (101) 194
59.1%
ASCII
ValueCountFrequency (%)
) 7
43.8%
( 7
43.8%
2
 
12.5%

면허종류
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size676.0 B
일반화물자동차운송사업
67 
<NA>
 
1

Length

Max length11
Median length11
Mean length10.897059
Min length4

Unique

Unique1 ?
Unique (%)1.5%

Sample

1st row일반화물자동차운송사업
2nd row일반화물자동차운송사업
3rd row일반화물자동차운송사업
4th row일반화물자동차운송사업
5th row일반화물자동차운송사업

Common Values

ValueCountFrequency (%)
일반화물자동차운송사업 67
98.5%
<NA> 1
 
1.5%

Length

2023-12-12T16:46:14.626519image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:46:14.732617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반화물자동차운송사업 67
98.5%
na 1
 
1.5%

보유대수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct23
Distinct (%)34.3%
Missing1
Missing (%)1.5%
Infinite0
Infinite (%)0.0%
Mean15.820896
Minimum1
Maximum142
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 B
2023-12-12T16:46:14.821709image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q39
95-th percentile78
Maximum142
Range141
Interquartile range (IQR)7

Descriptive statistics

Standard deviation28.779162
Coefficient of variation (CV)1.8190602
Kurtosis6.0995102
Mean15.820896
Median Absolute Deviation (MAD)2
Skewness2.4645817
Sum1060
Variance828.24016
MonotonicityNot monotonic
2023-12-12T16:46:14.927700image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=23)
ValueCountFrequency (%)
1 15
22.1%
2 14
20.6%
3 8
11.8%
4 5
 
7.4%
5 3
 
4.4%
9 2
 
2.9%
78 2
 
2.9%
47 2
 
2.9%
7 2
 
2.9%
12 1
 
1.5%
Other values (13) 13
19.1%
ValueCountFrequency (%)
1 15
22.1%
2 14
20.6%
3 8
11.8%
4 5
 
7.4%
5 3
 
4.4%
6 1
 
1.5%
7 2
 
2.9%
8 1
 
1.5%
9 2
 
2.9%
12 1
 
1.5%
ValueCountFrequency (%)
142 1
1.5%
102 1
1.5%
87 1
1.5%
78 2
2.9%
66 1
1.5%
60 1
1.5%
56 1
1.5%
48 1
1.5%
47 2
2.9%
42 1
1.5%

도로명주소
Text

MISSING 

Distinct56
Distinct (%)83.6%
Missing1
Missing (%)1.5%
Memory size676.0 B
2023-12-12T16:46:15.172654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length49
Median length38
Mean length26.507463
Min length19

Characters and Unicode

Total characters1776
Distinct characters109
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)71.6%

Sample

1st row대구광역시 달성군 구지면 창리서로6길 4-5
2nd row대구광역시 달성군 논공읍 비슬로373길 23-6
3rd row대구광역시 달성군 현풍읍 원교길 6
4th row대구광역시 달성군 현풍읍 원교길 6
5th row대구광역시 달성군 화원읍 성화로 11
ValueCountFrequency (%)
대구광역시 67
18.0%
달성군 67
18.0%
화원읍 18
 
4.8%
현풍읍 12
 
3.2%
하빈면 12
 
3.2%
다사읍 11
 
3.0%
논공읍 6
 
1.6%
성화로 5
 
1.3%
하산길 5
 
1.3%
옥포읍 5
 
1.3%
Other values (111) 164
44.1%
2023-12-12T16:46:15.525448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
307
 
17.3%
77
 
4.3%
74
 
4.2%
74
 
4.2%
73
 
4.1%
70
 
3.9%
69
 
3.9%
67
 
3.8%
67
 
3.8%
1 61
 
3.4%
Other values (99) 837
47.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1130
63.6%
Space Separator 307
 
17.3%
Decimal Number 280
 
15.8%
Other Punctuation 26
 
1.5%
Dash Punctuation 13
 
0.7%
Close Punctuation 9
 
0.5%
Open Punctuation 9
 
0.5%
Uppercase Letter 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
77
 
6.8%
74
 
6.5%
74
 
6.5%
73
 
6.5%
70
 
6.2%
69
 
6.1%
67
 
5.9%
67
 
5.9%
56
 
5.0%
54
 
4.8%
Other values (82) 449
39.7%
Decimal Number
ValueCountFrequency (%)
1 61
21.8%
2 46
16.4%
4 32
11.4%
0 29
10.4%
5 27
9.6%
3 25
8.9%
6 17
 
6.1%
9 16
 
5.7%
7 15
 
5.4%
8 12
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
G 1
50.0%
L 1
50.0%
Space Separator
ValueCountFrequency (%)
307
100.0%
Other Punctuation
ValueCountFrequency (%)
, 26
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13
100.0%
Close Punctuation
ValueCountFrequency (%)
) 9
100.0%
Open Punctuation
ValueCountFrequency (%)
( 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1130
63.6%
Common 644
36.3%
Latin 2
 
0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
77
 
6.8%
74
 
6.5%
74
 
6.5%
73
 
6.5%
70
 
6.2%
69
 
6.1%
67
 
5.9%
67
 
5.9%
56
 
5.0%
54
 
4.8%
Other values (82) 449
39.7%
Common
ValueCountFrequency (%)
307
47.7%
1 61
 
9.5%
2 46
 
7.1%
4 32
 
5.0%
0 29
 
4.5%
5 27
 
4.2%
, 26
 
4.0%
3 25
 
3.9%
6 17
 
2.6%
9 16
 
2.5%
Other values (5) 58
 
9.0%
Latin
ValueCountFrequency (%)
G 1
50.0%
L 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1130
63.6%
ASCII 646
36.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
307
47.5%
1 61
 
9.4%
2 46
 
7.1%
4 32
 
5.0%
0 29
 
4.5%
5 27
 
4.2%
, 26
 
4.0%
3 25
 
3.9%
6 17
 
2.6%
9 16
 
2.5%
Other values (7) 60
 
9.3%
Hangul
ValueCountFrequency (%)
77
 
6.8%
74
 
6.5%
74
 
6.5%
73
 
6.5%
70
 
6.2%
69
 
6.1%
67
 
5.9%
67
 
5.9%
56
 
5.0%
54
 
4.8%
Other values (82) 449
39.7%

Interactions

2023-12-12T16:46:12.897914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:46:12.703565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:46:12.982455image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:46:12.811657image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:46:15.601714image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업체명보유대수도로명주소
연번1.0001.0000.2320.919
업체명1.0001.0001.0001.000
보유대수0.2321.0001.0000.000
도로명주소0.9191.0000.0001.000
2023-12-12T16:46:15.679456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번보유대수면허종류
연번1.000-0.4741.000
보유대수-0.4741.0001.000
면허종류1.0001.0001.000

Missing values

2023-12-12T16:46:13.115827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:46:13.212166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T16:46:13.307223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번업체명면허종류보유대수도로명주소
01(주)남경물류일반화물자동차운송사업12대구광역시 달성군 구지면 창리서로6길 4-5
12대성특수화물㈜일반화물자동차운송사업87대구광역시 달성군 논공읍 비슬로373길 23-6
23대양운수㈜일반화물자동차운송사업102대구광역시 달성군 현풍읍 원교길 6
34삼양기업㈜일반화물자동차운송사업42대구광역시 달성군 현풍읍 원교길 6
45(주)하나물류시스템일반화물자동차운송사업66대구광역시 달성군 화원읍 성화로 11
56화진운수(주)일반화물자동차운송사업9대구광역시 달성군 현풍읍 국가산단북로 494
67(주)태성에너지일반화물자동차운송사업5대구광역시 달성군 하빈면 하산길 159
78(주)진성통운일반화물자동차운송사업142대구광역시 달성군 화원읍 사문진로 349, 삼주타운상가 202호
89(주)남경운수일반화물자동차운송사업78대구광역시 달성군 화원읍 사문진로 349, 삼주타운상가 202호
910성안특운㈜일반화물자동차운송사업47대구광역시 달성군 화원읍 성화로 11
연번업체명면허종류보유대수도로명주소
5859영남특수화물㈜일반화물자동차운송사업8대구광역시 달성군 논공읍 달성군청로1길 26
5960성광종합물류㈜일반화물자동차운송사업2대구광역시 달성군 유가읍 용금공단길 28
6061장윤통운㈜일반화물자동차운송사업1대구광역시 달성군 화원읍 비슬로539길 35, 108동 404호(대곡역래미안)
6162㈜대금종합물류일반화물자동차운송사업1대구광역시 달성군 논공읍 금강로1길 3, 102호
6263유유리싸이클링㈜일반화물자동차운송사업3대구광역시 달성군 하빈면 하산길 129-24
6364㈜이수로지스일반화물자동차운송사업2대구광역시 달성군 화원읍 명천로27길 6, 101호
6465㈜제이와이로지스일반화물자동차운송사업2대구광역시 달성군 다사읍 세천남로 54, 4층
6566㈜휴먼운수일반화물자동차운송사업1대구광역시 달성군 현풍읍 테크노중앙대로5길 2-17, 로젠빌 101호
6667서연종합물류㈜일반화물자동차운송사업2대구광역시 달성군 화원읍 성화로 12, 2층
67<NA><NA><NA><NA><NA>