Overview

Dataset statistics

Number of variables6
Number of observations85
Missing cells136
Missing cells (%)26.7%
Duplicate rows1
Duplicate rows (%)1.2%
Total size in memory4.3 KiB
Average record size in memory51.6 B

Variable types

Numeric2
Text2
Categorical2

Dataset

Description김천시 내에 소재하고 있는 화물자동차운송업체와 관련한 데이터로 업체명, 면허종류, 차량대수, 주소, 현재운영여부정보를 제공합니다.
URLhttps://www.data.go.kr/data/15114917/fileData.do

Alerts

Dataset has 1 (1.2%) duplicate rowsDuplicates
현재운영여부 is highly overall correlated with 연번 and 2 other fieldsHigh correlation
면허종류 is highly overall correlated with 차량대수 and 1 other fieldsHigh correlation
연번 is highly overall correlated with 현재운영여부High correlation
차량대수 is highly overall correlated with 면허종류 and 1 other fieldsHigh correlation
연번 has 34 (40.0%) missing valuesMissing
업체명 has 34 (40.0%) missing valuesMissing
차량대수 has 34 (40.0%) missing valuesMissing
주소 has 34 (40.0%) missing valuesMissing

Reproduction

Analysis started2023-12-12 17:39:36.879567
Analysis finished2023-12-12 17:39:37.867768
Duration0.99 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct51
Distinct (%)100.0%
Missing34
Missing (%)40.0%
Infinite0
Infinite (%)0.0%
Mean26
Minimum1
Maximum51
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size897.0 B
2023-12-13T02:39:37.945861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.5
Q113.5
median26
Q338.5
95-th percentile48.5
Maximum51
Range50
Interquartile range (IQR)25

Descriptive statistics

Standard deviation14.866069
Coefficient of variation (CV)0.57177187
Kurtosis-1.2
Mean26
Median Absolute Deviation (MAD)13
Skewness0
Sum1326
Variance221
MonotonicityStrictly increasing
2023-12-13T02:39:38.073139image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 1
 
1.2%
29 1
 
1.2%
30 1
 
1.2%
31 1
 
1.2%
32 1
 
1.2%
33 1
 
1.2%
34 1
 
1.2%
35 1
 
1.2%
36 1
 
1.2%
37 1
 
1.2%
Other values (41) 41
48.2%
(Missing) 34
40.0%
ValueCountFrequency (%)
1 1
1.2%
2 1
1.2%
3 1
1.2%
4 1
1.2%
5 1
1.2%
6 1
1.2%
7 1
1.2%
8 1
1.2%
9 1
1.2%
10 1
1.2%
ValueCountFrequency (%)
51 1
1.2%
50 1
1.2%
49 1
1.2%
48 1
1.2%
47 1
1.2%
46 1
1.2%
45 1
1.2%
44 1
1.2%
43 1
1.2%
42 1
1.2%

업체명
Text

MISSING 

Distinct51
Distinct (%)100.0%
Missing34
Missing (%)40.0%
Memory size812.0 B
2023-12-13T02:39:38.310935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length9
Mean length6.745098
Min length4

Characters and Unicode

Total characters344
Distinct characters110
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51 ?
Unique (%)100.0%

Sample

1st row(주)고속렉카
2nd row(주)그린물류
3rd row(주)글로벌스피드
4th row(주)글로벌퍼스트
5th row(주)김구스카이차
ValueCountFrequency (%)
주)그린물류 1
 
1.8%
주)창신종합물류 1
 
1.8%
김천합동운수㈜ 1
 
1.8%
김천혁신카고크레인 1
 
1.8%
남김천렉카 1
 
1.8%
1
 
1.8%
푸름 1
 
1.8%
물류 1
 
1.8%
다산물류㈜ 1
 
1.8%
대박운수(주 1
 
1.8%
Other values (46) 46
82.1%
2023-12-13T02:39:38.716913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
27
 
7.8%
) 26
 
7.6%
( 26
 
7.6%
23
 
6.7%
17
 
4.9%
16
 
4.7%
15
 
4.4%
9
 
2.6%
8
 
2.3%
7
 
2.0%
Other values (100) 170
49.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 280
81.4%
Close Punctuation 26
 
7.6%
Open Punctuation 26
 
7.6%
Other Symbol 5
 
1.5%
Space Separator 5
 
1.5%
Decimal Number 2
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
27
 
9.6%
23
 
8.2%
17
 
6.1%
16
 
5.7%
15
 
5.4%
9
 
3.2%
8
 
2.9%
7
 
2.5%
6
 
2.1%
6
 
2.1%
Other values (94) 146
52.1%
Decimal Number
ValueCountFrequency (%)
1 1
50.0%
6 1
50.0%
Close Punctuation
ValueCountFrequency (%)
) 26
100.0%
Open Punctuation
ValueCountFrequency (%)
( 26
100.0%
Other Symbol
ValueCountFrequency (%)
5
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 285
82.8%
Common 59
 
17.2%

Most frequent character per script

Hangul
ValueCountFrequency (%)
27
 
9.5%
23
 
8.1%
17
 
6.0%
16
 
5.6%
15
 
5.3%
9
 
3.2%
8
 
2.8%
7
 
2.5%
6
 
2.1%
6
 
2.1%
Other values (95) 151
53.0%
Common
ValueCountFrequency (%)
) 26
44.1%
( 26
44.1%
5
 
8.5%
1 1
 
1.7%
6 1
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 280
81.4%
ASCII 59
 
17.2%
None 5
 
1.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
27
 
9.6%
23
 
8.2%
17
 
6.1%
16
 
5.7%
15
 
5.4%
9
 
3.2%
8
 
2.9%
7
 
2.5%
6
 
2.1%
6
 
2.1%
Other values (94) 146
52.1%
ASCII
ValueCountFrequency (%)
) 26
44.1%
( 26
44.1%
5
 
8.5%
1 1
 
1.7%
6 1
 
1.7%
None
ValueCountFrequency (%)
5
100.0%

면허종류
Categorical

HIGH CORRELATION 

Distinct3
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Memory size812.0 B
(구)일반화물
40 
<NA>
34 
일반화물
11 

Length

Max length7
Median length4
Mean length5.4117647
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row(구)일반화물
2nd row(구)일반화물
3rd row일반화물
4th row(구)일반화물
5th row(구)일반화물

Common Values

ValueCountFrequency (%)
(구)일반화물 40
47.1%
<NA> 34
40.0%
일반화물 11
 
12.9%

Length

2023-12-13T02:39:38.837656image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:39:38.932572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
구)일반화물 40
47.1%
na 34
40.0%
일반화물 11
 
12.9%

차량대수
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct19
Distinct (%)37.3%
Missing34
Missing (%)40.0%
Infinite0
Infinite (%)0.0%
Mean11.72549
Minimum1
Maximum98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size897.0 B
2023-12-13T02:39:39.051254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median3
Q318.5
95-th percentile50.5
Maximum98
Range97
Interquartile range (IQR)17.5

Descriptive statistics

Standard deviation20.037044
Coefficient of variation (CV)1.7088449
Kurtosis9.4708859
Mean11.72549
Median Absolute Deviation (MAD)2
Skewness2.9818276
Sum598
Variance401.48314
MonotonicityNot monotonic
2023-12-13T02:39:39.188468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
1 14
16.5%
2 7
 
8.2%
3 6
 
7.1%
5 5
 
5.9%
4 2
 
2.4%
20 2
 
2.4%
21 2
 
2.4%
7 2
 
2.4%
18 1
 
1.2%
19 1
 
1.2%
Other values (9) 9
 
10.6%
(Missing) 34
40.0%
ValueCountFrequency (%)
1 14
16.5%
2 7
8.2%
3 6
7.1%
4 2
 
2.4%
5 5
 
5.9%
7 2
 
2.4%
9 1
 
1.2%
18 1
 
1.2%
19 1
 
1.2%
20 2
 
2.4%
ValueCountFrequency (%)
98 1
1.2%
84 1
1.2%
63 1
1.2%
38 1
1.2%
25 1
1.2%
24 1
1.2%
23 1
1.2%
22 1
1.2%
21 2
2.4%
20 2
2.4%

주소
Text

MISSING 

Distinct32
Distinct (%)62.7%
Missing34
Missing (%)40.0%
Memory size812.0 B
2023-12-13T02:39:39.436559image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length37
Median length26
Mean length19
Min length8

Characters and Unicode

Total characters969
Distinct characters98
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)52.9%

Sample

1st row경상북도 김천시 새까끔길 5 (다수동)
2nd row경상북도 김천시 공단2길 30-30, 조흥은행 (대광동)
3rd row경상북도 김천시 중앙시장길 38, 1층 (모암동)
4th row경상북도 김천시 중앙시장길 38, 1층 (모암동)
5th row경상북도 김천시 농소면 벽소로 2119
ValueCountFrequency (%)
경상북도 51
22.9%
김천시 51
22.9%
아포읍 7
 
3.1%
금계길 7
 
3.1%
부곡동 6
 
2.7%
대학로 4
 
1.8%
어모면 4
 
1.8%
35 4
 
1.8%
영남대로 3
 
1.3%
평화동 3
 
1.3%
Other values (64) 83
37.2%
2023-12-13T02:39:39.859295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
172
17.8%
53
 
5.5%
53
 
5.5%
52
 
5.4%
51
 
5.3%
51
 
5.3%
51
 
5.3%
51
 
5.3%
1 29
 
3.0%
) 25
 
2.6%
Other values (88) 381
39.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 617
63.7%
Space Separator 172
 
17.8%
Decimal Number 115
 
11.9%
Close Punctuation 25
 
2.6%
Open Punctuation 25
 
2.6%
Other Punctuation 9
 
0.9%
Dash Punctuation 6
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
53
 
8.6%
53
 
8.6%
52
 
8.4%
51
 
8.3%
51
 
8.3%
51
 
8.3%
51
 
8.3%
25
 
4.1%
19
 
3.1%
18
 
2.9%
Other values (73) 193
31.3%
Decimal Number
ValueCountFrequency (%)
1 29
25.2%
3 21
18.3%
2 19
16.5%
4 9
 
7.8%
5 9
 
7.8%
9 7
 
6.1%
0 6
 
5.2%
7 6
 
5.2%
8 6
 
5.2%
6 3
 
2.6%
Space Separator
ValueCountFrequency (%)
172
100.0%
Close Punctuation
ValueCountFrequency (%)
) 25
100.0%
Open Punctuation
ValueCountFrequency (%)
( 25
100.0%
Other Punctuation
ValueCountFrequency (%)
, 9
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 617
63.7%
Common 352
36.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
53
 
8.6%
53
 
8.6%
52
 
8.4%
51
 
8.3%
51
 
8.3%
51
 
8.3%
51
 
8.3%
25
 
4.1%
19
 
3.1%
18
 
2.9%
Other values (73) 193
31.3%
Common
ValueCountFrequency (%)
172
48.9%
1 29
 
8.2%
) 25
 
7.1%
( 25
 
7.1%
3 21
 
6.0%
2 19
 
5.4%
4 9
 
2.6%
, 9
 
2.6%
5 9
 
2.6%
9 7
 
2.0%
Other values (5) 27
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 617
63.7%
ASCII 352
36.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
172
48.9%
1 29
 
8.2%
) 25
 
7.1%
( 25
 
7.1%
3 21
 
6.0%
2 19
 
5.4%
4 9
 
2.6%
, 9
 
2.6%
5 9
 
2.6%
9 7
 
2.0%
Other values (5) 27
 
7.7%
Hangul
ValueCountFrequency (%)
53
 
8.6%
53
 
8.6%
52
 
8.4%
51
 
8.3%
51
 
8.3%
51
 
8.3%
51
 
8.3%
25
 
4.1%
19
 
3.1%
18
 
2.9%
Other values (73) 193
31.3%

현재운영여부
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.4%
Missing0
Missing (%)0.0%
Memory size812.0 B
영업중
51 
<NA>
34 

Length

Max length4
Median length3
Mean length3.4
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영업중
2nd row영업중
3rd row영업중
4th row영업중
5th row영업중

Common Values

ValueCountFrequency (%)
영업중 51
60.0%
<NA> 34
40.0%

Length

2023-12-13T02:39:39.993786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T02:39:40.417779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영업중 51
60.0%
na 34
40.0%

Interactions

2023-12-13T02:39:37.354452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:39:37.183225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:39:37.439441image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T02:39:37.265101image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T02:39:40.491719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업체명면허종류차량대수주소
연번1.0001.0000.0000.0710.709
업체명1.0001.0001.0001.0001.000
면허종류0.0001.0001.0000.8620.712
차량대수0.0711.0000.8621.0000.750
주소0.7091.0000.7120.7501.000
2023-12-13T02:39:40.602911image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
현재운영여부면허종류
현재운영여부1.0001.000
면허종류1.0001.000
2023-12-13T02:39:40.720156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번차량대수면허종류현재운영여부
연번1.0000.1040.1691.000
차량대수0.1041.0000.8831.000
면허종류0.1690.8831.0001.000
현재운영여부1.0001.0001.0001.000

Missing values

2023-12-13T02:39:37.578017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:39:37.694854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T02:39:37.799369image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번업체명면허종류차량대수주소현재운영여부
01(주)고속렉카(구)일반화물3경상북도 김천시 새까끔길 5 (다수동)영업중
12(주)그린물류(구)일반화물3경상북도 김천시 공단2길 30-30, 조흥은행 (대광동)영업중
23(주)글로벌스피드일반화물38경상북도 김천시 중앙시장길 38, 1층 (모암동)영업중
34(주)글로벌퍼스트(구)일반화물3경상북도 김천시 중앙시장길 38, 1층 (모암동)영업중
45(주)김구스카이차(구)일반화물5경상북도 김천시 농소면 벽소로 2119영업중
56(주)김천고속화물(구)일반화물1경상북도 김천시 새들1길 2 (백옥동)영업중
67(주)김천레카(구)일반화물5경상북도 김천시 영남대로 1462-1 (부곡동)영업중
78(주)김천츄레라(구)일반화물18경상북도 김천시 대학로 33 (교동)영업중
89(주)명도티엔에스(구)일반화물19경상북도 김천시 아포읍 금계길 171영업중
910(주)부곡물류(구)일반화물2경상북도 김천시 어모면 산업단지1로 34-6영업중
연번업체명면허종류차량대수주소현재운영여부
75<NA><NA><NA><NA><NA><NA>
76<NA><NA><NA><NA><NA><NA>
77<NA><NA><NA><NA><NA><NA>
78<NA><NA><NA><NA><NA><NA>
79<NA><NA><NA><NA><NA><NA>
80<NA><NA><NA><NA><NA><NA>
81<NA><NA><NA><NA><NA><NA>
82<NA><NA><NA><NA><NA><NA>
83<NA><NA><NA><NA><NA><NA>
84<NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

연번업체명면허종류차량대수주소현재운영여부# duplicates
0<NA><NA><NA><NA><NA><NA>34