Overview

Dataset statistics

Number of variables7
Number of observations106
Missing cells2
Missing cells (%)0.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.1 KiB
Average record size in memory59.2 B

Variable types

Numeric2
Text4
Categorical1

Dataset

Description충청북도 단양군의 공장등록 현황 데이터로 연번, 업종, 업체명, 공장위치, 근로자수, 전화번호, 데이터기준일자 등의 항목을 포함함.
Author충청북도 단양군
URLhttps://www.data.go.kr/data/15034302/fileData.do

Alerts

데이터기준일자 has constant value ""Constant
전화번호 has 2 (1.9%) missing valuesMissing
연번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 13:37:37.394808
Analysis finished2023-12-12 13:37:39.011854
Duration1.62 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct106
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.5
Minimum1
Maximum106
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-12T22:37:39.086149image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6.25
Q127.25
median53.5
Q379.75
95-th percentile100.75
Maximum106
Range105
Interquartile range (IQR)52.5

Descriptive statistics

Standard deviation30.743563
Coefficient of variation (CV)0.57464604
Kurtosis-1.2
Mean53.5
Median Absolute Deviation (MAD)26.5
Skewness0
Sum5671
Variance945.16667
MonotonicityStrictly increasing
2023-12-12T22:37:39.245427image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.9%
81 1
 
0.9%
79 1
 
0.9%
78 1
 
0.9%
77 1
 
0.9%
76 1
 
0.9%
75 1
 
0.9%
74 1
 
0.9%
73 1
 
0.9%
72 1
 
0.9%
Other values (96) 96
90.6%
ValueCountFrequency (%)
1 1
0.9%
2 1
0.9%
3 1
0.9%
4 1
0.9%
5 1
0.9%
6 1
0.9%
7 1
0.9%
8 1
0.9%
9 1
0.9%
10 1
0.9%
ValueCountFrequency (%)
106 1
0.9%
105 1
0.9%
104 1
0.9%
103 1
0.9%
102 1
0.9%
101 1
0.9%
100 1
0.9%
99 1
0.9%
98 1
0.9%
97 1
0.9%
Distinct65
Distinct (%)61.3%
Missing0
Missing (%)0.0%
Memory size980.0 B
2023-12-12T22:37:39.524910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length61
Median length39
Mean length14.528302
Min length5

Characters and Unicode

Total characters1540
Distinct characters146
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)46.2%

Sample

1st row시멘트제조업 외 2종
2nd row시멘트제조업
3rd row시멘트제조업
4th row비금속광물분쇄물생산업
5th row석회및플라스터제조업
ValueCountFrequency (%)
제조업 23
 
8.9%
22
 
8.5%
석회및플라스터제조업 11
 
4.3%
기타 11
 
4.3%
비금속광물분쇄물생산업 7
 
2.7%
금속 6
 
2.3%
관련제품 6
 
2.3%
레미콘제조업 5
 
1.9%
토목공사및유사용기계장비제조업 5
 
1.9%
일반용 5
 
1.9%
Other values (108) 157
60.9%
2023-12-12T22:37:39.968141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
153
 
9.9%
125
 
8.1%
122
 
7.9%
108
 
7.0%
70
 
4.5%
48
 
3.1%
34
 
2.2%
, 32
 
2.1%
32
 
2.1%
29
 
1.9%
Other values (136) 787
51.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1349
87.6%
Space Separator 153
 
9.9%
Other Punctuation 32
 
2.1%
Decimal Number 6
 
0.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
125
 
9.3%
122
 
9.0%
108
 
8.0%
70
 
5.2%
48
 
3.6%
34
 
2.5%
32
 
2.4%
29
 
2.1%
28
 
2.1%
28
 
2.1%
Other values (131) 725
53.7%
Decimal Number
ValueCountFrequency (%)
1 3
50.0%
7 2
33.3%
2 1
 
16.7%
Space Separator
ValueCountFrequency (%)
153
100.0%
Other Punctuation
ValueCountFrequency (%)
, 32
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1349
87.6%
Common 191
 
12.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
125
 
9.3%
122
 
9.0%
108
 
8.0%
70
 
5.2%
48
 
3.6%
34
 
2.5%
32
 
2.4%
29
 
2.1%
28
 
2.1%
28
 
2.1%
Other values (131) 725
53.7%
Common
ValueCountFrequency (%)
153
80.1%
, 32
 
16.8%
1 3
 
1.6%
7 2
 
1.0%
2 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1349
87.6%
ASCII 191
 
12.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
153
80.1%
, 32
 
16.8%
1 3
 
1.6%
7 2
 
1.0%
2 1
 
0.5%
Hangul
ValueCountFrequency (%)
125
 
9.3%
122
 
9.0%
108
 
8.0%
70
 
5.2%
48
 
3.6%
34
 
2.5%
32
 
2.4%
29
 
2.1%
28
 
2.1%
28
 
2.1%
Other values (131) 725
53.7%
Distinct105
Distinct (%)99.1%
Missing0
Missing (%)0.0%
Memory size980.0 B
2023-12-12T22:37:40.239425image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length12.5
Mean length6.8018868
Min length2

Characters and Unicode

Total characters721
Distinct characters186
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique104 ?
Unique (%)98.1%

Sample

1st row성신양회㈜
2nd row한일시멘트㈜
3rd row한일현대시멘트㈜
4th row삼원산업
5th row㈜태경비케이 단양2공장
ValueCountFrequency (%)
㈜태경비케이 3
 
2.6%
㈜강농 2
 
1.7%
주식회사 2
 
1.7%
수풍산업㈜ 2
 
1.7%
자)효신아스콘 2
 
1.7%
단양군산림조합 1
 
0.9%
성신양회㈜ 1
 
0.9%
성원파일㈜ 1
 
0.9%
㈜성우 1
 
0.9%
씨알에프앤씨(주 1
 
0.9%
Other values (100) 100
86.2%
2023-12-12T22:37:40.609471image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
63
 
8.7%
33
 
4.6%
27
 
3.7%
19
 
2.6%
19
 
2.6%
18
 
2.5%
17
 
2.4%
14
 
1.9%
13
 
1.8%
13
 
1.8%
Other values (176) 485
67.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 605
83.9%
Other Symbol 63
 
8.7%
Space Separator 33
 
4.6%
Close Punctuation 7
 
1.0%
Open Punctuation 7
 
1.0%
Decimal Number 4
 
0.6%
Uppercase Letter 2
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
27
 
4.5%
19
 
3.1%
19
 
3.1%
18
 
3.0%
17
 
2.8%
14
 
2.3%
13
 
2.1%
13
 
2.1%
12
 
2.0%
12
 
2.0%
Other values (168) 441
72.9%
Decimal Number
ValueCountFrequency (%)
2 2
50.0%
1 2
50.0%
Uppercase Letter
ValueCountFrequency (%)
K 1
50.0%
J 1
50.0%
Other Symbol
ValueCountFrequency (%)
63
100.0%
Space Separator
ValueCountFrequency (%)
33
100.0%
Close Punctuation
ValueCountFrequency (%)
) 7
100.0%
Open Punctuation
ValueCountFrequency (%)
( 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 668
92.6%
Common 51
 
7.1%
Latin 2
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
63
 
9.4%
27
 
4.0%
19
 
2.8%
19
 
2.8%
18
 
2.7%
17
 
2.5%
14
 
2.1%
13
 
1.9%
13
 
1.9%
12
 
1.8%
Other values (169) 453
67.8%
Common
ValueCountFrequency (%)
33
64.7%
) 7
 
13.7%
( 7
 
13.7%
2 2
 
3.9%
1 2
 
3.9%
Latin
ValueCountFrequency (%)
K 1
50.0%
J 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 605
83.9%
None 63
 
8.7%
ASCII 53
 
7.4%

Most frequent character per block

None
ValueCountFrequency (%)
63
100.0%
ASCII
ValueCountFrequency (%)
33
62.3%
) 7
 
13.2%
( 7
 
13.2%
2 2
 
3.8%
1 2
 
3.8%
K 1
 
1.9%
J 1
 
1.9%
Hangul
ValueCountFrequency (%)
27
 
4.5%
19
 
3.1%
19
 
3.1%
18
 
3.0%
17
 
2.8%
14
 
2.3%
13
 
2.1%
13
 
2.1%
12
 
2.0%
12
 
2.0%
Other values (168) 441
72.9%
Distinct100
Distinct (%)94.3%
Missing0
Missing (%)0.0%
Memory size980.0 B
2023-12-12T22:37:40.971584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length27
Mean length23.320755
Min length19

Characters and Unicode

Total characters2472
Distinct characters82
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique94 ?
Unique (%)88.7%

Sample

1st row충청북도 단양군 매포읍 매포길 18
2nd row충청북도 단양군 매포읍 매포길 245
3rd row충청북도 단양군 매포읍 고양길 95
4th row충청북도 단양군 단양읍 단양로 522
5th row충청북도 단양군 매포읍 단양로 2203-22
ValueCountFrequency (%)
충청북도 106
20.1%
단양군 106
20.1%
매포읍 47
 
8.9%
대강면 21
 
4.0%
적성농공로 17
 
3.2%
단양읍 15
 
2.8%
단양로 13
 
2.5%
대강농공길 12
 
2.3%
적성면 8
 
1.5%
가곡면 6
 
1.1%
Other values (134) 176
33.4%
2023-12-12T22:37:41.434628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
567
22.9%
159
 
6.4%
147
 
5.9%
107
 
4.3%
106
 
4.3%
106
 
4.3%
106
 
4.3%
106
 
4.3%
1 74
 
3.0%
67
 
2.7%
Other values (72) 927
37.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1513
61.2%
Space Separator 567
 
22.9%
Decimal Number 358
 
14.5%
Dash Punctuation 34
 
1.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
159
 
10.5%
147
 
9.7%
107
 
7.1%
106
 
7.0%
106
 
7.0%
106
 
7.0%
106
 
7.0%
67
 
4.4%
62
 
4.1%
53
 
3.5%
Other values (60) 494
32.7%
Decimal Number
ValueCountFrequency (%)
1 74
20.7%
2 64
17.9%
3 34
9.5%
7 32
8.9%
4 30
8.4%
6 29
 
8.1%
0 29
 
8.1%
5 26
 
7.3%
9 21
 
5.9%
8 19
 
5.3%
Space Separator
ValueCountFrequency (%)
567
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 34
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1513
61.2%
Common 959
38.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
159
 
10.5%
147
 
9.7%
107
 
7.1%
106
 
7.0%
106
 
7.0%
106
 
7.0%
106
 
7.0%
67
 
4.4%
62
 
4.1%
53
 
3.5%
Other values (60) 494
32.7%
Common
ValueCountFrequency (%)
567
59.1%
1 74
 
7.7%
2 64
 
6.7%
- 34
 
3.5%
3 34
 
3.5%
7 32
 
3.3%
4 30
 
3.1%
6 29
 
3.0%
0 29
 
3.0%
5 26
 
2.7%
Other values (2) 40
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1513
61.2%
ASCII 959
38.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
567
59.1%
1 74
 
7.7%
2 64
 
6.7%
- 34
 
3.5%
3 34
 
3.5%
7 32
 
3.3%
4 30
 
3.1%
6 29
 
3.0%
0 29
 
3.0%
5 26
 
2.7%
Other values (2) 40
 
4.2%
Hangul
ValueCountFrequency (%)
159
 
10.5%
147
 
9.7%
107
 
7.1%
106
 
7.0%
106
 
7.0%
106
 
7.0%
106
 
7.0%
67
 
4.4%
62
 
4.1%
53
 
3.5%
Other values (60) 494
32.7%

근로자수
Real number (ℝ)

Distinct30
Distinct (%)28.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.962264
Minimum1
Maximum414
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.1 KiB
2023-12-12T22:37:41.614767image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median5
Q310.75
95-th percentile72.25
Maximum414
Range413
Interquartile range (IQR)8.75

Descriptive statistics

Standard deviation51.498483
Coefficient of variation (CV)2.7158404
Kurtosis39.975311
Mean18.962264
Median Absolute Deviation (MAD)3
Skewness5.936673
Sum2010
Variance2652.0938
MonotonicityNot monotonic
2023-12-12T22:37:41.759692image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
1 20
18.9%
3 11
10.4%
4 9
 
8.5%
2 9
 
8.5%
8 8
 
7.5%
5 8
 
7.5%
7 6
 
5.7%
6 4
 
3.8%
20 4
 
3.8%
70 3
 
2.8%
Other values (20) 24
22.6%
ValueCountFrequency (%)
1 20
18.9%
2 9
8.5%
3 11
10.4%
4 9
8.5%
5 8
 
7.5%
6 4
 
3.8%
7 6
 
5.7%
8 8
 
7.5%
9 3
 
2.8%
10 1
 
0.9%
ValueCountFrequency (%)
414 1
 
0.9%
292 1
 
0.9%
106 1
 
0.9%
82 1
 
0.9%
80 1
 
0.9%
73 1
 
0.9%
70 3
2.8%
66 1
 
0.9%
41 1
 
0.9%
38 1
 
0.9%

전화번호
Text

MISSING 

Distinct97
Distinct (%)93.3%
Missing2
Missing (%)1.9%
Memory size980.0 B
2023-12-12T22:37:41.998236image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length12
Mean length12.019231
Min length11

Characters and Unicode

Total characters1250
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique90 ?
Unique (%)86.5%

Sample

1st row043-420-4100
2nd row043-420-5000
3rd row043-420-8900
4th row043-422-0570
5th row043-421-4811
ValueCountFrequency (%)
043-422-9771 2
 
1.9%
043-422-4321 2
 
1.9%
070-8250-6744 2
 
1.9%
043-422-2440 2
 
1.9%
043-422-0077 2
 
1.9%
043-422-5939 2
 
1.9%
043-421-9756 2
 
1.9%
043-422-7547 1
 
1.0%
043-421-6360 1
 
1.0%
043-422-3511 1
 
1.0%
Other values (87) 87
83.7%
2023-12-12T22:37:42.389914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 225
18.0%
- 208
16.6%
0 180
14.4%
2 180
14.4%
3 155
12.4%
1 74
 
5.9%
7 72
 
5.8%
5 50
 
4.0%
6 41
 
3.3%
9 37
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1042
83.4%
Dash Punctuation 208
 
16.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 225
21.6%
0 180
17.3%
2 180
17.3%
3 155
14.9%
1 74
 
7.1%
7 72
 
6.9%
5 50
 
4.8%
6 41
 
3.9%
9 37
 
3.6%
8 28
 
2.7%
Dash Punctuation
ValueCountFrequency (%)
- 208
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1250
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 225
18.0%
- 208
16.6%
0 180
14.4%
2 180
14.4%
3 155
12.4%
1 74
 
5.9%
7 72
 
5.8%
5 50
 
4.0%
6 41
 
3.3%
9 37
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1250
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 225
18.0%
- 208
16.6%
0 180
14.4%
2 180
14.4%
3 155
12.4%
1 74
 
5.9%
7 72
 
5.8%
5 50
 
4.0%
6 41
 
3.3%
9 37
 
3.0%

데이터기준일자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size980.0 B
2022-10-30
106 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2022-10-30
2nd row2022-10-30
3rd row2022-10-30
4th row2022-10-30
5th row2022-10-30

Common Values

ValueCountFrequency (%)
2022-10-30 106
100.0%

Length

2023-12-12T22:37:42.538160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T22:37:42.625958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2022-10-30 106
100.0%

Interactions

2023-12-12T22:37:38.664536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:37:38.494451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:37:38.742727image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T22:37:38.581133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T22:37:42.696500image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번업종명공장위치근로자수전화번호
연번1.0000.7570.8490.0000.790
업종명0.7571.0000.9910.9060.991
공장위치0.8490.9911.0001.0000.995
근로자수0.0000.9061.0001.0001.000
전화번호0.7900.9910.9951.0001.000
2023-12-12T22:37:42.793818image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번근로자수
연번1.000-0.210
근로자수-0.2101.000

Missing values

2023-12-12T22:37:38.842956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T22:37:38.961627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번업종명업체명공장위치근로자수전화번호데이터기준일자
01시멘트제조업 외 2종성신양회㈜충청북도 단양군 매포읍 매포길 18414043-420-41002022-10-30
12시멘트제조업한일시멘트㈜충청북도 단양군 매포읍 매포길 245292043-420-50002022-10-30
23시멘트제조업한일현대시멘트㈜충청북도 단양군 매포읍 고양길 9541043-420-89002022-10-30
34비금속광물분쇄물생산업삼원산업충청북도 단양군 단양읍 단양로 52212043-422-05702022-10-30
45석회및플라스터제조업㈜태경비케이 단양2공장충청북도 단양군 매포읍 단양로 2203-2270043-421-48112022-10-30
56석회및플라스터제조업현대석회㈜충청북도 단양군 단양읍 단양노동길 20322043-423-27502022-10-30
67석회및플라스터제조업㈜태경비케이 노동공장충청북도 단양군 단양읍 노동장현로 49170043-422-43212022-10-30
78비금속광물분쇄물생산업㈜영화케미칼충청북도 단양군 대강면 용부원길 82-481043-422-00872022-10-30
89탁주및약주제조업대강양조장충청북도 단양군 대강면 대강로 607043-422-00772022-10-30
910석회및플라스터제조업㈜기정소재충청북도 단양군 적성면 기동1길 383043-423-78692022-10-30
연번업종명업체명공장위치근로자수전화번호데이터기준일자
9697강주물주조업신보특수강충청북도 단양군 매포읍 단양로1397-271<NA>2022-10-30
9798농업및임업용기계제조업㈜강농충청북도 단양군 단성면 선암계곡로 10512043-422-59392022-10-30
9899곡물제분업소백산영농조합법인충청북도 단양군 매포읍 응실2길 416043-421-93792022-10-30
99100비금속광물분쇄물생산업주식회사 비엠씨충청북도 단양군 적성면 성곡길 1051043-421-82722022-10-30
100101간판및광고물제조업㈜참둘레길충청북도 단양군 적성면 적성농공로 30-166043-421-91302022-10-30
101102기타과실채소가공및저장처리업농업회사법인 주식회사 하동식품충청북도 단양군 적성면 금수산로 782070-7437-27102022-10-30
102103그외 기타 분류안된 섬유제품제조업㈜아보텍충청북도 단양군 매포읍 적성농공로 37-201043-421-57772022-10-30
103104일반용 도료 및 관련제품 제조업㈜케이디엠코팅스충청북도 단양군 매포읍 적성농공로 476043-422-42212022-10-30
104105콘크리트 타일, 기와, 벽돌 및 블록제조업㈜유환충청북도 단양군 매포읍 단양산업단지1로 189302-575-05572022-10-30
105106그 외 기타 특수목적용 기계 제조업(유)중원시스템충청북도 단양군 대강면 대강농공길 231043-855-52052022-10-30