Overview

Dataset statistics

Number of variables8
Number of observations501
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory33.4 KiB
Average record size in memory68.3 B

Variable types

Categorical3
Text1
Numeric4

Dataset

Description2021년 국가금연지원서비스를 제공하는 인력 현황 데이터로 상담사, 지도원 등의 항목을 제공합니다.- 전국 보건소, 지역금연지원센터
Author한국건강증진개발원
URLhttps://www.data.go.kr/data/15092414/fileData.do

Alerts

최종학력(고등학교 이하) is highly overall correlated with 인력구분High correlation
최종학력(대학교) is highly overall correlated with 기관유형High correlation
최종학력(대학원) is highly overall correlated with 기관유형High correlation
기관유형 is highly overall correlated with 최종학력(대학교) and 1 other fieldsHigh correlation
인력구분 is highly overall correlated with 최종학력(고등학교 이하)High correlation
기관유형 is highly imbalanced (78.6%)Imbalance
최종학력(고등학교 이하) has 327 (65.3%) zerosZeros
최종학력(전문대) has 237 (47.3%) zerosZeros
최종학력(대학교) has 104 (20.8%) zerosZeros
최종학력(대학원) has 428 (85.4%) zerosZeros

Reproduction

Analysis started2023-12-12 04:29:15.590950
Analysis finished2023-12-12 04:29:18.502938
Duration2.91 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기관유형
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
보건소
484 
금연지원센터
 
17

Length

Max length6
Median length3
Mean length3.1017964
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row보건소
2nd row보건소
3rd row보건소
4th row보건소
5th row보건소

Common Values

ValueCountFrequency (%)
보건소 484
96.6%
금연지원센터 17
 
3.4%

Length

2023-12-12T13:29:18.625897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:29:18.760721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
보건소 484
96.6%
금연지원센터 17
 
3.4%

지역
Categorical

Distinct17
Distinct (%)3.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
경기도
85 
경상북도
49 
전라남도
45 
경상남도
41 
서울특별시
40 
Other values (12)
241 

Length

Max length7
Median length5
Mean length4.1237525
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울특별시
2nd row서울특별시
3rd row서울특별시
4th row서울특별시
5th row서울특별시

Common Values

ValueCountFrequency (%)
경기도 85
17.0%
경상북도 49
9.8%
전라남도 45
9.0%
경상남도 41
8.2%
서울특별시 40
8.0%
강원도 37
7.4%
충청남도 33
 
6.6%
부산광역시 31
 
6.2%
전라북도 28
 
5.6%
충청북도 27
 
5.4%
Other values (7) 85
17.0%

Length

2023-12-12T13:29:18.914030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기도 85
17.0%
경상북도 49
9.8%
전라남도 45
9.0%
경상남도 41
8.2%
서울특별시 40
8.0%
강원도 37
7.4%
충청남도 33
 
6.6%
부산광역시 31
 
6.2%
전라북도 28
 
5.6%
충청북도 27
 
5.4%
Other values (7) 85
17.0%
Distinct275
Distinct (%)54.9%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-12T13:29:19.211793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length9
Mean length9.5748503
Min length8

Characters and Unicode

Total characters4797
Distinct characters153
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique49 ?
Unique (%)9.8%

Sample

1st row서울 강남구보건소
2nd row서울 강남구보건소
3rd row서울 강동구보건소
4th row서울 강북구보건소
5th row서울 강북구보건소
ValueCountFrequency (%)
경기 83
 
7.8%
경북 48
 
4.5%
전남 44
 
4.1%
경남 40
 
3.8%
서울 39
 
3.7%
강원 36
 
3.4%
충남 32
 
3.0%
부산 30
 
2.8%
전북 27
 
2.5%
충북 26
 
2.4%
Other values (273) 661
62.0%
2023-12-12T13:29:19.739608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
565
 
11.8%
489
 
10.2%
484
 
10.1%
455
 
9.5%
198
 
4.1%
190
 
4.0%
181
 
3.8%
166
 
3.5%
152
 
3.2%
120
 
2.5%
Other values (143) 1797
37.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 4232
88.2%
Space Separator 565
 
11.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
489
 
11.6%
484
 
11.4%
455
 
10.8%
198
 
4.7%
190
 
4.5%
181
 
4.3%
166
 
3.9%
152
 
3.6%
120
 
2.8%
113
 
2.7%
Other values (142) 1684
39.8%
Space Separator
ValueCountFrequency (%)
565
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 4232
88.2%
Common 565
 
11.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
489
 
11.6%
484
 
11.4%
455
 
10.8%
198
 
4.7%
190
 
4.5%
181
 
4.3%
166
 
3.9%
152
 
3.6%
120
 
2.8%
113
 
2.7%
Other values (142) 1684
39.8%
Common
ValueCountFrequency (%)
565
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 4232
88.2%
ASCII 565
 
11.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
565
100.0%
Hangul
ValueCountFrequency (%)
489
 
11.6%
484
 
11.4%
455
 
10.8%
198
 
4.7%
190
 
4.5%
181
 
4.3%
166
 
3.9%
152
 
3.6%
120
 
2.8%
113
 
2.7%
Other values (142) 1684
39.8%

인력구분
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
상담사
272 
단속원
229 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row상담사
2nd row단속원
3rd row상담사
4th row상담사
5th row단속원

Common Values

ValueCountFrequency (%)
상담사 272
54.3%
단속원 229
45.7%

Length

2023-12-12T13:29:19.906971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:29:20.041093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
상담사 272
54.3%
단속원 229
45.7%

최종학력(고등학교 이하)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct15
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.2355289
Minimum0
Maximum19
Zeros327
Zeros (%)65.3%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T13:29:20.195209image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q32
95-th percentile6
Maximum19
Range19
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.3614435
Coefficient of variation (CV)1.9112814
Kurtosis11.279256
Mean1.2355289
Median Absolute Deviation (MAD)0
Skewness2.8694115
Sum619
Variance5.5764152
MonotonicityNot monotonic
2023-12-12T13:29:20.319046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
0 327
65.3%
2 48
 
9.6%
1 35
 
7.0%
3 25
 
5.0%
4 20
 
4.0%
5 14
 
2.8%
6 11
 
2.2%
8 8
 
1.6%
7 5
 
1.0%
9 2
 
0.4%
Other values (5) 6
 
1.2%
ValueCountFrequency (%)
0 327
65.3%
1 35
 
7.0%
2 48
 
9.6%
3 25
 
5.0%
4 20
 
4.0%
5 14
 
2.8%
6 11
 
2.2%
7 5
 
1.0%
8 8
 
1.6%
9 2
 
0.4%
ValueCountFrequency (%)
19 1
 
0.2%
14 1
 
0.2%
13 2
 
0.4%
11 1
 
0.2%
10 1
 
0.2%
9 2
 
0.4%
8 8
1.6%
7 5
 
1.0%
6 11
2.2%
5 14
2.8%

최종학력(전문대)
Real number (ℝ)

ZEROS 

Distinct11
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.0499002
Minimum0
Maximum15
Zeros237
Zeros (%)47.3%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T13:29:20.443554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile4
Maximum15
Range15
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.5084777
Coefficient of variation (CV)1.436782
Kurtosis18.207373
Mean1.0499002
Median Absolute Deviation (MAD)1
Skewness3.1230514
Sum526
Variance2.275505
MonotonicityNot monotonic
2023-12-12T13:29:20.558886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
0 237
47.3%
1 130
25.9%
2 72
 
14.4%
3 36
 
7.2%
4 11
 
2.2%
5 8
 
1.6%
9 2
 
0.4%
6 2
 
0.4%
15 1
 
0.2%
7 1
 
0.2%
ValueCountFrequency (%)
0 237
47.3%
1 130
25.9%
2 72
 
14.4%
3 36
 
7.2%
4 11
 
2.2%
5 8
 
1.6%
6 2
 
0.4%
7 1
 
0.2%
8 1
 
0.2%
9 2
 
0.4%
ValueCountFrequency (%)
15 1
 
0.2%
9 2
 
0.4%
8 1
 
0.2%
7 1
 
0.2%
6 2
 
0.4%
5 8
 
1.6%
4 11
 
2.2%
3 36
 
7.2%
2 72
14.4%
1 130
25.9%

최종학력(대학교)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct19
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.5908184
Minimum0
Maximum29
Zeros104
Zeros (%)20.8%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T13:29:20.676558image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile9
Maximum29
Range29
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.1588345
Coefficient of variation (CV)1.219242
Kurtosis13.797926
Mean2.5908184
Median Absolute Deviation (MAD)1
Skewness2.9383978
Sum1298
Variance9.9782355
MonotonicityNot monotonic
2023-12-12T13:29:20.813650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
1 125
25.0%
0 104
20.8%
2 98
19.6%
3 60
12.0%
4 36
 
7.2%
5 18
 
3.6%
6 18
 
3.6%
8 10
 
2.0%
9 6
 
1.2%
10 5
 
1.0%
Other values (9) 21
 
4.2%
ValueCountFrequency (%)
0 104
20.8%
1 125
25.0%
2 98
19.6%
3 60
12.0%
4 36
 
7.2%
5 18
 
3.6%
6 18
 
3.6%
7 4
 
0.8%
8 10
 
2.0%
9 6
 
1.2%
ValueCountFrequency (%)
29 1
 
0.2%
20 1
 
0.2%
16 2
 
0.4%
15 2
 
0.4%
14 1
 
0.2%
13 3
0.6%
12 2
 
0.4%
11 5
1.0%
10 5
1.0%
9 6
1.2%

최종학력(대학원)
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct13
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.41117764
Minimum0
Maximum17
Zeros428
Zeros (%)85.4%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T13:29:20.916142image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum17
Range17
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.7130659
Coefficient of variation (CV)4.1662428
Kurtosis41.23982
Mean0.41117764
Median Absolute Deviation (MAD)0
Skewness6.1014382
Sum206
Variance2.9345948
MonotonicityNot monotonic
2023-12-12T13:29:21.045660image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%)
0 428
85.4%
1 53
 
10.6%
2 3
 
0.6%
6 3
 
0.6%
5 2
 
0.4%
9 2
 
0.4%
10 2
 
0.4%
11 2
 
0.4%
4 2
 
0.4%
14 1
 
0.2%
Other values (3) 3
 
0.6%
ValueCountFrequency (%)
0 428
85.4%
1 53
 
10.6%
2 3
 
0.6%
4 2
 
0.4%
5 2
 
0.4%
6 3
 
0.6%
8 1
 
0.2%
9 2
 
0.4%
10 2
 
0.4%
11 2
 
0.4%
ValueCountFrequency (%)
17 1
 
0.2%
14 1
 
0.2%
12 1
 
0.2%
11 2
0.4%
10 2
0.4%
9 2
0.4%
8 1
 
0.2%
6 3
0.6%
5 2
0.4%
4 2
0.4%

Interactions

2023-12-12T13:29:17.773305image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:16.131532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:16.698530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:17.271254image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:17.902426image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:16.270655image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:16.846898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:17.424290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:18.023693image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:16.432792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:16.982327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:17.530357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:18.127314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:16.560189image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:17.111315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:29:17.657107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:29:21.138812image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기관유형지역인력구분최종학력(고등학교 이하)최종학력(전문대)최종학력(대학교)최종학력(대학원)
기관유형1.0000.0000.2410.0000.4120.9500.996
지역0.0001.0000.0000.2600.2380.2060.314
인력구분0.2410.0001.0000.6350.3890.2020.106
최종학력(고등학교 이하)0.0000.2600.6351.0000.0000.0000.000
최종학력(전문대)0.4120.2380.3890.0001.0000.5660.715
최종학력(대학교)0.9500.2060.2020.0000.5661.0000.865
최종학력(대학원)0.9960.3140.1060.0000.7150.8651.000
2023-12-12T13:29:21.258326image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역기관유형인력구분
지역1.0000.0000.000
기관유형0.0001.0000.155
인력구분0.0000.1551.000
2023-12-12T13:29:21.376784image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최종학력(고등학교 이하)최종학력(전문대)최종학력(대학교)최종학력(대학원)기관유형지역인력구분
최종학력(고등학교 이하)1.000-0.305-0.209-0.0760.0000.1050.637
최종학력(전문대)-0.3051.000-0.0730.0720.2360.0650.290
최종학력(대학교)-0.209-0.0731.0000.2210.8000.0870.150
최종학력(대학원)-0.0760.0720.2211.0000.9370.1260.080
기관유형0.0000.2360.8000.9371.0000.0000.155
지역0.1050.0650.0870.1260.0001.0000.000
인력구분0.6370.2900.1500.0800.1550.0001.000

Missing values

2023-12-12T13:29:18.273069image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:29:18.426333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기관유형지역기관명인력구분최종학력(고등학교 이하)최종학력(전문대)최종학력(대학교)최종학력(대학원)
0보건소서울특별시서울 강남구보건소상담사0040
1보건소서울특별시서울 강남구보건소단속원0090
2보건소서울특별시서울 강동구보건소상담사0021
3보건소서울특별시서울 강북구보건소상담사1251
4보건소서울특별시서울 강북구보건소단속원3000
5보건소서울특별시서울 강서구보건소단속원3080
6보건소서울특별시서울 관악구보건소상담사0030
7보건소서울특별시서울 관악구보건소단속원2000
8보건소서울특별시서울 광진구보건소상담사0120
9보건소서울특별시서울 광진구보건소단속원0111
기관유형지역기관명인력구분최종학력(고등학교 이하)최종학력(전문대)최종학력(대학교)최종학력(대학원)
491금연지원센터경기도경기남부금연지원센터상담사03148
492금연지원센터경기도경기북부금연지원센터상담사262017
493금연지원센터강원도강원금연지원센터상담사01150
494금연지원센터충청북도충북금연지원센터상담사03115
495금연지원센터충청남도충남금연지원센터상담사031611
496금연지원센터전라북도전북금연지원센터상담사02116
497금연지원센터전라남도전남금연지원센터상담사02156
498금연지원센터경상북도경북금연지원센터상담사14134
499금연지원센터경상남도경남금연지원센터상담사15134
500금연지원센터제주특별자치도제주금연지원센터상담사20166