Overview

Dataset statistics

Number of variables7
Number of observations22
Missing cells96
Missing cells (%)62.3%
Duplicate rows1
Duplicate rows (%)4.5%
Total size in memory1.5 KiB
Average record size in memory68.0 B

Variable types

Text1
Numeric5
Categorical1

Dataset

Description잡지산업 실태조사 관련 2010년 기준 종사자 규모, 매출액에 따라 분류한 사업체 수 데이터입니다. 자세한 내용 첨부 확인바랍니다.
Author한국언론진흥재단
URLhttps://www.data.go.kr/data/15047566/fileData.do

Alerts

Dataset has 1 (4.5%) duplicate rowsDuplicates
5천만원 미만 is highly overall correlated with 5천만이상 1억5천만원 미만 and 3 other fieldsHigh correlation
5천만이상 1억5천만원 미만 is highly overall correlated with 5천만원 미만 and 3 other fieldsHigh correlation
1억5천만이상 5억원 미만 is highly overall correlated with 5천만원 미만 and 3 other fieldsHigh correlation
25억원 이상 is highly overall correlated with 5천만원 미만 and 3 other fieldsHigh correlation
무응답 is highly overall correlated with 5천만원 미만 and 3 other fieldsHigh correlation
구분 has 16 (72.7%) missing valuesMissing
5천만원 미만 has 16 (72.7%) missing valuesMissing
5천만이상 1억5천만원 미만 has 16 (72.7%) missing valuesMissing
1억5천만이상 5억원 미만 has 16 (72.7%) missing valuesMissing
25억원 이상 has 16 (72.7%) missing valuesMissing
무응답 has 16 (72.7%) missing valuesMissing
5천만이상 1억5천만원 미만 has 1 (4.5%) zerosZeros
1억5천만이상 5억원 미만 has 1 (4.5%) zerosZeros

Reproduction

Analysis started2023-12-12 08:17:43.484511
Analysis finished2023-12-12 08:17:46.191944
Duration2.71 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Text

MISSING 

Distinct6
Distinct (%)100.0%
Missing16
Missing (%)72.7%
Memory size308.0 B
2023-12-12T17:17:46.302878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.5
Min length3

Characters and Unicode

Total characters33
Distinct characters16
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)100.0%

Sample

1st row5인 미만
2nd row5인~9인
3rd row10인~14인
4th row15인~19인
5th row20인 이상
ValueCountFrequency (%)
5인 1
12.5%
미만 1
12.5%
5인~9인 1
12.5%
10인~14인 1
12.5%
15인~19인 1
12.5%
20인 1
12.5%
이상 1
12.5%
무응답 1
12.5%
2023-12-12T17:17:46.653906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8
24.2%
1 4
12.1%
5 3
 
9.1%
~ 3
 
9.1%
2
 
6.1%
9 2
 
6.1%
0 2
 
6.1%
1
 
3.0%
1
 
3.0%
4 1
 
3.0%
Other values (6) 6
18.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 15
45.5%
Decimal Number 13
39.4%
Math Symbol 3
 
9.1%
Space Separator 2
 
6.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
8
53.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Decimal Number
ValueCountFrequency (%)
1 4
30.8%
5 3
23.1%
9 2
15.4%
0 2
15.4%
4 1
 
7.7%
2 1
 
7.7%
Math Symbol
ValueCountFrequency (%)
~ 3
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 18
54.5%
Hangul 15
45.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
8
53.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
Common
ValueCountFrequency (%)
1 4
22.2%
5 3
16.7%
~ 3
16.7%
2
11.1%
9 2
11.1%
0 2
11.1%
4 1
 
5.6%
2 1
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18
54.5%
Hangul 15
45.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
8
53.3%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
1
 
6.7%
ASCII
ValueCountFrequency (%)
1 4
22.2%
5 3
16.7%
~ 3
16.7%
2
11.1%
9 2
11.1%
0 2
11.1%
4 1
 
5.6%
2 1
 
5.6%

5천만원 미만
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct6
Distinct (%)100.0%
Missing16
Missing (%)72.7%
Infinite0
Infinite (%)0.0%
Mean58.666667
Minimum2
Maximum232
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size330.0 B
2023-12-12T17:17:46.788048image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile3.25
Q19.5
median18.5
Q360.5
95-th percentile192.5
Maximum232
Range230
Interquartile range (IQR)51

Descriptive statistics

Standard deviation88.7596
Coefficient of variation (CV)1.5129477
Kurtosis4.1868904
Mean58.666667
Median Absolute Deviation (MAD)14
Skewness2.0426716
Sum352
Variance7878.2667
MonotonicityNot monotonic
2023-12-12T17:17:46.909614image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
232 1
 
4.5%
74 1
 
4.5%
20 1
 
4.5%
7 1
 
4.5%
17 1
 
4.5%
2 1
 
4.5%
(Missing) 16
72.7%
ValueCountFrequency (%)
2 1
4.5%
7 1
4.5%
17 1
4.5%
20 1
4.5%
74 1
4.5%
232 1
4.5%
ValueCountFrequency (%)
232 1
4.5%
74 1
4.5%
20 1
4.5%
17 1
4.5%
7 1
4.5%
2 1
4.5%

5천만이상 1억5천만원 미만
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct6
Distinct (%)100.0%
Missing16
Missing (%)72.7%
Infinite0
Infinite (%)0.0%
Mean19.666667
Minimum0
Maximum62
Zeros1
Zeros (%)4.5%
Negative0
Negative (%)0.0%
Memory size330.0 B
2023-12-12T17:17:47.049953image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.75
Q14
median9.5
Q328.5
95-th percentile55
Maximum62
Range62
Interquartile range (IQR)24.5

Descriptive statistics

Standard deviation24.005555
Coefficient of variation (CV)1.2206214
Kurtosis1.1483766
Mean19.666667
Median Absolute Deviation (MAD)8
Skewness1.3899104
Sum118
Variance576.26667
MonotonicityNot monotonic
2023-12-12T17:17:47.181763image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
62 1
 
4.5%
34 1
 
4.5%
12 1
 
4.5%
3 1
 
4.5%
7 1
 
4.5%
0 1
 
4.5%
(Missing) 16
72.7%
ValueCountFrequency (%)
0 1
4.5%
3 1
4.5%
7 1
4.5%
12 1
4.5%
34 1
4.5%
62 1
4.5%
ValueCountFrequency (%)
62 1
4.5%
34 1
4.5%
12 1
4.5%
7 1
4.5%
3 1
4.5%
0 1
4.5%

1억5천만이상 5억원 미만
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct6
Distinct (%)100.0%
Missing16
Missing (%)72.7%
Infinite0
Infinite (%)0.0%
Mean17.166667
Minimum0
Maximum43
Zeros1
Zeros (%)4.5%
Negative0
Negative (%)0.0%
Memory size330.0 B
2023-12-12T17:17:47.344329image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.25
Q15.75
median12
Q327.25
95-th percentile40
Maximum43
Range43
Interquartile range (IQR)21.5

Descriptive statistics

Standard deviation16.654329
Coefficient of variation (CV)0.97015507
Kurtosis-0.81058358
Mean17.166667
Median Absolute Deviation (MAD)9.5
Skewness0.7959263
Sum103
Variance277.36667
MonotonicityNot monotonic
2023-12-12T17:17:47.492330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
31 1
 
4.5%
43 1
 
4.5%
16 1
 
4.5%
5 1
 
4.5%
8 1
 
4.5%
0 1
 
4.5%
(Missing) 16
72.7%
ValueCountFrequency (%)
0 1
4.5%
5 1
4.5%
8 1
4.5%
16 1
4.5%
31 1
4.5%
43 1
4.5%
ValueCountFrequency (%)
43 1
4.5%
31 1
4.5%
16 1
4.5%
8 1
4.5%
5 1
4.5%
0 1
4.5%
Distinct6
Distinct (%)27.3%
Missing0
Missing (%)0.0%
Memory size308.0 B
<NA>
16 
14
19
 
1
17
 
1
13
 
1

Length

Max length4
Median length4
Mean length3.4090909
Min length1

Unique

Unique4 ?
Unique (%)18.2%

Sample

1st row14
2nd row19
3rd row17
4th row14
5th row13

Common Values

ValueCountFrequency (%)
<NA> 16
72.7%
14 2
 
9.1%
19 1
 
4.5%
17 1
 
4.5%
13 1
 
4.5%
0 1
 
4.5%

Length

2023-12-12T17:17:47.674196image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T17:17:47.829478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 16
72.7%
14 2
 
9.1%
19 1
 
4.5%
17 1
 
4.5%
13 1
 
4.5%
0 1
 
4.5%

25억원 이상
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct6
Distinct (%)100.0%
Missing16
Missing (%)72.7%
Infinite0
Infinite (%)0.0%
Mean10.166667
Minimum1
Maximum23
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size330.0 B
2023-12-12T17:17:47.950742image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.25
Q13.25
median8
Q316.5
95-th percentile22
Maximum23
Range22
Interquartile range (IQR)13.25

Descriptive statistics

Standard deviation8.998148
Coefficient of variation (CV)0.88506373
Kurtosis-1.463761
Mean10.166667
Median Absolute Deviation (MAD)6.5
Skewness0.59895222
Sum61
Variance80.966667
MonotonicityNot monotonic
2023-12-12T17:17:48.076204image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
23 1
 
4.5%
9 1
 
4.5%
7 1
 
4.5%
1 1
 
4.5%
19 1
 
4.5%
2 1
 
4.5%
(Missing) 16
72.7%
ValueCountFrequency (%)
1 1
4.5%
2 1
4.5%
7 1
4.5%
9 1
4.5%
19 1
4.5%
23 1
4.5%
ValueCountFrequency (%)
23 1
4.5%
19 1
4.5%
9 1
4.5%
7 1
4.5%
2 1
4.5%
1 1
4.5%

무응답
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct6
Distinct (%)100.0%
Missing16
Missing (%)72.7%
Infinite0
Infinite (%)0.0%
Mean176.83333
Minimum29
Maximum513
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size330.0 B
2023-12-12T17:17:48.208891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum29
5-th percentile30
Q144.5
median100
Q3244.75
95-th percentile456.25
Maximum513
Range484
Interquartile range (IQR)200.25

Descriptive statistics

Standard deviation189.8193
Coefficient of variation (CV)1.0734362
Kurtosis1.2553201
Mean176.83333
Median Absolute Deviation (MAD)69
Skewness1.402463
Sum1061
Variance36031.367
MonotonicityNot monotonic
2023-12-12T17:17:48.344820image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
513 1
 
4.5%
286 1
 
4.5%
121 1
 
4.5%
33 1
 
4.5%
79 1
 
4.5%
29 1
 
4.5%
(Missing) 16
72.7%
ValueCountFrequency (%)
29 1
4.5%
33 1
4.5%
79 1
4.5%
121 1
4.5%
286 1
4.5%
513 1
4.5%
ValueCountFrequency (%)
513 1
4.5%
286 1
4.5%
121 1
4.5%
79 1
4.5%
33 1
4.5%
29 1
4.5%

Interactions

2023-12-12T17:17:45.290181image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:43.698634image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.061561image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.453844image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.829721image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:45.383536image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:43.761130image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.143856image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.537881image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.904150image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:45.480764image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:43.834897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.227160image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.621375image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.992034image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:45.567024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:43.904861image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.308302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.688314image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:45.110541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:45.655797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:43.975747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.378857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:44.752586image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T17:17:45.187475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T17:17:48.757357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분5천만원 미만5천만이상 1억5천만원 미만1억5천만이상 5억원 미만5억이상 25억원 미만25억원 이상무응답
구분1.0001.0001.0001.0001.0001.0001.000
5천만원 미만1.0001.0001.0001.0000.3141.0001.000
5천만이상 1억5천만원 미만1.0001.0001.0000.7590.5731.0001.000
1억5천만이상 5억원 미만1.0001.0000.7591.0000.9420.9420.759
5억이상 25억원 미만1.0000.3140.5730.9421.0000.9420.573
25억원 이상1.0001.0001.0000.9420.9421.0001.000
무응답1.0001.0001.0000.7590.5731.0001.000
2023-12-12T17:17:48.903168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
5천만원 미만5천만이상 1억5천만원 미만1억5천만이상 5억원 미만25억원 이상무응답5억이상 25억원 미만
5천만원 미만1.0001.0000.9430.7711.0000.000
5천만이상 1억5천만원 미만1.0001.0000.9430.7711.0000.000
1억5천만이상 5억원 미만0.9430.9431.0000.6570.9430.250
25억원 이상0.7710.7710.6571.0000.7710.250
무응답1.0001.0000.9430.7711.0000.000
5억이상 25억원 미만0.0000.0000.2500.2500.0001.000

Missing values

2023-12-12T17:17:45.778303image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T17:17:45.921284image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T17:17:46.078230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

구분5천만원 미만5천만이상 1억5천만원 미만1억5천만이상 5억원 미만5억이상 25억원 미만25억원 이상무응답
05인 미만23262311423513
15인~9인743443199286
210인~14인201216177121
315인~19인73514133
420인 이상1778131979
5무응답2000229
6<NA><NA><NA><NA><NA><NA><NA>
7<NA><NA><NA><NA><NA><NA><NA>
8<NA><NA><NA><NA><NA><NA><NA>
9<NA><NA><NA><NA><NA><NA><NA>
구분5천만원 미만5천만이상 1억5천만원 미만1억5천만이상 5억원 미만5억이상 25억원 미만25억원 이상무응답
12<NA><NA><NA><NA><NA><NA><NA>
13<NA><NA><NA><NA><NA><NA><NA>
14<NA><NA><NA><NA><NA><NA><NA>
15<NA><NA><NA><NA><NA><NA><NA>
16<NA><NA><NA><NA><NA><NA><NA>
17<NA><NA><NA><NA><NA><NA><NA>
18<NA><NA><NA><NA><NA><NA><NA>
19<NA><NA><NA><NA><NA><NA><NA>
20<NA><NA><NA><NA><NA><NA><NA>
21<NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

구분5천만원 미만5천만이상 1억5천만원 미만1억5천만이상 5억원 미만5억이상 25억원 미만25억원 이상무응답# duplicates
0<NA><NA><NA><NA><NA><NA><NA>16