Overview

Dataset statistics

Number of variables7
Number of observations71
Missing cells120
Missing cells (%)24.1%
Duplicate rows1
Duplicate rows (%)1.4%
Total size in memory4.3 KiB
Average record size in memory61.8 B

Variable types

DateTime1
Categorical1
Numeric4
Text1

Dataset

Description경상북도 보훈대상자별 인원현황 자료1. 적용 대상 국가유공자는 「국가유공자 등 예우 및 지원에 관한 법률」 제4조 참조2. 참전유공자는 「참전유공자예우 및 단체설립에 관한 법률」제2조에 의거 등록된 대상자 현황<참고>* 제적(국적상실), 등급기준 미달자, 단순 수훈자, 희생자력 제외* 합계는 등록대상별 현황의 합계임 (실인원이 아님)* (고엽제후유증)은 국가유공자예우법상 "전몰·전상·순직·공상군경"에 포함(중복합산을 하지 않음)
Author국가보훈부
URLhttps://www.data.go.kr/data/15087450/fileData.do

Alerts

기준년월 has constant value ""Constant
Dataset has 1 (1.4%) duplicate rowsDuplicates
순서 is highly overall correlated with 유족 and 1 other fieldsHigh correlation
합계 is highly overall correlated with 본인 and 2 other fieldsHigh correlation
본인 is highly overall correlated with 합계 and 1 other fieldsHigh correlation
유족 is highly overall correlated with 순서 and 2 other fieldsHigh correlation
지역명 is highly overall correlated with 순서 and 3 other fieldsHigh correlation
기준년월 has 20 (28.2%) missing valuesMissing
순서 has 20 (28.2%) missing valuesMissing
대상구분 has 20 (28.2%) missing valuesMissing
합계 has 20 (28.2%) missing valuesMissing
본인 has 20 (28.2%) missing valuesMissing
유족 has 20 (28.2%) missing valuesMissing
합계 has 3 (4.2%) zerosZeros
본인 has 22 (31.0%) zerosZeros
유족 has 10 (14.1%) zerosZeros

Reproduction

Analysis started2024-03-14 09:10:10.519338
Analysis finished2024-03-14 09:10:15.867923
Duration5.35 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준년월
Date

CONSTANT  MISSING 

Distinct1
Distinct (%)2.0%
Missing20
Missing (%)28.2%
Memory size696.0 B
Minimum2023-12-31 00:00:00
Maximum2023-12-31 00:00:00
2024-03-14T18:10:16.010986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:16.310143image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)

지역명
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size696.0 B
경상북도
51 
<NA>
20 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경상북도
2nd row경상북도
3rd row경상북도
4th row경상북도
5th row경상북도

Common Values

ValueCountFrequency (%)
경상북도 51
71.8%
<NA> 20
 
28.2%

Length

2024-03-14T18:10:16.677487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-14T18:10:16.999761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
경상북도 51
71.8%
na 20
 
28.2%

순서
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct51
Distinct (%)100.0%
Missing20
Missing (%)28.2%
Infinite0
Infinite (%)0.0%
Mean26
Minimum1
Maximum51
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size767.0 B
2024-03-14T18:10:17.353466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.5
Q113.5
median26
Q338.5
95-th percentile48.5
Maximum51
Range50
Interquartile range (IQR)25

Descriptive statistics

Standard deviation14.866069
Coefficient of variation (CV)0.57177187
Kurtosis-1.2
Mean26
Median Absolute Deviation (MAD)13
Skewness0
Sum1326
Variance221
MonotonicityStrictly increasing
2024-03-14T18:10:17.799956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2 1
 
1.4%
29 1
 
1.4%
30 1
 
1.4%
31 1
 
1.4%
32 1
 
1.4%
33 1
 
1.4%
34 1
 
1.4%
35 1
 
1.4%
36 1
 
1.4%
37 1
 
1.4%
Other values (41) 41
57.7%
(Missing) 20
28.2%
ValueCountFrequency (%)
1 1
1.4%
2 1
1.4%
3 1
1.4%
4 1
1.4%
5 1
1.4%
6 1
1.4%
7 1
1.4%
8 1
1.4%
9 1
1.4%
10 1
1.4%
ValueCountFrequency (%)
51 1
1.4%
50 1
1.4%
49 1
1.4%
48 1
1.4%
47 1
1.4%
46 1
1.4%
45 1
1.4%
44 1
1.4%
43 1
1.4%
42 1
1.4%

대상구분
Text

MISSING 

Distinct48
Distinct (%)94.1%
Missing20
Missing (%)28.2%
Memory size696.0 B
2024-03-14T18:10:18.717511image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length15
Median length11
Mean length6.9411765
Min length4

Characters and Unicode

Total characters354
Distinct characters80
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique45 ?
Unique (%)88.2%

Sample

1st row[순국선열]
2nd row건국훈장
3rd row건국포장
4th row대통령표창
5th row[애국지사]
ValueCountFrequency (%)
대통령표창 2
 
3.6%
행불자 2
 
3.6%
또는 2
 
3.6%
건국훈장 2
 
3.6%
건국포장 2
 
3.6%
특수임무사망자 1
 
1.8%
6·25전쟁 1
 
1.8%
특수임무공로자 1
 
1.8%
고엽제후유의증 1
 
1.8%
보훈보상대상자 1
 
1.8%
Other values (40) 40
72.7%
2024-03-14T18:10:20.279090image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
21
 
5.9%
19
 
5.4%
16
 
4.5%
· 14
 
4.0%
13
 
3.7%
12
 
3.4%
11
 
3.1%
] 10
 
2.8%
[ 10
 
2.8%
1 9
 
2.5%
Other values (70) 219
61.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 281
79.4%
Decimal Number 34
 
9.6%
Other Punctuation 15
 
4.2%
Close Punctuation 10
 
2.8%
Open Punctuation 10
 
2.8%
Space Separator 4
 
1.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
21
 
7.5%
19
 
6.8%
16
 
5.7%
13
 
4.6%
12
 
4.3%
11
 
3.9%
9
 
3.2%
8
 
2.8%
7
 
2.5%
7
 
2.5%
Other values (58) 158
56.2%
Decimal Number
ValueCountFrequency (%)
1 9
26.5%
5 6
17.6%
8 5
14.7%
9 4
11.8%
4 4
11.8%
6 3
 
8.8%
2 3
 
8.8%
Other Punctuation
ValueCountFrequency (%)
· 14
93.3%
. 1
 
6.7%
Close Punctuation
ValueCountFrequency (%)
] 10
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 10
100.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 281
79.4%
Common 73
 
20.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
21
 
7.5%
19
 
6.8%
16
 
5.7%
13
 
4.6%
12
 
4.3%
11
 
3.9%
9
 
3.2%
8
 
2.8%
7
 
2.5%
7
 
2.5%
Other values (58) 158
56.2%
Common
ValueCountFrequency (%)
· 14
19.2%
] 10
13.7%
[ 10
13.7%
1 9
12.3%
5 6
8.2%
8 5
 
6.8%
4
 
5.5%
9 4
 
5.5%
4 4
 
5.5%
6 3
 
4.1%
Other values (2) 4
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 281
79.4%
ASCII 59
 
16.7%
None 14
 
4.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
21
 
7.5%
19
 
6.8%
16
 
5.7%
13
 
4.6%
12
 
4.3%
11
 
3.9%
9
 
3.2%
8
 
2.8%
7
 
2.5%
7
 
2.5%
Other values (58) 158
56.2%
None
ValueCountFrequency (%)
· 14
100.0%
ASCII
ValueCountFrequency (%)
] 10
16.9%
[ 10
16.9%
1 9
15.3%
5 6
10.2%
8 5
8.5%
4
 
6.8%
9 4
 
6.8%
4 4
 
6.8%
6 3
 
5.1%
2 3
 
5.1%

합계
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct43
Distinct (%)84.3%
Missing20
Missing (%)28.2%
Infinite0
Infinite (%)0.0%
Mean1786.1961
Minimum0
Maximum19853
Zeros3
Zeros (%)4.2%
Negative0
Negative (%)0.0%
Memory size767.0 B
2024-03-14T18:10:20.682723image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.5
Q112
median88
Q3976
95-th percentile11067
Maximum19853
Range19853
Interquartile range (IQR)964

Descriptive statistics

Standard deviation3929.498
Coefficient of variation (CV)2.1999253
Kurtosis9.8984428
Mean1786.1961
Median Absolute Deviation (MAD)87
Skewness3.0392266
Sum91096
Variance15440954
MonotonicityNot monotonic
2024-03-14T18:10:21.123446image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=43)
ValueCountFrequency (%)
12 4
 
5.6%
0 3
 
4.2%
1 2
 
2.8%
2 2
 
2.8%
18 2
 
2.8%
9869 1
 
1.4%
52 1
 
1.4%
463 1
 
1.4%
88 1
 
1.4%
348 1
 
1.4%
Other values (33) 33
46.5%
(Missing) 20
28.2%
ValueCountFrequency (%)
0 3
4.2%
1 2
2.8%
2 2
2.8%
6 1
 
1.4%
9 1
 
1.4%
10 1
 
1.4%
12 4
5.6%
15 1
 
1.4%
17 1
 
1.4%
18 2
2.8%
ValueCountFrequency (%)
19853 1
1.4%
12820 1
1.4%
12265 1
1.4%
9869 1
1.4%
5278 1
1.4%
5019 1
1.4%
4583 1
1.4%
4047 1
1.4%
3857 1
1.4%
2920 1
1.4%

본인
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct27
Distinct (%)52.9%
Missing20
Missing (%)28.2%
Infinite0
Infinite (%)0.0%
Mean1095.9412
Minimum0
Maximum12820
Zeros22
Zeros (%)31.0%
Negative0
Negative (%)0.0%
Memory size767.0 B
2024-03-14T18:10:21.519608image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median14
Q3276
95-th percentile6092.5
Maximum12820
Range12820
Interquartile range (IQR)276

Descriptive statistics

Standard deviation2597.2269
Coefficient of variation (CV)2.3698597
Kurtosis9.9371696
Mean1095.9412
Median Absolute Deviation (MAD)14
Skewness3.0590312
Sum55893
Variance6745587.4
MonotonicityNot monotonic
2024-03-14T18:10:21.912625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
0 22
31.0%
5 2
 
2.8%
238 2
 
2.8%
14 2
 
2.8%
4110 1
 
1.4%
2920 1
 
1.4%
5278 1
 
1.4%
69 1
 
1.4%
66 1
 
1.4%
135 1
 
1.4%
Other values (17) 17
23.9%
(Missing) 20
28.2%
ValueCountFrequency (%)
0 22
31.0%
5 2
 
2.8%
11 1
 
1.4%
14 2
 
2.8%
18 1
 
1.4%
28 1
 
1.4%
31 1
 
1.4%
47 1
 
1.4%
66 1
 
1.4%
69 1
 
1.4%
ValueCountFrequency (%)
12820 1
1.4%
9869 1
1.4%
6907 1
1.4%
5278 1
1.4%
4110 1
1.4%
3857 1
1.4%
2920 1
1.4%
2892 1
1.4%
2797 1
1.4%
1976 1
1.4%

유족
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct35
Distinct (%)68.6%
Missing20
Missing (%)28.2%
Infinite0
Infinite (%)0.0%
Mean690.2549
Minimum0
Maximum12946
Zeros10
Zeros (%)14.1%
Negative0
Negative (%)0.0%
Memory size767.0 B
2024-03-14T18:10:22.297299image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11.5
median12
Q3187.5
95-th percentile3460.5
Maximum12946
Range12946
Interquartile range (IQR)186

Descriptive statistics

Standard deviation2199.8934
Coefficient of variation (CV)3.1870739
Kurtosis21.716079
Mean690.2549
Median Absolute Deviation (MAD)12
Skewness4.5003662
Sum35203
Variance4839531
MonotonicityNot monotonic
2024-03-14T18:10:22.717542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=35)
ValueCountFrequency (%)
0 10
 
14.1%
1 3
 
4.2%
7 2
 
2.8%
34 2
 
2.8%
12 2
 
2.8%
4 2
 
2.8%
2 2
 
2.8%
174 1
 
1.4%
244 1
 
1.4%
8 1
 
1.4%
Other values (25) 25
35.2%
(Missing) 20
28.2%
ValueCountFrequency (%)
0 10
14.1%
1 3
 
4.2%
2 2
 
2.8%
3 1
 
1.4%
4 2
 
2.8%
5 1
 
1.4%
6 1
 
1.4%
7 2
 
2.8%
8 1
 
1.4%
10 1
 
1.4%
ValueCountFrequency (%)
12946 1
1.4%
8155 1
1.4%
4343 1
1.4%
2578 1
1.4%
1691 1
1.4%
1250 1
1.4%
963 1
1.4%
751 1
1.4%
550 1
1.4%
479 1
1.4%

Interactions

2024-03-14T18:10:13.833262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:10.824170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:11.802848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:12.837190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:14.073333image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:11.059138image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:12.053739image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:13.079930image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:14.338867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:11.316866image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:12.323269image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:13.342641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:14.583995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:11.558028image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:12.577166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-03-14T18:10:13.586487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-14T18:10:22.972854image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순서대상구분합계본인유족
순서1.0000.7700.5300.3620.264
대상구분0.7701.0001.0001.0001.000
합계0.5301.0001.0000.9340.777
본인0.3621.0000.9341.0000.728
유족0.2641.0000.7770.7281.000
2024-03-14T18:10:23.240486image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순서합계본인유족지역명
순서1.000-0.0880.326-0.5661.000
합계-0.0881.0000.7170.5091.000
본인0.3260.7171.0000.0361.000
유족-0.5660.5090.0361.0001.000
지역명1.0001.0001.0001.0001.000

Missing values

2024-03-14T18:10:14.928505image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-14T18:10:15.318342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-14T18:10:15.657688image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

기준년월지역명순서대상구분합계본인유족
02023-12-31경상북도1[순국선열]60060
12023-12-31경상북도2건국훈장57057
22023-12-31경상북도3건국포장202
32023-12-31경상북도4대통령표창101
42023-12-31경상북도5[애국지사]4790479
52023-12-31경상북도6건국훈장2440244
62023-12-31경상북도7건국포장61061
72023-12-31경상북도8대통령표창1740174
82023-12-31경상북도9[전몰·전상·순직·공상군경]19853690712946
92023-12-31경상북도10전몰군경257802578
기준년월지역명순서대상구분합계본인유족
61<NA><NA><NA><NA><NA><NA><NA>
62<NA><NA><NA><NA><NA><NA><NA>
63<NA><NA><NA><NA><NA><NA><NA>
64<NA><NA><NA><NA><NA><NA><NA>
65<NA><NA><NA><NA><NA><NA><NA>
66<NA><NA><NA><NA><NA><NA><NA>
67<NA><NA><NA><NA><NA><NA><NA>
68<NA><NA><NA><NA><NA><NA><NA>
69<NA><NA><NA><NA><NA><NA><NA>
70<NA><NA><NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

기준년월지역명순서대상구분합계본인유족# duplicates
0<NA><NA><NA><NA><NA><NA><NA>20