Overview

Dataset statistics

Number of variables5
Number of observations38
Missing cells66
Missing cells (%)34.7%
Duplicate rows1
Duplicate rows (%)2.6%
Total size in memory1.7 KiB
Average record size in memory46.5 B

Variable types

Numeric2
Text2
Unsupported1

Dataset

Description샘플 데이터
Author(재)전남정보문화산업진흥원
URLhttps://kadx.co.kr/opmk/frn/pmumkproductDetail/PMU_9673ca30-f762-480b-aa34-7d38726414d3/5

Alerts

Dataset has 1 (2.6%) duplicate rowsDuplicates
FAMP_ID has 7 (18.4%) missing valuesMissing
FMLD_ADDR has 7 (18.4%) missing valuesMissing
PHT_DT has 7 (18.4%) missing valuesMissing
FILE_NM has 7 (18.4%) missing valuesMissing
IMG_URL has 38 (100.0%) missing valuesMissing
IMG_URL is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-11 20:16:58.726345
Analysis finished2023-12-11 20:16:59.967887
Duration1.24 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

FAMP_ID
Real number (ℝ)

MISSING 

Distinct31
Distinct (%)100.0%
Missing7
Missing (%)18.4%
Infinite0
Infinite (%)0.0%
Mean6057995.2
Minimum5207356
Maximum7151780
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size474.0 B
2023-12-12T05:17:00.068573image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum5207356
5-th percentile5207759
Q15733668.5
median6301983
Q36407545
95-th percentile7147181.5
Maximum7151780
Range1944424
Interquartile range (IQR)673876.5

Descriptive statistics

Standard deviation636652.57
Coefficient of variation (CV)0.10509295
Kurtosis-0.91202414
Mean6057995.2
Median Absolute Deviation (MAD)559907
Skewness0.21287809
Sum1.8779785 × 108
Variance4.0532649 × 1011
MonotonicityNot monotonic
2023-12-12T05:17:00.212990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
5227177 1
 
2.6%
5208004 1
 
2.6%
5743797 1
 
2.6%
5740549 1
 
2.6%
6326489 1
 
2.6%
5742076 1
 
2.6%
5734360 1
 
2.6%
6406050 1
 
2.6%
5217784 1
 
2.6%
6853867 1
 
2.6%
Other values (21) 21
55.3%
(Missing) 7
 
18.4%
ValueCountFrequency (%)
5207356 1
2.6%
5207514 1
2.6%
5208004 1
2.6%
5217784 1
2.6%
5219435 1
2.6%
5221420 1
2.6%
5227177 1
2.6%
5732977 1
2.6%
5734360 1
2.6%
5740549 1
2.6%
ValueCountFrequency (%)
7151780 1
2.6%
7149329 1
2.6%
7145034 1
2.6%
7141185 1
2.6%
6853867 1
2.6%
6425278 1
2.6%
6424611 1
2.6%
6409040 1
2.6%
6406050 1
2.6%
6369759 1
2.6%

FMLD_ADDR
Text

MISSING 

Distinct31
Distinct (%)100.0%
Missing7
Missing (%)18.4%
Memory size436.0 B
2023-12-12T05:17:00.467461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length26
Mean length23.354839
Min length18

Characters and Unicode

Total characters724
Distinct characters86
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)100.0%

Sample

1st row전라남도 완도군 고금면 농상리 640-2 전
2nd row전라남도 완도군 금일읍 신구리 349-6 과
3rd row전라북도 김제시 월봉동 551-7답
4th row전라북도 남원시 주천면 장안리 476대
5th row전라북도 김제시 죽산면 홍산리 899-1답
ValueCountFrequency (%)
전라북도 12
 
7.5%
성산읍 8
 
5.0%
서귀포시 8
 
5.0%
제주특별자치도 8
 
5.0%
전라남도 7
 
4.4%
완도군 7
 
4.4%
김제시 6
 
3.8%
5
 
3.1%
남원시 5
 
3.1%
서산시 4
 
2.5%
Other values (73) 90
56.2%
2023-12-12T05:17:00.850515image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
129
 
17.8%
38
 
5.2%
29
 
4.0%
26
 
3.6%
24
 
3.3%
- 23
 
3.2%
21
 
2.9%
1 20
 
2.8%
19
 
2.6%
17
 
2.3%
Other values (76) 378
52.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 451
62.3%
Space Separator 129
 
17.8%
Decimal Number 121
 
16.7%
Dash Punctuation 23
 
3.2%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
38
 
8.4%
29
 
6.4%
26
 
5.8%
24
 
5.3%
21
 
4.7%
19
 
4.2%
17
 
3.8%
15
 
3.3%
15
 
3.3%
14
 
3.1%
Other values (64) 233
51.7%
Decimal Number
ValueCountFrequency (%)
1 20
16.5%
4 15
12.4%
2 14
11.6%
5 14
11.6%
3 14
11.6%
6 12
9.9%
9 10
8.3%
7 10
8.3%
8 7
 
5.8%
0 5
 
4.1%
Space Separator
ValueCountFrequency (%)
129
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 451
62.3%
Common 273
37.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
38
 
8.4%
29
 
6.4%
26
 
5.8%
24
 
5.3%
21
 
4.7%
19
 
4.2%
17
 
3.8%
15
 
3.3%
15
 
3.3%
14
 
3.1%
Other values (64) 233
51.7%
Common
ValueCountFrequency (%)
129
47.3%
- 23
 
8.4%
1 20
 
7.3%
4 15
 
5.5%
2 14
 
5.1%
5 14
 
5.1%
3 14
 
5.1%
6 12
 
4.4%
9 10
 
3.7%
7 10
 
3.7%
Other values (2) 12
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
Hangul 451
62.3%
ASCII 273
37.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
129
47.3%
- 23
 
8.4%
1 20
 
7.3%
4 15
 
5.5%
2 14
 
5.1%
5 14
 
5.1%
3 14
 
5.1%
6 12
 
4.4%
9 10
 
3.7%
7 10
 
3.7%
Other values (2) 12
 
4.4%
Hangul
ValueCountFrequency (%)
38
 
8.4%
29
 
6.4%
26
 
5.8%
24
 
5.3%
21
 
4.7%
19
 
4.2%
17
 
3.8%
15
 
3.3%
15
 
3.3%
14
 
3.1%
Other values (64) 233
51.7%

PHT_DT
Real number (ℝ)

MISSING 

Distinct31
Distinct (%)100.0%
Missing7
Missing (%)18.4%
Infinite0
Infinite (%)0.0%
Mean2.0230132 × 1013
Minimum2.0230114 × 1013
Maximum2.0230223 × 1013
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size474.0 B
2023-12-12T05:17:00.991984image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2.0230114 × 1013
5-th percentile2.0230115 × 1013
Q12.0230118 × 1013
median2.0230127 × 1013
Q32.0230129 × 1013
95-th percentile2.0230211 × 1013
Maximum2.0230223 × 1013
Range1.089908 × 108
Interquartile range (IQR)10998744

Descriptive statistics

Standard deviation28357073
Coefficient of variation (CV)1.4017246 × 10-6
Kurtosis6.0940182
Mean2.0230132 × 1013
Median Absolute Deviation (MAD)8990527
Skewness2.6803027
Sum6.2713408 × 1014
Variance8.0412359 × 1014
MonotonicityNot monotonic
2023-12-12T05:17:01.147946image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20230115114142 1
 
2.6%
20230118041027 1
 
2.6%
20230128021536 1
 
2.6%
20230129010819 1
 
2.6%
20230117111343 1
 
2.6%
20230128035000 1
 
2.6%
20230128123052 1
 
2.6%
20230127114041 1
 
2.6%
20230118021611 1
 
2.6%
20230115125922 1
 
2.6%
Other values (21) 21
55.3%
(Missing) 7
 
18.4%
ValueCountFrequency (%)
20230114052553 1
2.6%
20230115114142 1
2.6%
20230115125922 1
2.6%
20230117025039 1
2.6%
20230117033145 1
2.6%
20230117111343 1
2.6%
20230117113446 1
2.6%
20230118021611 1
2.6%
20230118024241 1
2.6%
20230118030911 1
2.6%
ValueCountFrequency (%)
20230223043352 1
2.6%
20230212050418 1
2.6%
20230210011401 1
2.6%
20230131120228 1
2.6%
20230130055226 1
2.6%
20230130034702 1
2.6%
20230129125046 1
2.6%
20230129032520 1
2.6%
20230129010819 1
2.6%
20230128123052 1
2.6%

FILE_NM
Text

MISSING 

Distinct31
Distinct (%)100.0%
Missing7
Missing (%)18.4%
Memory size436.0 B
2023-12-12T05:17:01.500221image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length54
Median length52
Mean length49.354839
Min length44

Characters and Unicode

Total characters1530
Distinct characters91
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique31 ?
Unique (%)100.0%

Sample

1st row05207356_전라남도 완도군 고금면 농상리 640-2 전_230118034031.jpg
2nd row05227177_전라남도 완도군 금일읍 신구리 349-6 과_230115114142.jpg
3rd row06317556_전라북도 김제시 월봉동 551-7답_230117033145.jpg
4th row06424611_전라북도 남원시 주천면 장안리 476대_230129032520.jpg
5th row06303728_전라북도 김제시 죽산면 홍산리 899-1답_230130055226.jpg
ValueCountFrequency (%)
서귀포시 8
 
5.0%
성산읍 8
 
5.0%
완도군 7
 
4.4%
김제시 6
 
3.8%
남원시 5
 
3.1%
온평리 4
 
2.5%
서산시 4
 
2.5%
고북면 3
 
1.9%
죽산면 2
 
1.2%
고금면 2
 
1.2%
Other values (106) 111
69.4%
2023-12-12T05:17:01.780964image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 134
 
8.8%
129
 
8.4%
1 119
 
7.8%
2 109
 
7.1%
3 90
 
5.9%
5 71
 
4.6%
_ 62
 
4.1%
4 62
 
4.1%
7 49
 
3.2%
6 39
 
2.5%
Other values (81) 666
43.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 741
48.4%
Other Letter 451
29.5%
Space Separator 129
 
8.4%
Lowercase Letter 93
 
6.1%
Connector Punctuation 62
 
4.1%
Other Punctuation 31
 
2.0%
Dash Punctuation 23
 
1.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
38
 
8.4%
29
 
6.4%
26
 
5.8%
24
 
5.3%
21
 
4.7%
19
 
4.2%
17
 
3.8%
15
 
3.3%
15
 
3.3%
14
 
3.1%
Other values (64) 233
51.7%
Decimal Number
ValueCountFrequency (%)
0 134
18.1%
1 119
16.1%
2 109
14.7%
3 90
12.1%
5 71
9.6%
4 62
8.4%
7 49
 
6.6%
6 39
 
5.3%
8 35
 
4.7%
9 33
 
4.5%
Lowercase Letter
ValueCountFrequency (%)
j 31
33.3%
p 31
33.3%
g 31
33.3%
Space Separator
ValueCountFrequency (%)
129
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 62
100.0%
Other Punctuation
ValueCountFrequency (%)
. 31
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 23
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 986
64.4%
Hangul 451
29.5%
Latin 93
 
6.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
38
 
8.4%
29
 
6.4%
26
 
5.8%
24
 
5.3%
21
 
4.7%
19
 
4.2%
17
 
3.8%
15
 
3.3%
15
 
3.3%
14
 
3.1%
Other values (64) 233
51.7%
Common
ValueCountFrequency (%)
0 134
13.6%
129
13.1%
1 119
12.1%
2 109
11.1%
3 90
9.1%
5 71
7.2%
_ 62
6.3%
4 62
6.3%
7 49
 
5.0%
6 39
 
4.0%
Other values (4) 122
12.4%
Latin
ValueCountFrequency (%)
j 31
33.3%
p 31
33.3%
g 31
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1079
70.5%
Hangul 451
29.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 134
12.4%
129
12.0%
1 119
11.0%
2 109
10.1%
3 90
8.3%
5 71
 
6.6%
_ 62
 
5.7%
4 62
 
5.7%
7 49
 
4.5%
6 39
 
3.6%
Other values (7) 215
19.9%
Hangul
ValueCountFrequency (%)
38
 
8.4%
29
 
6.4%
26
 
5.8%
24
 
5.3%
21
 
4.7%
19
 
4.2%
17
 
3.8%
15
 
3.3%
15
 
3.3%
14
 
3.1%
Other values (64) 233
51.7%

IMG_URL
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing38
Missing (%)100.0%
Memory size474.0 B

Interactions

2023-12-12T05:16:59.530245image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T05:16:59.340109image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T05:16:59.612378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T05:16:59.437201image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T05:17:01.864838image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
FAMP_IDFMLD_ADDRPHT_DTFILE_NM
FAMP_ID1.0001.0000.7291.000
FMLD_ADDR1.0001.0001.0001.000
PHT_DT0.7291.0001.0001.000
FILE_NM1.0001.0001.0001.000
2023-12-12T05:17:01.983493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
FAMP_IDPHT_DT
FAMP_ID1.0000.310
PHT_DT0.3101.000

Missing values

2023-12-12T05:16:59.712217image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T05:16:59.810873image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T05:16:59.891923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

FAMP_IDFMLD_ADDRPHT_DTFILE_NMIMG_URL
05207356전라남도 완도군 고금면 농상리 640-2 전2023011803403105207356_전라남도 완도군 고금면 농상리 640-2 전_230118034031.jpg<NA>
15227177전라남도 완도군 금일읍 신구리 349-6 과2023011511414205227177_전라남도 완도군 금일읍 신구리 349-6 과_230115114142.jpg<NA>
26317556전라북도 김제시 월봉동 551-7답2023011703314506317556_전라북도 김제시 월봉동 551-7답_230117033145.jpg<NA>
36424611전라북도 남원시 주천면 장안리 476대2023012903252006424611_전라북도 남원시 주천면 장안리 476대_230129032520.jpg<NA>
46303728전라북도 김제시 죽산면 홍산리 899-1답2023013005522606303728_전라북도 김제시 죽산면 홍산리 899-1답_230130055226.jpg<NA>
55221420전라남도 완도군 노화읍 신양리 454-1 전2023011405255305221420_전라남도 완도군 노화읍 신양리 454-1 전_230114052553.jpg<NA>
65732977제주특별자치도 서귀포시 성산읍 난산리 2392임2023012703155405732977_제주특별자치도 서귀포시 성산읍 난산리 2392임_230127031554.jpg<NA>
76425278전라북도 남원시 주천면 호기리 305-1답2023011804035806425278_전라북도 남원시 주천면 호기리 305-1답_230118040358.jpg<NA>
86315294전라북도 김제시 연정동 877-5답2023012608552906315294_전라북도 김제시 연정동 877-5답_230126085529.jpg<NA>
95744020제주특별자치도 서귀포시 성산읍 온평리 230-2전2023012801003105744020_제주특별자치도 서귀포시 성산읍 온평리 230-2전_230128010031.jpg<NA>
FAMP_IDFMLD_ADDRPHT_DTFILE_NMIMG_URL
285740549제주특별자치도 서귀포시 성산읍 신산리 1487-2임2023012901081905740549_제주특별자치도 서귀포시 성산읍 신산리 1487-2임_230129010819.jpg<NA>
295743797제주특별자치도 서귀포시 성산읍 온평리 531-1대2023012802153605743797_제주특별자치도 서귀포시 성산읍 온평리 531-1대_230128021536.jpg<NA>
305208004전라남도 완도군 신지면 동고리 771-1 전2023011804102705208004_전라남도 완도군 신지면 동고리 771-1 전_230118041027.jpg<NA>
31<NA><NA><NA><NA><NA>
32<NA><NA><NA><NA><NA>
33<NA><NA><NA><NA><NA>
34<NA><NA><NA><NA><NA>
35<NA><NA><NA><NA><NA>
36<NA><NA><NA><NA><NA>
37<NA><NA><NA><NA><NA>

Duplicate rows

Most frequently occurring

FAMP_IDFMLD_ADDRPHT_DTFILE_NM# duplicates
0<NA><NA><NA><NA>7