Overview

Dataset statistics

Number of variables4
Number of observations6258
Missing cells6006
Missing cells (%)24.0%
Duplicate rows54
Duplicate rows (%)0.9%
Total size in memory201.8 KiB
Average record size in memory33.0 B

Variable types

Text3
Numeric1

Dataset

Description국토지리정보원의 항공사진 관련 메타데이터 중 미디어 저장내역 입니다. (미디어관리번호, 미디어자료일련번호 등 포함)
Author국토교통부 국토지리정보원
URLhttps://www.data.go.kr/data/15067536/fileData.do

Alerts

Dataset has 54 (0.9%) duplicate rowsDuplicates
비고 has 5951 (95.1%) missing valuesMissing

Reproduction

Analysis started2023-12-12 17:21:15.176475
Analysis finished2023-12-12 17:21:15.734576
Duration0.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct3631
Distinct (%)58.0%
Missing0
Missing (%)0.0%
Memory size49.0 KiB
2023-12-13T02:21:15.945878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.989773
Min length1

Characters and Unicode

Total characters81290
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3421 ?
Unique (%)54.7%

Sample

1st row2006100002023
2nd row2007110016001
3rd row2007110016002
4th row2007110016003
5th row2007110016004
ValueCountFrequency (%)
air-2006-015a 289
 
4.6%
air-2006-015b 289
 
4.6%
air-2006-027a 144
 
2.3%
air-2006-027b 144
 
2.3%
air-2005-021a 115
 
1.8%
air-2005-021b 115
 
1.8%
air-2007-007a 46
 
0.7%
air-2007-007b 46
 
0.7%
air-2005-003a 44
 
0.7%
air-2006-008a 38
 
0.6%
Other values (3621) 4988
79.7%
2023-12-13T02:21:16.406529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 31790
39.1%
1 7229
 
8.9%
2 6992
 
8.6%
- 5712
 
7.0%
9 4274
 
5.3%
A 4075
 
5.0%
6 2995
 
3.7%
5 2970
 
3.7%
R 2812
 
3.5%
I 2770
 
3.4%
Other values (11) 9671
 
11.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 64318
79.1%
Uppercase Letter 11260
 
13.9%
Dash Punctuation 5712
 
7.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 31790
49.4%
1 7229
 
11.2%
2 6992
 
10.9%
9 4274
 
6.6%
6 2995
 
4.7%
5 2970
 
4.6%
8 2182
 
3.4%
7 2166
 
3.4%
3 2029
 
3.2%
4 1691
 
2.6%
Uppercase Letter
ValueCountFrequency (%)
A 4075
36.2%
R 2812
25.0%
I 2770
24.6%
B 1282
 
11.4%
W 105
 
0.9%
D 44
 
0.4%
E 44
 
0.4%
M 44
 
0.4%
O 42
 
0.4%
T 42
 
0.4%
Dash Punctuation
ValueCountFrequency (%)
- 5712
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 70030
86.1%
Latin 11260
 
13.9%

Most frequent character per script

Common
ValueCountFrequency (%)
0 31790
45.4%
1 7229
 
10.3%
2 6992
 
10.0%
- 5712
 
8.2%
9 4274
 
6.1%
6 2995
 
4.3%
5 2970
 
4.2%
8 2182
 
3.1%
7 2166
 
3.1%
3 2029
 
2.9%
Latin
ValueCountFrequency (%)
A 4075
36.2%
R 2812
25.0%
I 2770
24.6%
B 1282
 
11.4%
W 105
 
0.9%
D 44
 
0.4%
E 44
 
0.4%
M 44
 
0.4%
O 42
 
0.4%
T 42
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 81290
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 31790
39.1%
1 7229
 
8.9%
2 6992
 
8.6%
- 5712
 
7.0%
9 4274
 
5.3%
A 4075
 
5.0%
6 2995
 
3.7%
5 2970
 
3.7%
R 2812
 
3.5%
I 2770
 
3.4%
Other values (11) 9671
 
11.9%
Distinct375
Distinct (%)6.0%
Missing55
Missing (%)0.9%
Memory size49.0 KiB
2023-12-13T02:21:16.684869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters68233
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)0.1%

Sample

1st row200610A0002
2nd row200711A0016
3rd row200711A0016
4th row200711A0016
5th row200711A0016
ValueCountFrequency (%)
198500a0003 147
 
2.4%
198609a0004 134
 
2.2%
198900a0002 96
 
1.5%
198900a0008 88
 
1.4%
198700a0001 87
 
1.4%
198800a0003 83
 
1.3%
198700a0011 77
 
1.2%
200505a0001 71
 
1.1%
198800a0004 68
 
1.1%
198500a0001 67
 
1.1%
Other values (365) 5285
85.2%
2023-12-13T02:21:17.091387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 30581
44.8%
1 8187
 
12.0%
9 7081
 
10.4%
A 6083
 
8.9%
2 4200
 
6.2%
8 2740
 
4.0%
5 2105
 
3.1%
7 1952
 
2.9%
3 1728
 
2.5%
4 1723
 
2.5%
Other values (2) 1853
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 61670
90.4%
Uppercase Letter 6083
 
8.9%
Space Separator 480
 
0.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 30581
49.6%
1 8187
 
13.3%
9 7081
 
11.5%
2 4200
 
6.8%
8 2740
 
4.4%
5 2105
 
3.4%
7 1952
 
3.2%
3 1728
 
2.8%
4 1723
 
2.8%
6 1373
 
2.2%
Uppercase Letter
ValueCountFrequency (%)
A 6083
100.0%
Space Separator
ValueCountFrequency (%)
480
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 62150
91.1%
Latin 6083
 
8.9%

Most frequent character per script

Common
ValueCountFrequency (%)
0 30581
49.2%
1 8187
 
13.2%
9 7081
 
11.4%
2 4200
 
6.8%
8 2740
 
4.4%
5 2105
 
3.4%
7 1952
 
3.1%
3 1728
 
2.8%
4 1723
 
2.8%
6 1373
 
2.2%
Latin
ValueCountFrequency (%)
A 6083
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 68233
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 30581
44.8%
1 8187
 
12.0%
9 7081
 
10.4%
A 6083
 
8.9%
2 4200
 
6.2%
8 2740
 
4.0%
5 2105
 
3.1%
7 1952
 
2.9%
3 1728
 
2.5%
4 1723
 
2.5%
Other values (2) 1853
 
2.7%
Distinct289
Distinct (%)4.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.274049
Minimum1
Maximum289
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size55.1 KiB
2023-12-13T02:21:17.240882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q313
95-th percentile139
Maximum289
Range288
Interquartile range (IQR)12

Descriptive statistics

Standard deviation51.276088
Coefficient of variation (CV)2.3020551
Kurtosis9.9493427
Mean22.274049
Median Absolute Deviation (MAD)0
Skewness3.153189
Sum139391
Variance2629.2372
MonotonicityNot monotonic
2023-12-13T02:21:17.384732image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3645
58.2%
2 192
 
3.1%
3 132
 
2.1%
4 106
 
1.7%
5 100
 
1.6%
6 96
 
1.5%
7 82
 
1.3%
8 78
 
1.2%
9 70
 
1.1%
10 67
 
1.1%
Other values (279) 1690
27.0%
ValueCountFrequency (%)
1 3645
58.2%
2 192
 
3.1%
3 132
 
2.1%
4 106
 
1.7%
5 100
 
1.6%
6 96
 
1.5%
7 82
 
1.3%
8 78
 
1.2%
9 70
 
1.1%
10 67
 
1.1%
ValueCountFrequency (%)
289 2
< 0.1%
288 2
< 0.1%
287 2
< 0.1%
286 2
< 0.1%
285 2
< 0.1%
284 2
< 0.1%
283 2
< 0.1%
282 2
< 0.1%
281 2
< 0.1%
280 2
< 0.1%

비고
Text

MISSING 

Distinct136
Distinct (%)44.3%
Missing5951
Missing (%)95.1%
Memory size49.0 KiB
2023-12-13T02:21:17.689342image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length11
Mean length10.283388
Min length3

Characters and Unicode

Total characters3157
Distinct characters113
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)16.3%

Sample

1st row123
2nd row198700A0001
3rd row198700A0001
4th row198700A0001
5th row1987 서해안
ValueCountFrequency (%)
1987 27
 
5.2%
서해안 27
 
5.2%
경기도 21
 
4.0%
1992 19
 
3.7%
경인지구 19
 
3.7%
경남 15
 
2.9%
신도시 9
 
1.7%
인천 6
 
1.2%
제주 6
 
1.2%
서울 6
 
1.2%
Other values (166) 364
70.1%
2023-12-13T02:21:18.174773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 785
24.9%
1 252
 
8.0%
9 230
 
7.3%
213
 
6.7%
A 156
 
4.9%
2 151
 
4.8%
7 103
 
3.3%
8 103
 
3.3%
74
 
2.3%
4 69
 
2.2%
Other values (103) 1021
32.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1804
57.1%
Other Letter 949
30.1%
Space Separator 213
 
6.7%
Uppercase Letter 160
 
5.1%
Lowercase Letter 21
 
0.7%
Open Punctuation 5
 
0.2%
Close Punctuation 3
 
0.1%
Math Symbol 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
74
 
7.8%
63
 
6.6%
58
 
6.1%
45
 
4.7%
41
 
4.3%
39
 
4.1%
39
 
4.1%
39
 
4.1%
32
 
3.4%
27
 
2.8%
Other values (77) 492
51.8%
Decimal Number
ValueCountFrequency (%)
0 785
43.5%
1 252
 
14.0%
9 230
 
12.7%
2 151
 
8.4%
7 103
 
5.7%
8 103
 
5.7%
4 69
 
3.8%
3 38
 
2.1%
5 37
 
2.1%
6 36
 
2.0%
Lowercase Letter
ValueCountFrequency (%)
l 4
19.0%
o 3
14.3%
u 2
9.5%
x 2
9.5%
i 2
9.5%
n 2
9.5%
a 2
9.5%
m 2
9.5%
s 2
9.5%
Uppercase Letter
ValueCountFrequency (%)
A 156
97.5%
M 2
 
1.2%
W 2
 
1.2%
Space Separator
ValueCountFrequency (%)
213
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%
Math Symbol
ValueCountFrequency (%)
~ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2027
64.2%
Hangul 949
30.1%
Latin 181
 
5.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
74
 
7.8%
63
 
6.6%
58
 
6.1%
45
 
4.7%
41
 
4.3%
39
 
4.1%
39
 
4.1%
39
 
4.1%
32
 
3.4%
27
 
2.8%
Other values (77) 492
51.8%
Common
ValueCountFrequency (%)
0 785
38.7%
1 252
 
12.4%
9 230
 
11.3%
213
 
10.5%
2 151
 
7.4%
7 103
 
5.1%
8 103
 
5.1%
4 69
 
3.4%
3 38
 
1.9%
5 37
 
1.8%
Other values (4) 46
 
2.3%
Latin
ValueCountFrequency (%)
A 156
86.2%
l 4
 
2.2%
o 3
 
1.7%
u 2
 
1.1%
x 2
 
1.1%
i 2
 
1.1%
n 2
 
1.1%
a 2
 
1.1%
m 2
 
1.1%
s 2
 
1.1%
Other values (2) 4
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2208
69.9%
Hangul 949
30.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 785
35.6%
1 252
 
11.4%
9 230
 
10.4%
213
 
9.6%
A 156
 
7.1%
2 151
 
6.8%
7 103
 
4.7%
8 103
 
4.7%
4 69
 
3.1%
3 38
 
1.7%
Other values (16) 108
 
4.9%
Hangul
ValueCountFrequency (%)
74
 
7.8%
63
 
6.6%
58
 
6.1%
45
 
4.7%
41
 
4.3%
39
 
4.1%
39
 
4.1%
39
 
4.1%
32
 
3.4%
27
 
2.8%
Other values (77) 492
51.8%

Interactions

2023-12-13T02:21:15.367429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Missing values

2023-12-13T02:21:15.486802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T02:21:15.580869image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T02:21:15.679358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

미디어관리번호사업지구코드미디어자료일련번호비고
02006100002023200610A00021<NA>
12007110016001200711A00161<NA>
22007110016002200711A00161<NA>
32007110016003200711A00161<NA>
42007110016004200711A00161<NA>
52007110016005200711A00161<NA>
62007110016006200711A00161<NA>
72007110016007200711A00161<NA>
82007110016008200711A00161<NA>
92007110016009200711A00161<NA>
미디어관리번호사업지구코드미디어자료일련번호비고
6248AIR-2005-012B199300A00021<NA>
6249AIR-2005-012B199300A00032<NA>
6250AIR-2005-012B200209A00023<NA>
6251AIR-2005-013A199600A00021<NA>
6252AIR-2005-013A199700A00232<NA>
6253AIR-2005-014A199100A00061<NA>
6254AIR-2005-014A199600A00032<NA>
6255AIR-2005-014A199600A00133<NA>
6256AIR-2005-015A196800A00021<NA>
6257AIR-2005-015A199400A00022<NA>

Duplicate rows

Most frequently occurring

미디어관리번호사업지구코드미디어자료일련번호비고# duplicates
01970000003001197000A00031<NA>2
12007000017002200700A00171<NA>2
22007000017003200700A00171<NA>2
32007000017004200700A00171<NA>2
42007000017005200700A00171<NA>2
52007000017006200700A00171<NA>2
62007000017007200700A00171<NA>2
72007000017008200700A00171<NA>2
82007000017009200700A00171<NA>2
92007000017010200700A00171<NA>2