Overview

Dataset statistics

Number of variables5
Number of observations68
Missing cells59
Missing cells (%)17.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.9 KiB
Average record size in memory42.9 B

Variable types

Numeric1
Categorical2
Text2

Dataset

Description경기도 고양시 방역지리정보시스템 유충발생지 데이터로 시설명칭, 등록일, 도로명주소, 상세 주소 등의 항목을 제공합니다.
Author경기도 고양시
URLhttps://www.data.go.kr/data/15127202/fileData.do

Alerts

순번 is highly overall correlated with 시설명칭High correlation
시설명칭 is highly overall correlated with 순번 and 1 other fieldsHigh correlation
등록일 is highly overall correlated with 시설명칭High correlation
등록일 is highly imbalanced (80.9%)Imbalance
상세 주소 has 59 (86.8%) missing valuesMissing
순번 has unique valuesUnique

Reproduction

Analysis started2024-03-23 05:45:26.540070
Analysis finished2024-03-23 05:45:27.457466
Duration0.92 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct68
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.5
Minimum1
Maximum68
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 B
2024-03-23T14:45:27.566009image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.35
Q117.75
median34.5
Q351.25
95-th percentile64.65
Maximum68
Range67
Interquartile range (IQR)33.5

Descriptive statistics

Standard deviation19.77372
Coefficient of variation (CV)0.5731513
Kurtosis-1.2
Mean34.5
Median Absolute Deviation (MAD)17
Skewness0
Sum2346
Variance391
MonotonicityStrictly increasing
2024-03-23T14:45:27.788164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.5%
45 1
 
1.5%
51 1
 
1.5%
50 1
 
1.5%
49 1
 
1.5%
48 1
 
1.5%
47 1
 
1.5%
46 1
 
1.5%
44 1
 
1.5%
36 1
 
1.5%
Other values (58) 58
85.3%
ValueCountFrequency (%)
1 1
1.5%
2 1
1.5%
3 1
1.5%
4 1
1.5%
5 1
1.5%
6 1
1.5%
7 1
1.5%
8 1
1.5%
9 1
1.5%
10 1
1.5%
ValueCountFrequency (%)
68 1
1.5%
67 1
1.5%
66 1
1.5%
65 1
1.5%
64 1
1.5%
63 1
1.5%
62 1
1.5%
61 1
1.5%
60 1
1.5%
59 1
1.5%

시설명칭
Categorical

HIGH CORRELATION 

Distinct9
Distinct (%)13.2%
Missing0
Missing (%)0.0%
Memory size676.0 B
배수로
36 
농로
17 
근린공원
빗물받이
축사
 
3
Other values (4)

Length

Max length4
Median length3
Mean length2.8088235
Min length2

Unique

Unique4 ?
Unique (%)5.9%

Sample

1st row근린공원
2nd row근린공원
3rd row근린공원
4th row근린공원
5th row농로

Common Values

ValueCountFrequency (%)
배수로 36
52.9%
농로 17
25.0%
근린공원 4
 
5.9%
빗물받이 4
 
5.9%
축사 3
 
4.4%
웅덩이 1
 
1.5%
유수지 1
 
1.5%
저류지 1
 
1.5%
하천 1
 
1.5%

Length

2024-03-23T14:45:28.023802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T14:45:28.246639image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
배수로 36
52.9%
농로 17
25.0%
근린공원 4
 
5.9%
빗물받이 4
 
5.9%
축사 3
 
4.4%
웅덩이 1
 
1.5%
유수지 1
 
1.5%
저류지 1
 
1.5%
하천 1
 
1.5%

등록일
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size676.0 B
2023-10-23
66 
2023-10-20
 
2

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2023-10-23
2nd row2023-10-23
3rd row2023-10-23
4th row2023-10-23
5th row2023-10-23

Common Values

ValueCountFrequency (%)
2023-10-23 66
97.1%
2023-10-20 2
 
2.9%

Length

2024-03-23T14:45:28.486535image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-03-23T14:45:28.663785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2023-10-23 66
97.1%
2023-10-20 2
 
2.9%
Distinct66
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Memory size676.0 B
2024-03-23T14:45:29.037748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length34
Median length31
Mean length27.926471
Min length20

Characters and Unicode

Total characters1899
Distinct characters59
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique64 ?
Unique (%)94.1%

Sample

1st row경기도 고양시 일산서구 덕이동 산 131-7
2nd row경기도 고양시 일산서구 탄현동 92-29
3rd row경기도 고양시 일산서구 탄중로 284
4th row경기도 고양시 일산서구 일산로 552-1
5th row경기도 고양시 일산서구 덕이로 137-23 (덕이동)
ValueCountFrequency (%)
경기도 68
16.9%
일산서구 68
16.9%
고양시 68
16.9%
구산동 15
 
3.7%
덕이동 13
 
3.2%
법곳동 13
 
3.2%
가좌동 11
 
2.7%
송산로 7
 
1.7%
대화동 7
 
1.7%
법곳길 7
 
1.7%
Other values (107) 126
31.3%
2024-03-23T14:45:29.729293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
335
 
17.6%
110
 
5.8%
88
 
4.6%
69
 
3.6%
69
 
3.6%
69
 
3.6%
68
 
3.6%
68
 
3.6%
68
 
3.6%
68
 
3.6%
Other values (49) 887
46.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1128
59.4%
Space Separator 335
 
17.6%
Decimal Number 281
 
14.8%
Open Punctuation 62
 
3.3%
Close Punctuation 62
 
3.3%
Dash Punctuation 29
 
1.5%
Other Punctuation 2
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
110
 
9.8%
88
 
7.8%
69
 
6.1%
69
 
6.1%
69
 
6.1%
68
 
6.0%
68
 
6.0%
68
 
6.0%
68
 
6.0%
68
 
6.0%
Other values (34) 383
34.0%
Decimal Number
ValueCountFrequency (%)
2 48
17.1%
1 48
17.1%
3 34
12.1%
7 29
10.3%
5 28
10.0%
4 25
8.9%
9 24
8.5%
6 17
 
6.0%
8 16
 
5.7%
0 12
 
4.3%
Space Separator
ValueCountFrequency (%)
335
100.0%
Open Punctuation
ValueCountFrequency (%)
( 62
100.0%
Close Punctuation
ValueCountFrequency (%)
) 62
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 29
100.0%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1128
59.4%
Common 771
40.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
110
 
9.8%
88
 
7.8%
69
 
6.1%
69
 
6.1%
69
 
6.1%
68
 
6.0%
68
 
6.0%
68
 
6.0%
68
 
6.0%
68
 
6.0%
Other values (34) 383
34.0%
Common
ValueCountFrequency (%)
335
43.5%
( 62
 
8.0%
) 62
 
8.0%
2 48
 
6.2%
1 48
 
6.2%
3 34
 
4.4%
7 29
 
3.8%
- 29
 
3.8%
5 28
 
3.6%
4 25
 
3.2%
Other values (5) 71
 
9.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1128
59.4%
ASCII 771
40.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
335
43.5%
( 62
 
8.0%
) 62
 
8.0%
2 48
 
6.2%
1 48
 
6.2%
3 34
 
4.4%
7 29
 
3.8%
- 29
 
3.8%
5 28
 
3.6%
4 25
 
3.2%
Other values (5) 71
 
9.2%
Hangul
ValueCountFrequency (%)
110
 
9.8%
88
 
7.8%
69
 
6.1%
69
 
6.1%
69
 
6.1%
68
 
6.0%
68
 
6.0%
68
 
6.0%
68
 
6.0%
68
 
6.0%
Other values (34) 383
34.0%

상세 주소
Text

MISSING 

Distinct9
Distinct (%)100.0%
Missing59
Missing (%)86.8%
Memory size676.0 B
2024-03-23T14:45:29.984982image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length7
Mean length5.5555556
Min length2

Characters and Unicode

Total characters50
Distinct characters33
Distinct categories3 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)100.0%

Sample

1st row한뫼도서관
2nd row가좌마을101동
3rd row유리공장
4th row대화마을
5th row대화마을상가동
ValueCountFrequency (%)
한뫼도서관 1
11.1%
가좌마을101동 1
11.1%
유리공장 1
11.1%
대화마을 1
11.1%
대화마을상가동 1
11.1%
b동 1
11.1%
나동 1
11.1%
가좌배드민턴경기장 1
11.1%
궁골공원개방화장실 1
11.1%
2024-03-23T14:45:30.480474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
 
8.0%
3
 
6.0%
3
 
6.0%
3
 
6.0%
3
 
6.0%
3
 
6.0%
2
 
4.0%
2
 
4.0%
2
 
4.0%
1 2
 
4.0%
Other values (23) 23
46.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 46
92.0%
Decimal Number 3
 
6.0%
Uppercase Letter 1
 
2.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4
 
8.7%
3
 
6.5%
3
 
6.5%
3
 
6.5%
3
 
6.5%
3
 
6.5%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.2%
Other values (20) 20
43.5%
Decimal Number
ValueCountFrequency (%)
1 2
66.7%
0 1
33.3%
Uppercase Letter
ValueCountFrequency (%)
B 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 46
92.0%
Common 3
 
6.0%
Latin 1
 
2.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4
 
8.7%
3
 
6.5%
3
 
6.5%
3
 
6.5%
3
 
6.5%
3
 
6.5%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.2%
Other values (20) 20
43.5%
Common
ValueCountFrequency (%)
1 2
66.7%
0 1
33.3%
Latin
ValueCountFrequency (%)
B 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 46
92.0%
ASCII 4
 
8.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4
 
8.7%
3
 
6.5%
3
 
6.5%
3
 
6.5%
3
 
6.5%
3
 
6.5%
2
 
4.3%
2
 
4.3%
2
 
4.3%
1
 
2.2%
Other values (20) 20
43.5%
ASCII
ValueCountFrequency (%)
1 2
50.0%
B 1
25.0%
0 1
25.0%

Interactions

2024-03-23T14:45:26.849180image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-23T14:45:30.602152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번시설명칭등록일도로명 주소상세 주소
순번1.0000.7990.0000.9731.000
시설명칭0.7991.0000.6500.0001.000
등록일0.0000.6501.0000.000NaN
도로명 주소0.9730.0000.0001.0001.000
상세 주소1.0001.000NaN1.0001.000
2024-03-23T14:45:30.778705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등록일시설명칭
등록일1.0000.621
시설명칭0.6211.000
2024-03-23T14:45:30.974092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번시설명칭등록일
순번1.0000.5220.000
시설명칭0.5221.0000.621
등록일0.0000.6211.000

Missing values

2024-03-23T14:45:27.075551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-23T14:45:27.387063image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번시설명칭등록일도로명 주소상세 주소
01근린공원2023-10-23경기도 고양시 일산서구 덕이동 산 131-7<NA>
12근린공원2023-10-23경기도 고양시 일산서구 탄현동 92-29<NA>
23근린공원2023-10-23경기도 고양시 일산서구 탄중로 284한뫼도서관
34근린공원2023-10-23경기도 고양시 일산서구 일산로 552-1<NA>
45농로2023-10-23경기도 고양시 일산서구 덕이로 137-23 (덕이동)<NA>
56농로2023-10-23경기도 고양시 일산서구 법곳길 145-24 (법곳동)<NA>
67농로2023-10-23경기도 고양시 일산서구 법곳길 177-2 (법곳동)<NA>
78농로2023-10-23경기도 고양시 일산서구 법곳길 234 (법곳동)<NA>
89농로2023-10-23경기도 고양시 일산서구 이산포길 182-24 (법곳동)<NA>
910농로2023-10-23경기도 고양시 일산서구 가좌4로 29 (가좌동,가좌마을)가좌마을101동
순번시설명칭등록일도로명 주소상세 주소
5859빗물받이2023-10-23경기도 고양시 일산서구 송산로 387-7가좌배드민턴경기장
5960빗물받이2023-10-23경기도 고양시 일산서구 고봉로 91-3 (주엽동)궁골공원개방화장실
6061빗물받이2023-10-23경기도 고양시 일산서구 송포백송길 57-10 (덕이동)<NA>
6162웅덩이2023-10-23경기도 고양시 일산서구 대화로 54-6<NA>
6263유수지2023-10-20경기도 고양시 일산서구 하이파크로 83 (덕이동)<NA>
6364저류지2023-10-23경기도 고양시 일산서구 하이파크로 83 (덕이동)<NA>
6465축사2023-10-23경기도 고양시 일산서구 구산로 37-62 (구산동)<NA>
6566축사2023-10-23경기도 고양시 일산서구 법곳길 560 (구산동)<NA>
6667축사2023-10-23경기도 고양시 일산서구 법곳길 533 (구산동)<NA>
6768하천2023-10-23경기도 고양시 일산서구 곳산길 96 (구산동)<NA>