Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells2673
Missing cells (%)5.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory478.5 KiB
Average record size in memory49.0 B

Variable types

Numeric1
Categorical1
Text3

Dataset

Description전북특별자치도 진안군 지역코드에 대한 데이터이며, 자동기상현황관측시스템에서 추출한 데이터입니다. 지역코드, 시도, 구군, 동, 리에 대한 정보를 제공합니다.
Author전북특별자치도 진안군
URLhttps://www.data.go.kr/data/15119115/fileData.do

Alerts

지역코드 is highly overall correlated with 시도High correlation
시도 is highly overall correlated with 지역코드High correlation
읍면동 has 129 (1.3%) missing valuesMissing
has 2535 (25.4%) missing valuesMissing
지역코드 has unique valuesUnique

Reproduction

Analysis started2024-03-14 19:35:50.483531
Analysis finished2024-03-14 19:35:52.467993
Duration1.98 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

지역코드
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.3440963 × 109
Minimum1.1 × 109
Maximum4.972032 × 109
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-03-15T04:35:52.631424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.1 × 109
5-th percentile2.771031 × 109
Q14.276778 × 109
median4.519025 × 109
Q34.71704 × 109
95-th percentile4.884025 × 109
Maximum4.972032 × 109
Range3.872032 × 109
Interquartile range (IQR)4.40262 × 108

Descriptive statistics

Standard deviation7.0786402 × 108
Coefficient of variation (CV)0.16294851
Kurtosis9.1390564
Mean4.3440963 × 109
Median Absolute Deviation (MAD)2.060105 × 108
Skewness-2.9136806
Sum4.3440963 × 1013
Variance5.0107148 × 1017
MonotonicityNot monotonic
2024-03-15T04:35:52.924638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4111113300 1
 
< 0.1%
3171039000 1
 
< 0.1%
4371037021 1
 
< 0.1%
4784038041 1
 
< 0.1%
4713025627 1
 
< 0.1%
2871034029 1
 
< 0.1%
4775032038 1
 
< 0.1%
4793035029 1
 
< 0.1%
4571036028 1
 
< 0.1%
4817036022 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
1100000000 1
< 0.1%
1111010200 1
< 0.1%
1111010400 1
< 0.1%
1111010500 1
< 0.1%
1111011200 1
< 0.1%
1111011300 1
< 0.1%
1111011600 1
< 0.1%
1111011700 1
< 0.1%
1111011800 1
< 0.1%
1111012000 1
< 0.1%
ValueCountFrequency (%)
4972032026 1
< 0.1%
4972032023 1
< 0.1%
4972032022 1
< 0.1%
4972032021 1
< 0.1%
4972032000 1
< 0.1%
4972031030 1
< 0.1%
4972031028 1
< 0.1%
4972031027 1
< 0.1%
4972031026 1
< 0.1%
4972031025 1
< 0.1%

시도
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경상북도
1647 
전라남도
1456 
경상남도
1174 
충청남도
1154 
경기도
1051 
Other values (12)
3518 

Length

Max length5
Median length4
Mean length3.9041
Min length3

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row경기도
2nd row경상북도
3rd row경기도
4th row경상북도
5th row충청북도

Common Values

ValueCountFrequency (%)
경상북도 1647
16.5%
전라남도 1456
14.6%
경상남도 1174
11.7%
충청남도 1154
11.5%
경기도 1051
10.5%
전라북도 921
9.2%
충청북도 785
7.8%
강원도 759
7.6%
서울특별시 251
 
2.5%
대구광역시 147
 
1.5%
Other values (7) 655
 
6.6%

Length

2024-03-15T04:35:53.200793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경상북도 1647
16.5%
전라남도 1456
14.6%
경상남도 1174
11.7%
충청남도 1154
11.5%
경기도 1051
10.5%
전라북도 921
9.2%
충청북도 785
7.8%
강원도 759
7.6%
서울특별시 251
 
2.5%
대구광역시 147
 
1.5%
Other values (7) 655
 
6.6%

구군
Text

Distinct237
Distinct (%)2.4%
Missing9
Missing (%)0.1%
Memory size156.2 KiB
2024-03-15T04:35:54.578298image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.0715644
Min length2

Characters and Unicode

Total characters30688
Distinct characters142
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15 ?
Unique (%)0.2%

Sample

1st row수원시장안구
2nd row성주군
3rd row화성시
4th row고령군
5th row제천시
ValueCountFrequency (%)
청원군 137
 
1.4%
중구 123
 
1.2%
고성군 121
 
1.2%
안동시 118
 
1.2%
상주시 115
 
1.2%
영천시 109
 
1.1%
공주시 108
 
1.1%
순천시 107
 
1.1%
나주시 106
 
1.1%
합천군 103
 
1.0%
Other values (227) 8844
88.5%
2024-03-15T04:35:56.280691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5250
 
17.1%
4154
 
13.5%
1297
 
4.2%
1292
 
4.2%
1211
 
3.9%
1004
 
3.3%
1003
 
3.3%
754
 
2.5%
649
 
2.1%
583
 
1.9%
Other values (132) 13491
44.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 30688
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5250
 
17.1%
4154
 
13.5%
1297
 
4.2%
1292
 
4.2%
1211
 
3.9%
1004
 
3.3%
1003
 
3.3%
754
 
2.5%
649
 
2.1%
583
 
1.9%
Other values (132) 13491
44.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 30688
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5250
 
17.1%
4154
 
13.5%
1297
 
4.2%
1292
 
4.2%
1211
 
3.9%
1004
 
3.3%
1003
 
3.3%
754
 
2.5%
649
 
2.1%
583
 
1.9%
Other values (132) 13491
44.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 30688
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5250
 
17.1%
4154
 
13.5%
1297
 
4.2%
1292
 
4.2%
1211
 
3.9%
1004
 
3.3%
1003
 
3.3%
754
 
2.5%
649
 
2.1%
583
 
1.9%
Other values (132) 13491
44.0%

읍면동
Text

MISSING 

Distinct2659
Distinct (%)26.9%
Missing129
Missing (%)1.3%
Memory size156.2 KiB
2024-03-15T04:35:58.275648image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length3
Mean length3.0183365
Min length2

Characters and Unicode

Total characters29794
Distinct characters342
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1267 ?
Unique (%)12.8%

Sample

1st row천천동
2nd row용암면
3rd row팔탄면
4th row운수면
5th row신백동
ValueCountFrequency (%)
남면 73
 
0.7%
서면 72
 
0.7%
북면 50
 
0.5%
금성면 33
 
0.3%
동면 32
 
0.3%
대강면 24
 
0.2%
성산면 23
 
0.2%
봉산면 23
 
0.2%
대덕면 22
 
0.2%
대산면 21
 
0.2%
Other values (2649) 9498
96.2%
2024-03-15T04:36:00.451574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6683
22.4%
2175
 
7.3%
1460
 
4.9%
896
 
3.0%
523
 
1.8%
511
 
1.7%
454
 
1.5%
409
 
1.4%
382
 
1.3%
378
 
1.3%
Other values (332) 15923
53.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 29591
99.3%
Decimal Number 203
 
0.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6683
22.6%
2175
 
7.4%
1460
 
4.9%
896
 
3.0%
523
 
1.8%
511
 
1.7%
454
 
1.5%
409
 
1.4%
382
 
1.3%
378
 
1.3%
Other values (324) 15720
53.1%
Decimal Number
ValueCountFrequency (%)
2 61
30.0%
1 59
29.1%
3 40
19.7%
4 24
 
11.8%
5 8
 
3.9%
6 7
 
3.4%
7 3
 
1.5%
8 1
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Hangul 29591
99.3%
Common 203
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6683
22.6%
2175
 
7.4%
1460
 
4.9%
896
 
3.0%
523
 
1.8%
511
 
1.7%
454
 
1.5%
409
 
1.4%
382
 
1.3%
378
 
1.3%
Other values (324) 15720
53.1%
Common
ValueCountFrequency (%)
2 61
30.0%
1 59
29.1%
3 40
19.7%
4 24
 
11.8%
5 8
 
3.9%
6 7
 
3.4%
7 3
 
1.5%
8 1
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 29591
99.3%
ASCII 203
 
0.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6683
22.6%
2175
 
7.4%
1460
 
4.9%
896
 
3.0%
523
 
1.8%
511
 
1.7%
454
 
1.5%
409
 
1.4%
382
 
1.3%
378
 
1.3%
Other values (324) 15720
53.1%
ASCII
ValueCountFrequency (%)
2 61
30.0%
1 59
29.1%
3 40
19.7%
4 24
 
11.8%
5 8
 
3.9%
6 7
 
3.4%
7 3
 
1.5%
8 1
 
0.5%


Text

MISSING 

Distinct4364
Distinct (%)58.5%
Missing2535
Missing (%)25.4%
Memory size156.2 KiB
2024-03-15T04:36:01.774724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length3
Mean length2.9887475
Min length1

Characters and Unicode

Total characters22311
Distinct characters375
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3073 ?
Unique (%)41.2%

Sample

1st row대정리
2nd row이식리
3rd row상장리
4th row우용리
5th row송산리
ValueCountFrequency (%)
대곡리 25
 
0.3%
신흥리 23
 
0.3%
금곡리 23
 
0.3%
신촌리 21
 
0.3%
용산리 20
 
0.3%
송정리 20
 
0.3%
동산리 18
 
0.2%
신월리 15
 
0.2%
봉산리 15
 
0.2%
오산리 15
 
0.2%
Other values (4354) 7270
97.4%
2024-03-15T04:36:03.566327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7479
33.5%
596
 
2.7%
485
 
2.2%
363
 
1.6%
345
 
1.5%
334
 
1.5%
314
 
1.4%
281
 
1.3%
260
 
1.2%
234
 
1.0%
Other values (365) 11620
52.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 22285
99.9%
Decimal Number 20
 
0.1%
Open Punctuation 3
 
< 0.1%
Close Punctuation 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7479
33.6%
596
 
2.7%
485
 
2.2%
363
 
1.6%
345
 
1.5%
334
 
1.5%
314
 
1.4%
281
 
1.3%
260
 
1.2%
234
 
1.1%
Other values (354) 11594
52.0%
Decimal Number
ValueCountFrequency (%)
1 8
40.0%
2 3
 
15.0%
6 2
 
10.0%
4 2
 
10.0%
7 1
 
5.0%
9 1
 
5.0%
0 1
 
5.0%
3 1
 
5.0%
8 1
 
5.0%
Open Punctuation
ValueCountFrequency (%)
( 3
100.0%
Close Punctuation
ValueCountFrequency (%)
) 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 22279
99.9%
Common 26
 
0.1%
Han 6
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7479
33.6%
596
 
2.7%
485
 
2.2%
363
 
1.6%
345
 
1.5%
334
 
1.5%
314
 
1.4%
281
 
1.3%
260
 
1.2%
234
 
1.1%
Other values (349) 11588
52.0%
Common
ValueCountFrequency (%)
1 8
30.8%
( 3
 
11.5%
) 3
 
11.5%
2 3
 
11.5%
6 2
 
7.7%
4 2
 
7.7%
7 1
 
3.8%
9 1
 
3.8%
0 1
 
3.8%
3 1
 
3.8%
Han
ValueCountFrequency (%)
2
33.3%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 22279
99.9%
ASCII 26
 
0.1%
CJK 6
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7479
33.6%
596
 
2.7%
485
 
2.2%
363
 
1.6%
345
 
1.5%
334
 
1.5%
314
 
1.4%
281
 
1.3%
260
 
1.2%
234
 
1.1%
Other values (349) 11588
52.0%
ASCII
ValueCountFrequency (%)
1 8
30.8%
( 3
 
11.5%
) 3
 
11.5%
2 3
 
11.5%
6 2
 
7.7%
4 2
 
7.7%
7 1
 
3.8%
9 1
 
3.8%
0 1
 
3.8%
3 1
 
3.8%
CJK
ValueCountFrequency (%)
2
33.3%
1
16.7%
1
16.7%
1
16.7%
1
16.7%

Interactions

2024-03-15T04:35:51.343669image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-03-15T04:36:03.831476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역코드시도
지역코드1.0000.990
시도0.9901.000
2024-03-15T04:36:03.975868image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역코드시도
지역코드1.0000.965
시도0.9651.000

Missing values

2024-03-15T04:35:51.716802image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-03-15T04:35:52.059615image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-03-15T04:35:52.374331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

지역코드시도구군읍면동
19794111113300경기도수원시장안구천천동<NA>
174014784032000경상북도성주군용암면<NA>
33974159036000경기도화성시팔탄면<NA>
172944783032000경상북도고령군운수면<NA>
59544315012000충청북도제천시신백동<NA>
104834519046023전라북도남원시산내면대정리
65334372040036충청북도보은군산외면이식리
90944479035031충청남도청양군청남면상장리
184504822010400경상남도통영시항남동<NA>
111394575040000전라북도임실군덕치면<NA>
지역코드시도구군읍면동
104354519039023전라북도남원시덕과면사율리
162474728025325경상북도문경시가은읍성저리
69494376031021충청북도괴산군감물면오성리
129324678039030전라남도보성군회천면천포리
187644827010600경상남도밀양시용평동<NA>
1311114014200서울특별시중구예장동<NA>
15652920016600광주광역시광산구삼도동<NA>
169714776033035경상북도영양군일월면오리리
9312726011000대구광역시수성구파동<NA>
62324371038000충청북도청원군부용면<NA>