Overview

Dataset statistics

Number of variables7
Number of observations10000
Missing cells6184
Missing cells (%)8.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory654.3 KiB
Average record size in memory67.0 B

Variable types

Numeric3
Categorical1
Text3

Dataset

Description홈페이지에 메뉴, 회원, 콘텐츠 관련 기본정보DB에 대한 내용입니다.
Author국가평생교육진흥원
URLhttps://www.data.go.kr/data/15071863/fileData.do

Alerts

일련번호 is highly overall correlated with 시도High correlation
우편번호1 is highly overall correlated with 시도High correlation
시도 is highly overall correlated with 일련번호 and 1 other fieldsHigh correlation
번지 has 6184 (61.8%) missing valuesMissing

Reproduction

Analysis started2023-12-12 15:41:48.051114
Analysis finished2023-12-12 15:41:50.380112
Duration2.33 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일련번호
Real number (ℝ)

HIGH CORRELATION 

Distinct9954
Distinct (%)99.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25672.814
Minimum-744
Maximum51663
Zeros0
Zeros (%)0.0%
Negative98
Negative (%)1.0%
Memory size166.0 KiB
2023-12-13T00:41:50.456171image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum-744
5-th percentile2147.8
Q112638.5
median25670
Q338848.25
95-th percentile49059.05
Maximum51663
Range52407
Interquartile range (IQR)26209.75

Descriptive statistics

Standard deviation15093.228
Coefficient of variation (CV)0.58790703
Kurtosis-1.2085683
Mean25672.814
Median Absolute Deviation (MAD)13114
Skewness-0.0053276874
Sum2.5672814 × 108
Variance2.2780552 × 108
MonotonicityNot monotonic
2023-12-13T00:41:50.626691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 16
 
0.2%
-3 8
 
0.1%
-11 5
 
0.1%
-4 4
 
< 0.1%
-2 4
 
< 0.1%
-10 3
 
< 0.1%
-5 3
 
< 0.1%
-19 2
 
< 0.1%
-114 2
 
< 0.1%
-17 2
 
< 0.1%
Other values (9944) 9951
99.5%
ValueCountFrequency (%)
-744 1
< 0.1%
-512 1
< 0.1%
-397 1
< 0.1%
-355 1
< 0.1%
-306 1
< 0.1%
-303 1
< 0.1%
-272 1
< 0.1%
-214 1
< 0.1%
-198 1
< 0.1%
-173 1
< 0.1%
ValueCountFrequency (%)
51663 1
< 0.1%
51662 1
< 0.1%
51660 1
< 0.1%
51657 1
< 0.1%
51655 1
< 0.1%
51640 1
< 0.1%
51639 1
< 0.1%
51636 1
< 0.1%
51635 1
< 0.1%
51632 1
< 0.1%

우편번호1
Real number (ℝ)

HIGH CORRELATION 

Distinct257
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean460.919
Minimum100
Maximum799
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T00:41:50.806601image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum100
5-th percentile133
Q1325
median472
Q3618
95-th percentile750
Maximum799
Range699
Interquartile range (IQR)293

Descriptive statistics

Standard deviation198.16191
Coefficient of variation (CV)0.42992784
Kurtosis-1.0220946
Mean460.919
Median Absolute Deviation (MAD)146
Skewness-0.27584476
Sum4609190
Variance39268.142
MonotonicityNot monotonic
2023-12-13T00:41:50.981540image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
135 121
 
1.2%
706 104
 
1.0%
139 101
 
1.0%
704 94
 
0.9%
472 93
 
0.9%
151 90
 
0.9%
702 86
 
0.9%
560 81
 
0.8%
500 80
 
0.8%
730 79
 
0.8%
Other values (247) 9071
90.7%
ValueCountFrequency (%)
100 77
0.8%
110 76
0.8%
120 41
0.4%
121 63
0.6%
122 49
0.5%
130 49
0.5%
131 69
0.7%
132 51
0.5%
133 39
0.4%
134 51
0.5%
ValueCountFrequency (%)
799 5
 
0.1%
791 54
0.5%
790 32
0.3%
780 63
0.6%
770 55
0.5%
769 50
0.5%
767 27
0.3%
766 30
0.3%
764 11
 
0.1%
763 24
 
0.2%

우편번호2
Real number (ℝ)

Distinct558
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean761.6673
Minimum3
Maximum998
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T00:41:51.558018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile131
Q1758
median820
Q3861
95-th percentile933
Maximum998
Range995
Interquartile range (IQR)103

Descriptive statistics

Standard deviation208.73582
Coefficient of variation (CV)0.27405118
Kurtosis5.2297309
Mean761.6673
Median Absolute Deviation (MAD)50
Skewness-2.4557413
Sum7616673
Variance43570.642
MonotonicityNot monotonic
2023-12-13T00:41:51.766323image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
831 137
 
1.4%
821 131
 
1.3%
822 127
 
1.3%
812 123
 
1.2%
851 122
 
1.2%
841 117
 
1.2%
861 117
 
1.2%
811 115
 
1.1%
802 115
 
1.1%
842 110
 
1.1%
Other values (548) 8786
87.9%
ValueCountFrequency (%)
3 1
 
< 0.1%
4 1
 
< 0.1%
10 37
0.4%
11 11
 
0.1%
12 9
 
0.1%
13 7
 
0.1%
14 5
 
0.1%
15 4
 
< 0.1%
16 2
 
< 0.1%
17 3
 
< 0.1%
ValueCountFrequency (%)
998 2
 
< 0.1%
997 1
 
< 0.1%
996 3
< 0.1%
995 1
 
< 0.1%
994 4
< 0.1%
993 2
 
< 0.1%
992 3
< 0.1%
991 4
< 0.1%
990 6
0.1%
989 4
< 0.1%

시도
Categorical

HIGH CORRELATION 

Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
경기
1609 
서울
1569 
경북
935 
전남
755 
경남
726 
Other values (11)
4406 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울
2nd row경기
3rd row광주
4th row광주
5th row전남

Common Values

ValueCountFrequency (%)
경기 1609
16.1%
서울 1569
15.7%
경북 935
9.3%
전남 755
7.5%
경남 726
7.3%
부산 673
6.7%
충남 669
6.7%
대구 557
 
5.6%
전북 553
 
5.5%
강원 552
 
5.5%
Other values (6) 1402
14.0%

Length

2023-12-13T00:41:51.944886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
경기 1609
16.1%
서울 1569
15.7%
경북 935
9.3%
전남 755
7.5%
경남 726
7.3%
부산 673
6.7%
충남 669
6.7%
대구 557
 
5.6%
전북 553
 
5.5%
강원 552
 
5.5%
Other values (6) 1402
14.0%

구군
Text

Distinct228
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T00:41:52.358943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length3
Mean length3.4132
Min length2

Characters and Unicode

Total characters34132
Distinct characters141
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row종로구
2nd row화성시
3rd row광산구
4th row남구
5th row장흥군
ValueCountFrequency (%)
북구 305
 
2.7%
남구 300
 
2.7%
중구 236
 
2.1%
서구 231
 
2.1%
동구 217
 
1.9%
창원시 187
 
1.7%
전주시 123
 
1.1%
강남구 121
 
1.1%
성남시 112
 
1.0%
고양시 107
 
1.0%
Other values (228) 9244
82.7%
2023-12-13T00:41:52.968468image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4960
 
14.5%
4046
 
11.9%
2541
 
7.4%
1183
 
3.5%
1057
 
3.1%
1024
 
3.0%
928
 
2.7%
838
 
2.5%
827
 
2.4%
780
 
2.3%
Other values (131) 15948
46.7%

Most occurring categories

ValueCountFrequency (%)
Other Letter 32949
96.5%
Space Separator 1183
 
3.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
4960
 
15.1%
4046
 
12.3%
2541
 
7.7%
1057
 
3.2%
1024
 
3.1%
928
 
2.8%
838
 
2.5%
827
 
2.5%
780
 
2.4%
770
 
2.3%
Other values (130) 15178
46.1%
Space Separator
ValueCountFrequency (%)
1183
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 32949
96.5%
Common 1183
 
3.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
4960
 
15.1%
4046
 
12.3%
2541
 
7.7%
1057
 
3.2%
1024
 
3.1%
928
 
2.8%
838
 
2.5%
827
 
2.5%
780
 
2.4%
770
 
2.3%
Other values (130) 15178
46.1%
Common
ValueCountFrequency (%)
1183
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 32949
96.5%
ASCII 1183
 
3.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
4960
 
15.1%
4046
 
12.3%
2541
 
7.7%
1057
 
3.2%
1024
 
3.1%
928
 
2.8%
838
 
2.5%
827
 
2.5%
780
 
2.4%
770
 
2.3%
Other values (130) 15178
46.1%
ASCII
ValueCountFrequency (%)
1183
100.0%

동리
Text

Distinct8139
Distinct (%)81.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T00:41:53.428135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length25
Mean length7.0811
Min length2

Characters and Unicode

Total characters70811
Distinct characters603
Distinct categories8 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7257 ?
Unique (%)72.6%

Sample

1st row인의동
2nd row서신면 전곡리
3rd row월계동 벽산아파트
4th row주월2동
5th row장흥읍 건산리
ValueCountFrequency (%)
주공아파트 71
 
0.4%
사서함 56
 
0.3%
현대아파트 52
 
0.3%
남면 47
 
0.3%
서면 31
 
0.2%
중동 25
 
0.2%
서울중앙우체국사서함 23
 
0.1%
북면 22
 
0.1%
금곡동 22
 
0.1%
여의도동 20
 
0.1%
Other values (8107) 15988
97.7%
2023-12-13T00:41:53.997108image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6643
 
9.4%
6357
 
9.0%
3810
 
5.4%
3015
 
4.3%
1970
 
2.8%
1827
 
2.6%
1766
 
2.5%
1 1379
 
1.9%
2 1283
 
1.8%
1266
 
1.8%
Other values (593) 41495
58.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 60036
84.8%
Space Separator 6357
 
9.0%
Decimal Number 4110
 
5.8%
Uppercase Letter 181
 
0.3%
Other Punctuation 59
 
0.1%
Close Punctuation 31
 
< 0.1%
Open Punctuation 31
 
< 0.1%
Lowercase Letter 6
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6643
 
11.1%
3810
 
6.3%
3015
 
5.0%
1970
 
3.3%
1827
 
3.0%
1766
 
2.9%
1266
 
2.1%
1035
 
1.7%
978
 
1.6%
795
 
1.3%
Other values (555) 36931
61.5%
Uppercase Letter
ValueCountFrequency (%)
S 32
17.7%
K 32
17.7%
T 26
14.4%
G 18
9.9%
L 15
8.3%
I 11
 
6.1%
B 8
 
4.4%
C 5
 
2.8%
N 5
 
2.8%
A 5
 
2.8%
Other values (12) 24
13.3%
Decimal Number
ValueCountFrequency (%)
1 1379
33.6%
2 1283
31.2%
3 690
16.8%
4 323
 
7.9%
5 143
 
3.5%
6 110
 
2.7%
7 75
 
1.8%
8 55
 
1.3%
9 38
 
0.9%
0 14
 
0.3%
Other Punctuation
ValueCountFrequency (%)
. 58
98.3%
& 1
 
1.7%
Space Separator
ValueCountFrequency (%)
6357
100.0%
Close Punctuation
ValueCountFrequency (%)
) 31
100.0%
Open Punctuation
ValueCountFrequency (%)
( 31
100.0%
Lowercase Letter
ValueCountFrequency (%)
e 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 60036
84.8%
Common 10588
 
15.0%
Latin 187
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6643
 
11.1%
3810
 
6.3%
3015
 
5.0%
1970
 
3.3%
1827
 
3.0%
1766
 
2.9%
1266
 
2.1%
1035
 
1.7%
978
 
1.6%
795
 
1.3%
Other values (555) 36931
61.5%
Latin
ValueCountFrequency (%)
S 32
17.1%
K 32
17.1%
T 26
13.9%
G 18
9.6%
L 15
8.0%
I 11
 
5.9%
B 8
 
4.3%
e 6
 
3.2%
C 5
 
2.7%
N 5
 
2.7%
Other values (13) 29
15.5%
Common
ValueCountFrequency (%)
6357
60.0%
1 1379
 
13.0%
2 1283
 
12.1%
3 690
 
6.5%
4 323
 
3.1%
5 143
 
1.4%
6 110
 
1.0%
7 75
 
0.7%
. 58
 
0.5%
8 55
 
0.5%
Other values (5) 115
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
Hangul 60036
84.8%
ASCII 10775
 
15.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6643
 
11.1%
3810
 
6.3%
3015
 
5.0%
1970
 
3.3%
1827
 
3.0%
1766
 
2.9%
1266
 
2.1%
1035
 
1.7%
978
 
1.6%
795
 
1.3%
Other values (555) 36931
61.5%
ASCII
ValueCountFrequency (%)
6357
59.0%
1 1379
 
12.8%
2 1283
 
11.9%
3 690
 
6.4%
4 323
 
3.0%
5 143
 
1.3%
6 110
 
1.0%
7 75
 
0.7%
. 58
 
0.5%
8 55
 
0.5%
Other values (28) 302
 
2.8%

번지
Text

MISSING 

Distinct2883
Distinct (%)75.6%
Missing6184
Missing (%)61.8%
Memory size156.2 KiB
2023-12-13T00:41:54.327106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length11
Mean length7.2843291
Min length1

Characters and Unicode

Total characters27797
Distinct characters24
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2601 ?
Unique (%)68.2%

Sample

1st row(101∼103동)
2nd row821∼940
3rd row198∼229
4th row(101∼106동)
5th row79∼120
ValueCountFrequency (%)
101∼106동 55
 
1.4%
101∼107동 50
 
1.3%
101∼108동 44
 
1.2%
101∼105동 39
 
1.0%
101∼103동 39
 
1.0%
101∼104동 34
 
0.9%
101∼109동 30
 
0.8%
101∼110동 27
 
0.7%
101∼112동 19
 
0.5%
101∼102동 16
 
0.4%
Other values (2873) 3463
90.7%
2023-12-13T00:41:54.781773image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 5289
19.0%
0 3541
12.7%
3400
12.2%
2 1967
 
7.1%
3 1781
 
6.4%
5 1571
 
5.7%
4 1555
 
5.6%
9 1494
 
5.4%
6 1466
 
5.3%
7 1311
 
4.7%
Other values (14) 4422
15.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 21164
76.1%
Math Symbol 3400
 
12.2%
Other Letter 1158
 
4.2%
Open Punctuation 1033
 
3.7%
Close Punctuation 1033
 
3.7%
Uppercase Letter 9
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 5289
25.0%
0 3541
16.7%
2 1967
 
9.3%
3 1781
 
8.4%
5 1571
 
7.4%
4 1555
 
7.3%
9 1494
 
7.1%
6 1466
 
6.9%
7 1311
 
6.2%
8 1189
 
5.6%
Other Letter
ValueCountFrequency (%)
1033
89.2%
112
 
9.7%
6
 
0.5%
2
 
0.2%
2
 
0.2%
2
 
0.2%
1
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
A 4
44.4%
B 2
22.2%
C 2
22.2%
E 1
 
11.1%
Math Symbol
ValueCountFrequency (%)
3400
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1033
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1033
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 26630
95.8%
Hangul 1158
 
4.2%
Latin 9
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 5289
19.9%
0 3541
13.3%
3400
12.8%
2 1967
 
7.4%
3 1781
 
6.7%
5 1571
 
5.9%
4 1555
 
5.8%
9 1494
 
5.6%
6 1466
 
5.5%
7 1311
 
4.9%
Other values (3) 3255
12.2%
Hangul
ValueCountFrequency (%)
1033
89.2%
112
 
9.7%
6
 
0.5%
2
 
0.2%
2
 
0.2%
2
 
0.2%
1
 
0.1%
Latin
ValueCountFrequency (%)
A 4
44.4%
B 2
22.2%
C 2
22.2%
E 1
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23239
83.6%
Math Operators 3400
 
12.2%
Hangul 1158
 
4.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 5289
22.8%
0 3541
15.2%
2 1967
 
8.5%
3 1781
 
7.7%
5 1571
 
6.8%
4 1555
 
6.7%
9 1494
 
6.4%
6 1466
 
6.3%
7 1311
 
5.6%
8 1189
 
5.1%
Other values (6) 2075
 
8.9%
Math Operators
ValueCountFrequency (%)
3400
100.0%
Hangul
ValueCountFrequency (%)
1033
89.2%
112
 
9.7%
6
 
0.5%
2
 
0.2%
2
 
0.2%
2
 
0.2%
1
 
0.1%

Interactions

2023-12-13T00:41:49.807241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:41:49.098756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:41:49.442467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:41:49.915837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:41:49.202554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:41:49.583346image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:41:50.049364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:41:49.319476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T00:41:49.704851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T00:41:54.880470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호우편번호1우편번호2시도
일련번호1.0000.9460.3550.950
우편번호10.9461.0000.3130.969
우편번호20.3550.3131.0000.311
시도0.9500.9690.3111.000
2023-12-13T00:41:54.994775image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일련번호우편번호1우편번호2시도
일련번호1.0000.3270.2200.786
우편번호10.3271.0000.0540.857
우편번호20.2200.0541.0000.127
시도0.7860.8570.1271.000

Missing values

2023-12-13T00:41:50.178230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T00:41:50.305321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일련번호우편번호1우편번호2시도구군동리번지
434147227110410서울종로구인의동<NA>
2187129638445883경기화성시서신면 전곡리<NA>
3507716149506707광주광산구월계동 벽산아파트(101∼103동)
3502016391503837광주남구주월2동821∼940
959742051529801전남장흥군장흥읍 건산리<NA>
282848867340931충남예산군고덕면 석곡3리<NA>
2711824121421821경기부천시 오정구원종2동198∼229
2093130641621705경남김해시내동 김해등기소<NA>
3543315915409912인천옹진군백령면 북포리<NA>
184549781360737충북청주시 상당구용암2동 용암소라아파트(101∼106동)
일련번호우편번호1우편번호2시도구군동리번지
1470536807790906경북포항시 남구오천읍 세계2리<NA>
3515916231503871광주남구노대동 송화마을휴먼시아2단지아파트(201∼208동)
2033731260626725경남양산시덕계동 경보아파트(A∼B동)
69451086373812충북옥천군안내면 동대리<NA>
1224539519530836전남목포시용당1동1057∼1095
3602515295403758인천부평구삼산1동 주공미래타운5단지(501∼509동)
1532036175750907경북영주시영주1동334∼392
574945982695911제주제주시애월읍 수산리<NA>
1771233736676871경남함양군백전면 경백리<NA>
3888912267704831대구달서구월성2동1077∼1089