Overview

Dataset statistics

Number of variables4
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory419.9 KiB
Average record size in memory43.0 B

Variable types

Numeric2
Categorical1
Text1

Dataset

Description경기도 경기통계시스템 차원정보
Author경기도
URLhttps://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=Y3KQ0RNVT4RZK0SJ7Q1R33499065&infSeq=1

Alerts

조직번호 has constant value ""Constant
표항목인식번호 is highly overall correlated with 최종변경일High correlation
최종변경일 is highly overall correlated with 표항목인식번호High correlation

Reproduction

Analysis started2023-12-10 22:30:38.325642
Analysis finished2023-12-10 22:30:39.062766
Duration0.74 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

표항목인식번호
Real number (ℝ)

HIGH CORRELATION 

Distinct7307
Distinct (%)73.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean38319.243
Minimum1
Maximum512846
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T07:30:39.130361image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile87
Q1735
median4045
Q365766.75
95-th percentile197232.6
Maximum512846
Range512845
Interquartile range (IQR)65031.75

Descriptive statistics

Standard deviation60813.479
Coefficient of variation (CV)1.5870219
Kurtosis3.3696824
Mean38319.243
Median Absolute Deviation (MAD)3881
Skewness1.8965717
Sum3.8319243 × 108
Variance3.6982793 × 109
MonotonicityNot monotonic
2023-12-11T07:30:39.269768image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
74 10
 
0.1%
48 10
 
0.1%
55 10
 
0.1%
101 10
 
0.1%
112 10
 
0.1%
19 10
 
0.1%
27 10
 
0.1%
14 10
 
0.1%
23 9
 
0.1%
110 9
 
0.1%
Other values (7297) 9902
99.0%
ValueCountFrequency (%)
1 8
0.1%
2 5
0.1%
3 4
< 0.1%
4 4
< 0.1%
5 9
0.1%
6 7
0.1%
7 9
0.1%
8 2
 
< 0.1%
9 3
 
< 0.1%
10 6
0.1%
ValueCountFrequency (%)
512846 1
< 0.1%
512841 1
< 0.1%
512840 1
< 0.1%
431284 1
< 0.1%
431280 1
< 0.1%
431279 1
< 0.1%
202858 1
< 0.1%
202845 1
< 0.1%
202840 1
< 0.1%
202833 1
< 0.1%

조직번호
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
210
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row210
2nd row210
3rd row210
4th row210
5th row210

Common Values

ValueCountFrequency (%)
210 10000
100.0%

Length

2023-12-11T07:30:39.389545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T07:30:39.720172image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
210 10000
100.0%
Distinct253
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-11T07:30:39.928808image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length11
Mean length11.9806
Min length10

Characters and Unicode

Total characters119806
Distinct characters29
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)0.3%

Sample

1st rowDT_1B00003_BK
2nd rowDT_1K00003_BK
3rd rowDT_1L00004
4th rowDT_210J0047
5th rowDT_210J0047
ValueCountFrequency (%)
dt_210j0047 3855
38.6%
dt_210j0045 526
 
5.3%
dt_1b00003_bk 403
 
4.0%
dt_210j0044 361
 
3.6%
dt_1p00035 205
 
2.1%
dt_210j0038 157
 
1.6%
dt_1h00001_1_bk 131
 
1.3%
dt_21002_k001 129
 
1.3%
dt_21002_k026 106
 
1.1%
dt_21002_l010a 101
 
1.0%
Other values (243) 4026
40.3%
2023-12-11T07:30:40.292166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 32045
26.7%
_ 14131
11.8%
1 13561
11.3%
T 10336
 
8.6%
2 10152
 
8.5%
D 10079
 
8.4%
4 6936
 
5.8%
J 5220
 
4.4%
7 4298
 
3.6%
K 2662
 
2.2%
Other values (19) 10386
 
8.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 71262
59.5%
Uppercase Letter 34413
28.7%
Connector Punctuation 14131
 
11.8%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
T 10336
30.0%
D 10079
29.3%
J 5220
15.2%
K 2662
 
7.7%
B 2471
 
7.2%
A 801
 
2.3%
P 527
 
1.5%
M 513
 
1.5%
E 442
 
1.3%
I 349
 
1.0%
Other values (8) 1013
 
2.9%
Decimal Number
ValueCountFrequency (%)
0 32045
45.0%
1 13561
19.0%
2 10152
 
14.2%
4 6936
 
9.7%
7 4298
 
6.0%
3 2068
 
2.9%
5 1205
 
1.7%
6 364
 
0.5%
8 349
 
0.5%
9 284
 
0.4%
Connector Punctuation
ValueCountFrequency (%)
_ 14131
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 85393
71.3%
Latin 34413
28.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
T 10336
30.0%
D 10079
29.3%
J 5220
15.2%
K 2662
 
7.7%
B 2471
 
7.2%
A 801
 
2.3%
P 527
 
1.5%
M 513
 
1.5%
E 442
 
1.3%
I 349
 
1.0%
Other values (8) 1013
 
2.9%
Common
ValueCountFrequency (%)
0 32045
37.5%
_ 14131
16.5%
1 13561
15.9%
2 10152
 
11.9%
4 6936
 
8.1%
7 4298
 
5.0%
3 2068
 
2.4%
5 1205
 
1.4%
6 364
 
0.4%
8 349
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 119806
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 32045
26.7%
_ 14131
11.8%
1 13561
11.3%
T 10336
 
8.6%
2 10152
 
8.5%
D 10079
 
8.4%
4 6936
 
5.8%
J 5220
 
4.4%
7 4298
 
3.6%
K 2662
 
2.2%
Other values (19) 10386
 
8.7%

최종변경일
Real number (ℝ)

HIGH CORRELATION 

Distinct69
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20165729
Minimum20150625
Maximum20230616
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-11T07:30:40.460987image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20150625
5-th percentile20150625
Q120150722
median20170424
Q320170510
95-th percentile20210223
Maximum20230616
Range79991
Interquartile range (IQR)19788

Descriptive statistics

Standard deviation15422.032
Coefficient of variation (CV)0.00076476441
Kurtosis4.9535042
Mean20165729
Median Absolute Deviation (MAD)86
Skewness1.7859807
Sum2.0165729 × 1011
Variance2.3783907 × 108
MonotonicityNot monotonic
2023-12-11T07:30:40.595613image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20170510 3855
38.6%
20170416 1044
 
10.4%
20150625 734
 
7.3%
20150626 644
 
6.4%
20170504 429
 
4.3%
20150629 408
 
4.1%
20150727 292
 
2.9%
20150728 177
 
1.8%
20150724 174
 
1.7%
20150722 173
 
1.7%
Other values (59) 2070
20.7%
ValueCountFrequency (%)
20150625 734
7.3%
20150626 644
6.4%
20150629 408
4.1%
20150702 101
 
1.0%
20150703 69
 
0.7%
20150706 10
 
0.1%
20150707 20
 
0.2%
20150708 14
 
0.1%
20150709 45
 
0.4%
20150714 19
 
0.2%
ValueCountFrequency (%)
20230616 85
0.9%
20230203 25
 
0.2%
20221110 1
 
< 0.1%
20221109 74
0.7%
20221006 32
 
0.3%
20220328 42
 
0.4%
20210602 9
 
0.1%
20210318 8
 
0.1%
20210316 76
0.8%
20210311 129
1.3%

Interactions

2023-12-11T07:30:38.697829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:30:38.502223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:30:38.792136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T07:30:38.592122image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T07:30:40.684598image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
표항목인식번호최종변경일
표항목인식번호1.0000.726
최종변경일0.7261.000
2023-12-11T07:30:40.764504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
표항목인식번호최종변경일
표항목인식번호1.0000.645
최종변경일0.6451.000

Missing values

2023-12-11T07:30:38.919543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T07:30:39.020645image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

표항목인식번호조직번호통계표ID최종변경일
6547210603210DT_1B00003_BK20150629
70685706210DT_1K00003_BK20150625
978522348210DT_1L0000420150724
2543685261210DT_210J004720170510
2324476628210DT_210J004720170510
820423802210DT_1P0003520150727
3714457554210DT_210J004720170510
31827964210DT_210J003820170416
816961744210DT_1I0011420150730
67364265210DT_1M00009_4_BK20150625
표항목인식번호조직번호통계표ID최종변경일
6401654210DT_1B00004_BK20150626
851062691210DT_21002_K00120210311
2009860014210DT_210J004720170510
7348133210DT_1M00013_BK20150626
925692107210DT_1P0003520150727
88824485210DT_1L0000420150724
3389876081210DT_210J004720170510
97687616210DT_21002H006A20150703
8758735210DT_1C00009_120150721
1646258850210DT_210J004720170510