Dataset statistics
Number of variables | 7 |
---|---|
Number of observations | 209 |
Missing cells | 10 |
Missing cells (%) | 0.7% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 12.6 KiB |
Average record size in memory | 61.6 B |
Variable types
Categorical | 4 |
---|---|
Text | 1 |
Numeric | 2 |
Dataset
Description | 경기도 경기통계시스템 추출 통계표수록지점 |
---|---|
Author | 경기도 |
URL | https://data.gg.go.kr/portal/data/service/selectServicePage.do?&infId=TCCJ2MYDU86J86GP1V1133521570&infSeq=1 |
조직번호 has constant value "" | Constant |
공표구분 is highly overall correlated with 수록시점 and 3 other fields | High correlation |
수집유형 is highly overall correlated with 수록시점 and 3 other fields | High correlation |
주기구분 is highly overall correlated with 수록시점 and 2 other fields | High correlation |
수록시점 is highly overall correlated with 주기구분 and 2 other fields | High correlation |
최종수정일 is highly overall correlated with 수집유형 and 1 other fields | High correlation |
수집유형 is highly imbalanced (83.7%) | Imbalance |
공표구분 is highly imbalanced (83.7%) | Imbalance |
최종수정일 has 10 (4.8%) missing values | Missing |
Reproduction
Analysis started | 2023-12-10 21:16:15.164808 |
---|---|
Analysis finished | 2023-12-10 21:16:15.801203 |
Duration | 0.64 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
조직번호
Categorical
CONSTANT
 
Distinct | 1 |
---|---|
Distinct (%) | 0.5% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.8 KiB |
210 |
---|
Length
Max length | 3 |
---|---|
Median length | 3 |
Mean length | 3 |
Min length | 3 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 210 |
---|---|
2nd row | 210 |
3rd row | 210 |
4th row | 210 |
5th row | 210 |
Common Values
Value | Count | Frequency (%) |
210 | 209 |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
210 | 209 |
통계표 테이블 ID
Text
Distinct | 112 |
---|---|
Distinct (%) | 53.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.8 KiB |
Length
Max length | 23 |
---|---|
Median length | 19 |
Mean length | 13.736842 |
Min length | 11 |
Characters and Unicode
Total characters | 2871 |
---|---|
Distinct characters | 30 |
Distinct categories | 3 ? |
Distinct scripts | 2 ? |
Distinct blocks | 1 ? |
Unique
Unique | 91 ? |
---|---|
Unique (%) | 43.5% |
Sample
1st row | DT_21002_M023 |
---|---|
2nd row | DT_21002_M023 |
3rd row | DT_21002_M023 |
4th row | DT_21002_M023 |
5th row | DT_21002_M023 |
Value | Count | Frequency (%) |
dt_21002a004_1 | 28 | 13.4% |
dt_21002_k010 | 9 | 4.3% |
dt_21002_m023 | 8 | 3.8% |
dt_21002_p001 | 8 | 3.8% |
dt_statm_0008 | 8 | 3.8% |
dt_statm_0021 | 8 | 3.8% |
dt_21002_j010 | 7 | 3.3% |
dt_2020037_005 | 5 | 2.4% |
dt_21002_j001_1 | 5 | 2.4% |
dt_2020037_006 | 5 | 2.4% |
Other values (102) | 118 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 761 | |
_ | 450 | |
2 | 401 | |
1 | 327 | |
T | 241 | 8.4% |
D | 214 | 7.5% |
7 | 73 | 2.5% |
4 | 48 | 1.7% |
5 | 48 | 1.7% |
A | 46 | 1.6% |
Other values (20) | 262 | 9.1% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 1754 | |
Uppercase Letter | 667 | 23.2% |
Connector Punctuation | 450 | 15.7% |
Most frequent character per category
Uppercase Letter
Value | Count | Frequency (%) |
T | 241 | |
D | 214 | |
A | 46 | 6.9% |
M | 36 | 5.4% |
K | 20 | 3.0% |
S | 16 | 2.4% |
J | 16 | 2.4% |
P | 15 | 2.2% |
N | 13 | 1.9% |
E | 9 | 1.3% |
Other values (9) | 41 | 6.1% |
Decimal Number
Value | Count | Frequency (%) |
0 | 761 | |
2 | 401 | |
1 | 327 | |
7 | 73 | 4.2% |
4 | 48 | 2.7% |
5 | 48 | 2.7% |
3 | 42 | 2.4% |
8 | 23 | 1.3% |
6 | 20 | 1.1% |
9 | 11 | 0.6% |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 450 |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 2204 | |
Latin | 667 | 23.2% |
Most frequent character per script
Latin
Value | Count | Frequency (%) |
T | 241 | |
D | 214 | |
A | 46 | 6.9% |
M | 36 | 5.4% |
K | 20 | 3.0% |
S | 16 | 2.4% |
J | 16 | 2.4% |
P | 15 | 2.2% |
N | 13 | 1.9% |
E | 9 | 1.3% |
Other values (9) | 41 | 6.1% |
Common
Value | Count | Frequency (%) |
0 | 761 | |
_ | 450 | |
2 | 401 | |
1 | 327 | |
7 | 73 | 3.3% |
4 | 48 | 2.2% |
5 | 48 | 2.2% |
3 | 42 | 1.9% |
8 | 23 | 1.0% |
6 | 20 | 0.9% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 2871 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 761 | |
_ | 450 | |
2 | 401 | |
1 | 327 | |
T | 241 | 8.4% |
D | 214 | 7.5% |
7 | 73 | 2.5% |
4 | 48 | 1.7% |
5 | 48 | 1.7% |
A | 46 | 1.6% |
Other values (20) | 262 | 9.1% |
주기구분
Categorical
HIGH CORRELATION
 
Distinct | 5 |
---|---|
Distinct (%) | 2.4% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.8 KiB |
Y | |
---|---|
F | |
M | |
Q | 3 |
H | 1 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 1 ? |
---|---|
Unique (%) | 0.5% |
Sample
1st row | Y |
---|---|
2nd row | Y |
3rd row | Y |
4th row | Y |
5th row | Y |
Common Values
Value | Count | Frequency (%) |
Y | 115 | |
F | 48 | |
M | 42 | 20.1% |
Q | 3 | 1.4% |
H | 1 | 0.5% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
y | 115 | |
f | 48 | |
m | 42 | 20.1% |
q | 3 | 1.4% |
h | 1 | 0.5% |
수록시점
Real number (ℝ)
HIGH CORRELATION
 
Distinct | 61 |
---|---|
Distinct (%) | 29.2% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 45934.081 |
Minimum | 2001 |
---|---|
Maximum | 202211 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.0 KiB |
Quantile statistics
Minimum | 2001 |
---|---|
5-th percentile | 2004.4 |
Q1 | 2018 |
median | 2020 |
Q3 | 2021 |
95-th percentile | 202011.6 |
Maximum | 202211 |
Range | 200210 |
Interquartile range (IQR) | 3 |
Descriptive statistics
Standard deviation | 82869.759 |
---|---|
Coefficient of variation (CV) | 1.8041018 |
Kurtosis | -0.14914667 |
Mean | 45934.081 |
Median Absolute Deviation (MAD) | 2 |
Skewness | 1.3609996 |
Sum | 9600223 |
Variance | 6.867397 × 109 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
2020 | 48 | |
2021 | 24 | 11.5% |
2018 | 24 | 11.5% |
2019 | 13 | 6.2% |
2010 | 5 | 2.4% |
2011 | 5 | 2.4% |
2022 | 5 | 2.4% |
2017 | 5 | 2.4% |
2005 | 4 | 1.9% |
2004 | 4 | 1.9% |
Other values (51) | 72 |
Value | Count | Frequency (%) |
2001 | 2 | 1.0% |
2002 | 2 | 1.0% |
2003 | 3 | |
2004 | 4 | |
2005 | 4 | |
2006 | 1 | 0.5% |
2007 | 2 | 1.0% |
2008 | 4 | |
2009 | 3 | |
2010 | 5 |
Value | Count | Frequency (%) |
202211 | 2 | |
202210 | 2 | |
202209 | 2 | |
202208 | 2 | |
202207 | 2 | |
202012 | 1 | |
202011 | 1 | |
202010 | 1 | |
202009 | 1 | |
202008 | 1 |
수집유형
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.8 KiB |
1212713 | |
---|---|
<NA> | 5 |
Length
Max length | 7 |
---|---|
Median length | 7 |
Mean length | 6.9282297 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 1212713 |
---|---|
2nd row | 1212713 |
3rd row | 1212713 |
4th row | 1212713 |
5th row | 1212713 |
Common Values
Value | Count | Frequency (%) |
1212713 | 204 | |
<NA> | 5 | 2.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1212713 | 204 | |
na | 5 | 2.4% |
공표구분
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 2 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 1.8 KiB |
1210110 | |
---|---|
<NA> | 5 |
Length
Max length | 7 |
---|---|
Median length | 7 |
Mean length | 6.9282297 |
Min length | 4 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 1210110 |
---|---|
2nd row | 1210110 |
3rd row | 1210110 |
4th row | 1210110 |
5th row | 1210110 |
Common Values
Value | Count | Frequency (%) |
1210110 | 204 | |
<NA> | 5 | 2.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
1210110 | 204 | |
na | 5 | 2.4% |
최종수정일
Real number (ℝ)
HIGH CORRELATION
  MISSING
 
Distinct | 46 |
---|---|
Distinct (%) | 23.1% |
Missing | 10 |
Missing (%) | 4.8% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 20205068 |
Minimum | 20131231 |
---|---|
Maximum | 20230412 |
Zeros | 0 |
Zeros (%) | 0.0% |
Negative | 0 |
Negative (%) | 0.0% |
Memory size | 2.0 KiB |
Quantile statistics
Minimum | 20131231 |
---|---|
5-th percentile | 20131231 |
Q1 | 20200509 |
median | 20220603 |
Q3 | 20221168 |
95-th percentile | 20230406 |
Maximum | 20230412 |
Range | 99181 |
Interquartile range (IQR) | 20659 |
Descriptive statistics
Standard deviation | 31744.737 |
---|---|
Coefficient of variation (CV) | 0.0015711274 |
Kurtosis | 1.3334663 |
Mean | 20205068 |
Median Absolute Deviation (MAD) | 9720 |
Skewness | -1.6709603 |
Sum | 4.0208086 × 109 |
Variance | 1.0077283 × 109 |
Monotonicity | Not monotonic |
Value | Count | Frequency (%) |
20131231 | 28 | 13.4% |
20220603 | 18 | 8.6% |
20230406 | 17 | 8.1% |
20200509 | 12 | 5.7% |
20220602 | 11 | 5.3% |
20211221 | 9 | 4.3% |
20230323 | 9 | 4.3% |
20220919 | 8 | 3.8% |
20200508 | 8 | 3.8% |
20221212 | 7 | 3.3% |
Other values (36) | 72 | |
(Missing) | 10 | 4.8% |
Value | Count | Frequency (%) |
20131231 | 28 | |
20170216 | 2 | 1.0% |
20190417 | 1 | 0.5% |
20200508 | 8 | 3.8% |
20200509 | 12 | |
20200511 | 2 | 1.0% |
20200515 | 1 | 0.5% |
20200623 | 1 | 0.5% |
20200831 | 6 | 2.9% |
20201113 | 2 | 1.0% |
Value | Count | Frequency (%) |
20230412 | 1 | 0.5% |
20230406 | 17 | |
20230329 | 4 | 1.9% |
20230328 | 1 | 0.5% |
20230323 | 9 | |
20230103 | 1 | 0.5% |
20221220 | 1 | 0.5% |
20221219 | 1 | 0.5% |
20221213 | 1 | 0.5% |
20221212 | 7 |
주기구분 | 수록시점 | 최종수정일 | |
---|---|---|---|
주기구분 | 1.000 | 1.000 | 0.553 |
수록시점 | 1.000 | 1.000 | 0.766 |
최종수정일 | 0.553 | 0.766 | 1.000 |
공표구분 | 수집유형 | 주기구분 | |
---|---|---|---|
공표구분 | 1.000 | 1.000 | 1.000 |
수집유형 | 1.000 | 1.000 | 1.000 |
주기구분 | 1.000 | 1.000 | 1.000 |
수록시점 | 최종수정일 | 주기구분 | 수집유형 | 공표구분 | |
---|---|---|---|---|---|
수록시점 | 1.000 | 0.025 | 0.993 | 1.000 | 1.000 |
최종수정일 | 0.025 | 1.000 | 0.495 | 1.000 | 1.000 |
주기구분 | 0.993 | 0.495 | 1.000 | 1.000 | 1.000 |
수집유형 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
공표구분 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 |
조직번호 | 통계표 테이블 ID | 주기구분 | 수록시점 | 수집유형 | 공표구분 | 최종수정일 | |
---|---|---|---|---|---|---|---|
0 | 210 | DT_21002_M023 | Y | 2005 | 1212713 | 1210110 | 20221102 |
1 | 210 | DT_21002_M023 | Y | 2008 | 1212713 | 1210110 | 20221102 |
2 | 210 | DT_21002_M023 | Y | 2010 | 1212713 | 1210110 | 20221103 |
3 | 210 | DT_21002_M023 | Y | 2011 | 1212713 | 1210110 | 20221103 |
4 | 210 | DT_21002_M023 | Y | 2019 | 1212713 | 1210110 | 20221210 |
5 | 210 | DT_21002_M023 | Y | 2020 | 1212713 | 1210110 | 20221210 |
6 | 210 | DT_21002_N001 | Y | 2008 | 1212713 | 1210110 | 20221104 |
7 | 210 | DT_2021057_2_1 | F | 2021 | 1212713 | 1210110 | 20230406 |
8 | 210 | DT_21002_N001 | Y | 2019 | 1212713 | 1210110 | 20221104 |
9 | 210 | DT_21002_N001 | Y | 2020 | 1212713 | 1210110 | 20221210 |
조직번호 | 통계표 테이블 ID | 주기구분 | 수록시점 | 수집유형 | 공표구분 | 최종수정일 | |
---|---|---|---|---|---|---|---|
199 | 210 | DT_21002_J001_1 | Y | 2009 | 1212713 | 1210110 | 20220919 |
200 | 210 | DT_21002_J001_1 | Y | 2010 | 1212713 | 1210110 | 20220919 |
201 | 210 | DT_21002_O005 | Y | 2018 | 1212713 | 1210110 | 20221212 |
202 | 210 | DT_21002_J001_1 | Y | 2011 | 1212713 | 1210110 | 20220919 |
203 | 210 | DT_21002_J001_1 | Y | 2012 | 1212713 | 1210110 | 20220919 |
204 | 210 | DT_21002_J001_1 | Y | 2013 | 1212713 | 1210110 | 20220919 |
205 | 210 | DT_21002_O005 | Y | 2019 | 1212713 | 1210110 | 20221212 |
206 | 210 | DT_21002_O005 | Y | 2020 | 1212713 | 1210110 | 20221212 |
207 | 210 | DT_21002_M023 | Y | 2003 | 1212713 | 1210110 | 20221102 |
208 | 210 | DT_21002_M023 | Y | 2004 | 1212713 | 1210110 | 20221102 |