Overview

Dataset statistics

Number of variables4
Number of observations500
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory17.2 KiB
Average record size in memory35.3 B

Variable types

Categorical2
Numeric2

Dataset

Description샘플 데이터
Author통계청
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=35

Alerts

년도구분(YYYY) has constant value ""Constant
가구_수(STA_CNT) has 101 (20.2%) zerosZeros

Reproduction

Analysis started2023-12-10 14:50:15.346795
Analysis finished2023-12-10 14:50:17.221558
Duration1.87 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

년도구분(YYYY)
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2010
500 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2010
2nd row2010
3rd row2010
4th row2010
5th row2010

Common Values

ValueCountFrequency (%)
2010 500
100.0%

Length

2023-12-10T23:50:17.318313image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:50:17.432456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2010 500
100.0%

집계구코드(OA_CD)
Real number (ℝ)

Distinct499
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.6252838 × 1012
Minimum1.101055 × 1012
Maximum3.902012 × 1012
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-10T23:50:17.583880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.101055 × 1012
5-th percentile1.108085 × 1012
Q12.1078045 × 1012
median3.1036085 × 1012
Q33.4012513 × 1012
95-th percentile3.8080872 × 1012
Maximum3.902012 × 1012
Range2.800957 × 1012
Interquartile range (IQR)1.2934468 × 1012

Descriptive statistics

Standard deviation9.4262879 × 1011
Coefficient of variation (CV)0.35905786
Kurtosis-1.1006594
Mean2.6252838 × 1012
Median Absolute Deviation (MAD)6.0255151 × 1011
Skewness-0.52642271
Sum1.3126419 × 1015
Variance8.8854903 × 1023
MonotonicityNot monotonic
2023-12-10T23:50:17.787238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3105260021500 2
 
0.4%
3102357010200 1
 
0.2%
3439037010003 1
 
0.2%
3739012010800 1
 
0.2%
3738011011000 1
 
0.2%
1123078010301 1
 
0.2%
2103065020100 1
 
0.2%
1110054010301 1
 
0.2%
3117055030700 1
 
0.2%
3103051021400 1
 
0.2%
Other values (489) 489
97.8%
ValueCountFrequency (%)
1101055010005 1
0.2%
1101056020012 1
0.2%
1103063020008 1
0.2%
1103070010006 1
0.2%
1104055020007 1
0.2%
1104070010017 1
0.2%
1105058010008 1
0.2%
1105064020003 1
0.2%
1106071020003 1
0.2%
1106080040001 1
0.2%
ValueCountFrequency (%)
3902012020003 1
0.2%
3901055040001 1
0.2%
3901052070009 1
0.2%
3901014030004 1
0.2%
3840039020100 1
0.2%
3839011010600 1
0.2%
3837038010600 1
0.2%
3835011020500 1
0.2%
3834011020700 1
0.2%
3833039010100 1
0.2%
Distinct25
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
TO_GA_001
40 
GA_SD_001
36 
GA_CO_003
 
30
GA_PO_002
 
28
GA_CO_010
 
27
Other values (20)
339 

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique1 ?
Unique (%)0.2%

Sample

1st rowGA_CO_002
2nd rowTO_GA_001
3rd rowTO_GA_001
4th rowGA_CO_002
5th rowGA_SD_005

Common Values

ValueCountFrequency (%)
TO_GA_001 40
 
8.0%
GA_SD_001 36
 
7.2%
GA_CO_003 30
 
6.0%
GA_PO_002 28
 
5.6%
GA_CO_010 27
 
5.4%
GA_SD_003 26
 
5.2%
GA_SD_005 25
 
5.0%
TO_GA_002 23
 
4.6%
GA_CO_009 23
 
4.6%
GA_PO_005 22
 
4.4%
Other values (15) 220
44.0%

Length

2023-12-10T23:50:17.974344image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
to_ga_001 40
 
8.0%
ga_sd_001 36
 
7.2%
ga_co_003 30
 
6.0%
ga_po_002 28
 
5.6%
ga_co_010 27
 
5.4%
ga_sd_003 26
 
5.2%
ga_sd_005 25
 
5.0%
to_ga_002 23
 
4.6%
ga_co_009 23
 
4.6%
ga_po_005 22
 
4.4%
Other values (15) 220
44.0%

가구_수(STA_CNT)
Real number (ℝ)

ZEROS 

Distinct175
Distinct (%)35.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean62.218
Minimum0
Maximum508
Zeros101
Zeros (%)20.2%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-10T23:50:18.134103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median28
Q3101
95-th percentile209.35
Maximum508
Range508
Interquartile range (IQR)98

Descriptive statistics

Standard deviation76.133087
Coefficient of variation (CV)1.2236505
Kurtosis4.3711757
Mean62.218
Median Absolute Deviation (MAD)28
Skewness1.7752546
Sum31109
Variance5796.247
MonotonicityNot monotonic
2023-12-10T23:50:18.309173image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 101
 
20.2%
3 20
 
4.0%
7 14
 
2.8%
6 12
 
2.4%
2 9
 
1.8%
8 9
 
1.8%
12 9
 
1.8%
18 7
 
1.4%
24 6
 
1.2%
11 6
 
1.2%
Other values (165) 307
61.4%
ValueCountFrequency (%)
0 101
20.2%
1 1
 
0.2%
2 9
 
1.8%
3 20
 
4.0%
4 2
 
0.4%
6 12
 
2.4%
7 14
 
2.8%
8 9
 
1.8%
9 3
 
0.6%
10 4
 
0.8%
ValueCountFrequency (%)
508 1
0.2%
465 1
0.2%
357 1
0.2%
332 1
0.2%
321 1
0.2%
318 1
0.2%
308 1
0.2%
293 1
0.2%
274 1
0.2%
269 1
0.2%

Interactions

2023-12-10T23:50:16.766824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:16.480302image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:16.899315image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:16.641235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:50:18.418477image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
집계구코드(OA_CD)가구종류_구분코드(STA_CD)가구_수(STA_CNT)
집계구코드(OA_CD)1.0000.0000.126
가구종류_구분코드(STA_CD)0.0001.0000.000
가구_수(STA_CNT)0.1260.0001.000
2023-12-10T23:50:18.514434image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
집계구코드(OA_CD)가구_수(STA_CNT)가구종류_구분코드(STA_CD)
집계구코드(OA_CD)1.0000.0540.000
가구_수(STA_CNT)0.0541.0000.000
가구종류_구분코드(STA_CD)0.0000.0001.000

Missing values

2023-12-10T23:50:17.056055image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:50:17.173295image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

년도구분(YYYY)집계구코드(OA_CD)가구종류_구분코드(STA_CD)가구_수(STA_CNT)
020103102357010200GA_CO_0027
120102404061030015TO_GA_001293
220101119076012301TO_GA_00118
320103122053031200GA_CO_00242
420103808036010200GA_SD_0057
520102102069030100GA_PO_0020
620103704059020300GA_CO_0040
720101116055020010GA_SD_00347
820103401155060801GA_SD_00211
920102304058020600GA_SD_0067
년도구분(YYYY)집계구코드(OA_CD)가구종류_구분코드(STA_CD)가구_수(STA_CNT)
49020103811255020700GA_PO_0040
49120101108065020303GA_CO_010127
49220103110356020400GA_SD_001152
49320102108055040600GA_PO_0040
49420101119061020001GA_PO_00283
49520103107063020700GA_SD_0022
49620103206051040300GA_PO_00690
49720101116053010007GA_SD_00636
49820103811255030200GA_CO_003207
49920103102160011300TO_GA_0024