Overview

Dataset statistics

Number of variables5
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory4.3 KiB
Average record size in memory44.3 B

Variable types

Numeric3
Text2

Dataset

Description샘플 데이터
Author통계청
URLhttps://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=35

Alerts

집계구코드(CENSUS_AREA_CD) has unique valuesUnique

Reproduction

Analysis started2023-12-10 14:50:26.837969
Analysis finished2023-12-10 14:50:28.842479
Duration2 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

집계구코드(CENSUS_AREA_CD)
Real number (ℝ)

UNIQUE 

Distinct100
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7763943 × 1012
Minimum1.101071 × 1012
Maximum3.832037 × 1012
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T23:50:28.935488image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1.101071 × 1012
5-th percentile1.1110783 × 1012
Q12.2060715 × 1012
median3.1053545 × 1012
Q33.4014528 × 1012
95-th percentile3.8051636 × 1012
Maximum3.832037 × 1012
Range2.730966 × 1012
Interquartile range (IQR)1.1953813 × 1012

Descriptive statistics

Standard deviation8.4061555 × 1011
Coefficient of variation (CV)0.3027724
Kurtosis-0.36794929
Mean2.7763943 × 1012
Median Absolute Deviation (MAD)5.9579048 × 1011
Skewness-0.82930846
Sum2.7763943 × 1014
Variance7.066345 × 1023
MonotonicityNot monotonic
2023-12-10T23:50:29.109292image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3701263010800 1
 
1.0%
3432033010004 1
 
1.0%
3105352031700 1
 
1.0%
2303067010500 1
 
1.0%
3107059050200 1
 
1.0%
3402040010005 1
 
1.0%
3811557021200 1
 
1.0%
2404057030019 1
 
1.0%
2206067031200 1
 
1.0%
3105356042200 1
 
1.0%
Other values (90) 90
90.0%
ValueCountFrequency (%)
1101071010008 1
1.0%
1107057010202 1
1.0%
1107062080012 1
1.0%
1109060010002 1
1.0%
1111064030001 1
1.0%
1111079060101 1
1.0%
1112072010401 1
1.0%
1115061020004 1
1.0%
1120054030003 1
1.0%
1120055020002 1
1.0%
ValueCountFrequency (%)
3832037021600 1
1.0%
3811557021200 1
1.0%
3811364030300 1
1.0%
3811251051200 1
1.0%
3807055021000 1
1.0%
3805064030600 1
1.0%
3803071023900 1
1.0%
3742036010300 1
1.0%
3710035021300 1
1.0%
3706039010200 1
1.0%
Distinct71
Distinct (%)71.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34033.59
Minimum11170
Maximum50110
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T23:50:29.313297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11170
5-th percentile11465.5
Q126320
median41133.5
Q344180
95-th percentile48131.25
Maximum50110
Range38940
Interquartile range (IQR)17860

Descriptive statistics

Standard deviation12421.633
Coefficient of variation (CV)0.36498157
Kurtosis-0.83342186
Mean34033.59
Median Absolute Deviation (MAD)7054
Skewness-0.67208126
Sum3403359
Variance1.5429697 × 108
MonotonicityNot monotonic
2023-12-10T23:50:29.520904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11710 4
 
4.0%
45111 4
 
4.0%
26320 3
 
3.0%
48250 3
 
3.0%
11500 3
 
3.0%
28170 2
 
2.0%
27260 2
 
2.0%
47130 2
 
2.0%
41480 2
 
2.0%
41360 2
 
2.0%
Other values (61) 73
73.0%
ValueCountFrequency (%)
11170 1
 
1.0%
11230 1
 
1.0%
11305 2
2.0%
11380 1
 
1.0%
11470 1
 
1.0%
11500 3
3.0%
11545 1
 
1.0%
11560 2
2.0%
11710 4
4.0%
26110 2
2.0%
ValueCountFrequency (%)
50110 2
2.0%
48250 3
3.0%
48125 1
 
1.0%
48121 1
 
1.0%
47750 1
 
1.0%
47150 1
 
1.0%
47130 2
2.0%
46910 1
 
1.0%
46150 2
2.0%
46130 2
2.0%
Distinct71
Distinct (%)71.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T23:50:29.852857image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length4
Median length3
Mean length2.95
Min length2

Characters and Unicode

Total characters295
Distinct characters76
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52 ?
Unique (%)52.0%

Sample

1st row영등포구
2nd row남구
3rd row달서구
4th row강릉시
5th row아산시
ValueCountFrequency (%)
북구 5
 
5.0%
부산진구 3
 
3.0%
남구 3
 
3.0%
중구 3
 
3.0%
창원시 3
 
3.0%
용인시 3
 
3.0%
영등포구 3
 
3.0%
남동구 3
 
3.0%
제주시 2
 
2.0%
서구 2
 
2.0%
Other values (61) 70
70.0%
2023-12-10T23:50:30.365724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
53
 
18.0%
44
 
14.9%
9
 
3.1%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
7
 
2.4%
7
 
2.4%
Other values (66) 135
45.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 295
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
53
 
18.0%
44
 
14.9%
9
 
3.1%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
7
 
2.4%
7
 
2.4%
Other values (66) 135
45.8%

Most occurring scripts

ValueCountFrequency (%)
Hangul 295
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
53
 
18.0%
44
 
14.9%
9
 
3.1%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
7
 
2.4%
7
 
2.4%
Other values (66) 135
45.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 295
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
53
 
18.0%
44
 
14.9%
9
 
3.1%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
8
 
2.7%
7
 
2.4%
7
 
2.4%
Other values (66) 135
45.8%
Distinct58
Distinct (%)58.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean566.73
Minimum250
Maximum780
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T23:50:30.552860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum250
5-th percentile307.6
Q1527.5
median584
Q3650.25
95-th percentile740.1
Maximum780
Range530
Interquartile range (IQR)122.75

Descriptive statistics

Standard deviation128.02792
Coefficient of variation (CV)0.22590638
Kurtosis0.47002218
Mean566.73
Median Absolute Deviation (MAD)64
Skewness-0.95103022
Sum56673
Variance16391.149
MonotonicityNot monotonic
2023-12-10T23:50:30.754484image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
520 6
 
6.0%
550 5
 
5.0%
320 4
 
4.0%
540 4
 
4.0%
660 3
 
3.0%
700 3
 
3.0%
585 3
 
3.0%
640 3
 
3.0%
580 3
 
3.0%
570 3
 
3.0%
Other values (48) 63
63.0%
ValueCountFrequency (%)
250 2
2.0%
253 2
2.0%
262 1
 
1.0%
310 1
 
1.0%
320 4
4.0%
330 1
 
1.0%
370 2
2.0%
380 1
 
1.0%
400 1
 
1.0%
410 1
 
1.0%
ValueCountFrequency (%)
780 1
 
1.0%
770 2
2.0%
751 1
 
1.0%
742 1
 
1.0%
740 1
 
1.0%
720 1
 
1.0%
710 3
3.0%
702 1
 
1.0%
700 3
3.0%
690 1
 
1.0%
Distinct93
Distinct (%)93.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
2023-12-10T23:50:31.148883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length6
Median length4
Mean length3.58
Min length2

Characters and Unicode

Total characters358
Distinct characters102
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique88 ?
Unique (%)88.0%

Sample

1st row우만2동
2nd row수곡2동
3rd row광안4동
4th row운암2동
5th row동삼1동
ValueCountFrequency (%)
사2동 3
 
3.0%
화정1동 3
 
3.0%
초평동 2
 
2.0%
풍암동 2
 
2.0%
연희동 2
 
2.0%
신장2동 1
 
1.0%
궁내동 1
 
1.0%
구포2동 1
 
1.0%
인후1동 1
 
1.0%
진안동 1
 
1.0%
Other values (83) 83
83.0%
2023-12-10T23:50:31.719093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
85
23.7%
2 25
 
7.0%
1 18
 
5.0%
3 10
 
2.8%
10
 
2.8%
10
 
2.8%
8
 
2.2%
7
 
2.0%
6
 
1.7%
5
 
1.4%
Other values (92) 174
48.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 299
83.5%
Decimal Number 57
 
15.9%
Other Punctuation 2
 
0.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
85
28.4%
10
 
3.3%
10
 
3.3%
8
 
2.7%
7
 
2.3%
6
 
2.0%
5
 
1.7%
5
 
1.7%
5
 
1.7%
5
 
1.7%
Other values (85) 153
51.2%
Decimal Number
ValueCountFrequency (%)
2 25
43.9%
1 18
31.6%
3 10
 
17.5%
4 2
 
3.5%
6 1
 
1.8%
7 1
 
1.8%
Other Punctuation
ValueCountFrequency (%)
, 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 299
83.5%
Common 59
 
16.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
85
28.4%
10
 
3.3%
10
 
3.3%
8
 
2.7%
7
 
2.3%
6
 
2.0%
5
 
1.7%
5
 
1.7%
5
 
1.7%
5
 
1.7%
Other values (85) 153
51.2%
Common
ValueCountFrequency (%)
2 25
42.4%
1 18
30.5%
3 10
 
16.9%
, 2
 
3.4%
4 2
 
3.4%
6 1
 
1.7%
7 1
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 299
83.5%
ASCII 59
 
16.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
85
28.4%
10
 
3.3%
10
 
3.3%
8
 
2.7%
7
 
2.3%
6
 
2.0%
5
 
1.7%
5
 
1.7%
5
 
1.7%
5
 
1.7%
Other values (85) 153
51.2%
ASCII
ValueCountFrequency (%)
2 25
42.4%
1 18
30.5%
3 10
 
16.9%
, 2
 
3.4%
4 2
 
3.4%
6 1
 
1.7%
7 1
 
1.7%

Interactions

2023-12-10T23:50:28.252975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:27.465157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:27.864161image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:28.382902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:27.590989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:27.991770image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:28.498144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:27.736421image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:50:28.144771image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:50:31.845110image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
집계구코드(CENSUS_AREA_CD)시군구코드(SIGNGU_CD)시군구명(SIGNGU_NM)행정동코드(ADSTRD_CD)행정동명(ADSTRD_NM)
집계구코드(CENSUS_AREA_CD)1.0000.0000.5570.0790.741
시군구코드(SIGNGU_CD)0.0001.0000.8260.3520.801
시군구명(SIGNGU_NM)0.5570.8261.0000.7460.959
행정동코드(ADSTRD_CD)0.0790.3520.7461.0000.867
행정동명(ADSTRD_NM)0.7410.8010.9590.8671.000
2023-12-10T23:50:31.976053image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
집계구코드(CENSUS_AREA_CD)시군구코드(SIGNGU_CD)행정동코드(ADSTRD_CD)
집계구코드(CENSUS_AREA_CD)1.000-0.016-0.119
시군구코드(SIGNGU_CD)-0.0161.000-0.103
행정동코드(ADSTRD_CD)-0.119-0.1031.000

Missing values

2023-12-10T23:50:28.661347image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:50:28.785860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

집계구코드(CENSUS_AREA_CD)시군구코드(SIGNGU_CD)시군구명(SIGNGU_NM)행정동코드(ADSTRD_CD)행정동명(ADSTRD_NM)
0370126301080011710영등포구650우만2동
1350126902000311500남구702수곡2동
2111506102000445111달서구550광안4동
3210806302070047750강릉시310운암2동
4230307705020026290아산시585동삼1동
5310515501120026710달서구320도고면
6374203601030028237도봉구540양재1동
7360206002060148125구리시320화명1동
8311405202160043113임실군320일곡동
9383203702160041290천안시640괴정3동
집계구코드(CENSUS_AREA_CD)시군구코드(SIGNGU_CD)시군구명(SIGNGU_NM)행정동코드(ADSTRD_CD)행정동명(ADSTRD_NM)
90250306501001745111성동구250호계3동
91220607303120041115부산진구571갈산2동
92111207201040141480서귀포시641금호2동
93363903401000311500장성군380상관면
94330126802000941281중구540사2동
95311301201370041465신안군615구서2동
96310116202110041150부산진구570용봉동
97310237402270026530성북구624명일2동
98311935301160011380창원시520화정1동
99310306201040026110강서구652구월2동