Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory507.8 KiB
Average record size in memory52.0 B

Variable types

Numeric4
Text1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15067/S/1/datasetView.do

Alerts

ARSID is highly overall correlated with 표준ID and 1 other fieldsHigh correlation
표준ID is highly overall correlated with ARSID and 1 other fieldsHigh correlation
Y좌표 is highly overall correlated with ARSID and 1 other fieldsHigh correlation
ARSID has unique valuesUnique
표준ID has unique valuesUnique

Reproduction

Analysis started2024-04-29 17:03:22.441251
Analysis finished2024-04-29 17:03:26.264247
Duration3.82 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

ARSID
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14262.663
Minimum1001
Maximum25999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-30T02:03:26.331152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1001
5-th percentile2523.95
Q18766.75
median14531.5
Q320512.25
95-th percentile24371.05
Maximum25999
Range24998
Interquartile range (IQR)11745.5

Descriptive statistics

Standard deviation6920.448
Coefficient of variation (CV)0.4852143
Kurtosis-1.1030543
Mean14262.663
Median Absolute Deviation (MAD)5810
Skewness-0.14332373
Sum1.4262663 × 108
Variance47892601
MonotonicityNot monotonic
2024-04-30T02:03:26.465628image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10747 1
 
< 0.1%
23858 1
 
< 0.1%
11172 1
 
< 0.1%
9167 1
 
< 0.1%
19572 1
 
< 0.1%
4189 1
 
< 0.1%
12841 1
 
< 0.1%
24443 1
 
< 0.1%
22431 1
 
< 0.1%
19360 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
1001 1
< 0.1%
1002 1
< 0.1%
1003 1
< 0.1%
1004 1
< 0.1%
1005 1
< 0.1%
1006 1
< 0.1%
1007 1
< 0.1%
1009 1
< 0.1%
1011 1
< 0.1%
1012 1
< 0.1%
ValueCountFrequency (%)
25999 1
< 0.1%
25998 1
< 0.1%
25997 1
< 0.1%
25996 1
< 0.1%
25995 1
< 0.1%
25994 1
< 0.1%
25990 1
< 0.1%
25989 1
< 0.1%
25988 1
< 0.1%
25784 1
< 0.1%

표준ID
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.1316541 × 108
Minimum1 × 108
Maximum1.2900019 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-30T02:03:26.611691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 108
5-th percentile1.0100029 × 108
Q11.0790016 × 108
median1.1390003 × 108
Q31.1990001 × 108
95-th percentile1.230003 × 108
Maximum1.2900019 × 108
Range29000185
Interquartile range (IQR)11999854

Descriptive statistics

Standard deviation6905034.6
Coefficient of variation (CV)0.061017184
Kurtosis-1.1072166
Mean1.1316541 × 108
Median Absolute Deviation (MAD)5999881.5
Skewness-0.14877249
Sum1.1316541 × 1012
Variance4.7679503 × 1013
MonotonicityNot monotonic
2024-04-30T02:03:26.742671image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
109900046 1
 
< 0.1%
122900018 1
 
< 0.1%
110000693 1
 
< 0.1%
108000079 1
 
< 0.1%
118900046 1
 
< 0.1%
103000090 1
 
< 0.1%
111900154 1
 
< 0.1%
123000636 1
 
< 0.1%
121900025 1
 
< 0.1%
118000596 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
100000001 1
< 0.1%
100000002 1
< 0.1%
100000003 1
< 0.1%
100000004 1
< 0.1%
100000005 1
< 0.1%
100000006 1
< 0.1%
100000007 1
< 0.1%
100000009 1
< 0.1%
100000010 1
< 0.1%
100000011 1
< 0.1%
ValueCountFrequency (%)
129000186 1
< 0.1%
124900124 1
< 0.1%
124900123 1
< 0.1%
124900122 1
< 0.1%
124900121 1
< 0.1%
124900119 1
< 0.1%
124900117 1
< 0.1%
124900116 1
< 0.1%
124900115 1
< 0.1%
124900114 1
< 0.1%
Distinct6454
Distinct (%)64.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-04-30T02:03:27.018241image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length19
Mean length7.4514
Min length2

Characters and Unicode

Total characters74514
Distinct characters652
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3827 ?
Unique (%)38.3%

Sample

1st row청구아파트
2nd row상도3차삼성래미안후문
3rd row청한빌라
4th row서부트럭터미널
5th row가락래미안앞
ValueCountFrequency (%)
벽산아파트 12
 
0.1%
우성아파트 12
 
0.1%
북서울꿈의숲 12
 
0.1%
국민은행 11
 
0.1%
경남아파트 11
 
0.1%
현대아파트 10
 
0.1%
새마을금고 10
 
0.1%
가산디지털단지역 9
 
0.1%
삼성래미안아파트 9
 
0.1%
당산역 9
 
0.1%
Other values (6453) 9909
99.0%
2024-04-30T02:03:27.376496image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2300
 
3.1%
2067
 
2.8%
2021
 
2.7%
1999
 
2.7%
. 1967
 
2.6%
1683
 
2.3%
1454
 
2.0%
1433
 
1.9%
1272
 
1.7%
1213
 
1.6%
Other values (642) 57105
76.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 69428
93.2%
Decimal Number 2318
 
3.1%
Other Punctuation 1977
 
2.7%
Uppercase Letter 660
 
0.9%
Open Punctuation 42
 
0.1%
Close Punctuation 42
 
0.1%
Lowercase Letter 22
 
< 0.1%
Space Separator 14
 
< 0.1%
Dash Punctuation 11
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2300
 
3.3%
2067
 
3.0%
2021
 
2.9%
1999
 
2.9%
1683
 
2.4%
1454
 
2.1%
1433
 
2.1%
1272
 
1.8%
1213
 
1.7%
1168
 
1.7%
Other values (599) 52818
76.1%
Uppercase Letter
ValueCountFrequency (%)
T 103
15.6%
K 85
12.9%
A 63
9.5%
S 58
8.8%
C 55
8.3%
P 48
7.3%
G 41
 
6.2%
B 39
 
5.9%
M 27
 
4.1%
L 27
 
4.1%
Other values (13) 114
17.3%
Decimal Number
ValueCountFrequency (%)
1 692
29.9%
2 442
19.1%
3 334
14.4%
4 193
 
8.3%
5 154
 
6.6%
0 142
 
6.1%
7 115
 
5.0%
9 98
 
4.2%
6 97
 
4.2%
8 51
 
2.2%
Other Punctuation
ValueCountFrequency (%)
. 1967
99.5%
& 7
 
0.4%
· 3
 
0.2%
Lowercase Letter
ValueCountFrequency (%)
e 18
81.8%
k 2
 
9.1%
t 2
 
9.1%
Open Punctuation
ValueCountFrequency (%)
( 42
100.0%
Close Punctuation
ValueCountFrequency (%)
) 42
100.0%
Space Separator
ValueCountFrequency (%)
14
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 11
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 69428
93.2%
Common 4404
 
5.9%
Latin 682
 
0.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2300
 
3.3%
2067
 
3.0%
2021
 
2.9%
1999
 
2.9%
1683
 
2.4%
1454
 
2.1%
1433
 
2.1%
1272
 
1.8%
1213
 
1.7%
1168
 
1.7%
Other values (599) 52818
76.1%
Latin
ValueCountFrequency (%)
T 103
15.1%
K 85
12.5%
A 63
9.2%
S 58
8.5%
C 55
8.1%
P 48
 
7.0%
G 41
 
6.0%
B 39
 
5.7%
M 27
 
4.0%
L 27
 
4.0%
Other values (16) 136
19.9%
Common
ValueCountFrequency (%)
. 1967
44.7%
1 692
 
15.7%
2 442
 
10.0%
3 334
 
7.6%
4 193
 
4.4%
5 154
 
3.5%
0 142
 
3.2%
7 115
 
2.6%
9 98
 
2.2%
6 97
 
2.2%
Other values (7) 170
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
Hangul 69428
93.2%
ASCII 5083
 
6.8%
None 3
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2300
 
3.3%
2067
 
3.0%
2021
 
2.9%
1999
 
2.9%
1683
 
2.4%
1454
 
2.1%
1433
 
2.1%
1272
 
1.8%
1213
 
1.7%
1168
 
1.7%
Other values (599) 52818
76.1%
ASCII
ValueCountFrequency (%)
. 1967
38.7%
1 692
 
13.6%
2 442
 
8.7%
3 334
 
6.6%
4 193
 
3.8%
5 154
 
3.0%
0 142
 
2.8%
7 115
 
2.3%
T 103
 
2.0%
9 98
 
1.9%
Other values (32) 843
16.6%
None
ValueCountFrequency (%)
· 3
100.0%

X좌표
Real number (ℝ)

Distinct9979
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean126.98531
Minimum126.76899
Maximum127.18176
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-30T02:03:27.501378image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum126.76899
5-th percentile126.84191
Q1126.91677
median126.99425
Q3127.0503
95-th percentile127.12603
Maximum127.18176
Range0.41277403
Interquartile range (IQR)0.13352492

Descriptive statistics

Standard deviation0.085533522
Coefficient of variation (CV)0.00067357022
Kurtosis-0.8578543
Mean126.98531
Median Absolute Deviation (MAD)0.067558138
Skewness-0.048888561
Sum1269853.1
Variance0.0073159833
MonotonicityNot monotonic
2024-04-30T02:03:27.627920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.0324578678 3
 
< 0.1%
127.0156755717 2
 
< 0.1%
126.8862740506 2
 
< 0.1%
127.1358428731 2
 
< 0.1%
126.8992453681 2
 
< 0.1%
126.9275473964 2
 
< 0.1%
127.18176 2
 
< 0.1%
126.982473 2
 
< 0.1%
127.0366528058 2
 
< 0.1%
126.947598192 2
 
< 0.1%
Other values (9969) 9979
99.8%
ValueCountFrequency (%)
126.7689859695 1
< 0.1%
126.7690800105 1
< 0.1%
126.7975146083 1
< 0.1%
126.7978104431 1
< 0.1%
126.7983534326 1
< 0.1%
126.7984674574 1
< 0.1%
126.798649 1
< 0.1%
126.7986847129 1
< 0.1%
126.798773 1
< 0.1%
126.799863985 1
< 0.1%
ValueCountFrequency (%)
127.18176 2
< 0.1%
127.1802657501 1
< 0.1%
127.18013 1
< 0.1%
127.1795399999 1
< 0.1%
127.1795016106 1
< 0.1%
127.1792290002 1
< 0.1%
127.1783352627 1
< 0.1%
127.1780401265 1
< 0.1%
127.1779976114 1
< 0.1%
127.1779314283 1
< 0.1%

Y좌표
Real number (ℝ)

HIGH CORRELATION 

Distinct9979
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.550954
Minimum37.43078
Maximum37.781594
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-04-30T02:03:27.750949image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.43078
5-th percentile37.472091
Q137.502887
median37.54943
Q337.591673
95-th percentile37.648221
Maximum37.781594
Range0.35081385
Interquartile range (IQR)0.088786501

Descriptive statistics

Standard deviation0.055201447
Coefficient of variation (CV)0.0014700411
Kurtosis-0.76330348
Mean37.550954
Median Absolute Deviation (MAD)0.044721739
Skewness0.27758783
Sum375509.54
Variance0.0030471997
MonotonicityNot monotonic
2024-04-30T02:03:27.886281image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.6255057955 3
 
< 0.1%
37.520832794 2
 
< 0.1%
37.5783661809 2
 
< 0.1%
37.5398231139 2
 
< 0.1%
37.5781517573 2
 
< 0.1%
37.549895761 2
 
< 0.1%
37.560348747 2
 
< 0.1%
37.5661890393 2
 
< 0.1%
37.490902 2
 
< 0.1%
37.4941827082 2
 
< 0.1%
Other values (9969) 9979
99.8%
ValueCountFrequency (%)
37.430779662 1
< 0.1%
37.434793586 1
< 0.1%
37.4348444186 1
< 0.1%
37.4349898625 1
< 0.1%
37.4350042396 1
< 0.1%
37.4355268028 1
< 0.1%
37.436857042 1
< 0.1%
37.437324573 1
< 0.1%
37.4379208212 1
< 0.1%
37.437948957 1
< 0.1%
ValueCountFrequency (%)
37.7815935083 1
< 0.1%
37.690199 1
< 0.1%
37.6899469943 1
< 0.1%
37.6898660327 1
< 0.1%
37.6893523043 1
< 0.1%
37.6891947508 1
< 0.1%
37.689128 1
< 0.1%
37.6890060442 1
< 0.1%
37.6887853785 1
< 0.1%
37.6879849018 1
< 0.1%

Interactions

2024-04-30T02:03:25.731418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:24.471973image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:24.935203image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:25.330602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:25.829163image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:24.625547image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:25.027786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:25.433583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:25.921835image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:24.744724image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:25.125418image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:25.535933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:26.021061image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:24.843164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:25.217633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-30T02:03:25.629741image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-30T02:03:27.966341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ARSID표준IDX좌표Y좌표
ARSID1.0000.9860.9000.734
표준ID0.9861.0000.8940.737
X좌표0.9000.8941.0000.413
Y좌표0.7340.7370.4131.000
2024-04-30T02:03:28.045151image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ARSID표준IDX좌표Y좌표
ARSID1.0000.999-0.080-0.673
표준ID0.9991.000-0.080-0.672
X좌표-0.080-0.0801.0000.248
Y좌표-0.673-0.6720.2481.000

Missing values

2024-04-30T02:03:26.135603image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-30T02:03:26.219590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

ARSID표준ID정류장명X좌표Y좌표
373810747109900046청구아파트127.02641337.657507
836020244119000138상도3차삼성래미안후문126.95367137.500955
363610512109900102청한빌라127.01520537.657274
621815702114000379서부트럭터미널126.84366337.507789
1070724459123000367가락래미안앞127.13248437.499248
5863004102000004신용산역126.96663437.527937
912721877120900148두산아파트126.94672137.484732
4192148101000049남산3호터널126.982637.55943
26088361107000263우방아파트127.00940637.600236
27768731107900118래미안610동앞127.02223937.605543
ARSID표준ID정류장명X좌표Y좌표
424411604107900332교육촌.벼루말127.05249337.622002
13454863103900262경일중고등학교127.04912537.542413
16856103105000020용두동한국의류시험연구원127.02915437.577436
563214575113900265경성중고.홍익디자인고126.91921137.561964
571314757113900108염리초등학교126.94679237.542578
20437160106000066혜원여중고입구127.09509137.59445
506513340112000437아현가구단지126.96092837.558633
3812101101000009만리동고개126.96252637.551703
495813175112000092북가좌2동주민센터126.91147937.580291
727517866116000391구로역애경백화점126.88285137.500382