Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory722.7 KiB
Average record size in memory74.0 B

Variable types

Categorical4
Numeric2
Text1
DateTime1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15964/S/1/datasetView.do

Alerts

모델번호 has constant value ""Constant
지역 is highly overall correlated with 시리얼 and 2 other fieldsHigh correlation
자치구 is highly overall correlated with 시리얼 and 2 other fieldsHigh correlation
행정동 is highly overall correlated with 시리얼 and 2 other fieldsHigh correlation
시리얼 is highly overall correlated with 지역 and 2 other fieldsHigh correlation
방문자수 has 1588 (15.9%) zerosZeros

Reproduction

Analysis started2024-05-11 06:52:13.922333
Analysis finished2024-05-11 06:52:15.588201
Duration1.67 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

모델번호
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
SDOT001
10000 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSDOT001
2nd rowSDOT001
3rd rowSDOT001
4th rowSDOT001
5th rowSDOT001

Common Values

ValueCountFrequency (%)
SDOT001 10000
100.0%

Length

2024-05-11T15:52:15.676368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:52:15.840902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
sdot001 10000
100.0%

시리얼
Real number (ℝ)

HIGH CORRELATION 

Distinct53
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4030.2073
Minimum4001
Maximum4064
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:52:16.001605image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4001
5-th percentile4004
Q14016.75
median4030
Q34043
95-th percentile4061
Maximum4064
Range63
Interquartile range (IQR)26.25

Descriptive statistics

Standard deviation17.110634
Coefficient of variation (CV)0.0042455966
Kurtosis-0.92441764
Mean4030.2073
Median Absolute Deviation (MAD)13
Skewness0.12880325
Sum40302073
Variance292.7738
MonotonicityNot monotonic
2024-05-11T15:52:16.267934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4001 217
 
2.2%
4004 214
 
2.1%
4014 212
 
2.1%
4032 206
 
2.1%
4037 203
 
2.0%
4023 203
 
2.0%
4049 203
 
2.0%
4026 202
 
2.0%
4025 201
 
2.0%
4062 201
 
2.0%
Other values (43) 7938
79.4%
ValueCountFrequency (%)
4001 217
2.2%
4002 193
1.9%
4004 214
2.1%
4005 175
1.8%
4006 183
1.8%
4007 187
1.9%
4008 169
1.7%
4009 197
2.0%
4010 188
1.9%
4013 199
2.0%
ValueCountFrequency (%)
4064 198
2.0%
4062 201
2.0%
4061 182
1.8%
4060 182
1.8%
4054 179
1.8%
4053 164
1.6%
4051 180
1.8%
4050 192
1.9%
4049 203
2.0%
4048 197
2.0%
Distinct4973
Distinct (%)49.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:52:16.917883image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters190000
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2271 ?
Unique (%)22.7%

Sample

1st row2024-01-27_12:05:00
2nd row2024-01-28_22:32:00
3rd row2024-01-28_14:56:00
4th row2024-01-27_23:27:00
5th row2024-01-30_06:35:00
ValueCountFrequency (%)
2024-01-29_09:50:00 11
 
0.1%
2024-01-28_00:50:00 8
 
0.1%
2024-01-27_09:20:00 8
 
0.1%
2024-01-26_21:20:00 8
 
0.1%
2024-01-29_00:50:00 8
 
0.1%
2024-01-28_01:30:00 8
 
0.1%
2024-01-24_05:50:00 8
 
0.1%
2024-01-25_17:20:00 7
 
0.1%
2024-01-30_18:40:00 7
 
0.1%
2024-01-26_12:00:00 7
 
0.1%
Other values (4963) 9920
99.2%
2024-05-11T15:52:17.367537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 51329
27.0%
2 33832
17.8%
- 20000
 
10.5%
: 20000
 
10.5%
1 17884
 
9.4%
4 14398
 
7.6%
_ 10000
 
5.3%
5 5541
 
2.9%
3 4641
 
2.4%
6 4398
 
2.3%
Other values (3) 7977
 
4.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 140000
73.7%
Dash Punctuation 20000
 
10.5%
Other Punctuation 20000
 
10.5%
Connector Punctuation 10000
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 51329
36.7%
2 33832
24.2%
1 17884
 
12.8%
4 14398
 
10.3%
5 5541
 
4.0%
3 4641
 
3.3%
6 4398
 
3.1%
7 3481
 
2.5%
9 2325
 
1.7%
8 2171
 
1.6%
Dash Punctuation
ValueCountFrequency (%)
- 20000
100.0%
Other Punctuation
ValueCountFrequency (%)
: 20000
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 190000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 51329
27.0%
2 33832
17.8%
- 20000
 
10.5%
: 20000
 
10.5%
1 17884
 
9.4%
4 14398
 
7.6%
_ 10000
 
5.3%
5 5541
 
2.9%
3 4641
 
2.4%
6 4398
 
2.3%
Other values (3) 7977
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 190000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 51329
27.0%
2 33832
17.8%
- 20000
 
10.5%
: 20000
 
10.5%
1 17884
 
9.4%
4 14398
 
7.6%
_ 10000
 
5.3%
5 5541
 
2.9%
3 4641
 
2.4%
6 4398
 
2.3%
Other values (3) 7977
 
4.2%

지역
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
main_street
6469 
parks
2060 
traditional_markets
935 
public_facilities
 
192
commercial_area
 
180

Length

Max length19
Median length11
Mean length10.7812
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowtraditional_markets
2nd rowparks
3rd rowmain_street
4th rowmain_street
5th rowmain_street

Common Values

ValueCountFrequency (%)
main_street 6469
64.7%
parks 2060
 
20.6%
traditional_markets 935
 
9.3%
public_facilities 192
 
1.9%
commercial_area 180
 
1.8%
residential_area 164
 
1.6%

Length

2024-05-11T15:52:17.525941image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:52:17.655905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
main_street 6469
64.7%
parks 2060
 
20.6%
traditional_markets 935
 
9.3%
public_facilities 192
 
1.9%
commercial_area 180
 
1.8%
residential_area 164
 
1.6%

자치구
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Seoul_Grand_Park
1137 
Jung-gu
1129 
Gangseo-gu
958 
Jongno-gu
927 
Gangnam-gu
916 
Other values (13)
4933 

Length

Max length16
Median length11
Mean length10.4862
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGangseo-gu
2nd rowSeoul_Grand_Park
3rd rowDobong-gu
4th rowSeongdong-gu
5th rowJung-gu

Common Values

ValueCountFrequency (%)
Seoul_Grand_Park 1137
11.4%
Jung-gu 1129
11.3%
Gangseo-gu 958
9.6%
Jongno-gu 927
9.3%
Gangnam-gu 916
9.2%
Seocho-gu 799
8.0%
Gangdong-gu 738
 
7.4%
Seodaemun-gu 580
 
5.8%
Gwangjin-gu 549
 
5.5%
Seongdong-gu 391
 
3.9%
Other values (8) 1876
18.8%

Length

2024-05-11T15:52:17.788846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
seoul_grand_park 1137
11.4%
jung-gu 1129
11.3%
gangseo-gu 958
9.6%
jongno-gu 927
9.3%
gangnam-gu 916
9.2%
seocho-gu 799
8.0%
gangdong-gu 738
 
7.4%
seodaemun-gu 580
 
5.8%
gwangjin-gu 549
 
5.5%
seongdong-gu 391
 
3.9%
Other values (8) 1876
18.8%

행정동
Categorical

HIGH CORRELATION 

Distinct47
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Hongje3-dong
 
388
Gahoe-dong
 
381
Daechi4-dong
 
366
Buam-dong
 
364
Ihwa-dong
 
360
Other values (42)
8141 

Length

Max length16
Median length14
Mean length12.2374
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBanghwa1-dong
2nd rowvalet_parking1
3rd rowChang1-dong
4th rowSeongsu1ga2-dong
5th rowMyeong-dong

Common Values

ValueCountFrequency (%)
Hongje3-dong 388
 
3.9%
Gahoe-dong 381
 
3.8%
Daechi4-dong 366
 
3.7%
Buam-dong 364
 
3.6%
Ihwa-dong 360
 
3.6%
Myeong-dong 360
 
3.6%
Seocho3-dong 217
 
2.2%
Banpo3-dong 214
 
2.1%
Sindang-dong 212
 
2.1%
Mangwon-dong1 206
 
2.1%
Other values (37) 6932
69.3%

Length

2024-05-11T15:52:17.921198image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
hongje3-dong 388
 
3.9%
gahoe-dong 381
 
3.8%
daechi4-dong 366
 
3.7%
buam-dong 364
 
3.6%
ihwa-dong 360
 
3.6%
myeong-dong 360
 
3.6%
seocho3-dong 217
 
2.2%
banpo3-dong 214
 
2.1%
sindang-dong 212
 
2.1%
mangwon-dong1 206
 
2.1%
Other values (37) 6932
69.3%

방문자수
Real number (ℝ)

ZEROS 

Distinct366
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.1803
Minimum0
Maximum627
Zeros1588
Zeros (%)15.9%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:52:18.083867image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median13
Q343
95-th percentile156
Maximum627
Range627
Interquartile range (IQR)41

Descriptive statistics

Standard deviation61.757765
Coefficient of variation (CV)1.7069445
Kurtosis17.487693
Mean36.1803
Median Absolute Deviation (MAD)13
Skewness3.5847943
Sum361803
Variance3814.0216
MonotonicityNot monotonic
2024-05-11T15:52:18.290630image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1588
 
15.9%
1 545
 
5.5%
2 456
 
4.6%
3 386
 
3.9%
4 324
 
3.2%
5 284
 
2.8%
7 244
 
2.4%
8 230
 
2.3%
6 208
 
2.1%
9 190
 
1.9%
Other values (356) 5545
55.5%
ValueCountFrequency (%)
0 1588
15.9%
1 545
 
5.5%
2 456
 
4.6%
3 386
 
3.9%
4 324
 
3.2%
5 284
 
2.8%
6 208
 
2.1%
7 244
 
2.4%
8 230
 
2.3%
9 190
 
1.9%
ValueCountFrequency (%)
627 1
< 0.1%
605 1
< 0.1%
598 1
< 0.1%
569 1
< 0.1%
568 1
< 0.1%
566 1
< 0.1%
559 1
< 0.1%
553 1
< 0.1%
551 1
< 0.1%
540 2
< 0.1%
Distinct2872
Distinct (%)28.7%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2024-01-24 00:08:07
Maximum2024-01-30 23:58:02
2024-05-11T15:52:18.472860image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:52:18.660262image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-05-11T15:52:14.925503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:52:14.620853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:52:15.089534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:52:14.756330image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:52:18.783452image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시리얼지역자치구행정동방문자수
시리얼1.0000.7620.9620.9980.453
지역0.7621.0000.9210.9930.202
자치구0.9620.9211.0001.0000.422
행정동0.9980.9931.0001.0000.658
방문자수0.4530.2020.4220.6581.000
2024-05-11T15:52:18.918451image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역자치구행정동
지역1.0000.6400.944
자치구0.6401.0000.993
행정동0.9440.9931.000
2024-05-11T15:52:19.029082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시리얼방문자수지역자치구행정동
시리얼1.000-0.3020.5360.8210.979
방문자수-0.3021.0000.1080.1750.290
지역0.5360.1081.0000.6400.944
자치구0.8210.1750.6401.0000.993
행정동0.9790.2900.9440.9931.000

Missing values

2024-05-11T15:52:15.297456image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:52:15.496779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

모델번호시리얼측정시간지역자치구행정동방문자수등록일
25669SDOT00140242024-01-27_12:05:00traditional_marketsGangseo-guBanghwa1-dong122024-01-27 12:18:01
35824SDOT00140452024-01-28_22:32:00parksSeoul_Grand_Parkvalet_parking102024-01-28 22:48:01
33381SDOT00140282024-01-28_14:56:00main_streetDobong-guChang1-dong42024-01-28 15:08:03
29255SDOT00140422024-01-27_23:27:00main_streetSeongdong-guSeongsu1ga2-dong42024-01-27 23:38:02
45487SDOT00140182024-01-30_06:35:00main_streetJung-guMyeong-dong92024-01-30 06:48:01
25435SDOT00140192024-01-27_11:20:00main_streetJung-guHoehyeon-dong1912024-01-27 11:38:00
21336SDOT00140252024-01-26_21:47:00main_streetGangseo-guGonghang-dong72024-01-26 21:58:01
28835SDOT00140402024-01-27_22:07:00main_streetJongno-guIhwa-dong212024-01-27 22:18:02
11139SDOT00140362024-01-25_12:40:00main_streetYangcheon-guSinjeong4-dong652024-01-25 12:58:00
8819SDOT00140202024-01-25_05:26:00main_streetYongsan-guItaewon2-dong02024-01-25 05:38:01
모델번호시리얼측정시간지역자치구행정동방문자수등록일
34020SDOT00140282024-01-28_16:56:00main_streetDobong-guChang1-dong02024-01-28 17:08:02
49718SDOT00140302024-01-30_19:57:00parksGangbuk-guBeon3-dong92024-01-30 20:08:01
35561SDOT00140322024-01-28_21:44:00main_streetMapo-guMangwon-dong1202024-01-28 21:58:02
36523SDOT00140102024-01-29_01:05:00main_streetGangnam-guYeoksam1-dong72024-01-29 01:18:03
26215SDOT00140292024-01-27_13:57:00traditional_marketsGangbuk-guSuyu3-dong472024-01-27 14:08:01
13413SDOT00140532024-01-25_19:50:00residential_areaGwangjin-guGwangjang-dong82024-01-25 20:08:01
14309SDOT00140502024-01-25_22:36:00public_facilitiesSeodaemun-guCheonyeon-dong02024-01-25 22:48:03
18022SDOT00140052024-01-26_11:12:00main_streetSeocho-guYangjae1-dong752024-01-26 11:28:02
17756SDOT00140492024-01-26_10:22:00traditional_marketsSeodaemun-guHongje3-dong292024-01-26 10:38:02
29634SDOT00140262024-01-28_00:52:00main_streetGangseo-guGayang-dong1102024-01-28 01:08:03