Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory722.7 KiB
Average record size in memory74.0 B

Variable types

Categorical4
Numeric2
Text1
DateTime1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15964/S/1/datasetView.do

Alerts

모델번호 has constant value ""Constant
지역 is highly overall correlated with 시리얼 and 2 other fieldsHigh correlation
자치구 is highly overall correlated with 시리얼 and 2 other fieldsHigh correlation
행정동 is highly overall correlated with 시리얼 and 2 other fieldsHigh correlation
시리얼 is highly overall correlated with 지역 and 2 other fieldsHigh correlation
방문자수 has 1616 (16.2%) zerosZeros

Reproduction

Analysis started2024-05-11 06:51:40.170411
Analysis finished2024-05-11 06:51:42.965736
Duration2.8 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

모델번호
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
SDOT001
10000 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSDOT001
2nd rowSDOT001
3rd rowSDOT001
4th rowSDOT001
5th rowSDOT001

Common Values

ValueCountFrequency (%)
SDOT001 10000
100.0%

Length

2024-05-11T15:51:43.071975image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:51:43.232007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
sdot001 10000
100.0%

시리얼
Real number (ℝ)

HIGH CORRELATION 

Distinct53
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4030.112
Minimum4001
Maximum4064
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:51:43.399958image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4001
5-th percentile4004
Q14016
median4030
Q34043
95-th percentile4061
Maximum4064
Range63
Interquartile range (IQR)27

Descriptive statistics

Standard deviation17.026792
Coefficient of variation (CV)0.004224893
Kurtosis-0.92662364
Mean4030.112
Median Absolute Deviation (MAD)13
Skewness0.14852503
Sum40301120
Variance289.91165
MonotonicityNot monotonic
2024-05-11T15:51:43.652095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4006 215
 
2.1%
4022 209
 
2.1%
4014 208
 
2.1%
4041 206
 
2.1%
4008 206
 
2.1%
4034 205
 
2.1%
4010 204
 
2.0%
4045 203
 
2.0%
4060 202
 
2.0%
4047 201
 
2.0%
Other values (43) 7941
79.4%
ValueCountFrequency (%)
4001 180
1.8%
4002 186
1.9%
4004 182
1.8%
4005 185
1.8%
4006 215
2.1%
4007 174
1.7%
4008 206
2.1%
4009 197
2.0%
4010 204
2.0%
4013 198
2.0%
ValueCountFrequency (%)
4064 188
1.9%
4062 194
1.9%
4061 169
1.7%
4060 202
2.0%
4054 189
1.9%
4053 172
1.7%
4051 172
1.7%
4050 173
1.7%
4049 186
1.9%
4048 180
1.8%
Distinct5013
Distinct (%)50.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:51:44.114388image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters190000
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2341 ?
Unique (%)23.4%

Sample

1st row2024-01-09_08:32:00
2nd row2024-01-13_15:15:00
3rd row2024-01-09_16:45:00
4th row2024-01-13_16:05:00
5th row2024-01-10_18:37:00
ValueCountFrequency (%)
2024-01-13_20:40:00 8
 
0.1%
2024-01-12_23:00:00 8
 
0.1%
2024-01-09_18:20:00 8
 
0.1%
2024-01-13_14:10:00 8
 
0.1%
2024-01-11_14:10:00 8
 
0.1%
2024-01-12_08:30:00 8
 
0.1%
2024-01-13_19:20:00 8
 
0.1%
2024-01-13_13:50:00 7
 
0.1%
2024-01-13_07:20:00 7
 
0.1%
2024-01-14_17:20:00 7
 
0.1%
Other values (5003) 9923
99.2%
2024-05-11T15:51:44.707288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 54189
28.5%
2 26819
14.1%
1 26515
14.0%
- 20000
 
10.5%
: 20000
 
10.5%
4 14326
 
7.5%
_ 10000
 
5.3%
3 4606
 
2.4%
5 4151
 
2.2%
6 2823
 
1.5%
Other values (3) 6571
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 140000
73.7%
Dash Punctuation 20000
 
10.5%
Other Punctuation 20000
 
10.5%
Connector Punctuation 10000
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 54189
38.7%
2 26819
19.2%
1 26515
18.9%
4 14326
 
10.2%
3 4606
 
3.3%
5 4151
 
3.0%
6 2823
 
2.0%
9 2336
 
1.7%
8 2218
 
1.6%
7 2017
 
1.4%
Dash Punctuation
ValueCountFrequency (%)
- 20000
100.0%
Other Punctuation
ValueCountFrequency (%)
: 20000
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 190000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 54189
28.5%
2 26819
14.1%
1 26515
14.0%
- 20000
 
10.5%
: 20000
 
10.5%
4 14326
 
7.5%
_ 10000
 
5.3%
3 4606
 
2.4%
5 4151
 
2.2%
6 2823
 
1.5%
Other values (3) 6571
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 190000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 54189
28.5%
2 26819
14.1%
1 26515
14.0%
- 20000
 
10.5%
: 20000
 
10.5%
4 14326
 
7.5%
_ 10000
 
5.3%
3 4606
 
2.4%
5 4151
 
2.2%
6 2823
 
1.5%
Other values (3) 6571
 
3.5%

지역
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
main_street
6431 
parks
2119 
traditional_markets
933 
public_facilities
 
173
residential_area
 
172

Length

Max length19
Median length11
Mean length10.7336
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmain_street
2nd rowmain_street
3rd rowparks
4th rowparks
5th rowmain_street

Common Values

ValueCountFrequency (%)
main_street 6431
64.3%
parks 2119
 
21.2%
traditional_markets 933
 
9.3%
public_facilities 173
 
1.7%
residential_area 172
 
1.7%
commercial_area 172
 
1.7%

Length

2024-05-11T15:51:44.944190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:51:45.119791image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
main_street 6431
64.3%
parks 2119
 
21.2%
traditional_markets 933
 
9.3%
public_facilities 173
 
1.7%
residential_area 172
 
1.7%
commercial_area 172
 
1.7%

자치구
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Jung-gu
1167 
Seoul_Grand_Park
1145 
Gangnam-gu
991 
Gangseo-gu
964 
Jongno-gu
901 
Other values (13)
4832 

Length

Max length16
Median length11
Mean length10.4901
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGangseo-gu
2nd rowJongno-gu
3rd rowSeoul_Grand_Park
4th rowSongpa-gu
5th rowGangdong-gu

Common Values

ValueCountFrequency (%)
Jung-gu 1167
11.7%
Seoul_Grand_Park 1145
11.5%
Gangnam-gu 991
9.9%
Gangseo-gu 964
9.6%
Jongno-gu 901
9.0%
Gangdong-gu 755
7.5%
Seocho-gu 733
 
7.3%
Gwangjin-gu 553
 
5.5%
Seodaemun-gu 551
 
5.5%
Eunpyeong-gu 387
 
3.9%
Other values (8) 1853
18.5%

Length

2024-05-11T15:51:45.315691image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
jung-gu 1167
11.7%
seoul_grand_park 1145
11.5%
gangnam-gu 991
9.9%
gangseo-gu 964
9.6%
jongno-gu 901
9.0%
gangdong-gu 755
7.5%
seocho-gu 733
 
7.3%
gwangjin-gu 553
 
5.5%
seodaemun-gu 551
 
5.5%
eunpyeong-gu 387
 
3.9%
Other values (8) 1853
18.5%

행정동
Categorical

HIGH CORRELATION 

Distinct47
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Daechi4-dong
 
403
Myeong-dong
 
385
Hongje3-dong
 
378
Buam-dong
 
368
Ihwa-dong
 
367
Other values (42)
8099 

Length

Max length16
Median length14
Mean length12.2255
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGayang-dong1
2nd rowGahoe-dong
3rd rowmeeting_bridge1
4th rowJamsil6-dong
5th rowSeongnae1-dong

Common Values

ValueCountFrequency (%)
Daechi4-dong 403
 
4.0%
Myeong-dong 385
 
3.9%
Hongje3-dong 378
 
3.8%
Buam-dong 368
 
3.7%
Ihwa-dong 367
 
3.7%
Gahoe-dong 364
 
3.6%
Amsa3-dong 215
 
2.1%
Daejo-dong 209
 
2.1%
Sindang-dong 208
 
2.1%
Jamsil6-dong 205
 
2.1%
Other values (37) 6898
69.0%

Length

2024-05-11T15:51:45.496777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
daechi4-dong 403
 
4.0%
myeong-dong 385
 
3.9%
hongje3-dong 378
 
3.8%
buam-dong 368
 
3.7%
ihwa-dong 367
 
3.7%
gahoe-dong 364
 
3.6%
amsa3-dong 215
 
2.1%
daejo-dong 209
 
2.1%
sindang-dong 208
 
2.1%
jamsil6-dong 205
 
2.1%
Other values (37) 6898
69.0%

방문자수
Real number (ℝ)

ZEROS 

Distinct365
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.4491
Minimum0
Maximum642
Zeros1616
Zeros (%)16.2%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:51:45.682467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median13
Q343
95-th percentile161
Maximum642
Range642
Interquartile range (IQR)41

Descriptive statistics

Standard deviation62.90336
Coefficient of variation (CV)1.7257864
Kurtosis17.521574
Mean36.4491
Median Absolute Deviation (MAD)13
Skewness3.5835508
Sum364491
Variance3956.8327
MonotonicityNot monotonic
2024-05-11T15:51:45.931111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1616
 
16.2%
1 588
 
5.9%
2 456
 
4.6%
3 381
 
3.8%
4 310
 
3.1%
5 275
 
2.8%
6 239
 
2.4%
8 213
 
2.1%
7 210
 
2.1%
9 191
 
1.9%
Other values (355) 5521
55.2%
ValueCountFrequency (%)
0 1616
16.2%
1 588
 
5.9%
2 456
 
4.6%
3 381
 
3.8%
4 310
 
3.1%
5 275
 
2.8%
6 239
 
2.4%
7 210
 
2.1%
8 213
 
2.1%
9 191
 
1.9%
ValueCountFrequency (%)
642 1
< 0.1%
633 1
< 0.1%
629 1
< 0.1%
624 1
< 0.1%
594 1
< 0.1%
586 1
< 0.1%
567 1
< 0.1%
556 1
< 0.1%
553 1
< 0.1%
550 1
< 0.1%
Distinct2923
Distinct (%)29.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2024-01-08 00:28:00
Maximum2024-01-14 23:58:02
2024-05-11T15:51:46.171078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:51:46.409200image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-05-11T15:51:42.245716image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:51:41.863220image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:51:42.430413image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:51:42.047943image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:51:46.546482image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시리얼지역자치구행정동방문자수
시리얼1.0000.7560.9610.9980.457
지역0.7561.0000.9190.9930.208
자치구0.9610.9191.0001.0000.417
행정동0.9980.9931.0001.0000.655
방문자수0.4570.2080.4170.6551.000
2024-05-11T15:51:46.713998image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역자치구행정동
지역1.0000.6350.942
자치구0.6351.0000.993
행정동0.9420.9931.000
2024-05-11T15:51:46.827078image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시리얼방문자수지역자치구행정동
시리얼1.000-0.3100.5290.8170.979
방문자수-0.3101.0000.1100.1730.288
지역0.5290.1101.0000.6350.942
자치구0.8170.1730.6351.0000.993
행정동0.9790.2880.9420.9931.000

Missing values

2024-05-11T15:51:42.665971image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:51:42.870761image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

모델번호시리얼측정시간지역자치구행정동방문자수등록일
10041SDOT00140262024-01-09_08:32:00main_streetGangseo-guGayang-dong1492024-01-09 08:48:02
41803SDOT00140372024-01-13_15:15:00main_streetJongno-guGahoe-dong482024-01-13 15:28:01
12617SDOT00140622024-01-09_16:45:00parksSeoul_Grand_Parkmeeting_bridge1212024-01-09 16:58:01
42091SDOT00140342024-01-13_16:05:00parksSongpa-guJamsil6-dong1382024-01-13 16:18:02
20603SDOT00140072024-01-10_18:37:00main_streetGangdong-guSeongnae1-dong222024-01-10 18:48:02
36488SDOT00140212024-01-12_21:56:00parksEunpyeong-guNokbeon-dong22024-01-12 22:08:02
14832SDOT00140312024-01-09_23:30:00traditional_marketsSeodaemun-guHongje3-dong12024-01-09 23:48:03
38945SDOT00140392024-01-13_06:11:00main_streetJongno-guSamcheong-dong02024-01-13 06:28:01
7348SDOT00140022024-01-08_23:26:00main_streetSeocho-guSeocho4-dong312024-01-08 23:38:02
8552SDOT00140202024-01-09_03:56:00main_streetYongsan-guItaewon2-dong02024-01-09 04:08:01
모델번호시리얼측정시간지역자치구행정동방문자수등록일
5118SDOT00140492024-01-08_16:22:00traditional_marketsSeodaemun-guHongje3-dong602024-01-08 16:38:03
36351SDOT00140542024-01-12_21:20:00parksSeoul_Grand_Parkmeeting_bridge202024-01-12 21:38:02
49799SDOT00140272024-01-14_18:00:00main_streetGangseo-guDeungchon-dong1472024-01-14 18:18:02
46110SDOT00140642024-01-14_06:27:00parksSeoul_Grand_Parkskylift22024-01-14 06:38:02
10840SDOT00140262024-01-09_11:02:00main_streetGangseo-guGayang-dong1392024-01-09 11:18:02
31765SDOT00140302024-01-12_07:07:00parksGangbuk-guBeon3-dong32024-01-12 07:18:00
22612SDOT00140352024-01-11_01:15:00main_streetGwanak-guNakseongdae-dong212024-01-11 01:28:01
7896SDOT00140362024-01-09_01:30:00main_streetYangcheon-guSinjeong4-dong32024-01-09 01:48:01
10295SDOT00140352024-01-09_09:25:00main_streetGwanak-guNakseongdae-dong192024-01-09 09:38:00
39084SDOT00140472024-01-13_06:46:00traditional_marketsGwangjin-guHwayang-dong372024-01-13 06:58:00