Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory722.7 KiB
Average record size in memory74.0 B

Variable types

Categorical4
Numeric2
Text1
DateTime1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15964/S/1/datasetView.do

Alerts

모델번호 has constant value ""Constant
지역 is highly overall correlated with 시리얼 and 2 other fieldsHigh correlation
자치구 is highly overall correlated with 시리얼 and 2 other fieldsHigh correlation
행정동 is highly overall correlated with 시리얼 and 2 other fieldsHigh correlation
시리얼 is highly overall correlated with 지역 and 2 other fieldsHigh correlation
방문자수 has 1537 (15.4%) zerosZeros

Reproduction

Analysis started2024-05-11 06:51:57.833384
Analysis finished2024-05-11 06:51:59.323772
Duration1.49 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

모델번호
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
SDOT001
10000 

Length

Max length7
Median length7
Mean length7
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSDOT001
2nd rowSDOT001
3rd rowSDOT001
4th rowSDOT001
5th rowSDOT001

Common Values

ValueCountFrequency (%)
SDOT001 10000
100.0%

Length

2024-05-11T15:51:59.393995image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:51:59.506107image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
sdot001 10000
100.0%

시리얼
Real number (ℝ)

HIGH CORRELATION 

Distinct53
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4030.3441
Minimum4001
Maximum4064
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:51:59.644779image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4001
5-th percentile4004
Q14017
median4030
Q34043
95-th percentile4061
Maximum4064
Range63
Interquartile range (IQR)26

Descriptive statistics

Standard deviation17.052718
Coefficient of variation (CV)0.0042310823
Kurtosis-0.9371639
Mean4030.3441
Median Absolute Deviation (MAD)13
Skewness0.11530352
Sum40303441
Variance290.79517
MonotonicityNot monotonic
2024-05-11T15:51:59.845074image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4030 209
 
2.1%
4034 209
 
2.1%
4038 209
 
2.1%
4053 207
 
2.1%
4010 204
 
2.0%
4029 204
 
2.0%
4047 203
 
2.0%
4005 200
 
2.0%
4048 200
 
2.0%
4051 198
 
2.0%
Other values (43) 7957
79.6%
ValueCountFrequency (%)
4001 192
1.9%
4002 195
1.9%
4004 183
1.8%
4005 200
2.0%
4006 182
1.8%
4007 179
1.8%
4008 188
1.9%
4009 197
2.0%
4010 204
2.0%
4013 183
1.8%
ValueCountFrequency (%)
4064 194
1.9%
4062 174
1.7%
4061 184
1.8%
4060 189
1.9%
4054 175
1.8%
4053 207
2.1%
4051 198
2.0%
4050 191
1.9%
4049 194
1.9%
4048 200
2.0%
Distinct5013
Distinct (%)50.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-11T15:52:00.173702image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters190000
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2320 ?
Unique (%)23.2%

Sample

1st row2024-01-04_06:51:00
2nd row2024-01-01_12:47:00
3rd row2024-01-07_12:55:00
4th row2024-01-06_08:36:00
5th row2024-01-06_17:10:00
ValueCountFrequency (%)
2024-01-06_01:30:00 9
 
0.1%
2024-01-07_17:10:00 9
 
0.1%
2024-01-03_16:50:00 8
 
0.1%
2024-01-06_15:00:00 8
 
0.1%
2024-01-06_22:56:00 8
 
0.1%
2024-01-02_03:10:00 8
 
0.1%
2024-01-04_18:10:00 8
 
0.1%
2024-01-01_12:20:00 8
 
0.1%
2024-01-01_14:00:00 7
 
0.1%
2024-01-01_02:50:00 7
 
0.1%
Other values (5003) 9920
99.2%
2024-05-11T15:52:00.662554image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 59755
31.4%
2 26838
14.1%
- 20000
 
10.5%
: 20000
 
10.5%
1 19369
 
10.2%
4 14368
 
7.6%
_ 10000
 
5.3%
5 5597
 
2.9%
3 4553
 
2.4%
6 4371
 
2.3%
Other values (3) 5149
 
2.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 140000
73.7%
Dash Punctuation 20000
 
10.5%
Other Punctuation 20000
 
10.5%
Connector Punctuation 10000
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 59755
42.7%
2 26838
19.2%
1 19369
 
13.8%
4 14368
 
10.3%
5 5597
 
4.0%
3 4553
 
3.3%
6 4371
 
3.1%
7 3383
 
2.4%
8 893
 
0.6%
9 873
 
0.6%
Dash Punctuation
ValueCountFrequency (%)
- 20000
100.0%
Other Punctuation
ValueCountFrequency (%)
: 20000
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 190000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 59755
31.4%
2 26838
14.1%
- 20000
 
10.5%
: 20000
 
10.5%
1 19369
 
10.2%
4 14368
 
7.6%
_ 10000
 
5.3%
5 5597
 
2.9%
3 4553
 
2.4%
6 4371
 
2.3%
Other values (3) 5149
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 190000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 59755
31.4%
2 26838
14.1%
- 20000
 
10.5%
: 20000
 
10.5%
1 19369
 
10.2%
4 14368
 
7.6%
_ 10000
 
5.3%
5 5597
 
2.9%
3 4553
 
2.4%
6 4371
 
2.3%
Other values (3) 5149
 
2.7%

지역
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
main_street
6363 
parks
2072 
traditional_markets
969 
residential_area
 
207
commercial_area
 
198

Length

Max length19
Median length11
Mean length10.8293
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowcommercial_area
2nd rowparks
3rd rowmain_street
4th rowtraditional_markets
5th rowmain_street

Common Values

ValueCountFrequency (%)
main_street 6363
63.6%
parks 2072
 
20.7%
traditional_markets 969
 
9.7%
residential_area 207
 
2.1%
commercial_area 198
 
2.0%
public_facilities 191
 
1.9%

Length

2024-05-11T15:52:00.855424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T15:52:00.996627image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
main_street 6363
63.6%
parks 2072
 
20.7%
traditional_markets 969
 
9.7%
residential_area 207
 
2.1%
commercial_area 198
 
2.0%
public_facilities 191
 
1.9%

자치구
Categorical

HIGH CORRELATION 

Distinct18
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Jung-gu
1143 
Seoul_Grand_Park
1087 
Gangnam-gu
948 
Jongno-gu
919 
Gangseo-gu
909 
Other values (13)
4994 

Length

Max length16
Median length11
Mean length10.4664
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGangdong-gu
2nd rowSeoul_Grand_Park
3rd rowGangnam-gu
4th rowGangbuk-gu
5th rowGangdong-gu

Common Values

ValueCountFrequency (%)
Jung-gu 1143
11.4%
Seoul_Grand_Park 1087
10.9%
Gangnam-gu 948
9.5%
Jongno-gu 919
9.2%
Gangseo-gu 909
9.1%
Seocho-gu 770
7.7%
Gangdong-gu 750
 
7.5%
Gwangjin-gu 610
 
6.1%
Seodaemun-gu 566
 
5.7%
Gangbuk-gu 413
 
4.1%
Other values (8) 1885
18.9%

Length

2024-05-11T15:52:01.166364image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
jung-gu 1143
11.4%
seoul_grand_park 1087
10.9%
gangnam-gu 948
9.5%
jongno-gu 919
9.2%
gangseo-gu 909
9.1%
seocho-gu 770
7.7%
gangdong-gu 750
 
7.5%
gwangjin-gu 610
 
6.1%
seodaemun-gu 566
 
5.7%
gangbuk-gu 413
 
4.1%
Other values (8) 1885
18.9%

행정동
Categorical

HIGH CORRELATION 

Distinct47
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Buam-dong
 
395
Daechi4-dong
 
385
Myeong-dong
 
385
Gahoe-dong
 
380
Hongje3-dong
 
375
Other values (42)
8080 

Length

Max length16
Median length14
Mean length12.1998
Min length7

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBuam-dong
2nd rowskylift
3rd rowYeoksam1-dong
4th rowSuyu3-dong
5th rowCheonho1-dong

Common Values

ValueCountFrequency (%)
Buam-dong 395
 
4.0%
Daechi4-dong 385
 
3.9%
Myeong-dong 385
 
3.9%
Gahoe-dong 380
 
3.8%
Hongje3-dong 375
 
3.8%
Ihwa-dong 352
 
3.5%
Beon3-dong 209
 
2.1%
Jamsil6-dong 209
 
2.1%
Gwangjang-dong 207
 
2.1%
Suyu3-dong 204
 
2.0%
Other values (37) 6899
69.0%

Length

2024-05-11T15:52:01.320863image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
buam-dong 395
 
4.0%
daechi4-dong 385
 
3.9%
myeong-dong 385
 
3.9%
gahoe-dong 380
 
3.8%
hongje3-dong 375
 
3.8%
ihwa-dong 352
 
3.5%
beon3-dong 209
 
2.1%
jamsil6-dong 209
 
2.1%
gwangjang-dong 207
 
2.1%
suyu3-dong 204
 
2.0%
Other values (37) 6899
69.0%

방문자수
Real number (ℝ)

ZEROS 

Distinct413
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.8578
Minimum0
Maximum731
Zeros1537
Zeros (%)15.4%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-11T15:52:01.497748image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median13
Q342
95-th percentile166
Maximum731
Range731
Interquartile range (IQR)40

Descriptive statistics

Standard deviation68.995067
Coefficient of variation (CV)1.8224796
Kurtosis19.836927
Mean37.8578
Median Absolute Deviation (MAD)13
Skewness3.9151275
Sum378578
Variance4760.3192
MonotonicityNot monotonic
2024-05-11T15:52:01.667424image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1537
 
15.4%
1 551
 
5.5%
2 459
 
4.6%
3 352
 
3.5%
4 318
 
3.2%
5 269
 
2.7%
6 256
 
2.6%
7 229
 
2.3%
9 200
 
2.0%
10 198
 
2.0%
Other values (403) 5631
56.3%
ValueCountFrequency (%)
0 1537
15.4%
1 551
 
5.5%
2 459
 
4.6%
3 352
 
3.5%
4 318
 
3.2%
5 269
 
2.7%
6 256
 
2.6%
7 229
 
2.3%
8 165
 
1.7%
9 200
 
2.0%
ValueCountFrequency (%)
731 1
< 0.1%
648 1
< 0.1%
641 1
< 0.1%
627 1
< 0.1%
622 1
< 0.1%
607 1
< 0.1%
602 1
< 0.1%
599 1
< 0.1%
592 1
< 0.1%
589 1
< 0.1%
Distinct2939
Distinct (%)29.4%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2024-01-01 00:08:05
Maximum2024-01-07 23:58:03
2024-05-11T15:52:01.819274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:52:01.983644image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-05-11T15:51:58.789687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:51:58.533726image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:51:58.923494image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T15:51:58.648910image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T15:52:02.132829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시리얼지역자치구행정동방문자수
시리얼1.0000.7680.9630.9980.466
지역0.7681.0000.9240.9940.177
자치구0.9630.9241.0001.0000.412
행정동0.9980.9941.0001.0000.652
방문자수0.4660.1770.4120.6521.000
2024-05-11T15:52:02.303023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역자치구행정동
지역1.0000.6470.945
자치구0.6471.0000.992
행정동0.9450.9921.000
2024-05-11T15:52:02.427458image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시리얼방문자수지역자치구행정동
시리얼1.000-0.2840.5440.8250.979
방문자수-0.2841.0000.0940.1710.285
지역0.5440.0941.0000.6470.945
자치구0.8250.1710.6471.0000.992
행정동0.9790.2850.9450.9921.000

Missing values

2024-05-11T15:51:59.083633image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T15:51:59.246595image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

모델번호시리얼측정시간지역자치구행정동방문자수등록일
24523SDOT00140512024-01-04_06:51:00commercial_areaGangdong-guBuam-dong272024-01-04 07:08:02
3875SDOT00140642024-01-01_12:47:00parksSeoul_Grand_Parkskylift122024-01-01 12:58:01
48428SDOT00140102024-01-07_12:55:00main_streetGangnam-guYeoksam1-dong1442024-01-07 13:08:04
40020SDOT00140292024-01-06_08:36:00traditional_marketsGangbuk-guSuyu3-dong272024-01-06 08:48:00
42791SDOT00140332024-01-06_17:10:00main_streetGangdong-guCheonho1-dong702024-01-06 17:28:02
49921SDOT00140302024-01-07_17:47:00parksGangbuk-guBeon3-dong112024-01-07 17:58:01
18174SDOT00140022024-01-03_10:26:00main_streetSeocho-guSeocho4-dong1622024-01-03 10:38:02
9043SDOT00140512024-01-02_05:11:00commercial_areaGangdong-guBuam-dong42024-01-02 05:28:01
41615SDOT00140152024-01-06_13:30:00main_streetJung-guGwanghui-dong4692024-01-06 13:48:00
8918SDOT00140202024-01-02_04:56:00main_streetYongsan-guItaewon2-dong42024-01-02 05:08:02
모델번호시리얼측정시간지역자치구행정동방문자수등록일
51442SDOT00140312024-01-07_22:20:00traditional_marketsSeodaemun-guHongje3-dong62024-01-07 22:38:02
39789SDOT00140012024-01-06_07:40:00main_streetSeocho-guSeocho3-dong42024-01-06 07:58:02
29838SDOT00140272024-01-04_23:30:00main_streetGangseo-guDeungchon-dong1252024-01-04 23:48:01
20580SDOT00140642024-01-03_18:07:00parksSeoul_Grand_Parkskylift92024-01-03 18:18:00
11786SDOT00140452024-01-02_13:52:00parksSeoul_Grand_Parkvalet_parking1412024-01-02 14:08:03
23380SDOT00140052024-01-04_03:12:00main_streetSeocho-guYangjae1-dong12024-01-04 03:28:02
9492SDOT00140482024-01-02_06:44:00main_streetGwangjin-guGuui1-dong222024-01-02 06:58:01
45695SDOT00140202024-01-07_02:46:00main_streetYongsan-guItaewon2-dong02024-01-07 02:58:00
26111SDOT00140252024-01-04_11:57:00main_streetGangseo-guGonghang-dong162024-01-04 12:08:03
8402SDOT00140332024-01-02_03:10:00main_streetGangdong-guCheonho1-dong22024-01-02 03:28:02