Overview

Dataset statistics

Number of variables6
Number of observations68
Missing cells0
Missing cells (%)0.0%
Duplicate rows34
Duplicate rows (%)50.0%
Total size in memory3.6 KiB
Average record size in memory53.9 B

Variable types

DateTime1
Text1
Numeric2
Categorical2

Dataset

Description기준년월,주차장명,정기권대수(소형),정기권대수(대형),현재판매량(소형),현재판매량(대형)
Author서울시설공단
URLhttps://data.seoul.go.kr/dataList/OA-2209/S/1/datasetView.do

Alerts

기준년월 has constant value ""Constant
정기권대수(대형) has constant value ""Constant
현재판매량(대형) has constant value ""Constant
Dataset has 34 (50.0%) duplicate rowsDuplicates
정기권대수(소형) is highly overall correlated with 현재판매량(소형)High correlation
현재판매량(소형) is highly overall correlated with 정기권대수(소형)High correlation

Reproduction

Analysis started2024-05-11 09:35:02.782545
Analysis finished2024-05-11 09:35:09.706227
Duration6.92 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기준년월
Date

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size676.0 B
Minimum2024-04-01 00:00:00
Maximum2024-04-01 00:00:00
2024-05-11T09:35:09.958824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:35:10.510429image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
Distinct34
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Memory size676.0 B
2024-05-11T09:35:11.105084image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.3529412
Min length2

Characters and Unicode

Total characters296
Distinct characters84
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가양라이품
2nd row개화산역
3rd row개화역
4th row구파발역
5th row도봉산역
ValueCountFrequency (%)
가양라이품 2
 
2.9%
개화산역 2
 
2.9%
구로디지털단지역 2
 
2.9%
훈련원공원 2
 
2.9%
한강진역 2
 
2.9%
학여울역 2
 
2.9%
천호역 2
 
2.9%
천왕역 2
 
2.9%
종묘 2
 
2.9%
볕우물 2
 
2.9%
Other values (24) 48
70.6%
2024-05-11T09:35:12.635712image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
30
 
10.1%
12
 
4.1%
10
 
3.4%
10
 
3.4%
8
 
2.7%
8
 
2.7%
) 6
 
2.0%
6
 
2.0%
6
 
2.0%
6
 
2.0%
Other values (74) 194
65.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 274
92.6%
Close Punctuation 6
 
2.0%
Open Punctuation 6
 
2.0%
Decimal Number 6
 
2.0%
Uppercase Letter 4
 
1.4%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
30
 
10.9%
12
 
4.4%
10
 
3.6%
10
 
3.6%
8
 
2.9%
8
 
2.9%
6
 
2.2%
6
 
2.2%
6
 
2.2%
6
 
2.2%
Other values (67) 172
62.8%
Decimal Number
ValueCountFrequency (%)
1 2
33.3%
2 2
33.3%
4 2
33.3%
Uppercase Letter
ValueCountFrequency (%)
R 2
50.0%
V 2
50.0%
Close Punctuation
ValueCountFrequency (%)
) 6
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 274
92.6%
Common 18
 
6.1%
Latin 4
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
30
 
10.9%
12
 
4.4%
10
 
3.6%
10
 
3.6%
8
 
2.9%
8
 
2.9%
6
 
2.2%
6
 
2.2%
6
 
2.2%
6
 
2.2%
Other values (67) 172
62.8%
Common
ValueCountFrequency (%)
) 6
33.3%
( 6
33.3%
1 2
 
11.1%
2 2
 
11.1%
4 2
 
11.1%
Latin
ValueCountFrequency (%)
R 2
50.0%
V 2
50.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 274
92.6%
ASCII 22
 
7.4%

Most frequent character per block

Hangul
ValueCountFrequency (%)
30
 
10.9%
12
 
4.4%
10
 
3.6%
10
 
3.6%
8
 
2.9%
8
 
2.9%
6
 
2.2%
6
 
2.2%
6
 
2.2%
6
 
2.2%
Other values (67) 172
62.8%
ASCII
ValueCountFrequency (%)
) 6
27.3%
( 6
27.3%
1 2
 
9.1%
R 2
 
9.1%
V 2
 
9.1%
2 2
 
9.1%
4 2
 
9.1%

정기권대수(소형)
Real number (ℝ)

HIGH CORRELATION 

Distinct33
Distinct (%)48.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean315.67647
Minimum11
Maximum1580
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 B
2024-05-11T09:35:13.109308image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile16.05
Q149
median180
Q3370
95-th percentile1125
Maximum1580
Range1569
Interquartile range (IQR)321

Descriptive statistics

Standard deviation382.58217
Coefficient of variation (CV)1.2119439
Kurtosis2.5646228
Mean315.67647
Median Absolute Deviation (MAD)147.5
Skewness1.7636443
Sum21466
Variance146369.12
MonotonicityNot monotonic
2024-05-11T09:35:13.572154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
25 4
 
5.9%
30 2
 
2.9%
70 2
 
2.9%
15 2
 
2.9%
49 2
 
2.9%
11 2
 
2.9%
340 2
 
2.9%
65 2
 
2.9%
1160 2
 
2.9%
130 2
 
2.9%
Other values (23) 46
67.6%
ValueCountFrequency (%)
11 2
2.9%
15 2
2.9%
18 2
2.9%
25 4
5.9%
30 2
2.9%
35 2
2.9%
40 2
2.9%
49 2
2.9%
50 2
2.9%
65 2
2.9%
ValueCountFrequency (%)
1580 2
2.9%
1160 2
2.9%
1060 2
2.9%
1000 2
2.9%
810 2
2.9%
620 2
2.9%
520 2
2.9%
460 2
2.9%
370 2
2.9%
340 2
2.9%

정기권대수(대형)
Categorical

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size676.0 B
0
68 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 68
100.0%

Length

2024-05-11T09:35:14.336087image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T09:35:14.659385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 68
100.0%

현재판매량(소형)
Real number (ℝ)

HIGH CORRELATION 

Distinct32
Distinct (%)47.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean291.17647
Minimum11
Maximum1580
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size744.0 B
2024-05-11T09:35:14.998504image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile15.7
Q139
median179.5
Q3363
95-th percentile1091.85
Maximum1580
Range1569
Interquartile range (IQR)324

Descriptive statistics

Standard deviation357.22951
Coefficient of variation (CV)1.2268488
Kurtosis4.2227727
Mean291.17647
Median Absolute Deviation (MAD)147
Skewness2.0611945
Sum19800
Variance127612.92
MonotonicityNot monotonic
2024-05-11T09:35:15.537448image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
35 4
 
5.9%
25 4
 
5.9%
30 2
 
2.9%
65 2
 
2.9%
363 2
 
2.9%
15 2
 
2.9%
11 2
 
2.9%
340 2
 
2.9%
70 2
 
2.9%
200 2
 
2.9%
Other values (22) 44
64.7%
ValueCountFrequency (%)
11 2
2.9%
15 2
2.9%
17 2
2.9%
25 4
5.9%
30 2
2.9%
35 4
5.9%
39 2
2.9%
43 2
2.9%
65 2
2.9%
70 2
2.9%
ValueCountFrequency (%)
1580 2
2.9%
1109 2
2.9%
1060 2
2.9%
748 2
2.9%
620 2
2.9%
519 2
2.9%
402 2
2.9%
370 2
2.9%
363 2
2.9%
340 2
2.9%

현재판매량(대형)
Categorical

CONSTANT 

Distinct1
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Memory size676.0 B
0
68 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 68
100.0%

Length

2024-05-11T09:35:16.071092image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-11T09:35:16.550801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 68
100.0%

Interactions

2024-05-11T09:35:07.850986image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:35:06.884717image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:35:08.243384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-11T09:35:07.370662image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-11T09:35:16.811996image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
주차장명정기권대수(소형)현재판매량(소형)
주차장명1.0001.0001.000
정기권대수(소형)1.0001.0000.967
현재판매량(소형)1.0000.9671.000
2024-05-11T09:35:17.197116image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
정기권대수(소형)현재판매량(소형)
정기권대수(소형)1.0000.998
현재판매량(소형)0.9981.000

Missing values

2024-05-11T09:35:08.810897image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-11T09:35:09.391584image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기준년월주차장명정기권대수(소형)정기권대수(대형)현재판매량(소형)현재판매량(대형)
02024-04가양라이품300300
12024-04개화산역33003300
22024-04개화역29002900
32024-04구파발역24002400
42024-04도봉산역23002300
52024-04동구로500430
62024-04동구로(승합)180170
72024-04동대문1060010600
82024-04마포유수지46004020
92024-04볕우물800800
기준년월주차장명정기권대수(소형)정기권대수(대형)현재판매량(소형)현재판매량(대형)
582024-04장안1동650650
592024-04장안2동700700
602024-04종묘1160011090
612024-04천왕역23502350
622024-04천호역1580015800
632024-04학여울역16001590
642024-04한강진역14001400
652024-04훈련원공원81007480
662024-04구로디지털단지역250250
672024-04신대방역350350

Duplicate rows

Most frequently occurring

기준년월주차장명정기권대수(소형)정기권대수(대형)현재판매량(소형)현재판매량(대형)# duplicates
02024-04가양라이품3003002
12024-04개화산역330033002
22024-04개화역290029002
32024-04구로디지털단지역2502502
42024-04구파발역240024002
52024-04도봉산역230023002
62024-04동구로5004302
72024-04동구로(승합)1801702
82024-04동대문10600106002
92024-04마포유수지460040202