Overview

Dataset statistics

Number of variables4
Number of observations330
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory11.1 KiB
Average record size in memory34.4 B

Variable types

Numeric1
Text2
Categorical1

Dataset

Description온라인복권 1등(자동선택) 당첨 판매점 현황으로 859~911회차(19.5.18.~20.5.16.) 정보를 제공합니다. 순번, 상호, 지역, 1등 자동 당첨 건수 항목을 제공합니다.
Author기획재정부
URLhttps://www.data.go.kr/data/15059963/fileData.do

Alerts

1등 자동 당첨 건수 is highly imbalanced (81.8%)Imbalance
순번 has unique valuesUnique

Reproduction

Analysis started2023-12-12 12:30:16.697931
Analysis finished2023-12-12 12:30:17.337137
Duration0.64 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순번
Real number (ℝ)

UNIQUE 

Distinct330
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean165.5
Minimum1
Maximum330
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size3.0 KiB
2023-12-12T21:30:17.450632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile17.45
Q183.25
median165.5
Q3247.75
95-th percentile313.55
Maximum330
Range329
Interquartile range (IQR)164.5

Descriptive statistics

Standard deviation95.407023
Coefficient of variation (CV)0.57647748
Kurtosis-1.2
Mean165.5
Median Absolute Deviation (MAD)82.5
Skewness0
Sum54615
Variance9102.5
MonotonicityStrictly increasing
2023-12-12T21:30:17.652902image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.3%
228 1
 
0.3%
226 1
 
0.3%
225 1
 
0.3%
224 1
 
0.3%
223 1
 
0.3%
222 1
 
0.3%
221 1
 
0.3%
220 1
 
0.3%
219 1
 
0.3%
Other values (320) 320
97.0%
ValueCountFrequency (%)
1 1
0.3%
2 1
0.3%
3 1
0.3%
4 1
0.3%
5 1
0.3%
6 1
0.3%
7 1
0.3%
8 1
0.3%
9 1
0.3%
10 1
0.3%
ValueCountFrequency (%)
330 1
0.3%
329 1
0.3%
328 1
0.3%
327 1
0.3%
326 1
0.3%
325 1
0.3%
324 1
0.3%
323 1
0.3%
322 1
0.3%
321 1
0.3%

상호
Text

Distinct299
Distinct (%)90.6%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
2023-12-12T21:30:18.081772image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length16
Median length13
Mean length6.0515152
Min length2

Characters and Unicode

Total characters1997
Distinct characters314
Distinct categories11 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique279 ?
Unique (%)84.5%

Sample

1st row일등복권편의점
2nd row오케이상사
3rd row세진전자통신
4th row라이프마트
5th row스파
ValueCountFrequency (%)
행운복권방 6
 
1.7%
로또복권 5
 
1.4%
노다지복권방 5
 
1.4%
복권명당 3
 
0.8%
가판점 3
 
0.8%
복권방 3
 
0.8%
가로판매점 2
 
0.6%
복권판매점 2
 
0.6%
픽미 2
 
0.6%
복권 2
 
0.6%
Other values (307) 321
90.7%
2023-12-12T21:30:18.654144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
128
 
6.4%
119
 
6.0%
88
 
4.4%
68
 
3.4%
62
 
3.1%
59
 
3.0%
38
 
1.9%
32
 
1.6%
) 30
 
1.5%
( 30
 
1.5%
Other values (304) 1343
67.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1792
89.7%
Decimal Number 59
 
3.0%
Uppercase Letter 47
 
2.4%
Close Punctuation 30
 
1.5%
Open Punctuation 30
 
1.5%
Space Separator 24
 
1.2%
Lowercase Letter 10
 
0.5%
Dash Punctuation 2
 
0.1%
Other Symbol 1
 
0.1%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
128
 
7.1%
119
 
6.6%
88
 
4.9%
68
 
3.8%
62
 
3.5%
59
 
3.3%
38
 
2.1%
32
 
1.8%
29
 
1.6%
26
 
1.5%
Other values (272) 1143
63.8%
Uppercase Letter
ValueCountFrequency (%)
S 13
27.7%
G 12
25.5%
C 5
 
10.6%
U 5
 
10.6%
A 4
 
8.5%
L 3
 
6.4%
W 1
 
2.1%
Y 1
 
2.1%
M 1
 
2.1%
R 1
 
2.1%
Decimal Number
ValueCountFrequency (%)
2 25
42.4%
5 14
23.7%
4 11
18.6%
1 4
 
6.8%
7 2
 
3.4%
0 1
 
1.7%
9 1
 
1.7%
6 1
 
1.7%
Lowercase Letter
ValueCountFrequency (%)
o 3
30.0%
t 3
30.0%
l 1
 
10.0%
g 1
 
10.0%
s 1
 
10.0%
e 1
 
10.0%
Close Punctuation
ValueCountFrequency (%)
) 30
100.0%
Open Punctuation
ValueCountFrequency (%)
( 30
100.0%
Space Separator
ValueCountFrequency (%)
24
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Other Symbol
ValueCountFrequency (%)
1
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1793
89.8%
Common 147
 
7.4%
Latin 57
 
2.9%

Most frequent character per script

Hangul
ValueCountFrequency (%)
128
 
7.1%
119
 
6.6%
88
 
4.9%
68
 
3.8%
62
 
3.5%
59
 
3.3%
38
 
2.1%
32
 
1.8%
29
 
1.6%
26
 
1.5%
Other values (273) 1144
63.8%
Latin
ValueCountFrequency (%)
S 13
22.8%
G 12
21.1%
C 5
 
8.8%
U 5
 
8.8%
A 4
 
7.0%
o 3
 
5.3%
t 3
 
5.3%
L 3
 
5.3%
l 1
 
1.8%
W 1
 
1.8%
Other values (7) 7
12.3%
Common
ValueCountFrequency (%)
) 30
20.4%
( 30
20.4%
2 25
17.0%
24
16.3%
5 14
9.5%
4 11
 
7.5%
1 4
 
2.7%
- 2
 
1.4%
7 2
 
1.4%
+ 1
 
0.7%
Other values (4) 4
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1792
89.7%
ASCII 204
 
10.2%
None 1
 
0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
128
 
7.1%
119
 
6.6%
88
 
4.9%
68
 
3.8%
62
 
3.5%
59
 
3.3%
38
 
2.1%
32
 
1.8%
29
 
1.6%
26
 
1.5%
Other values (272) 1143
63.8%
ASCII
ValueCountFrequency (%)
) 30
14.7%
( 30
14.7%
2 25
12.3%
24
11.8%
5 14
 
6.9%
S 13
 
6.4%
G 12
 
5.9%
4 11
 
5.4%
C 5
 
2.5%
U 5
 
2.5%
Other values (21) 35
17.2%
None
ValueCountFrequency (%)
1
100.0%

지역
Text

Distinct139
Distinct (%)42.1%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
2023-12-12T21:30:19.059402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.9030303
Min length2

Characters and Unicode

Total characters1948
Distinct characters107
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique56 ?
Unique (%)17.0%

Sample

1st row대구 달서구
2nd row서울 서초구
3rd row대구 서구
4th row인천 중구
5th row서울 노원구
ValueCountFrequency (%)
경기 83
 
12.7%
서울 62
 
9.5%
부산 22
 
3.4%
경남 21
 
3.2%
인천 19
 
2.9%
충남 14
 
2.1%
광주 13
 
2.0%
전남 13
 
2.0%
충북 13
 
2.0%
강원 13
 
2.0%
Other values (130) 383
58.4%
2023-12-12T21:30:19.657353image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
326
16.7%
164
 
8.4%
163
 
8.4%
116
 
6.0%
91
 
4.7%
83
 
4.3%
70
 
3.6%
60
 
3.1%
50
 
2.6%
49
 
2.5%
Other values (97) 776
39.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 1622
83.3%
Space Separator 326
 
16.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
164
 
10.1%
163
 
10.0%
116
 
7.2%
91
 
5.6%
83
 
5.1%
70
 
4.3%
60
 
3.7%
50
 
3.1%
49
 
3.0%
45
 
2.8%
Other values (96) 731
45.1%
Space Separator
ValueCountFrequency (%)
326
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 1622
83.3%
Common 326
 
16.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
164
 
10.1%
163
 
10.0%
116
 
7.2%
91
 
5.6%
83
 
5.1%
70
 
4.3%
60
 
3.7%
50
 
3.1%
49
 
3.0%
45
 
2.8%
Other values (96) 731
45.1%
Common
ValueCountFrequency (%)
326
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 1622
83.3%
ASCII 326
 
16.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
326
100.0%
Hangul
ValueCountFrequency (%)
164
 
10.1%
163
 
10.0%
116
 
7.2%
91
 
5.6%
83
 
5.1%
70
 
4.3%
60
 
3.7%
50
 
3.1%
49
 
3.0%
45
 
2.8%
Other values (96) 731
45.1%

1등 자동 당첨 건수
Categorical

IMBALANCE 

Distinct4
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Memory size2.7 KiB
1
310 
2
 
18
5
 
1
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)0.6%

Sample

1st row5
2nd row3
3rd row2
4th row2
5th row2

Common Values

ValueCountFrequency (%)
1 310
93.9%
2 18
 
5.5%
5 1
 
0.3%
3 1
 
0.3%

Length

2023-12-12T21:30:19.855444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T21:30:19.981706image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 310
93.9%
2 18
 
5.5%
5 1
 
0.3%
3 1
 
0.3%

Interactions

2023-12-12T21:30:17.000659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T21:30:20.064842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번1등 자동 당첨 건수
순번1.0000.610
1등 자동 당첨 건수0.6101.000
2023-12-12T21:30:20.166579image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순번1등 자동 당첨 건수
순번1.0000.410
1등 자동 당첨 건수0.4101.000

Missing values

2023-12-12T21:30:17.156914image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T21:30:17.289865image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순번상호지역1등 자동 당첨 건수
01일등복권편의점대구 달서구5
12오케이상사서울 서초구3
23세진전자통신대구 서구2
34라이프마트인천 중구2
45스파서울 노원구2
56노다지복권방인천 미추홀구2
67흥부네박터졌네인천 계양구2
78오천억복권방광주 서구2
89해피+24시편의점광주 북구2
910토큰박스경기 남양주시2
순번상호지역1등 자동 당첨 건수
320321GS25(수영광안점)부산 수영구1
321322CU(입석강변점)대구 동구1
322323씨스페이스(주안1-2)인천 미추홀구1
323324GS25(계산동경점)인천 계양구1
324325GS25(인천관교점)인천 미추홀구1
325326GS25(수원행복점)경기 수원시1
326327CU(강릉내곡점)강원 강릉시1
327328GS25(청주수곡점)충북 청주시1
328329GS25(천안시민점)충남 천안시1
329330GS25(양산혜인점)경남 양산시1