Overview

Dataset statistics

Number of variables5
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory488.3 KiB
Average record size in memory50.0 B

Variable types

Numeric2
Categorical2
DateTime1

Dataset

Description사행산업 또는 불법 사행산업으로 인한 중독 및 도박 문제와 관련한 조기개입 지역, 성별 등 인구학적 데이터에 대한 종합적인 통계 데이터
Author한국도박문제관리센터
URLhttps://www.data.go.kr/data/15094350/fileData.do

Alerts

성별 is highly imbalanced (51.3%)Imbalance
번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 02:31:30.373329
Analysis finished2023-12-12 02:31:31.678872
Duration1.31 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22295.432
Minimum2
Maximum48189
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T11:31:31.756836image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile2307.65
Q111254.75
median22419
Q333162.5
95-th percentile42472.1
Maximum48189
Range48187
Interquartile range (IQR)21907.75

Descriptive statistics

Standard deviation12831.558
Coefficient of variation (CV)0.57552408
Kurtosis-1.1540285
Mean22295.432
Median Absolute Deviation (MAD)10959
Skewness0.016342256
Sum2.2295432 × 108
Variance1.6464888 × 108
MonotonicityNot monotonic
2023-12-12T11:31:31.897606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
41744 1
 
< 0.1%
26618 1
 
< 0.1%
1665 1
 
< 0.1%
25390 1
 
< 0.1%
7528 1
 
< 0.1%
4346 1
 
< 0.1%
433 1
 
< 0.1%
21319 1
 
< 0.1%
24724 1
 
< 0.1%
28266 1
 
< 0.1%
Other values (9990) 9990
99.9%
ValueCountFrequency (%)
2 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
12 1
< 0.1%
16 1
< 0.1%
25 1
< 0.1%
27 1
< 0.1%
28 1
< 0.1%
33 1
< 0.1%
37 1
< 0.1%
ValueCountFrequency (%)
48189 1
< 0.1%
48160 1
< 0.1%
48150 1
< 0.1%
48148 1
< 0.1%
48133 1
< 0.1%
48131 1
< 0.1%
48125 1
< 0.1%
48070 1
< 0.1%
48028 1
< 0.1%
47967 1
< 0.1%

지역
Categorical

Distinct15
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
서울
4237 
제주
749 
경기남부
748 
강원
651 
대구
575 
Other values (10)
3040 

Length

Max length4
Median length2
Mean length2.4814
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경남
2nd row경북
3rd row광주전남
4th row인천
5th row서울

Common Values

ValueCountFrequency (%)
서울 4237
42.4%
제주 749
 
7.5%
경기남부 748
 
7.5%
강원 651
 
6.5%
대구 575
 
5.8%
경기북부 546
 
5.5%
경남 441
 
4.4%
인천 417
 
4.2%
부산울산 396
 
4.0%
전북 339
 
3.4%
Other values (5) 901
 
9.0%

Length

2023-12-12T11:31:32.062760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울 4237
42.4%
제주 749
 
7.5%
경기남부 748
 
7.5%
강원 651
 
6.5%
대구 575
 
5.8%
경기북부 546
 
5.5%
경남 441
 
4.4%
인천 417
 
4.2%
부산울산 396
 
4.0%
전북 339
 
3.4%
Other values (5) 901
 
9.0%

성별
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
남성
6542 
여성
3402 
남자
 
48
여자
 
8

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row여성
2nd row남성
3rd row남성
4th row여성
5th row남성

Common Values

ValueCountFrequency (%)
남성 6542
65.4%
여성 3402
34.0%
남자 48
 
0.5%
여자 8
 
0.1%

Length

2023-12-12T11:31:32.230449image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T11:31:32.352643image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
남성 6542
65.4%
여성 3402
34.0%
남자 48
 
0.5%
여자 8
 
0.1%

연령대
Real number (ℝ)

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18.083
Minimum10
Maximum70
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T11:31:32.443534image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile10
Q110
median10
Q320
95-th percentile40
Maximum70
Range60
Interquartile range (IQR)10

Descriptive statistics

Standard deviation11.502536
Coefficient of variation (CV)0.63609668
Kurtosis1.4889753
Mean18.083
Median Absolute Deviation (MAD)0
Skewness1.4249628
Sum180830
Variance132.30834
MonotonicityNot monotonic
2023-12-12T11:31:32.564920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
10 5762
57.6%
20 1865
 
18.6%
30 1305
 
13.1%
40 767
 
7.7%
50 214
 
2.1%
60 71
 
0.7%
70 16
 
0.2%
ValueCountFrequency (%)
10 5762
57.6%
20 1865
 
18.6%
30 1305
 
13.1%
40 767
 
7.7%
50 214
 
2.1%
60 71
 
0.7%
70 16
 
0.2%
ValueCountFrequency (%)
70 16
 
0.2%
60 71
 
0.7%
50 214
 
2.1%
40 767
 
7.7%
30 1305
 
13.1%
20 1865
 
18.6%
10 5762
57.6%
Distinct391
Distinct (%)3.9%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2020-09-26 00:00:00
Maximum2021-10-31 00:00:00
2023-12-12T11:31:32.728967image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:31:32.888972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2023-12-12T11:31:31.021858image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:31:30.694371image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:31:31.223414image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T11:31:30.843508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T11:31:33.013406image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호지역성별연령대
번호1.0000.4280.3660.300
지역0.4281.0000.2390.414
성별0.3660.2391.0000.281
연령대0.3000.4140.2811.000
2023-12-12T11:31:33.129837image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
지역성별
지역1.0000.138
성별0.1381.000
2023-12-12T11:31:33.245606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호연령대지역성별
번호1.000-0.2240.1740.227
연령대-0.2241.0000.2010.196
지역0.1740.2011.0000.138
성별0.2270.1960.1381.000

Missing values

2023-12-12T11:31:31.508096image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T11:31:31.637238image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호지역성별연령대검사일자
4143341744경남여성402021-10-04
62246225경북남성202021-01-26
3897338974광주전남남성302021-07-05
3427134272인천여성102021-07-12
1697216973서울남성102021-05-27
1706217063서울여성102021-05-27
1443214433제주여성102021-06-11
2699026991대구남성202021-08-09
2196221963서울남성302021-05-16
1409714098경남남성102021-05-06
번호지역성별연령대검사일자
2759527596서울남성102021-07-15
1344113442서울여성102021-04-30
57405741전북남성302021-01-29
1393113932전북남성102021-06-21
4290043810서울남성102021-10-01
2723227233강원남성102021-09-01
1431514316서울남성102021-06-18
1776517766서울남성102021-05-20
3997840079서울남성102021-10-22
2059920600서울남성102021-05-20