Overview

Dataset statistics

Number of variables9
Number of observations100
Missing cells0
Missing cells (%)0.0%
Duplicate rows8
Duplicate rows (%)8.0%
Total size in memory7.5 KiB
Average record size in memory76.3 B

Variable types

Numeric3
Categorical6

Alerts

Dataset has 8 (8.0%) duplicate rowsDuplicates
test_cnt is highly overall correlated with center_nmHigh correlation
test_age is highly overall correlated with center_nm and 2 other fieldsHigh correlation
test_ymd is highly overall correlated with center_nm and 2 other fieldsHigh correlation
center_nm is highly overall correlated with test_cnt and 5 other fieldsHigh correlation
age_gbn is highly overall correlated with test_age and 2 other fieldsHigh correlation
test_gbn is highly overall correlated with test_age and 2 other fieldsHigh correlation
input_gbn is highly overall correlated with center_nmHigh correlation
input_gbn is highly imbalanced (78.9%)Imbalance

Reproduction

Analysis started2023-12-10 10:12:01.324574
Analysis finished2023-12-10 10:12:04.460354
Duration3.14 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

test_cnt
Real number (ℝ)

HIGH CORRELATION 

Distinct8
Distinct (%)8.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.86
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:12:04.590570image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile5
Maximum11
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.7057242
Coefficient of variation (CV)0.91705601
Kurtosis10.296552
Mean1.86
Median Absolute Deviation (MAD)0
Skewness2.9268518
Sum186
Variance2.9094949
MonotonicityNot monotonic
2023-12-10T19:12:04.759631image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
1 66
66.0%
2 14
 
14.0%
3 8
 
8.0%
4 5
 
5.0%
5 3
 
3.0%
8 2
 
2.0%
6 1
 
1.0%
11 1
 
1.0%
ValueCountFrequency (%)
1 66
66.0%
2 14
 
14.0%
3 8
 
8.0%
4 5
 
5.0%
5 3
 
3.0%
6 1
 
1.0%
8 2
 
2.0%
11 1
 
1.0%
ValueCountFrequency (%)
11 1
 
1.0%
8 2
 
2.0%
6 1
 
1.0%
5 3
 
3.0%
4 5
 
5.0%
3 8
 
8.0%
2 14
 
14.0%
1 66
66.0%

center_nm
Categorical

HIGH CORRELATION 

Distinct39
Distinct (%)39.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
안동
12 
동구(광주)
12 
성남
12 
영동
경산
 
4
Other values (34)
53 

Length

Max length10
Median length2
Mean length3
Min length2

Unique

Unique21 ?
Unique (%)21.0%

Sample

1st row서초
2nd rowKSPO대구
3rd row안동
4th row성남
5th row영동

Common Values

ValueCountFrequency (%)
안동 12
 
12.0%
동구(광주) 12
 
12.0%
성남 12
 
12.0%
영동 7
 
7.0%
경산 4
 
4.0%
제주 4
 
4.0%
마포 3
 
3.0%
스포원(금정) 3
 
3.0%
의정부 3
 
3.0%
구미 3
 
3.0%
Other values (29) 37
37.0%

Length

2023-12-10T19:12:04.990225image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
안동 12
 
12.0%
성남 12
 
12.0%
동구(광주 12
 
12.0%
영동 7
 
7.0%
경산 4
 
4.0%
제주 4
 
4.0%
마포 3
 
3.0%
스포원(금정 3
 
3.0%
의정부 3
 
3.0%
구미 3
 
3.0%
Other values (29) 37
37.0%

age_gbn
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
성인
43 
청소년
36 
노인
18 
유소년
 
3

Length

Max length3
Median length2
Mean length2.39
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row성인
2nd row노인
3rd row청소년
4th row청소년
5th row청소년

Common Values

ValueCountFrequency (%)
성인 43
43.0%
청소년 36
36.0%
노인 18
18.0%
유소년 3
 
3.0%

Length

2023-12-10T19:12:05.225019image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:12:05.415230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
성인 43
43.0%
청소년 36
36.0%
노인 18
18.0%
유소년 3
 
3.0%

test_gbn
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
출장
56 
일반
44 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반
2nd row일반
3rd row출장
4th row출장
5th row출장

Common Values

ValueCountFrequency (%)
출장 56
56.0%
일반 44
44.0%

Length

2023-12-10T19:12:05.614786image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:12:05.780360image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
출장 56
56.0%
일반 44
44.0%

test_age
Real number (ℝ)

HIGH CORRELATION 

Distinct43
Distinct (%)43.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.7
Minimum11
Maximum77
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:12:06.011374image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile13
Q114
median21.5
Q355.5
95-th percentile72.05
Maximum77
Range66
Interquartile range (IQR)41.5

Descriptive statistics

Standard deviation22.369938
Coefficient of variation (CV)0.68409597
Kurtosis-1.0262858
Mean32.7
Median Absolute Deviation (MAD)7.5
Skewness0.81719232
Sum3270
Variance500.41414
MonotonicityNot monotonic
2023-12-10T19:12:06.258577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=43)
ValueCountFrequency (%)
14 16
 
16.0%
13 9
 
9.0%
15 8
 
8.0%
19 5
 
5.0%
67 4
 
4.0%
26 4
 
4.0%
21 4
 
4.0%
72 3
 
3.0%
22 3
 
3.0%
25 3
 
3.0%
Other values (33) 41
41.0%
ValueCountFrequency (%)
11 1
 
1.0%
12 2
 
2.0%
13 9
9.0%
14 16
16.0%
15 8
8.0%
17 1
 
1.0%
18 2
 
2.0%
19 5
 
5.0%
20 2
 
2.0%
21 4
 
4.0%
ValueCountFrequency (%)
77 1
 
1.0%
75 1
 
1.0%
74 2
2.0%
73 1
 
1.0%
72 3
3.0%
71 1
 
1.0%
70 1
 
1.0%
69 1
 
1.0%
67 4
4.0%
66 2
2.0%

input_gbn
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
관리자
95 
<NA>
 
3
인바디
 
2

Length

Max length4
Median length3
Mean length3.03
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row관리자
2nd row<NA>
3rd row관리자
4th row관리자
5th row관리자

Common Values

ValueCountFrequency (%)
관리자 95
95.0%
<NA> 3
 
3.0%
인바디 2
 
2.0%

Length

2023-12-10T19:12:06.504156image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:12:06.703328image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
관리자 95
95.0%
na 3
 
3.0%
인바디 2
 
2.0%

cert_gbn
Categorical

Distinct4
Distinct (%)4.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
참가증
63 
3등급
15 
2등급
13 
1등급

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2등급
2nd row1등급
3rd row참가증
4th row3등급
5th row참가증

Common Values

ValueCountFrequency (%)
참가증 63
63.0%
3등급 15
 
15.0%
2등급 13
 
13.0%
1등급 9
 
9.0%

Length

2023-12-10T19:12:06.917144image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:12:07.117785image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
참가증 63
63.0%
3등급 15
 
15.0%
2등급 13
 
13.0%
1등급 9
 
9.0%

test_ymd
Real number (ℝ)

HIGH CORRELATION 

Distinct9
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean20211113
Minimum20211105
Maximum20211125
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.0 KiB
2023-12-10T19:12:07.310476image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum20211105
5-th percentile20211108
Q120211109
median20211112
Q320211119
95-th percentile20211119
Maximum20211125
Range20
Interquartile range (IQR)10

Descriptive statistics

Standard deviation4.4404022
Coefficient of variation (CV)2.1970103 × 10-7
Kurtosis-0.57397746
Mean20211113
Median Absolute Deviation (MAD)3
Skewness0.74859508
Sum2.0211113 × 109
Variance19.717172
MonotonicityNot monotonic
2023-12-10T19:12:07.546228image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
20211112 26
26.0%
20211119 24
24.0%
20211110 19
19.0%
20211108 13
13.0%
20211109 12
12.0%
20211116 2
 
2.0%
20211105 2
 
2.0%
20211123 1
 
1.0%
20211125 1
 
1.0%
ValueCountFrequency (%)
20211105 2
 
2.0%
20211108 13
13.0%
20211109 12
12.0%
20211110 19
19.0%
20211112 26
26.0%
20211116 2
 
2.0%
20211119 24
24.0%
20211123 1
 
1.0%
20211125 1
 
1.0%
ValueCountFrequency (%)
20211125 1
 
1.0%
20211123 1
 
1.0%
20211119 24
24.0%
20211116 2
 
2.0%
20211112 26
26.0%
20211110 19
19.0%
20211109 12
12.0%
20211108 13
13.0%
20211105 2
 
2.0%

test_sex
Categorical

Distinct2
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size932.0 B
M
57 
F
43 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowF
5th rowM

Common Values

ValueCountFrequency (%)
M 57
57.0%
F 43
43.0%

Length

2023-12-10T19:12:07.886247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T19:12:08.069527image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
m 57
57.0%
f 43
43.0%

Interactions

2023-12-10T19:12:03.429620image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:02.149345image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:02.867562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:03.585368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:02.325268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:03.056892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:03.783065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:02.652105image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T19:12:03.244572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T19:12:08.199696image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
test_cntcenter_nmage_gbntest_gbntest_ageinput_gbncert_gbntest_ymdtest_sex
test_cnt1.0000.9050.2300.7000.4830.1780.0000.2890.000
center_nm0.9051.0000.8980.8440.9260.7770.7910.9000.609
age_gbn0.2300.8981.0000.6510.8870.0000.0000.6380.000
test_gbn0.7000.8440.6511.0000.7220.0000.3520.6010.000
test_age0.4830.9260.8870.7221.0000.3480.2500.4690.000
input_gbn0.1780.7770.0000.0000.3481.0000.0000.0000.000
cert_gbn0.0000.7910.0000.3520.2500.0001.0000.3300.000
test_ymd0.2890.9000.6380.6010.4690.0000.3301.0000.150
test_sex0.0000.6090.0000.0000.0000.0000.0000.1501.000
2023-12-10T19:12:08.451444image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
age_gbncert_gbninput_gbntest_sextest_gbncenter_nm
age_gbn1.0000.0000.0000.0000.4510.553
cert_gbn0.0001.0000.0000.0000.2320.424
input_gbn0.0000.0001.0000.0000.0000.536
test_sex0.0000.0000.0001.0000.0000.405
test_gbn0.4510.2320.0000.0001.0000.588
center_nm0.5530.4240.5360.4050.5881.000
2023-12-10T19:12:08.782774image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
test_cnttest_agetest_ymdcenter_nmage_gbntest_gbninput_gbncert_gbntest_sex
test_cnt1.0000.3580.4490.5120.0000.3140.1590.0000.000
test_age0.3581.0000.3780.5340.7350.5410.2540.1440.000
test_ymd0.4490.3781.0000.5280.5870.6300.0000.2150.116
center_nm0.5120.5340.5281.0000.5530.5880.5360.4240.405
age_gbn0.0000.7350.5870.5531.0000.4510.0000.0000.000
test_gbn0.3140.5410.6300.5880.4511.0000.0000.2320.000
input_gbn0.1590.2540.0000.5360.0000.0001.0000.0000.000
cert_gbn0.0000.1440.2150.4240.0000.2320.0001.0000.000
test_sex0.0000.0000.1160.4050.0000.0000.0000.0001.000

Missing values

2023-12-10T19:12:04.020450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T19:12:04.343266image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

test_cntcenter_nmage_gbntest_gbntest_ageinput_gbncert_gbntest_ymdtest_sex
03서초성인일반25관리자2등급20211119M
12KSPO대구노인일반71<NA>1등급20211123M
21안동청소년출장14관리자참가증20211110M
31성남청소년출장15관리자3등급20211109F
41영동청소년출장13관리자참가증20211112M
51영동청소년출장14관리자참가증20211112F
61성남청소년출장15관리자참가증20211109F
72서대문성인일반19<NA>3등급20211125F
81영동청소년출장14관리자참가증20211112F
91성남청소년출장15관리자참가증20211109F
test_cntcenter_nmage_gbntest_gbntest_ageinput_gbncert_gbntest_ymdtest_sex
9011삼척청소년일반17관리자참가증20211110M
911동구(광주)성인출장63관리자참가증20211109F
921안동청소년출장14관리자참가증20211108M
931동구(광주)노인출장74관리자참가증20211109M
941영동청소년출장13관리자참가증20211109F
951동구(광주)노인출장75관리자참가증20211110M
961영동청소년출장13관리자참가증20211109F
971동구(광주)노인출장69관리자참가증20211110F
981동구(광주)노인출장72관리자참가증20211110M
991동구(광주)노인출장72관리자참가증20211109F

Duplicate rows

Most frequently occurring

test_cntcenter_nmage_gbntest_gbntest_ageinput_gbncert_gbntest_ymdtest_sex# duplicates
31안동청소년출장14관리자참가증20211108M4
01성남청소년출장13관리자참가증20211108F3
21성남청소년출장15관리자참가증20211109F3
11성남청소년출장13관리자참가증20211108M2
41안동청소년출장14관리자참가증20211110M2
51안동청소년출장14관리자참가증20211112M2
61영동청소년출장13관리자참가증20211109F2
71영동청소년출장14관리자참가증20211112F2