Overview

Dataset statistics

Number of variables13
Number of observations26
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.1 KiB
Average record size in memory121.1 B

Variable types

Text1
Numeric3
Categorical9

Dataset

Description대전보훈병원에서 개방하는 병원내 암 등록환자통계 데이터로 진료과명(1월~12월),총합계가 포함된 데이터입니다.
URLhttps://www.data.go.kr/data/15066388/fileData.do

Alerts

4월 has constant value ""Constant
5월 has constant value ""Constant
8월 is highly overall correlated with 1월 and 7 other fieldsHigh correlation
12월 is highly overall correlated with 1월 and 7 other fieldsHigh correlation
6월 is highly overall correlated with 1월 and 7 other fieldsHigh correlation
9월 is highly overall correlated with 1월 and 7 other fieldsHigh correlation
7월 is highly overall correlated with 1월 and 8 other fieldsHigh correlation
11월 is highly overall correlated with 1월 and 7 other fieldsHigh correlation
1월 is highly overall correlated with 2월 and 7 other fieldsHigh correlation
2월 is highly overall correlated with 1월 and 7 other fieldsHigh correlation
3월 is highly overall correlated with 1월 and 8 other fieldsHigh correlation
10월 is highly overall correlated with 3월 and 1 other fieldsHigh correlation
7월 is highly imbalanced (53.7%)Imbalance
8월 is highly imbalanced (51.5%)Imbalance
9월 is highly imbalanced (76.5%)Imbalance
11월 is highly imbalanced (51.5%)Imbalance
12월 is highly imbalanced (50.1%)Imbalance
구분 has unique valuesUnique
1월 has 14 (53.8%) zerosZeros
2월 has 17 (65.4%) zerosZeros
3월 has 17 (65.4%) zerosZeros

Reproduction

Analysis started2023-12-12 18:46:40.711537
Analysis finished2023-12-12 18:46:43.506221
Duration2.79 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

구분
Text

UNIQUE 

Distinct26
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size340.0 B
2023-12-13T03:46:43.714422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length6
Mean length4.5384615
Min length2

Characters and Unicode

Total characters118
Distinct characters55
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)100.0%

Sample

1st row내과
2nd row소화기내과
3rd row심장내과
4th row호흡기내과
5th row내분비내과
ValueCountFrequency (%)
내과 1
 
3.6%
소화기내과 1
 
3.6%
영상의학과 1
 
3.6%
한방진료과 1
 
3.6%
마취통증의학과 1
 
3.6%
치과 1
 
3.6%
비뇨의학과 1
 
3.6%
이비인후과 1
 
3.6%
안과 1
 
3.6%
산부인과 1
 
3.6%
Other values (18) 18
64.3%
2023-12-13T03:46:44.209725image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
26
22.0%
9
 
7.6%
7
 
5.9%
7
 
5.9%
4
 
3.4%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
2
 
1.7%
Other values (45) 51
43.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 116
98.3%
Space Separator 2
 
1.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
26
22.4%
9
 
7.8%
7
 
6.0%
7
 
6.0%
4
 
3.4%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
2
 
1.7%
Other values (44) 49
42.2%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 116
98.3%
Common 2
 
1.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
26
22.4%
9
 
7.8%
7
 
6.0%
7
 
6.0%
4
 
3.4%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
2
 
1.7%
Other values (44) 49
42.2%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 116
98.3%
ASCII 2
 
1.7%

Most frequent character per block

Hangul
ValueCountFrequency (%)
26
22.4%
9
 
7.8%
7
 
6.0%
7
 
6.0%
4
 
3.4%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
2
 
1.7%
Other values (44) 49
42.2%
ASCII
ValueCountFrequency (%)
2
100.0%

1월
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct8
Distinct (%)30.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.6538462
Minimum0
Maximum64
Zeros14
Zeros (%)53.8%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-13T03:46:44.382037image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31.75
95-th percentile17
Maximum64
Range64
Interquartile range (IQR)1.75

Descriptive statistics

Standard deviation13.001361
Coefficient of variation (CV)2.7936808
Kurtosis18.783769
Mean4.6538462
Median Absolute Deviation (MAD)0
Skewness4.1640274
Sum121
Variance169.03538
MonotonicityNot monotonic
2023-12-13T03:46:44.549119image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
0 14
53.8%
1 5
 
19.2%
17 2
 
7.7%
4 1
 
3.8%
64 1
 
3.8%
9 1
 
3.8%
2 1
 
3.8%
3 1
 
3.8%
ValueCountFrequency (%)
0 14
53.8%
1 5
 
19.2%
2 1
 
3.8%
3 1
 
3.8%
4 1
 
3.8%
9 1
 
3.8%
17 2
 
7.7%
64 1
 
3.8%
ValueCountFrequency (%)
64 1
 
3.8%
17 2
 
7.7%
9 1
 
3.8%
4 1
 
3.8%
3 1
 
3.8%
2 1
 
3.8%
1 5
 
19.2%
0 14
53.8%

2월
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)23.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.3076923
Minimum0
Maximum11
Zeros17
Zeros (%)65.4%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-13T03:46:44.707687image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile7
Maximum11
Range11
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.7679484
Coefficient of variation (CV)2.1166664
Kurtosis6.1211919
Mean1.3076923
Median Absolute Deviation (MAD)0
Skewness2.5469402
Sum34
Variance7.6615385
MonotonicityNot monotonic
2023-12-13T03:46:44.865062image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 17
65.4%
1 4
 
15.4%
7 2
 
7.7%
2 1
 
3.8%
3 1
 
3.8%
11 1
 
3.8%
ValueCountFrequency (%)
0 17
65.4%
1 4
 
15.4%
2 1
 
3.8%
3 1
 
3.8%
7 2
 
7.7%
11 1
 
3.8%
ValueCountFrequency (%)
11 1
 
3.8%
7 2
 
7.7%
3 1
 
3.8%
2 1
 
3.8%
1 4
 
15.4%
0 17
65.4%

3월
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)23.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.5384615
Minimum0
Maximum14
Zeros17
Zeros (%)65.4%
Negative0
Negative (%)0.0%
Memory size366.0 B
2023-12-13T03:46:45.026744image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile10.5
Maximum14
Range14
Interquartile range (IQR)1

Descriptive statistics

Standard deviation3.6247016
Coefficient of variation (CV)2.356056
Kurtosis7.3858652
Mean1.5384615
Median Absolute Deviation (MAD)0
Skewness2.8292727
Sum40
Variance13.138462
MonotonicityNot monotonic
2023-12-13T03:46:45.196686image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 17
65.4%
1 5
 
19.2%
12 1
 
3.8%
14 1
 
3.8%
3 1
 
3.8%
6 1
 
3.8%
ValueCountFrequency (%)
0 17
65.4%
1 5
 
19.2%
3 1
 
3.8%
6 1
 
3.8%
12 1
 
3.8%
14 1
 
3.8%
ValueCountFrequency (%)
14 1
 
3.8%
12 1
 
3.8%
6 1
 
3.8%
3 1
 
3.8%
1 5
 
19.2%
0 17
65.4%

4월
Categorical

CONSTANT 

Distinct1
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
26 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 26
100.0%

Length

2023-12-13T03:46:45.384453image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:46:45.521422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 26
100.0%

5월
Categorical

CONSTANT 

Distinct1
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
26 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 26
100.0%

Length

2023-12-13T03:46:45.674707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:46:45.816886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 26
100.0%

6월
Categorical

HIGH CORRELATION 

Distinct5
Distinct (%)19.2%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
18 
1
4
 
1
3
 
1
8
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique3 ?
Unique (%)11.5%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 18
69.2%
1 5
 
19.2%
4 1
 
3.8%
3 1
 
3.8%
8 1
 
3.8%

Length

2023-12-13T03:46:45.965175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:46:46.146002image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 18
69.2%
1 5
 
19.2%
4 1
 
3.8%
3 1
 
3.8%
8 1
 
3.8%

7월
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)19.2%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
21 
5
 
2
1
 
1
15
 
1
4
 
1

Length

Max length2
Median length1
Mean length1.0384615
Min length1

Unique

Unique3 ?
Unique (%)11.5%

Sample

1st row0
2nd row5
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 21
80.8%
5 2
 
7.7%
1 1
 
3.8%
15 1
 
3.8%
4 1
 
3.8%

Length

2023-12-13T03:46:46.327549image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:46:46.519141image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 21
80.8%
5 2
 
7.7%
1 1
 
3.8%
15 1
 
3.8%
4 1
 
3.8%

8월
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
21 
1
2
 
1
21
 
1

Length

Max length2
Median length1
Mean length1.0384615
Min length1

Unique

Unique2 ?
Unique (%)7.7%

Sample

1st row0
2nd row2
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 21
80.8%
1 3
 
11.5%
2 1
 
3.8%
21 1
 
3.8%

Length

2023-12-13T03:46:46.701230image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:46:46.858018image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 21
80.8%
1 3
 
11.5%
2 1
 
3.8%
21 1
 
3.8%

9월
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
25 
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique1 ?
Unique (%)3.8%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 25
96.2%
4 1
 
3.8%

Length

2023-12-13T03:46:47.041167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:46:47.198148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 25
96.2%
4 1
 
3.8%

10월
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)7.7%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
22 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 22
84.6%
1 4
 
15.4%

Length

2023-12-13T03:46:47.366959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:46:47.517442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 22
84.6%
1 4
 
15.4%

11월
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
21 
1
8
 
1
3
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique2 ?
Unique (%)7.7%

Sample

1st row0
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 21
80.8%
1 3
 
11.5%
8 1
 
3.8%
3 1
 
3.8%

Length

2023-12-13T03:46:47.663754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:46:47.830827image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 21
80.8%
1 3
 
11.5%
8 1
 
3.8%
3 1
 
3.8%

12월
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct4
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Memory size340.0 B
0
21 
2
 
2
1
 
2
38
 
1

Length

Max length2
Median length1
Mean length1.0384615
Min length1

Unique

Unique1 ?
Unique (%)3.8%

Sample

1st row0
2nd row2
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 21
80.8%
2 2
 
7.7%
1 2
 
7.7%
38 1
 
3.8%

Length

2023-12-13T03:46:48.040213image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:46:48.224057image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 21
80.8%
2 2
 
7.7%
1 2
 
7.7%
38 1
 
3.8%

Interactions

2023-12-13T03:46:42.439874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:41.526832image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:41.975935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:42.640408image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:41.688210image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:42.115475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:42.836168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:41.829454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:46:42.264005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:46:48.364495image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구분1월2월3월6월7월8월9월10월11월12월
구분1.0001.0001.0001.0001.0001.0001.0001.0001.0001.0001.000
1월1.0001.0000.6301.0000.7040.7590.9491.0000.6750.9970.929
2월1.0000.6301.0000.9340.9680.9790.8140.5120.0000.6560.858
3월1.0001.0000.9341.0000.9360.9670.8671.0000.4980.9160.806
6월1.0000.7040.9680.9361.0000.9390.7421.0000.0000.6580.809
7월1.0000.7590.9790.9670.9391.0000.8351.0000.5000.7770.850
8월1.0000.9490.8140.8670.7420.8351.0001.0000.5500.9590.990
9월1.0001.0000.5121.0001.0001.0001.0001.0000.0001.0001.000
10월1.0000.6750.0000.4980.0000.5000.5500.0001.0000.6020.000
11월1.0000.9970.6560.9160.6580.7770.9591.0000.6021.0000.964
12월1.0000.9290.8580.8060.8090.8500.9901.0000.0000.9641.000
2023-12-13T03:46:49.142909image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
8월12월10월6월9월7월11월
8월1.0000.8650.3540.6670.9570.7870.725
12월0.8651.0000.0000.7510.9570.8080.742
10월0.3540.0001.0000.0000.0000.5640.393
6월0.6670.7510.0001.0000.9350.6500.572
9월0.9570.9570.0000.9351.0000.9350.957
7월0.7870.8080.5640.6500.9351.0000.710
11월0.7250.7420.3930.5720.9570.7101.000
2023-12-13T03:46:49.362934image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
1월2월3월6월7월8월9월10월11월12월
1월1.0000.5330.8020.6220.6890.6960.9570.4510.9290.645
2월0.5331.0000.7220.7440.7880.7590.5770.0000.5690.819
3월0.8020.7221.0000.6430.7400.8320.9350.5620.9040.748
6월0.6220.7440.6431.0000.6500.6670.9350.0000.5720.751
7월0.6890.7880.7400.6501.0000.7870.9350.5640.7100.808
8월0.6960.7590.8320.6670.7871.0000.9570.3540.7250.865
9월0.9570.5770.9350.9350.9350.9571.0000.0000.9570.957
10월0.4510.0000.5620.0000.5640.3540.0001.0000.3930.000
11월0.9290.5690.9040.5720.7100.7250.9570.3931.0000.742
12월0.6450.8190.7480.7510.8080.8650.9570.0000.7421.000

Missing values

2023-12-13T03:46:43.092940image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:46:43.396887image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

구분1월2월3월4월5월6월7월8월9월10월11월12월
0내과111001000000
1소화기내과17712000520112
2심장내과100000000000
3호흡기내과421000110011
4내분비내과010000000100
5류마티스내과100000000000
6신장내과100000000000
7혈액종양내과64714004152140838
8소아청소년과000000000000
9신 경 과100001000000
구분1월2월3월4월5월6월7월8월9월10월11월12월
16신경외과201001500100
17산부인과000000000000
18안과000000000000
19이비인후과000000000000
20비뇨의학과3111008010002
21치과000000000000
22마취통증의학과000000000000
23한방진료과000000000000
24영상의학과000000000000
25응급의학과011001000000