Overview

Dataset statistics

Number of variables6
Number of observations441
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory21.2 KiB
Average record size in memory49.3 B

Variable types

Numeric1
Text2
Unsupported1
Categorical2

Dataset

Description교정표준장비, 교정결의관리, 교정업체MASTER, 교정결과, 교정접수, 표준장비코드에 관한 자료입니다.
Author한국원자력의학원
URLhttps://www.data.go.kr/data/15092449/fileData.do

Alerts

등록자 has constant value ""Constant
병원구분 is highly imbalanced (95.2%)Imbalance
번호 has unique valuesUnique
품목일련번호 is an unsupported type, check if it needs cleaning or further analysisUnsupported

Reproduction

Analysis started2023-12-13 00:20:48.417024
Analysis finished2023-12-13 00:20:48.740080
Duration0.32 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

UNIQUE 

Distinct441
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean221
Minimum1
Maximum441
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.0 KiB
2023-12-13T09:20:48.793529image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile23
Q1111
median221
Q3331
95-th percentile419
Maximum441
Range440
Interquartile range (IQR)220

Descriptive statistics

Standard deviation127.44999
Coefficient of variation (CV)0.57669679
Kurtosis-1.2
Mean221
Median Absolute Deviation (MAD)110
Skewness0
Sum97461
Variance16243.5
MonotonicityStrictly increasing
2023-12-13T09:20:48.894956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
0.2%
291 1
 
0.2%
302 1
 
0.2%
301 1
 
0.2%
300 1
 
0.2%
299 1
 
0.2%
298 1
 
0.2%
297 1
 
0.2%
296 1
 
0.2%
295 1
 
0.2%
Other values (431) 431
97.7%
ValueCountFrequency (%)
1 1
0.2%
2 1
0.2%
3 1
0.2%
4 1
0.2%
5 1
0.2%
6 1
0.2%
7 1
0.2%
8 1
0.2%
9 1
0.2%
10 1
0.2%
ValueCountFrequency (%)
441 1
0.2%
440 1
0.2%
439 1
0.2%
438 1
0.2%
437 1
0.2%
436 1
0.2%
435 1
0.2%
434 1
0.2%
433 1
0.2%
432 1
0.2%
Distinct440
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
2023-12-13T09:20:49.074095image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters4851
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique439 ?
Unique (%)99.5%

Sample

1st row2021-CW-001
2nd row2021-CW-002
3rd row2021-CW-003
4th row2021-CW-004
5th row2021-CW-005
ValueCountFrequency (%)
2022-cw-054 2
 
0.5%
2022-cw-062 1
 
0.2%
2022-cw-072 1
 
0.2%
2022-cw-071 1
 
0.2%
2022-cw-070 1
 
0.2%
2022-cw-069 1
 
0.2%
2022-cw-068 1
 
0.2%
2022-cw-067 1
 
0.2%
2022-cw-066 1
 
0.2%
2022-cw-065 1
 
0.2%
Other values (430) 430
97.5%
2023-12-13T09:20:49.355111image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 1209
24.9%
- 882
18.2%
0 769
15.9%
1 520
10.7%
C 441
 
9.1%
W 424
 
8.7%
4 85
 
1.8%
3 85
 
1.8%
5 84
 
1.7%
6 84
 
1.7%
Other values (4) 268
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3087
63.6%
Dash Punctuation 882
 
18.2%
Uppercase Letter 882
 
18.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 1209
39.2%
0 769
24.9%
1 520
16.8%
4 85
 
2.8%
3 85
 
2.8%
5 84
 
2.7%
6 84
 
2.7%
7 84
 
2.7%
8 84
 
2.7%
9 83
 
2.7%
Uppercase Letter
ValueCountFrequency (%)
C 441
50.0%
W 424
48.1%
K 17
 
1.9%
Dash Punctuation
ValueCountFrequency (%)
- 882
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3969
81.8%
Latin 882
 
18.2%

Most frequent character per script

Common
ValueCountFrequency (%)
2 1209
30.5%
- 882
22.2%
0 769
19.4%
1 520
13.1%
4 85
 
2.1%
3 85
 
2.1%
5 84
 
2.1%
6 84
 
2.1%
7 84
 
2.1%
8 84
 
2.1%
Latin
ValueCountFrequency (%)
C 441
50.0%
W 424
48.1%
K 17
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4851
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 1209
24.9%
- 882
18.2%
0 769
15.9%
1 520
10.7%
C 441
 
9.1%
W 424
 
8.7%
4 85
 
1.8%
3 85
 
1.8%
5 84
 
1.7%
6 84
 
1.7%
Other values (4) 268
 
5.5%

품목일련번호
Unsupported

REJECTED  UNSUPPORTED 

Missing0
Missing (%)0.0%
Memory size3.6 KiB
Distinct219
Distinct (%)49.7%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
2023-12-13T09:20:49.618490image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters3528
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99 ?
Unique (%)22.4%

Sample

1st row22.02.02
2nd row22.02.02
3rd row22.02.08
4th row22.02.09
5th row22.02.10
ValueCountFrequency (%)
22.12.27 8
 
1.8%
23.05.09 7
 
1.6%
22.06.03 6
 
1.4%
23.12.27 6
 
1.4%
23.12.20 6
 
1.4%
23.07.18 5
 
1.1%
23.12.13 5
 
1.1%
22.06.04 5
 
1.1%
22.06.21 5
 
1.1%
23.02.08 5
 
1.1%
Other values (209) 383
86.8%
2023-12-13T09:20:49.959395image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 940
26.6%
. 882
25.0%
0 501
14.2%
3 363
 
10.3%
1 338
 
9.6%
7 119
 
3.4%
6 113
 
3.2%
5 102
 
2.9%
4 77
 
2.2%
8 54
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2646
75.0%
Other Punctuation 882
 
25.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 940
35.5%
0 501
18.9%
3 363
 
13.7%
1 338
 
12.8%
7 119
 
4.5%
6 113
 
4.3%
5 102
 
3.9%
4 77
 
2.9%
8 54
 
2.0%
9 39
 
1.5%
Other Punctuation
ValueCountFrequency (%)
. 882
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3528
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 940
26.6%
. 882
25.0%
0 501
14.2%
3 363
 
10.3%
1 338
 
9.6%
7 119
 
3.4%
6 113
 
3.2%
5 102
 
2.9%
4 77
 
2.2%
8 54
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3528
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 940
26.6%
. 882
25.0%
0 501
14.2%
3 363
 
10.3%
1 338
 
9.6%
7 119
 
3.4%
6 113
 
3.2%
5 102
 
2.9%
4 77
 
2.2%
8 54
 
1.5%

병원구분
Categorical

IMBALANCE 

Distinct5
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
병원
436 
연구소
 
2
학교
 
1
기업
 
1
산업체
 
1

Length

Max length3
Median length2
Mean length2.0068027
Min length2

Unique

Unique3 ?
Unique (%)0.7%

Sample

1st row병원
2nd row병원
3rd row병원
4th row병원
5th row병원

Common Values

ValueCountFrequency (%)
병원 436
98.9%
연구소 2
 
0.5%
학교 1
 
0.2%
기업 1
 
0.2%
산업체 1
 
0.2%

Length

2023-12-13T09:20:50.066989image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:20:50.145792image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
병원 436
98.9%
연구소 2
 
0.5%
학교 1
 
0.2%
기업 1
 
0.2%
산업체 1
 
0.2%

등록자
Categorical

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
조규석
441 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row조규석
2nd row조규석
3rd row조규석
4th row조규석
5th row조규석

Common Values

ValueCountFrequency (%)
조규석 441
100.0%

Length

2023-12-13T09:20:50.227542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T09:20:50.297728image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
조규석 441
100.0%

Interactions

2023-12-13T09:20:48.554133image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T09:20:50.343132image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호병원구분
번호1.0000.000
병원구분0.0001.000
2023-12-13T09:20:50.420731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호병원구분
번호1.0000.000
병원구분0.0001.000

Missing values

2023-12-13T09:20:48.635036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T09:20:48.709906image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호접수번호품목일련번호차기교정일자병원구분등록자
012021-CW-001XW18033322.02.02병원조규석
122021-CW-002XW18017722.02.02병원조규석
232021-CW-003159022.02.08병원조규석
342021-CW-00469022.02.09병원조규석
452021-CW-00570422.02.10병원조규석
562021-CW-006215122.02.15병원조규석
672021-CW-007710822.02.16병원조규석
782021-CW-008132322.02.18병원조규석
892021-CW-00972522.02.19병원조규석
9102021-CW-010281022.03.04병원조규석
번호접수번호품목일련번호차기교정일자병원구분등록자
4314322022-CW-197XAJ18220323.12.15병원조규석
4324332022-CW-198207523.12.21병원조규석
4334342022-CW-199204623.12.21병원조규석
4344352022-CW-200202023.12.22병원조규석
4354362022-CW-2011061223.12.22병원조규석
4364372022-CW-2021061323.12.22병원조규석
4374382022-CW-203219323.12.23병원조규석
4384392022-CW-204263923.12.23병원조규석
4394402022-CW-205110823.12.28병원조규석
4404412022-CW-206226923.12.28병원조규석