Overview

Dataset statistics

Number of variables5
Number of observations69
Missing cells67
Missing cells (%)19.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory3.0 KiB
Average record size in memory44.9 B

Variable types

Numeric2
Text2
Categorical1

Dataset

Description인천광역시 계양구 무인민원발급기 제증명 수수료에 관한 데이터파일입니다. 증명서 종류, 관내 수수료, 관외 수수료를 나타내는 데이터파일입니다.
Author인천광역시 계양구
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15101766&srcSe=7661IVAWM27C61E190

Alerts

비고 has constant value ""Constant
수수료_관외 is highly overall correlated with 수수료_관내High correlation
수수료_관내 is highly overall correlated with 수수료_관외High correlation
수수료_관내 is highly imbalanced (59.9%)Imbalance
비고 has 67 (97.1%) missing valuesMissing
연번 has unique valuesUnique
증명서종류 has unique valuesUnique
수수료_관외 has 50 (72.5%) zerosZeros

Reproduction

Analysis started2024-04-21 16:01:49.589659
Analysis finished2024-04-21 16:01:50.721502
Duration1.13 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct69
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35
Minimum1
Maximum69
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size749.0 B
2024-04-22T01:01:50.846874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.4
Q118
median35
Q352
95-th percentile65.6
Maximum69
Range68
Interquartile range (IQR)34

Descriptive statistics

Standard deviation20.062403
Coefficient of variation (CV)0.5732115
Kurtosis-1.2
Mean35
Median Absolute Deviation (MAD)17
Skewness0
Sum2415
Variance402.5
MonotonicityStrictly increasing
2024-04-22T01:01:51.100402image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.4%
45 1
 
1.4%
51 1
 
1.4%
50 1
 
1.4%
49 1
 
1.4%
48 1
 
1.4%
47 1
 
1.4%
46 1
 
1.4%
44 1
 
1.4%
53 1
 
1.4%
Other values (59) 59
85.5%
ValueCountFrequency (%)
1 1
1.4%
2 1
1.4%
3 1
1.4%
4 1
1.4%
5 1
1.4%
6 1
1.4%
7 1
1.4%
8 1
1.4%
9 1
1.4%
10 1
1.4%
ValueCountFrequency (%)
69 1
1.4%
68 1
1.4%
67 1
1.4%
66 1
1.4%
65 1
1.4%
64 1
1.4%
63 1
1.4%
62 1
1.4%
61 1
1.4%
60 1
1.4%

증명서종류
Text

UNIQUE 

Distinct69
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size680.0 B
2024-04-22T01:01:51.871261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length23
Mean length13.826087
Min length4

Characters and Unicode

Total characters954
Distinct characters147
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique69 ?
Unique (%)100.0%

Sample

1st row주민등록 등초본
2nd row개별공시지가확인서
3rd row토지이용계획확인서
4th row토지(임야)대장등본, 대지권등록부
5th row건축물대장
ValueCountFrequency (%)
증명 4
 
2.9%
납부확인서 4
 
2.9%
졸업자 4
 
2.9%
지역가입자 3
 
2.1%
건강장기요양보험료 3
 
2.1%
검정고시 3
 
2.1%
고용보험 3
 
2.1%
산재보험 2
 
1.4%
2003년이후 2
 
1.4%
직장가입자 2
 
1.4%
Other values (100) 110
78.6%
2024-04-22T01:01:52.876036image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
71
 
7.4%
51
 
5.3%
44
 
4.6%
44
 
4.6%
30
 
3.1%
( 30
 
3.1%
) 30
 
3.1%
29
 
3.0%
18
 
1.9%
17
 
1.8%
Other values (137) 590
61.8%

Most occurring categories

ValueCountFrequency (%)
Other Letter 786
82.4%
Space Separator 71
 
7.4%
Open Punctuation 30
 
3.1%
Close Punctuation 30
 
3.1%
Other Punctuation 21
 
2.2%
Decimal Number 16
 
1.7%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
51
 
6.5%
44
 
5.6%
44
 
5.6%
30
 
3.8%
29
 
3.7%
18
 
2.3%
17
 
2.2%
17
 
2.2%
16
 
2.0%
16
 
2.0%
Other values (125) 504
64.1%
Decimal Number
ValueCountFrequency (%)
0 6
37.5%
2 4
25.0%
3 3
18.8%
1 1
 
6.2%
9 1
 
6.2%
8 1
 
6.2%
Other Punctuation
ValueCountFrequency (%)
, 13
61.9%
/ 5
 
23.8%
· 3
 
14.3%
Space Separator
ValueCountFrequency (%)
71
100.0%
Open Punctuation
ValueCountFrequency (%)
( 30
100.0%
Close Punctuation
ValueCountFrequency (%)
) 30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 786
82.4%
Common 168
 
17.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
51
 
6.5%
44
 
5.6%
44
 
5.6%
30
 
3.8%
29
 
3.7%
18
 
2.3%
17
 
2.2%
17
 
2.2%
16
 
2.0%
16
 
2.0%
Other values (125) 504
64.1%
Common
ValueCountFrequency (%)
71
42.3%
( 30
17.9%
) 30
17.9%
, 13
 
7.7%
0 6
 
3.6%
/ 5
 
3.0%
2 4
 
2.4%
· 3
 
1.8%
3 3
 
1.8%
1 1
 
0.6%
Other values (2) 2
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 786
82.4%
ASCII 165
 
17.3%
None 3
 
0.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
71
43.0%
( 30
18.2%
) 30
18.2%
, 13
 
7.9%
0 6
 
3.6%
/ 5
 
3.0%
2 4
 
2.4%
3 3
 
1.8%
1 1
 
0.6%
9 1
 
0.6%
Hangul
ValueCountFrequency (%)
51
 
6.5%
44
 
5.6%
44
 
5.6%
30
 
3.8%
29
 
3.7%
18
 
2.3%
17
 
2.2%
17
 
2.2%
16
 
2.0%
16
 
2.0%
Other values (125) 504
64.1%
None
ValueCountFrequency (%)
· 3
100.0%

수수료_관내
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct5
Distinct (%)7.2%
Missing0
Missing (%)0.0%
Memory size680.0 B
0
58 
500
 
4
800
 
3
1000
 
3
300
 
1

Length

Max length4
Median length1
Mean length1.3623188
Min length1

Unique

Unique1 ?
Unique (%)1.4%

Sample

1st row0
2nd row800
3rd row1000
4th row500
5th row500

Common Values

ValueCountFrequency (%)
0 58
84.1%
500 4
 
5.8%
800 3
 
4.3%
1000 3
 
4.3%
300 1
 
1.4%

Length

2024-04-22T01:01:53.104581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-04-22T01:01:53.297030image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 58
84.1%
500 4
 
5.8%
800 3
 
4.3%
1000 3
 
4.3%
300 1
 
1.4%

수수료_관외
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct6
Distinct (%)8.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean160.86957
Minimum0
Maximum1000
Zeros50
Zeros (%)72.5%
Negative0
Negative (%)0.0%
Memory size749.0 B
2024-04-22T01:01:53.464336image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3300
95-th percentile800
Maximum1000
Range1000
Interquartile range (IQR)300

Descriptive statistics

Standard deviation287.5952
Coefficient of variation (CV)1.7877539
Kurtosis1.5911934
Mean160.86957
Median Absolute Deviation (MAD)0
Skewness1.6282324
Sum11100
Variance82710.997
MonotonicityNot monotonic
2024-04-22T01:01:53.644599image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 50
72.5%
500 12
 
17.4%
1000 3
 
4.3%
800 2
 
2.9%
200 1
 
1.4%
300 1
 
1.4%
ValueCountFrequency (%)
0 50
72.5%
200 1
 
1.4%
300 1
 
1.4%
500 12
 
17.4%
800 2
 
2.9%
1000 3
 
4.3%
ValueCountFrequency (%)
1000 3
 
4.3%
800 2
 
2.9%
500 12
 
17.4%
300 1
 
1.4%
200 1
 
1.4%
0 50
72.5%

비고
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)50.0%
Missing67
Missing (%)97.1%
Memory size680.0 B
2024-04-22T01:01:54.003532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters10
Distinct characters5
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row관외 불가
2nd row관외 불가
ValueCountFrequency (%)
관외 2
50.0%
불가 2
50.0%
2024-04-22T01:01:54.821368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2
20.0%
2
20.0%
2
20.0%
2
20.0%
2
20.0%

Most occurring categories

ValueCountFrequency (%)
Other Letter 8
80.0%
Space Separator 2
 
20.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
2
25.0%
2
25.0%
2
25.0%
2
25.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 8
80.0%
Common 2
 
20.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
2
25.0%
2
25.0%
2
25.0%
2
25.0%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 8
80.0%
ASCII 2
 
20.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
2
25.0%
2
25.0%
2
25.0%
2
25.0%
ASCII
ValueCountFrequency (%)
2
100.0%

Interactions

2024-04-22T01:01:50.163251image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-22T01:01:49.869187image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-22T01:01:50.311715image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-04-22T01:01:50.019525image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-04-22T01:01:55.066590image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번증명서종류수수료_관내수수료_관외
연번1.0001.0000.6100.622
증명서종류1.0001.0001.0001.000
수수료_관내0.6101.0001.0000.888
수수료_관외0.6221.0000.8881.000
2024-04-22T01:01:55.314635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번수수료_관외수수료_관내
연번1.000-0.2950.283
수수료_관외-0.2951.0000.815
수수료_관내0.2830.8151.000

Missing values

2024-04-22T01:01:50.489876image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-04-22T01:01:50.656905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번증명서종류수수료_관내수수료_관외비고
01주민등록 등초본0200<NA>
12개별공시지가확인서800800<NA>
23토지이용계획확인서10001000<NA>
34토지(임야)대장등본, 대지권등록부500500<NA>
45건축물대장500500<NA>
56자동차등록원부(갑, 을)300300<NA>
67건설기계등록원부(갑, 을)500500<NA>
78수급자증명서00<NA>
89장애인증명서00<NA>
910한부모가족증명서00<NA>
연번증명서종류수수료_관내수수료_관외비고
5960산재보험 자격이력내역서(근로자용)00<NA>
6061산재보험 일용근로내역서(근로자용)00<NA>
6162보험급여 지급확인원(근로자용)00<NA>
6263고용·산재보험가입증명원(법인/개인)00<NA>
6364고용·산재보험 신고 및 완납여부 증명원(법인/개인)00<NA>
6465산재요양승인반려여부확인서(법인/개인)00<NA>
6566여권발급기록증명서(국/영)00<NA>
6667여권발급신청서류증명서00<NA>
6768여권실효확인서(국/영)00<NA>
6869여권정보증명서00<NA>