Overview

Dataset statistics

Number of variables4
Number of observations73
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.5 KiB
Average record size in memory34.8 B

Variable types

Numeric1
Text1
Categorical2

Dataset

Description인천광역시 연수구 무인민원발급기 발급가능 서류 데이터로서 연번, 서류명, 수수료, 본인확인 등의 항목으로 이루어져 있습니다.
Author인천광역시 연수구
URLhttps://data.incheon.go.kr/findData/publicDataDetail?dataId=15116400&srcSe=7661IVAWM27C61E190

Alerts

수수료(원) is highly overall correlated with 본인확인(지문)High correlation
본인확인(지문) is highly overall correlated with 수수료(원)High correlation
수수료(원) is highly imbalanced (56.4%)Imbalance
본인확인(지문) is highly imbalanced (59.0%)Imbalance
연번 has unique valuesUnique
서류명 has unique valuesUnique

Reproduction

Analysis started2024-01-28 14:26:32.958182
Analysis finished2024-01-28 14:26:33.364220
Duration0.41 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

UNIQUE 

Distinct73
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37
Minimum1
Maximum73
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size789.0 B
2024-01-28T23:26:33.419756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4.6
Q119
median37
Q355
95-th percentile69.4
Maximum73
Range72
Interquartile range (IQR)36

Descriptive statistics

Standard deviation21.217131
Coefficient of variation (CV)0.57343598
Kurtosis-1.2
Mean37
Median Absolute Deviation (MAD)18
Skewness0
Sum2701
Variance450.16667
MonotonicityStrictly increasing
2024-01-28T23:26:33.534249image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 1
 
1.4%
56 1
 
1.4%
54 1
 
1.4%
53 1
 
1.4%
52 1
 
1.4%
51 1
 
1.4%
50 1
 
1.4%
49 1
 
1.4%
48 1
 
1.4%
47 1
 
1.4%
Other values (63) 63
86.3%
ValueCountFrequency (%)
1 1
1.4%
2 1
1.4%
3 1
1.4%
4 1
1.4%
5 1
1.4%
6 1
1.4%
7 1
1.4%
8 1
1.4%
9 1
1.4%
10 1
1.4%
ValueCountFrequency (%)
73 1
1.4%
72 1
1.4%
71 1
1.4%
70 1
1.4%
69 1
1.4%
68 1
1.4%
67 1
1.4%
66 1
1.4%
65 1
1.4%
64 1
1.4%

서류명
Text

UNIQUE 

Distinct73
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size716.0 B
2024-01-28T23:26:33.727923image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length29
Median length19
Mean length12.712329
Min length4

Characters and Unicode

Total characters928
Distinct characters143
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique73 ?
Unique (%)100.0%

Sample

1st row주민등록등본(초본)
2nd row개별공시지가확인서
3rd row토지이용계획확인서
4th row토지(임야)대장등본
5th row건축물대장
ValueCountFrequency (%)
납부확인서 6
 
4.6%
국민연금보험료 4
 
3.1%
포함 4
 
3.1%
지역가입자 3
 
2.3%
건강장기요양보험료 3
 
2.3%
검정고시 3
 
2.3%
고용보험 3
 
2.3%
산재보험 2
 
1.5%
증명서 2
 
1.5%
부가가치세 2
 
1.5%
Other values (92) 99
75.6%
2024-01-28T23:26:34.031312image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
58
 
6.2%
47
 
5.1%
44
 
4.7%
43
 
4.6%
29
 
3.1%
( 28
 
3.0%
) 28
 
3.0%
26
 
2.8%
19
 
2.0%
19
 
2.0%
Other values (133) 587
63.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 800
86.2%
Space Separator 58
 
6.2%
Open Punctuation 28
 
3.0%
Close Punctuation 28
 
3.0%
Other Punctuation 14
 
1.5%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
47
 
5.9%
44
 
5.5%
43
 
5.4%
29
 
3.6%
26
 
3.2%
19
 
2.4%
19
 
2.4%
19
 
2.4%
18
 
2.2%
18
 
2.2%
Other values (127) 518
64.8%
Other Punctuation
ValueCountFrequency (%)
, 7
50.0%
/ 5
35.7%
· 2
 
14.3%
Space Separator
ValueCountFrequency (%)
58
100.0%
Open Punctuation
ValueCountFrequency (%)
( 28
100.0%
Close Punctuation
ValueCountFrequency (%)
) 28
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 800
86.2%
Common 128
 
13.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
47
 
5.9%
44
 
5.5%
43
 
5.4%
29
 
3.6%
26
 
3.2%
19
 
2.4%
19
 
2.4%
19
 
2.4%
18
 
2.2%
18
 
2.2%
Other values (127) 518
64.8%
Common
ValueCountFrequency (%)
58
45.3%
( 28
21.9%
) 28
21.9%
, 7
 
5.5%
/ 5
 
3.9%
· 2
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
Hangul 800
86.2%
ASCII 126
 
13.6%
None 2
 
0.2%

Most frequent character per block

ASCII
ValueCountFrequency (%)
58
46.0%
( 28
22.2%
) 28
22.2%
, 7
 
5.6%
/ 5
 
4.0%
Hangul
ValueCountFrequency (%)
47
 
5.9%
44
 
5.5%
43
 
5.4%
29
 
3.6%
26
 
3.2%
19
 
2.4%
19
 
2.4%
19
 
2.4%
18
 
2.2%
18
 
2.2%
Other values (127) 518
64.8%
None
ValueCountFrequency (%)
· 2
100.0%

수수료(원)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct9
Distinct (%)12.3%
Missing0
Missing (%)0.0%
Memory size716.0 B
무료
56 
500
1000
 
2
300
 
2
관내:800+관외:800
 
2
Other values (4)
 
4

Length

Max length15
Median length2
Mean length2.8219178
Min length2

Unique

Unique4 ?
Unique (%)5.5%

Sample

1st row200
2nd row800
3rd row1000
4th row500
5th row500

Common Values

ValueCountFrequency (%)
무료 56
76.7%
500 7
 
9.6%
1000 2
 
2.7%
300 2
 
2.7%
관내:800+관외:800 2
 
2.7%
200 1
 
1.4%
800 1
 
1.4%
관내:500+관외:1,500 1
 
1.4%
관내:500+관외:불가 1
 
1.4%

Length

2024-01-28T23:26:34.150577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-28T23:26:34.253120image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
무료 56
76.7%
500 7
 
9.6%
1000 2
 
2.7%
300 2
 
2.7%
관내:800+관외:800 2
 
2.7%
200 1
 
1.4%
800 1
 
1.4%
관내:500+관외:1,500 1
 
1.4%
관내:500+관외:불가 1
 
1.4%

본인확인(지문)
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Memory size716.0 B
필요
67 
불필요
 
6

Length

Max length3
Median length2
Mean length2.0821918
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row필요
2nd row불필요
3rd row불필요
4th row불필요
5th row불필요

Common Values

ValueCountFrequency (%)
필요 67
91.8%
불필요 6
 
8.2%

Length

2024-01-28T23:26:34.363797image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-01-28T23:26:34.444277image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
필요 67
91.8%
불필요 6
 
8.2%

Interactions

2024-01-28T23:26:33.144564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-01-28T23:26:34.501699image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번서류명수수료(원)본인확인(지문)
연번1.0001.0000.5020.615
서류명1.0001.0001.0001.000
수수료(원)0.5021.0001.0000.761
본인확인(지문)0.6151.0000.7611.000
2024-01-28T23:26:34.583583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
수수료(원)본인확인(지문)
수수료(원)1.0000.739
본인확인(지문)0.7391.000
2024-01-28T23:26:34.662913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번수수료(원)본인확인(지문)
연번1.0000.2500.447
수수료(원)0.2501.0000.739
본인확인(지문)0.4470.7391.000

Missing values

2024-01-28T23:26:33.262882image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-01-28T23:26:33.333487image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

연번서류명수수료(원)본인확인(지문)
01주민등록등본(초본)200필요
12개별공시지가확인서800불필요
23토지이용계획확인서1000불필요
34토지(임야)대장등본500불필요
45건축물대장500불필요
56건설기계등록원부(갑,을)관내:500+관외:1,500필요
67자동차등록원부(갑,을)300필요
78국민기초수급자증명무료필요
89장애인증명서무료필요
910한부모가족증명서무료필요
연번서류명수수료(원)본인확인(지문)
6364여권발급기록증명서(국/영)무료필요
6465여권발급신청서류증명서무료필요
6566여권실효확인서(국/영)무료필요
6667여권정보증명서무료필요
6768국민연금 가입자 가입증명무료필요
6869국민연금 수급증명(지급내역)무료필요
6970연금소득원천징수영수증무료필요
7071연금산정용 가입내역 확인서무료필요
7172국민연금보험료 소득공제용 납부확인서무료필요
7273국민연금보험료 납부확인서무료필요