Overview

Dataset statistics

Number of variables8
Number of observations500
Missing cells150
Missing cells (%)3.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory32.8 KiB
Average record size in memory67.3 B

Variable types

Text4
Categorical1
Boolean1
Numeric2

Dataset

Description해당 파일 데이터는 신용보증기금의 고객기타정보세무사회계사정보에 대해 확인하실 수 있는 자료이니 데이터 활용에 참고하여 주시기 바랍니다.
Author신용보증기금
URLhttps://www.data.go.kr/data/15093112/fileData.do

Alerts

삭제여부 has constant value ""Constant
세무사회계사구분코드 is highly imbalanced (56.4%)Imbalance
사무소명 has 150 (30.0%) missing valuesMissing
세무사회계사ID has unique valuesUnique

Reproduction

Analysis started2023-12-12 07:15:48.968824
Analysis finished2023-12-12 07:15:50.059320
Duration1.09 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct500
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-12T16:15:50.289321image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters5000
Distinct characters62
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique500 ?
Unique (%)100.0%

Sample

1st row9bnEWvraeC
2nd row9dnSX9ueMM
3rd row9dnSOEUvER
4th row9bnEWvf1FC
5th row9dnOG0K06G
ValueCountFrequency (%)
9bnewvraec 1
 
0.2%
9dmtxcvx11 1
 
0.2%
9cvnpx6scz 1
 
0.2%
9dmsonrhzn 1
 
0.2%
9dmsova5t3 1
 
0.2%
9dmsovbv82 1
 
0.2%
9bof9eu0hy 1
 
0.2%
9dmsxjsrul 1
 
0.2%
9cszso7pcp 1
 
0.2%
9dmsb16qpj 1
 
0.2%
Other values (490) 490
98.0%
2023-12-12T16:15:50.685489image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 551
 
11.0%
d 388
 
7.8%
m 242
 
4.8%
n 230
 
4.6%
c 136
 
2.7%
b 123
 
2.5%
E 103
 
2.1%
W 92
 
1.8%
v 86
 
1.7%
S 75
 
1.5%
Other values (52) 2974
59.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2320
46.4%
Uppercase Letter 1616
32.3%
Decimal Number 1064
21.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 388
16.7%
m 242
 
10.4%
n 230
 
9.9%
c 136
 
5.9%
b 123
 
5.3%
v 86
 
3.7%
z 71
 
3.1%
y 70
 
3.0%
o 69
 
3.0%
f 66
 
2.8%
Other values (16) 839
36.2%
Uppercase Letter
ValueCountFrequency (%)
E 103
 
6.4%
W 92
 
5.7%
S 75
 
4.6%
P 74
 
4.6%
J 69
 
4.3%
H 68
 
4.2%
C 66
 
4.1%
A 63
 
3.9%
V 63
 
3.9%
N 63
 
3.9%
Other values (16) 880
54.5%
Decimal Number
ValueCountFrequency (%)
9 551
51.8%
2 74
 
7.0%
3 71
 
6.7%
0 65
 
6.1%
7 57
 
5.4%
4 56
 
5.3%
1 53
 
5.0%
5 50
 
4.7%
8 46
 
4.3%
6 41
 
3.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 3936
78.7%
Common 1064
 
21.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 388
 
9.9%
m 242
 
6.1%
n 230
 
5.8%
c 136
 
3.5%
b 123
 
3.1%
E 103
 
2.6%
W 92
 
2.3%
v 86
 
2.2%
S 75
 
1.9%
P 74
 
1.9%
Other values (42) 2387
60.6%
Common
ValueCountFrequency (%)
9 551
51.8%
2 74
 
7.0%
3 71
 
6.7%
0 65
 
6.1%
7 57
 
5.4%
4 56
 
5.3%
1 53
 
5.0%
5 50
 
4.7%
8 46
 
4.3%
6 41
 
3.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 551
 
11.0%
d 388
 
7.8%
m 242
 
4.8%
n 230
 
4.6%
c 136
 
2.7%
b 123
 
2.5%
E 103
 
2.1%
W 92
 
1.8%
v 86
 
1.7%
S 75
 
1.5%
Other values (52) 2974
59.5%

세무사회계사구분코드
Categorical

IMBALANCE 

Distinct2
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
1
455 
2
 
45

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 455
91.0%
2 45
 
9.0%

Length

2023-12-12T16:15:50.857834image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T16:15:50.973654image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 455
91.0%
2 45
 
9.0%
Distinct459
Distinct (%)91.8%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-12T16:15:51.318659image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length6
Mean length6.296
Min length1

Characters and Unicode

Total characters3148
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique421 ?
Unique (%)84.2%

Sample

1st rowU-2861-7
2nd rowW33285
3rd rowU52361
4th rowU-5388-2
5th rowT40767
ValueCountFrequency (%)
w04749 3
 
0.6%
t93654 3
 
0.6%
w52998 3
 
0.6%
w46682 2
 
0.4%
u85276 2
 
0.4%
w30048 2
 
0.4%
p01591 2
 
0.4%
1 2
 
0.4%
u27808 2
 
0.4%
t43851 2
 
0.4%
Other values (449) 477
95.4%
2023-12-12T16:15:51.866205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 308
9.8%
5 272
8.6%
0 269
8.5%
2 269
8.5%
3 263
8.4%
- 252
 
8.0%
8 226
 
7.2%
7 222
 
7.1%
6 218
 
6.9%
9 214
 
6.8%
Other values (19) 635
20.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2467
78.4%
Uppercase Letter 419
 
13.3%
Dash Punctuation 252
 
8.0%
Lowercase Letter 10
 
0.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
W 155
37.0%
U 101
24.1%
T 63
15.0%
P 28
 
6.7%
C 24
 
5.7%
I 21
 
5.0%
V 12
 
2.9%
A 8
 
1.9%
D 4
 
1.0%
M 1
 
0.2%
Other values (2) 2
 
0.5%
Decimal Number
ValueCountFrequency (%)
1 308
12.5%
5 272
11.0%
0 269
10.9%
2 269
10.9%
3 263
10.7%
8 226
9.2%
7 222
9.0%
6 218
8.8%
9 214
8.7%
4 206
8.4%
Lowercase Letter
ValueCountFrequency (%)
w 3
30.0%
a 2
20.0%
u 2
20.0%
y 1
 
10.0%
n 1
 
10.0%
v 1
 
10.0%
Dash Punctuation
ValueCountFrequency (%)
- 252
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2719
86.4%
Latin 429
 
13.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
W 155
36.1%
U 101
23.5%
T 63
14.7%
P 28
 
6.5%
C 24
 
5.6%
I 21
 
4.9%
V 12
 
2.8%
A 8
 
1.9%
D 4
 
0.9%
w 3
 
0.7%
Other values (8) 10
 
2.3%
Common
ValueCountFrequency (%)
1 308
11.3%
5 272
10.0%
0 269
9.9%
2 269
9.9%
3 263
9.7%
- 252
9.3%
8 226
8.3%
7 222
8.2%
6 218
8.0%
9 214
7.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3148
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 308
9.8%
5 272
8.6%
0 269
8.5%
2 269
8.5%
3 263
8.4%
- 252
 
8.0%
8 226
 
7.2%
7 222
 
7.1%
6 218
 
6.9%
9 214
 
6.8%
Other values (19) 635
20.2%

사무소명
Text

MISSING 

Distinct320
Distinct (%)91.4%
Missing150
Missing (%)30.0%
Memory size4.0 KiB
2023-12-12T16:15:52.168052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length14
Median length12
Mean length7.4428571
Min length2

Characters and Unicode

Total characters2605
Distinct characters240
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique295 ?
Unique (%)84.3%

Sample

1st row세무법인다솔
2nd row세무법인 세움 충무로지점
3rd row세무법인반석
4th row세무법인한맘홍성지점
5th row세무법인한맘홍성지점
ValueCountFrequency (%)
세무법인 25
 
6.1%
세무사 5
 
1.2%
청주지점 4
 
1.0%
세무회계 4
 
1.0%
대전지점 4
 
1.0%
한울회계법인 3
 
0.7%
역삼지점 3
 
0.7%
세무회계사무소 3
 
0.7%
역삼지사 2
 
0.5%
대신회계법인 2
 
0.5%
Other values (336) 358
86.7%
2023-12-12T16:15:52.608287image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
355
 
13.6%
276
 
10.6%
159
 
6.1%
159
 
6.1%
149
 
5.7%
136
 
5.2%
135
 
5.2%
85
 
3.3%
64
 
2.5%
56
 
2.1%
Other values (230) 1031
39.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 2520
96.7%
Space Separator 64
 
2.5%
Close Punctuation 5
 
0.2%
Open Punctuation 5
 
0.2%
Uppercase Letter 4
 
0.2%
Lowercase Letter 4
 
0.2%
Other Punctuation 3
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
355
 
14.1%
276
 
11.0%
159
 
6.3%
159
 
6.3%
149
 
5.9%
136
 
5.4%
135
 
5.4%
85
 
3.4%
56
 
2.2%
47
 
1.9%
Other values (217) 963
38.2%
Uppercase Letter
ValueCountFrequency (%)
D 1
25.0%
H 1
25.0%
B 1
25.0%
L 1
25.0%
Lowercase Letter
ValueCountFrequency (%)
w 1
25.0%
z 1
25.0%
i 1
25.0%
a 1
25.0%
Other Punctuation
ValueCountFrequency (%)
& 2
66.7%
. 1
33.3%
Space Separator
ValueCountFrequency (%)
64
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 2520
96.7%
Common 77
 
3.0%
Latin 8
 
0.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
355
 
14.1%
276
 
11.0%
159
 
6.3%
159
 
6.3%
149
 
5.9%
136
 
5.4%
135
 
5.4%
85
 
3.4%
56
 
2.2%
47
 
1.9%
Other values (217) 963
38.2%
Latin
ValueCountFrequency (%)
D 1
12.5%
H 1
12.5%
w 1
12.5%
z 1
12.5%
i 1
12.5%
B 1
12.5%
a 1
12.5%
L 1
12.5%
Common
ValueCountFrequency (%)
64
83.1%
) 5
 
6.5%
( 5
 
6.5%
& 2
 
2.6%
. 1
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 2520
96.7%
ASCII 85
 
3.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
355
 
14.1%
276
 
11.0%
159
 
6.3%
159
 
6.3%
149
 
5.9%
136
 
5.4%
135
 
5.4%
85
 
3.4%
56
 
2.2%
47
 
1.9%
Other values (217) 963
38.2%
ASCII
ValueCountFrequency (%)
64
75.3%
) 5
 
5.9%
( 5
 
5.9%
& 2
 
2.4%
D 1
 
1.2%
H 1
 
1.2%
w 1
 
1.2%
. 1
 
1.2%
z 1
 
1.2%
i 1
 
1.2%
Other values (3) 3
 
3.5%

삭제여부
Boolean

CONSTANT 

Distinct1
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size632.0 B
False
500 
ValueCountFrequency (%)
False 500
100.0%
2023-12-12T16:15:52.745807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

최종수정수
Real number (ℝ)

Distinct7
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.642
Minimum1
Maximum9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T16:15:52.849154image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile4
Maximum9
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.0158247
Coefficient of variation (CV)0.61865085
Kurtosis7.9050754
Mean1.642
Median Absolute Deviation (MAD)0
Skewness2.3052262
Sum821
Variance1.0318998
MonotonicityNot monotonic
2023-12-12T16:15:52.976821image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
1 300
60.0%
2 126
25.2%
3 46
 
9.2%
4 16
 
3.2%
5 8
 
1.6%
6 3
 
0.6%
9 1
 
0.2%
ValueCountFrequency (%)
1 300
60.0%
2 126
25.2%
3 46
 
9.2%
4 16
 
3.2%
5 8
 
1.6%
6 3
 
0.6%
9 1
 
0.2%
ValueCountFrequency (%)
9 1
 
0.2%
6 3
 
0.6%
5 8
 
1.6%
4 16
 
3.2%
3 46
 
9.2%
2 126
25.2%
1 300
60.0%
Distinct494
Distinct (%)98.8%
Missing0
Missing (%)0.0%
Memory size4.0 KiB
2023-12-12T16:15:53.389568image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters3500
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique488 ?
Unique (%)97.6%

Sample

1st row07:24.8
2nd row59:49.5
3rd row33:36.7
4th row56:51.7
5th row30:09.5
ValueCountFrequency (%)
33:04.5 2
 
0.4%
28:50.2 2
 
0.4%
16:26.5 2
 
0.4%
59:58.9 2
 
0.4%
17:31.2 2
 
0.4%
29:48.7 2
 
0.4%
32:01.2 1
 
0.2%
07:24.8 1
 
0.2%
39:48.0 1
 
0.2%
07:14.7 1
 
0.2%
Other values (484) 484
96.8%
2023-12-12T16:15:53.934206image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
: 500
14.3%
. 500
14.3%
3 341
9.7%
2 325
9.3%
1 318
9.1%
0 316
9.0%
5 307
8.8%
4 285
8.1%
9 175
 
5.0%
8 159
 
4.5%
Other values (2) 274
7.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2500
71.4%
Other Punctuation 1000
 
28.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 341
13.6%
2 325
13.0%
1 318
12.7%
0 316
12.6%
5 307
12.3%
4 285
11.4%
9 175
7.0%
8 159
6.4%
7 138
5.5%
6 136
 
5.4%
Other Punctuation
ValueCountFrequency (%)
: 500
50.0%
. 500
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3500
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
: 500
14.3%
. 500
14.3%
3 341
9.7%
2 325
9.3%
1 318
9.1%
0 316
9.0%
5 307
8.8%
4 285
8.1%
9 175
 
5.0%
8 159
 
4.5%
Other values (2) 274
7.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3500
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
: 500
14.3%
. 500
14.3%
3 341
9.7%
2 325
9.3%
1 318
9.1%
0 316
9.0%
5 307
8.8%
4 285
8.1%
9 175
 
5.0%
8 159
 
4.5%
Other values (2) 274
7.8%

처리직원번호
Real number (ℝ)

Distinct276
Distinct (%)55.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5315.696
Minimum2369
Maximum6210
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 KiB
2023-12-12T16:15:54.089853image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2369
5-th percentile3619
Q14905.5
median5471.5
Q36008
95-th percentile6114.4
Maximum6210
Range3841
Interquartile range (IQR)1102.5

Descriptive statistics

Standard deviation758.027
Coefficient of variation (CV)0.14260165
Kurtosis0.34039554
Mean5315.696
Median Absolute Deviation (MAD)540.5
Skewness-0.99916157
Sum2657848
Variance574604.94
MonotonicityNot monotonic
2023-12-12T16:15:54.259186image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5570 10
 
2.0%
6043 9
 
1.8%
6008 8
 
1.6%
4054 7
 
1.4%
5549 6
 
1.2%
6102 6
 
1.2%
5901 6
 
1.2%
5061 6
 
1.2%
4760 5
 
1.0%
5370 5
 
1.0%
Other values (266) 432
86.4%
ValueCountFrequency (%)
2369 1
 
0.2%
2846 1
 
0.2%
3348 2
0.4%
3388 2
0.4%
3391 1
 
0.2%
3420 1
 
0.2%
3525 2
0.4%
3550 1
 
0.2%
3593 3
0.6%
3598 3
0.6%
ValueCountFrequency (%)
6210 1
 
0.2%
6207 3
0.6%
6201 1
 
0.2%
6189 1
 
0.2%
6183 1
 
0.2%
6171 2
0.4%
6169 1
 
0.2%
6166 1
 
0.2%
6163 1
 
0.2%
6156 2
0.4%

Interactions

2023-12-12T16:15:49.574052image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:15:49.322272image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:15:49.697054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T16:15:49.450054image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T16:15:54.357322image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
세무사회계사구분코드최종수정수처리직원번호
세무사회계사구분코드1.0000.0620.000
최종수정수0.0621.0000.014
처리직원번호0.0000.0141.000
2023-12-12T16:15:54.451327image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
최종수정수처리직원번호세무사회계사구분코드
최종수정수1.000-0.0280.065
처리직원번호-0.0281.0000.000
세무사회계사구분코드0.0650.0001.000

Missing values

2023-12-12T16:15:49.862391image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T16:15:49.990106image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

세무사회계사ID세무사회계사구분코드등록관리번호사무소명삭제여부최종수정수처리시각처리직원번호
09bnEWvraeC1U-2861-7세무법인다솔N307:24.85539
19dnSX9ueMM1W33285세무법인 세움 충무로지점N259:49.56086
29dnSOEUvER1U52361세무법인반석N133:36.75048
39bnEWvf1FC1U-5388-2<NA>N356:51.74601
49dnOG0K06G1T40767세무법인한맘홍성지점N130:09.55549
59dnOGXOsmv1T40767세무법인한맘홍성지점N129:26.05549
69dnOCLcOZD1U24981한우리세무회계N125:15.15549
79dnOCHvhc92U24981한우리세무회계N124:20.45549
89dnOA5t5GC1W49767세무법인호연강남지점N206:09.75061
99dnOxqFG3e1W09432수세무회계사무소N103:51.16006
세무사회계사ID세무사회계사구분코드등록관리번호사무소명삭제여부최종수정수처리시각처리직원번호
4909dmyEddtB91P23701<NA>N107:06.85868
4919dmyDHNPSd1W17754<NA>N159:22.64060
4929dmyBHNhtC1U12203더편한세무법인수원지점N128:50.25716
4939dmyx8QIqd1W46628세무회계창연N234:45.94532
4949dmxtEMsVj112668세무그룹 큰길N139:16.44431
4959dmxtgNz2l15402앙인욱N133:22.05844
4969dmxsbZwZn1P71766세무법인세종 마산지점N116:54.94256
4979dmxqmfxna1H-4008-8한재원N148:54.25344
4989cUS3uFdzI1w46174도영찬세무사사무실N342:17.74442
4999bnEWvCOg01V-2306-2우리세무회계사무소N303:51.95152