Overview

Dataset statistics

Number of variables6
Number of observations54
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.7 KiB
Average record size in memory50.4 B

Variable types

Categorical3
Text2
DateTime1

Alerts

처리내용 has constant value ""Constant
시군명 is highly overall correlated with 비고High correlation
비고 is highly overall correlated with 시군명High correlation
비고 is highly imbalanced (77.1%)Imbalance
처리일자 has unique valuesUnique

Reproduction

Analysis started2023-12-10 21:35:24.902762
Analysis finished2023-12-10 21:35:25.665710
Duration0.76 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

시군명
Categorical

HIGH CORRELATION 

Distinct24
Distinct (%)44.4%
Missing0
Missing (%)0.0%
Memory size564.0 B
수원시
10 
성남시
부천시
안산시
평택시
Other values (19)
29 

Length

Max length4
Median length3
Mean length3.0740741
Min length3

Unique

Unique11 ?
Unique (%)20.4%

Sample

1st row수원시
2nd row수원시
3rd row수원시
4th row화성시
5th row화성시

Common Values

ValueCountFrequency (%)
수원시 10
18.5%
성남시 5
 
9.3%
부천시 4
 
7.4%
안산시 3
 
5.6%
평택시 3
 
5.6%
의정부시 3
 
5.6%
화성시 3
 
5.6%
안성시 2
 
3.7%
군포시 2
 
3.7%
광주시 2
 
3.7%
Other values (14) 17
31.5%

Length

2023-12-11T06:35:25.739935image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
수원시 10
18.5%
성남시 5
 
9.3%
부천시 4
 
7.4%
안산시 3
 
5.6%
평택시 3
 
5.6%
의정부시 3
 
5.6%
화성시 3
 
5.6%
오산시 2
 
3.7%
김포시 2
 
3.7%
용인시 2
 
3.7%
Other values (14) 17
31.5%
Distinct53
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size564.0 B
2023-12-11T06:35:25.975650image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length11
Mean length6.7777778
Min length2

Characters and Unicode

Total characters366
Distinct characters111
Distinct categories5 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52 ?
Unique (%)96.3%

Sample

1st row프라임국제결혼
2nd row아이에이치국제결혼
3rd row타오국제결혼정보
4th row안유엠국제결혼
5th row소망
ValueCountFrequency (%)
국제결혼 3
 
5.0%
새봄결혼정보 2
 
3.3%
㈜은가비결혼정보사 1
 
1.7%
한국결혼문화원 1
 
1.7%
그랜드국제국내결혼 1
 
1.7%
해피러스 1
 
1.7%
가든국내.국제결혼 1
 
1.7%
아리아국제결혼 1
 
1.7%
투게더국제결혼 1
 
1.7%
유앤 1
 
1.7%
Other values (47) 47
78.3%
2023-12-11T06:35:26.430993image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
39
 
10.7%
39
 
10.7%
32
 
8.7%
29
 
7.9%
14
 
3.8%
12
 
3.3%
10
 
2.7%
9
 
2.5%
8
 
2.2%
6
 
1.6%
Other values (101) 168
45.9%

Most occurring categories

ValueCountFrequency (%)
Other Letter 352
96.2%
Space Separator 6
 
1.6%
Uppercase Letter 5
 
1.4%
Other Symbol 2
 
0.5%
Other Punctuation 1
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
39
 
11.1%
39
 
11.1%
32
 
9.1%
29
 
8.2%
14
 
4.0%
12
 
3.4%
10
 
2.8%
9
 
2.6%
8
 
2.3%
6
 
1.7%
Other values (94) 154
43.8%
Uppercase Letter
ValueCountFrequency (%)
M 2
40.0%
G 1
20.0%
Y 1
20.0%
K 1
20.0%
Space Separator
ValueCountFrequency (%)
6
100.0%
Other Symbol
ValueCountFrequency (%)
2
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 354
96.7%
Common 7
 
1.9%
Latin 5
 
1.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
39
 
11.0%
39
 
11.0%
32
 
9.0%
29
 
8.2%
14
 
4.0%
12
 
3.4%
10
 
2.8%
9
 
2.5%
8
 
2.3%
6
 
1.7%
Other values (95) 156
44.1%
Latin
ValueCountFrequency (%)
M 2
40.0%
G 1
20.0%
Y 1
20.0%
K 1
20.0%
Common
ValueCountFrequency (%)
6
85.7%
. 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 352
96.2%
ASCII 12
 
3.3%
None 2
 
0.5%

Most frequent character per block

Hangul
ValueCountFrequency (%)
39
 
11.1%
39
 
11.1%
32
 
9.1%
29
 
8.2%
14
 
4.0%
12
 
3.4%
10
 
2.8%
9
 
2.6%
8
 
2.3%
6
 
1.7%
Other values (94) 154
43.8%
ASCII
ValueCountFrequency (%)
6
50.0%
M 2
 
16.7%
. 1
 
8.3%
G 1
 
8.3%
Y 1
 
8.3%
K 1
 
8.3%
None
ValueCountFrequency (%)
2
100.0%
Distinct51
Distinct (%)94.4%
Missing0
Missing (%)0.0%
Memory size564.0 B
2023-12-11T06:35:26.722419image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length26
Median length22
Mean length18.907407
Min length11

Characters and Unicode

Total characters1021
Distinct characters133
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique48 ?
Unique (%)88.9%

Sample

1st row경기도 수원시 권선구 오목천로 54
2nd row경기도 수원시 팔달구 경수대로 425
3rd row경기도 수원시 팔달구 덕영대로 923-10
4th row경기도 화성시 동탄원천로(능동)
5th row경기도 화성시 마도면 청원로 104
ValueCountFrequency (%)
경기도 54
25.0%
수원시 10
 
4.6%
팔달구 6
 
2.8%
성남시 5
 
2.3%
부천시 4
 
1.9%
안산시 3
 
1.4%
의정부시 3
 
1.4%
화성시 3
 
1.4%
평택시 3
 
1.4%
중원구 3
 
1.4%
Other values (105) 122
56.5%
2023-12-11T06:35:27.397913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
162
 
15.9%
56
 
5.5%
56
 
5.5%
56
 
5.5%
54
 
5.3%
50
 
4.9%
40
 
3.9%
) 34
 
3.3%
( 34
 
3.3%
25
 
2.4%
Other values (123) 454
44.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 721
70.6%
Space Separator 162
 
15.9%
Decimal Number 67
 
6.6%
Close Punctuation 34
 
3.3%
Open Punctuation 34
 
3.3%
Dash Punctuation 3
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
56
 
7.8%
56
 
7.8%
56
 
7.8%
54
 
7.5%
50
 
6.9%
40
 
5.5%
25
 
3.5%
19
 
2.6%
18
 
2.5%
15
 
2.1%
Other values (109) 332
46.0%
Decimal Number
ValueCountFrequency (%)
1 11
16.4%
6 10
14.9%
2 9
13.4%
5 8
11.9%
4 8
11.9%
3 6
9.0%
9 5
7.5%
7 4
 
6.0%
0 4
 
6.0%
8 2
 
3.0%
Space Separator
ValueCountFrequency (%)
162
100.0%
Close Punctuation
ValueCountFrequency (%)
) 34
100.0%
Open Punctuation
ValueCountFrequency (%)
( 34
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 721
70.6%
Common 300
29.4%

Most frequent character per script

Hangul
ValueCountFrequency (%)
56
 
7.8%
56
 
7.8%
56
 
7.8%
54
 
7.5%
50
 
6.9%
40
 
5.5%
25
 
3.5%
19
 
2.6%
18
 
2.5%
15
 
2.1%
Other values (109) 332
46.0%
Common
ValueCountFrequency (%)
162
54.0%
) 34
 
11.3%
( 34
 
11.3%
1 11
 
3.7%
6 10
 
3.3%
2 9
 
3.0%
5 8
 
2.7%
4 8
 
2.7%
3 6
 
2.0%
9 5
 
1.7%
Other values (4) 13
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 721
70.6%
ASCII 300
29.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
162
54.0%
) 34
 
11.3%
( 34
 
11.3%
1 11
 
3.7%
6 10
 
3.3%
2 9
 
3.0%
5 8
 
2.7%
4 8
 
2.7%
3 6
 
2.0%
9 5
 
1.7%
Other values (4) 13
 
4.3%
Hangul
ValueCountFrequency (%)
56
 
7.8%
56
 
7.8%
56
 
7.8%
54
 
7.5%
50
 
6.9%
40
 
5.5%
25
 
3.5%
19
 
2.6%
18
 
2.5%
15
 
2.1%
Other values (109) 332
46.0%

처리일자
Date

UNIQUE 

Distinct54
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size564.0 B
Minimum2008-08-13 00:00:00
Maximum2023-06-30 00:00:00
2023-12-11T06:35:27.548810image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:35:27.694152image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

처리내용
Categorical

CONSTANT 

Distinct1
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size564.0 B
등록
54 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row등록
2nd row등록
3rd row등록
4th row등록
5th row등록

Common Values

ValueCountFrequency (%)
등록 54
100.0%

Length

2023-12-11T06:35:27.823530image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:35:27.907756image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
등록 54
100.0%

비고
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Memory size564.0 B
영업
52 
휴업
 
2

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row영업
2nd row영업
3rd row영업
4th row영업
5th row영업

Common Values

ValueCountFrequency (%)
영업 52
96.3%
휴업 2
 
3.7%

Length

2023-12-11T06:35:28.001567image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-11T06:35:28.105265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
영업 52
96.3%
휴업 2
 
3.7%

Correlations

2023-12-11T06:35:28.182007image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명업체명소재지주소처리일자비고
시군명1.0000.8481.0001.0000.877
업체명0.8481.0000.9901.0001.000
소재지주소1.0000.9901.0001.0001.000
처리일자1.0001.0001.0001.0001.000
비고0.8771.0001.0001.0001.000
2023-12-11T06:35:28.299467image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
비고시군명
비고1.0000.559
시군명0.5591.000
2023-12-11T06:35:28.400320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시군명비고
시군명1.0000.559
비고0.5591.000

Missing values

2023-12-11T06:35:25.469551image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T06:35:25.617610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

시군명업체명소재지주소처리일자처리내용비고
0수원시프라임국제결혼경기도 수원시 권선구 오목천로 542022-09-08등록영업
1수원시아이에이치국제결혼경기도 수원시 팔달구 경수대로 4252023-01-04등록영업
2수원시타오국제결혼정보경기도 수원시 팔달구 덕영대로 923-102023-06-30등록영업
3화성시안유엠국제결혼경기도 화성시 동탄원천로(능동)2021-03-23등록영업
4화성시소망경기도 화성시 마도면 청원로 1042022-06-21등록영업
5성남시유니코리아경기도 성남시 수정구 위례서일로(창곡동)2019-06-12등록영업
6부천시MKM국제결혼중개업소경기도 부천시 연동로(옥길동)2023-04-07등록영업
7안산시케이국제결혼경기도 안산시 단원구 새뿔길 (신길동)2022-07-01등록영업
8평택시제이제이국제결혼경기도 평택시 고덕국제대로(고덕동)2022-03-16등록영업
9파주시쉘위경기도 파주시 경의로10242023-05-23등록영업
시군명업체명소재지주소처리일자처리내용비고
44광명시조은국제결혼경기도 광명시 일직로 72(일직동)2023-04-10등록영업
45군포시리더스우즈벡국제결혼경기도 군포시 군포로(당동)2019-10-25등록영업
46양주시베트남 스토리 국제결혼경기도 양주시 은현면 운하로2023-06-23등록영업
47오산시새봄결혼정보경기도 오산시 수목원로610(세교동)2017-07-27등록영업
48오산시코라두리경기도 오산시 외삼미로66(외삼미동)2023-02-14등록영업
49이천시다연국제결혼경기도 이천시 부발읍 가좌로 41-2 1층2018-07-20등록영업
50안성시한베결혼정보경기도 안성시 미양면 안성맞춤대로 6892014-01-07등록영업
51안성시한국결혼문화원경기도 안성시 안성맞춤대로 7632021-01-24등록영업
52포천시㈜은가비결혼정보사경기도 포천시 송우로2008-08-13등록영업
53양평군니드유경기도 양평군 강상면 강남로2018-12-10등록휴업