Overview

Dataset statistics

Number of variables6
Number of observations33
Missing cells25
Missing cells (%)12.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory1.7 KiB
Average record size in memory54.0 B

Variable types

Numeric2
Categorical1
Text3

Dataset

Description충청북도 한옥체험 등록 업체 정보에 대한 데이터로 위치, 업체명, 소재지, 대표자, 객실수, 전화번호등에 대한 정보입니다.
URLhttps://www.data.go.kr/data/15029392/fileData.do

Alerts

번호 is highly overall correlated with 시군구High correlation
시군구 is highly overall correlated with 번호High correlation
전화번호 has 25 (75.8%) missing valuesMissing
번호 has unique valuesUnique
가옥명 has unique valuesUnique
주 소 has unique valuesUnique
객실 수 has 1 (3.0%) zerosZeros

Reproduction

Analysis started2023-12-12 19:35:51.612587
Analysis finished2023-12-12 19:35:52.722765
Duration1.11 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

번호
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct33
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean17
Minimum1
Maximum33
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size429.0 B
2023-12-13T04:35:52.810581image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2.6
Q19
median17
Q325
95-th percentile31.4
Maximum33
Range32
Interquartile range (IQR)16

Descriptive statistics

Standard deviation9.6695398
Coefficient of variation (CV)0.56879646
Kurtosis-1.2
Mean17
Median Absolute Deviation (MAD)8
Skewness0
Sum561
Variance93.5
MonotonicityStrictly increasing
2023-12-13T04:35:52.999283image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
1 1
 
3.0%
26 1
 
3.0%
20 1
 
3.0%
21 1
 
3.0%
22 1
 
3.0%
23 1
 
3.0%
24 1
 
3.0%
25 1
 
3.0%
27 1
 
3.0%
2 1
 
3.0%
Other values (23) 23
69.7%
ValueCountFrequency (%)
1 1
3.0%
2 1
3.0%
3 1
3.0%
4 1
3.0%
5 1
3.0%
6 1
3.0%
7 1
3.0%
8 1
3.0%
9 1
3.0%
10 1
3.0%
ValueCountFrequency (%)
33 1
3.0%
32 1
3.0%
31 1
3.0%
30 1
3.0%
29 1
3.0%
28 1
3.0%
27 1
3.0%
26 1
3.0%
25 1
3.0%
24 1
3.0%

시군구
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)21.2%
Missing0
Missing (%)0.0%
Memory size396.0 B
청주시
12 
보은군
충주시
옥천군
단양군
Other values (2)

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique2 ?
Unique (%)6.1%

Sample

1st row청주시
2nd row청주시
3rd row청주시
4th row청주시
5th row청주시

Common Values

ValueCountFrequency (%)
청주시 12
36.4%
보은군 8
24.2%
충주시 7
21.2%
옥천군 2
 
6.1%
단양군 2
 
6.1%
제천시 1
 
3.0%
증평군 1
 
3.0%

Length

2023-12-13T04:35:53.175044image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T04:35:53.317948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
청주시 12
36.4%
보은군 8
24.2%
충주시 7
21.2%
옥천군 2
 
6.1%
단양군 2
 
6.1%
제천시 1
 
3.0%
증평군 1
 
3.0%

가옥명
Text

UNIQUE 

Distinct33
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size396.0 B
2023-12-13T04:35:53.570390image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length17
Median length12
Mean length6.6666667
Min length3

Characters and Unicode

Total characters220
Distinct characters115
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)100.0%

Sample

1st row고선재
2nd row가영당
3rd row근지당
4th row만화당
5th row고은당
ValueCountFrequency (%)
가옥 5
 
9.4%
보은 2
 
3.8%
가영당 1
 
1.9%
btlm1960 1
 
1.9%
솔내음한옥집 1
 
1.9%
박도수 1
 
1.9%
은강재(임의백 1
 
1.9%
종택 1
 
1.9%
최재한 1
 
1.9%
최감찰댁 1
 
1.9%
Other values (38) 38
71.7%
2023-12-13T04:35:53.931168image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
20
 
9.1%
13
 
5.9%
8
 
3.6%
7
 
3.2%
6
 
2.7%
6
 
2.7%
5
 
2.3%
5
 
2.3%
5
 
2.3%
4
 
1.8%
Other values (105) 141
64.1%

Most occurring categories

ValueCountFrequency (%)
Other Letter 183
83.2%
Space Separator 20
 
9.1%
Decimal Number 5
 
2.3%
Uppercase Letter 4
 
1.8%
Lowercase Letter 4
 
1.8%
Open Punctuation 2
 
0.9%
Close Punctuation 2
 
0.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
13
 
7.1%
8
 
4.4%
7
 
3.8%
6
 
3.3%
6
 
3.3%
5
 
2.7%
5
 
2.7%
5
 
2.7%
4
 
2.2%
4
 
2.2%
Other values (89) 120
65.6%
Decimal Number
ValueCountFrequency (%)
0 1
20.0%
6 1
20.0%
9 1
20.0%
1 1
20.0%
7 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
M 1
25.0%
L 1
25.0%
T 1
25.0%
B 1
25.0%
Lowercase Letter
ValueCountFrequency (%)
y 1
25.0%
a 1
25.0%
t 1
25.0%
s 1
25.0%
Space Separator
ValueCountFrequency (%)
20
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 183
83.2%
Common 29
 
13.2%
Latin 8
 
3.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
13
 
7.1%
8
 
4.4%
7
 
3.8%
6
 
3.3%
6
 
3.3%
5
 
2.7%
5
 
2.7%
5
 
2.7%
4
 
2.2%
4
 
2.2%
Other values (89) 120
65.6%
Common
ValueCountFrequency (%)
20
69.0%
( 2
 
6.9%
) 2
 
6.9%
0 1
 
3.4%
6 1
 
3.4%
9 1
 
3.4%
1 1
 
3.4%
7 1
 
3.4%
Latin
ValueCountFrequency (%)
M 1
12.5%
L 1
12.5%
y 1
12.5%
a 1
12.5%
t 1
12.5%
s 1
12.5%
T 1
12.5%
B 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 183
83.2%
ASCII 37
 
16.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
20
54.1%
( 2
 
5.4%
) 2
 
5.4%
0 1
 
2.7%
6 1
 
2.7%
9 1
 
2.7%
M 1
 
2.7%
1 1
 
2.7%
L 1
 
2.7%
y 1
 
2.7%
Other values (6) 6
 
16.2%
Hangul
ValueCountFrequency (%)
13
 
7.1%
8
 
4.4%
7
 
3.8%
6
 
3.3%
6
 
3.3%
5
 
2.7%
5
 
2.7%
5
 
2.7%
4
 
2.2%
4
 
2.2%
Other values (89) 120
65.6%

객실 수
Real number (ℝ)

ZEROS 

Distinct11
Distinct (%)33.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.2424242
Minimum0
Maximum13
Zeros1
Zeros (%)3.0%
Negative0
Negative (%)0.0%
Memory size429.0 B
2023-12-13T04:35:54.065745image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q12
median4
Q35
95-th percentile12
Maximum13
Range13
Interquartile range (IQR)3

Descriptive statistics

Standard deviation3.3169103
Coefficient of variation (CV)0.78184314
Kurtosis1.3492674
Mean4.2424242
Median Absolute Deviation (MAD)2
Skewness1.4033848
Sum140
Variance11.001894
MonotonicityNot monotonic
2023-12-13T04:35:54.174358image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2 9
27.3%
4 8
24.2%
1 3
 
9.1%
3 3
 
9.1%
5 2
 
6.1%
12 2
 
6.1%
8 2
 
6.1%
6 1
 
3.0%
9 1
 
3.0%
13 1
 
3.0%
ValueCountFrequency (%)
0 1
 
3.0%
1 3
 
9.1%
2 9
27.3%
3 3
 
9.1%
4 8
24.2%
5 2
 
6.1%
6 1
 
3.0%
8 2
 
6.1%
9 1
 
3.0%
12 2
 
6.1%
ValueCountFrequency (%)
13 1
 
3.0%
12 2
 
6.1%
9 1
 
3.0%
8 2
 
6.1%
6 1
 
3.0%
5 2
 
6.1%
4 8
24.2%
3 3
 
9.1%
2 9
27.3%
1 3
 
9.1%

주 소
Text

UNIQUE 

Distinct33
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size396.0 B
2023-12-13T04:35:54.410990image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length28
Median length25
Mean length23.666667
Min length16

Characters and Unicode

Total characters781
Distinct characters99
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)100.0%

Sample

1st row충청북도 청주시 상당구 남일면 윗고분터길 33-15
2nd row충청북도 청주시 청원구 오창읍 미래지로 71-54
3rd row충청북도 청주시 청원구 오창읍 미래지로 71-58
4th row충청북도 청주시 청원구 오창읍 미래지로 71-51
5th row충청북도 청주시 상당구 문의면 남계2길 35-11
ValueCountFrequency (%)
충청북도 33
 
18.9%
청주시 12
 
6.9%
청원구 8
 
4.6%
보은군 8
 
4.6%
충주시 6
 
3.4%
오창읍 6
 
3.4%
미래지로 6
 
3.4%
거현송죽로 4
 
2.3%
삼승면 4
 
2.3%
상당구 3
 
1.7%
Other values (77) 85
48.6%
2023-12-13T04:35:54.802806image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
144
 
18.4%
53
 
6.8%
40
 
5.1%
34
 
4.4%
1 34
 
4.4%
33
 
4.2%
3 22
 
2.8%
20
 
2.6%
- 20
 
2.6%
19
 
2.4%
Other values (89) 362
46.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 492
63.0%
Space Separator 144
 
18.4%
Decimal Number 125
 
16.0%
Dash Punctuation 20
 
2.6%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
53
 
10.8%
40
 
8.1%
34
 
6.9%
33
 
6.7%
20
 
4.1%
19
 
3.9%
19
 
3.9%
18
 
3.7%
16
 
3.3%
13
 
2.6%
Other values (77) 227
46.1%
Decimal Number
ValueCountFrequency (%)
1 34
27.2%
3 22
17.6%
7 13
 
10.4%
0 13
 
10.4%
2 11
 
8.8%
5 9
 
7.2%
6 8
 
6.4%
8 7
 
5.6%
4 5
 
4.0%
9 3
 
2.4%
Space Separator
ValueCountFrequency (%)
144
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 20
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 492
63.0%
Common 289
37.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
53
 
10.8%
40
 
8.1%
34
 
6.9%
33
 
6.7%
20
 
4.1%
19
 
3.9%
19
 
3.9%
18
 
3.7%
16
 
3.3%
13
 
2.6%
Other values (77) 227
46.1%
Common
ValueCountFrequency (%)
144
49.8%
1 34
 
11.8%
3 22
 
7.6%
- 20
 
6.9%
7 13
 
4.5%
0 13
 
4.5%
2 11
 
3.8%
5 9
 
3.1%
6 8
 
2.8%
8 7
 
2.4%
Other values (2) 8
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 492
63.0%
ASCII 289
37.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
144
49.8%
1 34
 
11.8%
3 22
 
7.6%
- 20
 
6.9%
7 13
 
4.5%
0 13
 
4.5%
2 11
 
3.8%
5 9
 
3.1%
6 8
 
2.8%
8 7
 
2.4%
Other values (2) 8
 
2.8%
Hangul
ValueCountFrequency (%)
53
 
10.8%
40
 
8.1%
34
 
6.9%
33
 
6.7%
20
 
4.1%
19
 
3.9%
19
 
3.9%
18
 
3.7%
16
 
3.3%
13
 
2.6%
Other values (77) 227
46.1%

전화번호
Text

MISSING 

Distinct8
Distinct (%)100.0%
Missing25
Missing (%)75.8%
Memory size396.0 B
2023-12-13T04:35:54.998542image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters96
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)100.0%

Sample

1st row043-298-0148
2nd row043-233-9986
3rd row043-201-2052
4th row043-731-4430
5th row043-730-3414
ValueCountFrequency (%)
043-298-0148 1
12.5%
043-233-9986 1
12.5%
043-201-2052 1
12.5%
043-731-4430 1
12.5%
043-730-3414 1
12.5%
043-835-4153 1
12.5%
043-423-1345 1
12.5%
043-421-0906 1
12.5%
2023-12-13T04:35:55.298205image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 18
18.8%
4 17
17.7%
- 16
16.7%
0 15
15.6%
2 7
 
7.3%
1 7
 
7.3%
9 4
 
4.2%
8 4
 
4.2%
5 4
 
4.2%
6 2
 
2.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 80
83.3%
Dash Punctuation 16
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 18
22.5%
4 17
21.2%
0 15
18.8%
2 7
 
8.8%
1 7
 
8.8%
9 4
 
5.0%
8 4
 
5.0%
5 4
 
5.0%
6 2
 
2.5%
7 2
 
2.5%
Dash Punctuation
ValueCountFrequency (%)
- 16
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 96
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3 18
18.8%
4 17
17.7%
- 16
16.7%
0 15
15.6%
2 7
 
7.3%
1 7
 
7.3%
9 4
 
4.2%
8 4
 
4.2%
5 4
 
4.2%
6 2
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 96
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 18
18.8%
4 17
17.7%
- 16
16.7%
0 15
15.6%
2 7
 
7.3%
1 7
 
7.3%
9 4
 
4.2%
8 4
 
4.2%
5 4
 
4.2%
6 2
 
2.1%

Interactions

2023-12-13T04:35:52.227436image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:35:51.986593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:35:52.351118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T04:35:52.101707image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T04:35:55.405043image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호시군구가옥명객실 수주 소전화번호
번호1.0000.7811.0000.4511.0001.000
시군구0.7811.0001.0000.1261.0001.000
가옥명1.0001.0001.0001.0001.0001.000
객실 수0.4510.1261.0001.0001.0001.000
주 소1.0001.0001.0001.0001.0001.000
전화번호1.0001.0001.0001.0001.0001.000
2023-12-13T04:35:55.514410image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
번호객실 수시군구
번호1.0000.2560.528
객실 수0.2561.0000.000
시군구0.5280.0001.000

Missing values

2023-12-13T04:35:52.519544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T04:35:52.669224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

번호시군구가옥명객실 수주 소전화번호
01청주시고선재5충청북도 청주시 상당구 남일면 윗고분터길 33-15043-298-0148
12청주시가영당1충청북도 청주시 청원구 오창읍 미래지로 71-54043-233-9986
23청주시근지당2충청북도 청주시 청원구 오창읍 미래지로 71-58<NA>
34청주시만화당2충청북도 청주시 청원구 오창읍 미래지로 71-51<NA>
45청주시고은당2충청북도 청주시 상당구 문의면 남계2길 35-11<NA>
56청주시남화제1충청북도 청주시 청원구 오창읍 미래지로 71-60<NA>
67청주시문화재생공동체 터무니2충청북도 청주시 상당구 수영로158번길 47<NA>
78청주시초정행궁12충청북도 청주시 청원구 내수읍 초정약수로 851043-201-2052
89청주시예궁 한옥마을8충청북도 청주시 청원구 내수읍 학평길 14-8<NA>
910청주시다올stay4충청북도 청주시 흥덕구 옥산면 국사오산로 102<NA>
번호시군구가옥명객실 수주 소전화번호
2324보은군최동근 가옥2충청북도 보은군 삼승면 거현송죽로 333-14<NA>
2425보은군최혁재 가옥4충청북도 보은군 삼승면 거현송죽로 333-8<NA>
2526보은군보은 우당고택4충청북도 보은군 장안면 개안길 10-2<NA>
2627보은군보은 선병묵고가 전통한옥 체험장2충청북도 보은군 장안면 개안길 60<NA>
2728보은군보은서원한옥2충청북도 보은군 장안면 장안로 311<NA>
2829옥천군아리랑가옥9충청북도 옥천군 옥천읍 향수3길 11043-731-4430
2930옥천군옥천전통문화체험관13충청북도 옥천군 옥천읍 향수길 100043-730-3414
3031증평군증평민속체험박물관0충청북도 증평군 증평읍 둔덕길 89043-835-4153
3132단양군조자형 가옥4충청북도 단양군 가곡면 여천덕천로 737043-423-1345
3233단양군단촌서원 고택펜션12충청북도 단양군 단성면 북상하리길 103-10043-421-0906