Overview

Dataset statistics

Number of variables5
Number of observations191
Missing cells16
Missing cells (%)1.7%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.1 KiB
Average record size in memory43.7 B

Variable types

Numeric2
Text1
Categorical2

Dataset

Description대전광역시 시설관리공단에서 운영중인 대전역 앞 지하도 상가(동구 중앙로 지하 200)의 상가현황에 대한 정보(일렬번호,상가이름,상가유형,상가유형_상세,전화번호) 제공
Author대전광역시시설관리공단
URLhttps://www.data.go.kr/data/15123936/fileData.do

Alerts

상가유형 is highly imbalanced (74.7%)Imbalance
전화번호 has 16 (8.4%) missing valuesMissing
일렬번호 has unique valuesUnique

Reproduction

Analysis started2023-12-12 18:03:37.518561
Analysis finished2023-12-12 18:03:38.268870
Duration0.75 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

일렬번호
Real number (ℝ)

UNIQUE 

Distinct191
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2809.8796
Minimum2706
Maximum2923
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2023-12-13T03:03:38.341959image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum2706
5-th percentile2715.5
Q12757.5
median2807
Q32862.5
95-th percentile2907.5
Maximum2923
Range217
Interquartile range (IQR)105

Descriptive statistics

Standard deviation61.694333
Coefficient of variation (CV)0.021956219
Kurtosis-1.1603993
Mean2809.8796
Median Absolute Deviation (MAD)53
Skewness0.082675142
Sum536687
Variance3806.1907
MonotonicityStrictly increasing
2023-12-13T03:03:38.470893image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2706 1
 
0.5%
2707 1
 
0.5%
2839 1
 
0.5%
2840 1
 
0.5%
2841 1
 
0.5%
2842 1
 
0.5%
2843 1
 
0.5%
2844 1
 
0.5%
2845 1
 
0.5%
2846 1
 
0.5%
Other values (181) 181
94.8%
ValueCountFrequency (%)
2706 1
0.5%
2707 1
0.5%
2708 1
0.5%
2709 1
0.5%
2710 1
0.5%
2711 1
0.5%
2712 1
0.5%
2713 1
0.5%
2714 1
0.5%
2715 1
0.5%
ValueCountFrequency (%)
2923 1
0.5%
2922 1
0.5%
2921 1
0.5%
2920 1
0.5%
2919 1
0.5%
2918 1
0.5%
2911 1
0.5%
2910 1
0.5%
2909 1
0.5%
2908 1
0.5%
Distinct101
Distinct (%)52.9%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
2023-12-13T03:03:39.048705image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length9
Median length8
Mean length3.8848168
Min length2

Characters and Unicode

Total characters742
Distinct characters194
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)20.4%

Sample

1st row몽실
2nd row몽실
3rd row흙비
4th row흙비
5th row흙비
ValueCountFrequency (%)
큰별통신 6
 
3.1%
공실 6
 
3.1%
자방모드 4
 
2.1%
연예인 4
 
2.1%
올포유 4
 
2.1%
여성크로커 4
 
2.1%
천보당안경콘택트 4
 
2.1%
밤블비 3
 
1.5%
올리브정 3
 
1.5%
청코너 3
 
1.5%
Other values (93) 154
79.0%
2023-12-13T03:03:39.536638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
18
 
2.4%
18
 
2.4%
17
 
2.3%
17
 
2.3%
12
 
1.6%
12
 
1.6%
11
 
1.5%
11
 
1.5%
11
 
1.5%
10
 
1.3%
Other values (184) 605
81.5%

Most occurring categories

ValueCountFrequency (%)
Other Letter 729
98.2%
Space Separator 4
 
0.5%
Decimal Number 4
 
0.5%
Open Punctuation 2
 
0.3%
Close Punctuation 2
 
0.3%
Math Symbol 1
 
0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
18
 
2.5%
18
 
2.5%
17
 
2.3%
17
 
2.3%
12
 
1.6%
12
 
1.6%
11
 
1.5%
11
 
1.5%
11
 
1.5%
10
 
1.4%
Other values (178) 592
81.2%
Decimal Number
ValueCountFrequency (%)
2 2
50.0%
0 2
50.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 729
98.2%
Common 13
 
1.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
18
 
2.5%
18
 
2.5%
17
 
2.3%
17
 
2.3%
12
 
1.6%
12
 
1.6%
11
 
1.5%
11
 
1.5%
11
 
1.5%
10
 
1.4%
Other values (178) 592
81.2%
Common
ValueCountFrequency (%)
4
30.8%
2 2
15.4%
0 2
15.4%
( 2
15.4%
) 2
15.4%
~ 1
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
Hangul 729
98.2%
ASCII 13
 
1.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
18
 
2.5%
18
 
2.5%
17
 
2.3%
17
 
2.3%
12
 
1.6%
12
 
1.6%
11
 
1.5%
11
 
1.5%
11
 
1.5%
10
 
1.4%
Other values (178) 592
81.2%
ASCII
ValueCountFrequency (%)
4
30.8%
2 2
15.4%
0 2
15.4%
( 2
15.4%
) 2
15.4%
~ 1
 
7.7%

상가유형
Categorical

IMBALANCE 

Distinct6
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
3
173 
2
 
5
<NA>
 
5
4
 
4
5
 
3

Length

Max length4
Median length1
Mean length1.078534
Min length1

Unique

Unique1 ?
Unique (%)0.5%

Sample

1st row3
2nd row3
3rd row3
4th row3
5th row3

Common Values

ValueCountFrequency (%)
3 173
90.6%
2 5
 
2.6%
<NA> 5
 
2.6%
4 4
 
2.1%
5 3
 
1.6%
1 1
 
0.5%

Length

2023-12-13T03:03:39.690381image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T03:03:39.799585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
3 173
90.6%
2 5
 
2.6%
na 5
 
2.6%
4 4
 
2.1%
5 3
 
1.6%
1 1
 
0.5%
Distinct30
Distinct (%)15.7%
Missing0
Missing (%)0.0%
Memory size1.6 KiB
20
99 
35
10 
5
 
9
29
 
9
3
 
7
Other values (25)
57 

Length

Max length4
Median length2
Mean length1.9371728
Min length1

Unique

Unique10 ?
Unique (%)5.2%

Sample

1st row20
2nd row20
3rd row20
4th row20
5th row20

Common Values

ValueCountFrequency (%)
20 99
51.8%
35 10
 
5.2%
5 9
 
4.7%
29 9
 
4.7%
3 7
 
3.7%
23 6
 
3.1%
<NA> 5
 
2.6%
10 4
 
2.1%
22 4
 
2.1%
14 3
 
1.6%
Other values (20) 35
 
18.3%

Length

2023-12-13T03:03:39.914352image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
20 99
51.8%
35 10
 
5.2%
5 9
 
4.7%
29 9
 
4.7%
3 7
 
3.7%
23 6
 
3.1%
na 5
 
2.6%
22 4
 
2.1%
10 4
 
2.1%
14 3
 
1.6%
Other values (20) 35
 
18.3%

전화번호
Real number (ℝ)

MISSING 

Distinct86
Distinct (%)49.1%
Missing16
Missing (%)8.4%
Infinite0
Infinite (%)0.0%
Mean4.2249612 × 108
Minimum4.2221089 × 108
Maximum4.2863769 × 108
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.8 KiB
2023-12-13T03:03:40.056632image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum4.2221089 × 108
5-th percentile4.2221789 × 108
Q14.2224448 × 108
median4.2252486 × 108
Q34.2255517 × 108
95-th percentile4.2257906 × 108
Maximum4.2863769 × 108
Range6426793
Interquartile range (IQR)310688

Descriptive statistics

Standard deviation581709.71
Coefficient of variation (CV)0.0013768403
Kurtosis77.702608
Mean4.2249612 × 108
Median Absolute Deviation (MAD)47274
Skewness8.1716015
Sum7.3936822 × 1010
Variance3.3838618 × 1011
MonotonicityNot monotonic
2023-12-13T03:03:40.210118image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
422233777 6
 
3.1%
422531904 6
 
3.1%
422269867 4
 
2.1%
422579064 4
 
2.1%
422565454 4
 
2.1%
422228525 4
 
2.1%
422522149 4
 
2.1%
422212942 3
 
1.6%
422225636 3
 
1.6%
422540717 3
 
1.6%
Other values (76) 134
70.2%
(Missing) 16
 
8.4%
ValueCountFrequency (%)
422210893 2
1.0%
422211473 2
1.0%
422212646 2
1.0%
422212942 3
1.6%
422220016 3
1.6%
422221222 3
1.6%
422223837 2
1.0%
422225573 1
 
0.5%
422225636 3
1.6%
422227557 2
1.0%
ValueCountFrequency (%)
428637686 1
 
0.5%
425343138 2
1.0%
422742006 2
1.0%
422579255 3
1.6%
422579064 4
2.1%
422576887 2
1.0%
422572941 2
1.0%
422572131 2
1.0%
422571754 3
1.6%
422571753 2
1.0%

Interactions

2023-12-13T03:03:37.916255image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:03:37.749593image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:03:38.003403image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T03:03:37.828183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T03:03:40.321777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일렬번호상가유형상가유형_상세전화번호
일렬번호1.0000.3970.8070.334
상가유형0.3971.0000.7660.540
상가유형_상세0.8070.7661.0000.000
전화번호0.3340.5400.0001.000
2023-12-13T03:03:40.423506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
상가유형_상세상가유형
상가유형_상세1.0000.457
상가유형0.4571.000
2023-12-13T03:03:40.527293image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
일렬번호전화번호상가유형상가유형_상세
일렬번호1.000-0.1060.1730.418
전화번호-0.1061.0000.4780.000
상가유형0.1730.4781.0000.457
상가유형_상세0.4180.0000.4571.000

Missing values

2023-12-13T03:03:38.130268image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T03:03:38.232280image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

일렬번호상가이름상가유형상가유형_상세전화번호
02706몽실320422210893
12707몽실320422210893
22708흙비320422225636
32709흙비320422225636
42710흙비320422225636
52711해풍사316422562039
62712보라320422521601
72713보라320422521601
82714탐나라310422524857
92715탐나라310422524857
일렬번호상가이름상가유형상가유형_상세전화번호
1812908큰별통신329422233777
1822909큰별통신329422233777
1832910큰별통신329422233777
1842911큰별통신329422233777
1852918은성사32422564790
1862919그린월렛335<NA>
1872920미성건강카페535422249115
1882921종료<NA><NA><NA>
1892922명품가발335422266667
1902923공예협동조합435428637686