Overview

Dataset statistics

Number of variables9
Number of observations30
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 KiB
Average record size in memory82.4 B

Variable types

Numeric4
Categorical4
Text1

Dataset

Description샘플 데이터
Author경기도경제과학진흥원
URLhttps://bigdata-region.kr/#/dataset/6b2fa631-41dd-4320-afa2-fe012e0bea1a

Alerts

법정동코드 has constant value ""Constant
우편번호 has constant value ""Constant
용도지역명 has constant value ""Constant
분석인덱스 is highly overall correlated with 주용도명High correlation
지역별업종수 is highly overall correlated with 일반결제금액 and 1 other fieldsHigh correlation
일반결제금액 is highly overall correlated with 지역별업종수 and 1 other fieldsHigh correlation
정책결제금액 is highly overall correlated with 지역별업종수 and 1 other fieldsHigh correlation
주용도명 is highly overall correlated with 분석인덱스High correlation
주용도명 is highly imbalanced (53.1%)Imbalance
분석인덱스 has unique valuesUnique
분석인덱스 has 1 (3.3%) zerosZeros
일반결제금액 has 13 (43.3%) zerosZeros
정책결제금액 has 9 (30.0%) zerosZeros

Reproduction

Analysis started2023-12-10 14:18:03.416586
Analysis finished2023-12-10 14:18:06.093842
Duration2.68 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

분석인덱스
Real number (ℝ)

HIGH CORRELATION  UNIQUE  ZEROS 

Distinct30
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.5
Minimum0
Maximum29
Zeros1
Zeros (%)3.3%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:18:06.190661image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1.45
Q17.25
median14.5
Q321.75
95-th percentile27.55
Maximum29
Range29
Interquartile range (IQR)14.5

Descriptive statistics

Standard deviation8.8034084
Coefficient of variation (CV)0.60713162
Kurtosis-1.2
Mean14.5
Median Absolute Deviation (MAD)7.5
Skewness0
Sum435
Variance77.5
MonotonicityStrictly increasing
2023-12-10T23:18:06.377891image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=30)
ValueCountFrequency (%)
0 1
 
3.3%
16 1
 
3.3%
29 1
 
3.3%
28 1
 
3.3%
27 1
 
3.3%
26 1
 
3.3%
25 1
 
3.3%
24 1
 
3.3%
23 1
 
3.3%
22 1
 
3.3%
Other values (20) 20
66.7%
ValueCountFrequency (%)
0 1
3.3%
1 1
3.3%
2 1
3.3%
3 1
3.3%
4 1
3.3%
5 1
3.3%
6 1
3.3%
7 1
3.3%
8 1
3.3%
9 1
3.3%
ValueCountFrequency (%)
29 1
3.3%
28 1
3.3%
27 1
3.3%
26 1
3.3%
25 1
3.3%
24 1
3.3%
23 1
3.3%
22 1
3.3%
21 1
3.3%
20 1
3.3%

법정동코드
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
4182025021
30 

Length

Max length10
Median length10
Mean length10
Min length10

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row4182025021
2nd row4182025021
3rd row4182025021
4th row4182025021
5th row4182025021

Common Values

ValueCountFrequency (%)
4182025021 30
100.0%

Length

2023-12-10T23:18:06.578233image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:18:06.717658image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
4182025021 30
100.0%

우편번호
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
12413
30 

Length

Max length5
Median length5
Mean length5
Min length5

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row12413
2nd row12413
3rd row12413
4th row12413
5th row12413

Common Values

ValueCountFrequency (%)
12413 30
100.0%

Length

2023-12-10T23:18:06.855898image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:18:07.002920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
12413 30
100.0%

용도지역명
Categorical

CONSTANT 

Distinct1
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Memory size372.0 B
일반상업지역
30 

Length

Max length6
Median length6
Mean length6
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row일반상업지역
2nd row일반상업지역
3rd row일반상업지역
4th row일반상업지역
5th row일반상업지역

Common Values

ValueCountFrequency (%)
일반상업지역 30
100.0%

Length

2023-12-10T23:18:07.153437image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:18:07.292397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
일반상업지역 30
100.0%

주용도명
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)6.7%
Missing0
Missing (%)0.0%
Memory size372.0 B
제1종근린생활시설
27 
제2종근린생활시설

Length

Max length9
Median length9
Mean length9
Min length9

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row제1종근린생활시설
2nd row제1종근린생활시설
3rd row제1종근린생활시설
4th row제1종근린생활시설
5th row제1종근린생활시설

Common Values

ValueCountFrequency (%)
제1종근린생활시설 27
90.0%
제2종근린생활시설 3
 
10.0%

Length

2023-12-10T23:18:07.446850image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-10T23:18:07.606754image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
제1종근린생활시설 27
90.0%
제2종근린생활시설 3
 
10.0%
Distinct27
Distinct (%)90.0%
Missing0
Missing (%)0.0%
Memory size372.0 B
2023-12-10T23:18:07.841793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length8
Median length6
Mean length4
Min length2

Characters and Unicode

Total characters120
Distinct characters72
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)80.0%

Sample

1st row건강식품
2nd row건축자재
3rd row기타
4th row기타의료기관
5th row레저업소
ValueCountFrequency (%)
건강식품 2
 
6.1%
기타 2
 
6.1%
건축자재 2
 
6.1%
연료판매점 1
 
3.0%
영리 1
 
3.0%
유통업 1
 
3.0%
학원 1
 
3.0%
직물 1
 
3.0%
주방용구 1
 
3.0%
전기제품 1
 
3.0%
Other values (20) 20
60.6%
2023-12-10T23:18:08.344126image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
 
4.2%
5
 
4.2%
4
 
3.3%
4
 
3.3%
4
 
3.3%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
3
 
2.5%
Other values (62) 83
69.2%

Most occurring categories

ValueCountFrequency (%)
Other Letter 116
96.7%
Space Separator 3
 
2.5%
Other Punctuation 1
 
0.8%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5
 
4.3%
5
 
4.3%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
Other values (60) 79
68.1%
Space Separator
ValueCountFrequency (%)
3
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 116
96.7%
Common 4
 
3.3%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5
 
4.3%
5
 
4.3%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
Other values (60) 79
68.1%
Common
ValueCountFrequency (%)
3
75.0%
. 1
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 116
96.7%
ASCII 4
 
3.3%

Most frequent character per block

Hangul
ValueCountFrequency (%)
5
 
4.3%
5
 
4.3%
4
 
3.4%
4
 
3.4%
4
 
3.4%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
3
 
2.6%
Other values (60) 79
68.1%
ASCII
ValueCountFrequency (%)
3
75.0%
. 1
 
25.0%

지역별업종수
Real number (ℝ)

HIGH CORRELATION 

Distinct19
Distinct (%)63.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2221.6
Minimum90
Maximum24079
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:18:08.529385image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum90
5-th percentile338
Q1601
median1014
Q31698
95-th percentile7180.1
Maximum24079
Range23989
Interquartile range (IQR)1097

Descriptive statistics

Standard deviation4492.5703
Coefficient of variation (CV)2.0222229
Kurtosis20.687961
Mean2221.6
Median Absolute Deviation (MAD)646
Skewness4.3640754
Sum66648
Variance20183188
MonotonicityNot monotonic
2023-12-10T23:18:08.694261image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
676 6
20.0%
338 5
16.7%
1352 2
 
6.7%
1014 2
 
6.7%
2090 1
 
3.3%
1074 1
 
3.3%
1412 1
 
3.3%
90 1
 
3.3%
1692 1
 
3.3%
24079 1
 
3.3%
Other values (9) 9
30.0%
ValueCountFrequency (%)
90 1
 
3.3%
338 5
16.7%
398 1
 
3.3%
576 1
 
3.3%
676 6
20.0%
1014 2
 
6.7%
1074 1
 
3.3%
1352 2
 
6.7%
1412 1
 
3.3%
1463 1
 
3.3%
ValueCountFrequency (%)
24079 1
3.3%
9134 1
3.3%
4792 1
3.3%
3872 1
3.3%
2090 1
3.3%
2056 1
3.3%
1742 1
3.3%
1700 1
3.3%
1692 1
3.3%
1463 1
3.3%

일반결제금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct15
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1364006.3
Minimum0
Maximum23013700
Zeros13
Zeros (%)43.3%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:18:08.849064image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median25000
Q3472000
95-th percentile4207197
Maximum23013700
Range23013700
Interquartile range (IQR)472000

Descriptive statistics

Standard deviation4264918.2
Coefficient of variation (CV)3.1267584
Kurtosis24.855274
Mean1364006.3
Median Absolute Deviation (MAD)25000
Skewness4.8326798
Sum40920190
Variance1.8189527 × 1013
MonotonicityNot monotonic
2023-12-10T23:18:09.005926image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%)
0 13
43.3%
280000 2
 
6.7%
30000 2
 
6.7%
20000 2
 
6.7%
69030 1
 
3.3%
937200 1
 
3.3%
1577000 1
 
3.3%
2987200 1
 
3.3%
4203710 1
 
3.3%
4210050 1
 
3.3%
Other values (5) 5
 
16.7%
ValueCountFrequency (%)
0 13
43.3%
20000 2
 
6.7%
30000 2
 
6.7%
69030 1
 
3.3%
158000 1
 
3.3%
230000 1
 
3.3%
280000 2
 
6.7%
536000 1
 
3.3%
937200 1
 
3.3%
1577000 1
 
3.3%
ValueCountFrequency (%)
23013700 1
3.3%
4210050 1
3.3%
4203710 1
3.3%
2987200 1
3.3%
2338300 1
3.3%
1577000 1
3.3%
937200 1
3.3%
536000 1
3.3%
280000 2
6.7%
230000 1
3.3%

정책결제금액
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct19
Distinct (%)63.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4507860.7
Minimum0
Maximum68985530
Zeros9
Zeros (%)30.0%
Negative0
Negative (%)0.0%
Memory size402.0 B
2023-12-10T23:18:09.149617image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median80400
Q31806000
95-th percentile16903600
Maximum68985530
Range68985530
Interquartile range (IQR)1806000

Descriptive statistics

Standard deviation13028324
Coefficient of variation (CV)2.8901347
Kurtosis22.19194
Mean4507860.7
Median Absolute Deviation (MAD)80400
Skewness4.5129892
Sum1.3523582 × 108
Variance1.6973724 × 1014
MonotonicityNot monotonic
2023-12-10T23:18:09.293776image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=19)
ValueCountFrequency (%)
0 9
30.0%
40000 2
 
6.7%
108400 2
 
6.7%
75800 2
 
6.7%
16519090 1
 
3.3%
470000 1
 
3.3%
144600 1
 
3.3%
1811000 1
 
3.3%
68985530 1
 
3.3%
6902800 1
 
3.3%
Other values (9) 9
30.0%
ValueCountFrequency (%)
0 9
30.0%
10000 1
 
3.3%
19210 1
 
3.3%
40000 2
 
6.7%
75800 2
 
6.7%
85000 1
 
3.3%
108400 2
 
6.7%
144600 1
 
3.3%
470000 1
 
3.3%
1161700 1
 
3.3%
ValueCountFrequency (%)
68985530 1
3.3%
17218200 1
3.3%
16519090 1
3.3%
8451450 1
3.3%
8210240 1
3.3%
6902800 1
3.3%
3007600 1
3.3%
1811000 1
3.3%
1791000 1
3.3%
1161700 1
3.3%

Interactions

2023-12-10T23:18:05.319174image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:03.764176image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:04.288065image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:04.857136image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:05.431183image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:03.899562image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:04.416466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:04.963512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:05.536507image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:04.030286image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:04.570157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:05.078194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:05.639384image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:04.154422image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:04.720848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-10T23:18:05.200541image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-10T23:18:09.397840image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분석인덱스주용도명가맹점업종명지역별업종수일반결제금액정책결제금액
분석인덱스1.0001.0000.7710.0000.4820.586
주용도명1.0001.0000.0000.0000.0000.000
가맹점업종명0.7710.0001.0001.0001.0001.000
지역별업종수0.0000.0001.0001.0000.6720.940
일반결제금액0.4820.0001.0000.6721.0000.871
정책결제금액0.5860.0001.0000.9400.8711.000
2023-12-10T23:18:09.516698image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분석인덱스지역별업종수일반결제금액정책결제금액주용도명
분석인덱스1.0000.001-0.028-0.0000.845
지역별업종수0.0011.0000.6660.7550.000
일반결제금액-0.0280.6661.0000.8310.000
정책결제금액-0.0000.7550.8311.0000.000
주용도명0.8450.0000.0000.0001.000

Missing values

2023-12-10T23:18:05.777386image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-10T23:18:06.004777image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

분석인덱스법정동코드우편번호용도지역명주용도명가맹점업종명지역별업종수일반결제금액정책결제금액
00418202502112413일반상업지역제1종근린생활시설건강식품135228000040000
11418202502112413일반상업지역제1종근린생활시설건축자재6763000075800
22418202502112413일반상업지역제1종근린생활시설기타67620000108400
33418202502112413일반상업지역제1종근린생활시설기타의료기관3386903019210
44418202502112413일반상업지역제1종근린생활시설레저업소17009372003007600
55418202502112413일반상업지역제1종근린생활시설문화.취미101400
66418202502112413일반상업지역제1종근린생활시설보건위생913415770008451450
77418202502112413일반상업지역제1종근린생활시설사무통신57601161700
88418202502112413일반상업지역제1종근린생활시설수리서비스676010000
99418202502112413일반상업지역제1종근린생활시설숙박업33800
분석인덱스법정동코드우편번호용도지역명주용도명가맹점업종명지역별업종수일반결제금액정책결제금액
2020418202502112413일반상업지역제1종근린생활시설자동차정비 유지16925360001811000
2121418202502112413일반상업지역제1종근린생활시설자동차판매33800
2222418202502112413일반상업지역제1종근린생활시설전기제품9000
2323418202502112413일반상업지역제1종근린생활시설주방용구33800
2424418202502112413일반상업지역제1종근린생활시설직물14120144600
2525418202502112413일반상업지역제1종근린생활시설학원107400
2626418202502112413일반상업지역제1종근린생활시설회원제형태676230000470000
2727418202502112413일반상업지역제2종근린생활시설건강식품135228000040000
2828418202502112413일반상업지역제2종근린생활시설건축자재6763000075800
2929418202502112413일반상업지역제2종근린생활시설기타67620000108400