Overview

Dataset statistics

Number of variables6
Number of observations4422
Missing cells57
Missing cells (%)0.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory224.7 KiB
Average record size in memory52.0 B

Variable types

Numeric2
Categorical3
Text1

Dataset

Description부산광역시 빅데이터플랫폼에 있는 국세청과 관련된 자료로 구군별 연도별 월별 유형별 수량에 대한 데이터를 제공합니다.
Author부산광역시
URLhttps://www.data.go.kr/data/15063702/fileData.do

Alerts

수량 has 57 (1.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 06:04:28.640899
Analysis finished2023-12-12 06:04:29.626741
Duration0.99 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

순위
Real number (ℝ)

Distinct100
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.36635
Minimum1
Maximum100
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.0 KiB
2023-12-12T15:04:29.711499image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q127
median52
Q376
95-th percentile96
Maximum100
Range99
Interquartile range (IQR)49

Descriptive statistics

Standard deviation28.791848
Coefficient of variation (CV)0.56051963
Kurtosis-1.1905806
Mean51.36635
Median Absolute Deviation (MAD)25
Skewness-0.034117021
Sum227142
Variance828.97048
MonotonicityNot monotonic
2023-12-12T15:04:29.841842image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3 48
 
1.1%
55 48
 
1.1%
63 48
 
1.1%
62 48
 
1.1%
61 48
 
1.1%
60 48
 
1.1%
58 48
 
1.1%
94 48
 
1.1%
56 48
 
1.1%
54 48
 
1.1%
Other values (90) 3942
89.1%
ValueCountFrequency (%)
1 24
0.5%
2 48
1.1%
3 48
1.1%
4 48
1.1%
5 48
1.1%
6 48
1.1%
7 48
1.1%
8 18
 
0.4%
9 48
1.1%
10 48
1.1%
ValueCountFrequency (%)
100 48
1.1%
99 48
1.1%
98 48
1.1%
97 48
1.1%
96 48
1.1%
95 48
1.1%
94 48
1.1%
93 45
1.0%
92 48
1.1%
91 18
 
0.4%

구군명
Categorical

Distinct16
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size34.7 KiB
해운대구
 
297
부산진구
 
291
연제구
 
288
동래구
 
282
금정구
 
282
Other values (11)
2982 

Length

Max length4
Median length3
Mean length2.8256445
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row중구
2nd row중구
3rd row중구
4th row서구
5th row서구

Common Values

ValueCountFrequency (%)
해운대구 297
 
6.7%
부산진구 291
 
6.6%
연제구 288
 
6.5%
동래구 282
 
6.4%
금정구 282
 
6.4%
사상구 282
 
6.4%
동구 279
 
6.3%
사하구 279
 
6.3%
남구 276
 
6.2%
수영구 276
 
6.2%
Other values (6) 1590
36.0%

Length

2023-12-12T15:04:29.981642image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
해운대구 297
 
6.7%
부산진구 291
 
6.6%
연제구 288
 
6.5%
동래구 282
 
6.4%
금정구 282
 
6.4%
사상구 282
 
6.4%
동구 279
 
6.3%
사하구 279
 
6.3%
남구 276
 
6.2%
수영구 276
 
6.2%
Other values (6) 1590
36.0%

연도
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size34.7 KiB
2017
2948 
2016
1474 

Length

Max length4
Median length4
Mean length4
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2017
2nd row2017
3rd row2016
4th row2017
5th row2017

Common Values

ValueCountFrequency (%)
2017 2948
66.7%
2016 1474
33.3%

Length

2023-12-12T15:04:30.095067image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:04:30.192297image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
2017 2948
66.7%
2016 1474
33.3%


Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size34.7 KiB
11
2948 
10
1474 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row11
2nd row10
3rd row11
4th row11
5th row10

Common Values

ValueCountFrequency (%)
11 2948
66.7%
10 1474
33.3%

Length

2023-12-12T15:04:30.294597image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T15:04:30.397878image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
11 2948
66.7%
10 1474
33.3%

수량
Real number (ℝ)

MISSING 

Distinct441
Distinct (%)10.1%
Missing57
Missing (%)1.3%
Infinite0
Infinite (%)0.0%
Mean81.657732
Minimum3
Maximum854
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.0 KiB
2023-12-12T15:04:30.518431image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile5
Q118
median44
Q395
95-th percentile290.8
Maximum854
Range851
Interquartile range (IQR)77

Descriptive statistics

Standard deviation115.56045
Coefficient of variation (CV)1.4151808
Kurtosis14.466104
Mean81.657732
Median Absolute Deviation (MAD)31
Skewness3.4327062
Sum356436
Variance13354.218
MonotonicityNot monotonic
2023-12-12T15:04:30.671407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4 102
 
2.3%
7 83
 
1.9%
8 78
 
1.8%
3 76
 
1.7%
5 76
 
1.7%
6 76
 
1.7%
11 75
 
1.7%
10 69
 
1.6%
16 67
 
1.5%
13 66
 
1.5%
Other values (431) 3597
81.3%
ValueCountFrequency (%)
3 76
1.7%
4 102
2.3%
5 76
1.7%
6 76
1.7%
7 83
1.9%
8 78
1.8%
9 66
1.5%
10 69
1.6%
11 75
1.7%
12 58
1.3%
ValueCountFrequency (%)
854 1
< 0.1%
852 1
< 0.1%
843 2
< 0.1%
840 1
< 0.1%
836 1
< 0.1%
831 1
< 0.1%
815 1
< 0.1%
813 1
< 0.1%
811 1
< 0.1%
809 2
< 0.1%

유형
Text

Distinct100
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Memory size34.7 KiB
2023-12-12T15:04:30.917979image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length10
Median length8
Mean length5.0217096
Min length2

Characters and Unicode

Total characters22206
Distinct characters176
Distinct categories4 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row가구점
2nd row가구점
3rd row가구점
4th row가구점
5th row가구점
ValueCountFrequency (%)
의원 336
 
7.0%
가구점 48
 
1.0%
여행사 48
 
1.0%
일반외과 48
 
1.0%
이비인후과 48
 
1.0%
이발소 48
 
1.0%
이륜자동차판매점 48
 
1.0%
의료용품가게 48
 
1.0%
기타외국식전문점 48
 
1.0%
예술학원 48
 
1.0%
Other values (92) 4035
84.0%
2023-12-12T15:04:31.385170image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1356
 
6.1%
975
 
4.4%
831
 
3.7%
810
 
3.6%
573
 
2.6%
528
 
2.4%
504
 
2.3%
399
 
1.8%
396
 
1.8%
384
 
1.7%
Other values (166) 15450
69.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 21657
97.5%
Space Separator 381
 
1.7%
Lowercase Letter 96
 
0.4%
Uppercase Letter 72
 
0.3%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
1356
 
6.3%
975
 
4.5%
831
 
3.8%
810
 
3.7%
573
 
2.6%
528
 
2.4%
504
 
2.3%
399
 
1.8%
396
 
1.8%
384
 
1.8%
Other values (160) 14901
68.8%
Uppercase Letter
ValueCountFrequency (%)
L 24
33.3%
G 24
33.3%
P 24
33.3%
Lowercase Letter
ValueCountFrequency (%)
c 48
50.0%
p 48
50.0%
Space Separator
ValueCountFrequency (%)
381
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 21657
97.5%
Common 381
 
1.7%
Latin 168
 
0.8%

Most frequent character per script

Hangul
ValueCountFrequency (%)
1356
 
6.3%
975
 
4.5%
831
 
3.8%
810
 
3.7%
573
 
2.6%
528
 
2.4%
504
 
2.3%
399
 
1.8%
396
 
1.8%
384
 
1.8%
Other values (160) 14901
68.8%
Latin
ValueCountFrequency (%)
c 48
28.6%
p 48
28.6%
L 24
14.3%
G 24
14.3%
P 24
14.3%
Common
ValueCountFrequency (%)
381
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 21261
95.7%
ASCII 549
 
2.5%
Compat Jamo 396
 
1.8%

Most frequent character per block

Hangul
ValueCountFrequency (%)
1356
 
6.4%
975
 
4.6%
831
 
3.9%
810
 
3.8%
573
 
2.7%
528
 
2.5%
504
 
2.4%
399
 
1.9%
384
 
1.8%
384
 
1.8%
Other values (159) 14517
68.3%
Compat Jamo
ValueCountFrequency (%)
396
100.0%
ASCII
ValueCountFrequency (%)
381
69.4%
c 48
 
8.7%
p 48
 
8.7%
L 24
 
4.4%
G 24
 
4.4%
P 24
 
4.4%

Interactions

2023-12-12T15:04:29.254503image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:04:29.073890image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:04:29.350635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T15:04:29.165886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T15:04:31.498407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순위구군명연도수량유형
순위1.0000.0000.0000.0000.4071.000
구군명0.0001.0000.0000.0000.2680.000
연도0.0000.0001.0000.7060.0000.000
0.0000.0000.7061.0000.0000.000
수량0.4070.2680.0000.0001.0000.804
유형1.0000.0000.0000.0000.8041.000
2023-12-12T15:04:31.621543image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
구군명연도
구군명1.0000.0000.000
0.0001.0000.499
연도0.0000.4991.000
2023-12-12T15:04:31.763564image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
순위수량구군명연도
순위1.0000.1230.0000.0000.000
수량0.1231.0000.1080.0000.000
구군명0.0000.1081.0000.0000.000
연도0.0000.0000.0001.0000.499
0.0000.0000.0000.4991.000

Missing values

2023-12-12T15:04:29.462355image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T15:04:29.579182image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

순위구군명연도수량유형
03중구20171113가구점
13중구20171013가구점
23중구20161116가구점
33서구2017119가구점
43서구2017109가구점
53서구2016119가구점
63동구20171164가구점
73동구20171062가구점
83동구20161164가구점
93영도구2017117가구점
순위구군명연도수량유형
44122연제구20161140pc방
44132수영구20171137pc방
44142수영구20171037pc방
44152수영구20161139pc방
44162사상구20171153pc방
44172사상구20171054pc방
44182사상구20161152pc방
44192기장군20171125pc방
44202기장군20171025pc방
44212기장군20161124pc방