Overview

Dataset statistics

Number of variables7
Number of observations1121
Missing cells2214
Missing cells (%)28.2%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory64.7 KiB
Average record size in memory59.1 B

Variable types

Numeric3
Categorical2
Text2

Dataset

Description경상남도 아동급식카드 가맹점 현황
Author경상남도
URLhttps://www.data.go.kr/data/15068650/fileData.do

Alerts

업종 is highly overall correlated with 시도High correlation
시도 is highly overall correlated with 연번 and 3 other fieldsHigh correlation
연번 is highly overall correlated with 시도High correlation
순번 is highly overall correlated with 시도High correlation
개수 is highly overall correlated with 시도High correlation
시도 is highly imbalanced (90.3%)Imbalance
연번 has 1107 (98.8%) missing valuesMissing
시군구 has 1107 (98.8%) missing valuesMissing

Reproduction

Analysis started2023-12-12 23:25:50.173722
Analysis finished2023-12-12 23:25:51.682840
Duration1.51 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

연번
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct14
Distinct (%)100.0%
Missing1107
Missing (%)98.8%
Infinite0
Infinite (%)0.0%
Mean7.5
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.0 KiB
2023-12-13T08:25:51.729747image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1.65
Q14.25
median7.5
Q310.75
95-th percentile13.35
Maximum14
Range13
Interquartile range (IQR)6.5

Descriptive statistics

Standard deviation4.1833001
Coefficient of variation (CV)0.55777335
Kurtosis-1.2
Mean7.5
Median Absolute Deviation (MAD)3.5
Skewness0
Sum105
Variance17.5
MonotonicityStrictly increasing
2023-12-13T08:25:51.876997image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
1 1
 
0.1%
2 1
 
0.1%
3 1
 
0.1%
4 1
 
0.1%
5 1
 
0.1%
6 1
 
0.1%
7 1
 
0.1%
8 1
 
0.1%
9 1
 
0.1%
10 1
 
0.1%
Other values (4) 4
 
0.4%
(Missing) 1107
98.8%
ValueCountFrequency (%)
1 1
0.1%
2 1
0.1%
3 1
0.1%
4 1
0.1%
5 1
0.1%
6 1
0.1%
7 1
0.1%
8 1
0.1%
9 1
0.1%
10 1
0.1%
ValueCountFrequency (%)
14 1
0.1%
13 1
0.1%
12 1
0.1%
11 1
0.1%
10 1
0.1%
9 1
0.1%
8 1
0.1%
7 1
0.1%
6 1
0.1%
5 1
0.1%

시도
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
<NA>
1107 
경남
 
14

Length

Max length4
Median length4
Mean length3.9750223
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row경남
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 1107
98.8%
경남 14
 
1.2%

Length

2023-12-13T08:25:52.013223image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-13T08:25:52.442738image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 1107
98.8%
경남 14
 
1.2%

시군구
Text

MISSING 

Distinct14
Distinct (%)100.0%
Missing1107
Missing (%)98.8%
Memory size8.9 KiB
2023-12-13T08:25:52.592407image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters42
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)100.0%

Sample

1st row창원시
2nd row진주시
3rd row통영시
4th row사천시
5th row김해시
ValueCountFrequency (%)
창원시 1
 
7.1%
진주시 1
 
7.1%
통영시 1
 
7.1%
사천시 1
 
7.1%
김해시 1
 
7.1%
밀양시 1
 
7.1%
양산시 1
 
7.1%
창녕군 1
 
7.1%
고성군 1
 
7.1%
남해군 1
 
7.1%
Other values (4) 4
28.6%
2023-12-13T08:25:52.894583image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7
16.7%
7
16.7%
3
 
7.1%
3
 
7.1%
2
 
4.8%
2
 
4.8%
1
 
2.4%
1
 
2.4%
1
 
2.4%
1
 
2.4%
Other values (14) 14
33.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 42
100.0%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
7
16.7%
7
16.7%
3
 
7.1%
3
 
7.1%
2
 
4.8%
2
 
4.8%
1
 
2.4%
1
 
2.4%
1
 
2.4%
1
 
2.4%
Other values (14) 14
33.3%

Most occurring scripts

ValueCountFrequency (%)
Hangul 42
100.0%

Most frequent character per script

Hangul
ValueCountFrequency (%)
7
16.7%
7
16.7%
3
 
7.1%
3
 
7.1%
2
 
4.8%
2
 
4.8%
1
 
2.4%
1
 
2.4%
1
 
2.4%
1
 
2.4%
Other values (14) 14
33.3%

Most occurring blocks

ValueCountFrequency (%)
Hangul 42
100.0%

Most frequent character per block

Hangul
ValueCountFrequency (%)
7
16.7%
7
16.7%
3
 
7.1%
3
 
7.1%
2
 
4.8%
2
 
4.8%
1
 
2.4%
1
 
2.4%
1
 
2.4%
1
 
2.4%
Other values (14) 14
33.3%

순번
Real number (ℝ)

HIGH CORRELATION 

Distinct332
Distinct (%)29.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean84.021409
Minimum1
Maximum332
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.0 KiB
2023-12-13T08:25:53.050736image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5
Q123
median56
Q3115
95-th percentile276
Maximum332
Range331
Interquartile range (IQR)92

Descriptive statistics

Standard deviation82.148143
Coefficient of variation (CV)0.97770489
Kurtosis0.94872175
Mean84.021409
Median Absolute Deviation (MAD)39
Skewness1.341166
Sum94188
Variance6748.3174
MonotonicityNot monotonic
2023-12-13T08:25:53.225574image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 14
 
1.2%
3 14
 
1.2%
2 14
 
1.2%
4 13
 
1.2%
5 13
 
1.2%
6 13
 
1.2%
7 13
 
1.2%
8 13
 
1.2%
9 13
 
1.2%
10 13
 
1.2%
Other values (322) 988
88.1%
ValueCountFrequency (%)
1 14
1.2%
2 14
1.2%
3 14
1.2%
4 13
1.2%
5 13
1.2%
6 13
1.2%
7 13
1.2%
8 13
1.2%
9 13
1.2%
10 13
1.2%
ValueCountFrequency (%)
332 1
0.1%
331 1
0.1%
330 1
0.1%
329 1
0.1%
328 1
0.1%
327 1
0.1%
326 1
0.1%
325 1
0.1%
324 1
0.1%
323 1
0.1%
Distinct1005
Distinct (%)89.7%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
2023-12-13T08:25:53.532798image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length23
Median length19
Mean length6.0115968
Min length2

Characters and Unicode

Total characters6739
Distinct characters536
Distinct categories12 ?
Distinct scripts4 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique944 ?
Unique (%)84.2%

Sample

1st row성일할인마트
2nd row한솥도시락
3rd row려미원
4th row365할인마트
5th row봄내수제돈까스
ValueCountFrequency (%)
씨유(cu 16
 
1.2%
코리아세븐 13
 
1.0%
하나로마트 12
 
0.9%
gs25 11
 
0.9%
김밥천국 9
 
0.7%
세븐일레븐 8
 
0.6%
파리바게뜨 7
 
0.5%
뚜레쥬르 7
 
0.5%
농협 6
 
0.5%
cu 6
 
0.5%
Other values (1077) 1199
92.7%
2023-12-13T08:25:53.987049image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
263
 
3.9%
224
 
3.3%
209
 
3.1%
181
 
2.7%
124
 
1.8%
103
 
1.5%
92
 
1.4%
90
 
1.3%
87
 
1.3%
86
 
1.3%
Other values (526) 5280
78.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 6194
91.9%
Space Separator 181
 
2.7%
Uppercase Letter 116
 
1.7%
Open Punctuation 71
 
1.1%
Close Punctuation 70
 
1.0%
Decimal Number 61
 
0.9%
Lowercase Letter 20
 
0.3%
Other Punctuation 19
 
0.3%
Other Symbol 4
 
0.1%
Math Symbol 1
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
263
 
4.2%
224
 
3.6%
209
 
3.4%
124
 
2.0%
103
 
1.7%
92
 
1.5%
90
 
1.5%
87
 
1.4%
86
 
1.4%
83
 
1.3%
Other values (480) 4833
78.0%
Uppercase Letter
ValueCountFrequency (%)
C 31
26.7%
U 24
20.7%
G 16
13.8%
S 15
12.9%
K 5
 
4.3%
D 5
 
4.3%
T 3
 
2.6%
B 3
 
2.6%
I 3
 
2.6%
H 3
 
2.6%
Other values (5) 8
 
6.9%
Lowercase Letter
ValueCountFrequency (%)
e 5
25.0%
h 3
15.0%
c 2
 
10.0%
s 2
 
10.0%
k 2
 
10.0%
u 1
 
5.0%
l 1
 
5.0%
a 1
 
5.0%
b 1
 
5.0%
i 1
 
5.0%
Decimal Number
ValueCountFrequency (%)
2 20
32.8%
5 19
31.1%
0 5
 
8.2%
4 5
 
8.2%
1 4
 
6.6%
9 3
 
4.9%
6 2
 
3.3%
3 2
 
3.3%
7 1
 
1.6%
Other Punctuation
ValueCountFrequency (%)
& 11
57.9%
, 3
 
15.8%
. 3
 
15.8%
' 2
 
10.5%
Space Separator
ValueCountFrequency (%)
181
100.0%
Open Punctuation
ValueCountFrequency (%)
( 71
100.0%
Close Punctuation
ValueCountFrequency (%)
) 70
100.0%
Other Symbol
ValueCountFrequency (%)
4
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 6197
92.0%
Common 405
 
6.0%
Latin 136
 
2.0%
Han 1
 
< 0.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
263
 
4.2%
224
 
3.6%
209
 
3.4%
124
 
2.0%
103
 
1.7%
92
 
1.5%
90
 
1.5%
87
 
1.4%
86
 
1.4%
83
 
1.3%
Other values (480) 4836
78.0%
Latin
ValueCountFrequency (%)
C 31
22.8%
U 24
17.6%
G 16
11.8%
S 15
11.0%
e 5
 
3.7%
K 5
 
3.7%
D 5
 
3.7%
h 3
 
2.2%
T 3
 
2.2%
B 3
 
2.2%
Other values (16) 26
19.1%
Common
ValueCountFrequency (%)
181
44.7%
( 71
 
17.5%
) 70
 
17.3%
2 20
 
4.9%
5 19
 
4.7%
& 11
 
2.7%
0 5
 
1.2%
4 5
 
1.2%
1 4
 
1.0%
, 3
 
0.7%
Other values (9) 16
 
4.0%
Han
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
Hangul 6193
91.9%
ASCII 541
 
8.0%
None 4
 
0.1%
CJK 1
 
< 0.1%

Most frequent character per block

Hangul
ValueCountFrequency (%)
263
 
4.2%
224
 
3.6%
209
 
3.4%
124
 
2.0%
103
 
1.7%
92
 
1.5%
90
 
1.5%
87
 
1.4%
86
 
1.4%
83
 
1.3%
Other values (479) 4832
78.0%
ASCII
ValueCountFrequency (%)
181
33.5%
( 71
 
13.1%
) 70
 
12.9%
C 31
 
5.7%
U 24
 
4.4%
2 20
 
3.7%
5 19
 
3.5%
G 16
 
3.0%
S 15
 
2.8%
& 11
 
2.0%
Other values (35) 83
15.3%
None
ValueCountFrequency (%)
4
100.0%
CJK
ValueCountFrequency (%)
1
100.0%

개수
Real number (ℝ)

HIGH CORRELATION 

Distinct41
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.4183764
Minimum1
Maximum333
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.0 KiB
2023-12-13T08:25:54.137532image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile6
Maximum333
Range332
Interquartile range (IQR)0

Descriptive statistics

Standard deviation18.595381
Coefficient of variation (CV)5.4398283
Kurtosis189.95048
Mean3.4183764
Median Absolute Deviation (MAD)0
Skewness12.807499
Sum3832
Variance345.7882
MonotonicityNot monotonic
2023-12-13T08:25:54.268537image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
1 994
88.7%
2 35
 
3.1%
3 16
 
1.4%
5 9
 
0.8%
4 9
 
0.8%
8 5
 
0.4%
13 5
 
0.4%
14 4
 
0.4%
25 3
 
0.3%
9 3
 
0.3%
Other values (31) 38
 
3.4%
ValueCountFrequency (%)
1 994
88.7%
2 35
 
3.1%
3 16
 
1.4%
4 9
 
0.8%
5 9
 
0.8%
6 3
 
0.3%
7 2
 
0.2%
8 5
 
0.4%
9 3
 
0.3%
10 2
 
0.2%
ValueCountFrequency (%)
333 1
0.1%
316 1
0.1%
216 1
0.1%
180 1
0.1%
164 1
0.1%
138 1
0.1%
107 1
0.1%
104 1
0.1%
91 1
0.1%
77 1
0.1%

업종
Categorical

HIGH CORRELATION 

Distinct15
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size8.9 KiB
일반음식점
551 
마트
209 
휴게음식점
119 
소매업
63 
편의점
 
46
Other values (10)
133 

Length

Max length9
Median length5
Mean length4.2033898
Min length2

Unique

Unique3 ?
Unique (%)0.3%

Sample

1st row마트
2nd row일반음식점
3rd row휴게음식점
4th row마트
5th row일반음식점

Common Values

ValueCountFrequency (%)
일반음식점 551
49.2%
마트 209
 
18.6%
휴게음식점 119
 
10.6%
소매업 63
 
5.6%
편의점 46
 
4.1%
일반음식점 37
 
3.3%
제과점 35
 
3.1%
반찬가게 30
 
2.7%
숙박및음식점업 19
 
1.7%
중식 4
 
0.4%
Other values (5) 8
 
0.7%

Length

2023-12-13T08:25:54.422606image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
일반음식점 588
52.5%
마트 209
 
18.6%
휴게음식점 119
 
10.6%
소매업 63
 
5.6%
편의점 46
 
4.1%
제과점 35
 
3.1%
반찬가게 30
 
2.7%
숙박및음식점업 19
 
1.7%
중식 4
 
0.4%
한식 3
 
0.3%
Other values (4) 5
 
0.4%

Interactions

2023-12-13T08:25:51.078635image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:50.605899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:50.844026image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:51.167368image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:50.684735image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:50.913208image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:51.267781image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:50.760921image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:50.993848image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:25:54.526560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번시군구순번개수업종
연번1.0001.000NaN0.6540.540
시군구1.0001.000NaN1.0001.000
순번NaNNaN1.0000.0000.276
개수0.6541.0000.0001.0000.384
업종0.5401.0000.2760.3841.000
2023-12-13T08:25:54.630148image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업종시도
업종1.0001.000
시도1.0001.000
2023-12-13T08:25:54.716610image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연번순번개수시도업종
연번1.000NaN-0.2011.0000.000
순번NaN1.000-0.3211.0000.106
개수-0.201-0.3211.0001.0000.175
시도1.0001.0001.0001.0001.000
업종0.0000.1060.1751.0001.000

Missing values

2023-12-13T08:25:51.404719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:25:51.533847image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-13T08:25:51.628594image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

연번시도시군구순번가맹점(업체) 명개수업종
01경남창원시1성일할인마트1마트
1<NA><NA><NA>2한솥도시락5일반음식점
2<NA><NA><NA>3려미원1휴게음식점
3<NA><NA><NA>4365할인마트8마트
4<NA><NA><NA>5봄내수제돈까스1일반음식점
5<NA><NA><NA>6북경1일반음식점
6<NA><NA><NA>7금용1일반음식점
7<NA><NA><NA>8차이랑1일반음식점
8<NA><NA><NA>9중화요리1일반음식점
9<NA><NA><NA>10중국관1일반음식점
연번시도시군구순번가맹점(업체) 명개수업종
1111<NA><NA><NA>19대백마트(상림점)1마트
1112<NA><NA><NA>20송정마트1마트
1113<NA><NA><NA>21이지할인마트1마트
1114<NA><NA><NA>22OK포인트마트(거창점)1마트
1115<NA><NA><NA>23아림할인마트1마트
1116<NA><NA><NA>24시장할인마트1마트
1117<NA><NA><NA>25주식회사 스카이시티아림식자재1마트
1118<NA><NA><NA>26거창할인마트1마트
1119<NA><NA><NA>27만물식자재마트1마트
1120<NA><NA><NA>28유가네닭갈비거창점1한식