Overview

Dataset statistics

Number of variables10
Number of observations10000
Missing cells9833
Missing cells (%)9.8%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory878.9 KiB
Average record size in memory90.0 B

Variable types

Categorical4
Numeric2
Text2
DateTime2

Dataset

Description대한상공회의소 위해상품판매차단시스템(유해물질 검출로 인한 리콜제품 등을 차단)에서 제공되는 위해 및 불법 제품에 관련한 처리결과 정보
URLhttps://www.data.go.kr/data/15040692/fileData.do

Alerts

전송상태 has constant value ""Constant
처리상태 is highly overall correlated with 재고단위High correlation
검사기관명 is highly overall correlated with 문서번호 and 1 other fieldsHigh correlation
재고단위 is highly overall correlated with 문서번호 and 3 other fieldsHigh correlation
문서번호 is highly overall correlated with 검사기관명 and 1 other fieldsHigh correlation
차수 is highly overall correlated with 재고단위High correlation
처리상태 is highly imbalanced (92.2%)Imbalance
재고단위 is highly imbalanced (96.6%)Imbalance
수신일시 has 9833 (98.3%) missing valuesMissing

Reproduction

Analysis started2023-12-12 04:19:50.070675
Analysis finished2023-12-12 04:19:51.562084
Duration1.49 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

검사기관명
Categorical

HIGH CORRELATION 

Distinct8
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
산업통상자원부 국가기술표준원
3375 
식약처 식품
3323 
식품의약품안전처 화장품정책과
1120 
식품의약품안전처 의약외품정책과
918 
환경부 화학제품과
805 
Other values (3)
459 

Length

Max length16
Median length15
Mean length11.4555
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row식약처 식품
2nd row산업통상자원부 국가기술표준원
3rd row식품의약품안전처 화장품정책과
4th row산업통상자원부 국가기술표준원
5th row산업통상자원부 국가기술표준원

Common Values

ValueCountFrequency (%)
산업통상자원부 국가기술표준원 3375
33.8%
식약처 식품 3323
33.2%
식품의약품안전처 화장품정책과 1120
 
11.2%
식품의약품안전처 의약외품정책과 918
 
9.2%
환경부 화학제품과 805
 
8.1%
식약처 수입유통안전과 410
 
4.1%
식품의약품안전처 의료기기안전국 44
 
0.4%
환경부 어린이용품 5
 
0.1%

Length

2023-12-12T13:19:51.630880image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:19:51.748681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
식약처 3733
18.7%
산업통상자원부 3375
16.9%
국가기술표준원 3375
16.9%
식품 3323
16.6%
식품의약품안전처 2082
10.4%
화장품정책과 1120
 
5.6%
의약외품정책과 918
 
4.6%
환경부 810
 
4.0%
화학제품과 805
 
4.0%
수입유통안전과 410
 
2.1%
Other values (2) 49
 
0.2%

문서번호
Real number (ℝ)

HIGH CORRELATION 

Distinct1157
Distinct (%)11.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.7722575 × 1010
Minimum10014789
Maximum2.0221123 × 1011
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T13:19:51.885047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10014789
5-th percentile10018310
Q110018391
median3.0001592 × 109
Q32.0221005 × 1011
95-th percentile2.0221116 × 1011
Maximum2.0221123 × 1011
Range2.0220122 × 1011
Interquartile range (IQR)2.0220003 × 1011

Descriptive statistics

Standard deviation9.4353477 × 1010
Coefficient of variation (CV)1.3932352
Kurtosis-1.4759684
Mean6.7722575 × 1010
Median Absolute Deviation (MAD)2.9901409 × 109
Skewness0.72337705
Sum6.7722575 × 1014
Variance8.9025786 × 1021
MonotonicityNot monotonic
2023-12-12T13:19:52.028600image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3000159656 100
 
1.0%
202210040001 95
 
0.9%
3000159235 95
 
0.9%
202210200001 92
 
0.9%
202210200006 92
 
0.9%
3000158750 86
 
0.9%
202210200007 85
 
0.9%
202210210001 84
 
0.8%
3000160261 83
 
0.8%
202210170001 81
 
0.8%
Other values (1147) 9107
91.1%
ValueCountFrequency (%)
10014789 1
< 0.1%
10015038 1
< 0.1%
10015051 1
< 0.1%
10015063 1
< 0.1%
10015074 1
< 0.1%
10015081 1
< 0.1%
10015083 1
< 0.1%
10015108 1
< 0.1%
10015255 1
< 0.1%
10015300 1
< 0.1%
ValueCountFrequency (%)
202211230015 4
< 0.1%
202211230014 1
 
< 0.1%
202211230013 1
 
< 0.1%
202211230012 3
< 0.1%
202211230011 2
< 0.1%
202211230010 2
< 0.1%
202211230009 2
< 0.1%
202211230008 2
< 0.1%
202211230007 1
 
< 0.1%
202211230006 3
< 0.1%

차수
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.053
Minimum1
Maximum6
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-12T13:19:52.145085image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile1
Maximum6
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.33015436
Coefficient of variation (CV)0.3135369
Kurtosis68.561863
Mean1.053
Median Absolute Deviation (MAD)0
Skewness7.6416648
Sum10530
Variance0.1090019
MonotonicityNot monotonic
2023-12-12T13:19:52.264082image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
1 9685
96.9%
2 155
 
1.6%
3 127
 
1.3%
5 18
 
0.2%
4 13
 
0.1%
6 2
 
< 0.1%
ValueCountFrequency (%)
1 9685
96.9%
2 155
 
1.6%
3 127
 
1.3%
4 13
 
0.1%
5 18
 
0.2%
6 2
 
< 0.1%
ValueCountFrequency (%)
6 2
 
< 0.1%
5 18
 
0.2%
4 13
 
0.1%
3 127
 
1.3%
2 155
 
1.6%
1 9685
96.9%
Distinct548
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T13:19:52.528224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length27
Median length18
Mean length7.6136
Min length2

Characters and Unicode

Total characters76136
Distinct characters394
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row로그인 편의점
2nd row마켓오리진17
3rd row할렐루야마트
4th row현대홈마트(반구점)
5th row왕마트(둔촌동)
ValueCountFrequency (%)
주식회사 314
 
2.5%
포시즌 252
 
2.0%
광주타이어킹점 225
 
1.8%
엔마트 206
 
1.6%
진영알뜰마트 180
 
1.4%
레몬비 172
 
1.4%
포시즌마트 147
 
1.2%
청송(영덕)휴게소 133
 
1.1%
사러가 131
 
1.0%
js마트 116
 
0.9%
Other values (599) 10680
85.1%
2023-12-12T13:19:53.377892image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6524
 
8.6%
6205
 
8.1%
) 2949
 
3.9%
( 2887
 
3.8%
2870
 
3.8%
2556
 
3.4%
1998
 
2.6%
1311
 
1.7%
1163
 
1.5%
944
 
1.2%
Other values (384) 46729
61.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 65312
85.8%
Close Punctuation 2949
 
3.9%
Open Punctuation 2887
 
3.8%
Space Separator 2556
 
3.4%
Decimal Number 1211
 
1.6%
Uppercase Letter 1047
 
1.4%
Lowercase Letter 85
 
0.1%
Other Punctuation 60
 
0.1%
Dash Punctuation 21
 
< 0.1%
Math Symbol 8
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
6524
 
10.0%
6205
 
9.5%
2870
 
4.4%
1998
 
3.1%
1311
 
2.0%
1163
 
1.8%
944
 
1.4%
878
 
1.3%
841
 
1.3%
822
 
1.3%
Other values (343) 41756
63.9%
Uppercase Letter
ValueCountFrequency (%)
S 164
15.7%
C 160
15.3%
J 130
12.4%
D 126
12.0%
G 103
9.8%
L 80
7.6%
Q 58
 
5.5%
A 43
 
4.1%
M 29
 
2.8%
T 27
 
2.6%
Other values (7) 127
12.1%
Decimal Number
ValueCountFrequency (%)
2 365
30.1%
5 233
19.2%
0 158
13.0%
1 148
12.2%
3 102
 
8.4%
4 94
 
7.8%
6 57
 
4.7%
7 41
 
3.4%
8 7
 
0.6%
9 6
 
0.5%
Lowercase Letter
ValueCountFrequency (%)
a 22
25.9%
r 11
12.9%
t 11
12.9%
y 11
12.9%
i 11
12.9%
l 11
12.9%
w 8
 
9.4%
Other Punctuation
ValueCountFrequency (%)
* 48
80.0%
. 12
 
20.0%
Close Punctuation
ValueCountFrequency (%)
) 2949
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2887
100.0%
Space Separator
ValueCountFrequency (%)
2556
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 21
100.0%
Math Symbol
ValueCountFrequency (%)
+ 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 65312
85.8%
Common 9692
 
12.7%
Latin 1132
 
1.5%

Most frequent character per script

Hangul
ValueCountFrequency (%)
6524
 
10.0%
6205
 
9.5%
2870
 
4.4%
1998
 
3.1%
1311
 
2.0%
1163
 
1.8%
944
 
1.4%
878
 
1.3%
841
 
1.3%
822
 
1.3%
Other values (343) 41756
63.9%
Latin
ValueCountFrequency (%)
S 164
14.5%
C 160
14.1%
J 130
11.5%
D 126
11.1%
G 103
9.1%
L 80
 
7.1%
Q 58
 
5.1%
A 43
 
3.8%
M 29
 
2.6%
T 27
 
2.4%
Other values (14) 212
18.7%
Common
ValueCountFrequency (%)
) 2949
30.4%
( 2887
29.8%
2556
26.4%
2 365
 
3.8%
5 233
 
2.4%
0 158
 
1.6%
1 148
 
1.5%
3 102
 
1.1%
4 94
 
1.0%
6 57
 
0.6%
Other values (7) 143
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
Hangul 65312
85.8%
ASCII 10824
 
14.2%

Most frequent character per block

Hangul
ValueCountFrequency (%)
6524
 
10.0%
6205
 
9.5%
2870
 
4.4%
1998
 
3.1%
1311
 
2.0%
1163
 
1.8%
944
 
1.4%
878
 
1.3%
841
 
1.3%
822
 
1.3%
Other values (343) 41756
63.9%
ASCII
ValueCountFrequency (%)
) 2949
27.2%
( 2887
26.7%
2556
23.6%
2 365
 
3.4%
5 233
 
2.2%
S 164
 
1.5%
C 160
 
1.5%
0 158
 
1.5%
1 148
 
1.4%
J 130
 
1.2%
Other values (31) 1074
 
9.9%
Distinct4908
Distinct (%)49.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2022-10-01 00:00:35
Maximum2022-11-28 11:19:26
2023-12-12T13:19:53.542907image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:19:53.708925image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

수신일시
Date

MISSING 

Distinct130
Distinct (%)77.8%
Missing9833
Missing (%)98.3%
Memory size156.2 KiB
Minimum2022-10-04 11:22:34
Maximum2023-03-25 21:44:07
2023-12-12T13:19:53.861341image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:19:54.020894image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

처리상태
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9833 
NTS
 
165
BLK
 
2

Length

Max length4
Median length4
Mean length3.9833
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9833
98.3%
NTS 165
 
1.7%
BLK 2
 
< 0.1%

Length

2023-12-12T13:19:54.170485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:19:54.309793image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9833
98.3%
nts 165
 
1.7%
blk 2
 
< 0.1%
Distinct550
Distinct (%)5.5%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-12T13:19:54.549922image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters120000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row185-03-02256
2nd row191-81-00703
3rd row221-05-47835
4th row620-14-10333
5th row775-37-00420
ValueCountFrequency (%)
511-45-00215 225
 
2.2%
799-04-00879 180
 
1.8%
842-85-00395 133
 
1.3%
868-27-00607 130
 
1.3%
303-13-08236 116
 
1.2%
106-85-25515 115
 
1.1%
111-85-00758 108
 
1.1%
217-81-27663 84
 
0.8%
411-93-11373 72
 
0.7%
295-11-00171 71
 
0.7%
Other values (540) 8766
87.7%
2023-12-12T13:19:54.990913image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 20000
16.7%
0 17304
14.4%
1 15225
12.7%
2 9915
8.3%
8 9317
7.8%
3 9026
7.5%
6 8979
7.5%
5 8908
7.4%
4 7880
 
6.6%
7 7451
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 100000
83.3%
Dash Punctuation 20000
 
16.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 17304
17.3%
1 15225
15.2%
2 9915
9.9%
8 9317
9.3%
3 9026
9.0%
6 8979
9.0%
5 8908
8.9%
4 7880
7.9%
7 7451
7.5%
9 5995
 
6.0%
Dash Punctuation
ValueCountFrequency (%)
- 20000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 120000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
- 20000
16.7%
0 17304
14.4%
1 15225
12.7%
2 9915
8.3%
8 9317
7.8%
3 9026
7.5%
6 8979
7.5%
5 8908
7.4%
4 7880
 
6.6%
7 7451
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 120000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 20000
16.7%
0 17304
14.4%
1 15225
12.7%
2 9915
8.3%
8 9317
7.8%
3 9026
7.5%
6 8979
7.5%
5 8908
7.4%
4 7880
 
6.6%
7 7451
 
6.2%

전송상태
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
CP
10000 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCP
2nd rowCP
3rd rowCP
4th rowCP
5th rowCP

Common Values

ValueCountFrequency (%)
CP 10000
100.0%

Length

2023-12-12T13:19:55.160974image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:19:55.266829image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
cp 10000
100.0%

재고단위
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
<NA>
9965 
PCS
 
35

Length

Max length4
Median length4
Mean length3.9965
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row<NA>
2nd row<NA>
3rd row<NA>
4th row<NA>
5th row<NA>

Common Values

ValueCountFrequency (%)
<NA> 9965
99.7%
PCS 35
 
0.4%

Length

2023-12-12T13:19:55.364497image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T13:19:55.450933image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
na 9965
99.7%
pcs 35
 
0.4%

Interactions

2023-12-12T13:19:50.991175image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:19:50.750442image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:19:51.120602image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T13:19:50.879046image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T13:19:55.511215image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
검사기관명문서번호차수처리상태
검사기관명1.0001.0000.2510.249
문서번호1.0001.0000.0840.000
차수0.2510.0841.0000.000
처리상태0.2490.0000.0001.000
2023-12-12T13:19:55.615572image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
처리상태검사기관명재고단위
처리상태1.0000.2621.000
검사기관명0.2621.0001.000
재고단위1.0001.0001.000
2023-12-12T13:19:55.702625image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
문서번호차수검사기관명처리상태재고단위
문서번호1.000-0.0590.9920.0001.000
차수-0.0591.0000.1420.0001.000
검사기관명0.9920.1421.0000.2621.000
처리상태0.0000.0000.2621.0001.000
재고단위1.0001.0001.0001.0001.000

Missing values

2023-12-12T13:19:51.269560image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T13:19:51.479719image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

검사기관명문서번호차수업체명송신일시수신일시처리상태사업자등록번호전송상태재고단위
88841식약처 식품30001605851로그인 편의점2022-11-21 10:41:47<NA><NA>185-03-02256CP<NA>
46333산업통상자원부 국가기술표준원100183911마켓오리진172022-10-24 17:30:49<NA><NA>191-81-00703CP<NA>
23985식품의약품안전처 화장품정책과2022102000011할렐루야마트2022-10-21 07:20:28<NA><NA>221-05-47835CP<NA>
44465산업통상자원부 국가기술표준원100183801현대홈마트(반구점)2022-10-24 17:19:30<NA><NA>620-14-10333CP<NA>
47092산업통상자원부 국가기술표준원100183521왕마트(둔촌동)2022-10-24 17:36:32<NA><NA>775-37-00420CP<NA>
94127식약처 식품30001604671매일마트2022-11-27 09:08:05<NA><NA>610-21-58658CP<NA>
11340식품의약품안전처 의약외품정책과2022021600251진영알뜰마트2022-10-08 10:05:33<NA><NA>799-04-00879CP<NA>
16551식약처 식품30001587291레몬비 덕천연수원점2022-10-17 18:56:42<NA><NA>446-06-01304CP<NA>
71765식약처 식품30001531281보광할인마트2022-11-09 09:04:08<NA><NA>106-85-25515CP<NA>
82188식품의약품안전처 의약외품정책과2022111600061우리홈마트2022-11-16 19:53:44<NA><NA>211-78-88532CP<NA>
검사기관명문서번호차수업체명송신일시수신일시처리상태사업자등록번호전송상태재고단위
18467환경부 화학제품과2022101300052포토피아슈퍼마켓2022-10-18 06:55:39<NA><NA>605-81-98462CP<NA>
91093식약처 식품30001607881(주)부자 군위휴게소(하)2022-11-24 15:46:38<NA><NA>508-85-04448CP<NA>
73742식약처 식품30001601731삼성빅마트2022-11-10 19:05:18<NA><NA>305-32-33727CP<NA>
11572산업통상자원부 국가기술표준원100179671코코마트2022-10-11 15:34:01<NA><NA>314-86-63344CP<NA>
55555산업통상자원부 국가기술표준원100183101경성대학생회관매점2022-10-25 09:00:04<NA><NA>617-18-39841CP<NA>
75897산업통상자원부 국가기술표준원100184641(주)에이치앤디이 송산휴게소2022-11-14 15:41:42<NA><NA>143-85-02199CP<NA>
36376산업통상자원부 국가기술표준원100183341홈마트 만촌점2022-10-24 16:38:45<NA><NA>502-26-28974CP<NA>
85877식품의약품안전처 의약외품정책과2021101900141JS마트2022-11-17 11:03:49<NA><NA>303-13-08236CP<NA>
55941산업통상자원부 국가기술표준원100183061대성수퍼2022-10-25 09:01:17<NA><NA>133-25-06821CP<NA>
57379환경부 화학제품과2022102500121뉴아람마트(회원점)2022-10-25 22:55:11<NA><NA>608-18-81103CP<NA>