Overview

Dataset statistics

Number of variables6
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory576.2 KiB
Average record size in memory59.0 B

Variable types

Text2
DateTime1
Numeric3

Dataset

Description공정거래위원회의 소비자 민원에 대한 학습데이터의 데이터로, 접수기관별 사건내역으로 보여지도록 나타내는 데이터 입니다. 이 데이터는 사건제목, 기관코드, 처리결과코드 등을 포함하고 있습니다.
Author공정거래위원회
URLhttps://www.data.go.kr/data/15098333/fileData.do

Alerts

사건번호(ACCIDENT_NO) has unique valuesUnique

Reproduction

Analysis started2023-12-12 23:25:13.851198
Analysis finished2023-12-12 23:25:16.410310
Duration2.56 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

Distinct10000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T08:25:16.574279image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length12
Mean length12
Min length12

Characters and Unicode

Total characters120000
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10000 ?
Unique (%)100.0%

Sample

1st row2020-0060988
2nd row2020-0032642
3rd row2020-0005438
4th row2020-0008838
5th row2020-0003166
ValueCountFrequency (%)
2020-0060988 1
 
< 0.1%
2020-0093995 1
 
< 0.1%
2020-0048906 1
 
< 0.1%
2020-0059831 1
 
< 0.1%
2020-0089053 1
 
< 0.1%
2020-0083063 1
 
< 0.1%
2020-0043452 1
 
< 0.1%
2020-0066773 1
 
< 0.1%
2020-0054060 1
 
< 0.1%
2020-0065958 1
 
< 0.1%
Other values (9990) 9990
99.9%
2023-12-13T08:25:16.902753image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 44728
37.3%
2 24906
20.8%
- 10000
 
8.3%
1 6086
 
5.1%
7 4994
 
4.2%
3 4961
 
4.1%
6 4918
 
4.1%
9 4886
 
4.1%
4 4868
 
4.1%
5 4834
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 110000
91.7%
Dash Punctuation 10000
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 44728
40.7%
2 24906
22.6%
1 6086
 
5.5%
7 4994
 
4.5%
3 4961
 
4.5%
6 4918
 
4.5%
9 4886
 
4.4%
4 4868
 
4.4%
5 4834
 
4.4%
8 4819
 
4.4%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 120000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 44728
37.3%
2 24906
20.8%
- 10000
 
8.3%
1 6086
 
5.1%
7 4994
 
4.2%
3 4961
 
4.1%
6 4918
 
4.1%
9 4886
 
4.1%
4 4868
 
4.1%
5 4834
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 120000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 44728
37.3%
2 24906
20.8%
- 10000
 
8.3%
1 6086
 
5.1%
7 4994
 
4.2%
3 4961
 
4.1%
6 4918
 
4.1%
9 4886
 
4.1%
4 4868
 
4.1%
5 4834
 
4.0%
Distinct56
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2020-01-01 00:00:00
Maximum2020-02-27 00:00:00
2023-12-13T08:25:17.090512image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:17.218920image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct149
Distinct (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean35955.484
Minimum10000
Maximum41711
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T08:25:17.340609image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum10000
5-th percentile10000
Q140239
median40440
Q341004
95-th percentile41704
Maximum41711
Range31711
Interquartile range (IQR)765

Descriptive statistics

Standard deviation10636.126
Coefficient of variation (CV)0.29581375
Kurtosis1.9458606
Mean35955.484
Median Absolute Deviation (MAD)339
Skewness-1.94231
Sum3.5955484 × 108
Variance1.1312719 × 108
MonotonicityNot monotonic
2023-12-13T08:25:17.471565image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10000 1383
 
13.8%
41101 435
 
4.3%
41004 320
 
3.2%
40314 309
 
3.1%
41105 277
 
2.8%
40305 257
 
2.6%
40315 192
 
1.9%
40514 188
 
1.9%
41104 174
 
1.7%
41704 169
 
1.7%
Other values (139) 6296
63.0%
ValueCountFrequency (%)
10000 1383
13.8%
20100 3
 
< 0.1%
30100 28
 
0.3%
30200 29
 
0.3%
30300 17
 
0.2%
30400 21
 
0.2%
30500 15
 
0.1%
30600 16
 
0.2%
30700 25
 
0.2%
30800 167
 
1.7%
ValueCountFrequency (%)
41711 148
1.5%
41709 120
1.2%
41707 24
 
0.2%
41706 90
0.9%
41705 60
 
0.6%
41704 169
1.7%
41703 125
1.2%
41702 102
1.0%
41700 7
 
0.1%
41109 18
 
0.2%
Distinct9855
Distinct (%)98.6%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2023-12-13T08:25:17.824493image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length96
Median length62
Mean length21.6733
Min length1

Characters and Unicode

Total characters216733
Distinct characters1082
Distinct categories12 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9752 ?
Unique (%)97.5%

Sample

1st row우한폐렴 확산으로 인한 나트랑 여행 환불불가의 부당함에 이의제기
2nd row배송 완료된 제품에 대해 가격 오기를 이유로 사업자가 차액 청구함
3rd row체크페이 앱에서 온누리 모바일상품권 등록할때 불편해요
4th row[포털][카카오] 전자상거래 피해구제신청 - 한국소비자원/[기타 정보]
5th row기계구입후 고장으로인한 교환문의
ValueCountFrequency (%)
문의 2945
 
5.4%
환불 972
 
1.8%
719
 
1.3%
취소 689
 
1.3%
인한 627
 
1.1%
관련 581
 
1.1%
위약금 573
 
1.0%
요청 561
 
1.0%
547
 
1.0%
요구 397
 
0.7%
Other values (14931) 46193
84.3%
2023-12-13T08:25:18.401886image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
47978
 
22.1%
5128
 
2.4%
4444
 
2.1%
3064
 
1.4%
3009
 
1.4%
2891
 
1.3%
2775
 
1.3%
2772
 
1.3%
2743
 
1.3%
2563
 
1.2%
Other values (1072) 139366
64.3%

Most occurring categories

ValueCountFrequency (%)
Other Letter 163804
75.6%
Space Separator 47978
 
22.1%
Decimal Number 1725
 
0.8%
Uppercase Letter 972
 
0.4%
Other Punctuation 909
 
0.4%
Close Punctuation 428
 
0.2%
Open Punctuation 385
 
0.2%
Lowercase Letter 365
 
0.2%
Dash Punctuation 99
 
< 0.1%
Math Symbol 65
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
5128
 
3.1%
4444
 
2.7%
3064
 
1.9%
3009
 
1.8%
2891
 
1.8%
2775
 
1.7%
2772
 
1.7%
2743
 
1.7%
2563
 
1.6%
2543
 
1.6%
Other values (992) 131872
80.5%
Uppercase Letter
ValueCountFrequency (%)
T 169
17.4%
S 157
16.2%
A 130
13.4%
V 80
8.2%
G 67
 
6.9%
L 65
 
6.7%
K 62
 
6.4%
C 47
 
4.8%
P 46
 
4.7%
X 23
 
2.4%
Other values (14) 126
13.0%
Lowercase Letter
ValueCountFrequency (%)
s 89
24.4%
a 77
21.1%
t 50
13.7%
v 33
 
9.0%
k 20
 
5.5%
p 17
 
4.7%
l 12
 
3.3%
g 10
 
2.7%
c 10
 
2.7%
e 8
 
2.2%
Other values (12) 39
10.7%
Other Punctuation
ValueCountFrequency (%)
. 631
69.4%
/ 196
 
21.6%
% 24
 
2.6%
* 23
 
2.5%
' 12
 
1.3%
! 7
 
0.8%
; 6
 
0.7%
: 5
 
0.6%
2
 
0.2%
& 2
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 498
28.9%
0 283
16.4%
2 254
14.7%
3 238
13.8%
5 159
 
9.2%
4 87
 
5.0%
9 69
 
4.0%
7 48
 
2.8%
8 45
 
2.6%
6 44
 
2.6%
Math Symbol
ValueCountFrequency (%)
> 19
29.2%
< 18
27.7%
+ 12
18.5%
= 8
12.3%
~ 8
12.3%
Close Punctuation
ValueCountFrequency (%)
) 383
89.5%
] 45
 
10.5%
Open Punctuation
ValueCountFrequency (%)
( 341
88.6%
[ 44
 
11.4%
Space Separator
ValueCountFrequency (%)
47978
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 99
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 2
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 163804
75.6%
Common 51592
 
23.8%
Latin 1337
 
0.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
5128
 
3.1%
4444
 
2.7%
3064
 
1.9%
3009
 
1.8%
2891
 
1.8%
2775
 
1.7%
2772
 
1.7%
2743
 
1.7%
2563
 
1.6%
2543
 
1.6%
Other values (992) 131872
80.5%
Latin
ValueCountFrequency (%)
T 169
12.6%
S 157
11.7%
A 130
 
9.7%
s 89
 
6.7%
V 80
 
6.0%
a 77
 
5.8%
G 67
 
5.0%
L 65
 
4.9%
K 62
 
4.6%
t 50
 
3.7%
Other values (36) 391
29.2%
Common
ValueCountFrequency (%)
47978
93.0%
. 631
 
1.2%
1 498
 
1.0%
) 383
 
0.7%
( 341
 
0.7%
0 283
 
0.5%
2 254
 
0.5%
3 238
 
0.5%
/ 196
 
0.4%
5 159
 
0.3%
Other values (24) 631
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
Hangul 163794
75.6%
ASCII 52927
 
24.4%
Compat Jamo 10
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
47978
90.6%
. 631
 
1.2%
1 498
 
0.9%
) 383
 
0.7%
( 341
 
0.6%
0 283
 
0.5%
2 254
 
0.5%
3 238
 
0.4%
/ 196
 
0.4%
T 169
 
0.3%
Other values (69) 1956
 
3.7%
Hangul
ValueCountFrequency (%)
5128
 
3.1%
4444
 
2.7%
3064
 
1.9%
3009
 
1.8%
2891
 
1.8%
2775
 
1.7%
2772
 
1.7%
2743
 
1.7%
2563
 
1.6%
2543
 
1.6%
Other values (983) 131862
80.5%
Punctuation
ValueCountFrequency (%)
2
100.0%
Compat Jamo
ValueCountFrequency (%)
2
20.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Distinct16
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean609.4348
Minimum601
Maximum616
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T08:25:18.539506image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum601
5-th percentile603
Q1607
median609
Q3611
95-th percentile616
Maximum616
Range15
Interquartile range (IQR)4

Descriptive statistics

Standard deviation3.4644695
Coefficient of variation (CV)0.0056847255
Kurtosis-0.13168962
Mean609.4348
Median Absolute Deviation (MAD)2
Skewness0.21149008
Sum6094348
Variance12.002549
MonotonicityNot monotonic
2023-12-13T08:25:18.700164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=16)
ValueCountFrequency (%)
611 2753
27.5%
608 1785
17.8%
607 1632
16.3%
616 1006
 
10.1%
606 961
 
9.6%
610 575
 
5.8%
615 417
 
4.2%
603 276
 
2.8%
601 137
 
1.4%
602 133
 
1.3%
Other values (6) 325
 
3.2%
ValueCountFrequency (%)
601 137
 
1.4%
602 133
 
1.3%
603 276
 
2.8%
604 49
 
0.5%
605 7
 
0.1%
606 961
9.6%
607 1632
16.3%
608 1785
17.8%
609 117
 
1.2%
610 575
 
5.8%
ValueCountFrequency (%)
616 1006
 
10.1%
615 417
 
4.2%
614 19
 
0.2%
613 117
 
1.2%
612 16
 
0.2%
611 2753
27.5%
610 575
 
5.8%
609 117
 
1.2%
608 1785
17.8%
607 1632
16.3%
Distinct25
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean528.2863
Minimum401
Maximum612
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2023-12-13T08:25:18.851461image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum401
5-th percentile501
Q1502
median509
Q3527
95-th percentile610
Maximum612
Range211
Interquartile range (IQR)25

Descriptive statistics

Standard deviation40.45611
Coefficient of variation (CV)0.076579895
Kurtosis-0.036253472
Mean528.2863
Median Absolute Deviation (MAD)8
Skewness1.3224907
Sum5282863
Variance1636.6968
MonotonicityNot monotonic
2023-12-13T08:25:18.984874image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=25)
ValueCountFrequency (%)
509 2391
23.9%
501 2298
23.0%
502 1103
11.0%
527 1092
10.9%
610 632
 
6.3%
603 498
 
5.0%
505 270
 
2.7%
504 237
 
2.4%
605 233
 
2.3%
510 192
 
1.9%
Other values (15) 1054
10.5%
ValueCountFrequency (%)
401 3
 
< 0.1%
501 2298
23.0%
502 1103
11.0%
504 237
 
2.4%
505 270
 
2.7%
506 17
 
0.2%
507 113
 
1.1%
509 2391
23.9%
510 192
 
1.9%
511 123
 
1.2%
ValueCountFrequency (%)
612 9
 
0.1%
611 1
 
< 0.1%
610 632
6.3%
609 72
 
0.7%
608 79
 
0.8%
607 111
 
1.1%
606 51
 
0.5%
605 233
 
2.3%
604 184
 
1.8%
603 498
5.0%

Interactions

2023-12-13T08:25:15.921093image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:14.966331image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:15.335948image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:16.031166image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:15.093956image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:15.682164image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:16.135309image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:15.219103image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-13T08:25:15.821397image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-13T08:25:19.101852image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
접수일자(RCPT_YMD)기관코드(INSTITUTION_CODE)상담이유코드(DSCSN_REASON_CODE)처리결과코드(PRCS_RESULT_CODE)
접수일자(RCPT_YMD)1.0000.6670.1590.095
기관코드(INSTITUTION_CODE)0.6671.0000.0100.126
상담이유코드(DSCSN_REASON_CODE)0.1590.0101.0000.316
처리결과코드(PRCS_RESULT_CODE)0.0950.1260.3161.000
2023-12-13T08:25:19.224824image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
기관코드(INSTITUTION_CODE)상담이유코드(DSCSN_REASON_CODE)처리결과코드(PRCS_RESULT_CODE)
기관코드(INSTITUTION_CODE)1.000-0.0120.024
상담이유코드(DSCSN_REASON_CODE)-0.0121.000-0.091
처리결과코드(PRCS_RESULT_CODE)0.024-0.0911.000

Missing values

2023-12-13T08:25:16.240809image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-13T08:25:16.354872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

사건번호(ACCIDENT_NO)접수일자(RCPT_YMD)기관코드(INSTITUTION_CODE)사건제목(ACCIDENT_TITLE)상담이유코드(DSCSN_REASON_CODE)처리결과코드(PRCS_RESULT_CODE)
518902020-00609882020-01-3140515우한폐렴 확산으로 인한 나트랑 여행 환불불가의 부당함에 이의제기611527
275662020-00326422020-01-1730200배송 완료된 제품에 대해 가격 오기를 이유로 사업자가 차액 청구함603502
45902020-00054382020-01-0310000체크페이 앱에서 온누리 모바일상품권 등록할때 불편해요602502
78542020-00088382020-01-0610000[포털][카카오] 전자상거래 피해구제신청 - 한국소비자원/[기타 정보]615509
28792020-00031662020-01-0341702기계구입후 고장으로인한 교환문의606501
354482020-00419212020-01-2241703음식점에서 옷이 타버림 배상요구함608502
84272020-00094482020-01-07408011년전 보이스피싱 통장정지 개설 문의616505
8162020-00007072020-01-0240438세탁하자로 신체상의 피해발생하여 배상요청하는건608527
883362020-01021892020-02-1840508코웨이안마의자 동일고장 3회 발생 문의608601
563772020-00657642020-02-0341101부동산 중개업자님 불공정거래 신고하고자 문의 10607510
사건번호(ACCIDENT_NO)접수일자(RCPT_YMD)기관코드(INSTITUTION_CODE)사건제목(ACCIDENT_TITLE)상담이유코드(DSCSN_REASON_CODE)처리결과코드(PRCS_RESULT_CODE)
232992020-00275202020-01-1510000(중재원각하)무릎인공관절 수술 후 재시술받은데 따른 문의616509
354062020-00420322020-01-2240227란탈정수기 약정기간 도래후 해지시 발생한 설치비 지불 요청건.611509
559422020-00654732020-02-034081810개월전에 방판으로 구매한 전집류 반품문의616509
522332020-00607812020-01-3110000신종코로나바이러스 발생 여행출발전 신의칙상의 주의의무에 관한 손해배상의 건607502
871172020-01008892020-02-1740628신발세탁 후 하자 건615501
599312020-00584082020-01-3130800주택수리616507
567012020-00661662020-02-0310000중고 오토바이 구입후 하루만에 시동이 켜지지 않아 환불하려 합니다.608509
900002020-01049492020-02-1940646업체 사기로 인한 피해구제 신청을 하고자 함607509
439582020-00528602020-01-2940305전자담배 파손 보험 계약해지 요청건603502
236102020-00281632020-01-1540807키즈카페에서 어린이 골절 치료비 청구 건616509