Overview

Dataset statistics

Number of variables4
Number of observations711
Missing cells124
Missing cells (%)4.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory23.7 KiB
Average record size in memory34.2 B

Variable types

Numeric2
Text2

Dataset

Description공정거래위원회의 소비자 민원학습데이터로, 단순문의 소비자 민원 학습데이터 중에 상담에 대한 의도를 파악하는 분쟁유형의 코드성 데이터 입니다.
Author공정거래위원회
URLhttps://www.data.go.kr/data/15098372/fileData.do

Alerts

분쟁유형코드(DISPUTE_TYPE_CODE) is highly overall correlated with 상위코드(PARENT_CODE)High correlation
상위코드(PARENT_CODE) is highly overall correlated with 분쟁유형코드(DISPUTE_TYPE_CODE)High correlation
상위코드(PARENT_CODE) has 62 (8.7%) missing valuesMissing
상위분쟁유형(DISPUTE_TYPE_NAME) has 62 (8.7%) missing valuesMissing
분쟁유형코드(DISPUTE_TYPE_CODE) is highly skewed (γ1 = 20.0833458)Skewed
분쟁유형코드(DISPUTE_TYPE_CODE) has unique valuesUnique

Reproduction

Analysis started2023-12-12 10:20:00.582747
Analysis finished2023-12-12 10:20:01.715825
Duration1.13 second
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

분쟁유형코드(DISPUTE_TYPE_CODE)
Real number (ℝ)

HIGH CORRELATION  SKEWED  UNIQUE 

Distinct711
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7124309 × 1010
Minimum1 × 109
Maximum1 × 1012
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2023-12-12T19:20:01.814146image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 109
5-th percentile8.0001 × 109
Q19.0027 × 109
median2.30002 × 1010
Q33.800045 × 1010
95-th percentile5.70001 × 1010
Maximum1 × 1012
Range9.99 × 1011
Interquartile range (IQR)2.899775 × 1010

Descriptive statistics

Standard deviation4.0167557 × 1010
Coefficient of variation (CV)1.4808693
Kurtosis485.85279
Mean2.7124309 × 1010
Median Absolute Deviation (MAD)1.39984 × 1010
Skewness20.083346
Sum1.9285384 × 1013
Variance1.6134326 × 1021
MonotonicityNot monotonic
2023-12-12T19:20:02.007191image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15000300040 1
 
0.1%
9000400100 1
 
0.1%
9000100060 1
 
0.1%
9000100070 1
 
0.1%
9000200010 1
 
0.1%
9000200020 1
 
0.1%
9000200030 1
 
0.1%
9000200040 1
 
0.1%
9000200050 1
 
0.1%
9000200060 1
 
0.1%
Other values (701) 701
98.6%
ValueCountFrequency (%)
1000000000 1
0.1%
1000100000 1
0.1%
1000100010 1
0.1%
1000100020 1
0.1%
2000000000 1
0.1%
2000100000 1
0.1%
2000100010 1
0.1%
2000100020 1
0.1%
3000000000 1
0.1%
3000100000 1
0.1%
ValueCountFrequency (%)
1000000000000 1
0.1%
62000100030 1
0.1%
62000100020 1
0.1%
62000100010 1
0.1%
62000100000 1
0.1%
62000000000 1
0.1%
61000100040 1
0.1%
61000100030 1
0.1%
61000100020 1
0.1%
61000100010 1
0.1%
Distinct654
Distinct (%)92.0%
Missing0
Missing (%)0.0%
Memory size5.7 KiB
2023-12-12T19:20:02.341290image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length128
Median length68
Mean length24.279887
Min length2

Characters and Unicode

Total characters17263
Distinct characters467
Distinct categories10 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique611 ?
Unique (%)85.9%

Sample

1st row4) 판매자가 구입자의 철회권 행사를 제한하기 위해 임의로 포장을 훼손한 경우
2nd row5) 판매원 신분 허위 판매처 허위인 계약
3rd row6) 회원제 판매 또는 복합상품 판매 후 일부 계약 불이행
4th row7) 정기간행물 구독계약을 중도해지한 경우 (서면 계약해지의사 도달일 기준)
5th row8) 도서 음반 정기간행물 계약의 중도해지 시 제공받은 사은품
ValueCountFrequency (%)
경우 126
 
3.0%
1 123
 
3.0%
2 119
 
2.9%
인한 102
 
2.5%
3 87
 
2.1%
4 66
 
1.6%
업종 60
 
1.4%
발생한 57
 
1.4%
또는 56
 
1.4%
이내에 51
 
1.2%
Other values (1545) 3294
79.5%
2023-12-12T19:20:02.904265image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3607
 
20.9%
) 710
 
4.1%
308
 
1.8%
306
 
1.8%
274
 
1.6%
259
 
1.5%
257
 
1.5%
252
 
1.5%
244
 
1.4%
1 242
 
1.4%
Other values (457) 10804
62.6%

Most occurring categories

ValueCountFrequency (%)
Other Letter 11630
67.4%
Space Separator 3607
 
20.9%
Decimal Number 808
 
4.7%
Close Punctuation 710
 
4.1%
Open Punctuation 183
 
1.1%
Other Punctuation 181
 
1.0%
Lowercase Letter 100
 
0.6%
Uppercase Letter 27
 
0.2%
Dash Punctuation 15
 
0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
308
 
2.6%
306
 
2.6%
274
 
2.4%
259
 
2.2%
257
 
2.2%
252
 
2.2%
244
 
2.1%
209
 
1.8%
206
 
1.8%
201
 
1.7%
Other values (403) 9114
78.4%
Lowercase Letter
ValueCountFrequency (%)
i 13
13.0%
n 11
11.0%
o 10
10.0%
e 10
10.0%
a 8
8.0%
g 8
8.0%
r 7
 
7.0%
k 6
 
6.0%
t 5
 
5.0%
p 4
 
4.0%
Other values (8) 18
18.0%
Uppercase Letter
ValueCountFrequency (%)
C 4
14.8%
R 4
14.8%
O 3
11.1%
A 2
7.4%
S 2
7.4%
F 2
7.4%
N 2
7.4%
T 2
7.4%
B 1
 
3.7%
L 1
 
3.7%
Other values (4) 4
14.8%
Decimal Number
ValueCountFrequency (%)
1 242
30.0%
2 155
19.2%
3 119
14.7%
4 88
 
10.9%
5 75
 
9.3%
6 50
 
6.2%
7 30
 
3.7%
0 22
 
2.7%
8 14
 
1.7%
9 13
 
1.6%
Other Punctuation
ValueCountFrequency (%)
/ 96
53.0%
. 75
41.4%
· 5
 
2.8%
: 3
 
1.7%
* 1
 
0.6%
% 1
 
0.6%
Math Symbol
ValueCountFrequency (%)
± 1
50.0%
+ 1
50.0%
Space Separator
ValueCountFrequency (%)
3607
100.0%
Close Punctuation
ValueCountFrequency (%)
) 710
100.0%
Open Punctuation
ValueCountFrequency (%)
( 183
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15
100.0%

Most occurring scripts

ValueCountFrequency (%)
Hangul 11630
67.4%
Common 5506
31.9%
Latin 127
 
0.7%

Most frequent character per script

Hangul
ValueCountFrequency (%)
308
 
2.6%
306
 
2.6%
274
 
2.4%
259
 
2.2%
257
 
2.2%
252
 
2.2%
244
 
2.1%
209
 
1.8%
206
 
1.8%
201
 
1.7%
Other values (403) 9114
78.4%
Latin
ValueCountFrequency (%)
i 13
 
10.2%
n 11
 
8.7%
o 10
 
7.9%
e 10
 
7.9%
a 8
 
6.3%
g 8
 
6.3%
r 7
 
5.5%
k 6
 
4.7%
t 5
 
3.9%
C 4
 
3.1%
Other values (22) 45
35.4%
Common
ValueCountFrequency (%)
3607
65.5%
) 710
 
12.9%
1 242
 
4.4%
( 183
 
3.3%
2 155
 
2.8%
3 119
 
2.2%
/ 96
 
1.7%
4 88
 
1.6%
. 75
 
1.4%
5 75
 
1.4%
Other values (12) 156
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
Hangul 11630
67.4%
ASCII 5627
32.6%
None 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3607
64.1%
) 710
 
12.6%
1 242
 
4.3%
( 183
 
3.3%
2 155
 
2.8%
3 119
 
2.1%
/ 96
 
1.7%
4 88
 
1.6%
. 75
 
1.3%
5 75
 
1.3%
Other values (42) 277
 
4.9%
Hangul
ValueCountFrequency (%)
308
 
2.6%
306
 
2.6%
274
 
2.4%
259
 
2.2%
257
 
2.2%
252
 
2.2%
244
 
2.1%
209
 
1.8%
206
 
1.8%
201
 
1.7%
Other values (403) 9114
78.4%
None
ValueCountFrequency (%)
· 5
83.3%
± 1
 
16.7%

상위코드(PARENT_CODE)
Real number (ℝ)

HIGH CORRELATION  MISSING 

Distinct185
Distinct (%)28.5%
Missing62
Missing (%)8.7%
Infinite0
Infinite (%)0.0%
Mean2.6706176 × 1010
Minimum1 × 109
Maximum1 × 1012
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size6.4 KiB
2023-12-12T19:20:03.088846image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1 × 109
5-th percentile8.0001 × 109
Q19.0024 × 109
median2.30001 × 1010
Q33.80002 × 1010
95-th percentile5.660004 × 1010
Maximum1 × 1012
Range9.99 × 1011
Interquartile range (IQR)2.89978 × 1010

Descriptive statistics

Standard deviation4.1655241 × 1010
Coefficient of variation (CV)1.5597606
Kurtosis461.27369
Mean2.6706176 × 1010
Median Absolute Deviation (MAD)1.39989 × 1010
Skewness19.760268
Sum1.7332308 × 1013
Variance1.7351591 × 1021
MonotonicityNot monotonic
2023-12-12T19:20:03.292972image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9000000000 34
 
4.8%
9000400000 16
 
2.3%
34000600000 12
 
1.7%
16000100000 12
 
1.7%
38000000000 9
 
1.3%
53000100000 9
 
1.3%
38000100000 9
 
1.3%
15000300000 9
 
1.3%
57000100000 8
 
1.1%
38000300000 8
 
1.1%
Other values (175) 523
73.6%
(Missing) 62
 
8.7%
ValueCountFrequency (%)
1000000000 1
 
0.1%
1000100000 2
 
0.3%
2000000000 1
 
0.1%
2000100000 2
 
0.3%
3000000000 1
 
0.1%
3000100000 2
 
0.3%
4000000000 1
 
0.1%
4000100000 1
 
0.1%
5000000000 1
 
0.1%
5000100000 6
0.8%
ValueCountFrequency (%)
1000000000000 1
 
0.1%
62000100000 3
0.4%
62000000000 1
 
0.1%
61000100000 4
0.6%
61000000000 1
 
0.1%
60000100000 5
0.7%
60000000000 1
 
0.1%
59000100000 3
0.4%
59000000000 1
 
0.1%
58000100000 3
0.4%
Distinct62
Distinct (%)9.6%
Missing62
Missing (%)8.7%
Memory size5.7 KiB
2023-12-12T19:20:03.644814image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length32
Median length28
Mean length16.684129
Min length8

Characters and Unicode

Total characters10828
Distinct characters162
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row15. 문화용품·기타(4개 업종)
2nd row15. 문화용품·기타(4개 업종)
3rd row15. 문화용품·기타(4개 업종)
4th row15. 문화용품·기타(4개 업종)
5th row15. 문화용품·기타(4개 업종)
ValueCountFrequency (%)
업종 591
27.3%
9 173
 
8.0%
공산품(30개 173
 
8.0%
82
 
3.8%
38 51
 
2.4%
의약품 51
 
2.4%
화학제품(10개 51
 
2.4%
품종 51
 
2.4%
34 39
 
1.8%
운수업(9개 39
 
1.8%
Other values (130) 861
39.8%
2023-12-12T19:20:04.175320image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1519
 
14.0%
929
 
8.6%
650
 
6.0%
. 649
 
6.0%
( 642
 
5.9%
642
 
5.9%
) 639
 
5.9%
1 418
 
3.9%
3 406
 
3.7%
394
 
3.6%
Other values (152) 3940
36.4%

Most occurring categories

ValueCountFrequency (%)
Other Letter 5376
49.6%
Decimal Number 1956
 
18.1%
Space Separator 1519
 
14.0%
Other Punctuation 696
 
6.4%
Open Punctuation 642
 
5.9%
Close Punctuation 639
 
5.9%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
929
17.3%
650
 
12.1%
642
 
11.9%
394
 
7.3%
226
 
4.2%
191
 
3.6%
90
 
1.7%
83
 
1.5%
82
 
1.5%
82
 
1.5%
Other values (137) 2007
37.3%
Decimal Number
ValueCountFrequency (%)
1 418
21.4%
3 406
20.8%
0 268
13.7%
9 242
12.4%
2 180
9.2%
4 146
 
7.5%
5 104
 
5.3%
8 88
 
4.5%
6 58
 
3.0%
7 46
 
2.4%
Other Punctuation
ValueCountFrequency (%)
. 649
93.2%
· 47
 
6.8%
Space Separator
ValueCountFrequency (%)
1519
100.0%
Open Punctuation
ValueCountFrequency (%)
( 642
100.0%
Close Punctuation
ValueCountFrequency (%)
) 639
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5452
50.4%
Hangul 5376
49.6%

Most frequent character per script

Hangul
ValueCountFrequency (%)
929
17.3%
650
 
12.1%
642
 
11.9%
394
 
7.3%
226
 
4.2%
191
 
3.6%
90
 
1.7%
83
 
1.5%
82
 
1.5%
82
 
1.5%
Other values (137) 2007
37.3%
Common
ValueCountFrequency (%)
1519
27.9%
. 649
11.9%
( 642
11.8%
) 639
11.7%
1 418
 
7.7%
3 406
 
7.4%
0 268
 
4.9%
9 242
 
4.4%
2 180
 
3.3%
4 146
 
2.7%
Other values (5) 343
 
6.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5405
49.9%
Hangul 5376
49.6%
None 47
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1519
28.1%
. 649
12.0%
( 642
11.9%
) 639
11.8%
1 418
 
7.7%
3 406
 
7.5%
0 268
 
5.0%
9 242
 
4.5%
2 180
 
3.3%
4 146
 
2.7%
Other values (4) 296
 
5.5%
Hangul
ValueCountFrequency (%)
929
17.3%
650
 
12.1%
642
 
11.9%
394
 
7.3%
226
 
4.2%
191
 
3.6%
90
 
1.7%
83
 
1.5%
82
 
1.5%
82
 
1.5%
Other values (137) 2007
37.3%
None
ValueCountFrequency (%)
· 47
100.0%

Interactions

2023-12-12T19:20:01.182524image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:20:00.974912image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:20:01.285285image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:20:01.083357image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:20:04.299474image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분쟁유형코드(DISPUTE_TYPE_CODE)상위코드(PARENT_CODE)상위분쟁유형(DISPUTE_TYPE_NAME)
분쟁유형코드(DISPUTE_TYPE_CODE)1.0000.7050.000
상위코드(PARENT_CODE)0.7051.0000.000
상위분쟁유형(DISPUTE_TYPE_NAME)0.0000.0001.000
2023-12-12T19:20:04.404596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
분쟁유형코드(DISPUTE_TYPE_CODE)상위코드(PARENT_CODE)
분쟁유형코드(DISPUTE_TYPE_CODE)1.0000.993
상위코드(PARENT_CODE)0.9931.000

Missing values

2023-12-12T19:20:01.416851image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:20:01.553387image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-12-12T19:20:01.664157image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

분쟁유형코드(DISPUTE_TYPE_CODE)분쟁유형(DISPUTE_TYPE_NAME)상위코드(PARENT_CODE)상위분쟁유형(DISPUTE_TYPE_NAME)
0150003000404) 판매자가 구입자의 철회권 행사를 제한하기 위해 임의로 포장을 훼손한 경우1500030000015. 문화용품·기타(4개 업종)
1150003000505) 판매원 신분 허위 판매처 허위인 계약1500030000015. 문화용품·기타(4개 업종)
2150003000606) 회원제 판매 또는 복합상품 판매 후 일부 계약 불이행1500030000015. 문화용품·기타(4개 업종)
3150003000707) 정기간행물 구독계약을 중도해지한 경우 (서면 계약해지의사 도달일 기준)1500030000015. 문화용품·기타(4개 업종)
4150003000808) 도서 음반 정기간행물 계약의 중도해지 시 제공받은 사은품1500030000015. 문화용품·기타(4개 업종)
5150003000909) 청약철회기간 이후 계약 해제 시(법령상 청약철회가 가능한 거래의 경우)1500030000015. 문화용품·기타(4개 업종)
6150004000101) 구입 후 1개월 이내에 정상적인 사용상태에서 발생한 성능/기능상의 하자로 중요한 수리를 요할 때1500040000015. 문화용품·기타(4개 업종)
7150004000202) 품질보증기간 이내에 정상적인 사용상태에서 발생한 성능/기능상의 하자1500040000015. 문화용품·기타(4개 업종)
8150004000303) 사용자가 수리 의뢰한 제품을 사업자가 분실했을 경우1500040000015. 문화용품·기타(4개 업종)
9150004000404) 부품보유기간 이내에 수리용 부품을 보유하고 있지 않아 발생한 피해1500040000015. 문화용품·기타(4개 업종)
분쟁유형코드(DISPUTE_TYPE_CODE)분쟁유형(DISPUTE_TYPE_NAME)상위코드(PARENT_CODE)상위분쟁유형(DISPUTE_TYPE_NAME)
7014700000000047. 자동차운전학원(1개 업종)<NA><NA>
7024800000000048. 자동차 정비업(1개 업종)<NA><NA>
7034900000000049. 전자지급수단발행업(1개 업종)<NA><NA>
7045000000000050. 주차장업(2개 업종)<NA><NA>
7055100000000051. 주택건설업(1개 업종)<NA><NA>
7065200000000052. 중고전자제품매매업(1개 업종)<NA><NA>
7075300000000053. 중고자동차매매업(1개 업종)<NA><NA>
7085400000000054. 창호공사업(1개 업종)<NA><NA>
7095500000000055. 청소대행서비스업(1개 업종)<NA><NA>
7105600000000056. 체육시설업 레저용역업 및 할인회원권업(3개 업종)<NA><NA>