Dataset statistics
Number of variables | 4 |
---|---|
Number of observations | 504 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 16.4 KiB |
Average record size in memory | 33.3 B |
Variable types
Categorical | 2 |
---|---|
Text | 2 |
Dataset
Description | 샘플 데이터 |
---|---|
Author | 서울시(스마트카드사) |
URL | https://bigdata.seoul.go.kr/data/selectSampleData.do?sample_data_seq=13 |
코드구분명(GBN_NM) is highly overall correlated with 코드구분(GBN_CD) | High correlation |
코드구분(GBN_CD) is highly overall correlated with 코드구분명(GBN_NM) | High correlation |
코드구분(GBN_CD) is highly imbalanced (62.1%) | Imbalance |
코드구분명(GBN_NM) is highly imbalanced (62.1%) | Imbalance |
Reproduction
Analysis started | 2024-01-14 06:50:15.371379 |
---|---|
Analysis finished | 2024-01-14 06:50:15.725309 |
Duration | 0.35 seconds |
Software version | ydata-profiling vv4.5.1 |
Download configuration | config.json |
코드구분(GBN_CD)
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 5 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.1 KiB |
3 | |
---|---|
1 | 42 |
2 | 23 |
4 | 13 |
5 | 2 |
Length
Max length | 1 |
---|---|
Median length | 1 |
Mean length | 1 |
Min length | 1 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 1 |
---|---|
2nd row | 1 |
3rd row | 1 |
4th row | 1 |
5th row | 1 |
Common Values
Value | Count | Frequency (%) |
3 | 424 | |
1 | 42 | 8.3% |
2 | 23 | 4.6% |
4 | 13 | 2.6% |
5 | 2 | 0.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
3 | 424 | |
1 | 42 | 8.3% |
2 | 23 | 4.6% |
4 | 13 | 2.6% |
5 | 2 | 0.4% |
코드구분명(GBN_NM)
Categorical
HIGH CORRELATION
  IMBALANCE
 
Distinct | 5 |
---|---|
Distinct (%) | 1.0% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.1 KiB |
교통카드발행사 | |
---|---|
교통수단코드 | 42 |
사용자구분코드 | 23 |
1회권사용자구분코드 | 13 |
1회권발행사ID | 2 |
Length
Max length | 10 |
---|---|
Median length | 7 |
Mean length | 6.9980159 |
Min length | 6 |
Unique
Unique | 0 ? |
---|---|
Unique (%) | 0.0% |
Sample
1st row | 교통수단코드 |
---|---|
2nd row | 교통수단코드 |
3rd row | 교통수단코드 |
4th row | 교통수단코드 |
5th row | 교통수단코드 |
Common Values
Value | Count | Frequency (%) |
교통카드발행사 | 424 | |
교통수단코드 | 42 | 8.3% |
사용자구분코드 | 23 | 4.6% |
1회권사용자구분코드 | 13 | 2.6% |
1회권발행사ID | 2 | 0.4% |
Length
Common Values (Plot)
Value | Count | Frequency (%) |
교통카드발행사 | 424 | |
교통수단코드 | 42 | 8.3% |
사용자구분코드 | 23 | 4.6% |
1회권사용자구분코드 | 13 | 2.6% |
1회권발행사id | 2 | 0.4% |
코드값(CODE)
Text
Distinct | 497 |
---|---|
Distinct (%) | 98.6% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.1 KiB |
Value | Count | Frequency (%) |
c900001 | 2 | 0.4% |
22 | 2 | 0.4% |
23 | 2 | 0.4% |
24 | 2 | 0.4% |
13 | 2 | 0.4% |
c900008 | 2 | 0.4% |
21 | 2 | 0.4% |
3119013 | 1 | 0.2% |
3119002 | 1 | 0.2% |
3119003 | 1 | 0.2% |
Other values (487) | 487 |
Most occurring characters
Value | Count | Frequency (%) |
0 | 1113 | |
1 | 599 | |
3 | 400 | 12.6% |
2 | 333 | 10.5% |
9 | 158 | 5.0% |
4 | 156 | 4.9% |
5 | 154 | 4.8% |
7 | 108 | 3.4% |
6 | 85 | 2.7% |
8 | 68 | 2.1% |
Other values (3) | 6 | 0.2% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 3174 | |
Uppercase Letter | 6 | 0.2% |
Most frequent character per category
Decimal Number
Value | Count | Frequency (%) |
0 | 1113 | |
1 | 599 | |
3 | 400 | 12.6% |
2 | 333 | 10.5% |
9 | 158 | 5.0% |
4 | 156 | 4.9% |
5 | 154 | 4.9% |
7 | 108 | 3.4% |
6 | 85 | 2.7% |
8 | 68 | 2.1% |
Uppercase Letter
Value | Count | Frequency (%) |
C | 4 | |
S | 1 | 16.7% |
L | 1 | 16.7% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 3174 | |
Latin | 6 | 0.2% |
Most frequent character per script
Common
Value | Count | Frequency (%) |
0 | 1113 | |
1 | 599 | |
3 | 400 | 12.6% |
2 | 333 | 10.5% |
9 | 158 | 5.0% |
4 | 156 | 4.9% |
5 | 154 | 4.9% |
7 | 108 | 3.4% |
6 | 85 | 2.7% |
8 | 68 | 2.1% |
Latin
Value | Count | Frequency (%) |
C | 4 | |
S | 1 | 16.7% |
L | 1 | 16.7% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 3180 |
Most frequent character per block
ASCII
Value | Count | Frequency (%) |
0 | 1113 | |
1 | 599 | |
3 | 400 | 12.6% |
2 | 333 | 10.5% |
9 | 158 | 5.0% |
4 | 156 | 4.9% |
5 | 154 | 4.8% |
7 | 108 | 3.4% |
6 | 85 | 2.7% |
8 | 68 | 2.1% |
Other values (3) | 6 | 0.2% |
코드명(CODE_NM)
Text
Distinct | 182 |
---|---|
Distinct (%) | 36.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 4.1 KiB |
Length
Max length | 24 |
---|---|
Median length | 18 |
Mean length | 5.8293651 |
Min length | 2 |
Characters and Unicode
Total characters | 2938 |
---|---|
Distinct characters | 193 |
Distinct categories | 10 ? |
Distinct scripts | 3 ? |
Distinct blocks | 2 ? |
Unique
Unique | 158 ? |
---|---|
Unique (%) | 31.3% |
Sample
1st row | 마을버스(105) |
---|---|
2nd row | 간선버스 |
3rd row | 지선버스(120) |
4th row | 지선버스(121) |
5th row | 광역버스(130) |
Value | Count | Frequency (%) |
티머니 | 99 | |
레일플러스 | 70 | 11.8% |
이비 | 43 | 7.3% |
한페이시스 | 38 | 6.4% |
카드 | 29 | 4.9% |
조합 | 28 | 4.7% |
선불 | 27 | 4.6% |
경기도버스조합 | 18 | 3.0% |
일반 | 10 | 1.7% |
인천시버스조합 | 8 | 1.3% |
Other values (173) | 223 |
Most occurring characters
Value | Count | Frequency (%) |
스 | 168 | 5.7% |
카 | 158 | 5.4% |
이 | 121 | 4.1% |
드 | 104 | 3.5% |
티 | 103 | 3.5% |
니 | 102 | 3.5% |
머 | 102 | 3.5% |
일 | 98 | 3.3% |
89 | 3.0% | |
레 | 72 | 2.5% |
Other values (183) | 1821 |
Most occurring categories
Value | Count | Frequency (%) |
Other Letter | 2378 | |
Lowercase Letter | 131 | 4.5% |
Uppercase Letter | 95 | 3.2% |
Space Separator | 89 | 3.0% |
Decimal Number | 86 | 2.9% |
Connector Punctuation | 64 | 2.2% |
Close Punctuation | 40 | 1.4% |
Open Punctuation | 40 | 1.4% |
Dash Punctuation | 14 | 0.5% |
Other Punctuation | 1 | < 0.1% |
Most frequent character per category
Other Letter
Value | Count | Frequency (%) |
스 | 168 | 7.1% |
카 | 158 | 6.6% |
이 | 121 | 5.1% |
드 | 104 | 4.4% |
티 | 103 | 4.3% |
니 | 102 | 4.3% |
머 | 102 | 4.3% |
일 | 98 | 4.1% |
레 | 72 | 3.0% |
플 | 70 | 2.9% |
Other values (131) | 1280 |
Uppercase Letter
Value | Count | Frequency (%) |
P | 12 | |
C | 11 | |
B | 9 | |
S | 9 | |
K | 7 | 7.4% |
M | 6 | 6.3% |
O | 6 | 6.3% |
T | 6 | 6.3% |
A | 5 | 5.3% |
Q | 5 | 5.3% |
Other values (9) | 19 |
Lowercase Letter
Value | Count | Frequency (%) |
a | 39 | |
n | 27 | |
y | 17 | |
b | 16 | |
d | 6 | 4.6% |
p | 4 | 3.1% |
e | 4 | 3.1% |
o | 4 | 3.1% |
m | 3 | 2.3% |
i | 2 | 1.5% |
Other values (7) | 9 | 6.9% |
Decimal Number
Value | Count | Frequency (%) |
1 | 32 | |
5 | 13 | |
0 | 8 | 9.3% |
8 | 7 | 8.1% |
4 | 7 | 8.1% |
3 | 7 | 8.1% |
2 | 6 | 7.0% |
9 | 3 | 3.5% |
7 | 2 | 2.3% |
6 | 1 | 1.2% |
Space Separator
Value | Count | Frequency (%) |
89 |
Connector Punctuation
Value | Count | Frequency (%) |
_ | 64 |
Close Punctuation
Value | Count | Frequency (%) |
) | 40 |
Open Punctuation
Value | Count | Frequency (%) |
( | 40 |
Dash Punctuation
Value | Count | Frequency (%) |
- | 14 |
Other Punctuation
Value | Count | Frequency (%) |
. | 1 |
Most occurring scripts
Value | Count | Frequency (%) |
Hangul | 2378 | |
Common | 334 | 11.4% |
Latin | 226 | 7.7% |
Most frequent character per script
Hangul
Value | Count | Frequency (%) |
스 | 168 | 7.1% |
카 | 158 | 6.6% |
이 | 121 | 5.1% |
드 | 104 | 4.4% |
티 | 103 | 4.3% |
니 | 102 | 4.3% |
머 | 102 | 4.3% |
일 | 98 | 4.1% |
레 | 72 | 3.0% |
플 | 70 | 2.9% |
Other values (131) | 1280 |
Latin
Value | Count | Frequency (%) |
a | 39 | |
n | 27 | 11.9% |
y | 17 | 7.5% |
b | 16 | 7.1% |
P | 12 | 5.3% |
C | 11 | 4.9% |
B | 9 | 4.0% |
S | 9 | 4.0% |
K | 7 | 3.1% |
M | 6 | 2.7% |
Other values (26) | 73 |
Common
Value | Count | Frequency (%) |
89 | ||
_ | 64 | |
) | 40 | |
( | 40 | |
1 | 32 | 9.6% |
- | 14 | 4.2% |
5 | 13 | 3.9% |
0 | 8 | 2.4% |
8 | 7 | 2.1% |
4 | 7 | 2.1% |
Other values (6) | 20 | 6.0% |
Most occurring blocks
Value | Count | Frequency (%) |
Hangul | 2378 | |
ASCII | 560 | 19.1% |
Most frequent character per block
Hangul
Value | Count | Frequency (%) |
스 | 168 | 7.1% |
카 | 158 | 6.6% |
이 | 121 | 5.1% |
드 | 104 | 4.4% |
티 | 103 | 4.3% |
니 | 102 | 4.3% |
머 | 102 | 4.3% |
일 | 98 | 4.1% |
레 | 72 | 3.0% |
플 | 70 | 2.9% |
Other values (131) | 1280 |
ASCII
Value | Count | Frequency (%) |
89 | ||
_ | 64 | 11.4% |
) | 40 | 7.1% |
( | 40 | 7.1% |
a | 39 | 7.0% |
1 | 32 | 5.7% |
n | 27 | 4.8% |
y | 17 | 3.0% |
b | 16 | 2.9% |
- | 14 | 2.5% |
Other values (42) | 182 |
코드구분(GBN_CD) | 코드구분명(GBN_NM) | |
---|---|---|
코드구분(GBN_CD) | 1.000 | 1.000 |
코드구분명(GBN_NM) | 1.000 | 1.000 |
코드구분명(GBN_NM) | 코드구분(GBN_CD) | |
---|---|---|
코드구분명(GBN_NM) | 1.000 | 1.000 |
코드구분(GBN_CD) | 1.000 | 1.000 |
코드구분(GBN_CD) | 코드구분명(GBN_NM) | |
---|---|---|
코드구분(GBN_CD) | 1.000 | 1.000 |
코드구분명(GBN_NM) | 1.000 | 1.000 |
코드구분(GBN_CD) | 코드구분명(GBN_NM) | 코드값(CODE) | 코드명(CODE_NM) | |
---|---|---|---|---|
0 | 1 | 교통수단코드 | 105 | 마을버스(105) |
1 | 1 | 교통수단코드 | 115 | 간선버스 |
2 | 1 | 교통수단코드 | 120 | 지선버스(120) |
3 | 1 | 교통수단코드 | 121 | 지선버스(121) |
4 | 1 | 교통수단코드 | 130 | 광역버스(130) |
5 | 1 | 교통수단코드 | 131 | 심야버스(131) |
6 | 1 | 교통수단코드 | 140 | 순환버스 |
7 | 1 | 교통수단코드 | 142 | 도심순환(142) |
8 | 1 | 교통수단코드 | 201 | 서울메트로 |
9 | 1 | 교통수단코드 | 202 | 한국철도공사 |
코드구분(GBN_CD) | 코드구분명(GBN_NM) | 코드값(CODE) | 코드명(CODE_NM) | |
---|---|---|---|---|
494 | 4 | 1회권사용자구분코드 | 23 | 장애 |
495 | 4 | 1회권사용자구분코드 | 24 | 동반무임 |
496 | 4 | 1회권사용자구분코드 | 41 | 영어 일반 |
497 | 4 | 1회권사용자구분코드 | 42 | 일어 일반 |
498 | 4 | 1회권사용자구분코드 | 43 | 중국어 일반 |
499 | 4 | 1회권사용자구분코드 | 44 | 영어 어린이 |
500 | 4 | 1회권사용자구분코드 | 45 | 일어 어린이 |
501 | 4 | 1회권사용자구분코드 | 46 | 중국어 어린이 |
502 | 5 | 1회권발행사ID | C900001 | 코레일1회권 |
503 | 5 | 1회권발행사ID | C900008 | 티머니 |