Overview

Dataset statistics

Number of variables5
Number of observations204
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory8.7 KiB
Average record size in memory43.6 B

Variable types

Categorical3
Numeric2

Dataset

Description한국철도공사여객열차에대한성연령별거주시도별발매수정보로성명,연령,시도코드,거주시도,총발매수정보를 제공합니다.
Author한국철도공사
URLhttps://www.data.go.kr/data/15108357/fileData.do

Alerts

시도코드 is highly overall correlated with 거주시도High correlation
거주시도 is highly overall correlated with 시도코드High correlation

Reproduction

Analysis started2023-12-12 10:40:20.079848
Analysis finished2023-12-12 10:40:21.011982
Duration0.93 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

성별
Categorical

Distinct2
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
1
102 
2
102 

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
1 102
50.0%
2 102
50.0%

Length

2023-12-12T19:40:21.111470image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:40:21.268899image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
1 102
50.0%
2 102
50.0%

연령
Categorical

Distinct6
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
20대
34 
30대
34 
40대
34 
50대
34 
60대이상
34 

Length

Max length5
Median length3
Mean length3.1666667
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20대
2nd row20대
3rd row20대
4th row20대
5th row20대

Common Values

ValueCountFrequency (%)
20대 34
16.7%
30대 34
16.7%
40대 34
16.7%
50대 34
16.7%
60대이상 34
16.7%
기타 34
16.7%

Length

2023-12-12T19:40:21.436050image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-12-12T19:40:21.656392image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
20대 34
16.7%
30대 34
16.7%
40대 34
16.7%
50대 34
16.7%
60대이상 34
16.7%
기타 34
16.7%

시도코드
Real number (ℝ)

HIGH CORRELATION 

Distinct17
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.705882
Minimum11
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-12T19:40:21.812466image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum11
5-th percentile11
Q129
median41
Q345
95-th percentile50
Maximum50
Range39
Interquartile range (IQR)16

Descriptive statistics

Standard deviation10.279434
Coefficient of variation (CV)0.28004868
Kurtosis-0.059455855
Mean36.705882
Median Absolute Deviation (MAD)7
Skewness-0.7482056
Sum7488
Variance105.66676
MonotonicityNot monotonic
2023-12-12T19:40:21.975190image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
11 12
 
5.9%
26 12
 
5.9%
50 12
 
5.9%
48 12
 
5.9%
47 12
 
5.9%
46 12
 
5.9%
45 12
 
5.9%
44 12
 
5.9%
43 12
 
5.9%
42 12
 
5.9%
Other values (7) 84
41.2%
ValueCountFrequency (%)
11 12
5.9%
26 12
5.9%
27 12
5.9%
28 12
5.9%
29 12
5.9%
30 12
5.9%
31 12
5.9%
36 12
5.9%
41 12
5.9%
42 12
5.9%
ValueCountFrequency (%)
50 12
5.9%
48 12
5.9%
47 12
5.9%
46 12
5.9%
45 12
5.9%
44 12
5.9%
43 12
5.9%
42 12
5.9%
41 12
5.9%
36 12
5.9%

거주시도
Categorical

HIGH CORRELATION 

Distinct17
Distinct (%)8.3%
Missing0
Missing (%)0.0%
Memory size1.7 KiB
서울
 
12
부산
 
12
대구
 
12
인천
 
12
광주
 
12
Other values (12)
144 

Length

Max length2
Median length2
Mean length2
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row서울
2nd row부산
3rd row대구
4th row인천
5th row광주

Common Values

ValueCountFrequency (%)
서울 12
 
5.9%
부산 12
 
5.9%
대구 12
 
5.9%
인천 12
 
5.9%
광주 12
 
5.9%
대전 12
 
5.9%
울산 12
 
5.9%
세종 12
 
5.9%
경기 12
 
5.9%
강원 12
 
5.9%
Other values (7) 84
41.2%

Length

2023-12-12T19:40:22.194638image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
서울 12
 
5.9%
강원 12
 
5.9%
경남 12
 
5.9%
경북 12
 
5.9%
전남 12
 
5.9%
전북 12
 
5.9%
충남 12
 
5.9%
충북 12
 
5.9%
경기 12
 
5.9%
부산 12
 
5.9%
Other values (7) 84
41.2%

총 발매 수
Real number (ℝ)

Distinct198
Distinct (%)97.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean434814.22
Minimum3200
Maximum3026200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.9 KiB
2023-12-12T19:40:22.377801image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum3200
5-th percentile20815
Q1107825
median249250
Q3485525
95-th percentile1926090
Maximum3026200
Range3023000
Interquartile range (IQR)377700

Descriptive statistics

Standard deviation594714.58
Coefficient of variation (CV)1.3677441
Kurtosis7.5456167
Mean434814.22
Median Absolute Deviation (MAD)189150
Skewness2.7412641
Sum88702100
Variance3.5368543 × 1011
MonotonicityNot monotonic
2023-12-12T19:40:22.585231image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
39900 2
 
1.0%
23800 2
 
1.0%
19300 2
 
1.0%
13600 2
 
1.0%
22700 2
 
1.0%
57500 2
 
1.0%
1898800 1
 
0.5%
2452300 1
 
0.5%
353400 1
 
0.5%
279300 1
 
0.5%
Other values (188) 188
92.2%
ValueCountFrequency (%)
3200 1
0.5%
4200 1
0.5%
4700 1
0.5%
9100 1
0.5%
10300 1
0.5%
13600 2
1.0%
19300 2
1.0%
19800 1
0.5%
20500 1
0.5%
22600 1
0.5%
ValueCountFrequency (%)
3026200 1
0.5%
2953600 1
0.5%
2901700 1
0.5%
2818300 1
0.5%
2634900 1
0.5%
2528500 1
0.5%
2452300 1
0.5%
2291900 1
0.5%
2085800 1
0.5%
2051900 1
0.5%

Interactions

2023-12-12T19:40:20.563545image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:40:20.317760image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:40:20.689023image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-12T19:40:20.450485image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-12T19:40:22.713731image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
성별연령시도코드거주시도총 발매 수
성별1.0000.0000.0000.0000.000
연령0.0001.0000.0000.0000.422
시도코드0.0000.0001.0001.0000.459
거주시도0.0000.0001.0001.0000.540
총 발매 수0.0000.4220.4590.5401.000
2023-12-12T19:40:22.869454image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
연령거주시도성별
연령1.0000.0000.000
거주시도0.0001.0000.000
성별0.0000.0001.000
2023-12-12T19:40:23.027450image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
시도코드총 발매 수성별연령거주시도
시도코드1.000-0.2960.0000.0000.977
총 발매 수-0.2961.0000.0000.2360.240
성별0.0000.0001.0000.0000.000
연령0.0000.2360.0001.0000.000
거주시도0.9770.2400.0000.0001.000

Missing values

2023-12-12T19:40:20.852596image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-12T19:40:20.964235image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

성별연령시도코드거주시도총 발매 수
0120대11서울1909600
1120대26부산584400
2120대27대구701500
3120대28인천318200
4120대29광주187600
5120대30대전691200
6120대31울산329900
7120대36세종155400
8120대41경기2051900
9120대42강원324900
성별연령시도코드거주시도총 발매 수
1942기타36세종13600
1952기타41경기177000
1962기타42강원23700
1972기타43충북19300
1982기타44충남57500
1992기타45전북27800
2002기타46전남25200
2012기타47경북73000
2022기타48경남38200
2032기타50제주4700