Overview

Dataset statistics

Number of variables3
Number of observations4492
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory114.2 KiB
Average record size in memory26.0 B

Variable types

Numeric2
Text1

Reproduction

Analysis started2023-12-10 21:22:27.481224
Analysis finished2023-12-10 21:22:28.237366
Duration0.76 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

업체ID
Real number (ℝ)

Distinct122
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4073219.2
Minimum1000002
Maximum4154500
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size39.6 KiB
2023-12-11T06:22:28.302544image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum1000002
5-th percentile4100200
Q14100300
median4101100
Q34103800
95-th percentile4109100
Maximum4154500
Range3154498
Interquartile range (IQR)3500

Descriptive statistics

Standard deviation302206.95
Coefficient of variation (CV)0.074193639
Kurtosis99.527667
Mean4073219.2
Median Absolute Deviation (MAD)900
Skewness-10.072349
Sum1.82969 × 1010
Variance9.1329043 × 1010
MonotonicityNot monotonic
2023-12-11T06:22:28.415038image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4100300 766
17.1%
4100200 706
 
15.7%
4100900 198
 
4.4%
4101400 190
 
4.2%
4100600 167
 
3.7%
4103800 139
 
3.1%
4100500 116
 
2.6%
4101800 112
 
2.5%
4104200 109
 
2.4%
4103100 91
 
2.0%
Other values (112) 1898
42.3%
ValueCountFrequency (%)
1000002 1
< 0.1%
1000003 1
< 0.1%
1000005 1
< 0.1%
1000008 1
< 0.1%
1000009 2
< 0.1%
1000010 1
< 0.1%
1000011 1
< 0.1%
1000012 1
< 0.1%
1000013 1
< 0.1%
1000014 1
< 0.1%
ValueCountFrequency (%)
4154500 13
0.3%
4154400 1
 
< 0.1%
4150200 1
 
< 0.1%
4150100 1
 
< 0.1%
4149400 8
0.2%
4143600 13
0.3%
4113100 1
 
< 0.1%
4112800 1
 
< 0.1%
4112400 2
 
< 0.1%
4112200 2
 
< 0.1%

노선ID
Real number (ℝ)

Distinct3839
Distinct (%)85.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.3560724 × 108
Minimum0
Maximum9.9900011 × 108
Zeros2
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size39.6 KiB
2023-12-11T06:22:28.530905image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.000003 × 108
Q12.2100002 × 108
median2.3300008 × 108
Q32.3400141 × 108
95-th percentile2.4000003 × 108
Maximum9.9900011 × 108
Range9.9900011 × 108
Interquartile range (IQR)13001391

Descriptive statistics

Standard deviation82700609
Coefficient of variation (CV)0.35101047
Kurtosis79.589823
Mean2.3560724 × 108
Median Absolute Deviation (MAD)3999974
Skewness8.9162434
Sum1.0583477 × 1012
Variance6.8393907 × 1015
MonotonicityNot monotonic
2023-12-11T06:22:28.643274image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
999000107 31
 
0.7%
999000108 14
 
0.3%
219000017 4
 
0.1%
218000001 4
 
0.1%
218000002 4
 
0.1%
218000004 4
 
0.1%
219000008 4
 
0.1%
229000066 4
 
0.1%
229000060 4
 
0.1%
229000063 4
 
0.1%
Other values (3829) 4415
98.3%
ValueCountFrequency (%)
0 2
< 0.1%
200000006 1
< 0.1%
200000008 1
< 0.1%
200000009 1
< 0.1%
200000010 1
< 0.1%
200000012 1
< 0.1%
200000013 1
< 0.1%
200000014 1
< 0.1%
200000015 1
< 0.1%
200000016 1
< 0.1%
ValueCountFrequency (%)
999000108 14
0.3%
999000107 31
0.7%
999000106 1
 
< 0.1%
999000105 1
 
< 0.1%
999000102 1
 
< 0.1%
999000101 1
 
< 0.1%
999000100 1
 
< 0.1%
999000099 1
 
< 0.1%
249000002 1
 
< 0.1%
241000350 1
 
< 0.1%
Distinct2199
Distinct (%)49.0%
Missing0
Missing (%)0.0%
Memory size35.2 KiB
2023-12-11T06:22:28.951017image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length13
Median length10
Mean length3.8020926
Min length1

Characters and Unicode

Total characters17079
Distinct characters107
Distinct categories7 ?
Distinct scripts3 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1305 ?
Unique (%)29.1%

Sample

1st row9-3
2nd row1-2
3rd row1-1
4th row8
5th row52
ValueCountFrequency (%)
2222 31
 
0.7%
3 24
 
0.5%
2 19
 
0.4%
33 19
 
0.4%
7 18
 
0.4%
1 18
 
0.4%
8 17
 
0.4%
100 17
 
0.4%
11 17
 
0.4%
5 15
 
0.3%
Other values (2189) 4297
95.7%
2023-12-11T06:22:29.368950image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2509
14.7%
- 2497
14.6%
2 2122
12.4%
0 1850
10.8%
3 1753
10.3%
5 1283
7.5%
9 1012
5.9%
8 892
 
5.2%
7 822
 
4.8%
4 722
 
4.2%
Other values (97) 1617
9.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 13661
80.0%
Dash Punctuation 2497
 
14.6%
Other Letter 413
 
2.4%
Uppercase Letter 356
 
2.1%
Close Punctuation 75
 
0.4%
Open Punctuation 75
 
0.4%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Other Letter
ValueCountFrequency (%)
39
 
9.4%
22
 
5.3%
22
 
5.3%
21
 
5.1%
20
 
4.8%
18
 
4.4%
18
 
4.4%
17
 
4.1%
15
 
3.6%
14
 
3.4%
Other values (70) 207
50.1%
Uppercase Letter
ValueCountFrequency (%)
A 82
23.0%
B 79
22.2%
M 58
16.3%
P 38
10.7%
G 35
9.8%
H 25
 
7.0%
N 15
 
4.2%
C 11
 
3.1%
D 4
 
1.1%
Y 4
 
1.1%
Other values (3) 5
 
1.4%
Decimal Number
ValueCountFrequency (%)
1 2509
18.4%
2 2122
15.5%
0 1850
13.5%
3 1753
12.8%
5 1283
9.4%
9 1012
7.4%
8 892
 
6.5%
7 822
 
6.0%
4 722
 
5.3%
6 696
 
5.1%
Dash Punctuation
ValueCountFrequency (%)
- 2497
100.0%
Close Punctuation
ValueCountFrequency (%)
) 75
100.0%
Open Punctuation
ValueCountFrequency (%)
( 75
100.0%
Lowercase Letter
ValueCountFrequency (%)
a 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16308
95.5%
Hangul 413
 
2.4%
Latin 358
 
2.1%

Most frequent character per script

Hangul
ValueCountFrequency (%)
39
 
9.4%
22
 
5.3%
22
 
5.3%
21
 
5.1%
20
 
4.8%
18
 
4.4%
18
 
4.4%
17
 
4.1%
15
 
3.6%
14
 
3.4%
Other values (70) 207
50.1%
Latin
ValueCountFrequency (%)
A 82
22.9%
B 79
22.1%
M 58
16.2%
P 38
10.6%
G 35
9.8%
H 25
 
7.0%
N 15
 
4.2%
C 11
 
3.1%
D 4
 
1.1%
Y 4
 
1.1%
Other values (4) 7
 
2.0%
Common
ValueCountFrequency (%)
1 2509
15.4%
- 2497
15.3%
2 2122
13.0%
0 1850
11.3%
3 1753
10.7%
5 1283
7.9%
9 1012
6.2%
8 892
 
5.5%
7 822
 
5.0%
4 722
 
4.4%
Other values (3) 846
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16666
97.6%
Hangul 413
 
2.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2509
15.1%
- 2497
15.0%
2 2122
12.7%
0 1850
11.1%
3 1753
10.5%
5 1283
7.7%
9 1012
6.1%
8 892
 
5.4%
7 822
 
4.9%
4 722
 
4.3%
Other values (17) 1204
7.2%
Hangul
ValueCountFrequency (%)
39
 
9.4%
22
 
5.3%
22
 
5.3%
21
 
5.1%
20
 
4.8%
18
 
4.4%
18
 
4.4%
17
 
4.1%
15
 
3.6%
14
 
3.4%
Other values (70) 207
50.1%

Interactions

2023-12-11T06:22:27.890024image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:22:27.651047image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:22:28.046526image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2023-12-11T06:22:27.768641image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2023-12-11T06:22:29.449226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체ID노선ID
업체ID1.0000.654
노선ID0.6541.000
2023-12-11T06:22:29.509790image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
업체ID노선ID
업체ID1.000-0.363
노선ID-0.3631.000

Missing values

2023-12-11T06:22:28.152787image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2023-12-11T06:22:28.211807image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

업체ID노선ID노선명
041021002080000209-3
141021002080000211-2
241021002080000241-1
341021002080000288
4410210020800003052
541021002080000343-1
6410210020800003552-1
741021002080000361-5
841021002080000375-1
9410210020800003883
업체ID노선ID노선명
44824154500218000012M7106
4483415450021800001637
4484415450021900000855
4485415450021900001776
4486415450022900003234
44874154500229000036360
4488415450022900004073
44894154500229000060790
44904154500229000063850
44914154500229000066550