Overview

Dataset statistics

Number of variables8
Number of observations10000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory742.2 KiB
Average record size in memory76.0 B

Variable types

Categorical4
Text1
Numeric2
DateTime1

Dataset

Description파일 다운로드
Author서울특별시
URLhttps://data.seoul.go.kr/dataList/OA-15972/S/1/datasetView.do

Alerts

기관 명 has constant value ""Constant
모델명 has constant value ""Constant
불량여부(정상:0, 불량:1) is highly imbalanced (92.0%)Imbalance
위도 is highly skewed (γ1 = -54.26324649)Skewed
경도 is highly skewed (γ1 = -70.68390657)Skewed

Reproduction

Analysis started2024-05-04 00:43:36.078528
Analysis finished2024-05-04 00:43:40.155613
Duration4.08 seconds
Software versionydata-profiling vv4.5.1
Download configurationconfig.json

Variables

기관 명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
노원구
10000 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row노원구
2nd row노원구
3rd row노원구
4th row노원구
5th row노원구

Common Values

ValueCountFrequency (%)
노원구 10000
100.0%

Length

2024-05-04T00:43:40.372226image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T00:43:40.728194image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
노원구 10000
100.0%

모델명
Categorical

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
GDS-100T
10000 

Length

Max length8
Median length8
Mean length8
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGDS-100T
2nd rowGDS-100T
3rd rowGDS-100T
4th rowGDS-100T
5th rowGDS-100T

Common Values

ValueCountFrequency (%)
GDS-100T 10000
100.0%

Length

2024-05-04T00:43:41.066577image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T00:43:41.367872image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
gds-100t 10000
100.0%
Distinct4184
Distinct (%)41.8%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
2024-05-04T00:43:41.973032image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Length

Max length12
Median length7
Mean length7.6777
Min length7

Characters and Unicode

Total characters76777
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1000 ?
Unique (%)10.0%

Sample

1st row12-8508
2nd row15-5004
3rd row16-6666
4th row12-2621
5th row2019-0003729
ValueCountFrequency (%)
16-4883 7
 
0.1%
12-2804 6
 
0.1%
10-3666 6
 
0.1%
2019-0002299 6
 
0.1%
15-1680 6
 
0.1%
21-7616 6
 
0.1%
12-1890 6
 
0.1%
25-2069 6
 
0.1%
21-4195 6
 
0.1%
25-6863 6
 
0.1%
Other values (4174) 9939
99.4%
2024-05-04T00:43:43.103167image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 12974
16.9%
2 10690
13.9%
- 10000
13.0%
0 9835
12.8%
5 6571
8.6%
9 5087
 
6.6%
6 4664
 
6.1%
4 4377
 
5.7%
3 4349
 
5.7%
7 4220
 
5.5%
Other values (3) 4010
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 66749
86.9%
Dash Punctuation 10000
 
13.0%
Lowercase Letter 28
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 12974
19.4%
2 10690
16.0%
0 9835
14.7%
5 6571
9.8%
9 5087
 
7.6%
6 4664
 
7.0%
4 4377
 
6.6%
3 4349
 
6.5%
7 4220
 
6.3%
8 3982
 
6.0%
Lowercase Letter
ValueCountFrequency (%)
l 14
50.0%
r 14
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 76749
> 99.9%
Latin 28
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 12974
16.9%
2 10690
13.9%
- 10000
13.0%
0 9835
12.8%
5 6571
8.6%
9 5087
 
6.6%
6 4664
 
6.1%
4 4377
 
5.7%
3 4349
 
5.7%
7 4220
 
5.5%
Latin
ValueCountFrequency (%)
l 14
50.0%
r 14
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 76777
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 12974
16.9%
2 10690
13.9%
- 10000
13.0%
0 9835
12.8%
5 6571
8.6%
9 5087
 
6.6%
6 4664
 
6.1%
4 4377
 
5.7%
3 4349
 
5.7%
7 4220
 
5.5%
Other values (3) 4010
 
5.2%

위도
Real number (ℝ)

SKEWED 

Distinct3973
Distinct (%)39.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.644181
Minimum31.629479
Maximum37.929737
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T00:43:43.521234image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum31.629479
5-th percentile37.618838
Q137.6246
median37.649362
Q337.663975
95-th percentile37.675444
Maximum37.929737
Range6.300258
Interquartile range (IQR)0.039375

Descriptive statistics

Standard deviation0.106364
Coefficient of variation (CV)0.0028255097
Kurtosis3066.7792
Mean37.644181
Median Absolute Deviation (MAD)0.020803
Skewness-54.263246
Sum376441.81
Variance0.0113133
MonotonicityNot monotonic
2024-05-04T00:43:43.992224image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
37.624869 9
 
0.1%
37.656214 9
 
0.1%
37.675216 9
 
0.1%
37.622571 8
 
0.1%
37.675135 8
 
0.1%
37.672039 8
 
0.1%
37.673793 8
 
0.1%
37.672042 8
 
0.1%
37.670066 8
 
0.1%
37.6734 8
 
0.1%
Other values (3963) 9917
99.2%
ValueCountFrequency (%)
31.629479 3
< 0.1%
37.323564 2
< 0.1%
37.61207 2
< 0.1%
37.614475 3
< 0.1%
37.614544 1
 
< 0.1%
37.614683 2
< 0.1%
37.614823 2
< 0.1%
37.614824 2
< 0.1%
37.615128 3
< 0.1%
37.615143 4
< 0.1%
ValueCountFrequency (%)
37.929737 2
 
< 0.1%
37.68483 2
 
< 0.1%
37.684206 2
 
< 0.1%
37.684121 2
 
< 0.1%
37.683963 2
 
< 0.1%
37.683925 1
 
< 0.1%
37.683732 1
 
< 0.1%
37.683705 1
 
< 0.1%
37.68347 1
 
< 0.1%
37.683428 5
0.1%

경도
Real number (ℝ)

SKEWED 

Distinct3930
Distinct (%)39.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean127.05212
Minimum37.624061
Maximum127.70959
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size166.0 KiB
2024-05-04T00:43:44.611520image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Quantile statistics

Minimum37.624061
5-th percentile127.05167
Q1127.06066
median127.07093
Q3127.07754
95-th percentile127.08546
Maximum127.70959
Range90.085534
Interquartile range (IQR)0.016882

Descriptive statistics

Standard deviation1.2649896
Coefficient of variation (CV)0.009956462
Kurtosis4995.9794
Mean127.05212
Median Absolute Deviation (MAD)0.007834
Skewness-70.683907
Sum1270521.2
Variance1.6001987
MonotonicityNot monotonic
2024-05-04T00:43:45.231508image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
127.076381 10
 
0.1%
127.059021 10
 
0.1%
127.069653 9
 
0.1%
127.077981 9
 
0.1%
127.075549 9
 
0.1%
127.068972 8
 
0.1%
127.058561 8
 
0.1%
127.075047 8
 
0.1%
127.058627 8
 
0.1%
127.056543 8
 
0.1%
Other values (3920) 9913
99.1%
ValueCountFrequency (%)
37.6240612 2
< 0.1%
127.040071 2
< 0.1%
127.041928 1
 
< 0.1%
127.042075 2
< 0.1%
127.042219 3
< 0.1%
127.042411 3
< 0.1%
127.042431 2
< 0.1%
127.042596 1
 
< 0.1%
127.042597 1
 
< 0.1%
127.042598 3
< 0.1%
ValueCountFrequency (%)
127.709595 3
< 0.1%
127.111781 4
< 0.1%
127.111452 1
 
< 0.1%
127.111332 2
< 0.1%
127.111225 4
< 0.1%
127.111159 2
< 0.1%
127.111155 3
< 0.1%
127.11091 2
< 0.1%
127.110735 3
< 0.1%
127.110713 3
< 0.1%

등 밝기(%)
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
5347 
100
4653 

Length

Max length3
Median length1
Mean length1.9306
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row100
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 5347
53.5%
100 4653
46.5%

Length

2024-05-04T00:43:45.716938image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T00:43:46.247478image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 5347
53.5%
100 4653
46.5%

불량여부(정상:0, 불량:1)
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
0
9901 
1
 
99

Length

Max length1
Median length1
Mean length1
Min length1

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 9901
99.0%
1 99
 
1.0%

Length

2024-05-04T00:43:46.708904image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-04T00:43:47.052005image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
ValueCountFrequency (%)
0 9901
99.0%
1 99
 
1.0%
Distinct5816
Distinct (%)58.2%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Minimum2024-01-29 00:08:32
Maximum2024-01-29 11:17:26
2024-05-04T00:43:47.474247image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T00:43:48.049681image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-05-04T00:43:38.379081image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T00:43:37.522718image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T00:43:38.790288image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
2024-05-04T00:43:38.021246image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/

Correlations

2024-05-04T00:43:48.540585image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도등 밝기(%)불량여부(정상:0, 불량:1)
위도1.0000.0000.0040.000
경도0.0001.0000.0000.000
등 밝기(%)0.0040.0001.0000.037
불량여부(정상:0, 불량:1)0.0000.0000.0371.000
2024-05-04T00:43:48.937193image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
등 밝기(%)불량여부(정상:0, 불량:1)
등 밝기(%)1.0000.023
불량여부(정상:0, 불량:1)0.0231.000
2024-05-04T00:43:49.315135image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
위도경도등 밝기(%)불량여부(정상:0, 불량:1)
위도1.000-0.0960.0030.000
경도-0.0961.0000.0000.000
등 밝기(%)0.0030.0001.0000.023
불량여부(정상:0, 불량:1)0.0000.0000.0231.000

Missing values

2024-05-04T00:43:39.285317image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-04T00:43:39.875475image/svg+xmlMatplotlib v3.7.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

기관 명모델명시리얼위도경도등 밝기(%)불량여부(정상:0, 불량:1)등록일자
8532노원구GDS-100T12-850837.626403127.078073002024-01-29 03:13:22
27560노원구GDS-100T15-500437.673444127.05249810002024-01-29 10:12:15
15591노원구GDS-100T16-666637.623994127.060174002024-01-29 05:18:10
8615노원구GDS-100T12-262137.621447127.078251002024-01-29 03:13:42
5538노원구GDS-100T2019-000372937.661014127.071604002024-01-29 01:48:38
19740노원구GDS-100T21-835337.643252127.05173510002024-01-29 07:11:44
4578노원구GDS-100T15-792737.624498127.089996002024-01-29 01:44:56
8909노원구GDS-100T12-978337.630669127.065542002024-01-29 03:14:55
29790노원구GDS-100T10-374837.616273127.06295810002024-01-29 11:10:52
22334노원구GDS-100T17-378437.637181127.11133210002024-01-29 08:10:31
기관 명모델명시리얼위도경도등 밝기(%)불량여부(정상:0, 불량:1)등록일자
4197노원구GDS-100T12-995437.648285127.083809002024-01-29 00:20:05
1997노원구GDS-100T21-730437.663282127.066565002024-01-29 00:13:50
1073노원구GDS-100T2019-000370937.674101127.050695002024-01-29 00:11:16
5780노원구GDS-100T16-563137.636971127.078061002024-01-29 01:49:36
16308노원구GDS-100T2019-000237837.664598127.07055002024-01-29 06:09:18
17503노원구GDS-100T12-758837.624568127.0785510002024-01-29 06:14:11
24901노원구GDS-100T25-812537.636569127.07340710002024-01-29 09:10:01
16188노원구GDS-100T25-415937.671206127.080394002024-01-29 06:08:51
23309노원구GDS-100T16-736937.626055127.09420410002024-01-29 08:14:24
8603노원구GDS-100T2019-000074837.653421127.071039002024-01-29 03:13:39